Inv FCN THM Notes s18

Notes on the inverse function theorem
Math 511, Spring 2018
Theorem 0.1 (Inverse Function Theorem). Suppose Ω ⊂ Rn is an open set

and that F : Ω → Rn is a continuously differentiable function on Ω. Suppose
further that F 0 (a) ∈ L(Rn ) is an invertible transformation for some a ∈ Ω.
Denote b = F (a).
1. There exist open sets U, V ⊂ Rn such that a ∈ U , b ∈ V , F is one-to-

one on U and F (U ) = V . In other words the restriction of F to U ,
denoted F : U → V defines a bijection.
2. If F −1 : V → U denotes the inverse map of the bijection above, then

F −1 is continuously differentiable on V and (F −1 )0 (y) = (F 0 (x))−1
where y = F (x).
Proof. We first observe that if A ∈ L(Rn ) is invertible and b ∈ Rn is any

vector, then the mapping x 7→ Ax + b defines a bijection from Rn to itself
with inverse given by y 7→ A−1 (y − b). Both mappings are the composition
of a linear transformation with a translation and hence continuous. Since
the inverse image of open sets under continuous maps are open, we know
U is open ⇔ {Ax + b : x ∈ U } is open.
Consequently, it suffices to assume that a = b = 0, that is F (0) = 0, and that

F 0 (0) = I is the identity. For if this is not the case, we can apply the theorem
with these additional assumptions to F̃ (x) := (F 0 (a))−1 (F (x + a) − F (a)),
which is easily verified to be continuously differentiable on the open set
{x ∈ Rn : x + a ∈ Ω}. It is then an exercise to see that the open sets U, V
furnished by applying the theorem to F̃ yield open sets about a and b for
which the conclusion of the theorem hold for F .
From now on, we assume that a = b = 0 so that F (0) = 0 and that
0
F (0) = I with 0 ∈ Ω. Define H : Ω → Rn by H(x) = x − F (x), so that
H(0) = 0 and H is verified to be continuously differentiable with H 0 (0) = 0.
1
By continuity, there exists r > 0 such that Nr (0) ⊂ Ω and kH 0 (x)k < 12 for
x ∈ Nr (0). Since Nr (0) is convex, Theorem 9.19 in Rudin implies that
1
|H(x) − H(x0 )| ≤ |x − x0 |. (0.1)
2
Next, recalling that F (x) + H(x) = x, we have that
|x − x0 | = |F (x) + H(x) − (F (x0 ) + H(x0 ))|

≤ |F (x) − F (x0 )| + |H(x) − H(x0 )|
1
≤ |F (x) − F (x0 )| + |x − x0 |,
2
and by rearranging the inequality, we have that
|x − x0 | ≤ 2|F (x) − F (x0 )|.
This now shows that restricting F to Nr (0) yields an injective map, for if
F (x) = F (x0 ) with x, x0 ∈ Nr (0), then x = x0 .
Next, we have to show that F is surjective near the origin. We thus
take y ∈ Nr/2 (0) and want to show that there exists x ∈ Nr (0) such that
F (x) = y. To this end, define G(x) := x + y − F (x) = H(x) + y and observe
that G has a fixed point in Nr (0) if and only if F (x) = y has a solution on
this set:
x = G(x) = x + y − F (x) ⇔ 0 = y − F (x) ⇔ F (x) = y.
But G is easily observed to be a contraction since it is a translation of H

1
|G(x) − G(x0 )| = |H(x) − H(x0 )| ≤ |x − x0 |.
2
This if we can show that G(Nr (0)) ⊂ Nr (0), that is, G : Nr (0) → Nr (0),
then G is a mapping from a complete metric space to itself, at which point
the contraction mapping fixed point theorem shows that G has a unique
fixed point in Nr (0). Indeed, it can be verified that a closed subset of a
complete metric space defines a complete space on its own, or alternatively
that any compact space is complete. Suppose x ∈ Nr (0), then applying
(0.1) with x0 = 0, we obtain
1 r r r
|G(x)| ≤ |H(x)| + |y| < |x| + ≤ + = r,
2 2 2 2
hence G(Nr (0)) ⊂ Nr (0) and the second inequality is indeed strict since
y ∈ Nr/2 (0). Note that the latter point implies that if x = G(x), then
2
in fact |x| < r, so in fact the solution to y = F (x) is satisfied by some
x ∈ Nr (0).
The first part of the theorem is concluded by setting V = Nr/2 (0) and
U = Nr (0) ∩ F −1 (Nr/2 (0)), which define open sets since F is continuous.
Moreover, the properties established above ensure that F : U → V is a
bijection.
We now prove the second half of the theorem, that F −1 : V → U is
continuously differentiable. Note that since kF 0 (x) − Ik = kH 0 (x)k < 1/2
on U , F 0 (x) is invertible for x ∈ U by Theorem 9.8(a) in Rudin. Here
it is sufficient to show that if y ∈ V , then (F −1 )0 (y) exists and is equal
to (F 0 (x))−1 , where y = F (x). Indeed, as soon as we establish that F −1 is
differentiable on V , then we know that x = F −1 (y) defines x as a continuous
and function of y, so by continuity of inversion (Theorem 9.8(b)), y 7→
(F 0 (F −1 (y)))−1 is the composition of continuous maps, which shows that
F −1 is continuously differentiable. Alternatively, this can be seen by using
matrices: since the entries of (F 0 (x))−1 are rational functions of the entries
of F 0 (x) and the partial derivatives Dj fi (x) are continuous functions, again
there is continuous dependence of the entries of (F −1 )0 (y) on y.
Recall that if R(h) := F (x + h) − F (x) − F 0 (x)h, then |R(h)| = o(|h|) as
h → 0. However, here we want to define h as a function k by the relation
h = F −1 (y + k) − F −1 (y) = F −1 (y + k) − x,
which is a well defined injection for k such that y + k ∈ V . Hence x + h =

F −1 (y + k), equivalently,
F (x + h) = y + k = F (x) + k.
We now return the function G = x + y − F (x) defined above. Recall that

it is a contraction with constant 1/2. Hence since
G(x + h) − G(x) = x + h + y − F (x + h) − (x + y − F (x)) = h − k
we have that
1
|h − k| = |G(x + h) − G(x)| ≤ |h|.
2
Hence
1
|h| ≤ |k| + |h − k| ≤ |k| + |h|
2
or equivalently, |h| ≤ 2|k|. This shows that h → 0 as k → 0 and that when
1 2
h 6= 0, |k| ≤ |h| .
3
We now conclude by considering
F −1 (y + k) − F −1 (y) − (F 0 (x))−1 k = h − (F 0 (x))−1 k

= −(F 0 (x))−1 (k − F 0 (x)h)
= −(F 0 (x))−1 (F (x + h) − F (x) − F 0 (x)h).
Hence
|F −1 (y + k) − F −1 (y) − (F 0 (x))−1 k|
|k|
|(F (x + h) − F (x) − F 0 (x)h)|

0 −1
≤ 2k(F (x)) k
|h|
and since h → 0 as k → 0, the right hand side of this inequality tends to 0
as k → 0. This concludes that F −1 (y) is differentiable at y.
Theorem 0.2 (Implicit Function Theorem). Suppose Ω ⊂ Rn+m is an open

set and that F : Ω → Rn is continuously differentiable. Let (x, y) denote
coordinates in Rn+m so that x ∈ Rn , y ∈ Rm and write the Jacobian matrix
of F 0 (x, y) in block form as F 0 (x, y) = [ ∂F ∂F
∂x ∂y ] where
 ∂f1 ∂f1   ∂f1 ∂f1 
∂x1 ··· ∂xn ∂y1 ··· ∂ym
∂F ∂F
=  ... .. ..  , =  ... .. ..  ,
 
. .  . . 
∂x ∂f ∂fn
∂y ∂f ∂fn
∂x1
n
··· ∂xn
n
∂y1 ··· ∂ym
so that ∂F ∂F ∂F
∂x , ∂y are n × n and n × m matrices respectively. If ∂x defines
a invertible transformation in L(Rn ) at the point (a, b), where F (a, b) = 0,
then there exist neighborhoods V0 , W0 with a ∈ V0 ⊂ Rn and b ∈ W0 ⊂ Rm
and a continuously differentiable mapping G : W0 → V0 with the property
that F (x, y) = 0 for (x, y) ∈ V0 × W0 if and only if x = G(y). In other
words, F −1 (0) ∩ V0 × W0 is the graph of G,
F −1 (0) ∩ V0 × W0 = {(G(y), y) : y ∈ W0 } .
Proof. Define H : Ω → Rn+m by H(x, y) = (F (x, y), y) so that H is contin-

uously differentiable and the (n + m) × (n + m) Jacobian matrix of H 0 (a, b)
in block form is ∂F
(a, b) ∂F

0 ∂x ∂y (a, b)
H (a, b) = , (0.2)
0m×n Im×m
where 0m×n is an m×n matrix of all zeros and Im×m denotes the m×m iden-
tity matrix. Thus by taking determinants det(H 0 (a, b)) = det( ∂F
∂x (a, b)) 6= 0,
4
which shows that H 0 (a, b) is invertible. Alternatively we can check that
H 0 (a, b) is invertible by taking any vector (h, k) ∈ Rn+m in the null space of
H 0 (a, b) and verifying that (h, k) = (0, 0). Indeed, conflating the matrices
∂F ∂F
∂x , ∂y with the linear transformations they define, we have

0 ∂F ∂F
(0, 0) = H (a, b)(h, k) = (a, b)h + (a, b)k, k ,
∂x ∂y
and hence k = 0 by matching the Rm entries, and after inserting this into
the Rn entries, yields 0 = ∂F
∂x (a, b)h and hence h = 0.
The inverse function theorem now furnishes neighborhoods U, W with
(a, b) ∈ U and (0, b) ∈ W such that H : U → W is bijection with continu-
ously differentiable inverse. Shrinking U if necessary, we may assume it has
the product structure V0 × V1 where V0 ⊂ Rn , V1 ⊂ Rm are open in their
respective spaces.
We now write H −1 (x, y) = (A(x, y), B(x, y)) where A : W → V0 and
B : W → V1 . Hence
(x, y) = H(H −1 (x, y)) = H(A(x, y), B(x, y))
= (F (A(x, y), B(x, y)), B(x, y)) .
Identifying both sides of the Rm identities here, we obtain B(x, y) = y and
inserting this into the Rn identities, we obtain that
x = F (A(x, y), y)
We now simply define G(y) := A(0, y) and W0 := {y ∈ Rm : (0, y) ∈ W }∩V1
so that G : W0 → V0 . It is verified that W0 is open. Thus
(x, y) ∈ F −1 (0) ∩ V0 × W0 ⇔ H(x, y) = (0, y) and (x, y) ∈ V0 × W0
and by the definitions above, the latter is equivalent to (x, y) = H −1 (0, y) =
(G(y), y) when (x, y) ∈ V0 × W0 .
Theorem 0.3 (Rank Theorem). Suppose Ω1 ⊂ Rn and Ω2 ⊂ Rm are open

subsets of their respective spaces and that F : Ω1 → Ω2 is a continuously
differentiable map. Suppose further that F 0 (z) has constant rank for every
z ∈ Ω1 . Then given z0 ∈ Ω1 , there exist neighborhoods U, V of z0 , F (z0 )
respectively and bijections ϕ : U → ϕ(U ), ψ : V → ψ(V ) such that both
ϕ, ψ and their inverses are continuously differentiable with the property that
F (U ) ⊂ V and
ψ ◦ F ◦ ϕ−1 (x1 , . . . , xk , xk+1 , . . . , xn ) = (x1 , . . . , xk , 0, . . . , 0)

where in the last expression, the last m − k entries vanish.
5
Proof. Similar to the proof of the inverse function theorem, we may assume
that z0 = 0 and F (z0 ) = 0 as any needed translations can be absorbed in
ϕ, ψ. Moreover,
 
D1 f1 (0) . . . Dn f1 (0)
F 0 (0) = 
 .. .. .. 
. . . 
D1 fm (0) . . . Dn fm (0)
has some k × k minor with nonvanishing determinant. By permuting coor-

dinates in Rn and Rm , we may assume that this minor is in the upper left
corner, that is,  
D1 f1 (0) . . . Dk fk (0)
.. .. ..
det   6= 0. (0.3)
 
. . .
D1 fk (0) . . . Dk fk (0)
Indeed, as before, such permutations can be absorbed into ϕ, ψ. It is thus
natural to denote coordinates (x, y) ∈ Rn where x ∈ Rk , y ∈ Rn−k and
similarly (v, w) ∈ Rm where v ∈ Rk , w ∈ Rm−k . We now write
F (x, y) = (Q(x, y), R(x, y))
where Q : Ω → Rk , and R : Ω → Rm−k are both continuously differentiable

maps.
Now define ϕ : Ω → Rn by ϕ(x, y) = (Q(x, y), y). Using similar notation
to (0.2) as in the proof of the implicit function theorem, ϕ0 (0, 0) is invertible
since
" #
∂Q ∂Q
0

∂x (0, 0) ∂y (0, 0) ∂Q
det ϕ (0, 0) = det = det (0, 0) 6= 0.
0(n−k)×k I(n−k)×(n−k) ∂x
since the last quantity is (0.3). The inverse function theorem now furnishes
open sets U, Ũ containing 0 such that ϕ : U → Ũ is a bijection such that
ϕ, ϕ−1 are continuously differentiable. Shrinking Ũ if necessary, we may
assume that it is convex.
We now write
ϕ−1 (x, y) = (A(x, y), B(x, y)) A : Ũ → Rk , B : Ũ → Rn−k ,
with A, B continuously differentiable. Observe that by the definition of ϕ,

(x, y) = ϕ(A(x, y), B(x, y)) = Q A(x, y), B(x, y) , B(x, y) .
6
Thus by matching entries, we have that B(x, y) = y which implies that
ϕ−1 (x, y) = (A(x, y), y) and x = Q(A(x, y), y).
We now may write for some Rm−k -valued function R̃, continuously dif-
ferentiable on Ũ
F ◦ ϕ−1 (x, y) = F (A(x, y), y) = (Q(A(x, y), y), R(A(x, y), y)) = (x, R̃(x, y)).
Hence " #
−1 0
Ik×k 0k×(n−k)
(F ◦ ϕ ) (x, y) = ∂ R̃ ∂ R̃ . (0.4)
∂x (x, y) ∂y (x, y)
But by the chain rule, for (x, y) ∈ Ũ , (F ◦ ϕ−1 )0 (x, y) has rank k. Indeed,
(F ◦ ϕ−1 )0 (x, y) = F 0 (ϕ−1 (x, y))(ϕ−1 )0 (x, y),
and since F 0 (z) has constant rank with (ϕ−1 )0 (x, y) is invertible, this is the
composition of a rank k map with an invertible map, which yields a rank
k linear mapping. But given (0.4), the first k columns of (F ◦ ϕ−1 )0 (x, y)
are linearly independent, which means that ∂∂yR̃ = 0 on Ũ since otherwise,
the matrix would have rank larger than k. But since Ũ is convex, we have
that R̃ is independent of y (cf. Theorem 9.19 and its corollary), that is
R̃(x, y) = S(x) for some continuously differentiable function S defined on
the open set
Ṽ := {x ∈ Rk : (x, 0) ∈ Ũ }.
This now shows that (F ◦ ϕ−1 )(x, y) = (x, S(x)).
We finally define for v ∈ Ṽ , w ∈ Rm−k , ψ(v, w) = (v, w − S(v)). This de-
fines a continuously differentiable bijection with explicit inverse ψ −1 (s, t) =
(s, t + S(s)), which satisfies
(ψ ◦ F ◦ ϕ−1 )(x, y) = ψ(x, S(x)) = (x, S(x) − S(x)) = (x, 0),
where the latter entries in the last two expressions are in Rm−k . Defining
V to be the open set V := {(v, w) ∈ Rm : v ∈ Ṽ }, the proof is now
concluded.

Inv FCN THM Notes s18

Uploaded by

Copyright:

Available Formats

Inv FCN THM Notes s18

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Inv FCN THM Notes s18

Uploaded by

Copyright:

Available Formats

Notes on the inverse function theorem

Math 511, Spring 2018

Theorem 0.1 (Inverse Function Theorem). Suppose Ω ⊂ Rn is an open set

1. There exist open sets U, V ⊂ Rn such that a ∈ U , b ∈ V , F is one-to-

2. If F −1 : V → U denotes the inverse map of the bijection above, then

Proof. We first observe that if A ∈ L(Rn ) is invertible and b ∈ Rn is any

U is open ⇔ {Ax + b : x ∈ U } is open.

Consequently, it suffices to assume that a = b = 0, that is F (0) = 0, and that

|x − x0 | = |F (x) + H(x) − (F (x0 ) + H(x0 ))|

|x − x0 | ≤ 2|F (x) − F (x0 )|.

x = G(x) = x + y − F (x) ⇔ 0 = y − F (x) ⇔ F (x) = y.

But G is easily observed to be a contraction since it is a translation of H

which is a well defined injection for k such that y + k ∈ V . Hence x + h =

We now return the function G = x + y − F (x) defined above. Recall that

G(x + h) − G(x) = x + h + y − F (x + h) − (x + y − F (x)) = h − k

F −1 (y + k) − F −1 (y) − (F 0 (x))−1 k = h − (F 0 (x))−1 k

Theorem 0.2 (Implicit Function Theorem). Suppose Ω ⊂ Rn+m is an open

Proof. Define H : Ω → Rn+m by H(x, y) = (F (x, y), y) so that H is contin-

Theorem 0.3 (Rank Theorem). Suppose Ω1 ⊂ Rn and Ω2 ⊂ Rm are open

where in the last expression, the last m − k entries vanish.

has some k × k minor with nonvanishing determinant. By permuting coor-

F (x, y) = (Q(x, y), R(x, y))

where Q : Ω → Rk , and R : Ω → Rm−k are both continuously differentiable

ϕ−1 (x, y) = (A(x, y), B(x, y)) A : Ũ → Rk , B : Ũ → Rn−k ,

with A, B continuously differentiable. Observe that by the definition of ϕ,

ϕ−1 (x, y) = (A(x, y), y) and x = Q(A(x, y), y).

(F ◦ ϕ−1 )0 (x, y) = F 0 (ϕ−1 (x, y))(ϕ−1 )0 (x, y),

(ψ ◦ F ◦ ϕ−1 )(x, y) = ψ(x, S(x)) = (x, S(x) − S(x)) = (x, 0),

You might also like