Inv FCN THM Notes s18
Inv FCN THM Notes s18
Inv FCN THM Notes s18
1
By continuity, there exists r > 0 such that Nr (0) ⊂ Ω and kH 0 (x)k < 12 for
x ∈ Nr (0). Since Nr (0) is convex, Theorem 9.19 in Rudin implies that
1
|H(x) − H(x0 )| ≤ |x − x0 |. (0.1)
2
Next, recalling that F (x) + H(x) = x, we have that
This now shows that restricting F to Nr (0) yields an injective map, for if
F (x) = F (x0 ) with x, x0 ∈ Nr (0), then x = x0 .
Next, we have to show that F is surjective near the origin. We thus
take y ∈ Nr/2 (0) and want to show that there exists x ∈ Nr (0) such that
F (x) = y. To this end, define G(x) := x + y − F (x) = H(x) + y and observe
that G has a fixed point in Nr (0) if and only if F (x) = y has a solution on
this set:
2
in fact |x| < r, so in fact the solution to y = F (x) is satisfied by some
x ∈ Nr (0).
The first part of the theorem is concluded by setting V = Nr/2 (0) and
U = Nr (0) ∩ F −1 (Nr/2 (0)), which define open sets since F is continuous.
Moreover, the properties established above ensure that F : U → V is a
bijection.
We now prove the second half of the theorem, that F −1 : V → U is
continuously differentiable. Note that since kF 0 (x) − Ik = kH 0 (x)k < 1/2
on U , F 0 (x) is invertible for x ∈ U by Theorem 9.8(a) in Rudin. Here
it is sufficient to show that if y ∈ V , then (F −1 )0 (y) exists and is equal
to (F 0 (x))−1 , where y = F (x). Indeed, as soon as we establish that F −1 is
differentiable on V , then we know that x = F −1 (y) defines x as a continuous
and function of y, so by continuity of inversion (Theorem 9.8(b)), y 7→
(F 0 (F −1 (y)))−1 is the composition of continuous maps, which shows that
F −1 is continuously differentiable. Alternatively, this can be seen by using
matrices: since the entries of (F 0 (x))−1 are rational functions of the entries
of F 0 (x) and the partial derivatives Dj fi (x) are continuous functions, again
there is continuous dependence of the entries of (F −1 )0 (y) on y.
Recall that if R(h) := F (x + h) − F (x) − F 0 (x)h, then |R(h)| = o(|h|) as
h → 0. However, here we want to define h as a function k by the relation
h = F −1 (y + k) − F −1 (y) = F −1 (y + k) − x,
F (x + h) = y + k = F (x) + k.
we have that
1
|h − k| = |G(x + h) − G(x)| ≤ |h|.
2
Hence
1
|h| ≤ |k| + |h − k| ≤ |k| + |h|
2
or equivalently, |h| ≤ 2|k|. This shows that h → 0 as k → 0 and that when
1 2
h 6= 0, |k| ≤ |h| .
3
We now conclude by considering
Hence
|F −1 (y + k) − F −1 (y) − (F 0 (x))−1 k|
|k|
|(F (x + h) − F (x) − F 0 (x)h)|
0 −1
≤ 2k(F (x)) k
|h|
and since h → 0 as k → 0, the right hand side of this inequality tends to 0
as k → 0. This concludes that F −1 (y) is differentiable at y.
so that ∂F ∂F ∂F
∂x , ∂y are n × n and n × m matrices respectively. If ∂x defines
a invertible transformation in L(Rn ) at the point (a, b), where F (a, b) = 0,
then there exist neighborhoods V0 , W0 with a ∈ V0 ⊂ Rn and b ∈ W0 ⊂ Rm
and a continuously differentiable mapping G : W0 → V0 with the property
that F (x, y) = 0 for (x, y) ∈ V0 × W0 if and only if x = G(y). In other
words, F −1 (0) ∩ V0 × W0 is the graph of G,
F −1 (0) ∩ V0 × W0 = {(G(y), y) : y ∈ W0 } .
4
which shows that H 0 (a, b) is invertible. Alternatively we can check that
H 0 (a, b) is invertible by taking any vector (h, k) ∈ Rn+m in the null space of
H 0 (a, b) and verifying that (h, k) = (0, 0). Indeed, conflating the matrices
∂F ∂F
∂x , ∂y with the linear transformations they define, we have
0 ∂F ∂F
(0, 0) = H (a, b)(h, k) = (a, b)h + (a, b)k, k ,
∂x ∂y
and hence k = 0 by matching the Rm entries, and after inserting this into
the Rn entries, yields 0 = ∂F
∂x (a, b)h and hence h = 0.
The inverse function theorem now furnishes neighborhoods U, W with
(a, b) ∈ U and (0, b) ∈ W such that H : U → W is bijection with continu-
ously differentiable inverse. Shrinking U if necessary, we may assume it has
the product structure V0 × V1 where V0 ⊂ Rn , V1 ⊂ Rm are open in their
respective spaces.
We now write H −1 (x, y) = (A(x, y), B(x, y)) where A : W → V0 and
B : W → V1 . Hence
(x, y) = H(H −1 (x, y)) = H(A(x, y), B(x, y))
= (F (A(x, y), B(x, y)), B(x, y)) .
Identifying both sides of the Rm identities here, we obtain B(x, y) = y and
inserting this into the Rn identities, we obtain that
x = F (A(x, y), y)
We now simply define G(y) := A(0, y) and W0 := {y ∈ Rm : (0, y) ∈ W }∩V1
so that G : W0 → V0 . It is verified that W0 is open. Thus
(x, y) ∈ F −1 (0) ∩ V0 × W0 ⇔ H(x, y) = (0, y) and (x, y) ∈ V0 × W0
and by the definitions above, the latter is equivalent to (x, y) = H −1 (0, y) =
(G(y), y) when (x, y) ∈ V0 × W0 .
5
Proof. Similar to the proof of the inverse function theorem, we may assume
that z0 = 0 and F (z0 ) = 0 as any needed translations can be absorbed in
ϕ, ψ. Moreover,
D1 f1 (0) . . . Dn f1 (0)
F 0 (0) =
.. .. ..
. . .
D1 fm (0) . . . Dn fm (0)
since the last quantity is (0.3). The inverse function theorem now furnishes
open sets U, Ũ containing 0 such that ϕ : U → Ũ is a bijection such that
ϕ, ϕ−1 are continuously differentiable. Shrinking Ũ if necessary, we may
assume that it is convex.
We now write
6
Thus by matching entries, we have that B(x, y) = y which implies that
We now may write for some Rm−k -valued function R̃, continuously dif-
ferentiable on Ũ
F ◦ ϕ−1 (x, y) = F (A(x, y), y) = (Q(A(x, y), y), R(A(x, y), y)) = (x, R̃(x, y)).
Hence " #
−1 0
Ik×k 0k×(n−k)
(F ◦ ϕ ) (x, y) = ∂ R̃ ∂ R̃ . (0.4)
∂x (x, y) ∂y (x, y)
But by the chain rule, for (x, y) ∈ Ũ , (F ◦ ϕ−1 )0 (x, y) has rank k. Indeed,
and since F 0 (z) has constant rank with (ϕ−1 )0 (x, y) is invertible, this is the
composition of a rank k map with an invertible map, which yields a rank
k linear mapping. But given (0.4), the first k columns of (F ◦ ϕ−1 )0 (x, y)
are linearly independent, which means that ∂∂yR̃ = 0 on Ũ since otherwise,
the matrix would have rank larger than k. But since Ũ is convex, we have
that R̃ is independent of y (cf. Theorem 9.19 and its corollary), that is
R̃(x, y) = S(x) for some continuously differentiable function S defined on
the open set
Ṽ := {x ∈ Rk : (x, 0) ∈ Ũ }.
This now shows that (F ◦ ϕ−1 )(x, y) = (x, S(x)).
We finally define for v ∈ Ṽ , w ∈ Rm−k , ψ(v, w) = (v, w − S(v)). This de-
fines a continuously differentiable bijection with explicit inverse ψ −1 (s, t) =
(s, t + S(s)), which satisfies
where the latter entries in the last two expressions are in Rm−k . Defining
V to be the open set V := {(v, w) ∈ Rm : v ∈ Ṽ }, the proof is now
concluded.