Chapter 3 Contraction Mapping Prinicple 2024
Chapter 3 Contraction Mapping Prinicple 2024
In Rn a basic property is that every Cauchy sequence converges. This property is called
the completeness of the Euclidean space. The notion of a Cauchy sequence is well-defined
in a metric space. Indeed, a sequence {xn } in (X, d) is a Cauchy sequence if for every
ε > 0, there exists some n0 such that d(xn , xm ) < ε, for all n, m ≥ n0 . A metric space
(X, d) is complete if every Cauchy sequence in it converges. A subset E is complete if
(E, d E×E ) is complete, or, equivalently, every Cauchy sequence in E converges with limit
in E.
In particular, this proposition shows that every subset in a complete metric space is
complete if and only if it is closed.
1
2 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
Proof. (a) Let E ⊂ X be complete and {xn } a sequence converging to some x in X. Since
every convergent sequence is a Cauchy sequence, {xn } must converge to some z in E. By
the uniqueness of limit, we must have x = z ∈ E, so E is closed.
(b) Let (X, d) be complete and E a closed subset of X. Every Cauchy sequence {xn } in
E is also a Cauchy sequence in X. By the completeness of X, there is some x in X to
which {xn } converges. However, as E is closed, x also belongs to E. So every Cauchy
sequence in E has a limit in E.
Example 3.1. In 2050 it was shown that the space R is complete. Consequently, as
the closed subsets in R, the intervals [a, b], (−∞, b] and [a, ∞) are all complete sets. In
contrast, the set [a, b), b ∈ R, is not complete. For, simply observe that the sequence
{b − 1/k}, k ≥ k0 , for some large k0 , is a Cauchy sequence in [a, b) and yet it does not have
a limit in [a, b) (the limit is b, which lies outside [a, b)). The set of all rational numbers,
Q, is also not complete. Every irrational number is the limit of some sequence in Q, and
these sequences are Cauchy sequences whose limits lie outside Q.
Example 3.2. In 2060 we learned that every Cauchy sequence in C[a, b] with respect
to the sup-norm implies that it converges uniformly, so the limit is again continuous.
Therefore, C[a, b] is a complete space. The subset E = {f : f (x) ≥ 0, ∀x} is also
complete. Indeed, let {fn } be a Cauchy sequence in E, it is also a Cauchy sequence in
C[a, b] and hence there exists some f ∈ C[a, b] such that {fn } converges to f uniformly.
As uniform convergence implies pointwise convergence, f (x) = limn→∞ fn (x) ≥ 0, so f
belongs to E, and E is complete. Next, let P [a, b] be the collection of all polynomials
restricted to [a, b]. It is not complete. For, taking the sequence hn (x) given by
n
X xk
hn (x) = ,
k=0
k!
To obtain a typical non-complete set, we consider the closed interval [0, 1] in R. Take
away one point z from it to form E = [a, b] \ {z}. E is not complete, since every sequence
in E converging to z is a Cauchy sequence which does not converge in E. In general,
you may think of sets with “holes” being non-complete ones. Now, given a non-complete
metric space, can we make it into a complete metric space by filling out all the holes?
The answer turns out to affirmative. We can always enlarge a non-complete metric space
into a complete one by putting in sufficiently many ideal points.
Proof. Let T be a contraction in the complete metric space (X, d). Pick an arbitrary
x0 ∈ X and define a sequence {xn } by setting xn = T xn−1 = T n x0 , ∀n ≥ 1. We claim
that {xn } forms a Cauchy sequence in X. First of all, by iteration we have
d(xn , xN ) = d(T n x0 , T N x0 )
≤ d(T n x0 , T n−1 x0 ) + · · · + d(T N +1 x0 , T N x0 )
X−1
n−N
= d(T N +j+1 x0 , T N +j x0 ) .
j=0
4 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
X−1
n−N
≤ γ N d(T x0 , x0 ) γj
j=0
∞
X
< γ N d(T x0 , x0 ) γj
j=0
d(T x0 , x0 ) N
= γ . (3.2)
1−γ
For ε > 0, choose N so large that d(T x0 , x0 )γ N /(1 − γ) < ε/2. Then for n, m ≥ N ,
d(xn , xm ) ≤ d(xn , xN ) + d(xN , xm )
2d(T x0 , x0 ) N
< γ
1−γ
< ε,
thus {xn } forms a Cauchy sequence. As X is complete, x = limn→∞ xn exists. By the
continuity of T , limn→∞ T xn = T x. But on the other hand, limn→∞ T xn = limn→∞ xn+1 =
x. We conclude that T x = x.
Suppose there is another fixed point y ∈ X. From
d(x, y) = d(T x, T y)
≤ γd(x, y),
and γ ∈ (0, 1), we conclude that d(x, y) = 0, i.e., x = y.
Incidentally, we point out that this proof is a constructive one. It tells you how to
find the fixed point starting from an arbitrary point. In fact, letting n → ∞ in (3.2)
and then replacing N by n, we obtain an error estimate between the fixed point and the
approximating sequence {xn }:
d(T x0 , x0 ) n
d(x, xn ) ≤ γ , n ≥ 1.
1−γ
The following two examples demonstrate the sharpness of the Contraction Mapping
Principle.
Example 3.3. Consider the map T x = x/2 which maps (0, 1] to itself. It is clearly a
contraction. If T x = x, then x = x/2 which implies x = 0. Thus T does not have a fixed
point in (0, 1]. This example shows that completeness of the underlying space cannot be
removed from the assumption of the theorem.
3.2. THE CONTRACTION MAPPING PRINCIPLE 5
Sx = x − log (1 + ex ) .
We have
dS 1
= ∈ (0, 1) , ∀x.
dx 1 + ex
By the Mean-Value Theorem, for some z lying between x and y,
1
|Sx − Sy| = |x − y| < |x − y| .
1 + ez
However, in view of (1+ez )−1 → 1 as x, y → −∞, it is impossible to find a single γ ∈ (0, 1)
to satisfy
|Sx − Sy| ≤ γ|x − y| , ∀x, y .
It is easy to see that S admits no fixed points. Therefore, the contraction condition cannot
be removed from the assumption of the theorem.
where γ = supt∈[0,1] |f 0 (t)| < 1 (Why?). We see that f is a contraction. By the Contraction
Mapping Principle, it has a unique fixed point.
In fact, by using the mean-value theorem one can show that every continuous function
from [0, 1] to itself admits at least one fixed point. This is a general fact. More generally,
according to Brouwer’s Fixed Point Theorem, every continuous maps from a compact
convex set in Rn to itself admits at least one fixed point.
Our applications of the fixed point theorem are mainly on solving equations in certain
form. Let us first recall what the meaning of solving an equation is. Here are some
examples:
• Solve 2x − 5 = 0.
• Solve x2 − 3x + 5 = 0.
• Solve x − y + 12 = 0, 3x + 5y = 0 .
All these equations can be formulated as some mappings from a metric space to itself.
For instance, in the second case we take T x = x2 − 3x and X = R. Then the equation
becomes T x = −5, thus solving the equation means to find the preimage of −5 under T .
In the fourth case we take S(x, y) = (x − y, 3x + 5y) which maps R2 to itself. Solving the
equation means to determine S −1 (−12, 0). There are other choices of the map, say, if we
let S1 (x, y) = (x − y + 12, 3x + 5y), then we need to find S1−1 (0, 0). In the sixth case, first
observe that it is equivalent to solving the integral equation
ˆ x
y(x) = t2 y 3 (t) dt + sin x.
0
´x
Hence, taking Φ(y) = y(x)− 0 t2 y 3 (t) dt and X = C[a, b], solving the differential equation
means to determine Φ−1 (sin x). Here in addition to the requirement a < 0 < b, there are
also some technical ones to ensure Φ really maps X to X.
All in all, we have seen that solving equations, algebraic or differential alike, means to
determine the preimage of a given map on some metric space.
1 1 1 1
|T x − T x0 | = + cos 5x − ( + cos 5x)
2 8 2 8
1
≤ | cos 5x − cos 5x0 |
8
1
≤ | − 5 sin 5z(x − x0 )|
8
5
≤ |x − x0 | ,
8
where z lies between x and x0 . By appealing to the contraction mapping principle, we
conclude that T has a unique fixed point, which is the solution to the equation.
3.2. THE CONTRACTION MAPPING PRINCIPLE 7
This example is a rather straightforward application of the fixed point theorem. Now
we describe a common situation where the theorem can be applied. Let (X, k · k) be a
normed space and Φ : X → X satisfying Φ(x0 ) = y0 . We asked: Is it locally solvable?
That is, for all y sufficiently near y0 , is there some x close to x0 so that Φ(x) = y holds?
We have the following result.
The idea of the following proof can be explained in a few words. Taking x0 = y0 = 0
for simplicity, we would like to find x solving x + Ψ(x) = y. This is equivalent to finding
a fixed point for the map T , T x + Ψ(x) = y, that is, T x = y − Ψ(x). By our assumption,
Ψ is a contraction, so is T .
Proof. We first shift the points x0 and y0 to 0 by redefining Φ. Indeed, for x ∈ Br (0), let
Φ(x)
e = Φ(x + x0 ) − Φ(x0 ) = x + Ψ(x + x0 ) − Ψ(x0 ) .
Then Φ(0)
e = 0. Consider this map on Br (0) given by
T x = x − (Φ(x)
e − y), y ∈ BR (0) .
We would like to verify that T is a well-defined contraction on Br (0). First, we claim that
T maps Br (0) into itself. Indeed,
kT xk = kx − (Φ(x)
e − y)k
= kΨ(x0 ) − Ψ(x0 + x) + yk
≤ kΨ(x0 + x) − Ψ(x0 )k + kyk
≤ γkxk + R
≤ r.
kT x2 − T x1 k = kΨ(x1 + x0 ) − Ψ(x2 + x0 )k
≤ γkx2 − x1 k .
As Br (0) is a closed subset of the complete space X, it is also complete. The Contraction
Mapping Principle can be applied to conclude that for each y ∈ BR (0), there is a unique
fixed point for T , T x = x, in Br (0). In other words, Φ(x)
e = y for a unique x ∈ Br (0).
The desired conclusion follows after going back to Φ.
8 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
kT xk = kx − (Φ(x)
e − y)k
= kΨ(x0 ) − Ψ(x0 + x) + yk
≤ kΨ(x0 + x) − Ψ(x0 )k + kyk
< γkxk + R
≤ r.
It follows that the preimage x which satisfies T x = x belongs to Br (0).
(c) The inverse map that sends y ∈ BR (y0 ) back to x ∈ Br (x0 ), the fixed point of T , is
well-defined. Denote it by Φ−1 . We claim that it is continuous. For, let y1 , y2 ∈ BR (y0 ).
Then xi = Φ−1 (yi ) satisfy xi = yi − Ψ(xi ), i = 1, 2, that is,
kΦ−1 (y1 ) − Φ−1 (y2 )k = ky1 − Ψ(x1 ) − (y2 − Ψ(x2 )k
≤ ky1 − y2 k + kΨ(x2 ) − Ψ(x1 )k
≤ ky1 − y2 k + γkx1 − x2 k
= ky1 − y2 k + γkΦ−1 (y1 ) − Φ−1 (y2 )k ,
which implies
1
kΦ−1 (y1 ) − Φ−1 (y2 )k ≤ ky1 − y2 k .
1−γ
It follows that Φ−1 is uniformly continuous (in fact, “Lipschitz continuous”) in BR (y0 ).
Obviously, the terminology “perturbation of identity” comes from the expression
Φ(x)
e = Φ(x + x0 ) − Φ(x0 ) = x + Ψ(x + x0 ) − Ψ(x0 ) ,
which is in form of the identity plus a term satisfying the “smallness condition”
|Ψ(x + x0 ) − Ψ(x0 )| ≤ γ|x| , γ ∈ (0, 1) .
Example 3.7. Show that the equation 3x4 − x2 + x = −0.05 has a real root. We look
for a solution near 0. Let X be R and Φ(x) = x + Ψ(x) where Ψ(x) = 3x4 − x2 so
that Φ(0) = 0. According to the theorem, we need to find some r so that Ψ becomes a
contraction. For x1 , x2 ∈ Br (0), that is, x1 , x2 ∈ [−r, r], , we have
|Ψ(x1 ) − Ψ(x2 )| = |(3x42 − x22 ) − (3x41 − x21 )|
≤ (3|x32 + x22 x1 + x2 x21 + x31 | + |x2 + x1 |)|x2 − x1 |
≤ (12r3 + 2r)|x2 − x1 | ,
3.2. THE CONTRACTION MAPPING PRINCIPLE 9
which is a contraction as long as γ = (12r3 +2r) < 1. Taking r = 1/4, then γ = 11/16 < 1
will do the job. Then R = (1 − γ)r = 5/64 ∼ 0.078. We conclude that for all numbers
b, |b| < 5/64, the equation 3x4 − x2 + x = b has a unique root in (−1/4, 1/4). Now, −0.05
falls into this range, so the equation has a root.
Example 3.8. Solve
x − 3 sin2 (x − 1) = 1.01 .
Here we take Φ(x) = x−3 sin2 (x−1) and x0 = 1, y0 = 1. Using sin2 (x−1)−sin2 (x0 −1) =
2 sin(z − 1) cos(z − 1)(x − x0 ) where z lies between x and x0 , we have
We take r = 1/7 so that R = (1 − 6/7)1/7 = 1/49. We conclude that the equation has a
unique solution x ∈ [1−1/7, 1+1/7] whenever y ∈ [1−1/49, 1+1/49] ∼ [1−0.02, 1+0.02],
so it applies to y = 1.01.
Proof. It suffices to verify that Ψ is a contraction on Br (0) for sufficiently small r. Then
we can apply the theorem on perturbation of identity to obtain the desired result. To
this end, we fix x1 , x2 ∈ Br (0) where r is to be determined and consider the function
ϕ(t) = Ψi (x1 + t(x2 − x1 )). We have ϕ(0) = Ψi (x1 ) and ϕ(1) = Ψi (x2 ). By the mean
value theorem, there is some t∗ ∈ (0, 1) such that ϕ(1) − ϕ(0) = ϕ0 (t∗ )(1 − 0) = ϕ0 (t∗ ).
By the Chain Rule,
d
ϕ0 (t) = Ψi (x1 + t(x2 − x1 ))
dt
∂Ψi ∂Ψi
= (x1 + t(x2 − x1 ))(x21 − x11 ) + · · · + (x1 + t(x2 − x1 ))(x2n − x1n )
∂x1 ∂xn
n
X ∂Ψi
= (x1 + t(x2 − x1 ))(x2j − x1j ) .
j=1
∂x j
where v
uX 2
u ∂Ψ i
M = sup t (z) .
|z|≤r i,j
∂x j
Using the assumptions ∇Ψ(0) = 0 and Ψ is C 1 , we can find some small r such that
M = 1/2. Applying the theorem of perturbation of identity, the equation Φ(x) = y is
uniquely solvable for y ∈ BR (0), R = (1 − 1/2)r = r/2 with solution x ∈ Br (0).
Theorem 3.4 is also applicable to function spaces. Let us example the following ex-
ample.
where K(x, t) ∈ C([0, 1]2 ), g ∈ C[0, 1] are given and t is a small parameter. We would like
to show that it admits a solution y as long as t is small in some sense. Our first job is to
formulate this problem as a problem of perturbation of identity. We work on the Banach
space C[0, 1] and let
ˆ 1
Φ(y)(x) = y(x) − K(x, t)y 2 (t)dt .
0
That is,
ˆ 1
Ψ(y)(x) = − K(x, t)y 2 (t)dt .
0
has a unique solution y ∈ Br (0). For instance, we fix r = 1/(4M ) so that 2M r = 1/2 and
R = 1/(8M ). This integral equation is solvable for g as long as |t| < 1/(8M kgk).
You should be aware that in these two examples, the first underlying space is the Eu-
clidean space and the second one is the space of continuous functions under the supnorm.
It shows the power of abstraction, that is, the fixed point theorem applies to all complete
metric spaces.
and similarly for DG and DH. Then the formula above becomes, in matrix product,
DH(x) = DG(F (x))DF (x) .
Next, the mean-value theorem in one-dimensional case reads as f (y)−f (x) = f 0 (c)(y−
x) for some value c lying between x and y. To remove the uncertainty of c, we note the
alternative formula
12 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
ˆ 1
f (y) = f (x) + f 0 (x + t(y − x)) dt (y − x) ,
0
which is obtained from
ˆ 1
d
f (y) = f (x) + f (x + t(y − x)) dt (Fundamental Theorem of Calculus)
0 dt
ˆ 1
= f (x) + f 0 (x + t(y − x)) dt (y − x) (Chain Rule) .
0
Here F (x2 ) − F (x1 ) and x2 − x1 are viewed as column vectors. Componentwise this means
n ˆ 1
X ∂Fi
Fi (x2 ) − Fi (x1 ) = (x1 + t(x2 − x1 ))dt (x2j − x1j ), i = 1, · · · , n .
j=1 0
∂xj
The Inverse Function Theorem and Implicit Function Theorem play a fundamental
role in analysis and geometry. They illustrate the principle of linearization which is ubiq-
uitous in mathematics. We learned these theorems in advanced calculus but the proofs
were not emphasized. Now we fill out the gap.
In fact, the value a is equal to f 0 (x0 ), the derivative of f at x0 . We can rewrite the limit
above using the little o notation:
Here ◦(z) denotes a quantity satisfying limz→0 ◦(z)/|z| = 0. The same situation carries
over to a real-valued function f in some open set in Rn . A function f is called differentiable
at x0 in this open set if there exists a vector a = (a1 , · · · , an ) such that
n
X
f (x0 + x) − f (x0 ) = aj xj + ◦(|x|) as x → 0.
j=1
Note that here x0 = (x10 , · · · , xn0 ) is a vector. Again one can show that the vector a is
uniquely given by the gradient vector of f at x0
∂f ∂f
∇f (x0 ) = (x0 ), · · · , (x0 ) .
∂x1 ∂xn
More generally, a map F from an open set in Rn to Rm is called differentiable at a point
x0 in this open set if each component of F = (f 1 , · · · , f m ) is differentiable. We can write
the differentiability condition collectively in the following form
(a) There exist open sets V and W containing x0 and F (x0 ) respectively such that the
restriction of F on V is a bijection onto W with a C 1 -inverse.
14 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
In other words, the matrix of the derivative of the inverse map is precisely the inverse
matrix of the derivative of the map. We conclude that although the inverse may exist
without the non-degeneracy condition. This condition is necessary in order to have a
differentiable inverse. We single it out in the following proposition.
Now we prove Theorem 3.7. At first sight it is not clear how to link this theorem to
the Theorem of Perturbation of Identity. The ideas of the proof is as follows. Taking
x0 , y0 = 0, formally we have F (x) = F (0) + DF (0)(x − 0) + 12 D2 F (0)(x − 0)2 + · · · . Hence
solving F (x) = y is the same as solving DF (0)x + 12 D2 F (0)x2 + · · · = y. Since DF (0)
is invertible, it is equivalent to solving x + DF (0)−1 ( 21 D2 F (0)x2 + · · · ) = DF (0)−1 y, and
this is in the form of x + Ψ(x) = y.
Now let us turn to the proof. First assume that x0 = y0 = 0 and DF (0) = I, the
identity matrix. We write F (x) = y as x + Ψ(x) = y where Ψ(x) = F (x) − x and apply
Theorem 3.4. For this purpose we need to verify Ψ is a contraction. First fix a ball
Br0 (0) satisfying Br0 (0) ⊂ U . As U is open and 0 ∈ U , this is always possible. For
x1 , x2 ∈ Br0 (0), we have, by Proposition 3.6,
where we have used the assumption DF (0) = I. The ij-th entry of B = (bij ) is given by
ˆ 1
∂Fi ∂Fi
bij = (x2 + t(x1 − x2 )) − (0) dt .
0 ∂xj ∂xj
By the continuity of ∂Fi /∂xj at 0, given ε > 0, there is some r ≤ r0 such that
∂Fi ∂Fi
(x) − (0) < ε , ∀x ∈ Br (0) .
∂xj ∂xj
∂Fi ∂Fi
(x2 + t(x1 − x2 )) − (0) < ε , x1 , x2 ∈ Br (0) .
∂xj ∂xj
It follows that
To this end we recall the following fact: The partial derivatives of a function Φ exist at
x0 if there is an n × n-matrix A such that
Φ(x0 + x) − Φ(x0 ) = Ax + R ,
where R = ◦(|x|) . Moreover, when this happens, DΦ(x0 ) = A. (see Remark 3.2 below.)
Here we are concerned with Φ = G.
Let y ∈ BR (0) and y 0 ∈ BR (0) close to y. Let x and x0 be their respective preimages
in Br (0) under F . We have
Writing it in terms of y,
Here ◦(|x0 − x|) is a quantity which satisfies ◦(|x0 − x|)/|x0 − x| as x0 → x. Now, since G
is continuous, as y 0 → y, x0 = G(y 0 ) → x = G(y), and, in view of (3.4),
◦(|x0 − x|) ◦(|x0 − x|) |G(y 0 ) − G(y)|
= × → 0 , as y 0 → y .
|y 0 − y| |x0 − x| |y 0 − y|
So we can write
that is,
G(y 0 ) − G(y) = (DF (G(y))−1 (y 0 − y) + ◦(|y 0 − y|) .
We conclude that G is differentiable in BR (0) and DG(y) = (DF (G(y))−1 .
From linear algebra we know that each entry of DG(y) can be expressed as a rational
function of the entries of the matrix of DF (G(y)). Consequently, DG(y) is C k in y if
DF (G(y)) is C k for 1 ≤ k ≤ ∞.
So far we have been assuming x0 = y0 = 0 and DF (0) = I. For a general F and x0 , y0 ,
set
F̃ (x) = A(F (x + x0 ) − y0 ) ,
where A = (DF )−1 (x0 ). Then F̃ is a C 1 -map in the open set Ũ ≡ U − x0 and it satisfies
F̃ (0) = 0 and (DF̃ )−1 (0) = I. By what has been done, F̃ admits an inverse G̃ from some
open set W̃ containing 0 to an open set Ṽ containing 0 in Ũ . Letting V = Ṽ + x0 and
W = A−1 W̃ + y0 , then V and W are open sets containing x0 and y0 respectively. Define
G(y) = G̃(A(y − y0 )) + x0 ,
3.3. THE INVERSE FUNCTION THEOREM 17
where maps W bijectively onto V . We claim that F (G(y)) = y for For y ∈ W . For,
observe that
F (x) = A−1 F̃ (x − x0 ) + y0 , x ∈ V .
We have
F (G(y)) = A−1 F̃ (G(y) − x0 ) + y0
= A−1 F̃ (G̃(A(y − y0 )) + x0 − x0 ) + y0
= y.
Finally, observe that G is C k in W as long as G̃ is C k in W̃ . The proof of the Inverse
Function Theorem is completed.
When this happens, ∂ϕ/∂xj (x0 ) = αj . For Φ : U → Rn , applying this fact to each
component of Φ, Φi , we see that the Jacobian matrix DΦ at x0 exists if there is a matrix
A = {αij } such that
Φ(x0 + x) − Φ(x0 ) = Ax + ◦(|x|) .
When this happens, ∂Φi /∂xj (x0 ) = αij .
Example 3.10. The Inverse Function Theorem asserts a local invertibility. Even if the
linearization is non-singular everywhere, we cannot assert global invertibility. Consider
x = et cos θ, y = et sin θ .
The function F : R2 → R2 given by F (t, θ) = (x, y) is a continuously differentiable
function whose Jacobian matrix is non-singular everywhere. However, it is clear that F
is not bijective, for instance, all points (t, θ + 2nπ), n ∈ Z, have the same image under F .
Example 3.11. An exceptional case is dimension one where a global result is available.
Indeed, in 2060 we learned that if f is continuously differentiable on (a, b) with non-
vanishing f 0 , it is either strictly increasing or decreasing so that its global inverse exists
and is again continuously differentiable.
Example 3.12. Consider the map F : R2 → R2 given by F (x, y) = (x 2
√ , y). Its Jacobian
matrix is singular at (0, 0). In fact, for any point (a, b), a > 0, F (± a, b) = (a, b). We
cannot find any open set, no matter how small is, at (0, 0) so that F is injective. On the
other hand, the map H(x, y) = (x3 , y) is bijective with inverse given by J(x, y) = (x1/3 , y).
However, as the non-degeneracy condition does not hold at (0, 0) so it is not differentiable
there. In these cases the Jacobian matrix is singular, so the nondegeneracy condition does
not hold.
18 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
Next we discuss the Implicit Function Theorem. The simplest situation of this general
theorem concerns the zero set (or locus) of a single function f (x, y) in the plane. Namely,
when is the zero set {(x, y) : f (x, y) = 0} a curve? Consider the function x2 + y 2 − 1
where the zero set is the unit circle. It is easy to see that it is a curve. Moreover, when
(x, y) 6= (±1, 0), it is a graph over the x-axis, and, when (x, y) 6= (0, ±1), it is a graph
over the y-axis.
Theorem 3.9. Let f be a C 1 -function in some open U in the plane and f (x0 , y0 ) = 0.
Suppose that fy (x0 , y0 ) 6= 0, then there is some open interval I, x0 ∈ I, an open set
V containing (x0 , y0 ) in U and a C 1 -function ϕ on I whose graph lies in V such that
{(x, y) ∈ V : f (x, y) = 0} = {(x, ϕ(x)) : x ∈ I}.
In other words, the zero set of f (x, y) = 0 near (x0 , y0 ) is given by the graph (x, ϕ(x)),
hence it is a curve. Likewise, when fx (x0 , y0 ) 6= 0, the locus is locally given the graph
{ψ(y), y) : y ∈ J} for some interval J containing y0 and a C 1 -function ψ on J satisfying
ψ(y0 ) = x0 .
Proof. Define Φ(x, y) = (x, f (x, y)). Then Φ(x0 , y0 ) = (x0 , 0) and det DΦ(x0 , y0 ) =
fy (x0 , y0 ) 6= 0. By Inverse Function Theorem, Φ has a C 1 -inverse Ψ from some open
set W containing (x0 , 0) satisfying Φ(Ψ(x, z)) = (x, z) on W . By shrinking W a bit, we
may assume W = I × J for two intervals. Writing Ψ(x, z) = (Ψ1 (x, z), Ψ2 (x, z)), we have
Φ(Ψ1 (x, z), Ψ2 (x, z)) = (Ψ1 (x, z), f (Ψ1 (x, z), Ψ2 (x, z))).
By comparing the two components, we have Ψ1 (x, z) = x and f (Ψ1 (x, z), Ψ2 (x, z)) = z.
It follows that f (x, Ψ2 (x, z)) = z. Thus each horizontal line I × {z}, z ∈ J = (c, d)
is mapped to a curve (x, Ψ2 (x, z)). By restricting U we may assume fy (x, y) 6= 0 for all
(x, y) ∈ U . When fy > 0, the horizontal line I × {c} and I × {d} are mapped to the curves
(x, Ψ2 (x, c)) and (x, Ψ2 (x, d)) respectively with Ψ2 (x, c) < Ψ2 (x, d). (When fy (x0 , y0 ) <
0, Ψ2 (x, c) > Ψ2 (x, d).) Thus the image of I × J under Ψ is precisely the set bounded by
x = a, b and the two curves (x, Ψ2 (x, c)) and (x, Ψ2 (x, d)). In particular, at z = 0, we
have f (x, Ψ2 (x, 0)) = 0. Our desired conclusion follows by taking ϕ(x) = Ψ2 (x, 0).
Next we consider the function f2 (x, y) = x2 − y 2 at (0, 0). We have f2x (0, 0) =
F2y (0, 0) = 0. Indeed, the zero set of f2 consists of the two straight lines x = y and
x = −y intersecting at the origin. It is impossible to express it as the graph of a single
function near the origin.
Finally, consider the function f3 (x, y) = x2 + y 2 at (0, 0). We have f3x (0, 0) =
F3y (0, 0) = 0. Indeed, the zero set of f3 degenerates into a single point {(0, 0)} which
cannot be the graph of any function.
In words, this theorem asserts that near (x0 , y0 ), the locust of F is given by the graph
of ϕ.
The notation Dy F (x0 , y0 ) stands for the Jocabian matrix (∂Fi /∂yj (x0 , y0 ))i,j=1,··· ,m
where x0 is fixed. In general, a version of Implicit Function Theorem holds when the rank
of DF at a point is m. In this case, we can rearrange the independent variables to make
Dy F non-singular at this point.
The proof of the general case is essentially the same as the proof of the simplest case.
One readily checks that det DΦ(x, y) = det Dy F (x, y), so det DΦ(x0 , y0 ) 6= 0. By the
Inverse Function Theorem, there exists a C 1 -inverse Ψ = (Ψ1 , Ψ2 ) from some open set
W in Rn × Rm containing Φ(x0 , y0 ) = (x0 , 0) to an open subset of U . By restricting W
further we may assume W is of the form V1 × V2 where V1 and V2 are rectangles centered
at x0 and 0 respectively. We have
On the other hand, the definition of Φ gives Φ(Ψ1 (x, z), Ψ2 (x, z)) = (Ψ1 (x, z), F (Ψ1 (x, z), Ψ(x, z))) .
Therefore,
Ψ1 (x, z) = x, and F ((Ψ1 (x, z), Ψ2 (x, z)) = z.
20 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
In other words, F (x, Ψ2 (x, z)) = z holds for (x, z) ∈ V1 × V2 . In particular, taking z = 0
gives
F (x, Ψ2 (x, 0)) = 0, ∀x ∈ V1 ,
so the function ϕ(x) ≡ Ψ2 (x, 0) satisfies our requirement. Here we take G = V1 and
V = Ψ(V1 × V2 ).
A basic knowledge we pick up from the implicit function theorem is, keeping the
notations in Theorem 3.10, whenever DF is of full rank on the locus of F (x, y) = 0,
the locus is an “n-dimensional surface” in Rn+m . Thinking of n + m many free variables
are constrained by m many equations F (x, y) = 0 and thus leaving with n many free
variables, the terminology of n-dimensional surface is easily understood. In general, for a
given smooth F , the level set {(x, y) : F (x, y) = c} may or may not be an n-dimensional
surface. We call those values c such that DF (x, y) is of rank m at every (x, y) satisfying
F (x, y) = c a regular value of F . A theorem of Sard asserts that for a smooth F , regular
values are of full measure. It implies that, in case F (x, y) = c is not regular, we can
always find some regular value c0 arbitrarily close to c. For instance, 0 is not a regular
value for the function x2 − y 2 . However, x2 − y 2 = a, a 6= 0 is a regular value.
It is interesting to note that the Inverse Function Theorem can be deduced from
Implicit Function Theorem. Thus they are equivalent. To see this, keeping the notations
used in Theorem 3.7. Define a map Φ : U × Rn → Rn by
Φ(x, y) = F (x) − y.
some open set V ⊂ R2 containing (x0 , y0 ) and a C 1 -function ϕ on V such that the locust of
g = 0 is given by the graph of ϕ, that is, g(x, y, ϕ(x, y)) = 0 and ϕ(x0 , y0 ) = z0 for (x, y) ∈
V . It follows that (x0 , y0 ) is a local minimum for the function h(x, y) ≡ f (x, y, ϕ(x, y))
over V . We have
hx (x0 , y0 ) = fx (x0 , y0 , ϕ(x0 , y0 )) + fz (x0 , y0 , ϕ(x0 , y0 ))ϕx (x0 , y0 ) = ∇f · (1, 0, ϕx ) = 0,
and
hy (x0 , y0 ) = fy (x0 , y0 , ϕ(x0 , y0 )) + fz (x0 , y0 , ϕ(x0 , y0 ))ϕx (x0 , y0 ) = ∇f · (0, 1, ϕy ) = 0,
That is, ∇f is perpendicular to the two dimensional subspace spanned by (1, 0, ϕx ) and
(0, 1, ϕy ) at p0 . (In fact, this subspace is the tangent space of g = 0 at p0 .) On the other
hand, by differentiating g(x, y, ϕ(x, y)) = 0, we also have
gx (x, y, ϕ(x, y)) + gz (x, y, ϕ(x, y))ϕx (x, y) = ∇g · (1, 0, ϕx ) = 0,
and
gy (x, y, ϕ(x, y)) + gz (x, y, ϕ(x, y))ϕx (x, y) = ∇g · (0, 1, ϕy ) = 0.
It shows that the three vectors ∇g, (1, 0, ϕx ), (0, 1, ϕy ) forms a basis at p0 . Therefore,
∇f (p0 ) either points to the same or the opposite direction of ∇g(p0 , that is, or ∇f + λ∇g
at p0 for some λ.
In this section we discuss the fundamental existence and uniqueness theorem for differ-
ential equations. I assume that you learned the skills of solving ordinary differential
equations already so we will focus on the theoretical aspects.
Most differential equations cannot be solved explicitly, in other words, they cannot
be expressed as the composition of elementary functions. Nevertheless, there are two
exceptional classes which come up very often. Let us review them before going into the
theory.
Example 3.14. Consider the equation
dx
= a(t)x + b(t),
dt
where a and b are continuous functions defined on some open interval I. This differential
equation is called a linear differential equation because it is linear in x (with coefficients
functions of t). The general solution of this linear equation is given by the formula
ˆ t ˆ t
α(t) −α(s)
x(t) = e x0 + e b(s)ds , α(t) = a(s)ds,
t0 t0
The resulting relation, written as G(x) = F (t), can be converted formally into x =
G−1 F (t), a solution to the equation as immediately verified by the chain rule. For instance,
consider the equation
dx t+3
= .
dt x
The solution is given by integrating
ˆ x ˆ t
xdx = (t + 3)dt ,
x0 t0
to get
x2 = t2 + 6t + c , c ∈ R.
We have √
x(t) = ± t2 + 6t + c .
When x(0) = −2 is specified, we find the constant c = 4, so the solution is given by
√
x(t) = − t2 + 6t + 4 .
dx
= f (t, x),
dt (IVP)
x(t0 ) = x0 .
3.4. PICARD-LINDELÖF THEOREM FOR DIFFERENTIAL EQUATIONS 23
(In some books the independent variable t is replaced by x and the dependent variable
x is replaced by y. We prefer to use t instead of x as the independent variable in many
cases is the time.) To solve the initial value problem it means to find a function x(t)
defined in a perhaps smaller rectangle, that is, x : [t0 − a0 , t0 + a0 ] → [x0 − b, x0 + b], which
is differentiable and satisfies x(t0 ) = x0 and x0 (t) = f (t, x(t)), ∀t ∈ [t0 − a0 , t0 + a0 ], for
some 0 < a0 ≤ a. In general, no matter how nice f is, we do not expect there is always a
solution on the entire [t0 − a, t0 + a]. Let us look at the following example.
dx
= 1 + x2 ,
dt
x(0) = 0.
The function f (t, x) = 1 + x2 is smooth on [−a, a] × [−b, b] for every a, b > 0. However,
the solution, as one can verify immediately, is given by x(t) = tan t which is only defined
on (−π/2, π/2). It shows that even when f is very nice, a0 could be strictly less than a.
Furthermore, replace the equation by x0 = α(1+x2 ). Accordingly the solution becomes
tan αt which exists in (−π/2α, π/2α). It indicates that the interval of existence depending
on f .
Note that in particular means for each fixed t, f is Lipschitz continuous in x. The constant
L is called a Lipschitz constant. Obviously if L is a Lipschitz constant for f , any
number greater than L is also a Lipschitz constant. Not all continuous functions satisfy
the Lipschitz condition. An example is given by the function f (t, x) = tx1/2 is continuous.
I let you verify that it does not satisfy the Lipschitz condition on any rectangle containing
the origin.
In application, most functions satisfying the Lipschitz condition arise in the following
manner. A C 1 -function f (t, x) in a closed rectangle automatically satisfies the Lipschitz
condition. For, by the mean-value theorem, for some z lying on the segment between x1
and x2 ,
∂f
f (t, x2 ) − f (t, x1 ) = (t, z)(x2 − x1 ).
∂x
24 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
Letting
n ∂f o
L = max (t, x) : (t, x) ∈ R ,
∂x
(L is a finite number because ∂f /∂y is continuous on R and hence bounded), we have
|f (t, x2 ) − f (t, x1 )| ≤ L|x2 − x1 |, ∀(t, xi ) ∈ R, i = 1, 2.
From the proof one will see that a0 can be taken to be any number satisfying
0 b 1
0 < a < min a, , ,
M L
where M = sup{|f (t, x)| : (t, x) ∈ R}.
To prove Picard-Lindelöf Theorem, we first convert (IVP) into a single integral equa-
tion.
Proposition 3.13. Setting as Picard-Lindelöf Theorem, every solution x of (IVP) from
[t0 − a0 , t0 + a0 ] to [x0 − b, x0 + b] satisfies the equation
ˆ t
x(t) = x0 + f (t, x(t)) dt. (3.7)
t0
Proof. When x satisfies x0 (t) = f (t, x(t)) and x(t0 ) = x0 , (3.7) is a direct consequence of
the Fundamental Theorem of Calculus (first form). Conversely, when x(t) is continuous
on [t0 − a0 , t0 + a0 ], f (t, x(t)) is also continuous on the same interval. By the Fundamental
Theorem of Calculus (second form), the left hand side of (3.7) is continuously differentiable
on [t0 − a0 , t0 + a0 ] and solves (IVP).
Note that in this proposition we do not need the Lipschitz condition; only the conti-
nuity of f is needed.
Proof of Picard-Lindelöf Theorem. Instead of solving (IVP) directly, we look for a solution
of (3.7). We will work on the metric space
X = {x ∈ C[t0 − a0 , t0 + a0 ] : x(t) ∈ [x0 − b, x0 + b], x(t0 ) = x0 } ,
3.4. PICARD-LINDELÖF THEOREM FOR DIFFERENTIAL EQUATIONS 25
with the uniform metric (the metric induced by the supnorm). It is easily verified that
it is a closed subset in the complete metric space C[t0 − a0 , t0 + a0 ] and hence complete.
Recall that every closed subset of a complete metric space is complete. The number a0
will be specified below.
We are going to define a contraction on X. Indeed, for x ∈ X, define T by
ˆ t
(T x)(t) = x0 + f (s, x(s)) ds.
t0
First of all, for every x ∈ X, it is clear that f (t, x(t)) is well-defined and T x ∈ C[t0 −
a0 , t0 + a0 ]. To show that it is in X, we need to verify x0 − b ≤ (T x)(t) ≤ x0 + b for
all t ∈ [t0 − a0 , t0 + a0 ]. We claim that this holds if we choose a0 satisfying a0 ≤ b/M ,
M = sup {|f (t, x)| : (t, x) ∈ R}. For,
ˆ t
|(T x)(t) − x0 | = f (t, x(t)) dt
t0
≤ M |t − t0 |
≤ M a0
≤ b.
kT x2 − T x1 k∞ ≤ γkx2 − x1 k∞ , γ = a0 L < 1 .
Now we can apply the Contraction Mapping Principle to conclude that T x = x for some
x, and x solves (IVP). We have shown that (IVP) admits a solution in [t0 − a0 , t0 + a0 ]
where a0 can be chosen to be any number less than min{a, b/M, 1/L}.
Finally, any solution to the IVP is a fixed point of the map T , so the IVP has a unique
solution on [t0 − a0 , t0 + a0 ].
26 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
We point out that the existence part of Picard-Lindelöf Theorem still holds without
the Lipschitz condition. We will prove this in the next chapter. However, the solution
may not be unique.
Proposition 3.14. Consider the IVP where f ∈ C(D), D ⊂ R2 an open set satisfying
the Lipschitz condition. Suppose x1 and x2 are two solutions of this IVP over an interval
I such that their graphs lying inside D. Suppose that x1 (t0 ) = x2 (t0 ) at some t0 ∈ I, then
x1 coincides with x2 on I.
Let us take t > t0 . (The case t < t0 can be handled similarly.) The function
ˆ t
H(t) ≡ |x1 (s) − x2 (s)| ds
t0
or
H(t) + ε ≤ εeL(t−t0 ) , t ∈ I+ .
Now the desired conclusion follows by letting ε → 0.
3.4. PICARD-LINDELÖF THEOREM FOR DIFFERENTIAL EQUATIONS 27
Under the assumption of this proposition, let S be the collection of all pairs (x(t), I)
where x(t) is a solution over the open interval I and whose graph passing (t0 , x0 ), t0 ∈ I.
Letting I ∗ = ∪Iα where Iα ranges over all I’s in S, the function x∗ on I ∗ defines by
x∗ (t) = xα (t) whenever t ∈ Iα is a well-defined function which solves the (IVP) over I ∗ .
It is called the maximal solution to the (IVP).
dxj
= fj (t, x1 , x2 , · · · , xn ),
dt
xj (t0 ) = x0j ,
where j = 1, 2, · · · , n. By setting x = (x1 , x2 , · · · , xn ) and f = (f1 , f2 , · · · , fn ), we can
express it as in (IVP) but now both x and f are vectors.
Essentially following the same arguments as the case of a single equation, we have
into a system of first order differential equations. As a result, we also have a correspond-
ing Picard-Lindelöf theorem for higher order differential equations. I let you formulate it.
28 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
Before the proof of the Completion Theorem we briefly describe the ideas behind.
When (X, d) is not complete, we need to invent ideal points and add them to X to make
it complete. The idea goes back to Cantor’s construction of the real numbers from rational
numbers. Suppose now we have only rational numbers and we want to add irrationals.
First we identify Q with a proper subset in a larger set as follows. Let C be the collection
of all Cauchy sequences of rational numbers. Every point in C is of the form (x1 , x2 , · · · )
where {xn }, xn ∈ Q, forms a Cauchy sequence. A rational number x is identified with
the constant sequence {x, x, x, . . . } or any Cauchy sequence which converges to x. For in-
stance, 1 is identified with {1, 1, 1, . . . }, {0.9, 0.99, 0.999, . . . } or {1.01, 1.001, 1.0001, . . . }.
Clearly, there are Cauchy sequences which cannot be identified with rational numbers.
For instance, there is no rational number corresponding to {3, 3.1, 3.14, 3.141, 3.1415, . . . },
as we know, its correspondent should be the irrational number π. Similar √ situation holds
for the sequence {1, 1.4, 1.41, 1.414, · · · } which should correspond to 2. Since the cor-
respondence is not injective, we make it into one by introducing an equivalence relation
on C Indeed, {xn } and {yn } are said to be equivalent if |xn − yn | → 0 as n → ∞. The
equivalence relation ∼ forms the quotient C/ ∼ which is denoted by C. e Then x 7→ x e
sends Q injectively into C. It can be shown that C carries the structure of the real num-
e e
bers. In particular, those points not in the image of Q are exactly irrational numbers.
Now, for a metric space the situation is similar. We let Ce be the quotient space of all
Cauchy sequence in X under the relation {xn } ∼ {yn } if and only if d(xn , yn ) → 0. Define
e x, ye) = limn→∞ d(xn , yn ), for x ∈ x
d(e e, y ∈ ye. We have the embedding (X, d) → (X, e d), e
and we can further show that it is a completion of (X, d).
The following proof is for optional reading. In the exercise we will present a simpler
but less instructive proof.
Proof of Theorem 3.2. Let C be the collection of all Cauchy sequences in (X, d). We
introduce a relation ∼ on C by x ∼ y if and only if d(xn , yn ) → 0 as n → ∞. It is
routine to verify that ∼ is an equivalence relation on C. Let X
e = C/ ∼ and define a map:
3.5. APPENDIX I: COMPLETION OF A METRIC SPACE 29
e ×X
X e 7→ [0, ∞) by
d(e
e x, ye) = lim d(xn , yn )
n→∞
Step 1. (well-definedness of d)
e To show that d(e
e x, ye) is independent of their representatives,
0 0
let x ∼ x and y ∼ y . We have
d(e
e x, ze) = lim d(xn , zn )
n→∞
≤ lim d(xn , yn ) + d(yn , zn )
n→∞
= lim d(xn , yn ) + lim d(yn , zn )
n→∞ n→∞
= d(e
e x, ye) + d(e
e y , ze)
d(Φ(x),
e Φ(y)) = lim d(xn , yn ) = d(x, y)
n→∞
Given ε > 0, there exists an n0 such that d(xm , xn ) < ε/2 for all m, n ≥ n0 . Hence
d(e
e x, x fn → x
fn ) = limm→∞ d(xm , xn ) < ε for n ≥ n0 . That is x e as n → ∞, so the closure of
Φ(X) is precisely X.
e
1
d(
ex fn , yen ) < .
n
d(
ex fn , ye) ≤ d(
ex fn , yen ) + d(
e yen , ye)
1
≤ + lim d(yn , ym ) → 0
n m→∞
Completion of a metric space is unique once we have clarified the meaning of unique-
ness. Indeed, call two metric spaces (X, d) and (X 0 , d0 ) isometric if there exists a bijective
embedding from (X, d) onto (X 0 , d0 ). Since a metric preserving map is always one-to-one,
the inverse of of this mapping exists and is a metric preserving mapping from (X 0 , d0 ) to
(X, d). So two spaces are isometric provided there is a metric preserving map from one
onto the other. Two metric spaces will be regarded as the same if they are isometric,
since then they cannot be distinguish after identifying a point in X with its image in
X 0 under the metric preserving mapping. With this understanding, the completion of a
metric space is unique in the following sense: If (Y, ρ) and (Y 0 , ρ0 ) are two completions of
(X, d), then (Y, ρ) and (Y 0 , ρ0 ) are isometric. We will not go into the proof of this fact,
but instead leave it to the interested reader. In any case, now it makes sense to use “the
completion” of X to replace “a completion” of X.
mathematicians in this period. Toward the end of the nineteenth century, people began
to feel the need to clarify various things such as convergence of series. They soon realized
mathematics should be built upon the new theory of sets and the number systems. By
the effort of many people, nowadays the paramount building of mathematics stands on
relatively solid ground.
Mathematics is all about deduction. Proceeding from a few axioms, together with
insightful definitions, people deduce results from simple to sophisticated. Set theory is
the first step. Consider, for instance, the axioms proposed by Zermelo-Fraenkel, which
carefully tell us how a set is constructed. A remarkable axiom in this theory is the axiom
of choice which has many equivalent versions including the Zorn’s lemma commonly used
in analysis. With the notion of a set, one introduce ordered pairs and relations. Among
many relations, the equivalence relation and mappings are most useful.
Next it comes to numbers. The construction of the number system follows the order:
Natural numbers, integers, rational numbers, real numbers and finally complex numbers.
Natural numbers are introduced by the five axioms of Pearo:
A1. Zero is a natural number.
A2. Every natural number has a successor in the natural numbers.
A3. Zero is not the successor of any natural number.
A4. If the successor of two natural numbers is the same, then the two original numbers
are the same.
A5. If a set contains zero and the successor of every number is in the set, then the set
contains the natural numbers. With these five axioms one establishes unique factorization
property of natural numbers, introducing prime numbers, thus classical number theory
is born. After defining integers and its arithmetic, one introduces rational numbers as
the order pairs (p, q) where p, q are integers. Rational numbers consist of the equivalence
class of (p, q) under the relation (p, q) ∼ (r, s) if and only if ps = qr. The arithmetic and
ordering of integers are easily extended to all rational numbers.
There are two popular construction of the real numbers from rational numbers. Dedekind’s
cuts and Cantor’s Cauchy sequences. The latter was briefed in class. You may search the
internet to learn more. (This is not in the scope of MATH3060.)
Recall in Bartle-Sherbert’s book, the construction of real numbers is replaced by a few
additional axioms. For instance, it is assumed that R is a field satisfying certain well-
ordering property. A crucial assumption is the supremum property : Every nonempty
subset in R which is bounded from above has a supremum. It is this axiom which en-
ables us to deduce Nested-Interval, Theorem, Bolzano-Weierstrass Theorem, Complete-
ness Theorem, etc.
All these postulations become superfluous after the construction of R from Q. One
can prove the supremum property and them deduce all the other theorems, look up Wiki
32 CHAPTER 3. THE CONTRACTION MAPPING PRINCIPLE
Comments on Chapter 3. There are two popular constructions of the real number
system, Dedekind cuts and Cantor’s Cauchy sequences. Although the number system is
fundamental in mathematics, we did not pay much attention to its rigorous construction.
It is too dry and lengthy to be included in Mathematical Analysis I. Indeed, there are two
sophisticate steps in the construction of real numbers from nothing, namely, the construc-
tion of the natural numbers by Peano’s axioms and the construction of real numbers from
rational numbers. Other steps are much easier. Cantor’s construction of the irrationals
from the rationals is adapted to construct the completion for a metric space in Theorem
3.2. You may google under the key words “Peano’s axioms, Cantor’s construction of the
real numbers, Dedekind cuts” for more.
Contraction Mapping Principle, or Banach Fixed Point Theorem, was found by the
Polish mathematician S. Banach (1892-1945) in his 1922 doctoral thesis. He is the founder
of functional analysis and operator theory. According to P. Lax, “During the Second
World War, Banach was one of a group of people whose bodies were used by the Nazi
occupiers of Poland to breed lice, in an attempt to extract an anti-typhoid serum. He
died shortly after the conclusion of the war.” The interested reader should look up his
biography at Wiki.
An equally famous fixed point theorem is Brouwer’s Fixed Point Theorem. It states
that every continuous map from a closed ball in Rn to itself admits at least one fixed
point. Here it is not the map but the geometry, or more precisely, the topology of the
ball matters. You will learn it in a course on topology.
Inverse and Implicit Function Theorems, which reduce complicated structure to sim-
pler ones via linearization, are the most frequently used tool in the study of the local
behavior of maps. We learned these theorems and some of its applications in Advanced
Calculus I already. In view of this, we basically provide detailed proofs here but leave
out many standard applications. You may look up Fitzpatrick, “Advance Calculus”, to
refresh your memory. By the way, the proof in this book does not use Contraction Map-
ping Principle.
The case of polar coordinates (see Example 3.8) shows that a local invertible map
may not be globally invertible. A theorem of Hadamard asserts that a continuous, locally
bijective map F is globally bijective under an additional condition, namely, |F (x)| → ∞
whenever |x| → ∞. Incidentally, let us mention the celebrated Jacobian conjecture. Con-
sider a map F : Rn → Rn whose components Fj ’s are polynomials in x1 , · · · , xn . Assume
3.6. APPENDIX II: CONSTRUCTION OF REAL NUMBERS 33
that its Jacobian determinant is a nonzero constant. The conjecture asserts that this
map is globally bijective whose inverse is also a polynomial map. Except for some special
cases, this conjecture is still open.