Metricspaces PDF
Metricspaces PDF
KEITH CONRAD
1. Introduction
As calculus developed, eventually turning into analysis, concepts first explored on the
real line (e.g., a limit of a sequence of real numbers) eventually extended to other spaces
(e.g., a limit of a sequence of vectors or of functions), and in the early 20th century a general
setting for analysis was formulated, called a metric space. It is a set on which a notion of
distance between any two elements is defined, and in which notions from calculus in R (open
and closed intervals, convergent sequences, continuous functions) can be studied. Many of
the fundamental types of spaces used in analysis are metric spaces (e.g., Hilbert spaces and
Banach spaces), so metric spaces are one of the first abstractions that has to be mastered
in order to learn analysis.
2. Metric spaces
In R, the magnitude of a number x is its absolute value |x| and the distance between
two numbers x and y is the absolute value ofp their difference: |x − y|. In Rm , the length of
a vector x = (x1 , . . . , xm ) is its norm ||x|| = x21 + · · · + x2m and the distance between two
vectors x = (x1 , . . . , xm ) and y = (y1 , . . . , ym ) is the norm of their difference: ||x − y|| =
p
(x1 − y1 )2 + · · · + (xm − ym )2 .
The distance between points is essential in defining limits, the central idea of calculus.
There are limits of function values and limits of sequences. Focusing on the case of sequences
(we will deal with limits and continuous functions in Section 8), we say a sequence {xn } of
real numbers has limit x, and write limn→∞ xn = x or just xn → x, if for every ε > 0 there
is an N ≥ 1 (it is understood that N = Nε is an integer depending on ε) such that
n ≥ N =⇒ |xn − x| < ε.
Distances are useful not only between points in Euclidean space, but also between func-
tions. For continuous functions f, g : [0, 1] → R, here are two different ways of defining how
far apart they are:
Z 1
(2.1) max |f (x) − g(x)|, |f (x) − g(x)| dx.
0≤x≤1 0
What do these mean for the graphs of the functions below (in red and blue)?
1
2 KEITH CONRAD
0 1
The first formula in (2.1) is the length of the largest vertical line separating the graphs
(the dashed line in the diagram), so saying f and g are close in this way means their graphs
never get far apart from each other. The second formula is the area of the region over [0, 1]
that is enclosed by both graphs (“area between the curves”), so f and g are close in this
way if, roughly speaking, the graphs can only be far apart over small regions (thereby not
affecting the total area between the curves that much).
The desire to create a single framework for all the known settings where limit ideas are
used inspired Maurice Fréchet in his 1906 PhD thesis to make the following definition.
Definition 2.1. A metric on a set X is a function d : X × X → R satisfying the following
three properties:
(i) d(x, y) ≥ 0 for all x and y in X, with d(x, y) = 0 if and only if x = y,
(ii) d(x, y) = d(y, x) for all x and y in X,
(iii) d(x, y) ≤ d(x, z) + d(z, y) for all x, y, z ∈ X.
A set X together with a choice of a metric d on it is called a metric space and is denoted
(X, d), or just denoted X if the metric1 is understood from context.
The third property in the definition of a metric is called the triangle inequality since it
abstracts the fact that the length of one side of a triangle is at most the sum of the lengths
of the other two sides (see figure below).
d(x, y) y
x
d(z, y)
d(x, z)
z
1In his thesis, Fréchet did not use the term “metric,” but instead wrote écart, which is French for “gap.”
METRIC SPACES 3
which sets all pairs of (distinct) points in X at distance 1 from each other. All three
conditions for being a metric are easy to check.
Example 2.6. If d is a metric on X, then the functions d0 (x, y) = min(d(x, y), 1) and
d00 (x, y) = d(x, y)/(1 + d(x, y)) are also metrics on X. The only part that requires careful
checking is the triangle inequality, which is left to the reader. Note d0 and d00 are both
bounded: they never take a value larger than 1. These two metrics are similar to the
original metric d when d has value at most 1:
1
(2.3) d(x, y) ≤ 1 =⇒ d0 (x, y) = d(x, y) and d(x, y) ≤ d00 (x, y) ≤ d(x, y).
2
If d(x, y) > 1 then d0 and d00 change the distance between x and y in different ways: d0
redefines the distance to be 1, while d00 makes the distance less than 1 in a smoother way.
Remark 3.5. If we had stuck with ε rather than ε/2 all the way through the proof then
we’d get 2ε at the end instead of ε, and then we’d have to say “Now go back and replace ε
with ε/2 . . .” to get the desired conclusion. The idea of using ε/2 in place of ε in the middle
in order to get a single ε at the end is called an ε/2 argument.
√ This type of reasoning occurs
n
all the time in analysis. Instead of ε/2 one might use ε/3, ε, or ε/2 (the last one is good
if we have a whole sequence of terms that need bounds whose sum is still less than ε).
Theorem 3.6. Every subsequence of a convergent sequence in a metric space is also con-
vergent, with the same limit.
Proof. Let xn → x in (X, d) and let {xni } be a subsequence of {xn }. Then n1 < n2 < · · · .
Set yi = xni . We want to show yi → x.
For ε > 0 there is an N such that n ≥ N =⇒ d(xn , x) < ε. Since the integers ni are
increasing, we have ni ≥ N if we go out far enough: there’s an I such that i ≥ I =⇒ ni ≥
N =⇒ d(xni , x) < ε, so d(yi , x) < ε. Thus yi → x.
Theorem 3.7. In a metric space (X, d), if two sequences {xn } and {x0n } converge to the
same value then d(xn , x0n ) → 0.
Proof. Suppose xn → x and x0n → x. Then d(xn , x0n ) ≤ d(xn , x) + d(x, x0n ) = d(xn , x) +
d(x0n , x) and the last two terms get small for large n. This suggests using an ε/2 argument.
For each ε > 0 there’s an N1 such that n ≥ N1 =⇒ d(xn , x) < ε/2 and an N2 such that
n ≥ N2 =⇒ d(x0n , x) < ε/2. Set N = max(N1 , N2 ), so
ε ε
n ≥ N =⇒ d(xn , x0n ) ≤ d(xn , x) + d(x0n , x) < + = ε.
2 2
The converse to Theorem 3.7 is false: sequences for which d(xn , x0n ) → 0 do not have
to converge. After all, let {xn } be an arbitrary sequence and let x0n = xn for all n, so
d(xn , x0n ) = 0 all the time.
The next result is a partial converse to Theorem 3.7.
Theorem 3.8. In a metric space (X, d), if xn → x and {x0n } is a sequence such that
d(xn , x0n ) → 0 then x0n → x.
Proof. This will be an ε/2 argument.
Pick ε > 0. We want to find an N such that n ≥ N =⇒ d(x0n , x) < ε.
There’s an N1 such that n ≥ N1 =⇒ d(xn , x) < ε/2. Since the real numbers d(xn , x0n )
tend to 0, there’s an N2 such that n ≥ N2 =⇒ d(xn , x0n ) < ε/2. Setting N = max(N1 , N2 ),
we have
ε ε
n ≥ N =⇒ d(x0n , x) ≤ d(x0n , xn ) + d(xn , x) < + = ε.
2 2
Example 3.9. On Rm , because the metrics dE and d∞ are each bounded above by a
constant multiple of the other (see (2.2)), we have dE (xn , x) → 0 if and only if d∞ (xn , x) →
0. Therefore convergence of sequences in Rm for both metrics means the same thing (with
the same limits).
Example 3.10. In Example 2.6 we introduced two alternatives to a metric d that are both
bounded metrics: d0 (x, y) = min(d(x, y), 1) and d00 (x, y) = d(x, y)/(1 + d(x, y)). Condition
(2.3) shows d0 and d00 have the same convergent sequences and limits as d.
6 KEITH CONRAD
Example 3.11. In C[0, 1] consider the sequence of functions xn for n ≥ 1, graphed below.
0 1
In fact the sequence {xn } in C[0, 1] has no limit at all relative to the metric d∞ .
To prove {xn } has no limit in (C[0, 1], d∞ ), not just that the constant function 0 is not
a limit, we seek a property that all convergent sequences satisfy and the sequence {xn } in
(C[0, 1], d∞ ) does not satisfy. Theorem 3.4 tells us every convergent sequence in a metric
space has the distance between consecutive terms tending to 0. Might d∞ (xn , xn+1 ) not
tend to 0? Well,
d∞ (xn , xn+1 ) = max |xn − xn+1 | = max (xn − xn+1 )
0≤x≤1 0≤x≤1
y = fn (x)
y = f (x)
0 1
METRIC SPACES 7
When fn → f in (C[0, 1], d∞ ) we say fn → f uniformly on [0, 1]. This implies pointwise
convergence: fn (a) → f (a) for each a ∈ [0, 1] since
|fn (a) − f (a)| ≤ max |fn (x) − f (x)| = d∞ (fn , f ) → 0.
0≤x≤1
However the converse is false: if fn (a) → f (a) for each a ∈ [0, 1] it does not mean
d∞ (fn , f ) → 0. Pointwise convergence does not imply convergence can be controlled in
the same way simultaneously on the whole domain [0, 1]. Consider the functions fn (x) on
[0, 1] graphed below which are an isoceles triangle of height 1 over [0, 1/n] and 0 for x ≥ 1/n.
We have fn (0) = 0 for all n, and for each a ∈ (0, 1] we have fn (a) = 0 for large enough n,
so fn (a) → 0 for each a ∈ [0, 1], but d∞ (fn , 0) = 1 so fn does not get close to the function
0 as n gets large because every fn has a peak of height 1.
y = fn (x)
0 1/n 1
for x in the interval [−1, 1], but what does it converge to? The subtlety here is that we are
saying something converges without identifying the limit in any concrete way. (Saying “the
series is what the limit is” sounds more circular than explanatory.) Often we want to prove
a sequence (of numbers or functions or shapes) has a limit even if we don’t yet have a tidy
name for the limiting object. How can convergence be detected before the limit is known?
A clue is in Theorem 3.4: if xn → x in a metric space (X, d) then d(xn , xn+1 ) → 0. The
conclusion makes no reference to the original limit x. Unfortunately, this property that
the terms become “consecutively close” is not good enough to characterize convergence in
general metric spaces. We saw this in Example 3.11, where the sequence of power functions
xn in (C[0, 1], d∞ ) does not converge but d∞ (xn , xn+1 ) ∼ (1/e)(1/(n+1)) → 0. A more basic
example is the harmonic series, which diverges and its partial sums Hn = 1 + 1/2 + · · · + 1/n
are consecutively close: |Hn − Hn+1 | = 1/(n + 1) → 0.
By making a slight change in the proof of Theorem 3.4, we get a much stronger conclusion
than consecutive closeness, and this stronger conclusion will be exactly what we need.
Theorem 4.1. If {xn } is a convergent sequence in a metric space (X, d) then the terms
of the sequence become “uniformly close”: for every ε > 0 there is an N ≥ 1 such that
m, n ≥ N =⇒ d(xm , xn ) < ε.
8 KEITH CONRAD
Proof. We run through the proof of Theorem 3.4 and make a few changes. Letting x =
limn→∞ xn , the triangle inequality tells us for all m and n that
d(xm , xn ) ≤ d(xm , x) + d(x, xn ) = d(xm , x) + d(xn , x).
We now make an ε/2 argument. For every ε > 0 there’s an N ≥ 1 such that for all n ≥ N
we have d(xn , x) < ε/2. Therefore
ε ε
m, n ≥ N =⇒ d(xm , xn ) ≤ d(xm , x) + d(xn , x) < + = ε.
2 2
This concept of uniform closeness, which is a property of a sequence involving no direct
reference to a hypothetical limit, is much more stringent than consecutive closeness. For
example, if Hn = 1 + 1/2 + · · · + 1/n then the numbers Hn are consecutively close (that
is, |Hn − Hn+1 | → 0) but they are not uniformly close. It can be shown, for instance, that
|Hn − H2n | → log 2 ≈ .693.
When calculus was acquiring rigorous foundations in the 19th century, it was realized that
uniform closeness in Theorem 4.1 captures the idea of convergence for sequences without
mentioning a limit for the sequence. This property is not actually called uniform closeness,
but is named in honor of Cauchy, who articulated and used it in the setting of infinite series.
Definition 4.2. A sequence {xn } in a metric space (X, d) is called a Cauchy sequence if
for every ε > 0 there is an N = Nε such that for all m, n ≥ N we have d(xm , xn ) < ε.
Theorem 4.3. Every convergent sequence in a metric space is a Cauchy sequence.
Proof. This is Theorem 4.1.
Corollary 4.4. If (X, d) is a metric space and Y is a subset of X given the induced metric
d|Y , then any sequence in Y that converges in X is a Cauchy sequence in (Y, d|Y ).
Proof. A sequence {yn } in Y that converges in X is Cauchy in X by Theorem 4.3. Since
the metric d on X is the metric we are using on Y , the Cauchy property of {yn } in X can
be viewed as the Cauchy property in Y .
Example 4.5. Consider the interval (0, ∞) as a metric space using the absolute value
metric induced from R. We have 1/n → 0 in R, but the sequence {1/n} has no limit in
(0, ∞) since 0 6∈ (0, ∞). The sequence {1/n} is a Cauchy sequence in (0, ∞) by Corollary
4.4 but it is not a convergent sequence in (0, ∞).
√
Example 4.6. On Rm , the metrics dE and d∞ satisfy d∞ ≤ dE ≤ m d∞ , so a sequence
in Rm is Cauchy with respect to one of these metrics if and only if it is Cauchy with respect
to the other one.
Being a Cauchy sequence means if you go out far enough into the sequence then all the
terms from some point onwards are as close together as you wish. While this property
is much stronger than being a consecutively close sequence, a sequence whose terms get
consecutively close rapidly enough is a Cauchy sequence. The next theorem says that being
consecutively close at least at the rate of a geometric progression is rapid enough.
Theorem 4.7. If {xn } is a sequence in a metric space (X, d) such that d(xn , xn+1 ) ≤ arn
for all n, where a > 0 and 0 < r < 1, then {xn } is a Cauchy sequence.
METRIC SPACES 9
Example 4.12. The closed intervals [0, 1] and [0, ∞) with metric from R are complete.
The open intervals (0, 1) and (0, ∞) with metric from R are not complete: a sequence in
the interval that tends to 0 is Cauchy but does not converge in the interval.
Example 4.13. The rational numbers Q with the absolute value metric√d(r, s) = |r − s|
are not complete. To prove this, pick an irrational real number L (like 2 or π) and let
rn be the sequence of decimal approximations to L truncated at the 1/10n place. Each
finite decimal is rational, so rn ∈ Q. Since rn → L in R, {rn } is a Cauchy sequence in Q
(Corollary 4.4). However, {rn } has no limit in Q, so Q is not complete.
Example 4.14. The integers with the absolute value metric are complete: any Cauchy
sequence in Z (as a metric space inside R) is eventually constant. This is boring.
Example 4.15. A set X with the discrete metric on it (Example 2.5) is complete: if d
is the discrete metric and d(xm , xn ) < 1 then xm = xn , so every Cauchy sequence for the
discrete metric is an eventually constant sequence, which clearly converges.
Example 4.16. The metric space (C[0, 1], d∞ ) is complete. A proof is in Appendix A.
Example 4.17. The metric space (C[0, 1], d1 ) is not complete. To prove this we will follow
the idea of Example 4.13 by writing down a discontinuous function that is a d1 -limit of
continuous functions. Note the d1 -metric makes sense for piecewise continuous functions on
[0, 1], since they are integrable.
In the picture below, let f (x) (in red) be the discontinuous function that’s 0 for 0 ≤ x <
1/2 and 1 for 1/2 ≤ x ≤ 1. For n ≥ 2 let fn (x) be a piecewise linear approximation (in
blue) breaking from values 0 and 1 at x = 1/2 ± 1/n, where it’s linear in between.
y = f (x)
1
y = fn (x)
0 1/2 1
The region below fn (x) and above the x-axis to the left of x = 1/2 is a right triangle
with a base of width 1/2 − (1/2 − 1/n) = 1/n and height 1/2, so
Z 1 Z 1/2
111 1
|fn (x) − f (x)| dx = 2 |fn (x)| dx = 2 · = .
0 0 2 n 2 2n
Therefore
Z 1 Z 1 Z 1
1 1
d1 (fm , fn ) = |fm (x)−fn (x)| dx ≤ |fm (x)−f (x)| dx+ |f (x)−fn (x)| dx = + ,
0 0 0 2m 2n
which tends to 0 as m, n → ∞. Thus {fn } is Cauchy in (C[0, 1], d1 ), but it has no limit in this
R1 Rb
metric space. If fn → g in (C[0, 1], d1 ) then 0 |f (x)−g(x)| dx = 0, so 0 |f (x)−g(x)| dx = 0
R1
and b |f (x) − g(x)| dx = 0 for any b in (0, 1). Using 0 < b < 1/2 in the first integral and
Rb R1
1/2 < b < 1 in the second integral, 0 |g(x)| dx = 0 for 0 < b < 1/2 and b |1 − g(x)| dx = 0
METRIC SPACES 11
for 1/2 < b < 1. Therefore g(x) = 0 for 0 ≤ x < 1/2 and g(x) = 1 for 1/2 < x ≤ 1, but no
value can be assigned to g(1/2) to make this continuous.
When a metric space is not complete, like (Q, | · |) or (C[0, 1], d1 ), we want to “fill in all
the holes” to create a complete metric space containing the original one. To describe this
larger space, we need one more concept.
Definition 4.18. A subset S ⊂ X is called dense if every element of X is the limit of a
sequence in S or equivalently every open ball around an element of X contains an element
of S.
Example 4.19. The rational numbers are dense in R but the integers are not dense in R.
Definition 4.20. A completion of a metric space (X, d) is a complete metric space (X,
b d)
b
that contains X, the metric d restricted to X is d, and X is a dense subset of X.
b b
When r = 0, B(a, 0) = ∅ and B(a, 0) = {a} by the axioms for a metric. Some writers
only consider balls to have positive radius.
In a general metric space, the picture to have in mind of open and closed balls should be
discs in the plane: the interior of a disc for open balls and the interior of a disc together
with its boundary circle for closed balls, as shown below.
r r
a a
In specific metric spaces there are better pictures than these. In R, an open ball B(a, r)
is the open interval (a − r, a + r) and a closed ball B(a, r) is the closed interval [a − r, a + r].
Pictured below are the open and closed balls in (C[0, 1], d∞ ) of radius r around a function
f (in blue), consisting of all functions whose graph deviates by less than r or at most r from
the graph of f .
0 1 0 1
Definition 5.2. A subset of X is called bounded if it is contained in some ball B(a, r). A
subset that is not bounded is called unbounded.
Since B(a, r/2) ⊂ B(a, r) ⊂ B(a, r), talking about a subset being contained in an open
ball or a closed ball is the same thing: either case can be turned into the other by changing
the radius if necessary. For example, the definition of a bounded subset does not change if
we replace open balls in the definition with closed balls.
Theorem 5.3. Every convergent sequence in a metric space is bounded.
Proof. If xn → x then there’s an N such that n ≥ N =⇒ d(xn , x) < 1. Let
r = max(d(x1 , x) + 1, . . . , d(xN −1 , x) + 1, 1),
so d(xn , x) < r for all n ≥ 1. Thus the whole sequence is contained in B(x, r).
Theorem 5.4. Every Cauchy sequence in a metric space is bounded.
Proof. Let {xn } be Cauchy. There’s an N such that m, n ≥ N =⇒ d(xm , xn ) < 1. In
particular, d(xN , xn ) < 1 for n ≥ N , so as in the proof of Theorem 5.3 (using xN here in
place of x there) an r can be found such that the whole sequence is in B(xN , r).
METRIC SPACES 13
Since convergent sequences are Cauchy, Theorem 5.3 is a special case of Theorem 5.4.
Definition 5.5. A subset U ⊂ X is called open if for each x ∈ U there’s an r > 0 such that
B(x, r) ⊂ U . We also consider the empty subset of X to be an open subset.
The idea behind the concept of a subset U being open is that the points in it are stable
under small perturbations: if we wiggle each element of U a little bit “in all directions”
then we stay inside of U . Exactly how much we can wiggle without leaving U depends on
how close we are to the “edge” of U .
Theorem 5.6. Every open ball B(a, r) in X is an open subset, and every open subset of
X is a union of open balls in X.
Proof. Since B(a, 0) = ∅ is open in X by definition, we can assume r > 0. Pick b ∈ B(a, r).
We want to find an r0 > 0 such that B(b, r0 ) ⊂ B(a, r). The picture below suggests using
r0 = r − d(a, b), which is positive since d(a, b) < r. A dashed circle with radius r0 is drawn
centered around b and just fits inside B(a, r).
r
a
b d(a, b)
r0
To show B(b, r0 ) ⊂ B(a, r), pick x ∈ B(b, r0 ). By the triangle inequality d(x, a) ≤
d(x, b) + d(b, a) < r0 + d(a, b) = r, so x ∈ B(a, r).
If U ⊂ X is open then each point in U is the center of an open ball that’s inside U , by
definition, so U is a union of open balls.
A generic picture to have in mind for an open set in a metric space is a blob without its
boundary, as in the picture below.
S
Theorem 5.7. In a metric space, if {Ui } is any collection of open subsets then i∈I Ui is
open. If U1 , . . . , Un are finitely many open subsets then U1 ∩ · · · ∩ Un is open.
S
Proof. Each x ∈ i∈I Ui is in some Ui , so there’s an open ball B(x, r) with r > 0 contained
in that Ui . Thus every point of the union is in an open ball that’s inside the union, so the
union is an open subset.
For the second part, we may assume each Ui is nonempty, since otherwise the intersection
is empty, and ∅ is open by definition. Pick x ∈ U1 ∩ · · · ∩ Un . For each j = 1, . . . , n we have
B(x, rj ) ⊂ Uj for some rj > 0, so B(x, r) ⊂ U1 ∩ · · · ∩ Un when r = min(r1 , . . . , rn ) > 0.
It is false that the intersection
T of infinitely many open sets has to be open. For example,
in R we can write [0, 1] = n≥1 (−1/n, 1 + 1/n) and [0, 1] is not an open subset of R.
14 KEITH CONRAD
Definition 5.8. A subset C ⊂ X is called closed if for any sequence in C that has a limit
in X, the limit is in C: if cn ∈ C and cn → x ∈ X then x ∈ C. We also consider the empty
subset of X to be a closed subset.
A subset being closed means limit operations don’t take us outside the set. The following
picture is a closed blob. By including the boundary curve, no sequence in the closed blob
can have a limit outside the closed blob.
x y
r
xn
a
Consider the open ball with center x and radius r0 = d(a, x) − r. Then r0 > 0. The
picture above suggests that no element of B(x, r0 ) lies in B(a, r), and we can prove this
using the triangle inequality:
y ∈ B(x, r0 ) =⇒ d(a, x) ≤ d(a, y) + d(y, x) < d(a, y) + r0 =⇒ d(a, y) > d(a, x) − r0 = r,
so y 6∈ B(a, r).
Since xn → x, some xn is in B(x, r0 ), but we proved B(x, r0 ) is disjoint from B(a, r), so
xn 6∈ B(a, r) and that’s a contradiction. Thus “d(a, x) > r” is false, so d(a, x) ≤ r.
Here is the “closed” counterpart of Theorem 5.7.
T
Theorem 5.10. In a metric space, if {Ci } is any collection of closed subsets then i∈I Ci
is closed. If C1 , . . . , Cn are finitely many closed subsets then C1 ∪ · · · ∪ Cn is closed.
T
Proof. We can assume i∈I Ci 6= ∅, since the empty set is closed by definition. Thus we
can assume each Ci is nonempty. T
Call the metric space X and let {cn } be a sequence in i∈I Ci that convergesTto some
x ∈ X. For each i ∈ I we have {cn } ⊂ Ci , so x ∈ Ci since Ci is closed. Thus x ∈ i∈I Ci .
For the second part, we can assume at least one of C1 , . . . , Cn is nonempty, so the union
is nonempty. Let {cn } be a sequence in C1 ∪ · · · ∪ Cn with limit x ∈ X. We want to show
x is in the union. A sequence has infinitely many terms, so one of C1 , . . . , Cn contains
METRIC SPACES 15
infinitely many terms of the sequence. That is, a subsequence of {cn } lies entirely inside
some Cj . Since any subsequence of a convergent sequence also converges with the same
limit (Theorem 3.6), and Cj is closed, it follows that x ∈ Cj ⊂ C1 ∪ · · · ∪ Cn .
It is false that the union S
of infinitely many closed subsets has to be closed. For example,
in R we can write (0, 1) = n≥1 [1/n, 1 − 1/n] and (0, 1) is not a closed subset of R.
Theorem 5.11. Every closed subset of a complete metric space is complete.
Proof. Let C be a closed subset of X. Any Cauchy sequence in C is Cauchy in X, so it has
a limit x ∈ X by completeness of X. Since C is a closed subset of X, we must have x ∈ C.
Thus all Cauchy sequences in C converge in C, so C is complete.
Definition 5.12. A limit point of a subset S ⊂ X is a point of X that is the limit of some
sequence in S.
Definition 5.13. The closure of a subset S ⊂ X is the union of S and its limit points in
X. This set is denoted S.
Example 5.14. For all a ∈ Rm and r > 0, B(a, r) = B(a, r): the closure of every
(nonemtpy) open ball in Rm is the closed ball with the same center and radius. In general
metric spaces, this intuitively appealing property might be false.
Theorem 5.15. For any subset S ⊂ X, its closure S is closed in X and S is the smallest
closed subset of X containing S. In particular, S is closed if and only if S = S.
Proof. If S ⊂ C ⊂ X and C is closed in X then any limit point of S lies in C, so S ⊂ C. It
remains to show S is closed.
Let {xn } be a sequence in S with limit x ∈ X. Since each xn is a limit point of S, there’s
some sn ∈ S such that d(xn , sn ) < 1/n. Then xn → x and d(xn , sn ) → 0, so sn → x by
Theorem 3.8. Therefore x ∈ S.
Dense subsets were defined in Definition 4.18. They can also be described using closures.
Theorem 5.16. A subset S ⊂ X is dense if and only if S = X.
Proof. Saying S is dense means every element of X is a limit point of S, so X ⊂ S. The
reverse containment is true by definition.
We have developed properties of open subsets and closed subsets separately. The following
theorem shows they are in fact complementary concepts!
Theorem 5.17. A subset of X is open if and only if its complement is closed.
Proof. Pick an open subset U . To prove its complement X − U is closed, we can assume
X − U 6= ∅ since the empty set is closed by definition.
Let {xn } be a sequence in X − U with limit x ∈ X. To prove x ∈ X − U , assume this is
not true, so x ∈ U . Then there’s some r > 0 such that B(x, r) ⊂ U . However, since xn → x
the ball B(x, r) must contain some xn , which is impossible since xn ∈ X − U . Thus x 6∈ U .
Now pick a closed subset C. We want to prove its complement X − C is open, and as
before we can assume X − C 6= ∅ since the empty set is open by definition. Pick a point
x ∈ X − C. We want to find an r > 0 such that B(x, r) ⊂ X − C. Assume there is no
such r, so for every r > 0 the ball B(x, r) contains an element of C. Using the sequence
of radii 1/n, for each n ≥ 1 the ball B(x, 1/n) contains some element, say xn , of C. Then
d(x, xn ) < 1/n for all n, so xn → x. Since every xn is in C and C is closed, we deduce that
x ∈ C too. That is a contradiction, so X − C is open.
16 KEITH CONRAD
Example 5.18. In R, (0, 1) is open and its complement (−∞, 0] ∪ [1, ∞) is closed. The
interval [0, ∞) is closed and its complement (−∞, 0) is open.
Theorem 5.17 is not saying every subset of a metric space is open or closed. Most subsets
are neither. For example, in R the sets [0, 1) and [0, 1] ∪ (2, 3) are neither open nor closed.
In light of Theorem 5.17, we now see that Theorems 5.7 and 5.10 are equivalent since
complements exchange unions and intersections as well as open and closed subsets. For
example, if Ci are closed subsets of X then their complements Ui = X − Ci are open
subsets and \ [ [
X− Ci = (X − Ci ) = Ui ,
i∈I i∈I i∈I
S T
so from Theorem 5.6 saying i∈I Ui is open, Theorem 5.17 implies that i∈I Ci is closed.
6. Compact subsets
Closed bounded intervals are nice for functions: every continuous real-valued function on
a closed bounded interval is bounded and has maximum and minimum values (see below
on left), but 1/x on (0, 1) is unbounded above and has no minimum (see below on right).
1 1
0 1 0 1
What lies behind the better properties of continuous real-valued functions on closed
bounded intervals turns out to be a property of the interval having nothing to do with
functions: every sequence in a closed bounded interval has a subsequence converging in
that interval. Even if a sequence itself does not converge, some subsequence does. In a
general metric space X, subsets with this property get a special name.
Definition 6.1. A subset K of a metric space is called compact if every sequence in K has
a subsequence that converges in K.
The notion of a compact set (in a metric space) was first defined by Fréchet. We will see in
Section 8 some reasons why it is important. It is not an exaggeration to say compactness is
one of the most important concepts in mathematics. Its initial applications were in analysis,
but it is used in geometry, number theory, and even mathematical logic.
Here we will give some examples (and non-examples) and discuss some properties. We
start with the fundamental example.
Theorem 6.2. Every closed bounded interval [a, b] in R is compact.
Proof. Pick a sequence {xn } in [a, b]. All the terms of the sequence are within distance b − a
of each other. To extract a convergent subsequence, we use a repeated bisection method.
METRIC SPACES 17
Break up [a, b] into two halves: [a, b] = [a, m]∪[m, b] where m = (a+b)/2 is the midpoint.
Infinitely many of the terms in the sequence {xn } have to be in [a, m] or infinitely many
have to be in [m, b] (or maybe both happen). Focusing on a subinterval with infinitely many
(1)
xn in it, we get a subsequence denoted xn in which all terms are within (b − a)/2 of each
other. Take that subinterval we chose and divide it into left and right halves (overlapping
(1)
at the midpoint). Once again, infinitely many terms of {xn } have to be in one of the two
halves, so by passing to the terms of the subsequence in such a half we get a new (refined)
(2)
subsequence {xn } in which all the terms are within (b − a)/4 of each other.
(k)
Repeating this process, we get for each k ≥ 0 a subsequence {xn } with n = 1, 2, 3, . . .
(0)
in which all the terms are within (b − a)/2k of each other (with xn = xn ). The way we
construct these subsequences makes them nested:
{xn } = {x(0) (1) (2) (3)
n } ⊃ {xn } ⊃ {xn } ⊃ {xn } ⊃ · · ·
(n)
Now set yn = xn for n ≥ 1. The sequence {yn } is a subsequence of {xn } and |yn − yn+1 | ≤
(b − a)/2n . Since the sequence {yn } gets consecutively close at least as quickly as the
geometric progression (b − a)/2n , it is a Cauchy sequence by Theorem 4.7. By completeness
of R, the sequence yn has a limit in R, and this limit is in [a, b] since closed intervals are
closed subsets (Theorem 5.9). Thus {xn } has a subsequence that converges in [a, b].
Example 6.3. The open interval (0, 1) is not compact in R since the sequence 1/2, 1/3,
1/4, . . . , 1/n, . . . does not have a convergent subsequence within (0, 1): the sequence con-
verges to 0, so any of its subsequences converges to 0, but 0 6∈ (0, 1). In a similar way, no
(nonempty) open interval in R is compact.4
Theorem 6.4. Every closed bounded box [a1 , b1 ] × · · · × [am , bm ] in Rm is compact.
Proof. Let xn = (xn1 , . . . , xnm ) be a sequence in the box. Look at the sequence of first
components: {xn1 } is a sequence in [a1 , b1 ], so by compactness of this interval (Theorem
6.2) there is a convergent subsequence {xni 1 } where n1 < n2 < . . ., with limit y1 ∈ [a1 , b1 ].
Consider the subsequence xni = (xni 1 , . . . , xni m ). The first components converge to y1 .
Look now at the second components xni 2 : it is a sequence in [a2 , b2 ], so by compactness
of this interval there is a convergent subsequence xnij 2 with limit y2 ∈ [a2 , b2 ]. The corre-
sponding subsequence of first components xnij 1 still converges to y1 since a subsequence of
a convergent sequence has the same limit (Theorem 3.6).
Now the sub-subsequence xnij in the box has its first components converge to y1 and
its second components converge to y2 . Repeating this argument until we exhaust all the
components, we will finally get a subsequence of {xn } in which the kth components have a
limit yk ∈ [ak , bk ], so that subsequence converges to (y1 , . . . , ym ), which is in the box.
Every bounded sequence in Rm lies in a closed bounded box, so Theorem 6.4 tells us
that every bounded sequence in Rm has a convergent subsequence.5
Theorem 6.5. Every closed and bounded subset of Rm is compact.
4A student taking a real analysis course told me the instructor focused a lot on [a, b] and the student didn’t
understand why there is a big fuss about distinguishing between [a, b] and (a, b) since “they only differ in
two points.” Those two points make a huge difference, since it’s why [a, b] is compact and (a, b) is not.
5This property of bounded sequences in Euclidean space is called the Bolzano–Weierstrass theorem.
18 KEITH CONRAD
Proof. Let C be a closed and bounded subset of Rm and {cn } be a sequence in C. We want
to show {cn } has a subsequence that converges in C.
Since C is bounded in Rm it lies in some open ball B(a, r), which in turn lies in the
closed box [a1 − r, a1 + r] × · · · × [am − r, am + r]. This box is compact (Theorem 6.4), so
{cn } has a subsequence converging in this box, and the limit of this subsequence lies in C
since C is closed.
Example 6.6. Every closed ball in Rm is compact, since closed balls are closed subsets
and are clearly bounded.
Theorem 6.7. Every compact subset of a metric space is closed and bounded.
Proof. Let K be a compact subset of the metric space X.
K is closed: Suppose cn ∈ K and cn → x ∈ X. We want to show x ∈ K. By compactness
of K, there is a subsequence cni with a limit c ∈ K. When a sequence converges, every
subsequence converges to the same limit (Theorem 3.6), so cni → x. Thus c = x, so x ∈ K.
K is bounded: We will prove, contrapositively, that an unbounded subset S of a metric
space is not compact. Pick s0 ∈ S. Since S is unbounded, for each integer n ≥ 1 there is
an sn ∈ S such that d(s0 , sn ) > n. It should be intuitively clear that the sequence {sn } is
not bounded. To prove this, we show every open ball in X can contain only finitely many
sn : if sn ∈ B(a, r), and then
n < d(s0 , sn ) ≤ d(s0 , a) + d(a, sn ) < d(s0 , a) + r,
which is false for large enough n. Since no open ball contains infinitely many sn , every
subsequence of {sn } is unbounded. Therefore no subsequence of {sn } can converge, since
convergent sequences are bounded (Theorem 5.3).
Theorems 6.5 and 6.7 together give the following important characterization of compact
subsets of Euclidean space.
Theorem 6.8. A subset of Rm is compact if and only if it is closed and bounded.6
Proof. By Theorem 6.7, every compact subset of any metric space is closed and bounded.
By Theorem 6.5, every closed and bounded subset of Rm is compact.
Theorem 6.8 is not true in general metric spaces: a closed and bounded subset of a metric
space does not have to be compact. That is, the converse of Theorem 6.7 in some metric
spaces is false.
Example 6.9. On Rm change the Euclidean metric dE to one of the bounded metrics
min(1, dE ) or dE /(1 + dE ). Convergent sequences and limits in Rm for these metrics are
the same as for dE (Example 3.10), so they define the same closed subsets (and, by taking
complements, the same open subsets) of Rm as dE does. All closed subsets of Rm are
bounded in these metrics, but many closed subsets of Rm (like Rm itself) are not compact.
Here’s a less weird counterexample to the converse of Theorem 6.7 (no strange metric).
Example 6.10. We will show in the complete metric space (C[0, 1], d∞ ) that the closed
unit ball B(0, 1) is not compact.7 The sequence of functions xn lies in this ball and we will
6This is called the Heine–Borel theorem.
7While compactness in (C[0, 1], d ) is not the same as being closed and bounded, there is a set of conditions
∞
in (C[0, 1], d∞ ) useful for analysis that is equivalent to compactness. Google the Arzelà–Ascoli theorem.
METRIC SPACES 19
show it has no convergent subsequence. In Example 3.11 we showed this sequence is not
convergent, but saying it has no convergent subsequence is much stronger.
Suppose a subsequence {xni } has a limit f in (C[0, 1], d∞ ). For each a ∈ [0, 1] we have
|ani − f (a)| ≤ max |xni − f (x)| = d∞ (xni , f ),
0≤x≤1
member of the open covering, say x ∈ Ui . Since Ui is open, we have B(x, 1/m) ⊂ Ui for
some m ≥ 1. Since nj → ∞ and d(xnj , x) → 0, for suitably large nj we have both nj > 2m
and d(xnj , x) < 1/(2m). For that nj we have the implication
1 1 1 1 1
y ∈ B(xnj , 1/nj ) =⇒ d(y, x) ≤ d(y, xnj ) + d(xnj , x) < + < + = ,
nj 2m 2m 2m m
so B(xnj , 1/nj ) ⊂ B(x, 1/m) ⊂ Ui , but that is a contradiction since no B(xn , 1/n) lies
inside any member of the open covering. Thus some r exists with the desired property.
Step 3: Every open covering {Ui } of K in X has a finite subcovering containing K.
Using r as in Step 2, by Step 1 there are x1 , . . . , xn ∈ K such that K ⊂ nk=1 B(xk , r). By
S
the choice of r, each B(xk , r) is in some Ui , so K is contained in a finite union of members
of the open covering. This completes the proof that (1) ⇒ (2).
(2) =⇒ (1): Let {xn } be a sequence in K. We want to show (2) implies there is a
convergent subsequence, or equivalently the sequence {xn } has a limit point in K. (This
includes the possibility that a point occurs infinitely often in the sequence, making it a
limit of a constant subsequence.) Assume there is no limit point: no point in K is the limit
of a subsequence of {xn }. Then for each y ∈ K there must be an ry > 0 such that the
ball B(y, ry ) contains only finitely many terms from {xn }: if every ball centered at y had
infinitely many terms from {xn } in it then we could build a subsequence tending to y by
using radius 1, 1/2, 1/3, . . ..
The balls B(y, ry ) for y ∈ K are an open covering of K, so by (2) there is a finite
subcovering: K is contained in a union of finitely many of these balls. Each of these balls
has only finitely many terms from {xn } in it, so we’d get that the sequence {xn } has only
finitely many terms, which is absurd.
We call condition (2) in Theorem 6.11 the open covering criterion for compactness. It
leads to a second proof that compact subsets of metric spaces are closed and bounded
(Theorem 6.7).
Compact subsets are closed: (This will be similar to the proof that (2) =⇒ (1) above.)
If {xn } is a sequence in K that converges to some x ∈ X then we want to show x ∈ K.
Every subsequence of {xn } also tends to x, so if x is not in K then every element of K
is contained in an open ball that contains only finitely many terms of the sequence {xn }.
(If this were not true then some point in K would be the limit of a subsequence, which is
impossible since all subsequences tend to x.) These open balls are an open covering of K,
so by the open covering criterion for compactness we can extract a finite subcovering, but
that implies the sequence has only finitely many terms, a contradiction.
S
Compact subsets are bounded: One open covering of K is x∈K B(x, 1). By the open
covering criterion for compactness there is a finite subcovering, so K ⊂ nk=1 B(xk , 1) for
S
some finite set of points x1 , . . . , xn in K. Thus K is in a finite union of balls, so it is
bounded.
As an application of the open covering formulation of compactness, we show that all
compact subsets of R, no matter how complicated they may be, share a property with
closed bounded intervals: they contain maximum and minimum elements.
Theorem 6.12. For every nonempty compact subset K of R there are a ∈ K and b ∈ K
such that all x ∈ K satisfy a ≤ x ≤ b.
Proof. Suppose K does not have a maximum
S element. Then for each x ∈ K there is y > x
in K, so x ∈ (−∞, y). Thus K ⊂ y∈K (−∞, y). This open covering of K has a finite
METRIC SPACES 21
7. Connected subsets
A connected subset of a metric space is a subset that is in “one piece.” What does that
mean? It’s easier to say what it means not to be in one piece: the subset can be covered
by two disjoint open sets. For example, two closed discs in the plane that don’t overlap
should not be considered to be one piece. We can surround them by two open discs that
don’t overlap, as shown below.
If ` were in B, which is also an open set in R, then some interval around ` would lie
entirely in B. However, since ` is the least upper bound of S there must be elements of S
in every interval of the form (` − δ, `], and S ⊂ A, so we have a contradiction. Thus ` 6∈ B.
Since ` is in neither A nor B, we have a final contradiction, so (0, 1) is connected.
Step 2: Unbounded open intervals are connected. Consider the case of (0, ∞). Suppose
(0, ∞) ⊂ U ∪ V where U and V are disjoint open subsets of R. The interval contains 1,
and without loss of generality 1 ∈ U . For all m > 1 we have (0, m) ⊂ (0, ∞) ⊂ U ∪ V
and (0, m) ∩ U 6= ∅, so by connectedness of (0, m) we get (0, m) ⊂ U . Since this holds
for all m > 1, we get (0, ∞) ⊂ U . The same argument works for any unbounded open
interval with one finite endpoint. The only open interval left is R = (−∞, ∞). Write it as
(−∞, 2) ∪ (0, ∞) and use each part separately (both contain 1) to see R is connected.
Step 3: Other intervals are connected. Let I be a non-open interval and I ⊂ U ∪ V for
disjoint open U and V in R. Let J be I without its finite endpoints, so J is an open interval
and thus we know J is connected. Since J ⊂ U ∪ V , either J ⊂ U or J ⊂ V . Without
loss of generality, J ⊂ U . If an endpoint of I were in V then a small open interval around
that endpoint would be in V (since V is open in R), but this is absurd since any open
interval around an endpoint of I contains elements of J, which are all in U . Therefore finite
endpoints of I are in U too, so I ⊂ U and this proves I is connected.
Theorem 7.4. Every nonempty subset of R that is not an interval is not connected.
Proof. Say S ⊂ R is not an interval. Then S contains two points a and b, say with a < b,
and does not contain some point c in between them. Let U = (−∞, c) and V = (c, ∞).
Then U and V are disjoint open subsets of R, S ⊂ U ∪ V , a ∈ S ∩ U , and b ∈ S ∩ V .
Combining the last two theorems, the nonempty connected subsets of R are precisely the
intervals. There is no simple characterization of connected subsets of Rm for m > 1. In
practice nice subsets of Rm for m > 1 are proved to be connected by proving they have a
stronger, more visually intuitive, property called being path-connected.
Definition 7.5. A subset S of a metric space X is called path-connected if, for every pair
of points s and s0 in S, there is a continuous function p : [0, 1] → X such that p(t) ∈ S for
all t, p(0) = s, and p(1) = s0 .
We call such a function p a path from s to s0 . Since q(t) = p(1 − t) is also continuous
with q(0) = p(1) = s0 and q(1) = p(0) = s, we can think of a path going in either direction,
from s to s0 or from s0 to s.
The picture to have of a path-connected space is the inside of the blob below, where any
two points can be linked by a path.
can “see” right away nice solid regions or surfaces in R3 and their analogues in Rm are
connected because a path can be drawn between any two points. For example, the surface
of a sphere or a solid ball in R3 are path-connected and thus are connected.
The converse of Theorem 7.6 in general is false: connectedness does not imply path-
connectedness. An example is the “infinite broom” pictured below: it is the union of the
closed line segments Ln from (0, 0) to (1, 1/n) as n runs over positive integers together with
the (red) point (1, 0). The x-axis strictly between 0 and 1 is not part of the set. This is
connected, but there is no path from (1, 0) to any other point of the set. See [1] for a proof.
(1,1)
(1,1/2)
(1,1/3)
(1,1/4)
(0,0) (1,0)
There is an important partial converse to Theorem 7.6: for open subsets of Rm , being
connected implies being path connected. The proof is omitted.
The concept most unlike being connected is being totally disconnected. A subset of a
metric space is called totally disconnected if its only nonempty connected subsets are one-
element subsets (a point is always connected). Examples of totally disconnected subsets of
the metric space R include Z, Q, and fractals like the Cantor set. The p-adic integers and
p-adic numbers, for a prime p, are important totally disconnected metric spaces in number
theory. In Rm for m > 1 all open and closed balls are connected, which is a nice analogy
with the one-dimensional case, but in a totally disconnected metric space no open or closed
balls are connected (aside from closed balls of radius 0, i.e., points).
closed bounded interval then there are m and M such that (i) m ≤ f (x) ≤ M for all x in
[a, b] and (ii) m and M are values of f (x).
The Extreme Value Theorem justifies the definition of the metric d∞ on C[0, 1] back in
Section 2 as a maximum value of a continuous function on [0, 1]. In contrast to the Extreme
Value Theorem, a continuous bounded function on an open interval doesn’t have to have
a maximum value, such as 1/x on (1, 2). Its values are bounded above by 1, but no value
of the function is greater than all other values. The difference between closed bounded
intervals and open intervals is that closed bounded intervals are compact, and we’ll see that
is what makes the Extreme Value Theorem work.
Before we prove theorems about continuous functions we have to define continuous func-
tions. The definition of continuity for a real-valued function on an interval, usually called the
(ε, δ)-definition, goes as follows. For a real number a and a real-valued function f (x) defined
on an interval containing a, we say f (x) is continuous at a and write limx→a f (x) = f (a) if
for every ε > 0 there is a δ = δa,ε > 0 such that
|x − a| < δ =⇒ |f (x) − f (a)| < ε.
If we have a real-valued function f : Rm → R, and a ∈ Rm , then we say f (x) is continuous
at a and write limx→a f (x) = f (a) if for every ε > 0 there is a δ = δa,ε > 0 such that
||x − a|| < δ =⇒ |f (x) − f (a)| < ε.
Notice the different distances being used here, one on Rm (where the function is defined)
and one on R (where the function takes its values).
Definition 8.3. A function f : X → Y between two metric spaces is called continuous at
a ∈ X if for every ε > 0 there is a δ = δa,ε > 0 such that
dX (x, a) < δ =⇒ dY (f (x), f (a)) < ε.
If f is continuous at each point of X then we say f is continuous on X.
As a warm-up, let’s show every metric is a continuous function on its metric space when
we view it as a function of one of its variables, keeping the other one fixed.
Theorem 8.4. For any metric space (X, d) and point c ∈ X, the function fc : X → R that
is “distance to c”, namely fc (x) = d(c, x), is continuous.
Proof. Pick a ∈ X and ε > 0. We need a δ > 0 such that
d(x, a) < δ =⇒ |fc (x) − fc (a)| < ε.
The inequality on the right says |d(c, x) − d(c, a)| < ε.
Using the triangle inequality in two ways,
d(c, a) ≤ d(c, x) + d(x, a) and d(c, x) ≤ d(c, a) + d(a, x),
so
d(c, a) − d(c, x) ≤ d(x, a) and d(c, x) − d(c, a) ≤ d(a, x).
Thus |d(c, x) − d(c, a)| ≤ d(x, a). Therefore
d(x, a) < ε =⇒ |d(c, x) − d(c, a)| ≤ d(x, a) < ε
so we can use δ = ε.
METRIC SPACES 25
Remark 8.5. By a two-way triangle inequality argument like the one used in this proof,
show
|d(x, y) − d(xn , yn )| ≤ d(x, xn ) + d(y, yn )
for xn , x, yn , y ∈ X. Therefore if xn → x and yn → y in X then this inequality shows
d(xn , yn ) → d(x, y), an intuitively reasonable property.
Theorem 8.6. For any metric space (X, d), the identity function X → X where x 7→ x is
continuous.
Proof. This is straightforward, using δ = ε.
Theorem 8.7. Addition and multiplication, as functions R2 → R given by A(x, y) = x + y
and M (x, y) = xy, are both continuous.
Proof. First we prove continuity of addition. Pick (a, b) ∈ R2 and ε > 0. We need δ > 0
such that
||(x, y) − (a, b)|| < δ =⇒ |A(x, y) − A(a, b)| < ε.
We will use δ = ε/2. p
If ||(x, y) − (a, b)|| < ε/2 then (x − a)2 + (y − b)2 < ε/2, so |x − a| < ε/2 and |y − b| <
ε/2. Then
ε ε
|A(x, y) − A(a, b)| = |(x + y) − (a + b)| ≤ |x − a| + |y − b| < + = ε.
2 2
To prove multiplication is continuous we estimate |M (x, y) − M (a, b)| = |xy − ab|:
|xy − ab| = |(x − a)y + (y − b)a|
= |(x − a)(y − b) + (x − a)b + (y − b)a|
≤ |x − a||y − b| + |x − a|b + |y − b|a.
Thus if |x − a| < δ and |y − b| < δ, then |xy − ab| < δ 2 + δb + δa = δ(δ + b + a). If
ε
(8.1) δ ≤ 1 and δ ≤
1+a+b
then δ(δ + b + a) ≤ δ(1 + b + a) ≤ ε. So pick δ = min(1, ε/(1 + a + b)) to make δ satisfy the
two inequalities in (8.1). Then
||(x, y) − (a, b)|| < δ =⇒ |x − a|, |y − b| < δ
and our calculations above imply |xy − ab| < ε.
In this proof notice that for continuity of multiplication our choice of δ depends not only
on ε, but also on the point (a, b) where we are checking continuity. This is typical: in
practice the choice for δ may depend on the point at which we are proving continuity. (In
the definition of continuity at a, we wrote δ = δa,ε .) This did not happen for addition,
where δ = ε/2 everywhere. Such “independence of the point” is special; it will lead later to
the concept of uniform continuity.
Theorem 8.8. Any composition of continuous functions on metric spaces is continuous:
if f : X → Y and g : Y → Z are continuous then the composite function g ◦ f : X → Z is
continuous.
26 KEITH CONRAD
Proof. Pick a ∈ X and ε > 0. We have (g ◦ f )(a) = g(f (a)). By the definition of continuity
of g at f (a), there’s an η > 0 (depending on f (a) and ε) such that
(8.2) dY (y, f (a)) < η =⇒ dZ (g(y), g(f (a))) < ε.
By the definition of continuity of f at a, there’s a δ > 0 (depending on η and a) such that
dX (x, a) < δ =⇒ dY (f (x), f (a)) < η,
and by (8.2) that last inequality implies dZ (g(f (x)), g(f (a))) < ε.
Just as compactness has a formulation in terms of open sets rather than sequences (The-
orem 6.11), using open coverings, continuity of a function on a metric space also has a
formulation in terms of open sets rather than ε’s and δ’s. More precisely, continuity of a
function can be expressed in terms of inverse images of open sets. For a function f : X → Y
and a subset S ⊂ Y , the inverse image f −1 (S) means {x ∈ X : f (x) ∈ S}. An inverse image
of a function on a subset makes sense even if the function is not invertible. For example, if
f : R → R by f (x) = x2 then
√ √
f −1 ((0, 1)) = (−1, 0) ∪ (0, 1) f −1 ((−2, 2)) = (− 2, 2),
√ √
f −1 ((1, 2)) = (− 2, −1) ∪ (1, 2), f −1 ((−1, 0)) = ∅.
Inverse images of subsets under a function behave well for all set-theoretic operations: if
f : X → Y is a function then for subsets S and T in Y ,
f −1 (S ∩ T ) = f −1 (S) ∩ f −1 (T ), f −1 (S ∪ T ) = f −1 (S) ∪ f −1 (T ),
S ⊂ T =⇒ f −1 (S) ⊂ f −1 (T ), f −1 (S − T ) = f −1 (S) − f −1 (T ),
where S − T = {s ∈ S : s 6∈ T }. (For example, if S = {0, 1} and T = {1, 2} then
S − T = {0}.) We will use these properties without comment below.8 In particular,
f −1 (Y − S) = X − f −1 (S), so inverse images send complements to complements.
Theorem 8.9. A function f : X → Y between two metric spaces is continuous if and only
if the inverse image of every open set in Y is open in X: for all open U in Y , the set
f −1 (U ) = {x ∈ X : f (x) ∈ U } is open in X.
Proof. First suppose f fits the (ε, δ)-definition of continuity on X, so f is continuous at
each element of X. For every open set U ⊂ Y we want to show f −1 (U ) is open in X.
If f −1 (U ) = ∅ then f −1 (U ) is open by our convention that the empty set is open, so
suppose f −1 (U ) 6= ∅. Pick a ∈ f −1 (U ), so f (a) ∈ U . Since U is open in Y there’s
some ε > 0 such that B(f (a), ε) ⊂ U . By the (ε, δ)-definition of continuity at a, there
is a δ > 0 such that dX (x, a) < δ =⇒ dY (f (x), f (a)) < ε. This implication is saying
f (B(a, δ)) ⊂ B(f (a), ε), so B(a, δ) ⊂ f −1 (B(f (a), ε)) ⊂ f −1 (U ). This shows each a in
f −1 (U ) is contained in an open ball that’s contained in f −1 (U ), so f −1 (U ) is open in X.
Now we prove the converse. Suppose for all U open in Y we have f −1 (U ) open in X. For
every a ∈ X we will prove f is continuous at a. For each ε > 0, the open ball B(f (a), ε) in
Y is open, so using B(f (a), ε) as U the inverse image f −1 (B(f (a), ε)) is open in X and this
inverse image includes a. Thus there’s a δ > 0 such that B(a, δ) ⊂ f −1 (B(f (a), ε)), and
unwinding the notation this containment is saying that if dX (a, x) < δ then dY (f (a), f (x)) <
ε. That is exactly the (ε, δ)-definition of continuity of f at a.
8Analogous formulas for images of subsets are true for unions and containments but false for intersections
and complements: use f (x) = x2 with A = {1, 2} and B = {1, −2} to see f (A ∩ B) 6= f (A) ∩ f (B) and
f (A − B) 6= f (A) − f (B).
METRIC SPACES 27
It is false that continuous functions always send open sets to open sets. For example, the
squaring function R → R sends (−1, 1) to [0, 1). Continuity aligns with inverse images of
open sets, not images of open sets. C’est la vie.
Remark 8.10. Theorem 8.9 is an open-set formulation of continuity on a whole set, not
at a particular point. There is an open-set formulation of continuity at a point: f : X → Y
is continuous at a if and only if for all open U ⊂ Y containing f (a) there is an open V ⊂ X
containing a such that f (V ) ⊂ U . Checking this matches the (ε, δ)-definition of continuity
at a is left to the reader.
Corollary 8.11. A function f : X → Y between two metric spaces is continuous if and
only if the inverse image of every closed set in Y is closed in X.
Proof. Theorem 5.17 tells us that open and closed subsets are complementary to each other.
For any closed subset C of Y , U = Y − C is open and f −1 (U ) = f −1 (Y − C) = X − f −1 (C).
Therefore f −1 (C) is closed if and only if f −1 (U ) is open, so Theorem 8.9 tells us continuity
of f is equivalent to f −1 sending closed subsets to closed subsets.
Example 8.12. In R3 , the sphere S 2 = {(x, y, z) ∈ R3 : x2 + y 2 + z 2 = 1} is closed. This
can be proved directly using sequences in S 2 or by observing that f (x, y, z) = x2 + y 2 + z 2
is a continuous function R3 → R and S 2 = f −1 (1) with the one-point set {1} in R being
closed.
To illustrate the open-set formulation of continuity of a function, we prove Theorem 7.6:
path-connected sets are connected.
Proof. Let S be a path-connected subset of a metric space X. We will use paths in S to show
that if S is not connected then [0, 1] is not connected, which of course is a contradiction, so
S has to be connected.
Suppose S is not connected, so we have S ⊂ U ∪ V where U and V are nonempty disjoint
open subsets of X. Pick s ∈ S ∩ U and s0 ∈ S ∩ V . There is a path p : [0, 1] → S where
p(0) = s and p(1) = s0 . The partition of S into S ∩ U and S ∩ V leads via this path to a
partition of [0, 1]: [0, 1] = p−1 (U ) ∪ p−1 (V ). Set A = p−1 (U ) and B = p−1 (V ). Both are
open subsets of [0, 1] since p is continuous.9
Note 0 ∈ A and 1 ∈ B, so A and B are nonempty. Obviously A and B are disjoint, since
no point in [0, 1] can have its p-value in both U and V . Thus the equation [0, 1] = A ∪ B
exhibits [0, 1] as a disjoint union of two nonempty open subsets of [0, 1], which contradicts
the connectedness of [0, 1].
Next we reprove Theorem 8.8 using open sets instead of ε’s and δ’s.
Proof. Let U be an arbitrary open set in Z. Then
(g ◦ f )−1 (U ) = {x ∈ X : (g ◦ f )(x) ∈ U }
= {x ∈ X : g(f (x)) ∈ U }
= {x ∈ X : f (x) ∈ g −1 (U )}
= f −1 (g −1 (U )).
By continuity of g, g −1 (U ) is open in Y , and by continuity of f , f −1 (g −1 (U )) is open in X.
Thus g ◦ f is continuous.
9In [0, 1], intervals of the form [0, ε) are open! This is the ball B(0, ε) in the metric space [0, 1], even though
the same interval is not open in R.
28 KEITH CONRAD
Think about why this proof of Theorem 8.8 and the first proof of Theorem 8.8 really are
the same argument even though at first glance they might look different. Mathematicians
consider the second proof, using open sets, to be more elegant.
As a substantial application of continuity being preserved under composition, we will
prove polynomials with real coefficients are continuous. We’ll need one lemma.
Lemma 8.13. Let f : R → R and g : R → R. If f and g are continuous then the function
R → R2 given by x 7→ (f (x), g(x)) is continuous.
Proof. Set F : R → R2 by F (x) = (f (x), g(x)) and let U be an open set in R2 . To prove
F −1 (U ) is open in R, we may assume F −1 (U ) 6= ∅. Let a ∈ F −1 (U ), so (f (a), g(a)) ∈ U .
Around each point in U there is a small square, not just a small disc, centered at the
point and contained in U . See the picture below.
Thus for some suitably small ε > 0 we have (f (a) − ε, f (a) + ε) × (g(a) − ε, g(a) + ε) ⊂ U .
By continuity of f and g, Vf = f −1 ((f (a) − ε, f (a) + ε)) and Vg = g −1 ((g(a) − ε, g(a) + ε))
are open in R and a ∈ Vf ∩ Vg ⊂ F −1 (U ). Since Vf ∩ Vg is open, it contains an interval
around a. Thus F −1 (U ) is open in R.
Theorem 8.14. Every polynomial function in one variable with real coefficients is a con-
tinuous function R → R.
Proof. We will use Theorem 8.8 to get continuity of all polynomials from that of constant
functions, f (x) = x, and addition and multiplication on R. In particular, no ε’s or δ’s or
any open sets will appear. They were used in previous results that we will invoke.
First we prove by induction that xn is continuous for each positive integer n. When n = 1
this is the identity function (Theorem 8.6). For n ≥ 2 assume by induction that xn−1 is
continuous. We can think of the function xn as the composite R → R2 → R where the
first function is x 7→ (x, xn−1 ) and the second function is multiplication (x, y) 7→ xy: their
composite is x 7→ (x, xn−1 ) 7→ xxn−1 = xn . The first function is continuous by Lemma 8.13
and the second function is continuous by Theorem 8.7, so their composite is continuous.
Since xn is continuous, a general monomial cxn for c ∈ R can be regarded as a composite
function R → R2 → R where the first function is x 7→ (c, xn ) and the second function is
multiplication (x, y) 7→ xy. The first function is continuous by Lemma 8.13 since constant
functions10 and power functions are continuous. The second function is continuous by
Theorem 8.7, so their composite is continuous.
Polynomials are finite sum of monomials, and we will prove they are continuous by
induction on the number of monomials in the polynomial. The base case of monomials was
proved above. A sum of two monomials, axm + bxn , is a composite function R → R2 → R
where the first function is x 7→ (axm , bxn ) and the second function is addition (x, y) 7→ x+y.
Both of these functions are continuous by the base case, Lemma 8.13, and Theorem 8.7, so
their composite is continuous. The general inductive step is left to the reader.
Theorem 8.15. Let f : X → Y be continuous. If S ⊂ X is compact then f (S) is compact
in Y .
Proof. We will give two proofs, one using the convergent subsequence description of com-
pactness and the other using the open covering description of compactness.
First proof: Let {yn } be a sequence in f (S), so we can write yn = f (xn ) for some
xn ∈ S. By compactness of S, the sequence {xn } in S has a convergent subsequence, say
xni → x ∈ S. Then by continuity, f (xni ) → f (x), so yni → f (x) ∈ f (S). We proved every
sequence in f (S) has a convergent subsequence, so f (S) is compact. S
Second proof: Let {Ui } be an open covering of f (S), so f (S) ⊂ i∈I Ui in Y . Then
S ⊂ i∈I f −1 (Ui ), and each f −1 (Ui ) is open in X, so {f −1 (Ui )} is an open covering of S
S
Corollary 8.18. If K is a compact subset of the metric space X then for each x ∈ X the
distance from x to the elements of K has a minimum: there is some y0 ∈ K such that
d(x, y) ≥ d(x, y0 ) for all y ∈ K.
Proof. Let f : K → R by f (y) = d(x, y). This is continuous by Theorem 8.4 so its image
f (K) is compact in R. By Theorem 6.12 (with K in that theorem being f (K) here) some
value d(x, y0 ) for y0 ∈ K is a minimum value of f .
We now turn to the final topic about continuous functions that we’ll discuss. It is an
important refinement of continuity.
Definition 8.19. A function f : X → Y between two metric spaces is called uniformly
continuous if for every ε > 0 there is a δ = δε > 0 such that for all a ∈ X,
dX (x, a) < δ =⇒ dY (f (x), f (a)) < ε.
Uniform continuity is a special kind of continuity. The difference is that uniform continu-
ity says the value of δ can be chosen in terms of ε alone, independently of a choice of point
a (so we can choose δ “uniformly in a,” hence the name). Look at the graphs of y = x2
and y = 2x below. In each case we use ε = .75 and indicate in red the largest δ-intervals
around a = 1 and a = 2 on the x-axis that make |x − a| < δ =⇒ |f (x) − f (a)| < .75. For
f (x) = x2 we need a shorter interval around a = 2 to keep the f -values within .75 of f (2)
than we need around a = 1 to keep f -values within .75 of f (1), but for f (x) = 2x intervals
of the same radius around 1 and 2 lead to intervals of the same radius around f (1) and f (2).
Linear functions on R are uniformly continuous while x2 on R is not uniformly continuous.
y y
4 4
y = 2x
2
y = x2 1
x x
1 2 1 2
(every positive real number has a real square root), but it’s false that ∃ y ∈ R s.t. ∀x ∈ R s.t.
x > 0, y 2 = x (there’s a universal square root of all positive numbers?). Definitions in real
analysis have many nested quantifiers, so it’s understandable why even a mathematician of
the stature of Cauchy, who introduced the (ε, δ)-definition11 of continuity, confused it with
uniform continuity in proofs.
Unlike continuity, uniform continuity can’t be described as a property at individual points
of the domain (why?). Also unlike continuity, there no way to convert uniform continuity
into a statement about general open subsets.12 Proofs about uniform continuity use (ε, δ)-
language.
The last general comment we have about uniform continuity is that in its definition a
and x play symmetric roles (they are quantified at the same time, unlike in the definition
of continuity), so we can express uniform continuity in a way that puts a and x on an equal
footing, which we therefore will write as x and x0 . A function f : X → Y is called uniformly
continuous if for every ε > 0 there is a δ = δε > 0 such that for all x, x0 ∈ X,
dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.
This is the formulation of uniform continuity that we will use below. (Note: we are not
saying all x and x0 satisfy dX (x, x0 ) < δ, but rather that they satisfy the total implication
written above: if dX (x, x0 ) < δ then dY (f (x), f (x0 )) < ε.)
Example 8.20. We saw before that x2 is not uniformly continuous on R, since for a given
ε the function values stay within ε of each other at larger numbers that are within δ of
each other only if δ is made progressively smaller. We can’t use the same δ as the points
get too big. But if we restrict ourselves to a bounded domain then x2 becomes uniformly
continuous. For instance, on the interval [−b, b] where b > 0,
|x − x0 | < δ =⇒ |x2 − x02 | = |x − x0 ||x + x0 | < δ(|x| + |x0 |) ≤ δ(2b),
so to make |x2 − x02 | < ε when |x − x0 | < δ in [−b, b] we can use δ = ε/(2b).
Example 8.21. For a > 0 the function 1/x on [a, ∞) is uniformly continuous:
1 |x0 − x|
0
1 δ
|x − x | < δ =⇒ − 0 =
0
< 2,
x x |x||x | a
so we can make |1/x − 1/x0 | < ε if |x − x0 | < δ in [a, ∞) using δ = a2 ε.
Example 8.22. For a metric space (X, d) and c ∈ X, the function fc (x) = d(c, x) is
uniformly continuous on X. This comes from the proof of Theorem 8.4, where we could use
δ = ε independently of the point x at which we were checking continuity of fc .
Here is the fundamental property that gives rise to uniformly continuous functions.
Theorem 8.23. A continuous function f : X → Y from a compact metric space to any
metric space is uniformly continuous.
This doesn’t say continuous functions can’t be uniformly continuous on non-compact
metric spaces (see Example 8.21), only that they must be uniformly continuous on compact
metric spaces.
11Bolzano had developed similar ideas earlier, but his work was not widely read until after Cauchy.
12A partial translation into open sets is possible, but the family of all balls of a common radius has to be
given a special status, called a “uniformity”. We don’t get into that here.
32 KEITH CONRAD
Proof. We prove the theorem in two ways, using the subsequence description of compactness
and using the open covering description of compactness.
First proof (using subsequences): Our argument will be by contradiction, starting with
the assumption that f is not uniformly continuous. Figuring out what it means not to be
uniformly continuous is going to take up the bulk of the proof!
One way to say f is not uniformly continuous is to say it is not true that for every ε > 0
there is a δ > 0 such that all x and x0 in X satisfy dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.
But this is not really progress; adding “It is not true that” to the start of a sentence in
order to negate it doesn’t usually let us understand the meaning of the negation.
The negation of a statement of the form “for every ε > 0 there is a δ > 0 such that
something happens” is “there is an ε > 0 such that for all δ > 0 the thing does not
happen.” Using quantifier notation, for a proposition P the negation of
∀ε > 0 ∃ δ > 0 s.t. P
is
∃ ε > 0 s.t. ∀δ > 0 ∼ P ,
where ∼ P is the negation of P . Negating turns ∀ into ∃ and vice versa.
For us, P is “all x and x0 in X satisfy dX (x, x0 ) < δ =⇒ dY (f (x), f (x0 )) < ε.” Its negation
is “some x and x0 in X satisfy dX (x, x0 ) < δ and dY (f (x), f (x0 )) ≥ ε”. (Observe “< ε”
turned into “≥ ε” at the end, as a negation.) Putting everything together, the negation of
f being uniformly continuous says:
There is ε > 0 s.t. for all δ > 0, an x and x0 in X satisfy dX (x, x0 ) < δ & dY (f (x), f (x0 )) ≥ ε.
The numbers x and x0 here may depend on δ (each δ needs an example of x and x0 ).
Now that we can express what it means not to be uniformly continuous, assume f is not
uniformly continuous. For the special ε that occurs, use δ = 1/n with n = 1, 2, 3, . . .: for
each n there are xn and x0n in X such that dX (xn , x0n ) < 1/n and dY (f (xn ), f (x0n )) ≥ ε. We
have built two sequences in X: {xn } and {x0n }. By compactness of X, some subsequence
{xni } of {xn } converges in X, say to x. Also x0ni → x by Theorem 3.8. By continuity
of f , f (xni ) → f (x) and f (x0ni ) → f (x), so dY (f (xni ), f (x0ni )) → 0. This contradicts the
inequality dY (f (xn ), f (x0n )) ≥ ε for all n, so we’re done.
Second proof (using open coverings): This proof will not be by contradiction. Pick ε > 0.
For each x ∈ X, continuity of f at x implies there is some δx > 0 such that
ε
(8.3) dX (x, x0 ) < δx =⇒ dY (f (x), f (x0 )) < .
2
The balls B(x, δx ) are an open covering of X, so by compactness there is a finite subcovering,
say by B(x1 , δx1 ), . . . , B(xn , δxn ). Set δ = min(δx1 /2, . . . , δxn /2) > 0.
Pick x and x0 in X with dX (x, x0 ) < δ. We have x ∈ B(xi , δxi /2) for some i = 1, . . . , n.
Then dX (x, xi ) < δxi /2 < δxi , so dY (f (x), f (xi )) < ε/2 by (8.3). Also
δxi δx δx
dX (x0 , xi ) ≤ dX (x0 , x) + dX (x, xi ) < δ + ≤ i + i = δ xi ,
2 2 2
so dY (f (x0 ), f (xi )) < ε/2 by (8.3). Thus
ε ε
dY (f (x), f (x0 )) ≤ dY (f (x), f (xi )) + dY (f (xi ), f (x0 )) < + = ε.
2 2
METRIC SPACES 33
To illustrate the better control that is built into uniform continuity compared to continu-
ity, we will prove uniformly continuous functions can be extended from dense subsets to the
whole space. We need a lemma about a basic property of uniformly continuous functions
that is not generally true of continuous functions.
Lemma 8.24. If f : X → Y is uniformly continuous and {xn } is a Cauchy sequence in X
then {f (xn )} is a Cauchy sequence in Y .
Proof. Pick ε > 0. By uniform continuity there is a δ > 0 such that for all x and x0 in
X satisfying dX (x, x0 ) < δ we have dY (f (x), f (x0 )) < ε. Since {xn } is Cauchy in X, using
the value of δ there is an N ≥ 1 such that m, n ≥ N =⇒ dX (xm , xn ) < δ. Then uniform
continuity implies dY (f (xm ), f (xn )) < ε for m, n ≥ N , so {f (xn )} is Cauchy in Y .
In contrast to this, a continuous function might not send Cauchy sequences to Cauchy
sequences: 1/x maps (0, ∞) → R and {1/n} is Cauchy in (0, ∞) but {1/(1/n)} = {n} is
not Cauchy in R. Of course continuous functions send convergent sequences to convergent
sequences, by definition!
Theorem 8.25. Let X and Y be metric spaces, with Y complete. For any dense subset
D of X, every uniformly continuous function f : D → Y uniquely extends to a continuous
function fe: X → Y , defined by fe(x) = limn→∞ f (xn ) for any sequence {xn } in D that
converges to x. Moreover, fe is uniformly continuous.
Since, by definition, fe(x) is the limit of the f (xn ) and fe(x0 ) is the limit of the f (x0n ), for
large enough n both dY (fe(x), f (xn )) and dY (f (x0n ), fe(x0 )) are less than ε/3. Also for large
enough n, dX (xn , x0n ) < δ since xn and x0n both tend to x, so dY (f (xn ), f (x0n )) < ε/3 for
large enough n by (8.4). Plugging this all into the above inequality, for large enough n
ε ε ε
dY (fe(x), fe(x0 )) ≤ dY (fe(x), f (xn )) + dY (f (xn ), f (x0n )) + dY (f (x0n ), fe(x0 )) < + + = ε.
3 3 3
Example 8.26. While 1/x is uniformly continuous as a function [1, ∞) → R (Example
8.21 with a = 1), it is not uniformly continuous as a function (0, 1] → R: if it were uniformly
continuous on (0, 1], then it would extend to a continuous function on [0, 1], which of course
it does not.
The reader should show 1/x is not uniformly continuous on (0, 1] directly from the defi-
nition of uniform continuity, without needing a result like Theorem 8.25. Use the negation
of uniform continuity from the proof of Theorem 8.23.
Example 8.27. The function f (x) = 1/(x2 − 2) on Q ∩ [1, 2] is continuous, but it is not
uniformly continuous: if it were uniformly continuous then the function would√ extend to a
continuous function on [1, 2], which is clearly false due to a problem at x = 2.
Example 8.28. As a positive application of Theorem 8.23, for each a > 1 we use the
theorem to construct13 ax for irrational x > 0 from the case of rational x ≥ 0 and show ax
is continuous in x. To use Theorem 8.23 we restrict attention to bounded intervals of x.
We will take for granted as background the existence of positive rational powers. For
each r ∈ Q with r > 0, written as the ratio of positive integers m/n, the number ar = am/n
is the unique solution of xn = am in (1, ∞). Its existence and uniqueness follow from the
Intermediate Value Theorem since xn is continuous, increasing, and unbounded as x → ∞
with value 1 at x = 1.
Claim 1: For positive integers N , 1 < a1/N < 1 + (a − 1)/N .
Proof: For h > 0, (1 + h)N > 1 + N h by the Binomial Theorem and 1 + N h ≥ a if
h ≥ (a − 1)/N . Thus h ≥ (a − 1)/N =⇒ (1 + h)N > a, so if a1/N = 1 + h then we see
a1/N − 1 = h < (a − 1)/N , so a1/N < 1 + (a − 1)/N .
From Claim 1, a1/N → 1 as N → ∞ in Z.
Take for granted that ar as = ar+s for rational r, s ≥ 0. From this it follows that
0 ≤ r < s =⇒ ar < as : s − r > 0 =⇒ as−r > 1 =⇒ as = ar as−r > ar . Then ar → 1 as
r → 0+ through rational numbers because it is true when r = 1/N and ar is increasing.
13All we do could be adapted to 0 < a < 1, but we stick to a > 1 to keep the notation simple.
METRIC SPACES 35
Claim 2: On each bounded interval Q ∩ [0, M ] for integers M ≥ 1, the function f (r) = ar
is uniformly continuous.
Proof: Pick a positive integer N and let r, s ∈ Q ∩ [0, M ] satisfy |r − s| < 1/N . Suppose
r ≥ s. Then
f (r) − f (s) = ar − as = as (ar−s − 1) =⇒ |f (r) − f (s)| ≤ aM (ar−s − 1).
From r − s < 1/N we get 1 ≤ ar−s < a1/N < 1 + (a − 1)/N by Claim 1, so 0 ≤ ar−s − 1 <
M
(a − 1)/N . Thus |f (r) − f (s)| ≤ a (a−1)
N . If instead r < s then we get the same bound by
switching the roles of r and s. Thus
1 aM (a − 1)
r, s ∈ Q ∩ [0, M ] and |r − s| < =⇒ |f (r) − f (s)| ≤ .
N N
This implies uniform continuity on Q ∩ [0, M ]: for ε > 0 let δ = 1/N where N is chosen
large enough that aM (a − 1)/N < ε.
By Claim 2 and Theorem 8.23, for each integer M ≥ 1 the function f (x) = ax on
Q ∩ [0, M ] uniquely extends to a uniformly continuous function on [0, M ], which we will
√ √
also denoted by ax . For example, a 2 is the limit of ar for rationals r tending to 2. The
function ax on [0, M ] for larger M has to agree with ax on [0, M ] for smaller M since there’s
at most one continuous extension on each [0, M ]. Therefore letting M → ∞ we get a single
continuous function ax for all x ≥ 0 that agrees with the originally defined ax when the
exponent x is rational.
Exercises.
p
(1) On R, prove the function d(x, y) = |x − y| is a metric but the function d(x, y) =
(x − y)2 is not.
(2) Check the functions d∞ and d1 on C[0, 1] in Example 2.3 are metrics.
(3) Check Example 2.4: if (X, d) is a metric space and Y is a nonemtpy subset of X
then Y together with the restriction of d to Y (strictly speaking, to Y × Y ) is a
metric space.
(4) Verify on any set that the discrete metric (Example 2.5) is a metric.
(5) Verify for any metric space (X, d) that the functions d0 (x, y) = min(d(x, y), 1) and
d00 (x, y) = d(x, y)/(1 + d(x, y)) from Example 2.6 are both metrics.
(6) In the definition of a limit of a sequence, prove the limit is unique: if xn → x and
xn → y then x = y. (This was implicitly assumed when we used limits, but it should
be proved.)
(7) Prove Theorem 4.8.
(8) Let X be a metric space and Y be a subset, treated as a metric space using the
metric induced from X. Show every open set in the metric space Y has the form
U ∩ Y where U is an open set in X. (Hint: First treat the case of open balls in
Y , and it may help to distinguish open balls in X or Y with the same center and
radius, using notation like BX (y, r) and BY (y, r).)
(9) Let (X, d) be a metric space. For a ∈ X and 0 < r1 < r2 , show the annulus
{x ∈ X : r1 < d(a, x) < r2 } is open and {x ∈ X : r1 ≤ d(a, x) ≤ r2 } is closed.
(10) Show every closed subset of a compact metric space is a compact subset by two
methods: using the convergent subsequence description of compactness and using
the open covering description of compactness.
(11) Show every compact subset of a metric space is complete using the induced metric.
36 KEITH CONRAD
(12) Fill in the details from Remark 8.10: prove a function f : X → Y between two
metric spaces is continuous at a point a ∈ X (based on the (ε, δ)-definition) if and
only if for every open U ⊂ Y containing f (a) there is an open V ⊂ X containing a
such that f (V ) ⊂ U .
(13) Show each line in R2 is a closed subset in two ways: by the definition of a closed
subset and by Corollary 8.11. (Hint for the second way: show each line has the form
f −1 (c) for some continuous function f : R2 → R and c ∈ R.)
(14) For every interval I in R, construct a continuous function f : (0, 1) → R with
image I. Therefore the connectedness of (0, 1) implies the connectedness of all
other intervals by Theorem 8.16. (Hint: Start out with I being an interval (open,
closed, half-open) having endpoints 0 and 1.)
(15) Prove the inequality in Remark 8.5.
(16) Let f : X → Y and g : X → Y be two continuous functions between the same metric
spaces. Show {x ∈ X : f (x) = g(x)} is a closed subset of X.
(17) Let K1 and K2 be disjoint compact subsets of a metric space. Prove there is c > 0
such that d(x1 , x2 ) ≥ c for all x1 ∈ K1 and x2 ∈ K2 . Give an example of two disjoint
non-compact subsets of R2 for which the conclusion is false for those subsets (Hint:
horizontal asymptote).
(18) Viewing R× as a metric space using the absolute value metric from R, show inver-
sion R× → R given by f (x) = 1/x is a continuous function but is not uniformly
continuous. Then show all rational functions with real coefficients are continuous
on R away from the numbers where the denominator is 0.
(19) For functions f : R → R and g : R → R, let F : R → R2 by F (x) = (f (x), g(x)).
Lemma 8.13 says that if f and g are continuous then F is continuous. Prove the
converse: if F is continuous then f and g are continuous.
(20) Show the function 1/x on (0, 1] is not uniformly continuous directly from the defi-
nition of uniform continuity.
√
(21) Is the function f (x) = x on [0, ∞) uniformly continuous?
(22) Convergence of a function f : [a, ∞) → R at ∞ is defined analogously to the case
of sequences: say f (x) has limit L as x → ∞ if for every ε > 0 there is an N such
that x ≥ N ⇒ |f (x) − L| < ε.
For a continuous function f : [a, ∞) → R, prove that if f (x) has a limit as x → ∞
then f is uniformly continuous on [a, ∞). (Hint: on each interval [a, b] the function
is uniformly continuous, so focus on large x.) Is the converse true?
Proof. To show every Cauchy sequence {fn } in this space has a limit in this space falls into
three steps:
Step 1: Create a candidate limit function f .
For each a ∈ [0, 1] the sequence of numbers {fn (a)} is Cauchy since
and the value on the right is arbitrarily small for all large enough m and n. Therefore
limn→∞ fn (a) exists by completeness of R. Call the limit f (a). We have defined a
function f : [0, 1] → R using pointwise considerations.
Step 2: Show f is continuous.
This will be proved with an ε/3 argument.
Pick a ∈ [0, 1] and ε > 0. We need to find δ > 0 such that |x − a| < δ =⇒
|f (x) − f (a)| < ε.
Since {fn } is Cauchy, there is N such that m, n ≥ N =⇒ d∞ (fm , fn ) < ε/3, so
for all x ∈ [0, 1] and m ≥ N we have |fm (x) − fN (x)| < ε/3. Letting m → ∞ in this
inequality, |f (x) − fN (x)| ≤ ε/3 for all x in [0, 1]. Thus for each x in [0, 1],
|f (x) − f (a)| ≤ |f (x) − fn (x)| + |fN (x) − fN (a)| + |fN (a) − f (a)|
ε ε
≤ + |fN (x) − fN (a)| +
3 3
2
= ε + |fN (x) − fN (a)|.
3
Since fN is continuous at a, there is δ > 0 such that |x−a| < δ =⇒ |fN (x)−fN (a)| <
ε/3. Therefore
2
|x − a| < δ =⇒ |f (x) − f (a)| ≤ ε + |fN (x) − fN (a)| < ε,
3
which proves f is continuous at each a ∈ [0, 1].
Step 3: Show d∞ (fn , f ) → 0.
We essentially repeat the beginning of Step 2. From {fn } being Cauchy there is
an N such that m, n ≥ N =⇒ d∞ (fm , fn ) < ε/3, so for all x ∈ [0, 1] and m, n ≥ N
we have |fm (x) − fn (x)| < ε/3. Letting m → ∞ here, we get |f (x) − fn (x)| ≤ ε/3
for all x ∈ [0, 1]. Therefore d∞ (f, fn ) ≤ ε/3 < ε.
Next we will prove Theorem 4.22, which we restate.
Theorem A.2. Every metric space has a completion.
Proof. Let the metric space be (X, d). We seek a complete metric space (X, b d)
b containing
(X, d) as a dense subset. The purpose of the proof is to show some construction is possible,
but in practice nobody thinks about a completion by the way it will be constructed here.
In particular, analysts want to work with concrete complete metric spaces, not abstract
completions. More comments about creating completions will be given after the proof.
Before building X, b here are properties it must have if it exists.
(1) Each x b∈X b is a limit of a sequence from X since X is dense in X:b x b = limn→∞ xn
with xn ∈ X, so {xn } is a Cauchy sequence in X (Theorem 4.3) and thus is also a
b
Cauchy sequence in X since db = d on X.
(2) A second Cauchy sequence {x0n } in X converges to the same limit x b if and only if
d(xn , xn ) → 0 (Theorems 3.7 and 3.8), which is the same as d(xn , x0n ) → 0 since
b 0
db = d on X.
(3) The metric on X b can be expressed as a limit of values of the metric on X: if xb and
yb are in Xb and xn → x b and yn → yb with xn , yn ∈ X, then d(xn , yn ) → d(b
b x, yb) by
Remark 8.5 (because d(xn , yn ) = d(x
b n , yn )).
38 KEITH CONRAD
Since each element of Xb is a limit of a Cauchy sequence in X, and two Cauchy sequences
in X tend to the same value in X b exactly when their termwise distance tends to 0, this
suggests an idea for how to define X: b let it be the set of all Cauchy sequences in X, with
sequences identified when the termwise distance goes to 0. This will turn out to work, and is
a generalization of one of the constructions of the real numbers out of the rational numbers:
a real number is an equivalence class of Cauchy sequences of rational numbers relative to
the absolute value on Q.
Let C be the set of all Cauchy sequences in X. Introduce a relation ∼ on C by
{xn } ∼ {yn } if d(xn , yn ) → 0.
It is left to the reader to check ∼ is an equivalence relation on C. Denote the equivalence
class of {xn } in C as [xn ], so
[xn ] = {{yn } ∈ C : d(xn , yn ) → 0}.
We embed X into C by viewing each x ∈ X as the constant Cauchy sequence {x, x, x, . . .}.
For different x and y in X, the constant sequences (x, x, x, . . .) and (y, y, y, . . .) are not
equivalent since d(x, y) > 0.
Define Xb to be the set of equivalence classes in C for the relation ∼. The motivation for
this definition of X
b is to have a set that has the first and second properties that we already
observed a completion of (X, d) should have. To put a metric on X b we will get motivation
from the third property listed above. It suggests the definition
d([x
b n ], [yn ]) = lim d(xn , yn ).
n→∞
The terms on the right side both tend to 0 as n → ∞ by the definition of the relation ∼, so
d(xn , yn ) − d(x0n , yn0 ) → 0 in R. Thus the convergent sequences {d(xn , yn )} and {d(x0n , yn0 )}
in R have the same limit, and this finishes the proof that db makes sense on X. b
It is left to the reader to show d is a metric on X. We raise one issue: the fact that
b b
d([x
b n ], [yn ]) = 0 =⇒ [xn ] = [yn ]
Anyone who is pedantic can replace ε with ε/2 to make the limit strictly less than ε.
By the same reasoning, for all m ≥ N we have d(x b m , [xn ]) ≤ ε, so [(x1 , x2 , x3 , . . .)] is the
limit of {xm } as m → ∞. This shows that, in a loose sense, each Cauchy sequence in X has
been turned into the limit of its own terms (but we have to work with equivalence classes
of Cauchy sequences to make everything proper).
(3) (X,b d)b is complete.
Let x b1 , x
b2 , xb3 , . . . be a Cauchy sequence of elements of X. b (This is a “Cauchy sequence
of equivalence classes of Cauchy sequences from X,” but don’t think too hard about it that
way.) By (2), for each n ≥ 1 we can pick yn ∈ X such that d(b b xn , yn ) → 0. (For example, we
can make the distance at most 1/n.) Then y1 , y2 , y3 , . . . is Cauchy in X b by Theorem 4.9,
so y1 , y2 , y3 , . . . is Cauchy in X since d(ym , yn ) = d(ym , yn ). Define yb = [yn ] in X.
b b We will
show x bn → yb.
Pick ε > 0. For all m and n,
b xn , ym ) ≤ d(b
d(b b xn , yn ) + d(y
b n , ym ) = d(b
b xn , yn ) + d(yn , ym ).
For all sufficiently large m and n (depending on ε) we can make d(b b xn , yn ) < ε/2 by the
definition of yn and d(yn , ym ) < ε/2 by the Cauchy property. Therefore d(b b xn , ym ) < ε for
all sufficiently large m and n. Since ym → yb by (2), by continuity of metrics (Theorem 8.4)
lim d(b
b xn , ym ) = d(b
b xn , yb).
m→∞
bn → yb in X.
For sufficiently large n, this limit is at most ε. That proves x b
40 KEITH CONRAD
The method we just gave for constructing a completion is not the only possible construc-
tion. Let’s describe a second one. For every metric space X, define Cb (X) to be the set of all
continuous bounded functions X → R. This is a metric space where the distance between
two functions f and g is defined as supx∈X |f (x) − g(x)|. (We use a supremum here since a
continuous bounded function might not have a maximum; X could be non-compact.)
The space Cb (X) with this metric is complete. A proof can be made by adapting the
proof of Theorem 4.16, which is the special case X = [0, 1]. We can embed X into Cb (X) by
fixing a choice of a ∈ X and, in terms of this choice, associating to each y ∈ X the function
fy : X → R given by fy (x) = d(y, x) − d(a, x). The function fy is continuous by Theorem
8.4 and it is bounded since |fy (x)| ≤ d(a, y) by the proof of Theorem 8.4.14 It turns out
that supx∈X |fy (x) − fz (x)| = d(y, z), so y 7→ fy is an embedding of X into Cb (X) and the
metric on Cb (X) restricts to the metric d on X under this embedding. Closures of subsets
are closed (Theorem 5.15) and closed subsets of complete spaces are complete (Theorem
5.11), so the closure of the embedded copy of X in Cb (X) is a completion of X.
Our first construction of a completion, using equivalence classes of Cauchy sequences, is a
much more robust method than the one using Cb (X). The reason is that often when X has
some additional structure that interacts well with the metric (e.g., being a normed vector
space or a field), the method of completing using equivalence classes of Cauchy sequences
shows the completion has the same additional structure automatically, whereas this is not
clear by embedding of X into Cb (X).
Sometimes no abstract construction of a completion shows a particular feature of X
persists in its completion, so a more specific technique is needed to reveal that feature
in the completion. For example, the metric space of functions (C[0, 1], d1 ) is incomplete
(Example 4.17) and analysts want to think of elements in the completion as functions on
[0, 1]. Part of a course on measure theory is devoted to explaining why the completion
consists of certain real-valued functions on [0, 1].
References
[1] K. Conrad, Spaces that are Connected but not Path-connected, http://www.math.uconn.edu/~kconrad/
blurbs/topology/connnotpathconn.pdf.
[2] K. A. Ross, “Elementary Analysis: The Theory of Calculus,” Springer-Verlag, New York, 1980.
14It is more natural to convert points of X into continuous functions on X by associating to each y ∈ X
the function x 7→ d(y, x), but this function is not bounded if the metric on X takes arbitrarily large values.
Subtracting d(a, x) creates a bounded function.