0% found this document useful (0 votes)
21 views156 pages

Ma203 Notes All Lecture Notes For Ma203 Real Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views156 pages

Ma203 Notes All Lecture Notes For Ma203 Real Analysis

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 156

lOMoARcPSD|21840264

MA203-notes - All lecture notes for MA203 Real Analysis

Real Analysis (The London School of Economics and Political Science)

Scan to open on Studocu

Studocu is not sponsored or endorsed by any college or university


Downloaded by Lukhanyiso Jikela ([email protected])
lOMoARcPSD|21840264

Department of Mathematics, London School of Economics

Real Analysis
(MA203)

Amol Sasane

2014/15

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Contents

Preface vii

Chapter 1. R, metric spaces and Rn 1


§1.1. Distance in R 2
§1.2. Metric space 2
§1.3. Neighbourhoods and open sets 6
§1.4. Notes (not part of the course) 10

Chapter 2. Sequences 13
§2.1. Sequences in R 13
§2.2. Sequences in metric spaces 15
§2.3. Pointwise versus uniform convergence 20
§2.4. Convergent sequences and closed sets 23
§2.5. Compact sets 24
§2.6. Notes (not part of the course) 27

Chapter 3. Series 29
§3.1. Series in R 29
§3.2. Series in normed spaces 42
§3.3. Notes (not part of the course) 44

Chapter 4. Continuous functions 47


§4.1. Continuity of functions from R to R 47
§4.2. Continuity of maps between metric spaces 49
§4.3. Continuous maps and open sets 50
§4.4. Compactness and continuity 57
§4.5. Uniform continuity 59

Chapter 5. Differentiation 61
§5.1. Mean Value Theorem 63
§5.2. Uniform convergence and differentiation 66
n m
§5.3. Derivative of maps from R to R 68

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

vi Contents

§5.4. Partial derivatives 70


§5.5. Notes (not part of the course) 73

Chapter 6. Epilogue: integration 75


§6.1. Motivation and definition 75
§6.2. Fundamental theorem of integral calculus 79

Bibliography 81

Index 83

Solutions 85
Solutions to the exercises from Chapter 1 85
Solutions to the exercises from Chapter 2 97
Solutions to the exercises from Chapter 3 109
Solutions to the exercises from Chapter 4 121
Solutions to the exercises from Chapter 5 134

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Preface

What is Real Analysis?

First of all “Analysis” refers to the subdomain of Mathematics, which is roughly speaking an
abstraction of the familiar subject of Calculus. Calculus arose as a box of tools enabling one to
handle diverse problems in the applied sciences such as physics and engineering where quantities
change (for example with time), and calculations based on “rates of change” were needed. It soon
became evident that the foundations of Calculus needed to be made mathematically precise. This
is roughly the subject of Mathematical Analysis, where Calculus is made rigorous. But another
byproduct of this rigorization process is that mathematicians discovered that many of the things
done in the set-up of usual calculus can be done in a much more general set up, enabling one to
expand the domain of applications. We will study such things in this course.
Secondly, why do we use the adjective “Real”? We will start with the basic setting of making
rigorous Calculus with real numbers, but we will also develop Calculus in more abstract settings,
for example in Rn . Using this adjective “Real” also highlights that the subject is different from
“Complex Analysis” which is all about doing analysis in C. (It turns out that Complex Analysis
is a very specialized branch of analysis which acquires a somewhat peculiar character owing to the
special geometric meaning associated with the multiplication of complex numbers in the complex
plane.)

Amol Sasane

Thanks to Eleni Katirtzoglou and Tony Whelan for their help in revising these lecture notes.

Konrad Swanepoel

vii

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 1

R, metric spaces and Rn

We are familiar with concepts from calculus such as


(1) convergence of sequences of real numbers,
(2) continuity of a function f : R → R,
(3) differentiability of a function f : R → R.
Once these notions are available, one can prove useful results involving such notions. For example,
we have seen the following:

Theorem 1.1. If f : [a, b] → R is continuous, then f has a minimizer on [a, b].

Theorem 1.2. If f : R → R is such that f ′′ (x) ≥ 0 for all x ∈ R and f ′ (x0 ) = 0, then x0 is a
minimizer of f .

We will revisit these concepts in this course, and see that the same concepts can be defined
in a much more general context, enabling one to prove results similar to the above in the more
general set up. This means that we will be able to solve problems that arise in applications
(such as optimization and differential equations) that we wouldn’t be able to solve earlier with
our limited tools. Besides these immediate applications, concepts and results from real analysis
are fundamental in mathematics itself, and are needed in order to study almost any topic in
mathematics.
In this chapter, we wish to emphasize that the key idea behind defining the above concepts is
that of a distance between points. In the case when one works with real numbers, this distance
is provided by the absolute value of the difference between the two numbers: thus the distance
between x, y ∈ R is taken as |x − y|. This coincides with our geometric understanding of distance
when the real numbers are represented on the “number line”. For instance, the distance between
−1 and 3 is 4, and indeed 4 = | − 1 − 3|.

|x − y|

x y

Figure 1. Distance between real numbers.

Recall for example, that a sequence (an )n∈N is said to converge with limit L ∈ R if for every
ǫ > 0, there exists a N ∈ N such that whenever n > N , |an − L| < ǫ. In other words, the sequence

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2 1. R, metric spaces and Rn

converges to L if no matter what distance ǫ > 0 is given, one can guarantee that all the terms of
the sequence beyond a certain index N are at a distance of at most ǫ away from L (this is the
inequality |an − L| < ǫ). So we notice that in this notion of “convergence of a sequence” indeed the
notion of distance played a crucial role. After all, we want to say that the terms of the sequence
get “close” to the limit, and to measure closeness, we use the distance between points of R.
A similar thing happens with all the other notions listed at the outset. For example, recall
that a function f : R → R is said to be continuous at c ∈ R if for every ǫ > 0, there exists a δ > 0
such that whenever |x − c| < δ, |f (x) − f (c)| < ǫ. Roughly, given any distance ǫ, I can find a
distance δ such that whenever I choose an x not farther than a distance δ from c, I am guaranteed
that f (x) is not farther than a distance of ǫ from f (c). Again notice the key role played by the
distance in this definition.

1.1. Distance in R
The distance between points x, y ∈ R is taken as |x − y|. Thus we have a map that associates to a
pair (x, y) ∈ R × R of real numbers, the number |x − y| ∈ R which is the distance between x and
y. We think of this map (x, y) 7→ |x − y| : R × R → R as the “distance function” in R.
Define d : R × R → R by d(x, y) = |x − y| (x, y ∈ R). Then it can be seen that this distance
function d satisfies the following properties:
(D1) For all x, y ∈ R, d(x, y) ≥ 0. If x ∈ R, then d(x, x) = 0. If x, y ∈ R are such that
d(x, y) = 0, then x = y.
(D2) For all x, y ∈ R, d(x, y) = d(y, x).
(D3) For all x, y, z ∈ R, d(x, y) + d(y, z) ≥ d(x, z).
It turns out that these are the key properties of the distance which are needed in developing
analysis in R. So it makes sense that when we want to generalize the situation with the set R
being replaced by an arbitrary set X, we must define a distance function
d:X ×X →R
that associates a number (the distance!) to each pair of points x, y ∈ X, and which has the same
properties (D1)-(D3) (with the obvious changes: x, y, z ∈ X). We do this in the next section.

1.2. Metric space


Definition 1.3. A metric space is a set X together with a function d : X × X → R satisfying the
following properties:
(D1) (Positive definiteness) For all x, y ∈ X, d(x, y) ≥ 0. For all x ∈ X, d(x, x) = 0. If
x, y ∈ X are such that d(x, y) = 0, then x = y.
(D2) (Symmetry) For all x, y ∈ X, d(x, y) = d(y, x).
(D3) (Triangle inequality) For all x, y, z ∈ X, d(x, y) + d(y, z) ≥ d(x, z).

Such a d is referred to as a distance function or metric.


Let us consider some examples.

Example 1.4. X := R, with d(x, y) := |x − y| (x, y ∈ R), is a metric space. ♦

Example 1.5. For any nonempty set X, define



1 if x 6= y
d(x, y) =
0 if x = y.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

1.2. Metric space 3

This d is called the discrete metric. Then d satisfies (D1)-(D3), and so X with the discrete metric
is a metric space. ♦

Note that in particular R with the discrete metric is a metric space as well. So the above two
examples show that the distance function in a metric space is not unique, and what metric is to
be used depends on the application one has in mind. Hence whenever we speak of a metric space,
we always need to specify not just the set X but also the distance function d being considered. So
often we say “consider the metric space (X, d)”, where X is the set in question, and d : X ×X → R
is the metric considered.
However, for some sets, there are some natural candidates for distance functions. One such
example is the following one.

Example 1.6 (Euclidean space Rn ). In R2 and R3 , where we can think of vectors as points in the
plane or points in the space, we can use the distance distance between two points as the length of
the line segment joining these points. Thus (by Pythagoras’s Theorem) in R2 , we may use
p
d(x, y) = (x1 − y1 )2 + (x2 − y2 )2
as the distance between the points x = (x1 , x2 ) and y = (y1 , y2 ) in R2 . Similarly, in R3 , one may
use
p
d(x, y) = (x1 − y1 )2 + (x2 − y2 )2 + (x3 − y3 )2
as the distance between the points x = (x1 , x2 , x3 ) and y = (y1 , y2 , y3 ) in R3 . See Figure 2.

y y

d(x,y)
|x2 −y2 | |x2 −y2 |
d(x,y)

x
|x1 −y1 | |x3 −y3 |

x |x1 −y1 |

Figure 2. Distance in R2 and R3 .

In an analogous manner to R2 and R3 , more generally, for x, y ∈ Rn =: X, we define the


Euclidean distance by
v
u n
uX p
d(x, y) = t (xk − yk )2 = (x1 − y1 )2 + · · · + (xn − yn )2
k=1

for    
x1 y1
   
x =  ...  ∈ Rn , y =  ...  ∈ Rn .
xn yn
Then Rn is a metric space with the Euclidean distance, and is referred to as the Euclidean space.
The verification of (D3) can be done by using the Cauchy-Schwarz inequality: For real numbers

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4 1. R, metric spaces and Rn

x1 , . . . , xn and y1 , . . . , yn , there holds that


X n  X
n  X
n 2
x2k yk2 ≥ x k yk .
k=1 k=1 k=1

This last property (D3) is sometimes referred to as the triangle inequality. The reason behind this
is that, for triangles in Euclidean geometry of the plane, we know that the sum of the lengths of
two sides of a triangle is at least as much as the length of the third side. If we now imagine the
points x, y, z ∈ R2 as the three vertices of a triangle, then this is what (D3) says; see Figure 3.

z
11
00

11
00
x 1
0
0
1
y

Figure 3. How the triangle inequality gets its name.

Throughout this course, whenever we refer to Rn as a metric space, unless specified otherwise,
we mean that it is equipped with this Euclidean metric. Note that Example 1.4 corresponds to
the case when n = 1. ♦
Exercise 1.7. Verify that the d given in Example 1.5 does satisfy (D1)-(D3).
Exercise 1.8. One can show the Cauchy-Schwarz inequality as follows: Let x, y be vectors in Rn with
the components x1 , . . . , xn and y1 , . . . , yn , respectively. For t ∈ R, consider the function
f (t) = (x + ty)⊤ (x + ty) = x⊤ x + 2tx⊤ y + t2 y ⊤ y.
From the second expression, we see that f is a quadratic function of the variable t. It is clear from the
first expression that f (t), being the sum of squares
n
X
(xk + tyk )2 ,
k=1

is nonnegative for all t ∈ R. This means that the discriminant of f must be ≤ 0, since otherwise, f
would have two distinct real roots, and would then have negative values between these roots! Calculate
the discriminant of the quadratic function and show that its nonpositivity yields the Cauchy-Schwarz
inequality.

Normed space. Frequently in applications, one needs a metric not just in any old set X, but in
a vector space X.
Recall that a (real) vector space X, is just a set X with the two operations of vector addition
+ : X × X → X and scalar multiplication · : R × X → X which together satisfy the vector space
axioms.
But now if one wants to also do analysis in a vector space X, there is so far no ready-made
available notion of distance between vectors. One way of creating a distance in a vector space is
to equip it with a “norm” k · k, which is the analogue of absolute value | · | in the vector space R.
The distance function is then created by taking the norm kx − yk of the difference between pairs
of vectors x, y ∈ X, just like in R the Euclidean distance between x, y ∈ R was taken as |x − y|.
Definition 1.9. A normed space is a vector space (X, +, ·) together with a norm, namely, a
function k · k : X → R satisfying the following properties:
(N1) (Positive definiteness) For all x ∈ X, kxk ≥ 0. If x ∈ X is such that kxk = 0, then x = 0
(the zero vector in X).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

1.2. Metric space 5

(N2) (Positive homogeneity) For all α ∈ R and all x ∈ X, kα · xk = |α| kxk.


(N3) (Triangle inequality) For all x, y ∈ X, kx + yk ≤ kxk + kyk.

If X is a normed space then


d(x, y) := kx − yk (x, y ∈ X)
satisfies (D1)–(D3) and makes X a metric space (Exercise 1.11). This distance is referred to as
the induced distance in the normed space (X, k · k). Clearly then
kxk = kx − 0k = d(x, 0),
and so the norm of a vector is the induced distance to the zero vector in a normed space (X, k · k).
Example 1.10. R is a vector space with the usual operations of addition and multiplication. It
is easy to see that the absolute value function
x 7→ |x| (x ∈ R)
satisfies (N1)-(N3), and so R is a normed space, and the induced distance is the usual Euclidean
metric in R.
More generally, in the vector space Rn , with addition and scalar multiplication defined com-
ponentwise, we can introduce the 2-norm as
q
kxk2 := x21 + · · · + x2n
for vectors x ∈ Rn having components x1 , . . . , xn . Then k · k2 satisfies (N1)-(N3) and makes Rn a
normed space. The induced metric is then the usual Euclidean metric in Rn . ♦
Exercise 1.11. Verify that if X is a normed space with norm k · k, then d : X × X → R defined by
d(x, y) = kx − yk satisfies (D1)–(D3).
Hint: Make sure that you use each of the properties (N1), (N2) and (N3).
Exercise 1.12. Verify that the norm k · k2 given on Rn in Example 1.10 does satisfy (N1)-(N3).
Exercise 1.13. (∗) Let X be a metric space with a metric d. Define d1 : X × X → R by
d(x, y)
d1 (x, y) = (x, y ∈ X).
1 + d(x, y)
Note that d1 (x, y) ≤ 1 for all x, y ∈ X. Show that d1 is a metric on X.
Exercise 1.14. Consider the vector space Rm×n of matrices with m rows and n columns of real numbers,
with the usual entrywise addition and scalar multiplication. Define for M ∈ Rm×n with the entry in the
ith row and jth column denoted by mij , for 1 ≤ i ≤ m, 1 ≤ j ≤ n, the number
kM k∞ := max |mij |.
1≤i≤m, 1≤j≤n

Show that k · k∞ defines a norm on Rm×n .


Exercise 1.15. Let C[a, b] denote the set of all continuous functions f : [a, b] → R. Then C[a, b] is a
vector space with addition and scalar multiplication defined pointwise. If f ∈ C[a, b], define
kf k∞ = max |f (x)|.
x∈[a,b]

Note that the function x 7→ |f (x)| is continuous on [a, b], and so by the Extreme Value Theorem, the
above maximum exists.
(1) Show that k · k∞ is a norm on C[a, b].
(2) Let f ∈ C[a, b] and let ǫ > 0. Consider the set B(f, ǫ) := {g ∈ C[a, b] : kf − gk∞ < ǫ}. Draw a
picture to explain the geometric significance of the statement g ∈ B(f, ǫ).
Exercise 1.16. C[a, b] can also be equipped with other norms. For example, prove that
Z b
kf k1 := |f (x)|dx (f ∈ C[a, b])
a
also defines a norm on C[a, b].

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

6 1. R, metric spaces and Rn

Exercise 1.17. If (X, d) is a metric space, and if Y ⊆ X, then show that (Y, d|Y ×Y ) is a metric space.
(Here d|Y ×Y denotes the restriction of d to the set Y × Y , that is, d|Y ×Y (y1 , y2 ) = d(y1 , y2 ) for y1 , y2 ∈ Y .)
Hence every subset of a metric space is itself a metric space with the restriction of the original metric.
The metric d|Y ×Y is referred to as the induced metric on Y by d or the subspace metric of d with respect
to Y , and the metric space (Y, d|Y ×Y ) is called a metric subspace of (X, d).
Exercise 1.18. (∗) The set of integers Z (⊆ R) inherits the Euclidean metric from R, but it also carries
a very different metric, called the p-adic metric. Given a prime number p, and an integer n, the p-adic
“norm”1 of n is |n|p := 1/pk , where k is the largest power of p that divides n. The norm of 0 is by
definition 0. The more factors of p, the smaller the p-norm. The p-adic metric2 on Z is dp (x, y) := |x − y|p
(x, y ∈ Z).
(1) Prove that if x, y ∈ Z, then |x + y|p ≤ max{|x|p , |yp |}.
(2) Show that dp is a metric on Z.
Exercise 1.19. (∗) Let ℓ2 denote the set of all “square summable” sequences of real numbers:
 X∞ 
ℓ2 = (an )n∈N : |an |2 < ∞ .
n=1
2
(1) Show that ℓ is a vector space with addition and scalar multiplication defined termwise.
v
u∞
uX
(2) Let k(an )n∈N k2 := t |an |2 for (an )n∈N ∈ ℓ2 . Prove that k · k2 defines a norm on ℓ2 .
n=1

So ℓ is an infinite-dimensional analogue of the Euclidean space (Rn , k · k2 ).


2

Exercise 1.20 (Hamming Distance). Let Fn 2 be the set of all ordered n-tuples of zeros and ones. For
example, F32 = {000, 001, 010, 011, 100, 101, 110, 111}. For x, y ∈ Fn
2 , let

d(x, y) = the number of places where x and y have different entries.


For example, in F32 ,
we have d(110, 110) = 0, d(010, 110) = 1 and d(101, 010) = 3. Show that (Fn
2 , d) is a
metric space. (This metric is used in the digital world, in coding and information theory.)

1.3. Neighbourhoods and open sets


Let (X, d) be a metric space. Now that we have a metric, we can describe “neighbourhoods” of
points by considering sets which include all points whose distance to the given point is not too
large.
Definition 1.21 (Open ball). Let (X, d) be a metric space. If x ∈ X and r > 0, we call the set
B(x, r) = {y ∈ X : d(x, y) < r}
the open ball centered at x with radius r.

The picture we have in mind is shown in Figure 4.

y
x

Figure 4. The open ball B(x, r).

1Note that Z is not a real vector space and so this is not really a norm in the sense we have learnt.
2Some physicists propose this to be the appropriate metric for the fabric of space-time. Thus metrics are something
we choose, depending on context. It is not something that falls out of the sky!

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

1.3. Neighbourhoods and open sets 7

In the sequel, for example in our study of continuous functions, open sets will play an important
role. Here is the definition.
Definition 1.22 (Open set). Let (X, d) be a metric space. A set U ⊆ X is said to be open if for
every x ∈ U , there exists an r > 0 such that B(x, r) ⊆ U .

Note that the radius r can depend on the choice of the point x. See Figure 5. Roughly
speaking, in a open set, no matter which point you take in it, there is always some “room” around
it consisting only of points of the open set.

U
x r

Figure 5. Open set.

Example 1.23. Let us show that the set (a, b) is open in R. Given any x ∈ (a, b), we have
a < x < b. Motivated by Figure 6, let us take r = min{x − a, b − x}. Then we see that r > 0 and
whenever |y − x| < r, we have −r < y − x < r. So
a = x − (x − a) ≤ x − r < y < x + r ≤ x + (b − x) = b,
that is, y ∈ (a, b). Hence B(x, r) ⊆ (a, b). Consequently, (a, b) is open.

a x b

a x b

Figure 6. (a, b) is open in R.

On the other hand, the interval [a, b] is not open, because x := a ∈ [a, b], but no matter how
small an r > 0 we take, the set B(a, r) = {y ∈ R : |y − a| < r} = (a − r, a + r) contains points
that do not belong to [a, b]: for example,
r r
a − ∈ B(a, r) but a − 6∈ [a, b].
2 2
Figure 7 illustrates this. ♦

a b

Figure 7. [a, b] is not open in R.

Example 1.24. The set X is open, since given an x ∈ X, we can take any r > 0, and notice that
B(x, r) ⊆ X trivially.
The empty set ∅ is also open (“vacuously”). Indeed, the reasoning is as follows: can one show
an x for which there is no r > 0 such that B(x, r) ⊆ ∅? And the answer is no, because there is no

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

8 1. R, metric spaces and Rn

x in the empty set (let alone an x which has the extra property that there is no r > 0 such that
B(x, r) ⊆ ∅!). ♦
Exercise 1.25. Let (X, d) be a metric space, x ∈ X and r > 0. Show that the open ball B(x, r) is an
open set.

Lemma 1.26. Any finite intersection of open sets is open.

Proof. It is enough to consider two open sets, as the general case follows immediately by induction
T
on the number of sets. Let U1 , U2 be two open sets. Let x ∈ U1 U2 . Then there exist r1 > 0,
r2 > 0 such that B(x, r1 ) ⊆ U1 and B(x, r2 ) ⊆ U2 . Take r = min{r1 , r2 }. Then r > 0, and we
T
claim that B(x, r) ⊆ U1 U2 . To see this, let y ∈ B(x, r). Then d(x, y) < r1 and d(x, y) < r2 . So
T T
y ∈ B(x, r1 ) B(x, r1 ) ⊆ U1 U2 . 
Example 1.27. The finiteness condition in the above lemma cannot be dropped. Here is an
example. Consider the open sets
 
1 1
Un := − , (n ∈ N)
n n
\
in R. Then we have Un = {0}, which is not open in R. ♦
n∈N

Lemma 1.28. Any union of open sets is open.

Proof. Let Ui (i ∈ I) be a family of open sets indexed3 by the set I. If


[
x∈ Ui ,
i∈I

then x ∈ Ui∗ for some i∗ ∈ I. But as Ui∗ is open, there exists a r > 0 such that B(x, r) ⊆ Ui∗ .
Thus [
B(x, r) ⊆ Ui∗ ⊆ Ui ,
i∈I
[
and so we see that the union Ui is open. 
i∈I

Definition 1.29 (Closed set). Let (X, d) be a metric space. A set F is closed if its complement
X \ F is open.
Example 1.30. [a, b] is closed in R. Indeed, its complement R \ [a, b] is the union of the two open
sets (−∞, a) and (b, ∞). Hence R \ [a, b] is open, and [a, b] is closed.
The set (−∞, b] is closed in R. (Why?)
The sets (a, b], [a, b) are neither open nor closed in R. (Why?) ♦
Example 1.31. X, ∅ are closed. ♦
Exercise 1.32. Show that arbitrary intersections of closed sets are closed. Prove that a finite union of
closed sets is closed. Can the finiteness condition be dropped in the previous claim?
Exercise 1.33. We know that the segment (0, 1) is open in R. Show that the segment (0, 1) considered
as a subset of the plane, that is, the set I = {(x, y) ∈ R2 : 0 < x < 1, y = 0} is not open in R2 .
Exercise 1.34. Consider the following three metrics on R2 : for x = (x1 , x2 ), y = (y1 , y2 ) ∈ R2 ,
d1 (x, y) := |x1 − y1 | + |x2 − y2 |,
p
d2 (x, y) := (x1 − y1 )2 + (x2 − y2 )2 ,
d∞ (x, y) := max{|x1 − y1 |, |x2 − y2 |}.
We already know that d2 defines a metric on R2 : it is just the Euclidean metric induced by the norm k · k2 .

3This means that we have a set I, and for each i ∈ I, there is a set U .
i

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

1.3. Neighbourhoods and open sets 9

(1) Verify that d1 and d∞ are also metrics on R2 .


(2) Sketch the “unit balls” B(0, 1) in each of the metrics.
(3) Give a pictorial “proof without words” to show that a set U is open in R2 in the Euclidean
metric if and only if it is open when R2 is equipped with the metric d1 or the metric d∞ . Hint:
Inside every square you can draw a circle, and inside every circle, you can draw a square!
Exercise 1.35. Determine if the following statements are true or false. Give reasons for your answers.
(1) If a set is not open, then it is closed.
(2) If a set is open, then it is not closed.
(3) There are sets which are both open and closed.
(4) There are sets which are neither open nor closed.
(5) Q is open in R.
(6) (∗) Q is closed in R.
(7) Z is closed in R.
Exercise 1.36. Show that the unit sphere with center 0 in R3 , namely the set
S2 := {x ∈ R3 : x21 + x22 + x23 = 1}
is closed in R3 .
Exercise 1.37. Let (X, d) be a metric space. Show that a singleton (a subset of X containing precisely
one element) is always closed. Conclude that every finite subset of X is closed.
Exercise 1.38. Let X be any nonempty set equipped with the discrete metric. Prove that every subset
Y of X is both open and closed.
Exercise 1.39. (∗) A subset Y of a metric space (X, d) is said to be dense in X if for all x ∈ X and all
ǫ > 0, there exists a y ∈ Y such that d(x, y) < ǫ. (That is, if we take any x ∈ X and consider any ball
B(x, ǫ) centered at x, it contains a point from Y .) Show that Q is dense in R by proceeding as follows.
If x, y ∈ R and x < y, then show that there is a q ∈ Q such that x < q < y. Hint: By the Archimedean
property4 of R, there is a positive integer n such that n(y − x) > 1. Next there are positive integers m1 ,
m2 such that m1 > nx and m2 > −nx so that −m2 < nx < m1 . Hence there is an integer m such that
m − 1 ≤ nx < m. Consequently nx < m ≤ 1 + nx < ny, which gives the desired result.
Conclude that Q is dense in R.
Exercise 1.40. Is the set R \ Q of irrational numbers dense in R? Hint: Take any x ∈ R. If x is irrational

itself, then we may just take y to be x and we are done; whereas if x is rational, then take y = x + 2/n
with a sufficiently large n.
Exercise 1.41. A subset C of a normed space X is called convex if for all x, y ∈ C, and all t ∈ (0, 1),
(1 − t)x + ty ∈ C. (Geometrically, this means that for any pair of points in C, the “line segment” joining
them also lies in C.)
(1) Show that the open ball B(0, r) with center 0 ∈ X and radius r > 0 is convex.
(2) Is the unit circle S1 = {x ∈ R2 : kxk2 = 1} a convex set in R2 ?
(3) Let A ∈ Rm×n and b ∈ Rm . Prove that the “Linear Programming simplex”5
Σ := {x ∈ Rn : Ax = b, x1 ≥ 0, . . . , xn ≥ 0}
is a convex set in Rn .
Exercise 1.42. Define de : R2 × R2 → R by

kuk2 + kvk2 if u 6= v,
de (u, v) = (u, v ∈ R2 ).
0 if u = v,
We call this metric the “express railway metric”. (For example in the British context, to get from A to
B, travel via London, the origin.)

4The Archimedean property of R says that if x, y ∈ R and x > 0, then there exists an n ∈ N such that y < nx. See
the notes for MA103.
5This set arises the “feasible set” in a certain optimization problem in Rn , where the constraints are described by a
bunch of linear inequalities.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

10 1. R, metric spaces and Rn

On the other hand, consider ds : R2 × R2 → R defined by



kuk2 + kvk2 if u, v are linearly independent,
ds (u, v) = (u, v ∈ R2 ).
ku − vk2 if u, v are linearly dependent.
We call this the “stopping railway metric”. (To get from A to B travel via London, unless A and B are
on the same London route.)
Show that the express railway metric and the stopping railway metric are indeed metrics on R2 .

1.4. Notes (not part of the course)


Topology. If we look at the collection O of open sets in a metric space (X, d), we notice that it has the
following three properties:
(T1) ∅, X ∈ O.
[
(T2) If Ui (i ∈ I) is family of sets from O indexed by I, then Ui ∈ O.
i∈I
n
\
(T3) If U1 , . . . , Un is a finite collection of sets from O, then Ui ∈ O.
i=1

More generally, if X is any set (not necessarily one equipped with a metric), then any collection O of
subsets of X that satisfy the properties (T1), (T2), (T3) is called a topology on X and (X, O) is called a
topological space. So for a metric space X, if we take O to be family of open sets in X, then we obtain a
topological space. More generally, if one has a topological space (X, O) given by the topology O, we call
each element of O open.

Topological spaces

Metric spaces

Normed Vector
spaces spaces

It turns out that one can in fact extend some of the notions from Real Analysis (such as convergence
of sequences and continuity of maps) in the even more general set up of topological spaces, devoid of any
metric, where the notion of closeness is specified by considering arbitrary open neighbourhoods provided
by elements of O. In some applications this is exactly the right thing needed, but we will not go into such
abstractions in this course. In fact, this is a very broad subdiscipline of mathematics called Topology.

Construction of the set of real numbers. In these notes, we treat the real number system R as a
given. But one might wonder if we can take the existence of real numbers on faith alone. It turns out
that a mathematical proof of its existence can be given. Roughly, we are already familiar with the natural
numbers, the integers, and the rational numbers, and their rigorous mathematical construction is also
relatively straightforward. However, the set Q of rational numbers has “holes” (for example in MA103 we
have seen that this manifests itself in the fact that Q does not possess the least upper bound property).
The set of real numbers R is obtained by “filling these holes”. There are several ways of doing this. One
is by a general method called “completion of metric spaces”. Another way, which is more intuitive, is via
“(Dedekind) cuts”, where we view real numbers as places where a line S may be cut with scissors. T More
precisely, a cut A|B in Q is a pair of subsets A, B of Q such that A B = Q, A 6= ∅, B 6= ∅, A B = ∅,
if a ∈ A and b ∈ B then a < b, and A contains no largest element. R is then taken as the set of all cuts
A|B. Here are two examples of cuts:
A|B = {r ∈ Q : r < 1}|{r ∈ Q : r ≥ 1}
A|B = {r ∈ Q : r ≤ 0 or r2 < 2}|{r ∈ Q : r > 0 and r2 ≥ 2}.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

1.4. Notes (not part of the course) 11

It turns out that R is a field containing Q, and it possesses the least upper bound property. The interested
reader is referred to the Appendix to Chapter 1 in the classic textbook by Walter Rudin [R].

Figure 8. Dedekind cut.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 2

Sequences

In this chapter we study sequences in metric spaces. The notion of a convergent sequence is an
important concept in Analysis. Besides its theoretical importance, it is also a natural concept
arising in applications when one talks about better and better approximations to the solution of a
problem using a numerical scheme. For example the method of Archimedes for finding the area of
a circle by sandwiching it between the areas of a circumscribed and an inscribed regular polygon
of ever increasing number of sides. There are also numerical schemes for finding a minimizer of a
convex function (Newton’s method), or for finding a solution to an ordinary differential equation
(Euler’s method), where convergence in more general metric spaces (such as Rn or C[a, b]) will
play a role.
Before proceeding onto sequences in general metric spaces, let us first begin with (numerical)
sequences in R.

2.1. Sequences in R
Let us recall the definition of a convergent sequence of real numbers.

Definition 2.1. A sequence (an )n∈N of real numbers is said to be convergent with limit L ∈ R if
for every ǫ > 0, there exists a N ∈ N such that whenever n > N , |an − L| < ǫ.

We have learnt that the limit of a convergent sequence (an )n∈N is unique, and we denote it by
lim an .
n→∞

We have also learnt the following important result1:

Theorem 2.2 (Bolzano-Weierstrass Theorem). Every bounded sequence has a convergent subse-
quence.

An important consequence of this result is the fact that in R, the set of convergent sequences
coincides with the set of Cauchy sequences. Let us first recall the definition of a Cauchy sequence2.

Definition 2.3. A sequence (an )n∈N of real numbers is said to be a Cauchy sequence if for every
ǫ > 0, there exists a N ∈ N such that whenever m, n > N , |an − am | < ǫ.

Roughly speaking, we can make the terms of the sequence arbitrarily close to each other
provided we go far enough in the sequence.

1See Theorem 1.2.8 on page 34 of the lecture notes for the second part of MA103.
2See Exercise 6 on page 18 of the lecture notes for the second part of MA103.

13

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

14 2. Sequences

Example 2.4. The sequence ( n1 )n∈N is Cauchy. Indeed, we have n1 − m 1 1


≤ n1 + m < N1 + N1 = N2
whenever n, m > N . Thus given ǫ > 0, we can choose N ∈ N larger than 2ǫ so that we then have
1
| n1 − m | < N2 < ǫ for all n, m > N . Consequently, ( n1 )n∈N is Cauchy. ♦
Exercise 2.5. Show that if (an )n∈N is a Cauchy sequence, then (an+1 − an )n∈N converges to 0.

Example 2.6. This example shows that for a sequence (an )n∈N to be Cauchy, it is not enough

that (an+1 − an )n∈N converges to 0. Take an := n (n ∈ N). Then
√ √ 1 n→∞
an+1 − an = n + 1 − n = √ √ −→ 0,
n+1+ n
√ √ √
but (an )n∈N is not Cauchy, since for any n ∈ N, a4n − an = 4n − n = n ≥ 1. ♦

The next result says that Cauchyness is a necessary condition for a sequence to be convergent.
Lemma 2.7. Every convergent sequence is Cauchy.

Proof. Let (an )n∈N be a sequence of real numbers that converges to L. Let ǫ > 0. Then there
exists an N ∈ N such that |an − L| < 2ǫ . Thus for n, m > N , we have
ǫ ǫ
|an − am | = |an − L + L − am | ≤ |an − L| + |am − L| < + = ǫ.
2 2
So the sequence (an )n∈N is a Cauchy sequence. 

Now we will prove the remarkable fact in R, Cauchyness turns out to be also a sufficient
condition for the sequence to be convergent. In other words, in R, every Cauchy sequence is
convergent. This is a very useful fact since, in order to prove that a sequence is convergent using
the definition, we would need to guess what the limit is. In contrast, checking whether or not a
sequence is Cauchy needs only knowledge of the terms of the sequence, and no guesswork regarding
the limit is needed. So this is a powerful technique for proving existence results.
Theorem 2.8. Every Cauchy sequence in R is convergent.

Proof. There are three main steps. First we show that every Cauchy sequence is bounded. Then
we use the Bolzano-Weierstrass theorem to conclude that it must have a convergent subsequence.
Finally we show that a Cauchy sequence having a convergent subsequence must itself be convergent.
Step 1. Suppose that (an )n∈N is a Cauchy sequence. Choose any positive ǫ, say ǫ = 1. Then there
exists an N ∈ N such that for all n, m > N , |an − am | < ǫ. In particular, with m = N + 1 > N ,
and n > N , |an − aN +1 | < ǫ. Hence by the triangle inequality, for all n > N ,
|an | = |an − aN +1 + aN +1 | ≤ |an − aN +1 | + |aN +1 | < 1 + |aN +1 |.
On the other hand, for n ≤ N , |an | ≤ max{|a1 |, . . . , |aN |, |aN +1 | + 1} =: M . Consequently,
|an | ≤ M (n ∈ N), that is, the sequence (an )n∈N is bounded.
Step 2.By the Bolzano-Weierstrass Theorem, (an )n∈N has a convergent subsequence (ank )k∈N
that is convergent, to L, say.
Step 3. Finally we show that (an )n∈N is also convergent with limit L. Let ǫ > 0. Then there
exists an N ∈ N such that for all n, m > N ,
ǫ
|an − am | < . (2.1)
2
Also, since (ank )k∈N converges to L, we can find a nK > N such that |anK − L| < 2ǫ . Taking
m = nK in (2.1), we have for all n > N that
ǫ ǫ
|an − L| = |an − anK + anK − L| ≤ |an − anK | + |anK − L| < + = ǫ.
2 2
Thus (an )n∈N is also convergent with limit L, and this completes the proof. 

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.2. Sequences in metric spaces 15

Exercise 2.9. Determine if the following statements are true or false. Give reasons for your answers.
(1) Every subsequence of a convergent real sequence is convergent.
(2) Every subsequence of a divergent real sequence is divergent.
(3) Every subsequence of a bounded real sequence is bounded.
(4) Every subsequence of an unbounded real sequence is unbounded.
(5) Every subsequence of a monotone real sequence is monotone.
(6) Every subsequence of a nonmonotone real sequence is nonmonotone.
(7) If every subsequence of a real sequence converges, the sequence itself converges.
(8) If for a real sequence (an )n∈N , the sequences (a2n )n∈N and (a2n+1 )n∈N both converge, then
(an )n∈N converges.
(9) If for a real sequence (an )n∈N , the sequences (a2n )n∈N and (a2n+1 )n∈N both converge to the
same limit, then (an )n∈N converges.
Exercise 2.10. Fill in the blanks in the following proof of the fact that every bounded increasing sequence
of real numbers converges.
Let (an )n∈N be a bounded increasing sequence of real numbers. Let M be the of the set of
upper bounds of {an : n ∈ N}. The existence of M is guaranteed by the of the set of real numbers.
We show that M is the of (an )n∈N . Taking ǫ > 0, we must show that there exists a positive integer
N such that for all n > N . Since M − ǫ < M , M − ǫ is not of {an : n ∈ N}. Therefore
there exists N with ≥ aN > . Since (an )n∈N is , |an − M | < ǫ for all n ≥ N . 
Exercise 2.11. (∗) Consider the sequence
 1  2  3
1 1 1
1+ , 1+ , 1+ , ....
1 2 3
(1) Show that the sequence is increasing.
(2) Prove that each term in the sequence is smaller than 3.
Hint: Use the Binomial Theorem.
From the previous exercise, one can conclude that the sequence converges. In fact it can be shown
that  n
1 1 1 1
lim 1+ = e := 1 + + + + ....
n→∞ n 1! 2! 3!
Exercise 2.12. For each of the following sequences, determine whether it converges or not, and find the
limit in case of convergence. Give reasons for your answers.

(1) (cos(πn))n∈N

(2) (1 + n2 )n∈N
 
sin n
(3)
n n∈N
 
3n2
(4) 1 −
n+1 n∈N
 1
(5) n n
n∈N

(6) 0.9, 0.99, 0.999, . . .

2.2. Sequences in metric spaces


We now give the notion of convergence of a sequence in a general metric space. We will see that
essentially the definition is the same as in R, except that instead of having the distance between

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

16 2. Sequences

the nth term and the limit L given by |an − L|, now we will replace it by d(an , L) in a general
metric space with metric d.
Definition 2.13. A sequence (an )n∈N of points in a metric space (X, d) is said to be convergent
with limit L ∈ X if for every ǫ > 0, there exists a N ∈ N such that whenever n > N , d(an , L) < ǫ.

Let us understand this definition pictorially. We have been given a sequence (an )n∈N of points
and a candidate L for its limit. We are allowed to say that this sequence converges to L if given
any ǫ > 0, that is, no matter how small a ball we consider around L,

ǫ
L

we can find an index N such that all the terms of the sequence beyond this index lie inside the
ball.

Lemma 2.14. The limit of a convergent sequence in a metric space is unique.

Proof. Suppose that (an )n∈N is a convergent sequence, and let it have two distinct limits L1 and
L2 . Then d(L1 , L2 ) > 0. Set
1
ǫ = d(L1 , L2 ) > 0.
2
Then there exists an N1 such that for all n > N1 , d(an , L1 ) < ǫ. Also, there exists an N2 such
that for all n > N2 , d(an , L2 ) < ǫ. Hence for any n > max{N1 , N2 }, we have
d(L1 , L2 ) ≤ d(L1 , an ) + d(an , L2 ) < ǫ + ǫ = d(L1 , L2 ),
a contradiction. Thus the limit of (an )n∈N is unique. 

If (an )n∈N is a convergent sequence, then we will denote its limit by lim an .
n→∞

Exercise 2.15. Let (an )n∈N be a sequence in the Euclidean space Rd . Show that (an )n∈N is convergent
(k)
with limit L if and only if for every k ∈ {1, . . . , d}, the sequence (an )n∈N in R formed by the kth
component of the terms of (an )n∈N is convergent with limit L . (Here we use the notation v (k) to mean
(k)

the kth component of a vector v ∈ Rd .)


Exercise 2.16. Consider the sequence (an )n∈N in the Euclidean space R2 :
 n 
 4n + 2 
an :=  n2  (n ∈ N).
2
n +1
Show that (an )n∈N is convergent. What is its limit?

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.2. Sequences in metric spaces 17

Exercise 2.17. Let X be a nonempty set equipped with the discrete metric. Show that a sequence
(an )n∈N is convergent if and only if it is eventually a constant sequence (that is, there is a c ∈ X and an
N ∈ N such that for all n > N , an = c).
Exercise 2.18. (∗) Let (X, d) be a metric space and let (an )n∈N and (bn )n∈N be convergent sequences
in X with limits a and b, respectively. Prove that (d(an , bn ))n∈N is a convergent sequence in R with limit
d(a, b). Hint: d(an , bn ) ≤ d(an , a) + d(a, b) + d(b, bn ).
Exercise 2.19. Let v1 = (x1 , y1 ) ∈ R2 be such that 0 < x1 < y1 . Define
 
√ xn + yn
vn+1 := (xn+1 , yn+1 ) := xn yn , (n ∈ N).
2
yn − xn
(1) Show that 0 < xn < xn+1 < yn+1 < yn and that yn+1 − xn+1 < yn+1 − xn = .
2
(2) Conclude that lim vn exists and equals (c, c) for some number c ∈ R. This value c is called the
n→∞
arithmetic-geometric mean3, of x1 and y1 , and is denoted by agm(x1 , y1 ).

We can also define Cauchy sequences in a metric space analogous to the situation in R.
Definition 2.20. A sequence (an )n∈N of points in a metric space (X, d) is said to be a Cauchy
sequence if for every ǫ > 0, there exists a N ∈ N such that whenever n > N , d(an , am ) < ǫ.
Lemma 2.21. Every convergent sequence is Cauchy.

Proof. The proof is the same, mutatis mutandis4, as the proof of Lemma 2.7. Let (an )n∈N be a
sequence of points in X that converges to L ∈ X. Let ǫ > 0. Then there exists an N ∈ N such
ǫ ǫ
that d(an , L) < 2ǫ . Thus for n, m > N , we have d(an , am ) ≤ d(an , L) + d(L, am ) < + = ǫ. So
2 2
the sequence (an )n∈N is a Cauchy sequence. 

Just as in R, it would be useful to know if also in arbitrary metric spaces, the set of convergent
sequences coincides with the set of Cauchy sequences. Unfortunately, this is not always true. For
example, consider the metric space X = (0, 1] with the same Euclidean metric as used in R. Then
the sequence (1/n)n∈N is easily seen to be Cauchy, but is not convergent in X, as there is a missing
point in X, namely 0. However, in some other metric spaces, such as R, the set of convergent
sequences and the set of Cauchy sequences do coincide. So it makes sense to give such metric
spaces a special name: they are called “complete”.
Definition 2.22. A metric space in which every Cauchy sequence converges is called complete.
Example 2.23. R with the Euclidean metric is complete. ♦
Exercise 2.24. (∗) Show that Q with the Euclidean metric is not complete. Hint: Revisit the solution
to part (6) of Exercise 1.35.
Exercise 2.25. Let X be a metric space. If (xn )n∈N is a Cauchy sequence in X which has a convergent
subsequence (xnk )k∈N with limit L, then show that (xn )n∈N is convergent with the same limit L.
Theorem 2.26. Rd is complete.

Proof. (Essentially, this is because R is complete, and one has d copies of R in Rd .) Suppose that
(an )n∈N is a Cauchy sequence in Rd :
 
(1)
xn
 . 
an = 
 ..  .

(d)
xn
∞ dx √ 
Z 
3Gauss observed that I(a, b) := satisfies I(a, b) = I a+b ab with the help of the
p
2 2 2 2 2 ,
−∞ (x + a )(y + b )
dx π
Z ∞
substitution t = 12 x − ab

x , and using this, obtained the remarkable result that p = .
−∞ (x2 + a2 )(y 2 + b2 ) agm(a, b)
4Latin phrase meaning “by changing those things which need to be changed”.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

18 2. Sequences

We have the inequalities


|x(k) (k)
n − xm | ≤ kan − am k2 (n, m ∈ N, k = 1, . . . , d),
(k)
from which it follows that each of the sequences (xn )n∈N ,
k = 1, . . . , d, is Cauchy in R, and hence
(1) (d)
convergent, with respective limits, say L , . . . , L ∈ R. So given ǫ > 0, there exists a large
enough N such that whenever n > N , we have
ǫ
|x(k)
n −L
(k)
|< √ (k = 1, . . . , d).
d
Set  (1) 
L
 .. 
L =  .  ∈ Rd .
L(d)
Thus for n > N , v v
u d u d
uX u X ǫ2
kan − Lk2 = t |xn − L | < t
(k) (k) 2
= ǫ.
d
k=1 k=1

Consequently, the sequence (an )n∈N converges to L. 


Exercise 2.27. Rn×m with the metric induced by k · k∞ is complete. (See Exercise 1.14 for the definition
of the norm k · k∞ on the vector space Rn×m .)

The following theorem is an important result, and lies at the core of a result on the existence
of solutions for Ordinary Differential Equations (ODEs). You can learn more about this in the
course Differential Equations (MA209). (See Exercise 1.15 for the definition of the norm k · k∞ on
C[a, b].)
Theorem 2.28. (∗) C[a, b] with the metric induced by k · k∞ is complete.

Proof. (You may skip this proof.) The idea behind the proof is similar to the proof of the
completeness of Rn . If (fn )n∈N is a Cauchy sequence, then we think of the fn (x) as being the
“components” of fn indexed by x ∈ [a, b]. We first freeze an x ∈ [a, b], and show that (fn (x))n∈N is
a Cauchy sequence in R, and hence convergent to a number (which depends on x), and which we
denote by f (x). Next we show that the function x 7→ f (x) is continuous, and finally that (fn )n∈N
does converge to f .

11
00
0
1
00
11
0
1
00
11 f1
0
1
00
11
0
1
00
11 f2
0
1
00
11
0
1
0
1 f3
0
1
0
1
0
1
0
1
a x b
Figure 1. The Cauchy sequence (fn (x))n∈N obtained from the Cauchy sequence (fn )n∈N by
freezing x.

Let (fn )n∈N be a Cauchy sequence. Let x ∈ [a, b]. We claim that (fn (x))n∈N is a Cauchy
sequence in R. Let ǫ > 0. Then there exists an N ∈ N such that for all n, m > N , kfn − fm k∞ < ǫ.
But
|fn (x) − fm (x)| ≤ max |fn (y) − fm (y)| = kfn − fm k∞ < ǫ,
y∈[a,b]
for n, m > N . This shows that indeed (fn (x))n∈N is a Cauchy sequence in R. But R is complete,
and so the Cauchy sequence (fn (x))n∈N is in fact convergent, with a limit which depends on which

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.2. Sequences in metric spaces 19

x ∈ [a, b] we had frozen at the outset. To highlight this dependence on x, we denote the limit
of (fn (x))n∈N by f (x). (Thus f (a) is the number which is the limit of the convergent sequence
(fn (a))n∈N , f (b) is the number which is the limit of the convergent sequence (fn (b))n∈N , and so
on.) So we have a function
x is sent to the number which is the limit of the convergent sequence (fn (x))n∈N
from [a, b] to R. We call this function f . This will serve as the limit of the sequence (fn )n∈N . But
first we have to see if it belongs to C[a, b], that is, we need to check that this f is continuous on
[a, b].
Let x ∈ [a, b]. We will show that f is continuous at x. Recall that in order to do this, we
have to show that for each ǫ > 0, there exists a δ > 0 such that whenever |y − x| < δ, we have
|f (y) − f (x)| < ǫ. Let ǫ > 0. Choose N large enough so that for all n, m > N ,
ǫ
kfn − fm k∞ < .
3
ǫ
Let y ∈ [a, b]. Then for n > N , |fn (y) − fN +1 (y)| ≤ kfn − fN +1 k∞ < . Now let n → ∞:
3
ǫ
|f (y) − fN +1 (y)| = lim |fn (y) − fN +1 (y)| ≤ .
n→∞ 3
As the choice of y ∈ [a, b] was arbitrary, we have for all y ∈ [a, b] that
ǫ
|f (y) − fN +1 (y)| ≤ .
3
Now fN +1 ∈ C[a, b]. So there exists a δ > 0 such that whenever |y − x| < δ, we have
ǫ
|fN +1 (y) − fN +1 (x)| < .
3
Thus whenever |y − x| < δ, we have
|f (y) − f (x)| = |f (y) − fN +1 (y) + fN +1 (y) − fN +1 (x) + fN +1 (x) − f (x)|
≤ |f (y) − fN +1 (y)| + |fN +1 (y) − fN +1 (x)| + |fN +1 (x) − f (x)|
≤ ǫ/3 + ǫ/3 + ǫ/3 = ǫ.
This shows that f is continuous at x. As the choice of x ∈ [a, b] was arbitrary, f is continuous on
[a, b].
Finally, we show that (fn )n∈N does converge to f . Let ǫ > 0. Choose N large enough so
that for all n, m > N , kfn − fm k∞ < ǫ. Fix n > N . Let x ∈ [a, b]. Then for all m > N ,
|fn (x) − fm (x)| ≤ kfn − fm k∞ < ǫ. Thus
|fn (x) − f (x)| = lim |fn (y) − fm (y)| ≤ ǫ.
m→∞

But x ∈ [a, b] was arbitrary. Hence


kfn − f k∞ = max |fn (x) − f (x)| ≤ ǫ.
x∈[a,b]

But we could have fixed any n > N at the outset and obtained the same result. So we have that
for all n > N , kfn − f k∞ ≤ ǫ. Thus
lim fn = f,
n→∞
and this completes the proof. 

The norm k · k∞ is special in that C[a, b] is complete with the corresponding induced metric.
It turns out that C[a, b] with the other natural norm met earlier, namely the k · k1 -norm is not
complete. The objective in the following exercise is to demonstrate this.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

20 2. Sequences

Exercise 2.29. (∗) Let C[0, 1] be equipped with the k · k1 -norm given by
Z 1
kf k1 := |f (x)|dx (f ∈ C[0, 1]).
0
Show that the corresponding metric space is not complete. For example, you may consider the sequence
(fn )n∈N with the fn as shown in Figure 2. Show that
Z 1 +max{ 1 , 1 }
2 n+1 m+1 2
kfn − fm k1 = |fn (x) − fm (x)|dx ≤ ,
1 N
2

for n, m > N , and so (fn )∈N is Cauchy. Next, prove that if (fn )n∈N converges to f ∈ C[0, 1], then f must
satisfy 
0 for x ∈ [0, 12 ],
f (x) =
1 for x ∈ ( 21 , 1],
which does not belong to C[0, 1], a contradiction.

fn

1 1 1
0 2 2 + n+1 1

Figure 2. fn .

Exercise 2.30. Show that any nonempty set X equipped with the discrete metric is complete.
Exercise 2.31. Prove that Z equipped with the Euclidean metric induced from R is complete.

2.3. Pointwise versus uniform convergence


Convergence in (C[a, b], k · k∞ ) is referred to as uniform convergence. More generally, we have the
following definition.
Definition 2.32. Let X be any set and f, fn : X → R (n ∈ N) be functions.
(1) The sequence (fn )n∈N is said to converge uniformly to f if
∀ǫ > 0, ∃N ∈ N such that ∀n > N, ∀x ∈ X, |fn (x) − f (x)| < ǫ.
(2) The sequence (fn )n∈N is said to converge pointwise to f if
∀ǫ > 0, ∀x ∈ X, ∃N ∈ N such that ∀n > N, |fn (x) − f (x)| < ǫ.

Consider the two statements:


∀ǫ > 0, ∃N ∈ N such that ∀n > N, ∀x ∈ X, |fn (x) − f (x)| < ǫ.
and
∀ǫ > 0, ∀x ∈ X, ∃N ∈ N such that ∀n > N, |fn (x) − f (x)| < ǫ.
Can you spot the difference? What has changed is the order of
∀x ∈ X
and
∃N ∈ N such that ∀n > N.
This seemingly small change makes a world of difference. Indeed, even in everyday language, the
two statements:

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.3. Pointwise versus uniform convergence 21

There exists a faculty member B such that for every student A, B is a personal tutor of A.
and
For every student A, there exists a faculty member B such that B is a personal tutor of A.
clearly mean two quite different things, where the former says that the faculty member B is
personal tutor to all students, while in the latter statement, the faculty member who is claimed
to exist may depend on which student A is chosen.
This is the same sort of a difference between the uniform convergence requirement, namely:
∀ǫ > 0, ∃N ∈ N such that ∀n > N, ∀x ∈ X, |fn (x) − f (x)| < ǫ.
and the pointwise convergence requirement, namely
∀ǫ > 0, ∀x ∈ X, ∃N ∈ N such that ∀n > N, |fn (x) − f (x)| < ǫ.
In the former, the same N works for all x ∈ X, while in the latter, the N might depend on the x
in question.
It is clear that if fn converges uniformly to f , then fn converges pointwise to f , but there are
pointwise convergent sequences of functions which do not converge uniformly. Here is an example
to illustrate this.
Example 2.33. Consider fn : (0, 1) → R (n ∈ N) given by fn (x) = xn (x ∈ (0, 1), n ∈ N). Then
(fn )n∈N converges pointwise to the zero function f , defined by f (x) = 0 (x ∈ (0, 1)). Indeed, for
each x ∈ (0, 1),
lim xn = 0,
n→∞
and so the following statement is true:
∀ǫ > 0, ∀x ∈ (0, 1), ∃N ∈ N such that ∀n > N, |fn (x) − f (x)| < ǫ.
log ǫ
If fact, we can choose N > so that xN < ǫ, and if we have n > N , we are guaranteed that
log x
|fn (x) − f (x)| = |xn − 0| = xn < xN < ǫ.
It is clear that our choice of N in the above depends not only on ǫ, but also on the point x ∈ (0, 1).
The closer x is to 1, the larger N is. The question arises: Is there an N so that for n > N ,
|fn (x) − f (x)| < ǫ for all x ∈ (0, 1)? We will show below that the answer is “no”. In other words,
(fn )n∈N does not converge uniformly to the zero function f .

f1
f2

1
ǫ= 2

0 f
1

Figure 3. (xn )n∈N converges pointwise, but not uniformly, to the zero function on (0, 1).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

22 2. Sequences

For example, let ǫ = 1/2 > 0. Let us suppose that there does exist an N ∈ N such that for all
n > N,
1
∀x ∈ (0, 1), |fn (x) − f (x)| = xn < = ǫ.
2
1 1
In particular, we would have ∀x ∈ (0, 1), xN +1 < . Take x = 1 − , m ∈ N, m ≥ 2. Then
2 m
 N +1
1 1
1− < .
m 2
1
Letting m → ∞, we obtain 1 ≤ , a contradiction.
2
See Figure 3, which explains this visually. If (fn )n∈N were to converge to the zero function
uniformly, then in particular, for all n large enough, the graph of fn would lie in a strip of width
ǫ = 1/2 around the graph of the zero function. But no matter how large an n we take, some part
of the graph of fn always falls outside the strip. ♦

To test whether a sequence (fn )n∈N is uniformly convergent, first find its pointwise limit (if it
exists), and then check to see whether the convergence is uniform.
Exercise 2.34. Suppose that X be a set and fn : X → R (n ∈ N) be a sequence which is pointwise
convergent to f : X → R. Let the numbers an := sup{|fn (x) − f (x)| : x ∈ X} (n ∈ N) all exist. Prove
that (fn )n∈N converges uniformly to f if and only if lim an = 0.
n→∞

Let fn : (0, ∞) → R be given by fn (x) = xe−nx for x ∈ (0, ∞) and n ∈ N. Show that the sequence
(fn )n∈N converges uniformly on (0, ∞).
Exercise 2.35. Let fn : [0, 1] → R be defined by
x
fn (x) = (x ∈ [0, 1]).
1 + nx
Does (fn )n∈N converge uniformly on [0, 1]?
Exercise 2.36. Let fn : R → R be defined by
1
fn (x) = 1 − (x ∈ R, n ∈ N).
(1 + x2 )n
Show that the sequence (fn )n∈N of continuous functions converges pointwise to the function

1 if x 6= 0,
f (x) =
0 if x = 0,
which is discontinuous at 0.

Uniform convergence often implies that the limit function inherits properties possessed by the
terms of the sequence. For instance, we will see later on that if a sequence of continuous functions
converges uniformly to a function f , then f is also continuous; see Proposition 4.15. Morally, the
reason nice things can happen with uniform convergence is that we can exchange two limiting
processes, which is not allowed always when one just has pointwise convergence. The following
exercises demonstrate the precariousness of exchanging limiting processes arbitrarily.
m
Exercise 2.37. For n ∈ N and m ∈ N, set am,n = .
m+n
Show that for each fixed n, lim am,n = 1, while for each fixed m, lim am,n = 0.
m→∞ n→∞
Is lim lim am,n = lim lim am,n ?
m→∞ n→∞ n→∞ m→∞

Exercise 2.38. Let fn : R → R be defined by


sin(nx)
fn (x) = √ (x ∈ R, n ∈ N).
n
Show that (fn )n∈N converges pointwise to the zero function f . However, show that (fn′ )n∈N does not
converge pointwise to (the zero function) f ′ .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.4. Convergent sequences and closed sets 23

Exercise 2.39. Let fn : [0, 1] → R (n ∈ N) be defined by fn (x) = nx(1 − x2 )n (x ∈ [0, 1]). Show that
(fn )n∈N converges pointwise to the zero function f . However, show that
Z 1 Z 1
1
lim fn (x)dx = 6= 0 = lim f (x)dx.
n→∞ 0 2 0 n→∞

2.4. Convergent sequences and closed sets


We have learnt that closed sets are ones whose complement is open. Here is another characteriza-
tion of closed sets.
Theorem 2.40. A set F is closed if and only if for every convergent sequence (an )n∈N such that
an ∈ F (n ∈ N), we have that lim an ∈ F.
n→∞

Proof. Suppose that F is closed. Let (an )n∈N be a convergent sequence such that an ∈ F (n ∈ N)
and denote its limit by L. Assume that L 6∈ F . Then L ∈ ∁F , the complement of F , which is
open. So there exists an r > 0 such that the open ball B(L, r) with center L and radius r > 0 is
contained in ∁F , that is, B(L, r) contains no points from F . But as (an )n∈N is convergent with
limit L, we can choose a large enough n so that d(an , L) < r/2. This implies that an ∈ B(L, r).
But also an ∈ F , and so we have arrived at a contradiction. See Figure 4. This shows the “only
if” part.

∁F

F
F ∁F

L
r
an
L

Figure 4. Proof of the theorem on the characterization of closed sets in terms of convergent
sequences. The figure on the left is the one for the “only if” part, and the one on the right is
for the “if” part.

Now suppose that the set is not closed. Then its complement ∁F is not open. This means
that there is a point L ∈ ∁F such that for every r > 0, the open ball B(L, r) has at least one
point from F . Now take successively r = 1/n (n ∈ N), and choose a point an ∈ F ∩ B(L, 1/n). In
this manner we obtain a sequence (an )n∈N such that an ∈ F for each n, and d(an , L) < 1/n. The
property d(an , L) < 1/n (n ∈ N) implies that (an )n∈N is a convergent sequence with limit L. So
we have obtained the existence of a convergent sequence (an )n∈N such that an ∈ F (n ∈ N), but
lim an = L 6∈ F.
n→∞
See Figure 4. This completes the proof of the “if” part. 
Exercise 2.41.
(1) Let 0 6= a ∈ Rd and β ∈ R. Show that the “hyperplane” H = {x ∈ Rd : a⊤ x = β} is closed in
Rd .
(2) Prove that if A ∈ Rm×n and b ∈ Rm , then the set of solutions S = {x ∈ Rn : Ax = b} is a closed
subset of Rn .
(3) Show that the “Linear Programming simplex” Σ := {x ∈ Rn : Ax = b, x1 ≥ 0, . . . , xn ≥ 0} is
closed in Rn .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

24 2. Sequences

Exercise 2.42. Let S be an open set and (xn )n∈N a sequence in a metric space. Show that if (xn )
converges to x ∈ S, then there exists N ∈ N such that for all n > N , xn ∈ S. (In words: if the limit of a
convergent sequence lies in an open set, then the sequence eventually stays in the open set.)

2.5. Compact sets


In this section, we study an important class of subsets of a metric space, called compact sets.
Before we learn the definition, let us give some motivation for this concept.
Of the different types of intervals in R, perhaps the most important are those of the form
[a, b], where a, b are finite real numbers. Why are such intervals so important? This is not an easy
question to answer, but we already know of one vital result, namely the Extreme Value Theorem,
where such intervals play a vital role. Recall that the Extreme Value Theorem asserts that any
continuous function f : [a, b] → R attains a maximum and a minimum value on [a, b]. This
result does not hold in general for continuous functions f : I → R with I = (a, b) or I = [a, b) or
I = (a, ∞), and so on. Besides its theoretical importance in Analysis, the Extreme Value Theorem
is also a fundamental result in Optimization Theory. It turns out that when we want to generalize
this result, the notion of “compact sets” is pertinent, and we will learn (later on in Chapter 4) the
following analogue of the Extreme Value Theorem: If K is a compact subset of a metric space X
and f : K → R is continuous, then f assumes a maximum and a minimum on K.
Here is the definition of a compact set.

Definition 2.43. Let (X, d) be a metric space. A subset K of X is said to be compact if every
sequence in K has a convergent subsequence with limit in K, that is, if (xn )n∈N is a sequence such
that xn ∈ K for each n ∈ N, then there exists a subsequence (xnk )k∈N which converges to some
L ∈ K.

Example 2.44. The interval [a, b] is a compact subset of R. Indeed, every sequence (an )n∈N
contained in [a, b] is bounded, and by the Bolzano-Weierstrass Theorem possesses a convergent
subsequence, say (ank )k∈N , with limit L. But since

a ≤ ank ≤ b,

by letting k → ∞, we obtain a ≤ L ≤ b, that is, L ∈ [a, b]. Hence [a, b] is compact.


On the other hand, (a, b) is not compact, since the sequence
 
b−a
a+
2n n∈N
is contained in (a, b), but it has no convergent subsequence whose limit belongs to (a, b). Indeed
this is because the sequence is convergent, with limit a, and so every subsequence of this sequence
is also convergent with limit a, which doesn’t belong to (a, b).
R is not compact since the sequence (n)n∈N cannot have a convergent subsequence. Indeed,
if such a convergent subsequence existed, it would also be Cauchy, but the distance between any
two distinct terms, being distinct integers, is at least 1, contradicting the Cauchyness. ♦

In the above list of nonexamples, note that R = (−∞, ∞) is not bounded, and that (a, b) is
not closed. On the other hand, the example [a, b] is both bounded and closed. It turns out that
compact sets are always closed and bounded. First we define exactly what we mean by a bounded
subset of a metric space.

Definition 2.45. A subset S of a metric space X is said to be bounded if there exists an M > 0
such that for all x, y ∈ S, d(x, y) ≤ M .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.5. Compact sets 25

Exercise 2.46. Let (X, d) be a metric space and let x0 ∈ X. Show that a subset S ⊆ X is bounded if
and only if there exists an R > 0 such that for all x ∈ S, d(x, x0 ) ≤ R.
Hint: Note that x0 is not necessarily an element of S. Use the triangle inequality.

Theorem 2.47. Any compact subset K of a metric space (X, d) is closed and bounded.

Proof. We first show that K is closed. Let (an )n∈N be a sequence in K that converges to L.
Then there is a convergent subsequence, say (ank )k∈N that is convergent to a limit L′ ∈ K. But
as (ank )k∈N is a subsequence of a convergent sequence with limit L, it is also convergent to L. By
the uniqueness of limits, L = L′ ∈ K. Thus, K is closed.
Next we show that K is bounded. Let x0 ∈ X. Suppose K is not bounded. Then given any
n ∈ N, we can find an an ∈ K such that d(an , x0 ) > n. But this implies that no subsequence of
(an )n∈N is bounded. So no subsequence of (an )n∈N can be convergent either. This contradicts the
compactness of K. Thus our assumption was incorrect, that is, K is bounded. 

The converse of the above theorem is, in general, false. That is, there exist metric spaces
with subsets that are closed and bounded, but not compact, as shown by the following example.
(However, as shown by Theorem 2.49 below, the converse is true for subsets of Rn ).

Example 2.48. Consider the unit ball with center 0 in the normed space ℓ2 :
B = {x ∈ ℓ2 : kxk2 ≤ 1}.
(See Exercise 1.19 for the definition of ℓ2 .) Then B is bounded, it is closed (since its complement
is easily seen to be open), but B is not compact, and this can be demonstrated as follows. Take
the sequence (en )n∈N , where en is the sequence with only the nth term equal to 1, and all other
terms are equal to 0:
1 , 0 , · · · ) ∈ B ⊆ ℓ2
en := ( 0 , · · · , 0 , |{z}
nth place
2
Then this sequence (e√n )n∈N in B ⊆ ℓ can have no convergent subsequence. Indeed, whenever
n 6= m, ken − em k2 = 2, and so no subsequence of (en )n∈N can be Cauchy, much less convergent!

We will now show the following important result.

Theorem 2.49. A subset K of Rd is compact if and only if K is closed and bounded.

Before showing this, we prove a technical result, which besides being interesting on its own,
will also somewhat simplify the proof of the above theorem.

Lemma 2.50. Every bounded sequence in Rd has convergent subsequence.

Proof. We prove this using induction on d. Let us consider the case when d = 1. Then the
statement is precisely the Bolzano-Weierstrass Theorem!
Now suppose that the result has been proved in Rd for some d ≥ 1. We will show that it holds
in Rd+1 . Let (an )n∈N be a bounded sequence. We split each an into its first d components and its
last component in R:
 
αn
an = ,
βn
where αn ∈ Rd and βn ∈ R. Clearly kαn k2 ≤ kan k2 , and so (αn )n∈N is a bounded sequence in Rd .
By the induction hypothesis, it has a convergent subsequence, say (αnk )k∈N which converges to,
say α ∈ Rd . Consider now the sequence (βnk )k∈N in R. Then (βnk )k∈N is bounded, and so by the

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

26 2. Sequences

Bolzano-Weierstrass Theorem, it has a convergent subsequence (βnkℓ )ℓ∈N , with limit , say β ∈ R.
Then we have " #  
αnkℓ ℓ→∞ α
ankℓ = −→ =: L ∈ Rd+1 .
βnkℓ β
Thus the bounded sequence (an )n∈N has (ankℓ )ℓ∈N as a convergent subsequence. 

Now we return to the task of proving of Theorem 2.49.

Proof. (“If” part.) Let K be closed and bounded. Let (an )n∈N be a sequence in K. Then (an )n∈N
is bounded, and so it has a convergent subsequence, with limit L. But since K is closed and since
each term of the sequence belongs to K, it follows that also L ∈ K. Consequently, K is compact.

(“Only if” part) This follows by Theorem 2.47. 


Example 2.51. The intervals (a, b], [a, b) are not compact, since although they are bounded, they
are not closed.
The intervals (−∞, b], [a, ∞) are not compact, since although they are closed, they are not
bounded. ♦

Let us consider an interesting compact subset of the real line, called the Cantor set.
Example 2.52 (Cantor set). The Cantor set is constructed as follows. First, denote the closed
interval [0, 1] by F1 . Next, delete from F1 the open interval ( 13 , 32 ) which is its middle third, and
S
denote the remaining closed set by F2 . Clearly, F2 = [0, 13 ] [ 23 , 1]. Next, delete from F2 the open
intervals ( 19 , 92 ) and ( 79 , 98 ), which are the middle thirds of its two pieces, and denote the remaining
S S S
closed set by F3 . It is easy to see that F3 = [0, 91 ] [ 29 , 31 ] [ 23 , 97 ] [ 89 , 1]. If we continue this
process, at each stage deleting the open middle third of each closed interval remaining from the
previous stage, we obtain a sequence of closed sets Fn , each of which contains all of its successors.
Figure 5 illustrates this.

Figure 5. Construction of the Cantor set.


\
The Cantor set is defined by F = Fn .
n=1
As F is an intersection of closed sets, it is closed. Moreover it is contained in [0, 1] and so it is
also bounded. Consequently it is compact. F consists of those points in the closed interval [0, 1]
which “ultimately remain” after the removal of all the open intervals ( 13 , 32 ), ( 19 , 92 ), ( 97 , 98 ), . . . .
What points do remain? F clearly contains the end-points of the closed intervals which make up
each set Fn :
1 2 1 2 7 8
0, 1, , , , , , , · · · .
3 3 9 9 9 9
Does F contain any other points? Actually, F contains many more points than the above list
of end points. After all, the above list of endpoints is countable, but it can be shown that F is

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

2.6. Notes (not part of the course) 27

uncountable! It turns out that the Cantor set is a very intricate mathematical object, and is often
a source of interesting examples/counterexamples in Analysis. (For example, as the sum of the
lengths of the intervals removed is
1 1 1
+ 2 2 + 4 3 + · · · = 1,
3 3 3
(factor out 1/3 and sum the resulting geometric series), the “(Lebesgue length) measure” of F is
1 − 1 = 0. So this is an example of an uncountable set with “Lebesgue measure” 0.) ♦

Exercise 2.53. Determine if the following statements are true or false. Give reasons for your answers.

(1) If a set S ⊆ R is such that each convergent sequence in S has a convergent subsequence with
limit in S, then S is compact.
(2) All closed and bounded sets are compact.
(3) If (Y, d) is a metric space with subset X, and K is a compact subset of X, then K is a compact
subset of Y .

Exercise 2.54. Let K be a compact subset of Rd . Let F be a closed subset of Rd . Show that F ∩ K is
compact.

Exercise 2.55. Show that the unit sphere with center 0 in Rd , namely

Sd−1 := {x ∈ Rd : kxk2 = 1}

is compact.

Exercise 2.56. Show that 1, 12 , 13 , . . . ∪ {0} is compact.

Exercise 2.57. (∗) Consider the metric space (Rm×m , k·k∞ ). Is the subset (the “General Linear” group5)
GL(m, R) = {A ∈ Rm×m : A is invertible} compact?
What about the set of orthogonal matrices O(m, R) = {A ∈ Rm×m : AA⊤ = Im }?

Exercise 2.58. Consider the subset H := {(x1 , x2 ) ∈ R2 : x1 x2 = 1} of R2 . Show that H is not compact,
but H is closed.

2.6. Notes (not part of the course)


Definition of compactness. The notion of a compact set that we have defined is really sequential
compactness. In the context of the more general topological spaces, one defines the notion of compactness
as follows.

Definition 2.59. Let X be a topological space with the topology given by the family of open sets O. Let
Y ⊆ X. A collection C = {Ui : i ∈ I} of open sets is said to be an open cover of Y if
[
Y ⊆ Ui .
i∈I

K ⊆ X is said to be a compact set if every open cover of K has a finite subcover, that is, given any open
cover C = {Ui : i ∈ I} of K, there exist finitely many indices i1 , . . . , in ∈ I such that

K ⊆ Ui1 ∪ · · · ∪ Uin .

In the case of metric spaces, it can be shown that the set of compact sets coincides with the set of
sequentially compact sets. But in general topological spaces, these may not be the same.

5The General Linear group is so named because the columns of an invertible matrix are linearly independent, hence
the vectors they define are in “general position” (linearly independent!), and matrices in the general linear group take
points in general position to points in general position.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

28 2. Sequences

Uniform convergence, termwise differentiation and integration. Besides Proposition 4.15, one
also has the following results associated with uniform convergence.
Proposition 2.60. If fn : [a, b] → R (n ∈ N) is a sequence of Riemann-integrable functions on [a, b]
which converges uniformly to f : [a, b] → R, then f is also Riemann-integrable on [a, b], and moreover
Z b Z b
f (x)dx = lim fn (x)dx.
a n→∞ a

Proposition 2.61. Let fn : (a, b) → R (n ∈ N) be a sequence of differentiable functions on (a, b), such
that there exists a point c ∈ (a, b) for which (fn (c))n∈N converges. If the sequence (fn′ )n∈N converges
uniformly to g on (a, b), then (fn )n∈N converges uniformly to a differentiable function f on (a, b), and
moreover, f ′ (x) = g(x) for all x ∈ (a, b).

We will see a proof of this last result later on when we study differentiation in Chapter 5.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 3

Series

In this chapter we study series in normed spaces, but first we will begin with series in R.

3.1. Series in R
Given a sequence (an )n∈N , one can form a new sequence (sn )n∈N of its partial sums:
s1 := a1 ,
s2 := a1 + a2 ,
s3 := a1 + a2 + a3 ,
..
.
Definition 3.1. Let (an )n∈N be a sequence and let (sn )n∈N be the sequence of its partial sums.

X ∞
X
If (sn )n∈N converges, we say that the series an converges, and we write an = lim sn .
n→∞
n=1 n=1

X
If the sequence (sn )n∈N does not converge we say that the series an diverges.
n=1

X
Example 3.2. The series (−1)n diverges. Indeed the sequence of partial sums is the sequence
n=1
−1, 0, −1, 0, . . . which is a divergent sequence.
X∞
1
The series converges. Its nth partial sum “telescopes”:
n=1
n(n + 1)
X n Xn        
1 1 1 1 1 1 1 1 1
sn = = − = 1− + − +···+ − = 1− .
k(k + 1) k k+1 2 2 3 n n+1 n+1
k=1 k=1

X 1
Since lim sn = 1 − 0 = 1, we have = 1. ♦
n→∞
n=1
n(n + 1)

X 1 π
Exercise 3.3 (Tantalising tan−1 ). Show that tan−1 = .
n=1
2n2 4
1 (2n + 1) − (2n − 1) tan a − tan b
Hint: Write = and use tan(a − b) = .
2n2 1 + (2n + 1)(2n − 1) 1 + tan a tan b
Exercise 3.4. Show that for every real number x > 1, the series
1 2 4 2n
+ + + ··· + + ...
1+x 1+x 2 1+x 4 1 + x2 n

29

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

30 3. Series

1
converges. Hint: Add .
1−x
Exercise 3.5. Consider the Fibonacci sequence (Fn )n∈N with F0 = F1 = 1 and Fn+1 = Fn + Fn−1 for
n ∈ N. Show that

X 1
= 1.
n=2
F n−1 Fn+1


X
Note that in the above example of the divergent series (−1)n , the sequence
n=1
n
(an )n∈N = ((−1) )n∈N
was not convergent. In fact, we have the following necessary condition for convergence of a series.

X
Proposition 3.6. If the series an converges, then lim an = 0.
n→∞
n=1

Proof. Let sn := a1 + · · · + an . Since the series converges we have


lim sn = L
n→∞

for some L ∈ R. But as (sn+1 )n∈N is a subsequence of (sn )n∈N , it follows that
lim sn+1 = L.
n→∞

By the algebra of limits, lim an+1 = lim (sn+1 − sn ) = lim sn+1 − lim sn = L − L = 0. 
n→∞ n→∞ n→∞ n→∞
∞  
X 1
Exercise 3.7. Does the series cos converge?
n=1
n

In Theorem 3.9, we will see an instance of a series which shows that although this condition
is necessary for the convergence of a series, it is not sufficient. But first, let us see an important
example of a convergent series. In fact, it lies at the core of most of the convergence results in
Real Analysis.

X
Theorem 3.8. Let r ∈ R. The geometric series rn converges if and only if |r| < 1.
n=0

Proof. Let |r| < 1. First we will observe that lim rn = 0. As |r| < 1,
n→∞

1
|r| =
1+h
 
n
for some h > 0. (Just take h = 1/|r| − 1.) Then (1 + h)n = 1 + h + · · · + hn > nh. Thus
1
1 1
0 ≤ |r|n = < ,
(1 + h)n nh
and so by the Sandwich Theorem, lim |r|n = 0. As −|r|n ≤ rn ≤ |r|n , it follows again from the
n→∞
Sandwich Theorem that
lim rn = 0.
n→∞
Let sn := 1 + r + r + · · · + r . Then rsn = r + r2 + · · · + rn + rn+1 , and so
2 n

(1 − r)sn = sn − rsn = 1 − rn+1 .



X
n+1 1
As lim r = 0, it follows that lim (1 − r)sn = 1. Hence rn = lim sn = .
n→∞ n→∞
n=1
n→∞ 1−r

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 31

Now suppose that |r| ≥ 1. If r = 1, then


lim rn = 1 6= 0,
n→∞

and so by Proposition 3.6, the series diverges. Similarly if r = −1, then (rn )n∈N = ((−1)n )n∈N
diverges, and so the series is divergent. Also if |r| > 1, then the sequence (rn )n∈N has the
subsequence (r2n )n∈N which is not bounded, and hence not convergent. Consequently (rn )n∈N
diverges, and hence the series diverges. 
X∞
1
Theorem 3.9. The harmonic series1 diverges.
n=1
n

1 1 1
Proof. Let sn := 1 + + + · · · + . We have
2 3 n
1 1 1 1 1
s2n − sn = + + ··· + ≥n = . (3.1)
n+1 n+2 2n 2n 2
If the series converges, then lim sn = L for some L. But then also lim s2n = L, and so
n→∞ n→∞

lim (s2n − sn ) = L − L = 0,
n→∞

which contradicts (3.1). 


Note that the nth term of the above series satisfies
1
lim an = lim = 0,
n→∞ n→∞ n

showing that the condition given in Proposition 3.6 is necessary but not sufficient for the conver-
gence of the series.
X∞
1
Theorem 3.10. Let s ∈ R. The series2 s
converges if and only if s > 1.
n=1
n

Proof. Let
1 1 1
+ s + ··· + s.
Sn = 1 +
s
2 3 n
Clearly S1 ≤ S2 ≤ S3 ≤ . . . , so that (Sn )n∈N is an increasing sequence.
Let s > 1. We have
   
1 1 1 1 1 1
S2n+1 = 1+ + + · · · + + + + · · · +
2s 4s (2n)s 3s 5s (2n + 1)s
 
1 1 1
< 1 + 2 s + s + ··· +
2 4 (2n)s
 
2 1 1
= 1 + s 1 + s + ··· + s
2 2 n

= 1 + 21−s Sn

< 1 + 21−s S2n+1 .

1Its name derives from the concept of overtones, or harmonics in music: the wavelengths of the overtones of a vibrating
string are 12 , 31 , 41 , and so on, of the string’s fundamental wavelength.

2The function s 7→
X 1
s
is called the Riemann-zeta function, which is an important function in number theory;
n=1
n

X 1 Y 1
the connection with number theory is brought out by Euler’s identity, which says that ζ(s) := = .
n=1
ns p prime
1 − p−s

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

32 3. Series

As 21−s < 1 (because s > 1!), this last inequality yields


1
S2n+1 < (n ∈ N)
1 − 21−s
and
1
S2n < S2n+1 < (n ∈ N)
1 − 21−s
So (Sn )n∈N is bounded. But we know that an increasing sequence which is bounded above is
convergent (to the supremum of its terms). Thus
X∞
1
n=1
ns
converges for s > 1.
If on the other hand s ≤ 1, the proof is similar to that of showing that the harmonic series
diverges. Indeed, if the series converged, then
lim (S2n − Sn ) = 0,
n→∞
while on the other hand,
1 1 1 1 1 1
S2n − Sn = s
+ s
+ ··· + s
≥n s
≥n = ,
(n + 1) (n + 2) (2n) (2n) 2n 2
where we have used the fact that s ≤ 1 in order to obtain the last inequality. 

For a sequence (an )n∈N with nonnegative terms, we sometimes write



X
an < +∞
n=1

X
to mean the series an converges.
n=1

Exercise 3.11. (∗) Prove that if a1 ≥ a2 ≥ a3 . . . is a sequence of nonnegative numbers, and if



X
an < +∞,
n=1

then an approaches 0 faster than 1/n, that is, lim nan = 0.


n→∞
Hint: Consider the inequalities an+1 + · · · + a2n ≥ n · a2n and an+1 + · · · + a2n+1 ≥ n · a2n+1 .
Show that the assumption a1 ≥ a2 ≥ a3 . . . above cannot be dropped by considering the lacunary
series whose n2 th term is 1/n2 and all other terms are zero.

X ∞
X
Proposition 3.12. If |an | converges, then an converges.
n=1 n=1

Proof. Let sn := a1 + · · · + an . We will show that (sn )n∈N is a Cauchy sequence. We have for
n > m:
|sn − sm | = |(a1 + · · · + an ) − (a1 + · · · + am )| = |am+1 + · · · + an |
≤ |am+1 | + · · · + |an | = (|a1 | + · · · + |an |) − (|a1 | + · · · + |am |) = σn − σm ,
where σk := |a1 | + · · · + |ak | (k ∈ N). Since

X
|an | < +∞,
n=1

its sequence of partial sums (σn )n∈N is convergent, and in particular, Cauchy. This shows, in light
of the above inequality |sn − sm | ≤ σn − σm , that (sn )n∈N is a Cauchy sequence in R and hence
it is convergent. 

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 33

X ∞
X
Definition 3.13. If the series |an | converges, then we say that an converges absolutely.
n=1 n=1


X sin n
Exercise 3.14. Does the series converge?
n=1
n2

X∞
(−1)n
Example 3.15. The series does not converge absolutely, since
n=1
n

X∞ X∞
(−1)n 1
= ,
n=1
n n=1
n

and we have seen that the harmonic series diverges.


A series of the form

X
(−1)n an
n=1

with an ≥ 0 for all n ∈ N is called an alternating series. We note that the series considered above,
namely
X∞
(−1)n
n=1
n

X 1
is an alternating series (−1)n an with an := (n ∈ N).
n=1
n
We will now learn a result below, called the Leibniz Alternating Series Theorem, will enable us
to conclude that in fact this alternating series is in fact convergent (since the sufficiency conditions
for convergence in the Leibniz Alternating Series Theorem are satisfied:
1 1
a1 = 1 ≥ a2 = ≥ a3 = ≥ . . .
2 3
1
and lim an = lim = 0). ♦
n→∞ n→∞ n
Theorem 3.16 (Leibniz Alternating Series Theorem). Let (an )n∈N be a sequence such that
(1) it has nonnegative terms (an ≥ 0 for all n),
(2) it is decreasing (a1 ≥ a2 ≥ a3 ≥ . . . ), and
(3) lim an = 0.
n→∞

X
Then the series (−1)n an converges.
n=1

A pictorial “proof without words” is shown in Figure 1. The sum of the lengths of the disjoint
dark intervals is at most the length of (0, a1 ).

a2n−1 −a2n a3 −a4 a1 −a2

0 ... a2n a2n−1 ... a3 a2 a1


a4

Figure 1. Pictorial proof of the Leibniz Alternating Series Theorem.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

34 3. Series

X  ∞
X 
n+1 n
Proof. We may just as well prove the convergence of (−1) an =− (−1) an .
n=1 n=1
Let sn = a1 − a2 + a3 − + · · · + (−1)n−1 an . Clearly
s2n+1 = s2n−1 − a2n + a2n+1 ≤ s2n−1 ,
s2n+2 = s2n + a2n+1 − a2n+2 ≥ s2n ,
and so the sequence s2 , s4 , s6 , . . . is increasing, while the sequence s3 , s5 , s7 , . . . is decreasing. Also,
s2n ≤ s2n + a2n+1 = s2n+1 ≤ s2n−1 ≤ · · · ≤ s3 .
So (s2n )n∈N is a bounded (s2 ≤ s2n ≤ s3 for all n), increasing sequence, and hence it is convergent.
But as (a2n+1 )n∈N is also convergent with limit 0, it follows that
lim s2n+1 = lim (s2n + a2n+1 ) = lim s2n .
n→∞ n→∞ n→∞

Hence (sn )n∈N is convergent, and so the series converges. 



X (−1)n
Exercise 3.17. Let s > 0. Show that converges.
n=1
ns
∞ √
X n
Exercise 3.18. Prove that (−1)n converges.
n=1
n+1

X 1
Exercise 3.19. Prove that (−1)n sin converges.
n=1
n

3.1.1. Comparison, Ratio, Root. We will now learn three important tests for the convergence
of a series:
(1) the comparison test (where we compare with a series whose convergence status is known)
(2) the ratio test (where we look at the behaviour of the ratio of terms an+1 /an )
p
(3) the root test (where we look at the behaviour of n |an |)

Theorem 3.20 (Comparison test).


(1) If (an )n∈N and (cn )n∈N are such that there exists an N ∈ N such that |an | ≤ cn for all
X∞ X∞
n ≥ N , and cn converges, then an converges absolutely.
n=1 n=1
(2) If (an )n∈N and (dn )n∈N are such that there exists an N ∈ N such that an ≥ dn ≥ 0 for
X∞ ∞
X
all n ≥ N , and dn diverges, then an diverges.
n=1 n=1

Proof. Let
sn := |a1 | + · · · + |an |
σn := c1 + · · · + cn .
We have for n > m
|sn − sm | = |am+1 | + · · · + |an | ≤ cm+1 + · · · + cn = |σn − σm |,
and since (σn )n∈N is Cauchy, (sn )n∈N is also Cauchy. Hence (sn )n∈N is absolutely convergent.

X X∞
The second claim follows from the first one. For if an converges, so must dn . 
n=1 n=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 35

Theorem 3.21 (Ratio test). Let (an )n∈N be a sequence of nonzero terms.
X∞
an+1
(1) If ∃r ∈ (0, 1) and ∃N ∈ N such that ∀n > N, ≤ r, then an converges absolutely.
an n=1
X∞
an+1
(2) If ∃N ∈ N such that ∀n > N , ≥ 1, then an diverges.
an n=1

Proof. (1) We have

|aN +1 | ≤ r|aN |,
|aN +2 | ≤ r|aN +1 | ≤ r2 |aN |,
|aN +3 | ≤ r|aN +2 | ≤ r3 |aN |,
..
.

X
Since the geometric series rn converges, we obtain
n=1


X
|an | < +∞
n=N +1

by the Comparison Test. By adding the finitely sum |a1 | + · · · + |aN | to each partial sum of this

X
last series, we see also that |an | converges. This completes the proof of the claim in (1).
n=1

(2) The given condition implies that

· · · ≥ |aN +3 | ≥ |aN +2 | ≥ |aN +1 |, (3.2)



X
If the series an was convergent, then 0 = lim an = lim aN +k . Hence lim |aN +k | = 0 as well.
n→∞ k→∞ k→∞
n=1
But by the (3.2), we see that lim |aN +k | ≥ |aN +1 | > 0, a contradiction. 
k→∞

 an+1
It does not suffice for convergence of the series that for all sufficiently large n, < 1.
an
1
X∞
an+1 n 1
For example, for the harmonic series = n+1 = < 1, but diverges.
an 1 n+1 n
n=1
n
X∞
an+1
Corollary 3.22. Let (an )n∈N have nonzero terms. If lim < 1, then an converges
n→∞ an n=1
absolutely.

an+1 1−L
Proof. Let L := lim . Then ǫ := > 0. Choose N ∈ N such that for n > N ,
n→∞ an 2
an+1 an+1 1−L
−L≤ −L <ǫ= ,
an an 2
an+1 1+L 1+1
and so < =: r < = 1. Then we use (1) from Theorem 3.21. 
an 2 2

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

36 3. Series

X∞
1 n
Example 3.23 (The exponential series). Let a ∈ R. The series ea := a converges.
n=0
n!
If a = 0, clearly ea = e0 = 1. If a 6= 0, the convergence follows from the ratio test. Indeed,
an+1
(n + 1)! |a| n→∞
= −→ 0.
an n+1
n!

Theorem 3.24 (Root test).
p ∞
X
n
(1) If ∃r ∈ (0, 1) and ∃N ∈ N such that ∀n > N , |an | ≤ r, then an converges absolutely.
n=1
p ∞
X
n
(2) If for infinitely many n, |an | ≥ 1, then an diverges.
n=1


X
Proof. (1) We have |an | ≤ rn for all n > N , so that by the Comparison Test, |an | converges.
n=N +1
p
(2) Suppose that for the subsequence (ank )k∈N , we have nk |ank | ≥ 1. Then |ank | ≥ 1. If the
series was convergent, then lim an = 0, and so also lim |ank | = 0, a contradiction. Thus the
n→∞ n→∞
series cannot converge. 
 p
It does not suffice for convergence of the series that for sufficiently large n, n |an | < 1.
p X∞
1 1
For example, for the harmonic series n |an | = √ < 1, but diverges.
n
n n=1
n
Exercise 3.25. Determine if the following series are convergent or not.

X n2
(1) .
n=1
2n

X (n!)2
(2) .
n=1
(2n)!
∞  n
X 4
(3) n5 .
n=1
5

X n
Exercise 3.26. Prove that converges. (∗) Can you also find its value? Hint: Write the
n=1
n4 + n2 + 1
denominator as (n2 + 1)2 − n2 .

X
Exercise 3.27. (∗) Let (an )n∈N be a sequence of real numbers such that the series a4n converges.
n=1

X
Show that a5n converges. Hint: First conclude that for large n, |an | < 1.
n=1

X 1
Exercise 3.28. (∗) We have seen that the series p
converges if and only if p > 1. Suppose that we
n=1
n
sum the series for all n that can be written without using the numeral 9. (Imagine that the key for 9 on
X′ X′ 1
the keyboard is broken.) Call the resulting summation . Prove that the series converges if and
np
k−1
only if p > log10 9. Hint: First show that there are 8 · 9 numbers without 9 between 10k−1 and 10k .
Exercise 3.29. Determine if the following statements are true or false. Give reasons for your answers.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 37

X ∞
X
(1) If |an | is convergent, then so is a2n .
n=1 n=1
X∞ ∞
X
(2) If an is convergent, then so is a2n .
n=1 n=1

X
(3) If lim an = 0, then an converges.
n→∞
n=1

X
(4) If lim (a1 + · · · + an ) = 0, then an converges.
n→∞
n=1

X n+1
(5) log converges.
n=1
n

X
(6) If an > 0 (n ∈ N) and the partial sums of (an )n∈N are bounded above, then an converges.
n=1
∞ ∞
X X 1
(7) If an > 0 (n ∈ N) and an converges, then diverges.
n=1 n=1
a n

Exercise 3.30 (Fourier series). In order to understand a complicated situation, it is natural to try to
break it up into simpler things. For example, from Calculus we learn that an analytic function can be
expanded into a Taylor series, where we break it down into the simplest possible analytic functions, namely
monomials 1, x, x2 , . . . as follows:
f ′′ (0) 2
f (x) = f (0) + f ′ (0)x + x + ....
2!
The idea behind the Fourier series is similar. In order to understand a complicated periodic function,
we break it down into the simplest periodic functions, namely sines and cosines. Thus if T ≥ 0 and
f : R → R is T -periodic, that is, f (x) = f (x + T ) (x ∈ R), then one tries to find coefficients a0 , a1 , a2 , . . .
and b1 , b2 , b3 , . . . such that
∞     
X 2πn 2πn
f (x) = a0 + an cos x + bn sin x . (3.3)
n=1
T T

(1) Suppose that the Fourier series given in (3.3) converges pointwise to f on R. Show that if
X∞
(|an | + |bn |) < ∞,
n=1

then in fact the series converges uniformly.


(2) The aim of this part of the exercise is to give experimental evidence for two things. Firstly, the
plausibility of the Fourier expansion, and secondly, that the uniform convergence might fail if
the condition in the previous part of this exercise does not hold. To this end, let us consider
the square wave f : R → R given by

1 if x ∈ [n, n + 1) for n even,
f (x) =
−1 if x ∈ [n, n + 1) for n odd.
Then f is a 2-periodic signal. From the theory of Fourier Series, which we will not discuss in this
course, the coefficients can be calculated, and they happen to be 0 = a0 = a1 = a2 = a3 = . . .
and  4
 if n is odd,
bn = nπ
 0 if n is even.
Write a Maple program to plot the graphs of the partial sums of the series in (3.3) with, say, 3,
33, 333 terms. Discuss your observations.

X
Exercise 3.31. Let (an )n∈N be a sequence with nonnegative terms. Show that if an converges, then
n=1

X √
so does an an+1 .
n=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

38 3. Series

Figure 2. Partial sums of the Fourier series for the square wave considered in Exercise 3.30.


X
Exercise 3.32. (∗) Let (an )n∈N be a sequence with nonnegative terms. Prove that an converges if
n=1

X an
and only if converges.
n=1
1 + an
( ∞
)
X
1 1
Exercise 3.33. Let ℓ be defined by ℓ = (an )n∈N : |an | < ∞ . Show that ℓ1 ⊆ ℓ2 . Is ℓ1 = ℓ2 ?
n=1

(The normed space ℓ2 was defined in Exercise 1.19 on page 6.)



X 1
Exercise 3.34. As the harmonic series diverges, the reciprocal 1/sn of the nth partial sum
n=1
n
1 1 1
+ + ··· +
sn := 1 +
2 3 n
approaches 0 as n → ∞. So the necessary condition for the convergence of the series

X 1
n=1
s n

is satisfied. But we don’t know yet whether or not it actually converges. It is clear that the harmonic
series diverges very slowly, which means that 1/sn decreases very slowly, and this prompts the guess that
this series diverges. Show that in fact our guess is correct. Hint: sn ≤ n.

X 1
Exercise 3.35. Show that the series n
converges.
n=1
n

Exercise 3.36. Consider the Fibonnaci sequence (Fn )n∈N with F0 = F1 = 1 and Fn+1 = Fn + Fn−1 for
n ∈ N. Show that

X 1
< +∞.
n=0
F n

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 39

1 1 1
Hint: Fn+1 = Fn + Fn−1 > Fn−1 + Fn−1 = 2Fn−1 . Using this, show that both + + + . . . and
F0 F2 F4
1 1 1
+ + + . . . converge.
F1 F3 F5

3.1.2. Power series. Let (cn )n∈N be a real sequence (thought of as a sequence of “coefficients”).
An expression of the type
X∞
cn xn
n=0
is called a power series in the variable x ∈ R.
Example 3.37. All polynomial expressions are (finite) power series, with the coefficients being
eventually all zeros.
X∞ X∞
1 n
xn , x are also examples of power series. ♦
n=0 n=0
n!

Note that we have not said anything about the set of x ∈ R where the power series converges.
Of course the power series always converges for x = 0. It turns out that there is a maximal open
interval B(0, r) = (−r, r) centered at 0 where the power series converges absolutely, and we call
the radius r of this interval B(0, r) = (−r, r) as the radius of convergence of the power series. If
the power series converges for all x ∈ R, that is, if the above maximal interval is (−∞, ∞), we say
that the power series has infinite radius of convergence.

X
Example 3.38. The radius of convergence of xn is 1. Indeed, the geometric series converges
n=0
for x ∈ (−1, 1) and diverges whenever |x| ≥ 1. ♦

X 1 n
Example 3.39. The radius of convergence of x is infinite, since ex converges for every
n=0
n!
x ∈ R. ♦

X
Example 3.40. The radius of convergence of nn xn is zero. Indeed, whenever x 6= 0,
n=0
p
n
|nn xn | = n|x| > 1
for all n large enough. By the Root test, it follows that the power series diverges for all nonzero
real numbers. ♦
Theorem 3.41. A power series

X
cn xn
n=0
is either absolutely convergent for all x ∈ R, or there is a unique nonnegative real number ρ such
that the power series is absolutely convergent for all x ∈ R with |x| < ρ and is divergent for all
x ∈ R with |x| > ρ.
 ∞
X 
n
Proof. Let S := y ∈ R : there exists an x ∈ R such that y = |x| and cn x converges .
n=0
Clearly 0 ∈ S. We consider the following two possible cases.

1◦ S is not bounded above. Then given x ∈ R, we can find a |x0 | ∈ S such that |x| < |x0 |

X
and cn xn0 converges. It follows that its nth term goes to 0 as n → ∞, and in
n=0

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

40 3. Series

|x|
particular, it is bounded: |cn xn0 | ≤ M. Then with r := (< 1), we have
|x0 |
 n
|x|
|cn xn | = |cn xn0 | (n ∈ N),
|x0 |

X
But the Geometric Series M rn converges, and so by the Comparison Test, the series
n=0

X
n
cn x is absolutely convergent.
n=0

2◦ S is bounded above. Let ρ := sup S.

If x ∈ R and |x| < ρ, then by the definition of supremum, it follows that there exists

a |x0 | ∈ S such that |x| < |x0 |. Then we repeat the proof in 1◦ above to conclude that

X
the series cn xn is absolutely convergent.
n=0

On the other hand, if x ∈ R and |x| > ρ, then |x| 6∈ S, and by the definition of S,

X
the series cn xn diverges.
n=0

ρ + ρ′
If ρ, ρ′ have the property described in the theorem and ρ < ρ′ , then ρ < r := < ρ′ .
2
X∞ ∞
X
Because 0 < r < ρ′ , cn rn converges, while as ρ < r, it follows that cn rn diverges, a
n=1 n=1
contradiction. 

If ρ is the radius of convergence of a power series, then (−ρ, ρ) is called the interval of
convergence of that power series. We note that the interval of convergence is the empty set if
ρ = 0, and we set the interval of convergence to be R when the radius of convergence is infinite.
See Figure 3.

divergence absolute convergence divergence

−r 0 r

Figure 3. The interval of convergence of a power series.

The calculation of the radius of convergence is facilitated in some cases by the following result.

Theorem 3.42. Consider the power series



X
cn xn .
n=0

cn+1
If L := lim exists, then ρ = 1/L if L 6= 0, and the radius of convergence is infinite if
n→∞ cn
L = 0.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.1. Series in R 41

Proof. Let L 6= 0. We have that for all nonzero x such that |x| < ρ = 1/L that there exists a
q < 1 and a N large enough such that
|cn+1 xn+1 | cn+1
= |x| ≤ q < 1
|cn xn | cn
for all n > N . (This is because
cn+1 n→∞
x −→ L|x| < 1.
cn
So we may take for example q = (L|x| + 1)/2 < 1.) Thus by the Ratio Test, the power series
converges absolutely for such x.
If L = 0, then for any nonzero x ∈ R, we can guarantee that
|cn+1 xn+1 | cn+1
n
= |x| ≤ q < 1
|cn x | cn
for all n > N . (This is because
cn+1 n→∞
x −→ 0|x| = 0 < 1.
cn
So we may take for example q = 1/2 < 1.) Thus again by the Ratio Test, the power series
converges absolutely for such x.
On the other hand, if L 6= 0 and |x| > 1/L, then there exists a N large enough such that
|cn+1 xn+1 | cn+1
n
= |x| > 1
|cn x | cn
for all n > N . This is because
cn+1 n→∞
x −→ L|x| > 1.
cn
So again by the Ratio Test, the power series diverges. 

X p
Exercise 3.43. (∗) Consider the power series cn xn . If L := lim n
|cn | exists, then ρ = 1/L if L 6= 0,
n→∞
n=0
and the radius of convergence is infinite if L = 0.


Note that whether or not the power series converges at x = ρ and x = −ρ is not answered by
Theorem 3.41. In fact this is a delicate issue, and either convergence or divergence can take place
at these points, as demonstrated by the following examples.

Example 3.44. We have the following:


Power series Radius of convergence Set of x’s for which the power series converges

X
xn 1 (−1, 1)
n=1
X∞
xn
1 [−1, 1]
n=1
n2
X∞
xn
1 [−1, 1)
n=1
n

X xn
(−1)n 1 (−1, 1]
n=1
n

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

42 3. Series


Exercise 3.45. Check all the claims in Example 3.44.
Exercise 3.46. Find the radius of convergence for each of the following power series:
∞ ∞
X x2n−1 x3 x5 X x2n x2 x4
(−1)n−1 =x− + − +··· and (−1)n =1− + − +··· .
n=1
(2n − 1)! 3! 5! n=0
(2n)! 2! 4!

3.2. Series in normed spaces


We can’t define series in a general metric space, since we need to add terms. But in the setting of
a normed space, addition of vectors is available, and so we can define the notion of convergence
of a series in a normed space.

Definition 3.47. Let (an )n∈N be a sequence in a normed space (X, k · k). The sequence (sn )n∈N
of partial sums is defined as follows:
s n = a1 + · · · + an ∈ X (n ∈ N).
The series

X
an
n=1
is called convergent if (sn )n∈N converges in (X, k · k). Then we write

X
an = lim sn .
n→∞
n=1

X
If the sequence (sn )n∈N does not converge we say that the series an diverges.
n=1

It turns out the convergence in a complete normed space is guaranteed by the convergence
of an associated real series (of the norms of its terms). We first introduce the notion of absolute
convergence of a series in a normed space, analogous to the absolute convergence of a real series.

Definition 3.48. Let (an )n∈N be a sequence in a normed space (X, k · k). We say that the series

X
an
n=1

X
converges absolutely if the (real) series kan k converges.
n=1

Theorem 3.49. Let (an )n∈N be a sequence in a complete normed space (X, k · k) such that

X ∞
X
kan k < +∞. Then an converges in X.
n=1 n=1

Proof. The proof is the same, mutatis mutandis, as the proof of the fact that absolutely convergent
real series converge. The only change is we use norms instead of absolute values, and use the
completeness of X in order to conclude that when the partial sums form a Cauchy sequence, they
converge to a limit in X.
Let sn := a1 + · · · + an . We will show that (sn )n∈N is a Cauchy sequence. We have for n > m:
ksn − sm k = k(a1 + · · · + an ) − (a1 + · · · + am )k = kam+1 + · · · + an k
≤ kam+1 k + · · · + kan k = (ka1 k + · · · + kan k) − (ka1 k + · · · + kam k) = σn − σm ,

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.2. Series in normed spaces 43

where σk := ka1 k + · · · + kak k. Since the series



X
kan k
n=1

converges (given!), its sequence of partial sums is convergent, and in particular, Cauchy. This
shows, in light of the above inequality ksn − sm k ≤ σn − σm , that (sn )n∈N is a Cauchy sequence
in X. As X is complete, (sn )n∈N is convergent. 
Example 3.50. Consider the sequence (fn )n∈N in the normed space (C[0, 1], k · k∞ ), where
 x n
fn (x) = (x ∈ [0, 1], n ∈ N).
2

X
The series fn converges in (C[0, 1], k · k∞ ) since
n=1
 x n 1
kfn k∞ = max = ,
x∈[0,1] 2 2n

X X∞
1
and the series kfn k∞ = n
converges.
n=1 n=1
2

Figure 4. Plots of the graphs of the partial sums and the limit f .

In fact, one can see directly that since


  x n+1 
x
1−  
2 2 x xn
sn (x) = f1 (x) + · · · + fn (x) = x = 1− n ,
1− 2−x 2
2
x
with f defined by f (x) = , we have that
2−x
x xn 1 n→0
ksn − f k∞ = max ≤ n −→ 0,
x∈[0,1] 2 − x 2n 2

X
and so the series fn converges to f in (C[0, 1], k · k∞ ).
n=1
Figure 3.50 shows plots of the partial sums and their limit f . ♦

The following result plays a central role in Differential Equation theory.


Theorem 3.51. Let A ∈ Rn×n . Then the exponential series
1 1
eA := I + A + A2 + A3 + · · ·
2! 3!
converges in (Rn×n , k · k∞ ).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

44 3. Series

Proof. It is easy to see that if A, B ∈ Rn×n , then kABk∞ ≤ nkAk∞ kBk∞ . This follows imme-
diately from:
n
X n
X n
X
|(AB)ij | = Aik Bkj ≤ |Aik ||Bkj | ≤ kAk∞ kBk∞ = nkAk∞ kBk∞ .
k=1 k=1 k=1

Hence by induction, we have kAk k∞ ≤ nk−1 kAkk∞ ≤ (nkAk∞ )k . Thus


1 k 1
A ≤ (nkAk∞ )k .
k! ∞ k!

X 1
As the series enkAk∞ = (nkAk∞ )k converges, by the Comparison Test, the real series
k!
k=1

X 1 k
A
k! ∞
k=1
n×n
converges. Since (R , k · k∞ ) is complete, it follows that
1 2 1
eA := I + A + A + A3 + · · ·
2! 3!
converges in (Rn×n , k · k∞ ). 
Exercise 3.52. Calculate
(1) e0 (here 0 denotes the n × n matrix with all entries 0)
(2) eI
λ1 0 ··· 0
 

.. ..
 
 
0 λ2 . .
 
 
 
.. .. ..
 
 
. . . 0
 
 
 

(3) e 0 ··· 0 λn

Exercise 3.53. Give an example of a normed space and an absolutely convergent series inside this space
that is not convergent.
Hint: Consider the normed space c00 that consists
s of sequences that are eventually 0, with the norm

P
(n) (n) 2
of x = (x )n∈N defined to be kxk2 = k(x )k2 = (x(n) ) . Show that c00 is a subspace of ℓ2 . Find
n=1
P
a sequence (xk )k∈N in c00 such that ∞ 1 1 1
k=1 xk = (1, 4 , 9 , 16 , . . . ), which is an element of ℓ2 but not of c00 .

X
Exercise 3.54. (∗) Let X be a normed space in which every series an for which there holds
n=1

X
kan k < +∞,
n=1

is convergent in X. Prove that X is complete. Hint: Given a Cauchy sequence (xn )n∈N , construct a
subsequence (xnk )k∈N satisfying kxnk+1 −xnk k < 21k . Then take a1 = xn1 , a2 = xn2 −xn1 , a3 = xn3 −xn2 ,
and so on, and use the fact that a Cauchy sequence possessing a convergent subsequence must itself be
convergent, which was a result established in Exercise 2.25.

3.3. Notes (not part of the course)


In connection with the divergence of the harmonic series, we mention the following

Erdös conjecture on arithmetic progressions (APs) : If the sums of the reciprocals of the numbers
of a set A of natural numbers diverges then A contains arbitrarily long arithmetic progressions.
X 1
That is, if diverges, then A contains APs of any given length.
n∈A
n

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

3.3. Notes (not part of the course) 45



X 1
We know that diverges, and in this case the claim is trivially true. It can be shown that
n=1
n
X 1

p is prime
p
diverges. So one may ask: Does the claim hold in this special case? The answer is “Yes”, and this is the
Green-Tao Theorem proved in 2004. Terence Tao was awarded the Fields Medal in 2006, among other
things, for this result.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 4

Continuous functions

4.1. Continuity of functions from R to R


Recall that continuity is a “local” concept, and we have the following notion of the continuity of
a function at a point.

Definition 4.1. Let I be an interval and let c ∈ I. f : I → R is said to be continuous at c


if for every ǫ > 0, there exists a δ > 0 such that whenever x ∈ I satisfies |x − c| < δ, we have
|f (x) − f (c)| < ǫ.
f is said to be continuous on I if for every c ∈ I, f is continuous at c.

We have seen that if f, g : R → R are continuous on R, then their composition f ◦ g : R → R


defined by
(f ◦ g)(x) = f (g(x)) (x ∈ R),
is also continuous on R.
Recall also that we had learnt the following important properties possessed by continuous
functions: they preserve convergent sequences, the Intermediate Value Theorem and the Extreme
Value Theorem.

Theorem 4.2. Let I be an interval, c ∈ I, and f : I → R. The following are equivalent:


(1) f is continuous at c.
(2) For every sequence (xn )n∈N contained in I such that (xn )n∈N converges to c, the sequence
(f (xn ))n∈N converges to f (c).

In other words, f is continuous at c if and only if f “preserves” convergent sequences.


Exercise 4.3. Show that the statement (2) in Theorem 4.2 can be weakened to the following:
(2′ ) For every sequence (xn )n∈N contained in I such that (xn )n∈N converges to c, the sequence
(f (xn ))n∈N converges.

Theorem 4.4 (Intermediate Value Theorem). Let f : [a, b] → R be continuous on [a, b]. If y ∈ R
is such that f (a) ≤ y ≤ f (b) or f (b) ≤ y ≤ f (a) (that is, if y lies between f (a) and f (b)), then
there exists a c ∈ [a, b] such that f (c) = y.

In other words, a continuous function attains all real values between the values of the function
attained at the endpoints.
Finally, we recall the Extreme Value Theorem.

47

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

48 4. Continuous functions

Theorem 4.5 (Extreme Value Theorem). Let f : [a, b] → R be continuous on [a, b]. There exists
a c ∈ [a, b] and there exists a d ∈ [a, b] such that

f (c) = sup{f (x) : x ∈ [a, b]},


f (d) = inf{f (x) : x ∈ [a, b]}.

We note that since c, d ∈ [a, b], we have f (c), f (d) ∈ {f (x) : x ∈ [a, b]}, and so the supremum
and infimum are in fact maximum and minimum, respectively:

f (c) = sup{f (x) : x ∈ [a, b]} = max{f (x) : x ∈ [a, b]},


f (d) = inf{f (x) : x ∈ [a, b]} = min{f (x) : x ∈ [a, b]}.

We observe that in our definition of continuity of a function at a point, they key idea is that:
“We are guaranteed that f (x) stays close to f (c) for all x close enough to c.”
But “closeness” is something we know not just in R but in the context of general metric spaces!
We will now learn that indeed continuity can in fact be defined in a quite abstract setting, when
we have maps between metric spaces. We will also gain insights into the above properties of
continuous functions when we study analogues of the above results in our more general setting.
Exercise 4.6. Consider the function f : R → R defined by

x if x is rational,
f (x) =
−x if x is irrational.
Prove that f is continuous only at 0. Hint: For every rational number, there is a sequence of irrational
numbers that converges to it, and for every irrational number, there is a sequence of rational numbers
that converges to it, by the results in Exercise 1.39, 1.40.

Exercise 4.7. (∗) Every nonzero rational number x can be uniquely written as x = n/d, where n, d
denote integers without any common divisors and d > 0. When r = 0, we take d = 1 and n = 0. Consider
the function f : R → R defined by

 0
 if x is irrational,
 
f (x) = 1 n

 if x = is rational.
d d
Prove that f is discontinuous at every rational number, and continuous at every irrational number.
Hint: For an irrational number x, given any ǫ > 0, and any interval (N, N + 1) containing x, show that
there are just finitely many rational numbers r in (N, N + 1) for which f (r) ≥ ǫ. Use this to show the
continuity at irrationals.

Exercise 4.8. Consider a flat pancake of arbitrary shape. Show that there is a straight line cut that
divides the pancake into two parts having equal areas. Can the direction of the straight line cut be chosen
arbitrarily?

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.2. Continuity of maps between metric spaces 49

4.2. Continuity of maps between metric spaces


Definition 4.9. Let (X, dX ), (Y, dY ) be metric spaces, c ∈ X and f : X → Y be a map. f is said
to be continuous at c if for every ǫ > 0, there exists a δ > 0 such that whenever x ∈ X satisfies
dX (x, c) < δ, we have dY (f (x), f (c)) < ǫ. See Figure 1.
f is said to be continuous on X if for every c ∈ X, f is continuous at c.

X f
Y

δ
ǫ
c f (c)

Figure 1. Continuity of f at c.

First of all we notice that if I is an interval in R, and we take X = I, Y = R, both equipped


with the Euclidean metric, then the above definition of continuity of a function f : I → R at a
c ∈ R coincides with our earlier Definition 4.1.
We remark that although X may be equal to Y , they might be equipped with different metrics;
see as an extreme example, Exercise 4.11 below.
 x1 x2 

 p 2 if x 6= 0, 

2 x1 + x22
Exercise 4.10. Show that f : R → R given by f (x) = is continuous at 0.
 0
 if x = 0 


0 if x ≤ 0,
Exercise 4.11. Let f : R → R be defined by f (x) =
1 if x > 0.
(1) If both the domain X = R and the codomain Y = R are equipped with the Euclidean metric,
then show that f is not continuous at 0.
(2) On the other hand, if we equip the domain X = R of f with the discrete metric, and the
codomain Y = R of f with the Euclidean metric, then prove that f is continuous at 0.
Exercise 4.12. Let (X, d) be a metric space, and let p ∈ X. Show that the distance to p is a continuous
map, that is, prove that the function f : X → R defined by f (x) := d(x, p) (x ∈ X) is continuous.
Exercise 4.13. (∗) Show that addition (x, y) 7→ x + y and multiplication (x, y) 7→ xy are continuous
maps from R2 to R with the usual Euclidean metrics.
Exercise 4.14. (∗) Consider the normed space (C[0, 1], k · k∞ ), and let S : C[0, 1] → C[0, 1] be defined
by
(S(f ))(x) = (f (x))2 (x ∈ [0, 1], f ∈ C[0, 1]).
Show that S is continuous.
Proposition 4.15. Let (X, dX ) be a metric space, and let fn : X → R (n ∈ N) be a sequence of
continuous functions that converges uniformly to f : X → R. Then f is continuous.

Proof. Let c ∈ X and ǫ > 0. Choose an N ∈ N such that that for all x ∈ X, |fN (x) − f (x)| < ǫ/3.
As fN is continuous, we can find a δ > 0 such that for all x ∈ X satisfying dX (x, c) < δ, we have
|fN (x) − fN (c)| < ǫ/3. So for all x ∈ X satisfying dX (x, c) < δ, we have using the triangle
inequality that
|f (x) − f (c)| ≤ |f (x) − fN (x)| + |fN (x) − fN (c)| + |fN (c) − f (c)| < ǫ/3 + ǫ/3 + ǫ/3 = ǫ.
Hence f is continuous at c. Since the choice of c ∈ X was arbitrary, it follows that f is continuous
on X. 

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

50 4. Continuous functions

Exercise 4.16. A subset S of Rn is said to be path connected if for any two points x, y ∈ Rn , there
exists a continuous function γ : [0, 1] → S such that f (0) = x and f (1) = y. (We think of γ as a “path”
beginning at x and ending at y.)
(1) Show that every convex set1 C (⊆ Rn ) is path connected.
(2) We define the relation R on S by setting x R y if there is a path γ : [0, 1] → S such that γ(0) = x
and γ(1) = y. Prove that R is an equivalence relation on S. The equivalence classes of S under
R are called the path components of S. Thus if S is path connected, then it has a unique path
component, namely S itself.
(3) Which of the following subsets of R2 are path connected? If the set is not path connected,
then determine its path components. {(x, y) ∈ R2 : x2 + y 2 = 1}, {(x, y) ∈ R2 : xy = 0},
{(x, y) ∈ R2 : xy = 1}.

4.3. Continuous maps and open sets


We will now learn an important property of continuous functions, namely that “inverse images”
of open sets under a continuous map are open. In fact, we will see that this property is a
characterization of continuity.
But first we fix some standard notation. Let f : X → Y be a map, and let V ⊆ Y . Then we
set f −1 (V ) := {x ∈ X : f (x) ∈ V }, and call it the inverse image of V under f . See Figure 2.
Clearly f −1 (Y ) = X and f −1 (∅) = ∅.

f Y
X

f −1 (V ) V

Figure 2. Inverse image of V under f .

Exercise 4.17. Let f : R → R be given by f (x) = cos x (x ∈ R). Find f −1 (V ), where V = {−1, 1},
V = {1}, V = [−1, 1], V = R, V = (− 21 , 12 ).

On the other hand if U ⊆ X, then we set f (U ) := {f (x) ∈ Y : x ∈ U }, and call it the image
of U under f . See Figure 3.

X f Y

U f (U )

Figure 3. Image of U under f .

Exercise 4.18. Let f : R → R be given by f (x) = cos x (x ∈ R). Find f (U ), where U = R, U = [0, 2π],
U = [δ, δ + 2π] where δ is any positive number.
1See Exercise 1.41 for the definition of convex sets.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.3. Continuous maps and open sets 51

Theorem 4.19. Let (X, dX ), (Y, dY ) be metric spaces and f : X → Y be a map. Then f is
continuous on X if and only if for every V open in Y , f −1 (V ) is open in X.

Proof. (If) Let c ∈ X, and let ǫ > 0. Consider the open ball B(f (c), ǫ) with center f (c) and
radius ǫ in Y . We know that this open ball V := B(f (c), ǫ) is an open set in Y . Thus we also know
that f −1 (V ) = f −1 (B(f (c), ǫ)) is an open set in X. But the point c ∈ f −1 (B(f (c), ǫ)), because
f (c) ∈ B(f (c), ǫ) (dY (f (c), f (c)) = 0 < ǫ!). So by the definition of an open set, there is a δ > 0
such that B(c, δ) ⊆ f −1 (B(f (c), ǫ)). In other words, whenever x ∈ X satisfies dX (x, c) < δ, we
have that x ∈ f −1 (B(f (c), ǫ)), that is, f (x) ∈ B(f (c), ǫ), which implies dY (f (x), f (c)) < ǫ. Hence
f is continuous at c. But the choice of c ∈ X was arbitrary. Consequently f is continuous on X.
See the picture on the left side of Figure 4.

f −1 (V ) f −1 (V ) V
V :=B(f (c),ǫ)

δ ǫ δ ǫ
c f (c) c f (c)

(If) (Only if)

Figure 4. Continuity and open sets: Proof of Theorem 4.19.

(Only if) Now suppose that f is continuous, and let V be an open subset of Y . We would like to
show that f −1 (V ) is open. So let c ∈ f −1 (V ). Then f (c) ∈ V . As V is open, there is a small open
ball B(f (c), ǫ) with center f (c) and radius ǫ that is contained in V . By the continuity of f at c,
there is a δ > 0 such that whenever dX (x, c) < δ, we have dY (f (x), f (c)) < ǫ, that is, f (x) ∈ V .
But this means that B(c, δ) ⊆ f −1 (V ). Indeed, if x ∈ B(c, δ), then dX (x, c) < δ and so by the
above, f (x) ∈ V , that is, x ∈ f −1 (V ). Consequently, f −1 (V ) is open in X. See the picture on the
right side of Figure 4. 


Note that the theorem does not claim that for every U open in X, f (U ) is open in Y . Consider
for example X = Y = R equipped with the Euclidean metric, and the constant function f (x) = c
(x ∈ R). Then X = R is open in X = R, but f (X) = {c} is not open in Y = R.

Corollary 4.20. Let (X, dX ), (Y, dY ) be metric spaces and f : X → Y be a map. Then f is
continuous on X if and only if for every F closed in Y , f −1 (F ) is closed in X.

Proof. If F ⊆ Y , then f −1 (Y \ F ) = X \ (f −1 (F )). 


Exercise 4.21. Fill in the details of the proof of Corollary 4.20.

Theorem 4.22. Let (X, dX ), (Y, dY ), (Z, dZ ) be metric spaces, f : X → Y and g : Y → Z be


continuous maps. Then the composition map g ◦ f : X → Z, defined by (g ◦ f )(x) := g(f (x))
(x ∈ X), is continuous.

Proof. Let W be open in Z. Then since g is continuous, g −1 (W ) is open in Y . Also, since f


is continuous, f −1 (g −1 (W )) is open in X. Finally, we note that (g ◦ f )−1 (W ) = f −1 (g −1 (W )).
Consequently, g ◦ f is continuous. 

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

52 4. Continuous functions

Exercise 4.23. In the proof of Theorem 4.22, we used the fact that (g ◦ f )−1 (W ) = f −1 (g −1 (W )). Check
this.
Exercise 4.24. Let X be a metric space and f : X → R be a continuous map. Determine if the following
statements are true or false. Justify your answers.
(1) {x ∈ X : f (x) < 1} is an open set.
(2) {x ∈ X : f (x) > 1} is an open set.
(3) {x ∈ X : f (x) = 1} is an open set.
(4) {x ∈ X : f (x) ≤ 1} is a closed set.
(5) {x ∈ X : f (x) = 1} is a closed set.
(6) {x ∈ X : f (x) = 1 or f (x) = 2} is a closed set.
(7) {x ∈ X : f (x) = 1} is a compact set.

Analogous to Theorem 4.2, we have the following characterization of continuous maps in terms
of convergence of sequences.

Theorem 4.25. Let (X, dX ), (Y, dY ) be metric spaces, c ∈ X, and let f : X → Y be a map. Then
following two statements are equivalent:
(1) f is continuous at c.
(2) For every sequence (xn )n∈N in X such that (xn )n∈N converges to c, (f (xn ))n∈N converges
to f (c).

Proof. (1) ⇒ (2): Suppose that f is continuous at c. Let (xn )n∈N be a sequence in X such
that (xn )n∈N converges to c. Let ǫ > 0. Then there exists a δ > 0 such that for all x ∈ X
satisfying dX (x, c) < δ, we have dY (f (x), f (c)) < ǫ. As the sequence (xn )n∈N converges to c, for
this δ > 0, there exists an N ∈ N such that whenever n > N , dX (xn , c) < δ. But then by the
above, dY (f (xn ), f (c)) < ǫ. So we have shown that for every ǫ > 0, there is an N ∈ N such that
for all n > N , dY (f (xn ), f (c)) < ǫ. In other words, the sequence (f (xn ))n∈N converges to f (c).

(2) ⇒ (1): Suppose that f is not continuous at c. This means that there is an ǫ > 0 such
that for every δ > 0, there is an x ∈ X such that dX (x, c) < δ, but dY (f (x), f (c)) > ǫ. We
will use this statement to construct a sequence (xn )n∈N for which the conclusion in (2) does not
hold. Choose δ = 1/n, and denote the corresponding x as xn : thus, dX (xn , c) < δ = 1/n, but
dY (f (xn ), f (c)) > ǫ. Clearly the sequence (xn )n∈N is convergent with limit c, but (f (xn ))n∈N does
not converge to f (c) since dY (f (xn ), f (c)) > ǫ for all n ∈ N. Consequently if (1) does not hold,
then (2) does not hold. In other words, we have shown that (2) ⇒ (1). 
Exercise 4.26. Let f : R → R be a continuous function such that for all x, y ∈ R, f (x + y) = f (x) + f (y).
Show that there exists a real number a such that for all x ∈ R, f (x) = ax. Hint: Show first that for
natural numbers n, f (n) = nf (1). Extend this to integers n, and then to rational numbers n/d. Finally
use the density of Q in R to prove the claim.
Exercise 4.27. Find all continuous functions f : R → R such that for all x ∈ R, f (x) + f (2x) = 0.
Hint: Show that f (x) = −f ( x2 ) = f ( x4 ) = −f ( x8 ) = · · · .

Exercise 4.28. Show that the multiplication function f : R2 → R defined by f (x) = x1 x2 (x ∈ R2 ) is


continuous on R2 using the characterization of continuous functions in terms of preservation of convergent
sequences. Compare this with Exercise 4.13.
Exercise 4.29. (∗) Two metric spaces are called homeomorphic if there exists a continuous bijection
f : X → Y such that f −1 : Y → X is also continuous. For example, circles and triangles in R2 are
homeomorphic; see Figure 5.
Similarly, the an open interval in R is homeomorphic to a circle missing one point in R2 : we just “identify”
the two ends of the interval, as shown in the picture on the left hand side of Figure 6. Using this, one can

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.3. Continuous maps and open sets 53

f (x)
x

Figure 5. Homeomorphism between the triangle and the circle in R2 .

11
00

1
0 1
0

Figure 6. R is homeomorphic to an open interval in R.

also see pictorially that the open interval in R is homeomorphic to R, as shown in the picture on the right
hand side of Figure 6.
A natural question is whether the continuity of f −1 is actually implied by the continuity of a bijection
f . It is not, and the aim of this exercise is show such an instructive example. Look at Figure 7, where
a continuous bijection f is produced from the set X := [0, 1) onto the triangle in R2 by “bending” the
interval at the three points B, C, D, stretching the parts between the segments BC and CD, and sending
points A, B, C, D to the points f (A) = (1, 0), f (B), f (C), f (D) as shown. But now prove that f −1 can’t
be continuous by looking at the sequence (1, (−1)n /n).

f (B)

0 1
f (A)=(1,0)
f (C)
A B C D

f (D)

Figure 7. The triangle in R2 and the interval [0, 1).

Exercise 4.30. (∗) A “manifold” is a topological space that locally resembles the Euclidean space. More
precisely, we will call a subset M of Rn a manifold (of dimension2 k) if for every x ∈ M , there is an open
set Ox containing x such that Ox is homeomorphic to an open subset U of Rk . Locally the surface of the
earth, which is a sphere, looks flat, and so we expect that the sphere in R3 is a manifold of dimension 2.
Give an argument, based on pictures, that the unit sphere
S2 := {x ∈ R3 : kxk2 = 1}
is indeed a manifold of dimension 2. (This explains why one uses the superscript “2” on top of S in the
(standard) notation for the unit sphere in R3 . Similarly the circle S1 := {x ∈ R2 : kxk2 = 1} in R2 is
a manifold of dimension 1, and is denoted by S1 . More generally, it can be shown that the unit sphere
Sn−1 := {x ∈ Rn : kxk2 = 1} in Rn is a manifold of dimension n − 1 in Rn .)

2It can be shown that this is a well-defined notion.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

54 4. Continuous functions

Exercise 4.31. (∗) Consider the function f : R2 → R given by


 2
x − x22
 21
 if x 6= 0,
f (x) = x1 + x22
 c
 if x = 0.

Show that no matter what c ∈ R we take in the above, f is not continuous at 0.


Exercise 4.32. (∗) Show that the determinant function M 7→ det M from (Rn×n , k · k∞ ) to (R, | · |) is
continuous. Prove that the set of invertible matrices is open in (Rn×n , k · k∞ ). Hint: Consider det−1 ({0}).
Exercise 4.33. Give an example of a continuous function f : X → Y , where X, Y are metric spaces, and
a Cauchy sequence (xn )n∈N for which (f (xn ))n∈N is not a Cauchy sequence in Y .
Exercise 4.34. Let X, Y be metric spaces. A map f : X → Y is said to be open if for every open subset
U of X, f (U ) is open in Y .
Now take X = R with the usual Euclidean metric, and take Y = R with the discrete metric. Consider
the identity map f : X → Y defined by f (x) = x (x ∈ R). Show that f is open, but for all x ∈ R, f is not
continuous at x.
Exercise 4.35. (∗) Define f, g on R2 by f (0, 0) = g(0, 0) = 0, and if (x, y) 6= (0, 0), then we set
xy 2 xy 2
f (x, y) = , g(x, y) = .
x2 + y 4 x2 + y 6
(1) Show that f is bounded on R2 , that is, ∃M ∈ R such that for all (x, y) ∈ R2 , |f (x, y)| ≤ M .
(2) Prove that g is unbounded in every ball centered at (0, 0).
(3) Show that f is not continuous at (0, 0).
(4) Prove that g is not continuous at (0, 0).
(5) If Y ⊆ X and if Φ is a function defined on X, the restriction of Φ to Y is the function ϕ whose
domain is Y , and such that ϕ(y) = Φ(y) (y ∈ Y ). Show that the restrictions of both f and g to
every straight line in R2 are continuous!

For functions from the Euclidean space Rn to the Euclidean space Rm , we have the following
simplification.

Proposition 4.36. A function f : Rn → Rm is continuous if and only if each of its components


f1 , . . . , fm : Rn → R are continuous. (Here fk (x) := e⊤ n
k f (x), x ∈ R , where e1 , . . . , ed are the
standard basis vectors.)
v
uX
u n
Proof. We have |fk (x) − fk (y)| ≤ t |fi (x) − fi (y)|2 = kf (x) − f (y)k2 . 
i=1

Another case when checking continuity becomes considerably simpler is in the case of linear
transformations between normed spaces.

Proposition 4.37. Let (X, k · kX ), (Y, k · kY ) be normed spaces and let T : X → Y be a linear
transformation. Then the following are equivalent:
(1) T is continuous.
(2) T is continuous at 0.
(3) There exists an M > 0 such that for all x ∈ X, kT xkY ≤ M kxkX .

Proof. (1) ⇒ (2) follows from the definition. Let us show that (2) ⇒ (3). As T is continuous
at 0, we have that given ǫ := 1 > 0, there is a δ > 0 such that whenever kx − 0kX = kxkX < δ,
we have kT x − T 0kY = kT x − 0kY = kT xkY < 1. Define M := 2/δ. Then if x = 0, we have

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.3. Continuous maps and open sets 55

kT xkY = kT 0kY = k0kY = 0 ≤ (2/δ) · 0 = M · k0kX = M kxkX . If on the other hand, x 6= 0,then
define
δ
y := x.
2kxkX
Then kyk = δ/2 < δ and so kT ykY < 1. We have
 
δ δ
1 > kT ykY = T x = kT xkY ,
2kxkX Y 2kxkX
and so upon rearranging, we obtain
2
kT xkY < kxkX = M kxkX .
δ
Consequently (3) holds.

Finally, we show that (3) ⇒ (1). Let c ∈ X, and ǫ > 0. Choose δ = ǫ/M > 0. Then whenever
kx − ckX < δ = ǫ/M , we have
ǫ
kT x − T ckY = kT (x − c)kY ≤ M kx − ckX < M δ = M = ǫ.
M
Hence f is continuous at c. But the choice of c ∈ X was arbitrary, and so f is continuous. 

Example 4.38. Consider the map I : C[a, b] → R from the normed space (C[a, b], k · k∞ ) to R
given by
Z b
I(f ) = f (x)dx (f ∈ C[a, b]).
a

Then clearly I is a linear transformation. Moreover, since for every f ∈ C[a, b] we have
Z b Z b Z b
|I(f )| = f (x)dx ≤ |f (x)|dx ≤ kf k∞ dx = kf k∞ (b − a),
a a a

it follows that I is continuous. ♦

Example 4.39. Let A ∈ Rn×m . Consider the map TA from the Euclidean space Rn to the
Euclidean space Rm , given by matrix multiplication:

TA x = Ax (x ∈ Rn ).

Then TA is a linear transformation, and it is continuous, since


v v
uX  n  2 uX
um X um √
kTA xk2 = kAxk2 = t aij xj ≤t nkAk2∞ kxk22 = mnkAk∞ kxk2 .
i=1 j=1 i=1

Hence TA is continuous. ♦
Exercise 4.40. Show that if A ∈ Rn×m , then ker A = {x ∈ Rm : Ax = 0} is a closed subspace of Rn .

Exercise 4.41. (∗) Prove that every subspace of Rn is closed. Hint: Construct a linear transformation
whose kernel is the given subspace.

Exercise 4.42. A metric space X is said to be connected if X is not the union of two disjoint non-empty
open sets.
Prove that a continuous image of a connected set is connected, that is, if X and Y are metric spaces
such that X is connected, and if f : X → Y is continuous, then f (X) is connected as a metric space (with
the metric induced from Y ).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

56 4. Continuous functions

4.3.1. Limits and continuity.


Definition 4.43. Let U be an open subset of Rn , c ∈ U , L ∈ Rm and f : U → Rm . Then we
write
x→c
f (x) −→ L
or
lim f (x) = L
x→c
if for every ǫ > 0, there exists a δ > 0 such that whenever x ∈ U satisfies 0 < kx − ck2 < δ, we
have kf (x) − Lk2 < ǫ. We then say that f has a limit at c, and call L its limit.

We can recast this definition in terms of sequences.


Theorem 4.44. Let U be an open subset of Rn , c ∈ U , L ∈ Rm and f : U → Rm . Then
lim f (x) = L
x→c

if and only if for every sequence (xn )n∈N contained in U such that xn 6= c (n ∈ N), lim xn = c,
n→∞

lim f (xn ) = L.
n→∞

Proof. (If) Suppose that lim f (x) = L does not hold. Then there is an ǫ > 0 such that for every
x→c
δ > 0, there is a point x ∈ U (depending on δ), for which 0 < kx − ck2 < δ, but kf (x) − Lk2 ≥ ǫ.
Taking δ successively to be 1/n (n ∈ N), we can thus find a sequence xn ∈ U such that xn 6= c
(n ∈ N) and kf (xn ) − Lk2 ≥ ǫ. But this last condition means that lim f (xn ) = L does not hold.
n→∞
This completes the proof of the “if part”.

(Only if) Suppose now that lim f (x) = L, and that (xn )n∈N is a sequence contained in U
x→c
such that xn 6= c (n ∈ N), lim xn = c. Let ǫ > 0. Then there exists a δ > 0 such that whenever
n→∞
x ∈ U satisfies 0 < kx − ck2 < δ, we have kf (x) − Lk2 < ǫ. Also, there exists an N ∈ N such that
for all n > N , 0 < kxn − ck < δ. Consequently, for n > N we have kf (xn ) − Lk2 < ǫ. Hence
lim f (xn ) = L. 
n→∞

Corollary 4.45. Let U be an open subset of Rn , c ∈ U and f : U → Rm . If f has a limit at c,


then it is unique.

This result follows from the above theorem and the fact that convergent sequences have a
unique limit.
Moreover, using the algebra of limits for real sequences, it follows also that the same sort of
results carry over to limits of real-valued functions.
Theorem 4.46. Let U be an open subset of Rn , c ∈ U . Suppose that f, g : U → R and
lim f (x) = Lf , lim g(x) = Lg ,
x→c x→c
where Lf , Lg ∈ R. Define f + g, f g : U → R by
(f + g)(x) = f (x) + g(x), (f g)(x) = f (x)g(x) (x ∈ U ).
Then we have
(1) lim (f + g)(x) = Lf + Lg = lim f (x) + lim g(x).
x→c x→c x→c
  
(2) lim (f g)(x) = Lf Lg = lim f (x) lim g(x) .
x→c x→c x→c

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.4. Compactness and continuity 57

The following result is clear from the definitions.

Theorem 4.47. Let U be an open subset of Rn , c ∈ U . Then f : U → Rm is continuous at c if


and only if
lim f (x) = f (c).
x→c

4.4. Compactness and continuity


In this section we will learn about a very useful result in Optimization Theory, on the existence
of global minimizers of real-valued continuous functions on compact sets.

Theorem 4.48. Let K be a compact subset of a metric space X, and let Y be a metric space.
Suppose that f : K → Y is a continuous function. Then f (K) is a compact subset of Y .

Proof. Suppose that (yn )n∈N is a sequence in contained in f (K). Then for each n ∈ N, there
exists an xn ∈ K such that yn = f (xn ). Thus we obtain a sequence (xn )n∈N in the set K. As
K is compact, there exists a convergent subsequence, say (xnk )k∈N , with limit L ∈ K. As f is
continuous, it preserves convergent sequences. So (f (xnk ))k∈N = (ynk )k∈N is convergent with limit
f (L) ∈ f (K). Consequently, f (K) is compact. 

Now we prove the aforementioned result which turns out to be very useful in Optimiza-
tion Theory, namely that a real-valued continuous function on a compact set attains its maxi-
mum/minimum. This is a generalization of the Extreme-Value Theorem we had learnt earlier,
where the compact set in question was just the interval [a, b].

Theorem 4.49 (Weierstrass’s theorem). Let K be a nonempty compact subset of a metric space
X, and let f : K → R be a continuous function. Then there exists a c ∈ K such that
f (c) = sup{f (x) : x ∈ K}.

We note that since c ∈ K, f (c) ∈ {f (x) : x ∈ K}, and so the supremum above is actually a
maximum:
f (c) = sup{f (x) : x ∈ K} = max{f (x) : x ∈ K}.
Also, under the same hypothesis of the above result, there exists a minimizer in K, that is, there
exists a d ∈ K such that
f (d) = inf{f (x) : x ∈ K} = min{f (x) : x ∈ K}.
This follows from the above result by just looking at −f , that is by applying the above result to
the continuous function g : K → R given by
g(x) = −f (x) (x ∈ K).

Proof of Theorem 4.49. We know that the image of K under f , namely the set f (K) is compact
and hence bounded. So {f (x) : x ∈ K} is bounded. It is also nonempty since K is nonempty.
But by the least upper bound property of R, a nonempty bounded subset of R has a least upper
bound. Thus M := sup{f (x) : x ∈ K} ∈ R. Now consider M − 1/n (n ∈ N). This number
cannot be an upper bound for {f (x) : x ∈ K}. So we there must be an xn ∈ K such that
f (xn ) > M − 1/n. In this manner we get a sequence (xn )n∈N in K. As K is compact, (xn )n∈N
has a convergent subsequence (xnk )k∈N with limit, say c, belonging to K. As f is continuous,
(f (xnk ))k∈N is convergent as well with limit f (c). But from the inequalities f (xn ) > M − 1/n
(n ∈ N), it follows that f (c) ≥ M . On the other hand, from the definition of M , we also have that
f (c) ≤ M . Hence f (c) = M . 

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

58 4. Continuous functions

Example 4.50. Since the set K = {x ∈ R3 : x21 + x22 + x23 = 1} is compact in R3 and since the
function x 7→ x1 + x2 + x3 is continuous on R3 , it follows that the optimization problem
 
minimize x1 + x2 + x3
,
subject to x21 + x22 + x23 = 1
has a minimizer. ♦

Remark 4.51. (∗) In Optimization Theory, one often meets necessary conditions for an optimal
solution, that is, results of the following form:
 
maximize f (x)
If x
b is an optimal solution to the optimization problem ,
subject to x ∈ F (⊆ Rn )
then xb satisfies ∗ ∗ ∗ .
(Where ∗ ∗ ∗ are certain mathematical conditions, such as the Lagrange multiplier equations.)
Now such a result has limited use as such since even if we find all x b(s) which satisfy ∗ ∗ ∗ , we
can’t conclude that there is one that is optimal. But now suppose that we know that f : F → R
is continuous and that F is compact. Then we know that an optimal solution exists, and so we
know that among the x b(s) that satisfy ∗ ∗ ∗ , there is at least one which is an optimal solution.
Exercise 4.52. (∗) Let N : Rd → R be any norm on Rd . The aim of this exercise is to show that N is
“equivalent to3” k · k2 , that is, there are constants M, m > 0 such that
for all x ∈ Rd , mkxk2 ≤ N (x) ≤ M kxk2 .
(1) Let e1 = (1, 0, . . . , 0), . . . , ed = (0, . . . , 0, 1) be the standard basis vectors in Rd . Thus the vector
x = (x1 , . . . , xd ) ∈ Rd is the linear combination x1 e1 + · · · + xd ed . Show, using the Cauchy-
Schwarz inequality, that there is a constant M > 0 such that for all x ∈ Rd , N (x) ≤ M kxk2 .
(2) Prove using the triangle inequality for N that for all x, y ∈ Rd , |N (x) − N (y)| ≤ N (x − y).
Conclude that the map N : (Rd , k · k2 ) → R is continuous.
(3) Consider the compact set K := {x ∈ Rd : kxk2 = 1} and use Weierstrass’s theorem to prove the
existence of m > 0 such that for all x ∈ Rd , mkxk2 ≤ N (x).
Exercise 4.53. In each case, give an example of a continuous function f : S → T , such that f (S) = T
or else explain why there can be no such f . (We use the usual topologies, for example (0, 1) in the first
part has the Euclidean topology of R.)
(1) S = (0, 1), T = (0, 1].
(2) S = (0, 1), T = (0, 1) ∪ (1, 2).
(3) S = R, T = Q.
S
(4) S = [0, 1] [2, 3], T = {0, 1}.
(5) S = [0, 1] × [0, 1], T = R2 .
(6) S = [0, 1] × [0, 1], T = (0, 1) × (0, 1).
(7) S = (0, 1) × (0, 1), T = R2 .
Exercise 4.54. (∗) Let (X, d) be a metric space and let f : X → X be a function that satisfies
for all x, y ∈ X such that x 6= y, d(f (x), f (y)) < d(x, y). (4.1)
(1) Prove that f has at most one fixed point (that is, a point c ∈ X such that f (c) = c).
(2) Let X = (0, 12 ) with the usual topology and consider the function f : X → X given by f (x) = x2
for x ∈ (0, 12 ). Show that f satisfies (4.54), but it has no fixed point.
(3) Show that the function g : X → R given by g(x) = d(x, f (x)) (x ∈ X) is continuous.
(4) Prove that if X is compact, then f has exactly one fixed point.
Hint: g attains a minimum on X.

3The fuss about equivalent norms on a vector space X is that whenever two norms N , N are equivalent, the open
1 2
sets in (X, N1 ) coincide with the ones in (X, N2 ), and so as topological spaces, they are the same!

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

4.5. Uniform continuity 59

Exercise 4.55. Let X be a compact metric space and let f : X → Z be a continuous function. (Here Z
has the Euclidean topology induced from R.) Prove that f can assume only finitely many values.
Exercise 4.56. (∗)
(1) Show that [0, 1] and (0, 1) are homeomorphic when both spaces are equipped with the discrete
metric.
(2) Show that [0, 1] and (0, 1) are not homeomorphic when both spaces are equipped with the
Euclidean metric.

4.5. Uniform continuity


Definition 4.57. Let (X, dX ), (Y, dY ) be metric spaces. f : X → Y is said to be uniformly
continuous if for every ǫ > 0, there exists a δ > 0 such that for all x, y ∈ X satisfying dX (x, y) < δ,
there holds that dY (f (x), f (y)) < ǫ.

Note that in the definition we are introducing the notion of uniform continuity of a function
on a set, and not at a point.
While checking continuity of a function on a set, we check its continuity at every point as
follows. Given a point c in X and an ǫ > 0, we need to find a ball of radius δ around c such that
the image of that ball under f is contained in a ball of radius ǫ around f (c). But in general, the
radius δ of the ball around c may depend on the c we choose, and it may not be possible to use
the same δ everywhere in X. For a uniformly continuous function, it is possible to choose a δ that
depends only on ǫ and not on the point c.
Proposition 4.58. Let (X, dX ), (Y, dY ) be metric spaces. If f : X → Y is uniformly continuous,
then f is continuous.

Proof. Let c ∈ X. Suppose that ǫ > 0. By the uniform continuity of f , there exists a δ > 0 such
that for all x, y ∈ X satisfying dX (x, y) < δ, there holds that dY (f (x), f (y)) < ǫ. In particular,
if x ∈ X satisfies dX (x, c) < δ, we have dY (f (x), f (c)) < ǫ. Thus f is continuous at c. But the
choice of c was arbitrary, and so f is continuous on X. 

The following example shows that uniform continuity is a strictly stronger notion than conti-
nuity, that is, there are continuous functions that are not uniformly continuous.
Example 4.59. The function f : (0, 1) → R given by
f (x) = 1/x (0 < x < 1)
is continuous on (0, 1). Indeed, if (xn )N is a convergent sequence in (0, 1) with limit L, then
(f (xn ))n∈N = (1/xn )N is a convergent sequence with limit 1/L = f (L).
However, f is not uniformly continuous. Suppose it is. Then given ǫ > 0, there exists a δ > 0
such that whenever |x − y| < δ, we have
1 1
− < ǫ.
x y
1 1 1
Consider x = and y = . Then |x − y| = , and so |x − y| < δ for all n large enough, but
n 2n 2n
1 1
− = n > ǫ,
x y
for n large enough. ♦
2
Exercise 4.60. (∗) Show that f : R → R given by f (x) = x (x ∈ R) is continuous, but not uniformly
continuous. Hint: Consider x = n and y = n + n1 for large n.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

60 4. Continuous functions

Exercise 4.61. (∗) Show that f : [0, ∞) → R given by f (x) = x (x ≥ 0) is uniformly continuous.

Proposition 4.62. Let (X, dX ), (Y, dY ) be metric spaces, and suppose that X is compact. If
f : X → Y is continuous, then f is also uniformly continuous.

Proof. We give an argument by contradiction. So let us suppose that f is not uniformly contin-
uous. Then there exists an ǫ > 0 such that for every δ > 0, there are some x, y ∈ X such that
dX (x, y) < δ, but dY (f (x), f (y)) ≥ ǫ. In particular, if we take δ = 1/n, then there exist xn , yn ∈ X
such that dX (xn , yn ) < 1/n but dY (f (xn ), f (yn )) ≥ ǫ. By using the compactness of X, and con-
sidering subsequences if necessary, we may assume that (xn )n∈N and (yn )n∈N are convergent, with
limits say x, y ∈ X, respectively4. Since dX (xn , yn ) < 1/n, we obtain dX (x, y) ≤ 0, and so x = y.
Also, by the continuity of f , we have (f (xn ))n∈N , (f (yn ))n∈N converge respectively to f (x), f (y).
Hence dY (f (xn ), f (yn )) converges to dY (f (x), f (y)) = 0 (as x = y!). But on the other hand, from
dY (f (xn ), f (yn )) ≥ ǫ, we obtain dY (f (x), f (y)) ≥ ǫ > 0, a contradiction. 
Exercise 4.63. Prove that the function f : R → R defined by f (x) = |x| (x ∈ R) is uniformly continuous.
Exercise 4.64. Let (X, d) be a metric space and let c ∈ X. Define f : X → R by f (x) = d(x, c). Prove
that f is uniformly continuous on X.
Exercise 4.65. A function f : Rn → Rm is said to be (globally) Lipschitz if there exists a number L > 0
such that for all x, y ∈ Rn , kf (x) − f (y)k2 ≤ Lkx − yk2 . Prove that every Lipschitz function is uniformly
continuous.
Exercise 4.66. Let X, Y be metric spaces, and let f : X → Y be uniformly continuous. Show that
if (xn )n∈N is a Cauchy sequence in X, then (f (xn ))n∈N is a Cauchy sequence in Y . Compare this with
Exercise 4.33.

4Since X is compact, the sequence (x )


n n∈N has a convergent subsequence, say (xnk )k∈N , converging to, say x ∈ X.
Also, the sequence (ynk )k∈N has a convergent subsequence, say (ynk )ℓ∈N , converging to, say y ∈ Y .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 5

Differentiation

Given a function f : (a, b) → R, and a point c ∈ (a, b), consider the difference quotient for
x ∈ (a, b), x 6= c:
f (x) − f (c)
.
x−c
Geometrically, this number represents the slope of the chord passing through the points (c, f (c))
and (x, f (x)) on the graph of f ; see Figure 1.

c x

Figure 1. The difference quotient.

Suppose that as x goes to c, the difference quotients approach a number, say L, that is,
f (x) − f (c)
lim = L.
x→c x−c
In other words, for every ǫ > 0, there exists a δ > 0 such that whenever x ∈ (a, b) satisfies
0 < |x − c| < δ, we have
f (x) − f (c)
− L < ǫ.
x−c
Then we say that f is differentiable at the point c. See Figure 2, where we see the geometric
interpretation of L: it is the slope of the tangent to the graph of f at the point c. Notice also that
if we “zoom into” the graph of f around the point (c, f (c)), the graph seems to coincide with the
tangent line. In other words, the tangent line is a “linear approximation” of f near the point c.
The number L is unique, and we denote this unique number by f ′ (c), and call it the derivative
of f at c. If f is differentiable at every c ∈ (a, b), then we say that f is differentiable on (a, b).

Theorem 5.1. Let f : (a, b) → R be differentiable at c ∈ (a, b). Then f is continuous at c.

61

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

62 5. Differentiation

c x′ x

Figure 2. The chords approach the tangent line as x approaches c.

Proof. Let ǫ > 0. First choose a δ ′ > 0 such that for all x ∈ (a, b) such that 0 < |x − c| < δ ′ , we
have
f (x) − f (c)
− f ′ (c) < 1.
x−c
Then rearrangement (using the triangle inequality) gives
|f (x) − f (c)| < (1 + |f ′ (c)|)|x − c|.
 
ǫ
Now define δ := min δ ′ , . Then for all x ∈ (a, b) such that 0 < |x − c| < δ, we have
1 + |f ′ (c)|
ǫ
|f (x) − f (c)| < (1 + |f ′ (c)|)|x − c| ≤ (1 + |f ′ (c)|) = ǫ.
1 + |f ′ (c)|
Consequently f is continuous at c. 

The converse of the theorem is not true, and the following example demonstrates this.

Example 5.2. The function f : R → R defined by f (x) = |x| (x ∈ R) is (uniformly) continuous


since
f (x) − f (y) = |x| − |y| ≤ |x − y|.
But let us now show that f is not differentiable at 0. If it were, then given an ǫ > 0, there exists
a δ > 0 such that whenever 0 < |x| < δ, we have
|x|
− f ′ (0) < ǫ.
x
In particular, if we take x = δ/2, then we obtain |1 − f ′ (0)| < ǫ. If we take x = −δ/2, we
also get | − 1 − f ′ (0)| < ǫ. As the choice of ǫ > 0 was arbitrary, we obtain |1 − f ′ (0)| ≤ 0 and
| − 1 − f ′ (0)| ≤ 0. But this means that f ′ (0) = 1 and f ′ (0) = −1, and so 1 = −1, a contradiction.

One can show the following result about rules for differentiating the sum and product of
differentiable functions.

Proposition 5.3. Let f, g : (a, b) → R be differentiable at c ∈ (a, b). Then:


(1) The sum f +g : (a, b) → R defined by (f +g)(x) = f (x)+g(x) (x ∈ (a, b)) is differentiable
at c, and (f + g)′ (c) = f ′ (c) + g ′ (c).
(2) The product f g : (a, b) → R defined by (f g)(x) = f (x) · g(x) (x ∈ (a, b)) is differentiable
at c, and (f g)′ (c) = f ′ (c)g(c) + f (c)g ′ (c).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.1. Mean Value Theorem 63

Proof. These claims follow from the algebra of limits, namely Theorem 4.46. Indeed we have
(f + g)(x) − (f + g)(c) f (x) − f (c) g(x) − g(c)
lim = lim + lim = f ′ (c) + g ′ (c),
x→c x−c x→c x−c x→c x−c
which proves (1). Also, (2) follows from the following:
(f g)(x) − (f g)(c) f (x)g(x) − f (c)g(x) + f (c)g(x) − f (c)g(c)
lim = lim
x→c x−c x→c x−c
f (x) − f (c) g(x) − g(c)
= lim · g(x) + lim f (c) ·
x→c x−c x→c x−c
f (x) − f (c) g(x) − g(c)
= lim · lim g(x) + f (c) lim
x→c x−c x→c x→c x−c
= f ′ (c)g(c) + f (c)g ′ (c).

This completes the proof. 

Example 5.4. The derivative of a constant function is clearly zero. It is also easy to see that if
f is defined by f (x) = x (x ∈ R), then f ′ (x) = 1. Repeated application of (2) above shows that
the derivative of xn (n ∈ N) is nxn−1 . Thus every polynomial function is differentiable. ♦

Henceforth, we will take for granted the standard results on differentiating elementary func-
tions such as sin x that the student is familiar from ordinary calculus.
Exercise 5.5.
(1) Show that if f, g are real valued differentiable functions on an open interval I, then (f g)′ (x) =
f ′ (x)g(x) + f (x)g ′ (x), x ∈ I.
(2) Prove that if f, g are infinitely many times differentiable functions on an open interval I, then
n
!
(n)
X n (k)
(f g) (x) = f (x)g (n−k) (x), x ∈ I.
k
k=0

(3) For a real number x and n a nonnegative integer, define


x[n] := x(x − 1) · · · (x − n + 1).
n
!
X n [k] [n−k]
Show that if x, y ∈ R, (x + y)[n] = x y .
k
k=0
Hint: Differentiate tx+y n times with respect to t ∈ I := (0, ∞).

5.1. Mean Value Theorem


We say that f : (a, b) → R has a local minimum at c ∈ (a, b) if there exists a δ > 0 such that
whenever x ∈ (a, b) satisfies |x − c| < δ, we have f (x) ≥ f (c). In other words, “locally” around c,
the value assumed by f at c is the smallest. See Figure 3. Local maximizers are defined likewise.

Theorem 5.6. Let f : (a, b) → R be such that f has a local minimum at c ∈ (a, b), and f is
differentiable at c. Then f ′ (c) = 0.

An analogous result holds for a local maximizer.

Proof. Let δ > 0 be such that a < c − δ < c < c + δ < b and f (x) ≥ f (c) for x satisfying
|x − c| < δ. Given an ǫ > 0, we can also ensure (by making δ smaller if required) that for all x
satisfying 0 < |x − c| < δ, we have
f (x) − f (c)
− f ′ (c) < ǫ.
x−c

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

64 5. Differentiation

P A B Q

Figure 3. The points P , Q and all points in the interior of the line segment AB are all local minimizers.

Hence we have for all x satisfying c < x < c + δ that


f (x) − f (c) f (x) − f (c)
0 − f ′ (c) ≤ − f ′ (c) ≤ − f ′ (c) < ǫ.
x−c x−c
(In order to obtain the first inequality, we have used the fact that x − c > 0 and f (x) ≥ f (c).)
Similarly, for x satisfying c − δ < x < c,we have
f (x) − f (c) f (x) − f (c)
0 + f ′ (c) ≤ − + f ′ (c) ≤ − f ′ (c) < ǫ.
x−c x−c
Consequently, we have |f ′ (c)| < ǫ. But the choice of ǫ > 0 was arbitrary, and hence f ′ (c) = 0. 

Theorem 5.7 (Mean-Value Theorem). Let f : [a, b] → R be continuous on [a, b] and differentiable
on (a, b). Then there is a point c ∈ (a, b) such that
f (b) − f (a)
= f ′ (c).
b−a
The above result has a simple geometric interpretation. If we look at the chord AB in the
plane which joins the end points A ≡ (a, f (a)) and B ≡ (b, f (b)) of the graph of f , then there is
a point c ∈ (a, b), where the tangent to f at the point C ≡ (c, f (c)) is parallel to the chord AB.
See Figure 4.

C
B

a c b

Figure 4. Geometric meaning of the Mean Value Theorem.

Proof of Theorem 5.7. Define ϕ : [a, b] → R by


ϕ(x) = (f (b) − f (a))x − (b − a)f (x), x ∈ (a, b).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.1. Mean Value Theorem 65

Then ϕ is continuous on [a, b], differentiable on (a, b) and


ϕ(a) = (f (b) − f (a))a − (b − a)f (a) = f (b)a − bf (a) = (f (b) − f (a))b − (b − a)f (b) = ϕ(b).
Moreover, for x ∈ (a, b), we have ϕ′ (x) = f (b) − f (a) − (b − a)f ′ (x), and so in order to prove the
theorem, it suffices to show that ϕ′ (c) = 0 for some c ∈ (a, b).
If ϕ is constant, this holds for all x ∈ (a, b). If ϕ(x) < ϕ(a) = ϕ(b) for some x ∈ (a, b), let c be
a point in [a, b] where ϕ attains its minimum (Extreme Value Theorem!). Then since ϕ(b) = ϕ(a),
we conclude that c ∈ (a, b). By the necessary condition for a local minimizer, we have ϕ′ (c) = 0.
A similar argument applies if we choose for c a point on [a, b] where ϕ attains its maximum. 
Corollary 5.8 (Rolle’s theorem). Let f : [a, b] → R be continuous on [a, b] and differentiable on
(a, b). If f (a) = f (b), then there exists c ∈ (a, b) such that f ′ (c) = 0.
Exercise 5.9 (Cauchy’s theorem). If f, g : [a, b] → R are continuous on [a, b] and differentiable on (a, b),
then show that there is a point c ∈ (a, b) such that (f (b) − f (a))g ′ (c) = (g(b) − g(a))f ′ (c).
 
f (x) g(x) 1
Hint: Apply Rolle’s Theorem to ϕ given by ϕ(x) = det  f (a) g(a) 1  (x ∈ [a, b]).
f (b) g(b) 1

Corollary 5.10. Suppose that f : (a, b) → R is differentiable on (a, b). Then:


(1) If f ′ (x) ≥ 0 for all x ∈ (a, b), then f is increasing.
(2) If f ′ (x) = 0 for all x ∈ (a, b), then f is constant.
(3) If f ′ (x) ≤ 0 for all x ∈ (a, b), then f is decreasing.

Proof. For each pair of numbers x1 , x2 ∈ (a, b), by the Mean Value Theorem, we have
f (x1 ) − f (x2 ) = f ′ (x)(x1 − x2 )
for some x between x1 and x2 . All the above conclusions can be read off from here. 

The Mean Value Theorem can be used to prove interesting inequalities; here is an example.
√ 1
Example 5.11. Let us show that for all x > 0, 1 + x < 1 + x.
2

Consider the function f : [0, ∞) → R defined by f (x) = 1 + x. Then f is continuous on
[0, ∞) and differentiable on (0, ∞). If x > 0, then applying the Mean Value Theorem to f on the
interval [0, x], we obtain the existence of a c such that 0 < c < x and

f (x) − f (0) 1+x−1 1 1
= = f ′ (c) = √ < .
x−0 x 2 1+c 2
Rearranging, we obtain the desired inequality. ♦

If f has a derivative f ′ (x) at each x ∈ (a, b), then we can consider the derivative function,
namely the map f ′ given by x 7→ f ′ (x) on the interval (a, b). Suppose now that f ′ is itself
differentiable on (a, b). Then we may consider the derivative function f ′′ of f ′ . One can continue
in this manner (provided of course that each successive function obtained is again differentiable),
and obtain the functions
f ′ , f ′′ , f (3) , . . . , f (n) ,
each of which is the derivative of the previous one. f (n) is called the nth derivative, or the
derivative or order n, of f .
Example 5.12. All polynomials have derivatives of all orders, and eventually all high order
derivatives are the zero function. ♦
Exercise 5.13. Let f : R → R be such that for all x, y ∈ R, |f (x) − f (y)| ≤ (x − y)2 . Prove that f is
constant.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

66 5. Differentiation

Exercise 5.14. Let f : (a, b) → R be differentiable on (a, b) and suppose that there is number M such
that for all x ∈ (a, b), |f ′ (x)| ≤ M . Show that f is uniformly continuous on (a, b).
Exercise 5.15. Show that the cubic 2x3 + 3x2 + 6x + 10 has exactly one real zero.
Exercise 5.16. Let f : R → R. We call x ∈ R a fixed point of f if f (x) = x.
(1) If f is differentiable, and for all x ∈ R, f ′ (x) 6= 1, then prove that f has at most one fixed point.
(2) (∗) Show that if there is an M < 1 such that for all x ∈ R, |f ′ (x)| ≤ M , then there is a fixed
point x∗ of f , and that x∗ = lim xn , where x1 is arbitrary and xn+1 = f (xn ) (n ∈ N).
n→∞

(3) Visualize the process described in the part (2) above via the zig-zag path
(x1 , x2 ) −→ (x2 , x2 ) −→ (x2 , x3 ) −→ (x3 , x3 ) −→ (x3 , x4 ) −→ . . . .
(4) Prove that the function f : R → R defined by
1
f (x) = x + (x ∈ R)
1 + ex
has no fixed point, although 0 < f ′ (x) < 1 for all x ∈ R. Is this a contradiction to the result in
part (2) above? Explain.
Exercise 5.17. Consider the function f : R → R defined by f (0) = 0 and for x 6= 0,
1
f (x) = x2 sin .
x
Prove that f is differentiable, but f ′ is not continuous at 0.
Exercise 5.18. (∗) Find all functions f : R → R such that f is differentiable on R and for all x ∈ R and
all n ∈ N,
f (x + n) − f (x)
f ′ (x) = .
n
Hint: Conclude that f must be twice differentiable and calculate f ′′ (x).
Exercise 5.19. Show that for every real a, b ∈ R, | cos a − cos b| ≤ |a − b|.
Exercise 5.20. (∗) A function f : R → R is said to be convex if for all x, y ∈ R and all t ∈ (0, 1),
f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y).
(1) Draw a picture and explain the geometric meaning of the inequality above.
(2) Show that if f is twice differentiable, then f is convex if f ′′ (x) ≥ 0 for all x ∈ R. Hint: If x < y,
then apply the Mean Value Theorem for f on the interval [x, (1 − t)x + ty] and [(1 − t)x + ty, y].
(3) Prove that if f is a differentiable and convex function, then f ′ is increasing. Hint: If x < u < y,
then using the convexity, derive the inequalities
f (u) − f (x) f (y) − f (x) f (y) − f (u)
≤ ≤ .
u−x y−x y−u
and pass the limits u ց x and u ր y.
(Combining this result with the result in part (2), we conclude that a twice differentiable
function is convex if and only if f ′′ (x) ≥ 0 for all x ∈ R.)
(4) Prove that if f is a differentiable, convex function and f ′ (x0 ) = 0 for a x0 ∈ R, then x0 is a
minimizer of f .
√ √
Exercise 5.21. Prove that not all the zeros of the polynomial x4 − 7x3 + 4x2 − 22x + 15 are real.
Hint: Repeated application of Rolle’s Theorem.

5.2. Uniform convergence and differentiation


When we studied uniform convergence, we had mentioned that interchanging limits is facilitated by
uniform convergence. An instance of this is the possibility of differentiating a uniformly convergent
series termwise; as shown in the Corollary 5.23 below. This relies on the following result, which
in turn is an application of the Mean Value Theorem.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.2. Uniform convergence and differentiation 67

Proposition 5.22. Let fn : (a, b) → R (n ∈ N) be a sequence of differentiable functions on (a, b),


such that there exists a point c ∈ (a, b) for which (fn (c))n∈N converges. If the sequence (fn′ )n∈N
converges uniformly to g on (a, b), then (fn )n∈N converges uniformly to a differentiable function
f on (a, b), and moreover, f ′ (x) = g(x) for all x ∈ (a, b).

Proof. (∗) (You may skip reading this proof.) Let ǫ > 0. Choose N1 ∈ N such that for all
m, n > N1 ,
 
′ ǫ ǫ ǫ
|fm (x) − fn′ (x)| < min , and |fm (c) − fn (c)| < .
3 3(b − a) 3
Let x ∈ (a, b). Apply the Mean Value Theorem to the function fm − fn on the interval with the
endpoints x, c to get

fm (x) − fn (x) = fm (c) − fn (c) + (x − c)(fm (y) − fn′ (y))
for some y between x, c. Hence
′ 2
|fm (x) − fn (x)| ≤ |fm (c) − fn (c)| + (b − a)|fm (y) − fn′ (y)| < ǫ < ǫ.
3
Consequently, (fn )n∈N is uniformly convergent on (a, b). Let f : (a, b) → R be its limit. As each
fn is continuous, so is f .
To show that f is differentiable at a point x0 ∈ (a, b), we apply the Mean Value Theorem once
again to fm − fn on the interval with endpoints x0 and x ∈ (a, b), x 6= x0 . We obtain

(fm (x) − fn (x)) − (fm (x0 ) − fn (x0 )) = (x − x0 )(fm (y) − fn′ (y))
for some y between x and x0 . Dividing by x − x0 and taking absolute values, we get
fm (x) − fm (x0 ) fn (x) − fn (x0 ) ′ ǫ
− ≤ |fm (y) − fn′ (y)| < .
x − x0 x − x0 3
Passing the limit m → ∞ yields
f (x) − f (x0 ) fn (x) − fn (x0 ) ǫ
− ≤ .
x − x0 x − x0 3
Now choose N2 ∈ N such that |fn′ (x0 ) − g(x0 )| < ǫ/3 for all n > N2 . Let N := max{N1 , N2 } and
choose δ > 0 such that 0 < |x − x0 | < δ implies
fN (x) − fN (x0 ) ′ ǫ
− fN (x0 ) < .
x − x0 3
Then combining these inequalities, we get for 0 < |x − x0 | < δ that
f (x) − f (x0 )
− g(x0 ) < ǫ.
x − x0
As the choice of ǫ > 0 was arbitrary, it follows that f is differentiable at x0 and f ′ (x0 ) = g(x0 ). 

Corollary 5.23. Let fn : (a, b) → R, n ∈ N, be a sequence of differentiable functions such that



X
(1) fn (c) converges for some c ∈ (a, b), and
n=1

X
(2) g := fn′ is uniformly convergent on (a, b).
n=1

X
Then f := fn is uniformly convergent on (a, b) and f ′ = g on (a, b).
n=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

68 5. Differentiation

Example 5.24. We will show that the exponential function f (x) := ex (x ∈ R) satisfies the
differential equation
d
f (x) = f (x) (x ∈ R).
dx
We have
X∞ ∞ ∞ ∞
1 d n X 1 n−1 X 1 X 1 n
x = nx = xn−1 = x
n=0
n! dx n=1
n! n=1
(n − 1)! n=0
n!
converges uniformly on each bounded interval. Thus, by Corollary 5.23, we obtain f ′ (x) = f (x)
as required. ♦
Exercise 5.25. (∗) What is the sum of the series 1 + 2x + 3x2 + 4x3 + . . . for x ∈ (−1, 1)?

5.3. Derivative of maps from Rn to Rm


In order to differentiate functions whose domain is Rn (or an open subset of Rn ) and takes values
in Rm , let us take another look at the familiar case when n = m = 1, and let us recast the old
definition in a manner that will naturally lend itself for extension to the case when n or m is > 1.
We has defined f : (a, b) → R to be differentiable at c ∈ (a, b) if
f (x) − f (c)
f ′ (c) := lim
x→c x−c
exists. Or equivalently,
f (x) − f (c)
lim − f ′ (c) = 0.
x→c x−c
This implies that the “remainder”
r(x) := f (x) − f (c) − f ′ (c)(x − c)
is small, in the sense that
|r(x)| |f (x) − f (c) − f ′ (c)(x − c)| f (x) − f (c)
lim = lim = lim − f ′ (c) = 0.
x→c |x − c| x→c |x − c| x→c x−c
Now consider the linear transformation L : R → R given by
L(h) = f ′ (c)h (h ∈ R).
Then we can write the remainder term as
r(x) = f (x) − f (c) − f ′ (c)(x − c) = f (x) − f (c) − L(x − c),
and we see that r(x) measures how different f (x) − f (c) is from the action of the linear transfor-
mation L on x − c. So we may view the number f ′ (c) as describing a linear transformation L so
that the remainder r defined by r(x) := f (x) − f (c) − f ′ (c)(x − c) has the property that
|r(x)|
lim = 0.
x→c |x − c|
This motivates the following definition.
Definition 5.26. Let U ⊆ Rn be open, c ∈ U and f : U → Rm . Then we say that f is
differentiable at c if there exists a linear transformation L : Rn → Rm such that
kf (x) − f (c) − L(x − c)k2
lim = 0, (5.1)
x→c kx − ck2
that is, for every ǫ > 0, there exists a δ > 0 such that whenever x ∈ U satisfies 0 < kx − ck2 < δ,
there holds that
kf (x) − f (c) − L(x − c)k2
< ǫ.
kx − ck2
Then L is called the derivative of f at c, and we write f ′ (c) = L.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.3. Derivative of maps from Rn to Rm 69

The relation (5.1) can be expressed by saying that


f (x) − f (c) = f ′ (c)(x − c) + r(x),
where the remainder r satisfies
kr(x)k2
lim = 0.
kx − ck2
x→c

So we can interpret this as saying that f ′ (c) is that linear transformation which has the property
that for x close to c, f (x) − f (c) is approximately equal to its action on x − c. Next we show that
in fact there can be only one such linear transformation.
Lemma 5.27. Let U ⊆ Rn be open, c ∈ U and f : U → Rm be differentiable at c. Then the
derivative of f at c is unique.

Proof. Suppose that L1 , L2 are two linear transformations such that


kf (x) − f (c) − L1 (x − c)k2 kf (x) − f (c) − L2 (x − c)k2
lim = 0 = lim .
x→c kx − ck2 x→c kx − ck2
Thus given ǫ > 0, we can choose a δ > 0 such that whenever 0 < kx − ck2 < δ, we have
kf (x) − f (c) − L1 (x − c)k2
< ǫ,
kx − ck2
kf (x) − f (c) − L2 (x − c)k2
< ǫ.
kx − ck2
Hence by the triangle inequality, it follows for whenever 0 < kx − ck2 < δ,
kL2 (x − c) − L1 (x − c)k2
< 2ǫ,
kx − ck2
that is, kL2 (x − c) − L1 (x − c)k2 ≤ 2ǫkx − ck2 . Now given any nonzero h ∈ Rn , define
δ
x=c+ h.
2khk2
Then 0 < kx − ck2 = δ/2 < δ, and so kL2 h − L1 hk2 ≤ 2ǫkhk2 . But the choice of ǫ > 0 was
arbitrary, and so L2 h = L1 h for all nonzero h ∈ Rn . Thus L1 = L2 . 
Example 5.28. Let A ∈ Rm×n . Consider the map Ta : Rn → Rm given by TA x = Ax (x ∈ Rn ).
If c ∈ Rn , then is TA differentiable at c? If so, then what is its derivative? The answers turn out
to be very simple. We note that for x ∈ Rn , we have
TA (x) − TA (c) = TA (x − c),
and so
kTA (x) − TA (c) − TA (x − c)k2 0
lim = lim = lim 0 = 0.
x→c kx − ck2 x→c kx − ck2 x→c

So TA is differentiable at c ∈ Rn , and TA′ (c) = TA ! ♦


Exercise 5.29. (∗) Let f : Rn → R be differentiable at c ∈ Rn . Show that the new function g : Rn → R,
defined by
g(x) := (f (x))2 (x ∈ Rn )
is also differentiable at c. Hint: (f (x)) − (f (c))2 = (f (x) + f (c))(f (x) − f (c)) ≈ 2f (c)f ′ (c)(x − c) for x
2

near c.
Exercise 5.30. Let Q ∈ Rn×n be a symmetric matrix, that is, Q = Q⊤ . Define q : Rn → R by
q(x) = x⊤ Qx (x ∈ Rn ). Prove that q is differentiable at each c ∈ Rn and that q ′ (c) : Rn → R is given by
q ′ (c)v = 2c⊤ Qv (v ∈ Rn ).
Exercise 5.31. Consider the map f : Rn → R given by
f (x) = kxk42 (x ∈ Rn ).
Calculate f ′ (c) for c ∈ Rn . Hint: Use the results in Exercises 5.29 and 5.30.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

70 5. Differentiation

5.4. Partial derivatives


Suppose that U is an open subset of Rn , and let f : U → Rm be a function. Let the components
of f be denoted by f1 , . . . , fm . Thus for i = 1, . . . , m,
fi (x) := e⊤
i f (x) (x ∈ U ),
where e1 , . . . , em denote the standard basis vectors in Rm , that is,
   
1 0
 0   .. 
   
e1 :=  .  , . . . , em :=  .  .
 ..   0 
0 1
Let c ∈ U . If
∂fi fi (c1 , . . . , cj−1 , xj , cj+1 , . . . , cn ) − fi (c1 , . . . , cj−1 , cj , cj+1 , . . . , cn )
(c) := lim
∂xj x j →c j xj − cj
∂fi
exists, then we call (c) the (i, j)th partial derivative f at c. Thus, we look only at the ith
∂xj
component fi : U → R, keep all the variables x1 , . . . , xj−1 , xj+1 , . . . , xn as fixed, with values
c1 , . . . , cj−1 , cj+1 , . . . , cn , respectively, and differentiate the function
xj 7→ fi (c1 , . . . , cj−1 , xj , cj+1 , . . . , cn )
with respect to xj at cj .

Example 5.32. Let f : R2 → R2 be defined by


 
x21 + x22
f (x1 , x2 ) = .
x1 x2
Then we have
∂f1 ∂f1
(c1 , c2 ) = 2c1 , (c1 , c2 ) = 2c2 ,
∂x1 ∂x2
∂f2 ∂f2
(c1 , c2 ) = c2 , (c1 , c2 ) = c1 .
∂x1 ∂x2

Theorem 5.33. Let U be an open subset of Rn , and c ∈ U . If f : U → Rm is differentiable at c,


then all the partial derivatives of f at c, namely,
∂fi
(c) (i = 1, . . . , m; j = 1, . . . , n)
∂xj
exist, and the matrix [f ′ (c)] of the linear transformation f ′ (c) with respect to the standard bases
for Rn and Rm is given by
 ∂f ∂f1 
1
(c) . . . (c)
 ∂x1 ∂xn 
 .. .. 
[f ′ (c)] = 
 . .
.

 ∂f ∂fm 
m
(c) . . . (c)
∂x1 ∂xn

Proof. Let ǫ > 0. Then since f is differentiable at c, there exists a δ > 0 such that for all x ∈ U
such that 0 < kx − ck2 < δ, we have
kf (x) − f (c) − f ′ (c)(x − c)k2
< ǫ.
kx − ck2

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.4. Partial derivatives 71

Let xj be such that 0 < |xj − cj | < δ. Define


x := (c1 , . . . , cj−1 , xj , cj+1 , . . . , cn ).
Then x − c = (0, . . . , 0, xj − c, 0, . . . , 0) = (xj − c)ej , and so kx − ck2 = |xj − c|. Consequently, for
these x:s,
kf (x) − f (c) − f ′ (c)(x − c)k2
< ǫ.
kx − ck2
Moreover, fi (x) − fi (c) − (e⊤ ′ ⊤ ′
i [f (c)]ej )(xj − cj ) = ei (f (x) − f (c) − f (c)(x − c)), and so

|fi (x) − fi (x) − (e⊤ ′ ′


i [f (c)]ej )(xj − cj )| ≤ kf (x) − f (c) − f (c)(x − c)k2 .

Hence for xj :s satisfying 0 < |xj − cj | < δ, we have


fi (c1 , . . . , cj−1 , xj , cj+1 , . . . , cn ) − fi (c1 , . . . , cj−1 , cj , cj+1 , . . . , cn )
− e⊤ ′
i [f (c)]ej < ǫ.
xj − cj
This completes the proof. 


The above result says that for the derivative to exist, it is necessary that the partial derivatives
exist. Surprisingly, this is not a sufficient condition.

Example 5.34. Let f : R2 → R be defined by f (0, 0) = 0 and for (x1 , x2 ) 6= (0, 0),
x1 x2
f (x1 , x2 ) = 2 .
x1 + x22
∂f ∂f
We will show that although (0, 0) and (0, 0) exist, f ′ (0, 0) doesn’t, that is, f is not
∂x1 ∂x2
differentiable at (0, 0).
For x1 6= 0, we have f (x1 , 0) − f (0, 0) = 0 − 0 = 0, and so
∂f f (x1 , 0) − f (0, 0)
(0, 0) = lim = lim 0 = 0.
∂x1 x1 →0 x1 − 0 x1 →0

∂f
Similarly, (0, 0) = 0.
∂x2
Thus all the partial derivatives of f exist at (0, 0). However, we will now show that f ′ (0, 0)
does not exist. Suppose that f ′ (0, 0) does exist. Then by Theorem 5.33,
 
′ ∂f ∂f  
[f (0, 0)] = (0, 0) (0, 0) = 0 0 .
∂x1 ∂x2
Let ǫ > 0. Then there exists a δ > 0 such that for all x ∈ R2 satisfying 0 < kx − 0k2 < δ, we have
kf (x) − f (0) − f ′ (0)(x − 0)k2
< ǫ,
kx − 0k2
that is,
|x1 x2 |
p < ǫ.
(x21 + x22 ) x21 + x22

For all n ∈ N large enough, with x1 := x2 := 1/n, we have that 0 < k(x1 , x2 )−(0, 0)k2 = 2/n < δ,
and so there must hold
1
n 2 |x1 x2 |
√ = n√ = p < ǫ,
2 2 2 2 (x1 + x22 ) x21 + x22
2

n2 n
for all large n, a contradiction. ♦

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

72 5. Differentiation

Let U be an open subset of Rn . We say that f : U → R has a local minimum at c ∈ U if


there exists a δ > 0 such that whenever x ∈ U satisfies kx − ck2 < δ, we have f (x) ≥ f (c). Local
maximizers are defined analogously.
Corollary 5.35. Let U be an open subset of Rn . Let f : U → R be such that f has a local
minimum at c ∈ U , and f is differentiable at c. Then f ′ (c) = 0.

Proof. It is clear that if c is a local minimizer for f , then each of the functions
xi 7→ f (c1 , . . . , ci−1 , xi , ci+1 , . . . , cn )
has a local minimum at ci , and so by the one variable result, we have
∂f
(c) = 0
∂xi
for each i = 1, . . . , m. Consequently, Theorem 5.33 yields f ′ (c) = 0. 

An analogous result holds for a local maximizer.


Exercise 5.36. Let τ, ξ be real numbers such that τ − ξ 2 = 0. Show that f (x, t) := eτ t+ξx , x, t ∈ (0, ∞),
satisfies the diffusion equation
∂f ∂2f
(x, t) − (x, t) = 0.
∂t ∂x2
2
∂ f ∂f
(Here is used to denote the partial derivative with respect to x of the map (x, t) 7→ (x, t).)
∂x2 ∂x
Exercise 5.37. In the subject of “Calculus of Variations”, the following type of optimization problem is
studied: Z b
minimize f (x) := F (x(t), x′ (t), t)dt
a
where the integrand in the cost function is described by a function F : R3 → R of three real variables
(α, β, γ) taking values in R3 :
(α, β, γ) (∈ R3 ) 7−→ F (α, β, γ) (∈ R),
and the (variable) function x : [a, b] → R is such that x(a) = ya and xb = yb . Hence we observe that this
is an optimization problem in which the domain of the cost function is itself a set of functions. Figure 5
illustrates this.

yb

ya

a b

Figure 5. Possible xs.

The fundamental result in Calculus of Variations is that if x∗ is an optimal solution to this problem,
then it must satisfy the following “Euler-Lagrange equation”:
 
∂F d ∂F
(x(t), x′ (t), t) − (x(t), x′ (t), t) = 0 (t ∈ [a, b]).
∂α dt ∂β
Consider for example the following optimization problem which occurs in resource management. One
wants to maximize the profit
Z T
f (x) := (P − ax(t) − bx′ (t))x′ (t)dt
0

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

5.5. Notes (not part of the course) 73

associated with a possible choice of operation x : [0, T ] → R over the time interval [0, T ] satisfying x(0) = 0
and x(T ) = Q. Here T, P, a, b, Q are given positive constants. Assuming that an optimal operation x∗
exists, find it using the Euler-Lagrange equation.
Exercise 5.38. (∗) Find all (global) minimizers of f : R2 → R defined by f (x1 , x2 ) = x41 − 12x1 x2 + x42 ,
(x1 , x2 ) ∈ R2 .
Exercise 5.39. Verify that
∂2f ∂2f
(x, y) = (x, y) (5.2)
∂x∂y ∂y∂x
for all (x, y) ∈ R2 , where the function f is given by
f (x, y) = 3x3 + 9y 2 − 9x3 y ((x, y) ∈ R2 ).
Show that (5.2) does not hold at (0, 0) if f is given by

 x2 − y 2
xy 2 if (x, y) 6= (0, 0),
f (x, y) = x + y2

0 if (x, y) = (0, 0).

Exercise 5.40. Find the derivative of the multiplication map (x, y) 7→ xy : R2 → R at (x0 , y0 ) in R2 .

5.5. Notes (not part of the course)


In connection with Theorem 5.1, we might wonder how badly behaved continuous functions might be
with respect to the notion of differentiability. It turns out that there are functions that are continuous
everywhere, but differentiable nowhere. One construction is that of the so-called blancmange function
obtained by taking the basic sawtooth function f1 ,

1
2 f1

and constructing f2 , f3 , . . . by setting


1
f2 (x) = f1 (2x),
2
1
f3 (x) = f1 (4x),
4
1
f4 (x) = f1 (8x),
8
..
.
 n−1
1
fn (x) = f1 (2n−1 x),
2
..
.

and adding these:



X
b(x) = fn (x), x ∈ R.
n=1
Then it can be shown that b is continuous on R, but not differentiable at any x ∈ R.

We have seen in Example 5.34 that even though all the partial derivatives exist at a point, the function
may not be differentiable at that point. However, the following result says that if the partial derivatives
are continuous in a neighbourhood of the point, then the function is differentiable in that neighbourhood.
Here is the precise statement of the result, a proof of which can be found in Rudin [R].

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

74 5. Differentiation

Theorem 5.41. Suppose that f maps an open set U ⊆ Rn into Rm . If all the partial derivatives
∂fi
(x) (i = 1, . . . , m; j = 1, . . . , n)
∂xj
exist at each x ∈ U and the maps
∂fi
x 7→ (x) : U → R (i = 1, . . . , m; j = 1, . . . , n)
∂xj
are continuous on U , then f is continuously differentiable on U .

(In the above the phrase “continuously differentiable on U ” means that at each x ∈ U , f ′ (x) exists
and the map x 7→ f ′ (x) : U → L(Rn ; Rm ) is continuous on U . Here L(Rn ; Rm ) denotes the set of all linear
transformations from Rn to Rm .)
Also, in Exercise 5.39, we saw a real function for which the order of taking partial derivatives mattered.
The following result gives a sufficient condition for it to be irrelevant.
Theorem 5.42. If U ⊆ Rn and f : U → R is twice continuously differentiable (that is, f ′′ (x) exists at
each x ∈ U and the map x 7→ f ′′ (x) is continuous on U ), then
∂2f ∂2f
(x) = (x) (1 ≤ i, j ≤ n, x ∈ U ).
∂xi ∂xj ∂xj ∂xi

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Chapter 6

Epilogue: integration

One traditional topic in real analysis that we haven’t covered yet in these notes is Integration
Theory. There are two important types of integrals: Riemann integration and Lebesgue integra-
tion. Riemann integration will be covered in the course MA212 Further Mathematical Methods,
and it has the advantage that it is intuitive and easy to follow. However, it turns out that it is
not amenable to certain natural limiting processes. For example, it turns out that the functional
analogue of the Euclidean space Rn , namely the space C[a, b] equipped with the following norm
s
Z b
kf k2 := |f (x)|2 dx (f ∈ C[a, b])
a

is not complete, which turns out to be awkward when one wants to deal with applications. How-
ever, one can introduce a more general integral called the Lebesgue integral, which rescues this
situation. The interested reader is referred to the book by Rudin [R] for these matters. However,
in this last chapter, we study the foundations of Riemann integration in the case of a function
f : [a, b] → R. We study the definition, elementary properties and finish with establishing the
Fundamental Theorem of Integral Calculus.

6.1. Motivation and definition


It is a basic problem in geometry to calculate the area under a curve, which arises from concrete
practical applications.
Given a “nice” function f : [a, b] → R, determine the area under the graph of f . Pictorially,
find the area of the shaded region in Figure 1.

a b

Figure 1. What is the area under the graph of f ?

Of course if the function were a simple function, like a constant function taking value c
everywhere, it would be easy to calculate the area. See Figure 2.

75

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

76 6. Epilogue: integration

f
D C

A B

a b

Figure 2. Area under the graph of f is c · (b − a).

Indeed it would just be the area of the rectangle ABCD, that is,
f (a) · (b − a) = f (b) · (b − a) = c · (b − a).

But how do we define the area in the case when f is not constant and the graph of f has a
shape which curves? Well, if there were numbers m, M such that m ≤ f (x) ≤ M for all x ∈ [a, b],
surely we must have that the area under the graph of f , say A(f ), is flanked by the areas of the
two rectangles ABCD and ABC ′ D′ , as shown in Figure 3, that is,
m · (b − a) ≤ A(f ) ≤ M · (b − a).

D′ C′

D C

A B

a b

Figure 3. Bounds/estimates for the area under the graph of f .

This gives us the idea that we can estimate the area A(f ) by breaking it into little rectangles;
see Figure 4.

a b

Figure 4. Calculation of the area under the graph of f .

In order to make this rigorous, we introduce the notion of a partition and that of upper/lower
sum associated with a partition and a bounded function f : [a, b] → R.
Definition 6.1 (Partition). A partition P of [a, b] is a finite set P = {x0 , x1 , . . . , xn−1 , xn } such
that a = x0 < x1 < · · · < xn−1 < xn = b. The set of all partitions P of [a, b] will be denoted by
P.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

6.1. Motivation and definition 77

Definition 6.2 (Upper/lower sum). Let f : [a, b] → R be a bounded function and let P be a
partition of [a, b].
The upper sum U (f, P ) of f associated with P is defined to be
n−1
!
X
U (f, P ) := sup f (x) · (xk+1 − xk ).
k=0 x∈[xk ,xk+1 ]

(In other words, it is the area of a rectangle with the segment [xk , xk+1 ] as the base and which is
the shortest rectangle which contains the graph of the function f in the interval [xk , xk+1 ].)

The lower sum L(f, P ) of f associated with P is defined to be


X
n−1 
L(f, P ) := inf f (x) · (xk+1 − xk ).
x∈[xk ,xk+1 ]
k=0

(In other words, it is the area of a rectangle with the segment [xk , xk+1 ] as the base and which
is the tallest rectangle which is contained in the region under the graph of the function f in the
interval [xk , xk+1 ].)

Clearly we expect that the actual area A(f ) is bounded above by any upper sum, and hence
by the infimum of all upper sums. Similarly, A(f ) is bounded below by any lower sum, and hence
by the supremum of all lower sums. Thus we expect that
sup L(f, P ) ≤ A(f ) ≤ inf U (f, P ).
P ∈P P ∈P

Of course as the partitions get finer, we expect that for a nice function f the two bound get close to
each other, since they both approximate A(f ) rather well. This motivates the following definition.
Exercise 6.3.
(1) Show that for any bounded function f : [a, b] → R, and any partitions P, P ′ of [a, b], there holds
that U (f, P ) ≥ L(f, P ′ ).
Hint: Consider a “refinement” partition that contains both the partitions P and P ′ .
(2) Prove that inf U (f, P ) ≥ sup L(f, P ).
P ∈P P ∈P

Definition 6.4. A bounded function f : [a, b] → R is called Riemann integrable if


sup L(f, P ) = inf U (f, P ).
P ∈P P ∈P
Z b
We denote the common value by f (x)dx, and call it the Riemann integral of f .
a

Here is an example of a Riemann integrable function.


Example 6.5. Consider the bounded function f : [0, 1] → R given by f (x) = x2 , x ∈ [0, 1]. We
will show that f is Riemann integrable and that
Z 1
1
f (x)dx = .
0 3
Rather than considering all partitions, it turns out that one can be somewhat more efficient by
just considering the special partitions
 
1 2 3 n−1
P∗ := 0, , , , . . . , ,1 ,
n n n n
which will be used to prove our claim. Clearly,
X  k + 1 2 1
n−1
12 + 22 + 32 + · · · + n2 n · (n + 1) · (2n + 1) (1 + n1 )(2 + n1 )
U (f, P∗ ) = = 3
= 3
= .
n n n 6n 6
k=0

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

78 6. Epilogue: integration

Thus,
(1 + n1 )(2 + n1 ) 1
inf U (f, P ) ≤ inf U (f, P∗ ) = lim = .
P ∈P n∈N n→∞ 6 3
Similarly,
X
n−1
k
2
1 12 + 22 + 32 + · · · + (n − 1)2 (1 − n1 )(2 − n1 )
L(f, P∗ ) = = = .
n n n3 6
k=0
Hence,
(1 − n1 )(2 − n1 ) 1
sup L(f, P ) ≥ sup L(f, P∗ ) = lim = .
P ∈P n∈N n→∞ 6 3
But these estimates yield that
1 1
≥ inf U (f, P ) ≥ sup L(f, P ) ≥ ,
3 P ∈P P ∈P 3
showing that
1
inf U (f, P ) = sup L(f, P ) = .
P ∈P P ∈P 3
Z 1
1
Hence f is Riemann integrable and f (x)dx = . ♦
0 3
Here is an example of a function that is not Riemann integrable.
Example 6.6. Consider the indicator function f of rationals in [0, 1], that is, f : [0, 1] → R is
given by 
1 if x ∈ Q,
f (x) = (x ∈ [0, 1]).
0 if x 6∈ Q,
Clearly f is bounded by 1. We will show that
inf U (f, P ) = 1 > sup L(f, P ) = 0,
P ∈P P ∈P

showing that f is not Riemann integrable.


Consider any partition P = {0, x1 , . . . , xn−1 , 1} of [0, 1]. Then each subinterval [xk , xk+1 ]
contains a rational number αk and an irrational number βk , and so
sup f (x) ≥ f (αk ) = 1 and inf f (x) ≤ f (βk ) = 0.
x∈[xk ,xk+1 ] x∈[xk ,xk+1 ]

Consequently,
n−1
! n−1
X X
U (f, P ) = sup f (x) · (xk+1 − xk ) ≥ 1 · (xk+1 − xk )
k=0 x∈[xk ,xk+1 ] k=0

= (x1 − x0 ) + (x2 − x1 ) + · · · + (xn − xn−1 ) = xn − x0 = 1 − 0 = 1.

Similarly,
X
n−1  n−1
X
L(f, P ) = inf f (x) · (xk+1 − xk ) ≤ 0 · (xk+1 − xk ) = 0.
x∈[xk ,xk+1 ]
k=0 k=0
Thus,
inf U (f, P ) ≥ 1 > 0 ≥ sup L(f, P ).
P ∈P P ∈P
Hence f is not Riemann integrable. ♦

All continuous functions are Riemann integrable.


Theorem 6.7. All functions continuous on [a, b] are Riemann integrable.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

6.2. Fundamental theorem of integral calculus 79

Proof. We know that [a, b] is compact, and so the continuity of f implies that f is bounded and
also uniformly continuous. Given any ǫ > 0, there exists δ > 0 such that whenever x, y ∈ [a, b] are
such that |x − y| < δ, we have |f (x) − f (y)| < ǫ. Consider any partition P∗ = {a, x1 , . . . , xn−1 , b}
such that max |xk+1 − xk | < δ. Then
k=0,...,n−1

n−1
! n−1
X X
U (f, P∗ )−L(f, P∗ ) = sup f (x) − inf f (x) (xk+1 −xk ) ≤ ǫ(xk+1 −xk ) = ǫ(b−a).
x∈[xk ,xk+1 ]
k=0 x∈[xk ,xk+1 ] k=0

Thus
0 ≤ inf U (f, P ) − sup L(f, P ) ≤ inf U (f, P ) − L(f, P∗ ) ≤ U (f, P∗ ) − L(f, P∗ ) ≤ ǫ(b − a).
P ∈P P ∈P P ∈P

As the choice of ǫ > 0 was arbitrary, it follows that


inf U (f, P ) = sup L(f, P ),
P ∈P P ∈P

and so f is Riemann integrable. 

One can show that the Riemann integral has the following elementary properties, which are
left as an exercise to the student.
Exercise 6.8. (∗) Let f, g be Riemann integrable on [a, b] and α ∈ R. Then the following hold:
Z b Z b Z b
(1) f (x) + g(x)dx = f (x)dx + g(x)dx.
a a a
Z b Z b
(2) α · f (x)dx = α f (x)dx.
a a
Z b
(3) If for all x ∈ [a, b], f (x) ≥ 0, f (x)dx ≥ 0.
a
Z b Z b
(4) If for all x ∈ [a, b], f (x) ≤ g(x), then f (x)dx ≤ g(x)dx.
a a
Z b Z b
(5) |f | is also Riemann integrable and f (x)dx ≤ |f (x)|dx.
a a
Z b
(6) If f is continuous and nonnegative on [a, b] and f (x)dx = 0, then f is identically equal to 0.
a

6.2. Fundamental theorem of integral calculus


We will now prove the following result.

Theorem 6.9. Let f : [a, b] → R be continuously differentiable on [a, b]. Then


Z b
f ′ (x)dx = f (b) − f (a).
a

Proof. f ′ is continuous on [a, b] and so uniformly continuous there. Let ǫ > 0. Then there exists
a δ > 0 such that whenever x, y ∈ [a, b] are such that |x − y| < δ, we have |f ′ (x) − f ′ (y)| < ǫ.
Consider any partition P∗ = {x0 = a, x1 , . . . , xn−1 , xn = b} with
max (xk+1 − xk ) < δ.
k=0,1,...,n

By the Mean Value Theorem, for each k = 0, . . . , n − 1, there is a ck ∈ (xk , xk+1 ) such that
f (xk+1 ) − f (xk )
= f ′ (ck ).
xk+1 − xk

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

80 6. Epilogue: integration

Hence
n−1
! n−1
X X

U (f, P∗ ) − (f (b) − f (a)) = sup f (x) (xk+1 − xk ) − (f (xk+1 ) − f (xk ))
k=0 x∈[xk ,xk+1 ] k=0
n−1
!
X
′ ′
= sup f (x) − f (ck ) (xk+1 − xk ) (note this is ≥ 0)
k=0 x∈[xk ,xk+1 ]
n−1
X
≤ ǫ(xk+1 − xk ) = ǫ(b − a).
k=0
Similarly,
X
n−1 
′ ′
(f (b) − f (a)) − L(f, P∗ ) = f (ck ) − inf f (x) (xk+1 − xk ) (note this is ≥ 0)
x∈[xk ,xk+1 ]
k=0
n−1
X
≤ ǫ(xk+1 − xk ) = ǫ(b − a).
k=0
Thus
Z b
f ′ (x)dx − (f (b) − f (a)) = inf U (f, P ) − (f (b) − f (a)) ≤ U (f, P∗ ) − (f (b) − f (a)) ≤ ǫ(b − a),
a P ∈P

and
Z b
f ′ (x)dx − (f (b) − f (a)) = sup L(f, P ) − (f (b) − f (a)) ≥ L(f, P∗ ) − (f (b) − f (a)) ≥ −ǫ(b − a).
a P ∈P
Z b
Consequently, f ′ (x)dx − (f (b) − f (a)) ≤ ǫ(b − a). As the choice of ǫ > 0 was arbitrary, it
a
follows that Z b
f ′ (x)dx = f (b) − f (a).
a
This completes the proof. 
Exercise 6.10. Redo Example 6.5 using the Fundamental Theorem of Integral Calculus and the fact that
d 1 3
x = x2 .
dx 3

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Bibliography

[B1] K.G. Binmore. The Foundations of Analysis: a Straightforward Introduction. Book 1. Logic, Sets and
Numbers. Cambridge University Press, Cambridge-New York, 1980.
(This book is an informal introduction to logic and set theory. As the title suggests, the properties of
the real numbers are emphasized in order to prepare the student for courses in analysis. Examples,
exercises, and diagrams are abundant.)
[B2] K.G. Binmore. The Foundations of Analysis: a Straightforward Introduction. Book 2. Topological
Ideas. Cambridge University Press, Cambridge-New York, 1981.
(This is a leisurely introduction to the basic notions of analysis. Clear motivation is provided for the
results to follow. )
[R] W. Rudin. Principles of Mathematical Analysis. Third edition. McGraw-Hill, Singapore, 1976.
(This book is tough to read, but having ploughed through it, with the exercises, gives a solid founda-
tion.)
[S] G.E. Shilov. Elementary Real and Complex Analysis. Revised English edition translated from the
Russian and edited by Richard A. Silverman. Corrected reprint of the 1973 English edition. Dover
Publications, Mineola, NY, 1996.
(This book is easier to read than Rudin’s book, and offers the reader a leisurely but systematic
introduction to Analysis. The Dover edition is affordable.)

81

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Index

B(x, r), 6 Fourier series, 37


C[a, b], 5
GL(m, R), 27 general linear group, 27
O(m, R), 27 geometric series, 30
ℓ2 , 6 Green-Tao Theorem, 45
S2 , 9
p-adic metric, 6 Hamming distance, 6
harmonic series, 31
absolute convergence, 33, 42 homeomorphic spaces, 52
alternating series, 33
image of a set, 50
Archimedean property, 9
induced distance, 5
blancmange function, 73 induced metric, 6
Bolzano-Weierstrass theorem, 13 infinite radius of convergence, 39
bounded set, 24 inverse image of a set, 50

calculus of variations, 72 lacunary series, 32


Cantor set, 26 limit of a function, 56
Cauchy sequence, 13, 17 linear Programming simplex, 9
Cauchy’s Theorem, 65 Lipschitz function, 60
Cauchy-Schwarz inequality, 3, 4
manifold, 53
closed set, 8
Mean Value Theorem, 64
compact set, 24, 27
metric, 2
comparison test, 34
metric space, 2
complete metric space, 17
metric subspace, 6
composition of functions, 47
continuity at a point, 47, 49 norm, 4
continuous function, 47, 49 normed space, 4
convergent sequence, 13, 16
convergent series, 29, 42 open ball, 6
convex function, 66 open cover, 27
convex set, 9 open interval, 7
open map, 54
dense set, 9 open set, 7, 10
diffusion equation, 72
discrete metric, 3 partial derivative, 70
distance function, 2 partial sums, 29
divergent series, 29, 42 partial sums of a series, 29, 42
path component, 50
equivalent norms, 58 path connected set, 50
Erdös conjecture on APs, 44 periodic function, 37
Euclidean metric, 3 pointwise convergence, 20
Euclidean space, 3 power series, 39
Euler-Lagrange equation, 72
exponential series, 36 radius of convergence, 39
exponential series (matrix case), 43 ratio test, 35
restriction, 54
fixed point, 58, 66 Riemann-Zeta function, 31

83

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

84 Index

Rolle’s Theorem, 65
root test, 36

sawtooth function, 73
subcover, 27
subspace metric, 6

Taylor series, 37
topological space, 10
Topology, 10
topology, 10
triangle inequality, 2, 4, 5

uniform continuity, 59
uniform convergence, 20
unit sphere, 9

Weierstrass’s theorem, 57

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions

Solutions to the exercises from Chapter 1


Solution to Exercise 1.7.
(D1) As d(x, y) is either 0 or 1 for all x, y ∈ X, clearly d(x, y) ≥ 0.
Also, for all x ∈ X, d(x, x) = 0 by definition.
If x, y ∈ X and x 6= y, then d(x, y) = 1 6= 0. So if d(x, y) = 0, then x = y.

(D2) If x = y ∈ X, then d(x, y) = 0 = d(y, x). If x, y ∈ X and x 6= y, then d(x, y) = 1 = d(y, x).

(D3) Let x, y, z ∈ X. If x = y and y = z, then x = z and so clearly

d(x, z) = 0 ≤ 0 + 0 = d(x, y) + d(y, z).

If either x 6= y or y 6= z, then d(x, y) = 1 or d(y, z) = 1, and so d(x, y) + d(y, z) ≥ 1 + 0 = 1 ≥ d(x, z).

x⊤ x .
Solution to Exercise 1.8. We have f (t) = (y ⊤ y) t2 + (2x⊤ y) t + |{z}
| {z } | {z }
=:a =:b =:c
 
The discriminant of f is D = b2 − 4ac = (2x⊤ y)2 − 4(y ⊤ y)(x⊤ x) = 4 (x⊤ y)2 − (y ⊤ y)(x⊤ x) .
X
n 2 X
n  X
n 
Thus D ≤ 0 yields (x⊤ y)2 = xi yi ≤ yi2 x2i = (y ⊤ y)(x⊤ x).
i=1 i=1 i=1

Solution to Exercise 1.11.


N1
(D1) d(x, y) = kx − yk ≥ 0.
N2
d(x, x) = kx − xk = k0k = k0 · 0k = |0| · k0k = 0.
N1
If d(x, y) = 0, then kx − yk = 0 =⇒ x = y.
N2
(D2) d(x, y) = kx − yk = k(−1)(y − x)k = | − 1| · ky − xk = ky − xk = d(y, x).
N3
(D3) d(x, z) = kx − zk = kx − y + y − zk ≤ kx − yk + ky − zk = d(x, y) + d(y, z).

Solution to Exercise 1.12.


p
(N1) If x ∈ Rn , then kxk2 = x21 + · · · + x2n ≥ 0.
If kxk2 = 0, then x21 + · · · + x2n = 0 and so x1 = · · · = xn = 0, that is x = 0.

(N2) For x ∈ Rn and α ∈ R, then


p q q
kα · xk2 = (αx1 )2 + · · · + (αxn )2 = α2 (x21 + · · · + x2n ) = |α| x21 + · · · + x2n = |α|kxk2 .

85

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

86 Solutions

(N3) For x, y ∈ Rn , we have, using the Cauchy-Schwarz inequality, that


kx + yk22 = (x1 + y1 )2 + · · · + (xn + yn )2
= x21 + · · · + x2n + 2(x1 y1 + · · · + xn yn ) + y12 + · · · + yn2
≤ kxk22 + 2kxk2 kyk2 + kyk22 = (kxk2 + kyk2 )2 .
As kx + yk2 ≥ 0 and kxk2 + kyk2 ≥ 0, we obtain by taking square roots that kx + yk2 ≤ kxk2 + kyk2 .

Solution to Exercise 1.13.


(D1) As d(x, y) ≥ 0 for all x, y ∈ X, also d1 (x, y) = d(x, y)/(1 + d(x, y)) ≥ 0. If x ∈ X, then d(x, x) = 0,
and hence d1 (x, x) = 0/(1 + 0) = 0 as well. If x, y ∈ X and d1 (x, y) = 0, then d(x, y)/(1 + d(x, y)) = 0,
and so d(x, y) = 0, which gives x = y.

(D2) If x, y ∈ X, then d(x, y) = d(y, x) and so


d(x, y) d(y, x)
d1 (x, y) = = = d1 (y, x).
1 + d(x, y) 1 + d(y, x)
(D3) Let x, y, z ∈ X. Set a = d(x, z), b = d(x, y), c = d(y, z). We know that a ≤ b + c. We want to show
that
a b c
≤ + .
1+a 1+b 1+c
We have
a 1+a−1 1
= =1−
1+a 1+a 1+a
1
≤ 1− (using a ≤ b + c)
1 + (b + c)
b+c b c
= = +
1+b+c 1+b+c 1+b+c
b c
≤ + (as b, c ≥ 0).
1+b 1+c
Solution to Exercise 1.14.
(N1) Clearly kM k∞ = max |mij | ≥ 0 for all M = [mij ] ∈ Rm×n .
1≤i≤m, 1≤j≤n

If kM |k∞ = 0, then |mij | = 0 for all 1 ≤ i ≤ m, 1 ≤ j ≤ n, that is, M = [mij ] = 0, the zero matrix.

(N2) For = [mij ] ∈ Rm×n and α ∈ R, we have


kα · M k∞ = max |αmij | = max |α||mij | = |α| max |mij | = |α|kM k∞ .
1≤i≤m, 1≤j≤n 1≤i≤m, 1≤j≤n 1≤i≤m, 1≤j≤n

(N3) For matrices P = [pij ], Q = [qij ] in Rm×n , we have


|pij + qij | ≤ |pij | + |qij | ≤ kP k∞ + kQk∞ .
As this holds for each i, j, we obtain kP + Qk∞ = max |pij + qij | ≤ kP k∞ + kQk∞ .
1≤i≤m, 1≤j≤n

Solution to Exercise 1.15. (1)


(N1) For every f ∈ C[a, b], clearly |f (x)| ≥ 0 for all x in [a, b], and so kf k∞ = max |f (x)| ≥ 0.
x∈[a,b]

If kf k∞ = 0, then for all x ∈ [a, b], |f (x)| = 0, and so f (x) = 0, that is f = 0, the zero function.

(N2) For f ∈ C[a, b] and α ∈ R, we have


kα · f k∞ = max |(α · f )(x)| = max |α||f (x)| = |α| max |f (x)| = |α|kf k∞ .
x∈[a,b] x∈[a,b] x∈[a,b]

(N3) For f, g ∈ C[a, b] and x ∈ [a, b], we have


|(f + g)(x)| = |f (x) + g(x)| ≤ |f (x)| + |g(x)| ≤ kf k∞ + kgk∞ .
As the choice of x ∈ [a, b] was arbitrary, kf + gk∞ = max |(f + g)(x)| ≤ kf k∞ + kgk∞ .
x∈[a,b]

(2) If kf − gk∞ < ǫ, then max |f (x) − g(x)| < ǫ. In particular, for all x ∈ [a, b], |f (x) − g(x)| < ǫ, that
x∈[a,b]
is, f (x) − ǫ < g(x) < f (x) + ǫ.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 1 87

Conversely, if for each x ∈ [a, b], f (x) − ǫ < g(x) < f (x) + ǫ, then |f (x) − g(x)| < ǫ. Hence
kf − gk∞ = max |f (x) − g(x)| < ǫ.
x∈[a,b]

Thus there holds

kf − gk∞ < ǫ ⇔ ∀x ∈ [a, b], f (x) − ǫ < g(x) < f (x) + ǫ .

Pictorially, this means that the graph of g lies in a strip of width 2ǫ around the graph of f .

f (x)+ǫ
g(x)

f (x) g
f
f (x)−ǫ

a x b a b

Solution to Exercise 1.16.


Z b
(N1) For every f ∈ C[a, b], clearly |f (x)| ≥ 0 for all x in [a, b], and so kf k1 = |f (x)|dx ≥ 0.
a
Z b
Let f ∈ C[a, b] satisfy kf k1 = 0. Then |f (x)dx = 0. Suppose that f (c) > 0 for some c ∈ [a, b].
a
Then, since f is continuous at c, for ǫ = f (c)/2 there exists δ > 0 such that for all x ∈ [a, b] ∩ (c − δ, c + δ),
we have |f (x) − f (c)| < ǫ = f (c)/2. It follows that f (x) > f (c)/2 for all x in some subinterval [α, β] of
[a, b], where α < β. Then
Z b Z β Z β
f (c) 1
|f (x)|dx ≥ |f (x)|dx ≥ dx = f (c)(β − α) > 0,
a α α 2 2

a contradiction. A similar argument can be made when f (c) < 0.


It follows that f (c) = 0 for all c ∈ [a, b], that is
f = 0, the zero function.

(N2) For f ∈ C[a, b] and α ∈ R, we have


Z b Z b Z b
kα · f k1 = |(α · f )(x)|dx = |α||f (x)|dx = |α| |f (x)|dx = |α|kf k1 .
a a a

(N3) For f, g ∈ C[a, b] and x ∈ [a, b], we have |(f + g)(x)| = |f (x) + g(x)| ≤ |f (x)| + |g(x)|.
Z b Z b
So kf + gk1 = |(f + g)(x)|dx ≤ |f (x)| + |g(x)|dx = kf k1 + kgk1 .
a a

Solution to Exercise 1.17.


(D1) If y1 , y2 ∈ Y , then d|Y ×Y (y1 , y2 ) = d(y1 , y2 ) ≥ 0.
If y ∈ Y , then d|Y ×Y (y, y) = d(y, y) = 0.
If y1 , y2 ∈ Y and d|Y ×Y (y1 , y2 ) = 0, then d(y1 , y2 ) = 0, and so y1 = y2 .

(D2) If y1 , y2 ∈ Y , then d|Y ×Y (y1 , y2 ) = d(y1 , y2 ) = d(y2 , y1 ) = d|Y ×Y (y2 , y1 ).

(D3) If y1 , y2 , y3 ∈ Y , then d|Y ×Y (y1 , y3 ) = d(y1 , y3 ) ≤ d(y1 , y2 )+d(y2 , y3 ) = d|Y ×Y (y1 , y2 )+d|Y ×Y (y2 , y3 ).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

88 Solutions

Solution to Exercise 1.18. (1) If x = 0, then |x + y|p = |y|p = max{0, |y|p } = max{|x|p , |y|p }. So if
either x or y is zero, then the result holds. Suppose now that x 6= 0 and y 6= 0, and let kx , ky be the
largest powers of p that divide x, y, respectively. Then x = pkx qx and y = pky qy for some integers qx , qy
which aren’t divisible by p. Without loss of generality, we may assume that kx ≥ ky . (Otherwise we just
interchange x and y.) Then x + y = pkx qx + pky qy = pky (pkx −ky qx + qy ). From this it follows that the
largest power of p dividing x + y is at least ky and so
 
1 1 1
|x + y|p ≤ = max , = max{|x|p , |y|p }.
p ky p kx p ky

(2) We will check that (D1), (D2), (D3) hold.

(D1) If x, y ∈ Z, then clearly dp (x, y) is either 1/pk for some k or 0, and hence it is surely nonnegative.
If x ∈ Z, then dp (x, x) = |x − x|p = |0|p = 0.
If x, y ∈ Z and dp (x, y) = 0, then |x − y|p = 0, and so x = y.
(D2) For a nonzero n ∈ Z, clearly the largest power of p that divides n coincides with the one dividing
−n. Hence for all n ∈ Z, |n|p = | − n|p . If x, y ∈ Z, then dp (x, y) = |x − y|p = |y − x|p = dp (y, x).
(D3) If x, y, z ∈ Z, then we have

dp (x, z) = |x − z|p = |(x − y) + (y − z)|p ≤ max{|x − y|p , |y − z|p }


≤ |x − y|p + |y − z|p = dp (x, y) + dp (y, z).

Solution to Exercise 1.19. We know that the set RN of all real sequences is a vector space with termwise
addition and scalar multiplication. We will prove that ℓ2 is a subspace of RN . If (an )n∈N , (bn )n∈N belong
to ℓ2 , then we have that the sequence of partial sums An := a21 + · · · + a2n and Bn := b21 + · · · + b2n are
bounded. For real numbers x, y, we have (x − y)2 ≥ 0, and so (x + y)2 ≤ 2(x2 + y 2 ). This yields that

(a1 + b1 )2 + · · · + (an + bn )2 ≤ 2(An + Bn ),

showing that the monotone sequence ((a1 + b1 )2 + · · · + (an + bn )2 )n∈N is bounded, and hence convergent.
Consequently (an + bn )n∈N belongs to ℓ2 , and so the subset ℓ2 is closed under vector addition.
If (an )n∈N belongs to ℓ2 and α ∈ R, then

(αa1 )2 + · · · + (αan )2 = α2 (a21 + · · · + a2n ),

and so α · (an )n∈N = (αan )n∈N also belongs to ℓ2 . Consequently the subset ℓ2 is closed under scalar
multiplication.
Finally the zero sequence (0)n∈N is obviously an element of ℓ2 . Thus ℓ2 is a subspace of RN , and hence
it is a vector space with termwise addition and scalar multiplication.

v
u∞
uX
(N1) If (an )n∈N 2
belongs to ℓ , then clearly k(an )n∈N k2 = t |an |2 ≥ 0.
n=1

X
If (an )n∈N ∈ ℓ2 and k(an )n∈N k2 = 0, then |an |2 = 0, and so |an |2 = 0 for each n. Thus
n=1
(an )n∈N = (0)n∈N .

(N2) If (an )n∈N belongs to ℓ2 and α ∈ R, then


v v
u∞ u∞
uX uX
kα · (an )n∈N k2 = k(αan )n∈N k2 = t |αan |2 = |α|t |an |2 = |α|k(an )n∈N k2 .
n=1 n=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 1 89

(N3) If (an )n∈N , (bn )n∈N belong to ℓ2 , then we have


X n Xn n
X n
X
(ak + bk )2 = a2k + 2 ak b k + b2k
k=1 k=1 k=1 k=1
v v
u n u n
2
u X uX
≤ k(an )n∈N k2 + 2t a2k t b2k + k(bn )n∈N k22
k=1 k=1

≤ k(an )n∈N k22 + 2k(an )n∈N k2 k(bn )n∈N k2 + k(bn )n∈N k22
 2
≤ k(an )n∈N k2 + k(bn )n∈N k2 ,

 2
and so it follows that k(an )n∈N + (bn )n∈N k22 ≤ k(an )n∈N k2 + k(bn )n∈N k2 .

Since k(an )n∈N + (bn )n∈N k2 ≥ 0 and k(an )n∈N k2 + k(bn )n∈N k2 ≥ 0, by taking square roots, we obtain
the triangle inequality.

Solution to Exercise 1.20.


(D1) If x, y ∈ Fn2 , then clearly d(x, y) ≥ 0, since it is the number of entries where x and y differ.
If x ∈ Fn2 , then since x is identical to itself, d(x, x) = 0.
If x, y ∈ Fn2 and d(x, y) = 0, then x and y differ in no place, and so x = y.

(D2) If x, y ∈ Fn
2 , then the number of places where x differs from y is the same as the number of
places y differs from x. Hence d(x, y) = d(y, x).
(D3) Define for k = 1, . . . , n, the function δk : Fn n
2 × F2 → R by

1 if the kth digit of x and y differ
δk (x, y) =
0 otherwise.
n
X
Clearly d(x, y) = δk (x, y).
k=1
We establish the triangle inequality simply by proving it for each δk . Let x, y, z ∈ Fn 2 , and
k ∈ {1, . . . , n}. If the kth digit of x and z coincide, then δk (x, z) = 0 and as δk (x, y), δk (y, z) ≥ 0,
δk (x, z) ≤ δk (x, z) + δk (y, z) holds trivially. If on the other hand, the kth digits of x and z differ,
then δk (x, z) = 1. But then it cannot be the case that both δk (x, y) = 0 and δk (y, z) = 0. For
if this were the case, kth digits of x, y, z would be the same, a contradiction to δk (x, z) = 1.
Hence δk (x, y) = 1 or1 δk (y, z) = 1, and then we have δk (x, y) + δk (y, z) ≥ 1 (= δk (x, z)). This
completes the proof.

1this is the standard mathematical “inclusive or”, meaning that either just one case holds or both can hold too

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

90 Solutions

Solution to Exercise 1.25. Consider the open ball B(x, r) = {y ∈ X : d(x, y) < r} in X. If y ∈ B(x, r),
then d(x, y) < r. Define r′ = r − d(x, y) > 0. We claim that B(y, r′ ) ⊆ B(x, r). Let z ∈ B(y, r′ ). Then
d(z, y) < r′ = r − d(x, y) and so d(x, z) ≤ d(x, y) + d(y, z) < r. Hence z ∈ B(x, r). Figure 5 illustrates
this.

r
y r′

Figure 5. The open ball is open.

Solution to Exercise 1.32. Let Fi (i ∈ I) be a family of closed sets. Then X \ Fi (i ∈ I) forms a family
of open sets. Hence [ \
X \ Fi = X \ Fi
i∈I i∈I
\  \ 
is open. Consequently, Fi = X \ X \ Fi is closed.
i∈I i∈I

If F1 , . . . , Fn are closed, then X \ F1 , . . . , X \ Fn are open and so the intersection


n
\ n
[
(X \ Fi ) = X \ Fi
i=1 i=1
n
[  n
[ 
of these finitely many open sets is open as well. Thus Fi = X \ X \ Fi is closed.
i=1 i=1

For showing that the finiteness condition cannot be dropped, we simply rework Example 1.27 by
taking complements. We know that Fn := R \ (− n1 , n1 ) (n ∈ N) is closed and the union of these,
∞ ∞   ∞  
[ [ 1 1 \ 1 1
Fn = R\ − , =R\ − , = R \ {0}
n=1 n=1
n n n=1
n n
which is not closed, since if it were, its complement R \ (R \ {0}) = {0} would be open, which is not the
case.

Solution to Exercise 1.33. The point c := ( 21 , 0) ∈ I, but for each r > 0, the point y = ( 21 , r2 ) belongs
q
to the ball B(c, r), but not to I, since ky − ck2 = ( 12 − 21 )2 + ( r2 − 0)2 = r2 < r, but 2r 6= 0. See Figure 6.

Solution to Exercise 1.34.


(1) Define k · k1 : R2 → R by kxk1 = |x1 | + |x2 | (x ∈ R2 ). We will show that k · k1 is a norm on R2 and
hence d1 (x, y) := kx − yk1 defines a metric on R2 .

(N1) For x ∈ R2 , clearly kxk1 = |x1 | + |x2 | ≥ 0.


If x ∈ R2 and kxk1 = 0, then |x1 | + |x2 | = 0 and so x1 = x2 = 0 and so x = 0.

(N2) If x ∈ R2 and α ∈ R, then kα · xk1 = |αx1 | + |αx2 | = |α||x1 | + |α||x2 | = |α|kxk1 .

(N3) If x, y ∈ R2 , then kx + yk1 = |x1 + y1 | + |x2 + y2 | ≤ |x1 | + |y1 | + |x2 | + |y2 | = kxk1 + kyk1 .

Define k · k∞ : R2 → R by kxk∞ = max{|x1 |, |x2 |} (x ∈ R2 ). We will show that k · k∞ is a norm on R2


and hence d∞ (x, y) := kx − yk∞ defines a metric on R2 .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 1 91

0 c 1

Figure 6. The segment (0, 1) × {0} is not open in R2 .

(N1) For x ∈ R2 , clearly kxk∞ = max{|x1 |, |x2 |} ≥ 0.


If x ∈ R2 and kxk∞ = 0, then max{|x1 |, |x2 |} = 0 and so x1 = x2 = 0 and so x = 0.
(N2) If x ∈ R2 and α ∈ R, then
kα · xk∞ = max{|αx1 |, |αx2 |} = max{|α||x1 |, |α||x2 |} = |α| max{|x1 |, |x2 |} = |α|kxk∞ .
(N3) If x, y ∈ R2 , then
|x1 + y1 | ≤ |x1 | + |y1 | ≤ kxk∞ + kyk∞ and
|x2 + y2 | ≤ |x2 | + |y2 | ≤ kxk∞ + kyk∞ ,
and so kx + yk∞ = max{|x1 + y1 |, |x2 + y2 |} ≤ kxk∞ + kyk∞ .
(2) B(0, 1) in the metric space (R2 , d1 ) is the set {(x1 , x2 ) ∈ R2 : |x1 | + |x2 | < 1}, namely the interior of
the rhombus with vertices (1, 0), (0, 1), (−1, 0), (0, −1), as shown in Figure 7.

x2

(0,1)

x1
(−1,0) (0,0) (1,0)

(0,−1)

Figure 7. The ball in R2 with center 0 and radius 1 in (R2 , k · k1 ).

B(0, 1) in the metric space (R2 , d2 ) is the set {(x1 , x2 ) ∈ R2 : x21 + x22 < 1}, namely the interior of the
circle with center (0, 0) and unit radius, as shown in Figure 8.
In (R2 , d∞ ), B(0, 1) = {(x1 , x2 ) ∈ R2 : max{|x1 |, |x2 |} < 1} = {(x1 , x2 ) ∈ R2 : |x1 | < 1 and |x2 | < 1},
namely the interior of the square with vertices (1, 1), (−1, 1), (−1, −1), (1, −1), as shown in Figure 9.
Figure 10 summarizes the above findings.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

92 Solutions

x2

x1
(0,0)

Figure 8. The ball in R2 with center 0 and radius 1 in (R2 , k · k2 ).

x2

(−1,1) (1,1)

x1
(0,0)

(−1,−1) (1,−1)

Figure 9. The ball in R2 with center 0 and radius 1 in (R2 , k · k∞ ).

x2

B(0,1) in (R2 ,d∞ )

B(0,1) in (R2 ,d2 )

B(0,1) in (R2 ,d1 )

x1
(0,0)

Figure 10. Open unit balls with center (0, 0) in R2 with the three different metrics d1 , d2 , d∞ .
The balls are the respective interiors of the dotted curves.

Figure 11. The set of open sets in (R2 , d1 ), (R2 , d2 ), (R2 , d∞ ) coincide.

(3) See the last figure above.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 1 93

Solution to Exercise 1.35.


(1) False.
For example in the metric space R, consider the set [0, 1). This set [0, 1) is not open, since every
open ball with center 0 has at least one negative real number, and so has points not belonging
to [0, 1). On the other hand, this set [0, 1) is not closed either, because its complement is
U := (−∞, 0) ∪ [1, ∞), which is not open since every open ball with center 1 has at least one
positive real number strictly less than one, and so has points that do not belong to U .
(2) False.
R is open in R, and it is also closed.
(3) True.
∅ and X are both open and closed in any metric space X.
(4) True.
[0, 1) is neither open nor closed in R.
(5) False. √
0 ∈ Q, but every open ball centered at 0 contains irrational numbers; just consider 2/n, with
a sufficiently large n.
(6) False.
Consider the sequence (an )n∈N given by a1 = 3/2, and for n > 1,
4 + 3an
an+1 = .
3 + 2an
Then it can be shown by induction that (an )n∈N is monotone decreasing and bounded below by

2. So this sequence is convergent with a limit L satisfying
4 + 3L
L= ,
3 + 2L
2

and so
√ L = 2. As L must be positive √ (the sequence is bounded below by 2), it follows that
L = 2. So every ball with center 2 and a positive radius contains elements from Q, showing
that R \ Q is not open, and hence Q is not closed.
(Alternately, consider the real number c with the decimal expansion
c = 0.101001000100001 · · · .
This number c is irrational because it has a nonterminating and nonrepeating decimal expansion.
If we consider the sequence of rational numbers obtained by truncation, namely
0.1
0.101
0.101001
0.1010010001
0.101001000100001
..
.
then this sequence converges with limit c, and so every ball with center c and a positive radius
contains elements from Q, showing again that R \ Q is not open, and hence Q is not closed.)
(7) True
We have [
R\Z= (n, n + 1).
n∈Z

As each interval (n, n + 1) is open, their union is open as well, and so Z = R \ (R \ Z) is closed.

Solution to Exercise 1.36. We have already seen in Exercise 1.25 that the interior of S2 , namely the
open ball B(0, 1) = {x ∈ R3 : kxk2 < 1} is open. We now show that the exterior of S2 , namely the set
U = {x ∈ R3 : kxk2 > 1} is open as well. Let y ∈ U . Then kyk2 > 1. Thus r := kyk2 − 1 > 0. Consider
the open ball B(y, r) = {z ∈ R3 : kz − yk2 < r}. We claim that B(y, r) is contained in U . Let z ∈ B(y, r).
Then kz − yk2 < r. By the triangle inequality, kyk2 = kz − y − zk2 ≤ kz − yk2 + kzk2 < r + kzk2 and so
kzk2 > kyk2 − r = kyk2 − (kyk2 − 1) = 1.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

94 Solutions

Thus the complement of S2 , being the union of the two open sets B(0, 1) and U , is open. Consequently
2
S is closed.

Solution to Exercise 1.37. Let (X, d) be the metric space and let x ∈ X. If U := X \ {x} = ∅, then
since ∅ is open, it follows that {x} is closed.
Let y ∈ U := X \ {x}. Then r := d(x, y) > 0. We claim that the open ball B(y, r) is contained in U .
If z ∈ B(y, r), then d(y, z) < r. Thus d(z, x) ≥ d(x, y) − d(y, z) ≥ d(x, y) − r > 0. Hence z 6= x, and so
z ∈ X \ {x} = U . Consequently U is open, and so {x} = X \ U = X \ (X \ {x}) is closed.
If F is empty, then it is closed. Otherwise,
n
[
F = {x1 , . . . , xn } = {xi },
i=1

being the finite union of the closed sets {x1 }, . . . , {xn } is closed too.

Solution to Exercise 1.38. We simply prove that every subset Y of X is open. It then follows, by
taking complements of these, that every subset is closed.
If Y is empty, then it is open. Otherwise we have
[
Y = {y}.
y∈Y

We will show that every singleton set {y} is open from which it follows that the union above is open. To
show that {y} is open, we note that the open ball B(y, 12 ) = {z ∈ X : d(z, y) < 21 } = {y}.

Solution to Exercise 1.39. Let x, y ∈ R and x < y. By the Archimedean property of R, there is a
1
positive integer n such that n > y−x , that is n(y − x) > 1. Also, there are positive integers m1 , m2 such
that m1 > nx and m2 > −nx, so that −m2 < nx < m1 . Thus

nx ∈ [−m2 , −m2 + 1) ∪ [−m2 + 1, −m2 + 2) ∪ · · · ∪ [m1 − 1, m1 ).

Hence there is an integer m such that m − 1 ≤ nx < m. We have nx < m ≤ 1 + nx < ny, and so dividing
by n, we have
m
x< < y.
n
Consequently, between any two real numbers, there is a rational number.
Let x ∈ R and let ǫ > 0. Then there is a rational number y such that x − ǫ < y < x + ǫ, that is,
|x − y| < ǫ. Hence Q is dense in R.

Solution to Exercise 1.40. Let x ∈ R and let ǫ > 0. If x ∈ R \ Q, then √


taking y = x, we have

|x − y| = 0 < ǫ. If on the other hand, x ∈ Q, then let n ∈ N be such that n > ǫ so that with y := x + n2 ,
2

we have that y ∈ R \ Q and |x − y| = n2 < ǫ. Thus R \ Q is dense in R.

Solution to Exercise 1.41.

(1) Let x, y ∈ B(0, r) and t ∈ (0, 1). Then kxk < r and kyk < r. Thus

k(1 − t)x + tyk ≤ k(1 − t)xk + ktyk = (1 − t)kxk + tkyk < (1 − t)r + tr = r,

and so (1 − t)x + ty ∈ B(0, r). Hence B(0, r) is a convex set.


(2) The unit circle is not convex, since x := (1, 0) ∈ S1 , y := (−1, 0) ∈ S1 and t = 1
2
∈ (0, 1), but
(1 − t)x + ty = x+y
2
= (0, 0) 6∈ S1 since k(0, 0)k2 = 0 6= 1. So S1 is not convex.
(3) Let x, y ∈ Σ and t ∈ (0, 1). Then Ax = Ay = b and all the components of x and of y
are nonnegative. Hence A((1 − t)x + ty) = (1 − t)Ax + tAy = (1 − t)b + tb = b. Also if
k ∈ {1, . . . , n}, then the kth component of (1 − t)x + ty is (1 − t)xk + tyk , which is nonegative.
Thus (1 − t)x + ty ∈ Σ. Hence Σ is convex.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 1 95

Solution to Exercise 1.42. For both metrics we have to verify that (D1), (D2) and (D3) are satisfied.
Since both metrics are defined by cases, we have to carefully consider all possible cases, especially when
verifying (D3).
We first consider the “express railway metric” de .
(D1) Given x, y ∈ R2 , we have to show that de (x, y) ≥ 0. However, this is true for both cases in the
definition of de (the first case because k · k2 satisfies (N1)).
Given x ∈ R2 , we also have to show that de (x, x) = 0. This is true by definition of de .
Finally, given x, y ∈ R2 such that de (x, y) = 0, we have to show that x = y. Suppose on the contrary
that x 6= y. Then by definition, de (x, y) = kxk2 + kyk2 . Therefore, kxk2 + kyk2 = 0. Since kxk2 ≥ 0 and
kyk2 ≥ 0 ((N1) holds for k · k2 ), it follows that kxk2 = 0 and kyk2 = 0. By (N1) it follows that x = 0 and
y = 0. This contradicts the assumption that x 6= y.
Therefore, x = y.
(D2)

de (u, v) = kuk2 + kvk2


= kvk2 + kuk2 = de (v, u).

(D3) Given x, y, z ∈ R2 we have to show that de (x, z) ≤ de (x, y) + de (y, z).


If x = z then de (x, z) = 0 ≤ de (x, y) + de (y, z) by definition of de (x, x) and by (D1) (already proved).
If x = y then d(x, y) + d(y, z) = d(x, x) + d(x, z) = d(x, z), also by (D1), so the triangle inequality
holds.
If y = z then the triangle inequality holds similarly.
So we may assume without loss of generality that the above three cases do not occur: x 6= z, x 6= y,
y 6= z. Then

de (x, z) = kxk2 + kzk2 (by definition of de )


≤ kxk2 + kyk2 + kyk2 + kzk2 (by (N1) kyk2 ≥ 0)
= de (x, y) + d(y, z) (again by definition of de ).

We have shown that de is a metric on R2 .

Next we consider the “stopping railway metric” ds .


(D1) Given x, y ∈ R2 , ds (x, y) ≥ 0, because in both cases ds (x, y) is defined by a 2-norm or a sum of
2-norms, and we know that kxk2 ≥ 0 for all x ∈ R2 .
Given x ∈ R2 ,

ds (x, x) = kx − xk2 (since x and x are linearly dependent)


= k0k2 = 0.

Given x, y ∈ R2 such that ds (x, y) = 0, we have to show that x = y. Suppose that x and y are linearly
independent. Then kxk2 + kyk2 = 0. As before, this implies that kxk2 = kyk2 = 0 and x = 0 = y, which
contradicts linear independence.
Therefore, x and y are linearly dependent. Then ds (x, y) = kx − yk2 = 0. It follows from (N1) that
x − y = 0 and therefore, x = y.
(D2) Let x, y ∈ R2 be given.
If x and y are linearly indpendent, then ds (x, y) = kxk2 + kyk2 = kyk2 + kxk2 = ds (y, x), the last
equality because y and x are linearly independent.
If x and y are linearly dependent, then ds (x, y) = kx−yk2 = k(−1)(y −x)k2 = |−1|ky −xk = ds (y, x),
the last equality because y and x are linearly dependent.
(D3) Let x, y, z ∈ R2 be given. We have to show that ds (x, z) ≤ ds (x, y) + ds (y, z). For each of {x, z},
{x, y}, {y, z}, the set is either linearly independent or linearly dependent. Thus there are 2 × 2 × 2 = 8
cases to consider:

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

96 Solutions

{x, z} {x, y} {y, z} Inequality to prove


LI LI LI kxk2 + kzk2 ≤ (kxk2 + kyk2 ) + (kyk2 + kzk2 )
This is true because kyk2 ≥ 0.
LI LI LD kxk2 + kzk2 ≤ (kxk2 + kyk2 ) + ky − zk2
This follows from kzk2 = ky − y + zk2 ≤ kyk2 + k − y + zk2 (N3)
= kyk2 + ky − zk2 (N2).
LI LD LI kxk2 + kzk2 ≤ kx − yk2 + (kyk2 + kzk2 )
Similar to the previous case.
LI LD LD kxk2 + kzk2 ≤ kx − yk2 + ky − zk2
This case can only occur if y = 0, since y has to be a multiple of x
and a multiple of z, but x and z are linearly independent.
So the inequality follows from kxk2 + kzk2 = kx − 0k2 + k0 − zk2 .
LD LI LI kx − zk2 ≤ (kxk2 + kyk2 ) + (kyk2 + kzk2 ).
This follows from kx − zk2 ≤ kxk2 + k − zk2 (N3)
= kxk2 + kzk2 (N2)
≤ kxk2 + 2kyk2 + kzk2 .
LD LI LD kx − zk2 ≤ (kxk2 + kyk2 ) + ky − zk2 .
This case can only occur if z = 0, since z has to be a multiple of the
linearly independent x and y. So the inequality follows from
kx − 0k2 ≤ (kxk2 + kyk2 ) + ky − 0k2 .
LD LD LI kx − zk2 ≤ kx − yk2 + (kyk2 + kzk2 ).
Similar to the previous case.
LD LD LD kx − zk2 ≤ kx − yk2 + ky − zk2 .
Follows from kx − zk2 = kx − y + y − zk2 ≤ kx − yk2 + ky − zk2 by N3.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 97

Solutions to the exercises from Chapter 2


Solution to Exercise 2.5. Let ǫ > 0. Since (an )n∈N is a Cauchy sequence, there exists an N ∈ N
such that for all n, m > N , |an − am | < ǫ. In particular for n > N , we have m := n + 1 > N and so
|an − an+1 | < ǫ, that is, |(an+1 − an ) − 0| < ǫ. So (an+1 − an )n∈N is convergent with limit 0.

Solution to Exercise 2.9.

(1) True.
Let (an )n∈N be convergent with limit L, and let (ank )k∈N be a subsequence. If ǫ > 0, then there
exists an N ∈ N such that for all n > N , |an − L| < ǫ. Choose K ∈ N such that for all k > K,
nk > N . Then for all k > K, nk > N and so |ank − L| < ǫ. Thus (ank )k∈N is convergent with
limit L.
(2) False.
The sequence ((−1)n )n∈N is divergent, but it has as a subsequence ((−1)2n )n∈N = (1)n∈N , which
is convergent with limit 1.
(3) True.
Let (an )n∈N be bounded, and let M > 0 be a number such that for all n ∈ N, |an | ≤ M . If
(ank )k∈N is a subsequence, then also |ank | ≤ M for all k ∈ N, and so it is bounded as well.
(4) False.
The sequence 1, 0, 2, 0, 3, 0, . . . is unbounded, but has as a subsequence 0, 0, 0, . . . , which is
bounded.
(5) True.
Let (an )n∈N be an increasing sequence. (The argument is similar for a decreasing sequence.)
Suppose that (ank )k∈N is a subsequence. For each k ∈ N, nk+1 > nk and so we have that
ank+1 ≥ ank+1 −1 ≥ ank+1 −2 ≥ · · · ≥ ank . Thus (ank )k∈N is increasing as well.
(6) False.
The sequence 1, 0, 2, 0, 3, 0, . . . is not monotone, but it has as a subsequence 1, 2, 3, . . . , which is
monotone.
(7) True.
The sequence itself is a subsequence of itself, and so it must be convergent too.
(8) False.
The sequence (an )n∈N = ((−1)n )n∈N is divergent, but we have that both (a2n )n∈N = (1)n∈N and
(a2n+1 )n∈N = (−1)n∈N are convergent (with limits 1 and −1, respectively).
(9) True.
Let L be the common limit, and let ǫ > 0. There exists an N1 ∈ N such that for all n > N1 ,
|a2n − L| < ǫ. Also, there exists an N2 ∈ N such that for all n > N2 , |a2n+1 − L| < ǫ. Choose
N ∈ N such that N > max{2N1 , 2N2 + 1}. Let n > N . If n is even, say n = 2k for some k ∈ N,
then 2k > N ≥ 2N1 gives k > N1 , and so |a2k − L| = |an − L| < ǫ. If, on the other hand, n
is odd, say n = 2k + 1 for some k ∈ N, then 2k + 1 = n > N ≥ 2N2 + 1 gives k > N2 , and
so |a2k+1 − L| = |an − L| < ǫ. Hence for all n > N , |an − L| < ǫ. Consequently, (an )n∈N is
convergent with limit L.

Solution to Exercise 2.10. Let (an )n∈N be a bounded increasing sequence of real numbers. Let M
be the minimum (or least element) of the set of upper bounds of {an : n ∈ N}. The existence of M is
guaranteed by the least upper bound property of the set of real numbers. We show that M is the limit of
(an )n∈N . Taking ǫ > 0, we must show that there exists a positive integer N such that |an − M | < ǫ for all
n > N . Since M − ǫ < M , M − ǫ is not an upper bound of {an : n ∈ N}. Therefore there exists N with
M ≥ aN > M − ǫ. Since (an )n∈N is increasing, |an − M | < ǫ for all n ≥ N .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

98 Solutions

Solution to Exercise 2.11. We have


 n
1 1 n(n − 1) 1 n(n − 1)(n − 2) 1 n···2 1 1
1+ = 1+n· + + + ··· + + n
n n 1 · 2 n2 1·2·3 n3 (n − 1)! nn−1 n
    
1 1 1 2 1
= 1+1+ 1− + 1− 1− + ...
n 2 n n 3!
    
1 2 n−2 1
+ 1− 1− ... 1 −
n n n (n − 1)!
    
1 2 n−1 1
+ 1− 1− ... 1 −
n n n n!
    
1 1 1 2 1
< 1+1+ 1− + 1− 1− + ...
n+1 2 n+1 n + 1 3!
    
1 2 n−2 1
+ 1− 1− ... 1 −
n+1 n+1 n + 1 (n − 1)!
    
1 2 n−1 1
+ 1− 1− ... 1 −
n+1 n+1 n + 1 n!
    
1 2 n 1
+ 1− 1− ... 1 −
n+1 n+1 n + 1 (n + 1)!
 n+1
1
= 1+ .
n+1
So the sequence is increasing. Also,
 n     
1 1 1 1 2 1
1+ = 1+1+ 1− + 1− 1− + ...
n n 2 n n 3!
    
1 2 n−2 1
+ 1− 1− ... 1 −
n n n (n − 1)!
    
1 2 n−1 1
+ 1− 1− ... 1 −
n n n n!
1 1 1
< 1+1+1· +1· + ··· + 1 ·
2 3! n!
1 1 1 1
< 1+1+ + + + · · · + n−1
2 2·2 2·2·2 2
1 − 21n
= 1+ < 1 + 2 = 3,
1 − 21
and so the sequence is also bounded.

Solution to Exercise 2.12.


(1) Divergent. (cos(πn))n∈N = ((−1)n )n∈N , which has the convergent subsequences 1, 1, 1, . . . and
−1, −1, −1, . . . with different limits −1, 1, respectively. But all subsequences of a convergent
sequence must converge to the same limit (namely the limit of the original sequence). This
means that (cos(πn))n∈N = ((−1)n )n∈N can’t be convergent.
(2) Divergent. The sequence is not bounded, whereas we know that all convergent real sequences
must be bounded.
(3) Convergent with limit 0. Indeed, we have for all n that
1 sin n 1
− ≤ ≤ ,
n n n
and the claim follows by the Sandwich Theorem.
(4) Divergent. The sequence is not bounded, since for any M > 0,
3n2 3(n2 − 1) + 3 3 3
1− = 1− = 1 − 3(n − 1) − = −3n + 4 −
n+1 n+1 n+1 n+1
3
= 3n − 4 + > 3n − 4 > M
n+1
if we choose n > (M + 4)/3. On the other hand, all convergent real sequences must be bounded.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 99


1 1
(5) Convergent with limit 1. For n ≥ 2, we have n n > 1 n = 1. Set
1
bn := n n − 1 > 0 for n > 2.
1
Then n n = 1 + bn , and so
n(n − 1) 2 n(n − 1) 2
n = (1 + bn )n = 1 + nbn + n−1
bn + · · · + nbn + bn
n > bn .
2 2
2
Thus > b2n > 0 for n > 2, and so by the Sandwich Theorem, lim b2n = 0. Hence also
n−1 n→∞

lim bn = 0,
n→∞

and so
1
lim n n = lim (1 + bn ) = 1 + 0 = 1.
n→∞ n→∞

An alternate proof can be given using the Arithmetic Mean-Geometric Mean Inequality, which
says that for nonnegative real numbers a1 , . . . , an , there holds that
a1 + · · · + an √
their arithmetic mean := ≥ n a1 . . . an =: their geometric mean.
n
√ √
Applying this to the n numbers 1, . . . , 1 , n, n gives
| {z }
(n−2)times
√ √ q
(n − 2) · 1 + n+ n n √ √ 1
≥ n· n = nn ,
n
and so,
2 2 1
1− + √ ≥ n n (≥ 1).
n n
1
By the Sandwich Theorem, it follows that lim n n = 1.
n→∞

(6) Convergent with limit 1. We have


  n 
9 1
1−
9 9 9 9 10 10
an := 0. |9 .{z
. . 9} = + + + ··· + n = .
10 100 1000 10 1
n times 1−
10
9

1
n+1 (1 − 0)
Since lim = 0, it follows that lim an = 10 = 1.
n→∞ 10 n→∞ 1
1−
10

(k)
Solution to Exercise 2.15. (“If” part) Suppose that (an )n∈N is convergent with limit L(k) , k = 1, . . . , d.
(k)

Let ǫ > 0. We have the existence of indices N1 , . . . , Nd ∈ N such that for n > Nk , |an − L(k) | < ǫ/ d,
(k)

k = 1, . . . , d. So with N := max{N1 , . . . , Nd }, for n > N we have that |an − L(k) | < ǫ/ d, k = 1, . . . , d.
d (k)
Now define L ∈ R , by setting its kth component to be L , k = 1, . . . , d. Then we have
v v
u d u d
uX (k) uX √
kan − Lk2 = t |an − L(k) |2 < t (ǫ/ d)2 = ǫ.
k=1 k=1

Thus (an )n∈N converges to L.

(“Only if” part) Suppose that (an )n∈N converges to L, and let the kth component of L be denoted by
L(k) , k = 1, . . . , d. If ǫ > 0, then let N ∈ N be such that for all n > N , kan − Lk2 < ǫ. But then we have
for n > N that
v
u d
(k) (k)
uX (ℓ)
|a − L | ≤ t
n |an − L(ℓ) |2 = kan − Lk2 < ǫ,
ℓ=1

(k)
for each k = 1, . . . , d. This shows that (an )n∈N is convergent with limit L(k) , k = 1, . . . , d.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

100 Solutions

Solution to Exercise 2.16. If we look at the first components of the terms of the sequence, we have
n 1 n→∞ 1 1
= −→ = ,
4n + 2 2 4+0 4
4+
n
by the algebra of limits. If we look at the second components of the terms of the sequence, we have
n2 n→∞ 1 1
= −→ = 1,
n2 +1 1 1+0
1+ 2
n
by the algebra of limits. By the result in Exercise 2.15, we conclude that (an )n∈N is convergent with limit
" #
1
L := 4 .
1

Solution to Exercise 2.17. (“If” part) If the sequence (an )n∈N is eventually the constant sequence with
all terms equal to c beyond the index N , then clearly d(an , c) = d(c, c) = 0 for all n > N , and so the
sequence is convergent with limit c.

(“Only if” part) Suppose that the sequence (an )n∈N is convergent with limit L. With ǫ = 1/2, let N ∈ N
be such that for all n > N , d(xn , L) < 1/2. But then by the definition of the discrete metric, it must be
the case that d(xn , L) = 0 (since it cannot be 1!). Thus xn = L for all n > N , that is, the sequence is
eventually constant.

Solution to Exercise 2.18. Let ǫ > 0. There exists an N1 ∈ N such that for all n > N1 , d(an , a) < ǫ/2.
Also, there exists an N2 ∈ N such that for all n > N2 , d(bn , b) < ǫ/2. We have for each n ∈ N that
d(an , bn ) ≤ d(an , a) + d(a, b) + d(b, bn ),
d(a, b) ≤ d(a, an ) + d(an , bn ) + d(bn , b),
and combining these, we have for n > max{N1 , N2 } that
|d(an , bn ) − d(a, b)| ≤ d(an , a) + d(bn , b) < ǫ/2 + ǫ/2 = ǫ.
Consequently the real sequence (d(an , bn ))n∈N is convergent with limit d(a, b).

Solution to Exercise 2.19.


(1) We will use the following simple fact, called the “Arithmetic mean-geometric mean (AM-GM) inequal-
ity”. For nonnegative real numbers a, b, we have
a+b √
≥ ab,
2
√ √
with equality if and only if a = b. This is an immediate consequence of the inequality ( a − b)2 ≥ 0.
We will prove the first string of inequalities using induction on n. The claim is true for n = 1, since
we have
0 < x1 (given),
√ √
x2 = x1 y1 > x1 x1 = x1 (y1 > x1 > 0),
x1 + y1 √
y2 = > x1 y1 = x2 (AM-GM inequality and x1 6= y1 ),
2
x1 + y1 y1 + y1
y2 = < = y1 (x1 < y1 ).
2 2
Thus 0 < x1 < x2 < y2 < y1 .
Let us now assume that for some n ∈ N, 0 < xn < xn+1 < yn+1 < yn . Then

xn+1 > 0(induction hypothesis),


√ √
xn+2 = xn+1 yn+1 > xn+1 xn+1 = xn+1 (yn+1 > xn+1 > 0),
xn+1 + yn+1 √
yn+2 = > xn+1 yn+1 = xn+2 (AM-GM inequality and xn+1 6= yn+1 ),
2
xn+1 + yn+1 yn+1 + yn+1
yn+2 = < = yn+1 (xn+1 < yn+1 ).
2 2

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 101

Thus 0 < xn+1 < xn+2 < yn+2 < yn+1 , completing the induction step. Consequently, the claim holds by
induction.
xn + yn yn − xn
Using xn+1 > xn , we also have yn+1 − xn+1 < yn+1 − xn = − xn = .
2 2
(2) Since for all n ∈ N xn < xn+1 , the sequence (xn )n∈N is increasing. Also yn+1 < yn for all n ∈ N, and so
the sequence (yn )n∈N is decreasing. In particular, yn ≤ y1 for all n ∈ N. The inequality xn < yn (n ∈ N)
now yields xn < yn ≤ y1 . Thus we have x1 ≤ xn ≤ y1 (n ∈ N), and so the sequence (xn )n∈N is bounded.
As it is also increasing, it follows from the Bolzano-Weierstrass Theorem that it must be convergent.
For the sequence (yn )n∈N , we have 0 < yn ≤ y1 for all n ∈ N, showing that it is bounded. It is also
decreasing. Again it follows from the Bolzano-Weierstrass Theorem that (yn )n∈N is convergent as well.
Finally, we have for all n > 1 that
|xn−1 − yn−1 |
|xn − yn | < .
2
|xn−1 − yn−1 |
Set Lx := lim xn and Ly := lim yn . Passing the limit as n → ∞ in |xn − yn | < , it follows
n→∞ n→∞ 2
that
|Lx − Ly |
|Lx − Ly | ≤ ,
2
6 0, we arrive at the contradiction that 1 < 12 . Hence Lx = Ly = c (say), that is,
and so if |Lx − Ly | =
lim xn = lim yn =: c.
n→∞ n→∞

Consequently, lim vn exists, and is equal to (c, c).


n→∞


Solution to Exercise 2.24. Let us revisit the construction of the sequence in Q with limit 2 done in
part (6) of Exercise 1.35.
Consider the sequence (an )n∈N given by a1 = 3/2, and for n > 1,
4 + 3an
an+1 = .
3 + 2an

Then it can be shown by induction that (an )n∈N is monotone decreasing and bounded below by 2. So
this sequence is convergent in R, and in particular, it is a Cauchy sequence in R. So given any ǫ > 0, there
is an N ∈ N such that for all n, m ∈ N, |an − am | < ǫ. As all the terms an , n ∈ N, are rational, it follows
that (an )n∈N is a Cauchy sequence in Q.
Next we will show that it cannot converge in Q. Suppose on the contrary, that it has the limit r ∈ Q.
Thus if ǫ > 0, then there exists an N ∈ N such that for each n ∈ N satisfying n > N , we have |an − r| < ǫ.
So the sequence (an )n∈N is also a convergent sequence in R with the same limit r. But it then follows
from the recursive definition of the sequence that r must satisfy
4 + 3r
r= ,
3 + 2r
and so r2 = 2. But then r cannot belong to Q, a contradiction.
So we have shown the existence of a Cauchy sequence (an )n∈N in Q which does not converge in Q.
This shows that Q is not complete.

Solution to Exercise 2.25. First choose an N ∈ N so that if n, m > N , then d(xn , xm ) < ǫ/2. Then
choose a K large enough so that nK > N and d(xnK , L) < ǫ/2. Now for n > nK , we have
d(xn , L) ≤ d(xn , xnK ) + d(xnK , L) < ǫ/2 + ǫ/2 = ǫ.
Thus (xn )n∈N is convergent with limit L.

Solution to Exercise 2.27. Let (M (k) )k∈N be a Cauchy sequence in (Rn×m , k · k∞ ). We have the
inequalities
(k) (ℓ)
|Mij − Mij | ≤ kM (k) − M (ℓ) k∞ (k, ℓ ∈ N, 1 ≤ i ≤ n, 1 ≤ j ≤ m),
(k)
from which it follows that each of the sequences (Mij )k∈N is Cauchy in R, and hence convergent, with
respective limits, say Mij , 1 ≤ i ≤ n, 1 ≤ j ≤ m. So given ǫ > 0, there exists an N ∈ N such that
whenever n > N , we have
(k)
|Mij − Mij | < ǫ (1 ≤ i ≤ n, 1 ≤ j ≤ m).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

102 Solutions

Set M to be the matrix with the entry in the ith row and jth column as Mij , for 1 ≤ i ≤ n, 1 ≤ j ≤ m.
Thus for n > N , we have
(k)
kM (k) − M k∞ = max |Mij − Mij | < ǫ.
1≤i≤n, 1≤j≤m

So the Cauchy sequence (M (k) )k∈N is convergent with limit M . Hence (Rn×m , k · k∞ ) is complete.

Solution to Exercise 2.29. Let n, m ∈ N. Since the functions fn and fm coincide in the intervals [0, 21 ]
and [ 21 + max{ n+1
1 1
, m+1 }, 1], we have
Z 1 Z 1 +max{ 1 , 1 }
2 n+1 m+1
kfn − fm k1 = |fn (x) − fm (x)|dx = |fn (x) − fm (x)|dx.
0 1
2

In the interval [ 12 , 12 + max{ n+1


1 1
, m+1 }], we can estimate the last integral above using the fact that
|fn (x) − fm (x)| ≤ |fn (x)| + |fm (x)| ≤ 1 + 1 = 2, to obtain
Z 1 +max{ 1 , 1 } Z 1 +max{ 1 , 1 }  
2 n+1 m+1 2 n+1 m+1 1 1
kfn − fm k1 = |fn (x) − fm (x)|dx ≤ 2dx ≤ 2 max , .
1 1 n+1 m+1
2 2

Given ǫ > 0, we can choose an N ∈ N large enough so that N > 2/ǫ, so that whenever m, n > N , we
obtain from the above that
 
1 1 2
kfn − fm k1 ≤ 2 max , ≤ < ǫ.
n+1 m+1 N
Consequently, the sequence (fn )n∈N is Cauchy in C[0, 1].
Now we show that the sequence (fn )n∈N is not convergent in C[0, 1]. Suppose, on that contrary, that
it is convergent with limit f ∈ C[0, 1]. We have
Z 1 Z 1
2
kfn − f k1 = |f (x)|dx + |fn (x) − f (x)|dx.
0 1
2
Z 1
2
In particular, 0 ≤ |f (x)|dx ≤ kfn − f k1 , and since lim kfn − f k1 = 0, we obtain
0 n→∞

Z 1
2
|f (x)|dx = 0.
0

As f is a continuous function, it follows that |f (x)| = 0 for all x in [0, 21 ], and so f (x) = 0 for all x in
[0, 21 ]. We will now show that f (x) = 1 for all x in ( 21 , 1]. Suppose that y ∈ ( 12 , 1]. Then we can choose a
large N such that 12 + N1+1 < y. Consequently, for all n > N , we have
Z y Z 1
kfn − f k1 = |fn (x) − f (x)|dx + |1 − f (x)|dx.
1 y
2
Z 1
In particular, 0 ≤ |1 − f (x)|dx ≤ kfn − f k1 , and since lim kfn − f k1 = 0, we obtain
y n→∞
Z 1
|1 − f (x)|dx = 0.
y

As f is a continuous function, it follows that |1 − f (x)| = 0 for all x in [y, 1], and so f (x) = 0 for all x in
[y, 1]. As the choice of y ∈ ( 12 , 1] was arbitrary, it follows that f (x) = 0 for all x in ( 12 , 1]. So we arrive at
the conclusion that 
0 if x ∈ [0, 21 ],
f (x) =
1 if x ∈ ( 21 , 1],
which is clearly not continuous at 12 . This is a contradiction to our assumption that f ∈ C[0, 1]. So the
sequence (fn )n∈N is not convergent in C[0, 1].

Solution to Exercise 2.30. Let (an )n∈N be a Cauchy sequence in X. Then with ǫ := 21 , there exists an
N ∈ N such that whenever m, n > N , we have d(an , am ) < ǫ = 21 . But since the discrete metric assumes
only the values 0 or 1, it follows that for n, m > N , d(an , am ) = 0, and so in fact an = am . Thus the
sequence is eventually a constant sequence, and is convergent.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 103

Solution to Exercise 2.31. Let (an )n∈N be a Cauchy sequence in Z. Then with ǫ := 12 , there exists an
N ∈ N such that whenever m, n > N , we have |an − am | < ǫ = 12 . But since an , am are integers, an − am
is also an integer, and if it were nonzero, we would have |an − am | ≥ 1. Thus for n, m > N , |an − am | = 0,
and so in fact an = am . Thus the sequence is eventually a constant sequence, and is convergent.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

104 Solutions

Solution to Exercise 2.34. (“If” part) Suppose that (an )n∈N converges to 0. Let ǫ > 0. Then there
exists an N ∈ N such that whenever n > N , |an − 0| < ǫ. As an ≥ 0, we have that an < ǫ for all n > N .
But by the definition of an , this means that for any x ∈ X,
|fn (x) − f (x)| ≤ an < ǫ.
Hence (fn )n∈N is uniformly convergent to f .

(“Only if” part) Now suppose that (fn )n∈N is uniformly convergent to f . Let ǫ > 0. Then there exists
an N ∈ N such that whenever n > N , we have for all x ∈ X, |fn (x) − f (x)| < ǫ. But this says that ǫ is
an upper bound for the set {|fn (x) − f (x)| : x ∈ X}. By the definition of the least upper bound, it now
follows that an ≤ ǫ. Since an is nonnegative, it follows that |an − 0| = |an | = an ≤ ǫ for all n > N . In
other words, the sequence (an )n∈N converges to 0.

For a fixed x ∈ (0, ∞), we have


lim fn (x) = lim xe−nx = x · lim e−nx = x · 0 = 0.
n→∞ n→∞ n→∞

So the pointwise limit of (fn )n∈N is the function f which is identically zero on (0, ∞). We now show that
the convergence is uniform by using the result proved above.
We have to calculate or at least find a good upper bound to
an := sup{|fn (x) − f (x)| : x ∈ (0, ∞)} = sup xe−nx .
x∈(0,∞)

There are various ways to do this.


The most straightforward “calculus” method is to find the extreme values of xe−nx over the interval
(0, ∞). We calculate dx d
xe−nx = e−nx − nxe−nx = e−nx (1 − nx), so dx d
xe−nx > 0 iff 1 − nx > 0 iff
−nx
d
x < 1/n, and similarly, dx xe > 0 iff x > 1/n. Therefore, the function (0, ∞) → R, x 7→ xe−nx
is strictly increasing on (0, 1/n) and strictly decreasing on (1/n, ∞), so it attains a global maximum at
x = 1/n with maximum value (1/n)e−n·1/n = 1/(ne). Therefore,
1
an = sup xe−nx = →0
x∈(0,∞) ne
as n → ∞. Hence (fn )n∈N converges uniformly to the function which is identically zero on (0, ∞).
Another method is to use the inequality ex ≥ 1 + x, which is valid for all x ∈ R, as can be seen by
calculating the minimum value of ex − x, x ∈ R. Thus, enx ≥ 1 + nx, hence
x 1 1
xe−nx ≤ = 1 < , (since x > 0)
1 + nx x
+ n n
and we obtain that 0 ≤ an < 1/n. By the Sandwich Theorem it follows that lim an = 0, and we can
n→∞
conclude as before that (fn ) convergences uniformly to the zero function.

Solution to Exercise 2.35. We will show that (fn )n∈N converges uniformly to the function which is
identically 0 on [0, 1]. We have for all x ∈ [0, 1] that
x 1 nx 1
|fn (x) − 0| == · < · 1.
1 + nx n 1 + nx n
Let ǫ > 0. Choose an N ∈ N such that N > 1/ǫ. Then whenever n > N , we have for all x ∈ [0, 1] that
1 1
|fn (x) − 0| << < ǫ.
n N
Consequently, (fn )n∈N converges uniformly to the function which is identically zero on [0, 1].

Solution to Exercise 2.36. If x 6= 0, then


1
0< < 1,
1 + x2
and so  n
1
lim fn (x) = 1 − lim = 1 − 0 = 1.
n→∞ n→∞ 1 + x2
On the other hand,  
1
lim fn (0) = lim 1 − = 1 − 1 = 0.
n→∞ n→∞ (1 + 02 )n

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 105

So the sequence (fn )n∈N converges pointwise to f . Clearly f is discontinuous at 0: for example, the
sequence ( n1 )n∈N converges to 0, but the sequence (f ( n1 ))n∈N = (1)n∈N converges to 1 6= 0 = f (0).

Solution to Exercise 2.37. We have


m 1 1
lim am,n = lim = lim n = = 1.
m→∞ m→∞ m+n m→∞ 1 +
m
1+0
On the other hand,
m
m n 0
lim am,n = lim = lim m = = 0.
n→∞ n→∞ m+n n→∞
n
+1 0+1
Hence
lim lim am,n = lim 0 = 0 6= 1 = lim 1 = lim lim am,n .
m→∞ n→∞ m→∞ n→∞ n→∞ m→∞

Solution to Exercise 2.38. We have for x ∈ R that


| sin(nx)| 1
|fn (x) − f (x)| = |fn (x)| = √ ≤ √ .
n n
Let ǫ > 0. Choose an N ∈ N such that N > 1/ǫ2 . Then whenever n > N , we have for all x ∈ R that
1 1
|fn (x) − f (x)| ≤ √ < √ < ǫ.
n N
Consequently, (fn )n∈N converges pointwise (in fact even uniformly!) to the function f which is identically
zero on R.
We have
d n cos(nx) √
fn′ (x) := fn (x) = √ = n cos(nx).
dx n

f ′ on the other hand is identically 0 on R. If we look at x = 0, we have (fn′ (x))n∈N = ( n)n∈N is clearly
′ ′
unbounded, and thus not convergent. Hence (fn )n∈N is not pointwise convergent (to f ).

Solution to Exercise 2.39. If x = 0 or x = 1, fn (x) = 0, and so (fn (x))n∈N = (0)n∈N is convergent


with limit 0 (= f (x)). Now suppose that x ∈ (0, 1). Then 0 < 1 − x2 < 1. So we can find a positive
number h such that 1 − x2 = 1/(1 + h). By the binomial expansion for (1 + h)n , we have the inequality
(1 + h)n > n(n − 1)h2 , which yields

0 < nx(1 − x2 )n < ✁


nx
.
✁ − 1)h
n(n 2

Thus by the Sandwich Theorem we obtain that (fn (x))n∈N converges to 0. Consequently (fn )n∈N converges
pointwise to the zero function f on [0, 1].
Using the variable substitution u = x2 , we see that
Z 1 Z 1 Z
1 1 n 1 n
fn (x)dx = nx(1 − x2 )n dx = nu du = .
0 0 2 0 2 n+1
Thus Z 1 Z 1 Z 1
1 n 1
lim fn (x)dx = lim = 6= 0 = f (x)dx = lim fn (x)dx.
n→∞ 0 n→∞ 2 n + 1 2 0 0 n→∞

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

106 Solutions

Solution to Exercise 2.41.


(1) Let (x(n) )n∈N be a sequence such that x(n) belongs to H for each n ∈ N, and suppose that
 
L1
lim x(n) = L =  ... 
 
n→∞
Ld
(n) (n)
in Rd . If x1 , . . . , xd are the components of x(n) , n ∈ N, then we have
(n)
lim xi = Li (i = 1, . . . , d).
n→∞

Since x(n) belongs to H for each n ∈ N, we also have


(n) (n)
a1 x 1 + · · · + ad x d = a⊤ x(n) = β (n ∈ N),
where a1 , . . . , ad denote the components of a. Passing the limit as n → ∞, we obtain
(n) (n)
a⊤ L = a1 L1 + · · · + ad Ld = lim (a1 x1 + · · · + ad xd ) = β,
n→∞

and so L ∈ H. Hence H is closed, by the characterization of closed sets given in Theorem 2.40.
(2) Let r1 , . . . , rm ∈ Rn be vectors such that r1⊤ , . . . , rm

are the rows of the matrix A. Also, let
b1 , . . . , bm denote the components of b. Then it is clear that the equation Ax = b is satisfied by
a vector x ∈ Rn if and only if it satisfies r1⊤ x = b1 , . . . , rm

x = bm . Thus
\m
S = {x ∈ Rn : Ax = b} = {x ∈ Rn : ri⊤ x = bi }.
i=1

Hence S is a finite intersection of the closed subspaces Hi := {x ∈ Rn : ri⊤ x = bi }, i = 1, . . . , m,


and so it follows that S is closed.
(3) We first show that the set Π := {x ∈ Rn : x1 ≥ 0, . . . , xn ≥ 0} is closed. Let (x(k) )k∈N be a
sequence such that x(k) belongs to Π for each k ∈ N, and suppose that
 
L1
lim x(k) = L =  ... 
 
k→∞
Ln
(k) (k)
in Rn . If x1 , . . . , xn are the components of x(k) , k ∈ N, then we have
(k)
lim xi = Li (i = 1, . . . , n).
n→∞

Since x(k) belongs to Π for each k ∈ N, we also have


(k)
x1 ≥ 0, . . . , x(k)
n ≥ 0 (k ∈ N).
Passing the limit as k → ∞, these inequalities yield
L1 ≥ 0, . . . , Ln ≥ 0,
and so L ∈ Π. Hence Π is closed, by the characterization of closed sets given in Theorem 2.40.
Since Σ is the intersection of the two closed sets Π and S, it now follows that Σ is closed too.

Solution to Exercise 2.46. “=⇒”: We are given that S ⊆ X is bounded, that is, ∃M > 0 such that
∀x, y ∈ S, d(x, y) ≤ M . We have to find an R > 0 such that for all x ∈ S, d(x, x0 ) ≤ R.
Choose some fixed x1 ∈ S. (Note that if S = ∅, then there is nothing to prove.)
Set R = M + d(x0 , x1 ). Since M > 0 and d(x0 , x1 ) ≥ 0, also R > 0.
Let x ∈ S be given. Then
d(x, x0 ) ≤ d(x, x1 ) + d(x1 , x0 ) (triangle inequality in X)
≤ M + d(x1 , x0 ) (since x, x1 ∈ S)
= R. 

“⇐=”: Now we are given that ∃R > 0 such that ∀x ∈ S, d(x, x0 ) ≤ R, and we have to find a M > 0
such that ∀x, y ∈ S, d(x, y) ≤ M .
Set M = 2R. Since R > 0, also M > 0.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 2 107

Let x, y ∈ S be given. Then again by the triangle inequality,


d(x, y) ≤ d(x, x0 ) + d(x0 , y) ≤ R + R = M. 

Solution to Exercise 2.53.


(1) False. As a counterexample take S = R. Any subsequence of any convergent sequence in R will
have a limit in R, and yet R is not compact. The reason R is not compact is that there exist
sequences in R without any convergent subsequence, such as (n)n∈N .
More generally, any closed S ⊆ R will have the property that each convergent sequence in
S has a convergent subsequence with limit in S, because the subsequence will have the same
limit, and will be in S because S is closed.
(2) False. It is not given that the closed and bounded sets under consideration are in R. As a
counterexample take N with the discrete metric: for any x, y ∈ N define d(x, y) = 1 when
x 6= y and d(x, y) = 0 when x = y. Then N itself is closed and bounded, but is not compact,
since the sequence (n)n∈N does not have a convergent subsequence. (There are many other
counterexamples, such as the set {f ∈ C[a, b] : kf k∞ ≤ 1} in C[a, b].)
(3) True. Let K be a compact subset of X. Considered as a subset of the metric space X, that K
is compact means that any sequence in K has a convergent (in X) subsequence with limit in
K. A subsequence being convergent to, say, L ∈ K ⊆ X also means that it is convergent in Y ,
since L ∈ Y . Therefore, any sequence in K has a convergent (in Y ) subsequence with limit in
K.
The moral is that being compact is a property that depends only on the set K and the
values of the metric on elements of K, not on the underlying space that contains K.

Solution to Exercise 2.54. Since K is compact in Rd , it is closed and bounded. Let R > 0 be such
that for all x ∈ K, kxk2 ≤ R. In particular, for every x ∈ K ∩ F , we have kxk2 ≤ R. Thus K ∩ F is
bounded. Also, since both K and F are closed, it follows that even K ∩ F is closed. Hence K ∩ F is closed
and bounded, and so by Theorem 2.49, we can conclude that K ∩ F is compact.

Solution to Exercise 2.55. Clearly Sd−1 is bounded. It is also closed, and we prove this below. Let
(x(n) )n∈N be a sequence in Sd−1 which converges to L in Rd . Let
   (n) 
L1 x1
 .  (n)
 . 
L =  ..  and x =  .. 

 (n ∈ N).
(n)
Ld xd
(n)
Then limn→∞ xi = Li (i = 1, . . . , d). Since x(n) belongs to Sd−1 for each n ∈ N, we have
(n) (n)
kx(n) k22 = (x1 )2 + · · · + (xd )2 = 1.
Passing the limit as n → ∞, we obtain
 
(n) (n)
kLk22 = L21 + · · · + L2d = lim (x1 )2 + · · · + (xd )2 = 1.
n→∞

d−1 d−1 d−1


Hence L ∈ S . So S is closed. As S is closed and bounded, it follows from Theorem 2.49 that it
is compact.

Solution to Exercise 2.56. Let  [


1 1
K := 1, , , . . . {0}.
2 3
Since K ⊆ [0, 1], clearly K is bounded. Moreover, we have
∞  ! [
[ [ 1 1
R \ K = (−∞, 0) , (1, ∞),
n=1
n+1 n

which is open, since it is the union of open intervals. Hence its complement, namely the set K, is closed.
Since K is closed and bounded, it is compact.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

108 Solutions

Solution to Exercise 2.57. Consider the sequence (nI)n∈N in GL(m, R). Then we have for n ∈ N that
knIk∞ = n,
and so the sequence (nI)n∈N does not have a convergent subsequence. (For if it did, then the subsequence
would be bounded in particular, but no subsequence of (nI)n∈N can be bounded.) So GL(m, R) is not
compact.
We now show that the set O(m, R) of orthogonal matrices is compact. Let (A(n) )n∈N be a sequence
in O(m, R). Using (A(n) )⊤ A(n) = Im , we see that
Xm
(n)
(aij )2 = 1 (1 ≤ j ≤ m),
i=1
(n) (n)
and so each of the sequences (aij )n∈R is bounded. So if we look at the sequence (a11 )n∈N , this has a
(n ) (n )
convergent subsequence, say (a11k )k∈N . Now look at the sequence (a12k )k∈N . As it is bounded, it has a
convergent subsequence too. Proceeding in this manner, we will get a subsequence of (A(n) )n∈N such that
if we look at any (i, j)th entry, the corresponding sequence in R is convergent, with limite say Lij , and so
this subsequence of (A(n) )n∈N is convergent with the limit L, where the (i, j)th entry of L is Lij . From
(A(n) )⊤ A(n) = Im (n ∈ N), it follows that also L⊤ L = Im , showing that L ∈ O(m, R).
(n) (n)
Solution to Exercise 2.58. We first show that H is closed. Let xn = (x1 , x2 ) ∈ H be a sequence
(n) (n)
that converges to L = (L1 , L2 ) ∈ R2 . Since xn ∈ H, we have that x1 x2 = 1 for each n ∈ N. Taking
limits we obtain that
(n) (n) (n) (n)
1 = lim x1 x2 = lim x1 lim x2 = L1 L2 .
n→∞ n→∞ n→∞
Since L1 L2 = 1, it follows that L ∈ H. This shows (by Theorem 2.40) that H is closed.
We next show that H is not bounded. Suppose that there exists R > 0 such that kxk2 ≤ R for
all x ∈ H. Since √
p (R, 1/R) ∈ H, it would then follow that k(R, 1/R)k2 ≤ R. However, k(R, 1/R)k2 =
R2 + (1/R)2 > R2 = R, a contradiction.
Finally, it follows that H is not compact, since otherwise (by Theorem 2.49) H would be closed and
bounded, while we have shown to the contrary that H is unbounded.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 109

Solutions to the exercises from Chapter 3


Solution to Exercise 3.3. We have
1 (2n + 1) − (2n − 1) 1 1
tan−1 2 = tan−1 = tan−1 − tan−1 ,
2n 1 + (2n + 1)(2n − 1) 2n − 1 2n + 1
and so the partial sums telescope to give

X 1 1 1 π π
tan−1 = tan−1 − lim tan−1 = −0= .
n=1
2n2 1 n→∞ 2n + 1 4 4

Solution to Exercise 3.4. We have


1 1 2 4 2n
+ + + + · · · +
1−x 1+x 1 + x2 1 + x4 1 + x2n
n
2 2 4 2
= + + + ··· +
1 − x2 1 + x2 1 + x4 1 + x2 n
4 4 2n
= + + ··· +
1−x 4 1+x 4 1 + x2 n

..
.
2n 2n 2n+1
= n + n = .
1−x 2 1+x 2
1 − x2n+1
Since x > 1, we can write x = 1 + h with h > 0, and so for integers k > 2 we have
!
k k k 2 k · (k − 1) 2
x = (1 + h) > h = h .
2 2
Thus
k 2
0< < ,
xk (k − 1) · h2
and so by the Sandwich Theorem,
k
lim = 0.
k→∞ xk
Hence for x > 1, we obtain
2n+1
1 1 2 4 2n 2n+1 x2
n+1 0
+ + + + · · · + n + · · · = lim n+1 = lim 1 = = 0,
1 − x 1 + x 1 + x 2 1 + x4 1 + x2 n→∞ 1 − x2 n→∞ n+1 −1 0−1
x2

and so
1 2 4 2n 1
+ + + ··· + + ··· = .
1+x 1+x 2 1+x 4 1 + x2n x−1

Solution to Exercise 3.7. The sequence (1/n)n∈N is convergent with limit 0. The function cos : R → R
is continuous. Since continuous functions preserve convergence of sequences, it follows that (cos(1/n))n∈N
is convergent with limit cos 0 = 1. By Proposition 3.6, we can conclude that since
 
1
lim cos = 1 6= 0,
n→∞ n
∞  
X 1
the series cos does not converge.
n=1
n

Solution to Exercise 3.11. Let sn := a1 + · · · + an denote the nth partial sum of the series for n ∈ N.
Since the series converges, (sn )n∈N converges to some limit L. So the sequence (s2n − sn )n∈N converges
to L − L = 0. We have
s2n − sn = an+1 + · · · + a2n ≥ a2n + · · · + a2n = n · a2n ≥ 0.
| {z }
n times

Thus by the Sandwich Theorem, we obtain lim n · a2n = 0. Hence also


n→∞

lim 2na2n = 0. (6.1)


n→∞

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

110 Solutions

Also, the sequence (s2n+1 − sn+1 )n∈N converges to L − L = 0. We have


s2n+1 − sn = an+2 + · · · + a2n+1 ≥ a2n+1 + · · · + a2n+1 = n · a2n+1 ≥ 0.
| {z }
n times

Thus by the Sandwich Theorem, we obtain lim n · a2n+1 = 0. As the sequence (an )n∈N converges to 0,
n→∞
we also have that (a2n+1 )n∈N converges to 0. Hence we obtain
lim (2n + 1)a2n+1 = 2 lim na2n+1 + lim a2n+1 = 2 · 0 + 0 = 0. (6.2)
n→∞ n→∞ n→∞

It follows now from (6.1) and (6.2) that lim nan = 0.


n→∞
In order to show that the assumption a1 ≥ a2 ≥ a3 . . . cannot be dropped, we consider the lacunary
series whose n2 th term is 1/n2 and all other terms are zero:
1 1 1
a1 = 2 , a2 = a3 = 0, a22 = 2 , a22 +1 = · · · = a32 −1 = 0, a32 = 2 , . . . .
1 2 3
The sequence (sn )n∈N of partial sums is clearly increasing since an ≥ 0 for all n ∈ N. We have
1 1 1 1
s n2 = 2 + 2 + 2 + · · · + 2 ,
1 2 3 n
which is the nth partial sum of the convergent series

X 1
2
.
n=1
n
It thus follows from the convergence of (sn2 )n∈N that (sn2 )n∈N is bounded, and so also (sn )n∈N is bounded
too (after all, given any m ∈ N, there is a perfect square N 2 exceeding it, so that sm ≤ sN 2 ≤ M , where M
is a bound for the sequence (sn2 )n∈N ). As the sequence (sn )n∈N is monotone and bounded, it is convergent
(by the Bolzano-Weierstrass Theorem). Note however that the sequence (nan )n∈N has as a subsequence
(n2 an2 )n∈N = (n2 n12 )n∈N = (1)n∈N , and so it follows that (nan )n∈N does not converge to 0.

Solution to Exercise 3.14. Since −1 ≤ sin n ≤ 1 for all n ∈ N, we have


sin n 1
≤ 2 (n ∈ N).
n2 n

X sin n
Thus the sequence (sn )n∈N of the partial sums of the series is bounded:
n=1
n2

1 1 1 X 1
sn ≤ + + · · · + ≤ < +∞.
12 22 n2 n=1
n 2

Moreover, it is an increasing sequence, and so by the Bolzano-Weierstrass Theorem (sn )n∈N is convergent,
that is,

X sin n
< +∞.
n=1
n2

X sin n
It now follows from Theorem 3.12 that converges too.
n=1
n2
(When we learn the “Comparison Test” (Theorem 3.20) later, we can give a one line justification.)

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 111

Solution to Exercise 3.17. We have


1
(1) an := s ≥ 0 for all n ∈ N.
n
1 1
(2) As s > 0, (n + 1)s > ns and so > for all n ∈ N.
ns (n + 1)s
1
(3) lim = 0.
n→∞ ns

X (−1)n
Thus by the Leibniz Alternating Series Theorem converges.
n=1
ns

Solution to Exercise 3.18. We have



n
(1) an := ≥ 0 for all n ∈ N.
n+1
√ √
n n+1
(2) (We want to know if for all n ∈ N, n+1 ≥ n+2 . In order to see this, we square both sides, and
rearrange terms to arrive at the equivalent inequality n2 + n ≥ 1, which is clearly true.)
We note that for all n ∈ N, n2 + n ≥ 1, and so
n(n + 2)2 = n3 + 4n2 + 4n ≥ n3 + 3n2 + 3n + 1 = (n + 1)3 .
Rearranging, we obtain
n n+1
≥ ,
(n + 1)2 (n + 2)2
√ √
n n+1
and finally, taking square roots, we obtain ≥ for all n ∈ N.
n+1 n+2
√ √1
n n 0
(3) lim = lim = = 0.
n→∞ n + 1 n→∞ 1 + 1 1 + 0
n
∞ √
X n
Thus by the Leibniz Alternating Series Theorem (−1)n converges.
n=1
n + 1

Solution to Exercise 3.19. We have


1
(1) We know that sin x ≥ 0 for all x ∈ [0, π]. Since 0 ≤ n
≤ 1 < π for all n ∈ N, it follows that
an := sin n1 ≥ 0 for all n ∈ N.
(2) The function x 7→ sin x defined on [0, π2 ] is (strictly) increasing. Since 0 < 1
n+1
< 1
n
<1< π
2
, it
1
follows that sin n+1 < sin n1 for all n ∈ N.
(3) Finally, the function x 7→ sin x : R → R is continuous, and the sequence ( n1 )n∈N is convergent
with limit 0, and so it follows that
 
1 1
lim sin = sin lim = sin 0 = 0.
n→∞ n n→∞ n


X 1
Thus by the Leibniz Alternating Series Theorem (−1)n sin converges.
n=1
n

Solution to Exercise 3.25. We will use the Ratio Test in each part.
n2
(1) We have that an := 6= 0 for each n ∈ N, and
2n
 2
an+1 (n + 1)2 2n 1 1
= · = 1+ ,
an 2n+1 n2 2 n
 2
an+1 1 1 1 1
and so lim = lim 1+ = (1 + 0)2 = < 1. Thus the series
n→∞ an n→∞ 2 n 2 2

X n2
n=1
2n
converges (absolutely).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

112 Solutions

(n!)2
(2) We have that an := 6= 0 for each n ∈ N, and
(2n)!
an+1 ((n + 1)!)2 (2n)! (n + 1)2 n+1 1 + n1
= · = = = ,
an (2n + 2)! (n!) 2 (2n + 1)(2n + 2) 2(2n + 1) 2(2 + n1 )
an+1 1 + n1 1+0 1
and so lim = lim = = < 1. Thus the series
n→∞ an n→∞ 2(2 + 1 ) 2(2 + 0) 4
n

X (n!)2
n=1
(2n)!
converges (absolutely).
 n
4
(3) We have that an := n5 6= 0 for each n ∈ N, and
5
 n+1  n  5  5
an+1 4 5 1 4 n+1 4 1
= (n + 1)5 · = = 1 + ,
an 5 4 n5 5 n 5 n
 5
an+1 4 1 4 4
and so lim = lim 1+ = (1 + 0)5 = < 1. Thus the series
n→∞ an n→∞ 5 n 5 5

X 4   n
n5
n=1
5
converges (absolutely).

n n 1 X 1
Solution to Exercise 3.26. We have that ≤ = for all n ∈ N. Since < +∞,
n4 + n2 + 1 n4 n3 n=1
n 3


X n
it follows from the Comparison Test that also 4 + n2 + 1
converges.
n=1
n
We have
∞ ∞
X n X n
=
n=1
n4 + n2 + 1 n=1
(n2 + 1)2 − n2

X n
= 2 + 1 − n)(n2 + 1 + n)
n=1
(n
∞  
1X 1 1
= −
2 n=1 n2 + 1 − n n2 + 1 + n
∞  
1X 1 1
= − .
2 n=1 (n − 1) · n + 1 n · (n + 1) + 1
But the partial sum of this last series is the “telescoping sum”
1 ✟ 1 ✟ 1 ✟ ✘

− ✟✟ + ✟✟ − ✟ ✟ + · · · + ✘✘ ✘✘ −
1 1 1
0·1+1 ✟ 1·2+1 ✟ 1·2+1 ✟ 2·3+1 ✘(n − 1) · n + 1 n · (n + 1) + 1
1
=1− .
n · (n + 1) + 1
∞  
X n 1 1 1
Consequently, 4 + n2 + 1
= lim 1 − = .
n=1
n 2 n→∞ n · (n + 1) + 1 2


X
Solution to Exercise 3.27. Since the series a4n converges, we have necessarily that
n=1

lim a4n = 0.
n→∞

But then lim |an |4 = 0 as well, and so it follows from here that
n→∞

lim |an | = 0.
n→∞

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 113

Thus there is an N ∈ N such that for all n > N , |an | < 1. But now for n > N ,

|a5n | = a4n |an | < a4n ,



X
and so it follows from the Comparison Test that a5n converges too.
n=1

Solution to Exercise 3.28. Clearly numbers n such that 10k−1 ≤ n < 10k have k digits. If we are to
avoid the digit 9 in any of these places, then we see that the first digit can only be 1, 2, 3, . . . , 7, 8, while
the rest can be any of the digits 0, 1, 2, 3, . . . , 7, 8. Thus the total number of allowed numbers is 8 · 9k−1 .
X′ 1
Thus in our series , there are
np
1 1 1
8 terms with < p ≤ p,
10p n 1
1 1 1
8·9 terms with < p ≤ p,
102p n 10
1 1 1
8 · 92 terms with < p ≤ 2p ,
103p n 10
..
.
1 1 1
8 · 9k−1 terms with < p ≤ (k−1)p ,
10kp n 10
..
.

Thus
   
1 9 92 X′ 1 1 9 92 93
8 + + + ... < ≤ 8 + + + + . . . . (6.3)
10p 102p 103p np 1p 10p 102p 103p
We now have the following:
9
1◦ p > log10 9. Then 9 < 10p and so < 1. Thus the geometric series on the right hand side of
10p X′ 1
(6.3) converges. So by the Comparison Test, converges as well.
np
9
2◦ p ≤ log10 9. Then 9 ≥ 10p and so ≥ 1. Thus the geometric series on the left hand side of
10p X′ 1
(6.3) diverges. So by the Comparison Test, diverges as well.
np

Solution to Exercise 3.30.


(1) Let ǫ > 0. Since the series

X
S := |an | + |bn |
n=1

is convergent, there exists an N ∈ N such that for all n > N ,


n
X ∞
X
(|ak | + |bk |) − S = (|ak | + |bk |) < ǫ.
k=1 k=n+1

As
       
2πk 2πk 2πk 2πk
ak cos x + bk sin x ≤ |ak | cos x + |bk | sin x ≤ |ak | · 1 + |bk | · 1,
T T T T

we have that
∞     ∞
X 2πk 2πk X
ak cos x + bk sin x ≤ (|ak | · 1 + |bk | · 1) < ǫ.
T T
k=n+1 k=n+1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

114 Solutions

Thus if n > N , then for all x ∈ R, we have


 n     
X 2πk 2πk
f (x) − a0 + ak cos x + bk sin x
T T
k=1
∞    
X 2πk 2πk
= ak cos x + bk sin x
T T
k=n+1
∞    
X 2πk 2πk
≤ ak cos x + bk sin x < ǫ.
T T
k=n+1

This shows that the Fourier series converges uniformly to f .

(2) The plots are displayed on page 38. From the plots, we do see that the partial sum functions look
increasingly like the square wave, but we also notice the persistent overshoot at the points −1, 0, 1, 2, 3,
suggesting nonuniform convergence.


X ∞
X
Solution to Exercise 3.31. It is clear that if the series an converges, then so does an+1 . Thus
n=1 n=1

X an + an+1
converges too. By the Arithmetic Mean-Geometric Mean inequality
n=1
2

a+b √
≥ ab
2
√ √
for nonegative a, b (this is just a rearrangement of the trivial observation that ( a − b)2 ≥ 0), it follows
that for all n ∈ N,
√ an + an+1
an an+1 ≤ ,
2
and so the result follows from the Comparison Test.


X an
Solution to Exercise 3.32. (“If” part) Suppose that converges. Then
n=1
1 + an

an 1
lim = 1 − lim = 0,
n→∞ 1 + an n→∞ 1 + an

1
and so lim = 1. Thus there exists an N ∈ N such that for all n > N ,
n→∞ 1 + an
1 1
> .
1 + an 2
But this implies that for n > N ,
an an
≥ (note that equality is possible).
1 + an 2
∞ ∞
X an X
By the Comparison Test, it follows that converges, and so an converges as well.
n=1
2 n=1


X
(“Only if” part) Suppose that an converges. Since the an ’s are all positive, we have
n=1

an
< an (n ∈ N),
1 + an

X an
and so by the Comparison Test, it follows that converges.
n=1
1 + an

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 115



X
Solution to Exercise 3.33. (This was already covered in Exercise 3.29.(1).) If the series |an |
n=1
converges, we have necessarily that lim |an | = 0. Thus there is an N ∈ N such that for all n > N ,
n→∞
|an | < 1. But now for n > N , |a2n | = |an ||an | < |an |, and so it follows from the Comparison Test that
X∞
a2n converges too. So every element of ℓ1 belongs to ℓ2 .
n=1

The sequence (1/n)n∈N belongs to ℓ2 since


∞  2 ∞
X 1 X 1
= 2
< +∞,
n=1
n n=1
n
but it does not belong to ℓ1 (since the harmonic series diverges). So ℓ1 ( ℓ2 .

Solution to Exercise 3.34. We have


1 1 1
sn := 1 + + + · · · + < 1 + 1 + 1 + · · · + 1 = n,
2 3 n | {z }
n times

1 1 X 1
and so < for all n ∈ N. Hence by the Comparison Test, the series diverges.
n sn n=1
s n

Solution to Exercise 3.35. We have that


r
1 n1 1
= ≤ =r<1
nn n 2

X 1
for all n ≥ 2. So it follows from the Root Test that the series n
converges.
n=1
n

Solution to Exercise 3.5. Define the following partial sums:


1 1 1
sn = + + ··· + ,
F0 F2 F2n
1 1 1
tn = + + ··· + .
F1 F3 F2n+1

X 1
Then (sn )n is the sequence of partial sums of . Using the hint, we have
F2k
k=0

1/F2(k+1) F2k 1
= ≤ for all k ≥ 0,
1/F2k F2k+2 2

X 1
and it follows by the Ratio Test that converges. In other words, the sequence (sn )n converges.
F2k
k=0

X 1
Also, (tn )n is the sequence of partial sums of . Using the ratio test again, it similarly follows
F2k+1
k=0

X 1
that converges, and therefore, the sequence (tn )n converges.
F2k+1
k=0

X 1
Now consider the sequence of partial sums of :
Fk
k=0
1 1 1
σn = + + ··· +
F0 F1 Fn
Note that sn + tn = σ2n+1 and sn + tn−1 = σ2n . Therefore,
lim σ2n+1 = lim sn + lim tn = lim sn + lim tn−1 = lim σ2n .
n→∞ n→∞ n→∞ n→∞ n→∞ n→∞

Thus, (σ2n ) and (σ2n+1 ) converge to the same limit. It follows that (σn ) converges (see Exercise 2.9(9)).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

116 Solutions

Solution to Exercise 3.29.


(1) True.

X
Since the series |an | converges, we have necessarily that
n=1

lim |an | = 0.
n→∞

Thus there is an N ∈ N such that for all n > N , |an | < 1. But now for n > N ,
|a2n | = |an ||an | ≤ |an |,

X
and so it follows from the Comparison Test that a2n converges too.
n=1
(2) False.
By the Leibniz Alternating Series Theorem, it is easy to check that

X 1
(−1)n √
n=1
n
∞   2 ∞
X 1 X 1
converges. However, (−1)n √ is the harmonic series , which is divergent.
n=1
n n=1
n
(3) False.

X 1 1
The harmonic series is divergent, but lim = 0.
n=1
n n→∞ n

(4) True.

X
The partial sums converge to 0, and so the series an converges to 0, by definition.
n=1
(5) False.
The partial sum
2 3 n+1 2 · 3 · · · · · (n + 1)
sn = log+ log + · · · + log = log = log(n + 1).
1 2 n 1 · 2 · ··· · n

X n+1
Since (log(n + 1))n∈N diverges, it follows that log diverges.
n=1
n
(6) True.
We know that increasing sequences that are bounded above are convergent. As the sequence of
partial sums of the given series is increasing and bounded above, the series is convergent.
(7) True.
∞  
X 1
Since an converges, lim an = 0. This implies that is unbounded, and hence can’t
n=1
n→∞ an n∈N

X 1
be convergent to 0. So diverges.
n=1
a n

Solution to Exercise 3.43. Let L 6= 0. We have that for all x such that |x| < ρ = 1/L that there exists
a q < 1 and a N large enough such that
p
n
p
|cn xn | = n |cn ||x| ≤ q < 1
p n→∞
for all n > N . This is because n |cn ||x| −→ L|x| < 1. (For example take q = (L|x| + 1)/2 < 1.) So by
the Root Test, the power series converges absolutely for such x.
If L = 0, then for any x ∈ R, we can guarantee that
p
n
p
|cn xn | = n |cn ||x| ≤ q < 1
p n→∞
for all n > N . This is because n |cn ||x| −→ 0|x| = 0 < 1. (So we may arrange for example that
q = 1/2 < 1.) So again by the Root Test, the power series converges absolutely for such x.
On the other hand, if L 6= 0 and |x| > 1/L, then there exists a N large enough such that
p
n
p
|cn xn | = n |cn ||x| > 1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 117


p
n n→∞
for all n > N . This is because |cn ||x| −→ L|x| > 1. So again by the Root Test, the power series
diverges.

Solution to Exercise 3.45.


(1) We have
cn+1 1
L := lim = lim = 1,
n→∞ cn n→∞ 1

and so the radius of convergence is indeed 1, by Theorem 3.42. When x = ±1, xn = ±1,

X
and so the the sequence (xn )n∈N is not convergent with limit 0, and so the series xn is not
n=1
convergent at these points.
(2) We have
cn+1 n2
L := lim = lim = 1,
n→∞ cn n→∞ (n + 1)2

and so the radius of convergence is indeed 1, by Theorem 3.42. When x = ±1, we have that
xn 1
= 2,
n2 n

X xn
and so it follows from the Comparison Test that the series is convergent at these points.
n=1
n2
(3) We have
cn+1 n
L := lim = lim = 1,
n→∞ cn n→∞ n + 1

and so the radius of convergence is indeed 1, by Theorem 3.42. When x = +1, we have that the
∞ ∞ ∞ ∞
X xn X 1 X xn X (−1)n
series = diverges. On the other hand, when x = −1, = can be
n=1
n n=1
n n=1
n n=1
n
seen to be convergent by the Leibniz Alternating Series Test.
(4) We have
cn+1 n
L := lim = lim = 1,
n→∞ cn n→∞ n + 1

and so the radius of convergence is indeed 1, by Theorem 3.42. When x = −1, we have
∞ ∞
X xn X 1
that the series (−1)n = diverges. On the other hand, when x = 1, the series
n=1
n n=1
n
∞ ∞
X xn X (−1)n
(−1)n = can be seen to be convergent by the Leibniz Alternating Series Test.
n=1
n n=1
n

Solution to Exercise 3.46.


(1) First we note that
∞ ∞ ∞
X x2n−1 X x2n−2 X (x2 )n−1
(−1)n−1 =x (−1)n−1 = (−1)n−1 .
n=1
(2n − 1)! n=1
(2n − 1)! n=1
(2n − 1)!
We will show that the radius of convergence of the auxiliary series

X y n−1
(−1)n−1
n=1
(2n − 1)!
is infinite, from which it will follow also that the power series in the exercise has infinite radius
of convergence. We have
cn+1 (2n + 1)! 1
L := lim = lim = lim = 0,
n→∞ cn n→∞ (2n + 3)! n→∞ (2n + 2)(2n + 3)

and so by Theorem 3.42, the radius of convergence is infinite.


(2) This exercise is similar to the previous one. We have
cn+1 (2n)! 1
L := lim = lim = lim = 0,
n→∞ cn n→∞ (2n + 2)! n→∞ (2n + 1)(2n + 2)

and so by Theorem 3.42, the radius of convergence is infinite.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

118 Solutions

1 1 1
Solution to Exercise 3.52. The sequence of partial sums of e0 = I + 0 + 02 + 03 + . . . is the
1! 2! 3!
constant sequence (I)n∈N and so e0 = I.
1 1 1
The sequence of partial sums of eI = I + I + I 2 + I 3 + · · · is the sequence
1! 2! 3!
  
1 1 1 1
1+ + + + ··· + I
1! 2! 3! n! n∈N

and so we see that eI = eI.


λk
 
λ1 0 ··· 0 0 ··· 0
 
 1 
.. .. .. ..
   
 
λk2
 
0 λ2 . .  0 . .
   
Note that for each k ∈ N, Λk =
 
Write Λ = . .
  

.. .. ..  . .. ..
  
 .
  
. . . 0  . . . 0
  
  
 
λkn
 
0 ··· 0 λn 0 ··· 0
Therefore, the k-th partial sum is
1 1 1 1
I + Λ + Λ2 + Λ3 + · · · + Λk
1! 2! 3! k!
 2  3
1 0 ··· 0 λ1 0 ··· 0 λ1 0 ··· 0 λ1 0 ··· 0
     

. . .. .. .. .. .. ..
       
. . .. 
       
0 1 1  0 λ2 . . 1  0 λ22 . . 1  0 λ32 . .
      
   
=.
+
 
+ 
+ +

.. ..  ..
1!  .. ..  ..
2!  .. ..  ..
3!  .. ..
     
.
   
. . . 0  . . . 0  . . . 0  . . . 0
   
  
       
0 ··· 0 1 0 ··· 0 λn 0 ··· 0 λ2n 0 ··· 0 λ3n
λk
 
 1
0 ··· 0 

.. .. 
λk2
 
1  0 . . 
+ ··· +

 
 .
k!  .. .. 
 .

 . . . 0 

λkn
 
0 ··· 0
1 λ + 1 λ2 + 1 λ3 +···+ 1 λk 0 ··· 0
 
1+ 1! 1 2! 1 3! 1 k! 1
.. ..
 
 
0 1 λ + 1 λ2 + 1 λ3 +···+ 1 λk . .
 
 1+ 1! 2 2! 2 3! 2

=  k! 2 
.. .. ..
 
 
. . . 0
 
 
 
0 ··· 0 1 λ + 1 λ2 + 1 λ3 +···+ 1 λk
1+ 1! n 2! n 3! n k! n

e λ1
 

0 ··· 0 


λ2
.. .. 


0 e . . 
→ as k → ∞.
 
 

 .. .. .. 


 . . . 0 

e λn
 
0 ··· 0
e λ1
 

0 ··· 0 


λ2
.. .. 


0 e . . 
Therefore, eΛ = .
 


 .. .. .. 


 . . . 0 

e λn
 
0 ··· 0

Solution to Exercise 3.53. Denote the vector space of all sequences that are eventually 0, by c00 .
(Check
s that c00 is indeed a vector space.) Define the norm of a sequence (x(n) )n∈N ∈ c00 by k(x(n) )k2 =

P 2
(x(n) ) . This makes c00 a subspace of ℓ2 (see Ex. 1.19) since this formula is the same as that for
n=1

the norm of ℓ2 . You can also independently, without using ℓ2 , check that this really defines a norm. (The
triangle inequality (N3), namely that k(x(n) + y (n) )k2 ≤ k(x(n) )k2 + k(y (n) )k2 , follows from the the triangle
inequality in Rd , where d is chosen large enough such that x(n) = y (n) = 0 for all n > d.)
(n)
For each k ∈ N, let ek = (ek )n∈N be the real sequence that is 1 in the k-th position and 0 everywhere
(n) (n)
else. That is, ek = 1 if n = k and ek = 0 if n 6= k. (By the way, the set {e1 , e2 , e3 , . . . } forms a basis of

P P∞
c00 ; see the MA103 Lecture Notes.) Consider the series 1
e in c00 . Since k k12 ek k2 = 1/k2 and
k2 k
1
k2
k=1 k=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 3 119



P ∞
P
1 1
is convergent, by definition e
k2 k
is absolutely convergent. We next give two proofs that e
k2 k
is
k=1 k=1
not convergent in c00 .
Solution I. One way is to show that the limit is the sequence

L = (L(n) ) = (1, 1/22 , 1/32 , 1/42 , 1/52 , . . . ) ∈ ℓ2 \ c00 .

Indeed, let sk be the k-th partial sum of the series, that is,
 
1 1 1
sk = 1, 2 , 2 , . . . , 2 , 0, 0, 0, . . . .
2 3 k
v
u X
1 1 u ∞ 1
ksk − Lk2 = (0, 0, . . . , 0, , ,...) =t .
(k + 1)2 (k + 2)2 2 n2
n=k+1


P Pk
Note that since 1/n2 is convergent, meaning that the sequence of its partial sums σk = n=1 1/n2
P n=1
converges to ∞ 2
n=1 1/n , it follows that


!
X
lim 1/n2 − σk = 0,
k→∞
n=1


P 1
that is, n2
→ 0 as k → ∞. It follows that ksk − Lk2 → 0 as k → ∞ and
n=k+1


X 1
ek = L = (1, 1/22 , 1/32 , 1/42 , 1/52 , . . . ) ∈ ℓ2 .
k2
k=1

Since limits are unique by Lemma 2.14, it follows that (s(k) )k cannot also have a limit in c00 .
Solution II. The following is another solution that directly shows by contradiction that there is no
limit in c00 , without using ℓ2 . Suppose that (sk )k∈N is convergent in c00 and let the limit be c = (c(n) ) ∈ c00 .
Then for some d ∈ N, c(n) = 0 for all n > d.
Use ε = 1/(d + 1) in the definition of convergence of (sk )k to obtain that for some N ∈ N, ksk − ck2 <
1/(d + 1) for all k > N .
Now consider the partial sum sk where k = max{N, d} + 1. This is a sequence with (d + 1)-th term
(d+1) (n)
sk = 1/(d + 1). Also, c(d+1) = 0. Thus, the (d + 1)-th term of the sequence sk − c = (sk − c(n) )n∈N
(d+1)
equals sk − c(d+1) = 1/(d + 1). In particular,
v
u∞ q
u X (n) (d+1) (d+1)
ksk − ck2 = t (sk − c(n) )2 ≥ (sk − c(d+1) )2 = |sk − c(d+1) | = 1/(d + 1).
n=1

However, since k > N , we also have ksk − ck2 < 1/(d + 1), which is a contradiction.
(Note that a normed space with an absolutely convergent, non-convergent series cannot be complete,
otherwise Theorem 3.49 will be contradicted. We can therefore conclude that c00 is not complete.)

Solution to Exercise 3.54. Let (xn )n∈N be a Cauchy sequence. Let n1 ∈ N be large enough so that for
all n, m ≥ n1 , kxn − xm k < 21 . Suppose that xn1 , . . . , xnk have been constructed. Choose nk+1 > nk such
1
that whenever n, m ≥ nk+1 , kxn − xm k < 2k+1 . Thus we get a sequence of indices n1 < n2 < n3 < . . .
such that whenever n, m ≥ nk , kxn − xm k < 21k for all k’s. In particular, kxnk+1 − xnk k < 21k . Now we
take a1 = xn1 , a2 = xn2 − xn1 , a3 = xn3 − xn2 , and so on. Then
1
kak k = kxnk − xnk−1 k ≤ k (k ≥ 2),
2k−1
and so it follows from the Comparison Test that

X
kak k < +∞.
k=1

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

120 Solutions

X
By the hypothesis, ak converges in X, that is, the sequence of partial sums
k=1
K
!
X 
ak = xn1 + (xn2 − xn1 ) + (xn3 − xn2 ) + · · · + (xnK − xnK−1 ) K∈N
= (xnK )K∈N
k=1 K∈N

is convergent. So the Cauchy sequence (xn )n∈N possesses the convergent subsequence (xnk )k∈N , and so it
must be convergent. So X is complete.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 121

Solutions to the exercises from Chapter 4


Solution to Exercise 4.3. That (1) implies (2′ ) is immediate since (1) implies (2), and (2) implies (2′ ).
Now suppose that (2′ ) holds. Let (xn )n∈N be a sequence contained in I that converges to c ∈ I. Then
the new sequence x1 , c, x2 , c, x3 , c, . . . also converges to c. By the hypothesis (2′ ), it follows that the image
of this sequence under f , namely the sequence f (x1 ), f (c), f (x2 ), f (c), f (x3 ), f (c), . . . must be convergent,
say with limit L. But then each of its subsequences must also be convergent with the same limit L. By
looking at the even indexed terms of this sequence, we obtain that the subsequence f (c), f (c), f (c), . . .
converges to L, and so L = f (c). On the other hand, by looking at the odd indexed terms, we conclude
that the sequence f (x1 ), f (x2 ), f (x3 ), . . . must be also convergent with limit L (=f (c)). Thus (2) holds.
As (2) implies (1), it follows that (1) holds. Hence we have shown that (2′ ) implies (1).

Solution to Exercise 4.6. Let x 6= 0. We consider two possible cases.


1◦ Let x be rational. Then we can find a sequence (yn )n∈N of irrational numbers that converges
to x, by the result in Exercise 1.40. If f was continuous at x, then (f (yn ))n∈N = (−yn )n∈N
would have to be convergent with limit f (x) = x. But then we obtain x = −x and so x = 0, a
contradiction.
2◦ Let x be irrational. Then we can find a sequence (rn )n∈N of rational numbers that converges to
x, by the result in Exercise 1.39. If f was continuous at x, then (f (rn ))n∈N = (rn )n∈N would
have to be convergent with limit f (x) = −x. But then we obtain x = −x and so x = 0, a
contradiction.
So f is not continuous at x whenever x 6= 0. We now show that f is continuous at 0. Let ǫ > 0. Let
δ := ǫ > 0. Since f (x) = x if x is rational, and −x if x is irrational, it follows that in either case,
|f (x)| = |x|. Thus, whenever x ∈ R and |x − 0| = |x| < δ = ǫ, we have
|f (x) − f (0)| = |f (x) − 0| = |f (x)| = |x| < δ = ǫ.
So f is continuous at 0.

Solution to Exercise 4.7. Let x be a rational number. Then f (x) > 0. We can find a sequence (yn )n∈N
of irrational numbers that converges to x, by the result in Exercise 1.40. If f was continuous at x, then
(f (yn ))n∈N = (0)n∈N would have to be convergent with limit f (x) > 0, which is clearly not true. Thus f
is not continuous at x.

Now suppose that x is irrational. Then x 6= 0, and so there exists an integer N such that x ∈ (N, N +1).
Let ǫ > 0. Suppose that r is a rational number in (N, N + 1) for which f (r) ≥ ǫ. Let r = n/d, where
n, d are integers without any common divisors and d > 0. Then we have f (r) = 1/d ≥ ǫ and so d ≤ 1/ǫ.
So there are just finitely many possibilities for d. Moreover, since N < n/d < N + 1, it follows that
N d < n < (N + 1)d, and so there are only finitely many possibilities for d as well. So there are just finitely
many rational numbers r in (N, N + 1) for which f (r) ≥ ǫ. Now among these finitely many rational
numbers, let r∗ be one which is the closest to x, and set δ := min{|x − r∗ |/2, |x − N |, |x − (N + 1)|} > 0.
Then whenever z ∈ R and |z − x| < δ, we are guaranteed that z ∈ (N, N + 1), and moreover if z is rational,
then it can’t be one of the rational r’s in (N, N + 1) for which f (r) ≥ ǫ, and so it must be the case that
f (z) < ǫ. If z is irrational, then f (z) = 0 by definition of f , and so then too we have f (z) = 0 < ǫ. So
we have that for all z ∈ R such that |z − x| < δ, |f (z) − f (x)| = |f (z) − 0| = |f (z)| = f (z) < ǫ. This
completes the proof that f is continuous at every irrational number.

Solution to Exercise 4.8. Imagine a directed line ℓ such that the pancake lies entirely to one side of
the line ℓ (say if we look along the direction of the line, the pancake appears to our right). Now translate
the line to the right parallel to itself till the whole pancake appears to the left of the line. Suppose
that the total distance by which the line is translated is d. At each intermediate distance x ∈ [0, d],
let A(x) denote the area of the part of the pancake that lies to the right of our line. Thus if S is the
total area of the pancake, then A(0) = A and A(d) = 0. As the map A : [0, d] → R is continuous and
f (0) = S ≥ S/2 ≥ 0 = f (d), it follows by the Intermediate Value Theorem, that there is an y ∈ [0, d] such
that A(y) = S/2, or in other words, at some position of our line, the pancake is divided into two parts
with equal areas.
As the direction of the line in the above process was inconsequential, given any direction, we can
choose a straight line cut having that direction which divides the pancake into two equal parts.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

122 Solutions

Solution to Exercise 4.10. For nonzero x ∈ R2 , we have


x1 x2 |x1 ||x2 | kxk2 kxk2
p = ≤ = kxk2 .
x21 + x22 kxk2 kxk2
Let ǫ > 0. Set δ := ǫ > 0. Then if 0 6= x ∈ R2 and kx − 0k2 = kxk2 < δ = ǫ, we have
x1 x2
|f (x) − f (0)| = |f (x) − 0| = |f (x)| = p ≤ kxk2 < δ = ǫ.
x21 + x22
On the other hand, if x = 0, then |f (x) − f (0)| = 0 < ǫ, trivially.
Thus f is continuous at 0.

Solution to Exercise 4.11.


(1) Suppose f is continuous at 0. The sequence (1/n)n∈N is convergent with limit 0. Thus
(f (1/n))n∈N = (1)n∈N must be convergent with limit f (0) = 0, which is clearly false. Hence f
is not continuous at 0.
(2) Let ǫ > 0. Take δ = 12 > 0. Then whenever x ∈ R with dX (x, 0) < δ = 1/2, we have dX (x, 0) = 0
and x = 0, which gives f (x) = f (0) = 0. Thus whenever x ∈ R with dX (x, 0) < δ = 1/2,
|f (x) − f (0)| = |f (0) − f (0)| = 0 < ǫ. This shows that f is continuous at 0.

Solution to Exercise 4.12. If x, y ∈ X we have by the triangle inequality that


d(x, p) ≤ d(x, y) + d(y, p),
and so d(x, p) − d(y, p) ≤ d(x, y). Similarly,
d(y, p) ≤ d(y, x) + d(x, p) = d(x, y) + d(x, p),
and so d(y, p) − d(x, p) ≤ d(x, y). Consequently, |d(x, p) − d(y, p)| ≤ d(x, y). Let ǫ > 0, and set δ := ǫ > 0.
Then whenever y ∈ X and d(x, y) < δ = ǫ, we have
|d(x, p) − d(y, p)| ≤ d(x, y) < δ = ǫ.
This shows that the map f is continuous at x ∈ X. As the choice of x ∈ X was arbitrary, it follows that
f is continuous on X.

Solution to Exercise 4.13. Let (x0 , y0 ) ∈ R2 . We have for all (x, y) ∈ R2 that
|x + y − (x0 + y0 )| = |(x − x0 ) + (y − y0 )| ≤ |x − x0 | + |y − y0 |
≤ k(x, y) − (x0 , y0 )k2 + k(x, y) − (x0 , y0 )k2 = 2k(x, y) − (x0 , y0 )k2 .
Thus, given ǫ > 0, we can set δ := ǫ/2 > 0, and then for all (x, y) ∈ R2 satisfying the inequality
k(x, y) − (x0 , y0 )k2 < δ, we have
|x + y − (x0 + y0 )| ≤ 2k(x, y) − (x0 , y0 )k2 ≤ 2δ = ǫ.
As the choice of (x0 , y0 ) ∈ R2 was arbitrary, it follows that addition is continuous.

Let (x0 , y0 ) ∈ R2 . We have for all (x, y) ∈ R2


|xy − x0 y0 | = |xy − x0 y + x0 y − x0 y0 )| ≤ |y||x − x0 | + |x0 ||y − y0 |
≤ |y|k(x, y) − (x0 , y0 )k2 + |x0 |k(x, y) − (x0 , y0 )k2
= (|y| + |x0 |)k(x, y) − (x0 , y0 )k2 .
But if (x0 , y0 ) is close to (x, y), then |y| is close to |y0 |. So the continuity will follow from the above
inequality. We give the precise details below. Let ǫ > 0. Set
 
ǫ
δ = min 1, > 0.
1 + |y0 | + |x0 |
Then whenever k(x, y) − (x0 , y0 )k2 < δ, we have first of all that
|y| = |y0 + (y − y0 )| ≤ |y0 | + |y − y0 | ≤ |y0 | + k(x, y) − (x0 , y0 )k2 < |y0 | + δ ≤ |y0 | + 1.
So if k(x, y) − (x0 , y0 )k2 < δ, then
ǫ
|xy − x0 y0 | ≤ (|y| + |x0 |)k(x, y) − (x0 , y0 )k2 < (|y0 | + 1 + |x0 |)δ ≤ (|y0 | + 1 + |x0 |) = ǫ.
1 + |y0 | + |x0 |
As the choice of (x0 , y0 ) ∈ R2 was arbitrary, it follows that multiplication is continuous.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 123

Solution to Exercise 4.14. Let f ∈ C[0, 1]. If g ∈ C[0, 1], then we have for x ∈ [0, 1] that
|(Sg)(x) − (Sf )(x)| = |(g(x))2 − (f (x))2 | = |g(x) + f (x)||g(x) − f (x)|
≤ (|g(x)| + |f (x)|)kg − f k∞
≤ (kgk∞ + kf k∞ )kg − f k∞ .
Thus kSg − Sf k∞ ≤ (kgk∞ + kf k∞ )kg − f k∞ . If kg − f k∞ < δ, then we have
kgk∞ ≤ kg − f k∞ + kf k∞ < δ + ∞,
and so the above yields that kSg − Sf k∞ < (δ + 2kf k∞ )δ. Let ǫ > 0. Define
 
ǫ
δ = min ,1 .
1 + 2kf k∞
Then whenever g ∈ C[0, 1] and kg − f k∞ < δ, we have
kSg − Sf k∞ < (δ + 2kf k∞ )δ ≤ (1 + 2kf k∞ )δ ≤ ǫ.
Hence S is continuous at f . As the choice of f ∈ C[0, 1] was arbitrary, it follows that S is continuous on
C[0, 1].

Solution to Exercise 4.16.


(1) Let x, y ∈ C. Define γ : [0, 1] → C by γ(t) = (1 − t)x + ty, t ∈ [0, 1]. Then γ is well-
defined since γ(0) = x ∈ C, γ(1) = y ∈ C and for all t ∈ (0, 1), C being convex, we have
γ(t) = (1 − t)x + ty ∈ C as well.
Also, if s, t ∈ [0, 1], we have γ(t) = x + t(y − x) and γ(s) = x + s(y − x), and so
kγ(t) − γ(s)k2 = kx + t(y − x) − (x + s(y − x))k2 = k(t − s)(y − x)k2
= |t − s|ky − xk2 .
From here it follows that γ is continuous. Hence γ is the desired path joining x to y. Conse-
quently, C is path connected.
(2) We verify below that R is reflexive, symmetric and transitive.
(R1) (Reflexivity) Let x ∈ S. Define γ : [0, 1] → S by γ(t) = x (t ∈ [0, 1]). Clearly γ is
continuous and γ(0) = γ(1) = x. So xRx.
(R2) (Symmetry) Let x, y ∈ S and suppose that xRy. Then there exists a continuous path
γ : [0, 1] → S such that γ(0) = x and γ(1) = y. Define the new map µ : [0, 1] → S by
µ(t) := γ(1 − t) (t ∈ [0, 1]). Then µ(0) = γ(1) = y, µ(1) = γ(0) = x, and since µ is the
composition of two continuous maps, namely the continuous map γ with the continuous
map t 7→ 1 − t : [0, 1] → [0, 1], it follows that µ is continuous too. Hence yRx as well.
(R3) (Transitivity) Let x, y, z ∈ S be such that xRy and yRz. We would like to show that xRz.
As xRy and yRz, there exist continuous maps γ1 , γ2 : [0, 1] → S such that γ1 (0) = x,
γ1 (1) = y, γ2 (0) = y and γ2 (1) = z. Define the map γ : [0, 1] → S as follows:

γ1 (2t) if t ∈ [0, 21 ),
γ(t) =
γ2 (2t − 1) if t ∈ [ 21 , 1].
Then γ(0) = γ1 (0) = x and γ(1) = γ2 (1) = z. Also, it is not hard to check that γ is
continuous. Hence xRz.
Thus R is an equivalence relation.
(3) The circle S1 := {(x, y) ∈ R2 : x2 + y 2 = 1} is path connected. Indeed, if a, b ∈ S1 , then
there are real numbers θa , θb ∈ R such that a = (cos θa , sin θb ) and b = (cos θb , sin θb ). Define
the map θ : [0, 1] → R by θ(t) = (1 − t)θa + tθb , t ∈ [0, 1]. Now define γ : [0, 1] → R by
γ(t) = (cos(θ(t)), sin(θ(t))) (t ∈ [0, 1]). Then γ is continuous, γ(0) = a and γ(1) = b. Thus S1 is
path connected. (Alternately, using the fact that R is an equivalence relation, it is easy to see
that it is enough to check the path connectedness of the two arcs
A+ := {(x, y) ∈ R2 : x2 + y 2 = 1, y ≥ 0},
A− := {(x, y) ∈ R2 : x2 + y 2 = 1, y ≤ 0}
lying in the upper half-plane and lower half plane, respectively. If a = (x1 , ya ) and b = (xb , yb )
belong to A+ , then define x : [0, 1] → R by x(t) = (1−t)xa +txb (t ∈ [0, 1]). Define γ : [0, 1] → R2
as follows:  p 
γ(t) = x(t), 1 − (x(t))2 (t ∈ [0, 1]).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

124 Solutions

Then γ is continuous, γ(0) = aand γ(1) = b. This shows that A+ is path connected. For
showing that A− is path connected, we proceed similarly, except that we use
 p 
γ(t) = x(t), − 1 − (x(t))2 (t ∈ [0, 1])

as our path.)
The set {(x, y) ∈ R2 : xy = 0} is just the union of the two straight lines, namely the
horizontal “x-axis” and the vertical “y-axis” when we identify R2 with the plane in the usual
manner. It is now pictorially evident that this set is path connected. Indeed, if x, y belong
to this set then we have three cases. If they both lie on the x-axis we just join them by the
horizontal straight line segment which is contained entirely in the x-axis. If they both lie on
the y-axis, we join them by the vertical line segment which is contained entirely in the y axis.
Finally, if one lies on the x axis and the other on the y-axis, then we join the point on the x-axis
to the origin by a horizontal straight line segment, and then join the origin to the point on the
y-axis with a vertical line segment. Figure 12 illustrates this.

Figure 12. Paths in the set {(x, y) ∈ R2 : xy = 0}.

The set H := {(x, y) ∈ R2 : xy = 1} is not path connected, and we prove this by con-
tradiction. Clearly the points (1, 1) and (−1, −1) both belong to H. Suppose that there is a
continuous path γ : [0, 1] → R2 such that γ(0) = (−1, −1) and γ(1) = (1, 1). Composing γ with
the linear transformation which is projection onto the x-axis, namely the map

x(t) := the first component of γ(t) (t ∈ [0, 1]),

we have that x : [0, 1] → R is continuous and x(0) = −1 and x(1) = 1. As we have that
x(−1) = −1 < 0 < 1 = x(1), it follows from the Intermediate value Theorem that there is a
t0 ∈ [0, 1] such that x(t0 ) = 0. But this means that the first component of x(t0 ) is 0 and so the
product of the two components of γ(t0 ) is 0. Thus γ(t0 ) 6∈ H, a contradiction. Consequently,
H is not path connected.
In fact, H has two path components, namely,

H+ := {(x, y) ∈ R2 : xy = 1 and x > 0},


H− := {(x, y) ∈ R2 : xy = 1 and x < 0}.

It is easy to see that H+ and H− are path connected. We just show it for H+ . Suppose
that a = (xa , ya ) and b = (xb , yb ) are points in H+ . Let the map x : [0, 1] → R be defined by
x(t) = (1 − t)xa + txb (t ∈ [0, 1]). Then x(t) > 0 for all t ∈ [0, 1]. Define γ : [0, 1] → R2 as
follows:
 
1
γ(t) = x(t), (t ∈ [0, 1]).
x(t)

Then γ is continuous, γ(0) = a and γ(1) = b.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 125

H+

H−

Figure 13. Path components of the set {(x, y) ∈ R2 : xy = 1}.

Solution to Exercise 4.17. We have

f −1 ({−1, 1}) = {nπ : n ∈ Z},

f −1 ({1}) = {2nπ : n ∈ Z},

f −1 ([−1, 1]) = f −1 (R) = R,


  [ [ 
1 1 π π π π π 2π
f −1 − , = (2n + 1) − , (2n + 1) + = nπ + , nπ + .
2 2 n∈Z
2 6 2 6 n∈Z
3 3

Solution to Exercise 4.18. Since the cosine function cos is periodic with period 2π (that is, there holds
f (x) = f (x + 2π) for all x ∈ R), it follows that f (R) = f ([0, 2π]) = f ([δ, δ + 2π]) = [−1, 1].

Solution to Exercise 4.21. (“If” part) Suppose that for every closed F in Y , f −1 (F ) is closed in X.
Now let V be open in Y . Then Y \ V is closed in Y . Thus
f −1 (Y \ V ) = f −1 (Y ) \ f −1 (V ) = X \ f −1 (V )
is closed in X. Hence f −1 (V ) = X \ (X \ f −1 (V )) is open in X. So for every open V in Y , f −1 (V ) is
open in X. Consequently, by Theorem 4.19, f is continuous on X.

(“Only if” part) Suppose that f is continuous. Let F be closed in Y . Then Y \ F is open in Y . Hence
f −1 (Y \F ) = f 1 (Y )\f −1 (F ) = X \f −1 (F ) is open in X. Consequently, we have that f −1 (F ) = X \f −1 (F )
is closed in X. So for every closed F in Y , f −1 (F ) is closed in X.

Solution to Exercise 4.23. Let x ∈ (f ◦ g)−1 (W ). Then (f ◦ g)(x) ∈ W , that is, f (g(x)) ∈ W . So
g(x) ∈ f 1 (W ). Hence x ∈ g −1 (f −1 (W )). So we have the inclusion (f ◦ g)−1 (W ) ⊆ g −1 (f −1 (W )).
Now let x ∈ g −1 (f −1 (W )). This means that g(x) ∈ f −1 (W ). Thus (f ◦ g)(x) = f (g(x)) ∈ W .
Hence x ∈ (f ◦ g)1 (W ). So we also have the inclusion g −1 (f −1 (W )) ⊆ (f ◦ g)−1 (W ). Consequently,
(f ◦ g)−1 (W ) = g −1 (f −1 (W )).

Solution to Exercise 4.24.


(1) True.
Since (−∞, 1) is open in R and f : X → R is continuous, it follows that
{x ∈ X : f (x) < 1} = f −1 (−∞, 1)

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

126 Solutions

is open in X by Theorem 4.19.


(2) True.
Since (1, ∞) is open in R and f : X → R is continuous, it follows that
{x ∈ X : f (x) > 1} = f −1 (1, ∞)
is open in X by Theorem 4.19.
(3) False.
Take for example X = R with the usual metric, and consider the continuous function f (x) = x
(x ∈ R). Then {x ∈ X : f (x) = 1} = {1}, which is not open in R.
(4) True.
(−∞, 1] is closed in R because its complement is (1, ∞), which is open in R. Since f : X → R is
continuous, it follows that {x ∈ X : f (x) ≤ 1} = f −1 (−∞, 1] is closed in X by Corollary 4.20.
(5) True.
Since {1} is closed in R and since f : X → R is continuous, it follows that
{x ∈ X : f (x) = 1} = f −1 {1}
is closed in X by Corollary 4.20.
(6) True.
Each of the sets f −1 {1} and f −1 {2} are closed, and so their finite union, namely
{x ∈ X : f (x) = 1 or 2}
is closed as well.
(7) False.
Take for example X = R with the usual metric, and consider the continuous function f (x) = 1
(x ∈ R). Then {x ∈ X : f (x) = 1} = R, which is not bounded, and hence can’t be compact.

Solution to Exercise 4.26. We show by induction that f (n) = nf (1) for all natural numbers n. For
n = 1, we have f (n) = f (1) = 1 · f (1) = nf (1), and so the claim is true when n = 1. If for some n ∈ N
there holds f (n) = nf (1), then we have f (n + 1) = f (n) + f (1) = nf (1) + f (1) = (n + 1)f (1). This
completes the induction step, and so the result holds for all natural numbers.
Also f (0) = f (0 + 0) = f (0) + f (0) shows that f (0) = 0 = 0 · f (1). So f (n) = nf (1) for all nonnegative
integers. Let m be a negative integer. Then
0 = f (0) = f (m + (−m)) = f (m) + f (−m) = f (m) + (−m)f (1),
and so f (m) = −(−m)f (1) = mf (1). So we have that f (n) = nf (1) for all integers n.
Now every rational number r can be expressed as r = n/d for some integers n, d with d > 0. Then
nf (1) = f (n) = f (d · (n/d)) = f (n/d) + · · · + f (n/d) = d · f (n/d).
| {z }
d times

Consequently, f (n/d) = (nf (1))/d, that is, f (r) = rf (1).


Let x ∈ R. Then we can find a sequence (rn )n∈N of rational numbers which converges to x. Using
the continuity of f we obtain
 
f (x) = f lim rn = lim f (rn ) = lim rn f (1) = xf (1).
n→∞ n→∞ n→∞

Consequently, f (x) = xf (1) for all x ∈ R.

Solution to Exercise 4.27. First note that if we substitute x = 0 into the given identity, we obtain
f (0) + f (0) = 0, hence f (0) = 0.
Clearly f (2x) = −f (x) for all x ∈ R, and so
f (x) = −f (x/2) = +f (x/4) = −f (x/8) = +f (x/16) = · · · = +f (x/4n )
for all x ∈ R and n ∈ N. That is, for any x ∈ R the sequence
f (x), f (x/4), f (x/16), . . . , f (x/4n ), . . . ,
is a constant sequence with limit f (x). Since the sequence (x/4n )n∈N converges to 0, and f is continuous
at 0, we also obtain that the above sequence converges to f (0) = 0. Since limits are unique, we obtain
that f (x) = 0 for all x ∈ R.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 127

So if f is continuous and it satisfies the given identity then it must be the constant function 0.
Conversely, note that the constant function 0 is indeed continuous and also satisfies the identity
f (2x) + f (x) = 0 for all x ∈ R.
   
an a
Solution to Exercise 4.28. Let be a sequence in R2 that converges to . Then we
bn n∈N
b
know that (an )n∈N converges
  to
a in R, and (bn )n∈N converges to b 
in R. 
Hence (an bn )n∈N converges to
an a
ab. In other words, f = (an bn )n∈N converges to f = ab. So by Theorem 4.25,
bn n∈N
b
it follows that f is continuous.

Solution to Exercise 4.29. Clearly the sequence given by (1, (−1)n /n), n ∈ N, is convergent with
limit (1, 0). So if f −1 were continuous, then (f −1 ((1, (−1)n /n)))n∈N should converge to f −1 (1, 0) = 0.
However, if we look at the odd indexed terms of (f −1 ((1, (−1)n /n)))n∈N , then we obtain the subsequence
(f −1 ((1, −1/n)))n∈N , which clearly converges to 1 6= 0 = f −1 (1, 0). Hence f −1 is not continuous at the
point (1, 0).

Solution to Exercise 4.30. See Figure 14. Look at a spherical cap Ox around the point x on the
sphere. Imagine a light source at the center O of the (transparent) sphere, which casts a shadow U of the
(coloured) spherical cap on the plane R2 (screen) perpendicular to the line Ox. It is clear then that the
sets Ox and U are homeomorphic. This shows that S2 is a manifold of dimension 2.

R2

screen

x
Ox

O
light source

Figure 14. S2 is a 2-dimensional manifold.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

128 Solutions

Solution to Exercise 4.31. Consider the two sequences (vn )n∈N and (wn )n∈N defined by
 1   
n
0
vn := and wn := 1 (n ∈ N),
0 n
both of which are convergent to 0. But (f (vn ))n∈N = (1)n∈N , while (f (wn ))n∈N = (−1)n∈N , and clearly
these cannot have the same limit f (0) = c, no matter what c is. Hence it follows from Theorem 4.25 that
f is not continuous at 0.

Solution to Exercise 4.32. The determinant of M is given by the sum of expressions of the type
±m1p(1) m2p(2) m3p(3) . . . mnp(n) ,
where p : {1, 2, 3, . . . , n} → {1, 2, 3, . . . , n} is a permutation. Since each of the maps
M 7→ m1p(1) m2p(2) m3p(3) . . . mnp(n)
is easily seen to be continuous using the characterization of continuous functions provided by Theorem 4.25,
it follows that their linear combination is also continuous.
{0} is closed in R, and so its inverse image under the continuous map det : Rn×n → R is also closed.
In other words, det−1 {0} = {M ∈ Rn×n : det M = 0} is closed. Thus its complement, namely the set
{M ∈ Rn×n : det M 6= 0} is open. But this is precisely the set of invertible matrices, since M ∈ Rn×n is
invertible if and only if det M 6= 0.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 129

Solution to Exercise 4.33. Let X = (0, 1], Y = R with the usual Euclidean metrics. Define f : (0, 1] →
R by
1
f (x) = (x ∈ (0, 1]).
x

Then f is continuous. Take (xn )n∈N to be the Cauchy sequence n1 n∈N . Then f (xn ) = n, and we have
|f (xn ) − (xm )| = |n − m| ≥ 1 for all n 6= m, showing that the sequence (f (xn ))n∈N is not Cauchy.

Solution to Exercise 4.34. If U is open in X = R with the Euclidean metric, then f (U ) = U is also
open in Y = R with the discrete metric d, since all subsets of (R, d) are open. So the map f is open.
Let x0 ∈ R and let ǫ = 1/2 > 0. If f is continuous at x0 , then there exists a δ > 0 such that whenever
|x − x0 | < δ, we have d(f (x), f (x0 )) < ǫ = 1/2. Now take x = x0 + δ/2. Then |x − x0 | = δ/2 < δ, but
f (x) = x = x0 + δ/2 6= x0 = f (x0 ) and so d(f (x), f (x0 )) = 1 6< 1/2 = ǫ. Thus we arrive at a contradiction.
Consequently, f is not continuous at x0 .

Solution to Exercise 4.35.


2
(1) Since (x ± y 2 )2 ≥ 0, we obtain by rearranging that ± x2xy
+y 4
≤ 1
2
and so

xy 2 1
|f (x, y)| = ≤
x2 + y 4 2
whenever (x, y) 6= (0, 0).
(2) Consider the ball B(0, r) with center (0, 0) and radius r > 0. The idea is to take points (xn , yn )
in B(0, r) of the type
1 1
xn := m , yn := ℓ
n n
with large enough natural numbers n and suitable exponents ℓ, m so that so that
1
n m+2ℓ nm+4ℓ
g(xn , yn ) = = 6ℓ
1 1 n + n2m
+ 6ℓ
n2m n
grows unboundedly with increasing n. Clearly this can be arranged if m + 4l > 6ℓ and moreover
m + 4ℓ > 2m, that is if 4ℓ > m > 2ℓ. One choice guaranteeing this is ℓ = 1 and m = 3. Then
1
n5 n
g(xn , yn ) = = .
1 1 2
+ 6
n6 n
r r
1 1 1 1 n→∞
Clearly for all large enough n’s, k(xn , yn )k2 = + < r since + 2 −→ 0. So the
n6 n2 n6 n
n
points (xn , yn ) belong to B(0, r) for all n’s large enough. But as g(xn , yn ) = , we see that g
2
is not bounded on B(0, r).
1 1
(3) We have that the sequence ((an , bn ))n∈N given by an = and bn = (n ∈ N), converges to
n2 n
(0, 0). However,
1
n4 1
f (an , bn ) = = (n ∈ N),
1 1 2
+
n4 n4
and so (f (an , bn ))n∈N = (1/2)n∈N does not converge to f (0, 0) = 0. Consequently, f is not
continuous at (0, 0).
(4) Suppose that g is continuous at (0, 0). Then with ǫ := 1 > 0, there exists a r > 0 such that
whenever (x, y) ∈ B(0, r), we have that
|g(x, y) − g(0, 0)| = |g(x, y) − 0| = |g(x, y)| < ǫ = 1.
In particular, g is bounded on the ball B(0, r). However, we have already seen in part (2) that
this is false. Hence g is is not continuous at (0, 0).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

130 Solutions

(5) f (0, y) = 0 for all y ∈ R, and so the restriction of f to the line x = 0 is continuous. Similarly,
since f (x, 0) = 0 for all x ∈ R, the restriction of f to the line y = 0 is continuous. For a point
p = (x, mx) on the line y = mx with m 6= 0, and x 6= 0, we have that
x(mx)2 m2 x
f (p) = = .
x2 + (mx) 4 1 + m 4 x2
Now if pn = (xn , mxn ) (n ∈ N) is a sequence of points on the line y = mx that converges to
p = (x, mx), then it is clear that (xn )n∈N converges to x, and so
m 2 xn
f (pn ) =
1 + m4 x2n
converges to
m 2 xn
f (p) =
1 + m4 x2n
by the algebra of limits. (If x = 0, then xn converges to x = 0, and so f (pn ) clearly converges
to 0 = f (0, 0) = f (p).) Thus the restriction of f to the line y = mx is continuous.
g(0, y) = 0 for all y ∈ R, and so the restriction of f to the line x = 0 is continuous. Similarly,
since g(x, 0) = 0 for all x ∈ R, the restriction of f to the line y = 0 is continuous. For a point
p = (x, mx) on the line y = mx with m 6= 0, and x 6= 0, we have that
x(mx)2 m2 x
g(p) = = .
x2 + (mx) 6 1 + m 6 x4
Now if pn = (xn , mxn ) (n ∈ N) is a sequence of points on the line y = mx that converges to
p = (x, mx), then it is clear that (xn )n∈N converges to x, and so
m 2 xn
g(pn ) =
1 + m6 x4n
converges to
m 2 xn
g(p) =
1 + m6 x4n
by the algebra of limits. (If x = 0, then xn converges to x = 0, and so g(pn ) clearly converges
to 0 = g(0, 0) = g(p).) Thus the restriction of g to the line y = mx is continuous.

Solution to Exercise 4.40. We had seen in Exercise 1.37 that a singleton set in any metric space is
closed. In particular, the set {0} is closed in Rn . As the linear transformation TA : Rm → Rn is continuous,
it follows that its inverse image under f , namely the set
f −1 (0) = {x ∈ Rm : Ax = 0} = ker A
is closed in Rm .

Solution to Exercise 4.41. Let V be a subspace of Rn , and let {v1 , . . . , vk } be a basis for V . Extend
this to a basis {v1 , . . . , vk , vk+1 , . . . , vn } for Rn . By using the Gram-Schmidt orthogonalization procedure,
we can find an orthonormal set of vectors {u1 , . . . , un } such that for each i ∈ {1, . . . , n}, the span of the
vectors v1 , . . . , vi coincides with the span of u1 , . . . , ui . Now define A ∈ R(n−k)×n as follows:
 ⊤ 
uk+1
 . 
A =  ..  .
u⊤
n

It is clear from the orthonormality of the ui ’s that Au1 = Au2 = Au3 = · · · = Auk = 0, and so it follows
that also any linear combination of u1 , . . . , uk lies in the kernel of A. In other words, V ⊆ ker A.
On the other hand, if x = α1 u1 + · · · + αn un , where α1 , . . . , αn are scalars and if Ax = 0, then it
follows that  ⊤   
uk+1 (α1 u1 + · · · + αn un ) αk+1
0 = Ax = 
 ..   .. 
. = . ,

un (α1 u1 + · · · + αn un ) αn
and so αk+1 = · · · = αn = 0. Thus x = α1 u1 + · · · + αk uk ∈ V . Hence also ker A ⊆ V . Consequently
V = ker A, and by the result of the previous exercise, it now follows that V is closed.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 131

Solution to Exercise 4.52.


(1) We have for x ∈ Rd
N (x) = N (x1 e1 + · · · + xd ed )
≤ N (x1 e1 ) + · · · + N (xd ed ) (triangle inequality)
= |x1 |N (e1 ) + · · · + |xd |N (ed )
≤ M kxk2 (Cauchy-Schwarz)
p
where M := (N (e1 ))2 + · · · + (N (ed ))2 > 0.
(2) We have for x, y ∈ Rd that N (x) = N (y + x − y) ≤ N (y) + N (x − y). Thus
N (x) − N (y) ≤ N (x − y)
d
for all x, y ∈ R . Interchanging x, y we also have
N (y) − N (x) ≤ N (y − x) = | − 1| · N (x − y) = N (x − y).
Hence −(N (x) − N (y)) ≤ N (x − y) for all x, y ∈ Rd . Consequently,
|N (x) − N (y)| ≤ N (x − y)
d
for all x, y ∈ R .
We have for all x, y ∈ Rd that |N (x) − N (y)| ≤ N (x − y) ≤ M kx − yk2 , and this shows the
continuity of N : (Rd , k · k2 ) → R.
(3) K := {x ∈ Rd : kxk2 = 1} is a compact set in (Rd , k · k2 ), and N : (Rd , k · k2 ) → R is continuous.
So by Weierstrass’s theorem, there exists a point c ∈ K such that
N (x) ≥ N (c) =: m
for all x ∈ K. As c 6= 0, N (c) 6= 0. Thus m > 0. We claim that in fact for all x ∈ Rd ,
mkxk2 ≤ N (x).
Suppose first that x = 0. Then clearly mkxk2 = m · 0 = 0 = N (x). If on the other hand,
x 6= 0, then we consider the vector
1
y= x ∈ Rd ,
kxk2
and note that
1 1
kyk2 = x = kxk2 = 1,
kxk2 2 kxk2
and so y ∈ K. Consequently,
 
1 1
N (y) = N x = N (x) ≥ N (c) = m,
kxk2 kxk2
and rearranging this gives mkxk2 ≤ N (x). This completes the proof.

Solution to Exercise 4.53.


(1) Let 
2x if x ∈ (0, 21 ),
f (x) =
2 − 2x if x ∈ [ 21 , 1).
Then f is continuous on (0, 1) and f (0, 1) = (0, 1].
(2) If there existed such a continuous f , then since 1/2, 3/2 ∈ T and since f (S) = T , there would
exist a, b ∈ S = (0, 1) such that f (a) = 1/2 and f (b) = 3/2. But then since we have that
f (a) = 1/2 < 1 < 3/2 = f (b), it follows by the Intermediate Value Theorem that there is a c
between a and b such that f (c) = 1. Thus 1 ∈ f (S) = T , a contradiction.
(3) We have 0, 1 ∈ Q =
√ T = f (S), and so there exist a, b ∈ S = R such that f (a) = 1 and f (b) = 2.
Since f (a) = 1 < 2 < 2 = f (b), it √
follows by √
the Intermediate Value Theorem that there is a
c between a and b such that f (c) = 2. Thus 2 ∈ f (S) = T = Q, a contradiction.
(4) Set 
0 if x ∈ [0, 1],
f (x) =
1 if x ∈ [2, 3].
S
Then f is continuous on [0, 1] [2, 3]. Indeed if (xn )n∈N is a convergent sequence in S with limit
L in S, then beyond a certain index N , the terms all lie in [0, 1], or they all lie in [2, 3]. Thus

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

132 Solutions

the terms of the sequence (f (xn ))n∈N beyond this index N are all 0, or they are all 1. In either
case, the sequence (f (xn ))n∈N is convergent, and this shows that f is continuous. Also it is clear
that f (S) = T .
(5) S is closed and bounded and so it is compact. Thus f (S) = T = R2 must be compact too. But
R2 is not bounded, and hence not compact. So we arrive at a contradiction.
(6) S is closed and bounded and so it is compact. Thus f (S) = T = (0, 1) × (0, 1) must be
compact too. But (0, 1) × (0, 1) is not compact (for example, the sequence ((1/n, 1/n))n∈N has
no convergent subsequence with limit in T ). So we arrive at a contradiction.
(7) Define ϕ : (0, 1) → R by
x − 1/2
ϕ(x) = for x ∈ (0, 1).
x(1 − x)
Then ϕ is continuous on (0, 1) and ϕ(0, 1) = R. Now set
f (x, y) = (ϕ(x), ϕ(y)) for (x, y) ∈ (0, 1) × (0, 1).
Then f is continuous and f ((0, 1) × (0, 1)) = R2 . We prove these claims below.
If ((xn , yn ))n∈N is a sequence in (0, 1) × (0, 1) that converges to (x, y) belonging to the set
(0, 1) × (0, 1), then we have that (xn )n∈N converges to x and (yn )n∈N converges to y. As ϕ is
continuous, it follows that (ϕ(xn ))n∈N converges to ϕ(x) and (ϕ(yn ))n∈N converges to ϕ(y). Thus
(f (xn , yn ))n∈N = ((ϕ(xn ), ϕ(yn )))n∈N converges to (ϕ(x), ϕ(y)) = f (x, y). So f is continuous.
If (x, y) ∈ R2 , then x ∈ R and y ∈ R. As ϕ : (0, 1) → R is onto, there exist ξ, η ∈ (0, 1) such
that ϕ(ξ) = x and ϕ(η) = y. Thus f (ξ, η) = (ϕ(ξ), ϕ(η)) = (x, y). Hence f is onto.

Solution to Exercise 4.54.


(1) If there were two fixed points, say c1 , c2 ∈ X, then f (c1 ) = c1 6= c2 = f (c2 ), and so from (4.1)
we would have d(c1 , c2 ) = d(f (c1 ), f (c2 )) < d(c1 , c2 ), a contradiction.
(2) We have f (x) = x2 < ( 21 )2 = 1
4
< 1
2
, and so we have that f maps (0, 12 ) into (0, 21 ). Also, we
have for all x 6= y that

|f (x) − f (y)| = |x2 − y 2 | = |x + y||x − y|


 
1 1
≤ (|x| + |y|)|x − y| < + |x − y| = |x − y|.
2 2
If f has a fixed point c, then f (c) = c2 = c, and so c = 0 or c = 1, but neither of these belong
to X = (0, 1/2). Hence f has no fixed points.
(3) It is easy to see that f is continuous: if ǫ > 0, then setting δ = ǫ > 0, we have for all x ∈ X
satisfying 0 < d(x, x0 ) < δ that d(f (x), f (x0 )) < d(x, x0 ) < δ = ǫ.
By the triangle inequality and using the condition in (4.1) we have for all x 6= y that
d(x, f (x)) ≤ d(x, y) + d(y, f (y)) + d(f (y), f (x)) < d(x, y) + d(y, f (y)) + d(x, y),
and so d(x, f (x)) − d(y, f (y)) < 2d(x, y). Interchanging x, y, this yields
|g(x) − g(y)| = |d(x, f (x)) − d(y, f (y))| < 2d(x, y),
and this gives the desired continuity of g. (Alternately, we could use the result from Exer-
cise 2.18. If (xn )n∈N is a sequence in X convergent with a limit L ∈ X, then (f (xn ))n∈N
is also convergent with limit f (L) ∈ X. Thus by the result in Exercise 2.18, it follows that
(g(xn ))n∈N = (d(xn , f (xn )))n∈N is convergent with limit g(L) = d(L, f (L)). So g is continuous.)
(4) By Weierstrass’s Theorem the continuous function g on the compact set X attains a minimum
at some c ∈ X. We claim that f (c) = c. Suppose not. Then by (4.1), we obtain
g(f (c)) = d(f (c), f (f (c))) < d(c, f (c)) = g(c),
contradicting the fact that c is a minimizer of g on X. So f (c) = c, and hence a fixed point of
f exists. It is also unique by the result in part (1).

Solution to Exercise 4.55. Since X is compact and since f : X → Z is continuous, it follows from
Theorem 4.48 that f (X) is a compact subset of Z. Suppose that f is not finite-valued. Then one can
find an infinite sequence of distinct integers (nk )k∈N such that nk ∈ f (X) for each k ∈ N. But then this
sequence in the set f (X) cannot have a convergent subsequence since whenever k 6= ℓ, nk 6= nℓ and so
|nk − nℓ | ≥ 1. This contradicts the compactness of f (X). Hence f can assume only finitely many values.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 4 133

Solution to Exercise 4.56.


(1) It suffices to construct a one-to-one correspondence between [0, 1] and (0, 1), because this map
and its inverse will then be automatically continuous (since both spaces have the discrete met-
ric). One way to construct the one-to-one correspondence f : [0, 1] → (0, 1) is to consider the
enumeration of rationals in (0, 1): r1 , r2 , r3 , . . . , and to define
f (0) = r1 , f (1) = r2 , f (r1 ) = r3 , f (r2 ) = r4 , . . . , f (rn ) = f (rn+2 ), . . . ,
and f (x) = x if x ∈ [0, 1] is irrational. It is clear then that f is one-to-one and onto.
(2) [0, 1] is compact with the Euclidean metric, while (0, 1) isn’t compact with the Euclidean metric.
(Indeed, if the sequence 21 , 13 , 14 , . . . in (0, 1) had a convergent sequence with a limit L ∈ (0, 1),
then it would also converge in R to L, which forces L to be 0, a contradiction.) If there were a
continuous map f : [0, 1] → (0, 1), then f ([0, 1]) = (0, 1) would be compact, a contradiction.

Solution to Exercise 4.60. Suppose that f is uniformly continuous. Let ǫ = 1 > 0, and let δ > 0 be such
that for all x, y ∈ R satisfying |x − y| < δ, there holds |x2 − y 2 | < 1. For n ∈ N, set xn = n and yn = n + n1 .
Then |xn −yn | = n1 < δ for all n’s large enough. However, |x2n −yn2 | = |n2 −(n+ n1 )2 | = 2+ n12 > 2 > 1 = ǫ,
a contradiction. Thus f is not uniformly continuous.

Solution to Exercise 4.63. We have for x, y ∈ R that ||x| − |y|| ≤ |x − y|. If ǫ > 0, then with δ := ǫ > 0,
we have for all x, y ∈ R satisfying |x − y| < δ = ǫ that ||x| − |y|| ≤ |x − y| < δ = ǫ. So the absolute value
function is uniformly continuous.

Solution to Exercise 4.64. If x, y ∈ X, then by the triangle inequality we have


d(x, c) ≤ d(x, y) + d(y, c),
and so d(x, y) − d(y, c) ≤ d(x, y). Interchanging x, y, we also have
d(y, c) ≤ d(y, x) + d(x, c) = d(x, y) + d(x, c),
and so −(d(x, c) − d(y, c)) ≤ d(x, y). Consequently for all x, y ∈ X, we obtain
|d(x, c) − d(y, c)| ≤ d(x, y).
So if ǫ > 0 then with δ := ǫ > 0, we have for all x, y ∈ X satisfying d(x, y) < δ = ǫ that
|d(x, c) − d(y, c)| ≤ d(x, y) < δ = ǫ.
So the distance function to a (fixed) point is uniformly continuous.

Solution to Exercise 4.65. Let ǫ > 0. Set δ = ǫ/L > 0. Then whenever x, y ∈ Rn and kx − yk2 < δ,
we have
kf (x) − f (y)k2 ≤ Lkx − yk2 < Lδ = ǫ.
Thus f is uniformly continuous.

Solution to Exercise 4.66. Suppose that (xn )n∈N is a Cauchy sequence in X. We want to show that
(f (xn ))n∈N is a Cauchy sequence in Y . Let ǫ > 0. Since f is uniformly continuous, there exists a δ > 0
such that whenever d(x, y) < δ, we have d(f (x), f (y)) < ǫ. As (xn )n∈N is Cauchy, there exists an index
N ∈ N such that for all n, m > N , d(xn , xm ) < δ. But then for all n, m > N , we have d(f (xn ), f (yn )) < ǫ.
Hence (f (xn ))n∈N is a Cauchy sequence in Y .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

134 Solutions

Solutions to the exercises from Chapter 5


Solution to Exercise 5.9. We have
 
f (a) g(a) 1
ϕ(a) = det  f (a) g(a) 1 =0
f (b) g(b) 1
since the first and second rows of the matrix are linearly dependent. Similarly,
 
f (b) g(b) 1
ϕ(b) = det  f (a) g(a) 1  = 0.
f (b) g(b) 1
Also,
 
f (x) g(x) 1
ϕ(x) = det  f (a) g(a) 1  = f (x)(g(a) − g(b)) − g(x)(f (a) − f (b)) + f (a)g(b) − f (b)g(a) (x ∈ [a, b]),
f (b) g(b) 1

and so we see that ϕ is continuous on [a, b], differentiable on (a, b) and

ϕ′ (x) = f ′ (x)(g(a) − g(b)) − g ′ (x)(f (a) − f (b)) (x ∈ (a, b)).

As ϕ(a) = ϕ(b), it follows from Rolle’s Theorem that there is a c ∈ (a, b) for which ϕ′ (c) = 0, that is,

f ′ (c)(g(a) − g(b)) − g ′ (c)(f (a) − f (b)) = 0.

Rearranging, we we obtain the desired equality.

Solution to Exercise 5.13. We have |f (x) − f (y)| ≤ (x − y)2 = |x − y|2 , and so for x 6= y we obtain
f (x) − f (y)
− 0 ≤ |x − y|.
x−y
Let ǫ > 0. Then with δ := ǫ > 0, we have that whenever y satisfies 0 < |x − y| < δ = ǫ, we have
f (x) − f (y)
− 0 ≤ |x − y| < δ = ǫ.
x−y
Thus f ′ (x) = 0 for all x ∈ R.
Now suppose that a, b ∈ R and that a < b. Then by the Mean Value Theorem applied to f on the
interval [a, b], we obtain
f (a) − f (b)
= f ′ (c)
a−b
for some c ∈ (a, b). But f ′ (c) = 0, and so f (a) = f (b). Thus f is constant.

Solution to Exercise 5.14. If x, y ∈ (a, b) and x < y, then we have by the Mean Value Theorem applied
to the interval [x, y] that
f (x) − f (y)
= f ′ (c)
x−y
for some c ∈ (a, b). Since |f ′ (c)| ≤ M , we obtain that

|f (x) − f (y)| ≤ M |x − y|.

Clearly this is also true if x = y, and (by interchanging the roles of x and y) if x > y. Hence for all
x, y ∈ (a, b), there holds |f (x) − f (y)| ≤ M |x − y|.
Let ǫ > 0. Set δ = ǫ/(M + 1) (the M + 1 is in case M = 0 and f is constant). Then if x, y ∈ (a, b)
and |x − y| < δ, we have

|f (x) − f (y)| ≤ M |x − y| ≤ M δ < (M + 1)δ = ǫ.

Hence f is uniformly continuous on (a, b).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 135

Solution to Exercise 5.15. Call the cubic function f . Suppose that a, b are two distinct zeros such
that a < b. Then f (a) = f (b) (= 0), and so by Rolle’s Theorem, there must exist a c ∈ (a, b) such that
f ′ (c) = 0. But
   2 !
′ 2 2 2 1 1 3 1 3
f (c) = 6c + 6c + 6 = 6(c + c + 1) = 6 c + 2 · c · + + =6 c+ + > 0.
2 4 4 2 4
Thus we see that f has at most one zero.
We now show that in fact there does exist a zero. Clearly f (0) = 10 > 0, while we have that
f (−2) = −16 + 12 − 12 + 10 = −6 < 0. So by the Intermediate Value Theorem, there is a z ∈ [−2, 0] such
that f (z) = 0.

Solution to Exercise 5.16.


(1) Suppose that there exist a, b ∈ R such that a < b and f (a) = a, f (b) = b. Applying the Mean
Value Theorem to f on [a, b], we obtain
f (b) − f (a) b−a
= = 1 = f ′ (c)
b−a b−a
for some c ∈ (a, b) ⊆ (0, 1). But as f ′ (c) 6= 1, we have arrived at a contradiction. Hence if f has
a fixed point, then it is unique.
(2) Let x1 ∈ R, and set xn+1 := f (xn ) for n ∈ N. We claim that
|f (xn+1 ) − f (xn )| ≤ M |xn+1 − xn | (n ∈ N). (6.4)
This is clearly true if xn+1 = xn . If they are not equal, then we apply the Mean Value Theorem
to the interval with endpoints xn+1 and xn to obtain
f (xn+1 ) − f (xn )
= f ′ (c),
xn+1 − xn
for some c between xn+1 and xn . As |f ′ (c)| ≤ M , this yields the inequality (6.4).
Thus we have by a repeated application of (6.4) that
|xn+1 − xn | = |f (xn ) − f (xn−1 )| ≤ M |xn − xn−1 |
≤ M · M |xn−1 − xn−2 |
..
.
≤ M n−1 |x2 − x1 |.
Now consider the series

X
x1 + (xn+1 − xn ).
n=1

The estimate |xn+1 − xn | ≤ M n−1 |x2 − x1 |, with M < 1, shows that this series converges
absolutely, using the Comparison Test. As the partial sums of the above series telescope, that
is,
x1 + (x2 − x1 ) + (x3 − x2 ) + · · · + (xn+1 − xn ) = xn+1 ,
it follows that
x∗ := lim xn
n→∞
exists. Since f is continuous, we also have that
 
f (x∗ ) = f lim xn = lim f (xn ) = lim xn+1 = x∗ .
n→∞ n→∞ n→∞

So x∗ is a fixed point.
(3) See Figure 15.
(4) We have
ex 1 + e2x + ex
f ′ (x) = 1 − = ,
(1 + ex )2 1 + e2x + 2ex
and so 0 < f ′ (x) < 1 for all x ∈ R.
If f had a fixed point x∗ , then f (x∗ ) = x∗ would imply that
1
f (x∗ ) = x∗ + = x∗ ,
1 + ex ∗

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

136 Solutions

(x2 ,x3 ) y=x

x∗

(x3 ,x4 )

(x1 ,x2 )

x∗ x1

Figure 15. The zig-zag path (x1 , x2 ) −→ (x2 , x2 ) −→ (x2 , x3 ) −→ . . . .

and so we would have


1
= 0,
1 + ex ∗
which is clearly impossible, since ex∗ > 0.
We observe that although |f ′ (x)| < 1 for all x ∈ R, it is not uniformly bounded away from
1, that is, there does not exist an M < 1 such that for all x ∈ R, |f (x)| < M . Suppose, on the
contrary, that such an M exists. Then we would have for all x ∈ R that
ex
1− < M.
(1 + ex )2
In particular, setting x = −n (n ∈ N), we obtain for all n ∈ N that
e−n
1− < M.
(1 + e−n )2
Passing the limit as n → ∞ yields
0
1− = 1 ≤ M,
(1 + 0)2
a contradiction to the fact that M < 1. Thus there is no contradiction to the result from part
(2).

Solution to Exercise 5.17. If x 6= 0, then f is differentiable at x and we have


  
1 1 1 1 1
f ′ (x) = 2x sin + x2 cos − 2 = 2x sin − cos .
x x x x x
On the other hand, we have for x 6= 0 that
f (x) − f (0) 1
= x sin ,
x−0 x
and so, given ǫ > 0, if we set δ = ǫ > 0, then we have for all x ∈ R satisfying 0 < |x − 0| = |x| < δ = ǫ that
f (x) − f (0) 1 1
− 0 = x sin = |x| · sin ≤ |x| · 1 < δ = ǫ.
x−0 x x
Consequently, f is differentiable at 0, and its derivative is f ′ (0) = 0.
If f ′ were continuous, then since the sequence ( 2πn 1
)n∈N converges to 0, it would follows that also
(f ′ ( 2πn
1
))n∈N converges to f ′ (0) = 0. However, for all n ∈ N,
 
′ 1 2
f = sin(2πn) − cos(2πn) = 0 − 1 = −1.
2πn 2πn
hence f ′ is not continuous. (So f is differentiable, but not “continuously differentiable”.)

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 137

Solution to Exercise 5.18. Since x 7→ f (x + n) is differentiable too, it follows that


f (x + n) − f (x)
x 7→ = f ′ (x)
n
is differentiable. Also, for all m ∈ N, we have
f (x + m + n) − f (x + m) f (x + m + n) − f (x) + f (x) − f (x + m)
f ′ (x + m) = =
n n
n + m f (x + m + n) − f (x) m f (x + m) − f (x)
= · −
n n+m n m
n+m ′ m ′ ′
= f (x) − f (x) = f (x).
n n
Thus
f ′ (x + n) − f ′ (x) f ′ (x) − f ′ (x)
f ′′ (x) = = = 0.
n n
′ ′
By the Mean Value Theorem applied to f , this gives that f is a constant, that is, there is a c ∈ R such
that for all x ∈ R, f ′ (x) = c. Applying the Mean Value Theorem to f , we obtain for every nonzero real x
that
f (x) − f (0)
= f ′ (z)
x−0
for some z between 0 and x. But since f ′ is constant, it follows that f (x) = f (0) + cx for all x ∈ R.

Solution to Exercise 5.19. If a, b ∈ R and a < b, then applying the Mean Value Theorem to the
function cos ·, we obtain
cos a − cos b
= − sin c
a−b
for some c ∈ (a, b). But as | sin θ| ≤ 1 for all real θ, it follows that
cos a − cos b
≤ 1,
a−b
hence | cos a − cos b| ≤ |a − b|.
The case a > b is similar (note that the inequality does not change if a and b are interchanged).
The case a = b is trivial (both sides of the inequality equal 0).

Solution to Exercise 5.20.


(1) See Figure 16. The point (1 − t)x + ty is the point dividing the line segment joining x and
y in the ratio t : 1. The convexity inequality says that the corresponding value of f , namely
f ((1 − t)x + ty) is at most (1 − t)f (x) + tf (y), and so the point ((1 − t)x + ty, f ((1 − t)x + ty))
lies below the point ((1 − t)x + ty, (1 − t)f (x) + tf (y)) on the “chord” joining the points (x, f (x))
and (y, f (y)).

f (y)

(1−t)f (x)+tf (y)

f ((1−t)x+ty)

f (x)

x (1−t)x+ty y

Figure 16. Geometric meaning of the convexity inequality.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

138 Solutions

(2) (“If part”) Suppose that f ′′ (x) ≥ 0 for all x ∈ R. Let x, y ∈ R and (without loss of generality)
let x < y. Suppose that t ∈ (0, 1). Applying the Mean Value Theorem to f on the interval
[x, (1 − t)x + ty] gives
f ((1 − t)x + ty) − f (x)
= f ′ (c1 )
t(y − x)
for some c1 such that x < c1 < (1 − t)x + ty. Also, by the Mean Value Theorem for f on the
interval [(1 − t)x + ty, y] we obtain
f (y) − f ((1 − t)x + ty)
= f ′ (c2 )
(1 − t)(y − x)
for some c2 such that (1 − t)x + ty < c2 < y. As f ′′ (ξ) ≥ 0 for all real ξ, and since c1 < c2 , we
can conclude that f ′ (c1 ) ≤ f ′ (c2 ) using Corollary 5.10. Hence we obtain
f ((1 − t)x + ty) − f (x) f (y) − f ((1 − t)x + ty)
= f ′ (c1 ) ≤ f ′ (c2 ) = .
t(y − x) (1 − t)(y − x)
Rearranging, we obtain f ((1 − t)x + ty) ≤ (1 − t)f (x) + tf (y). So f is convex.
(“Only if part”) Now suppose that f is convex. Let x < u < y. If
u−x
t := ,
y−x
y−u
then t ∈ (0, 1), and 1 − t = . From the convexity of f , we obtain
y−x
 
y−u u−x y−u u−x
f (x) + f (y) ≥ f x+ y = f (u)
y−x y−x y−x y−x
that is,
(y − x)f (u) ≤ (u − x)f (y) + (y − u)f (x). (6.5)
From (6.5), we obtain (y − x)f (u) ≤ (u − x)f (y) + (y − x + x − u)f (x), that is,
(y − x)f (u) − (y − x)f (x) ≤ (u − x)f (y) − (u − x)f (x),
and so
f (u) − f (x) f (y) − f (x)
≤ . (6.6)
u−x y−x
From (6.5), we also have (y − x)f (u) ≤ (u − y + y − x)f (y) + (y − u)f (x), that is,
(y − x)f (u) − (y − x)f (y) ≤ (u − y)f (y) − (u − y)f (x),
and so
f (y) − f (x) f (y) − f (u)
≤ . (6.7)
y−x y−u
Combining (6.6) and (6.7),
f (u) − f (x) f (y) − f (x) f (y) − f (u)
≤ ≤ .
u−x y−x y−u
Passing the limit as u ց x and u ր y, we obtain
f (y) − f (x)
f ′ (x) ≤ ≤ f ′ (y),
y−x
and so f ′ is increasing.
f ′ (y) − f ′ (x)
Moreover, if f is twice differentiable, then f ′′ (x) = y→x
lim ≥ 0.
y>x
y−x
(3) Suppose that x1 ∈ R is such that f (x1 ) < f (x0 ). We consider the two possible cases:
1◦ x1 > x0 . By the Mean Value Theorem applied to f on [x0 , x1 ], we obtain the existence of
a c such that x0 < c < x1 and
f (x1 ) − f (x0 )
f ′ (x0 ) = 0 > = f ′ (c),
x1 − x0
contradicting the convexity of f .

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 139

2◦ x1 < x0 . By the Mean Value Theorem applied to f on [x1 , x0 ], we obtain the existence of
a c such that x1 < c < x0 and
f (x1 ) − f (x0 )
f ′ (x0 ) = 0 < = f ′ (c),
x1 − x0
again contradicting the convexity of f .
Thus f (x) ≥ f (x0 ) for all x ∈ R.

Solution to Exercise 5.25. Let |x| < 1. Choose any r ∈ (0, 1) such that |x| < r < 1. That the series
X∞
nrn−1
n=1

converges can be seen using the Ratio Test:


 
(n + 1)rn 1
lim = lim 1 + r = r < 1.
n→∞ nrn−1 n→∞ n
Now consider the functions fn : (−r, r) → R defined by fn (x) = xn (x ∈ (−r, r), n ∈ N). Then the
geometric series

X ∞
X
fn (x) = xn
n=1 n=1
converges for each x ∈ (−r, r).
Now we will show that

X ∞
X
fn′ (x) = nxn−1
n=1 n=1
converges uniformly using the convergence of the series
X∞
nrn−1 .
n=1

By the Comparison Test, we have the pointwise convergence on (−r, r) of the series
X∞
nxn−1
n=1

since |nxn−1 | ≤ nrn−1 (n ∈ N). Also,


N
X ∞
X ∞
X ∞
X
nxn−1 − nxn−1 = nxn−1 ≤ nrn−1
n=1 n=1 n=N +1 n=N +1

shows the claimed uniform convergence. Thus we have by Corollary 5.23 that
  X ∞
d 1
= nxn−1 ,
dx 1 − x n=1
1
and so 1 + 2x + 3x2 + · · · = .
(1 − x)2

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

140 Solutions

Solution to Exercise 5.29. (From the calculation done in the given hint, we guess that the derivative
of g at c is the map Rn ∋ h 7→ 2f (c)f ′ (c)h, and we prove our claim below.)
We have for x ∈ Rn that

(f (x))2 − (f (c))2 − 2f (c)f ′ (c)(x − c)


= (f (x) + f (c))(f (x) − f (c) − f ′ (c)(x − c)) + (f (x) − f (c))f ′ (c)(x − c).

From the Cauchy–Schwarz inequality, we know that |f ′ (c)(x − c)| ≤ kf ′ (c)k2 kx − ck2 , and so

|(f (x))2 − (f (c))2 − 2f (c)f ′ (c)(x − c)|


≤ |f (x) + f (c)| |f (x) − f (c) − f ′ (c)(x − c)| + |f (x) − f (c)| kf ′ (c)k2 kx − ck2 ,

Thus we have for x 6= c that

|(f (x))2 − (f (c))2 − 2f (c)f ′ (c)(x − c)|


kx − ck2
|f (x) − f (c) − f ′ (c)(x − c)|
≤ |f (x) + f (c)| + |f (x) − f (c)| kf ′ (c)k2 .
kx − ck2

Let ǫ > 0. Since f is differentiable at c, it is also continuous at c. Hence there exists a δ1 > 0 such that
whenever kx − ck2 < δ1 , we have that
ǫ
|f (x) − f (c)| < .
2(kf ′ (c)k2 + 1)

We have |f (x) + f (c)| ≤ |f (x) − f (c) + 2f (c)| ≤ |f (x) − f (c)| + 2|f (c)|. Bearing this in mind, let δ2 > 0
be such that whenever kx − ck2 < δ2 , we have |f (x) − f (c)| < 1. Thus, for kx − ck2 < δ2 ,

|f (x) + f (c)| ≤ |f (x) − f (c)| + 2|f (c)| < 1 + 2|f (c)|.

As f is differentiable at c, there is a δ3 > 0 such that whenever 0 < kx − ck2 < δ3 , we have

|f (x) − f (c) − f ′ (c)(x − c)| ǫ


< .
kx − ck2 2(1 + 2|f (c)|)

Set δ = min{δ1 , δ2 , δ3 } > 0. Then whenever 0 < kx − ck2 < δ, we have

|(f (x))2 − (f (c))2 − 2f (c)f ′ (c)(x − c)|


kx − ck2
|f (x) − f (c) − f ′ (c)(x − c)|
≤ |f (x) + f (c)| + |f (x) − f (c)|kf ′ (c)k2
kx − ck2
ǫ ǫ
< (1 + 2|f (c)|) + kf ′ (c)k2 < ǫ.
2(1 + 2|f (c)|) 2(kf ′ (c)k2 + 1)

Thus g is differentiable at c and g ′ (c) : Rn → R is given by g ′ (c)h = 2f (c)f ′ (c)h (h ∈ Rn ).

Solution to Exercise 5.30. We have

q(x) − q(c) − 2c⊤ Q(x − c) = x⊤ Qx − c⊤ Qc − 2c⊤ Q(x − c)


= x⊤ Qx − c⊤ Qc − 2c⊤ Qx + 2c⊤ Qc
= x⊤ Qx + c⊤ Qc − 2c⊤ Qx
= x⊤ Qx − c⊤ Qx + c⊤ Qc − c⊤ Qx
= (x − c)⊤ Qx − c⊤ Q(x − c).

But c⊤ Q(x − c) is a scalar, and so it is its own transpose. So we obtain, using Q = Q⊤ , that

c⊤ Q(x − c) = (c⊤ Q(x − c))⊤ = (x − c)⊤ Q⊤ c = (x − c)⊤ Qc.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 141

Thus q(x) − q(c) − 2c⊤ Q(x − c) = (x − c)⊤ Qx − (x − c)⊤ Qc = (x − c)⊤ Q(x − c). Now using the Cauchy–
Schwarz inequality we have the following estimate. For any vector v ∈ Rn , we have
n n
!2
2
X X
kQvk2 = qij vj
i=1 j=1
n n
! n
!
X X 2
X
≤ qij vj2
i=1 j=1 j=1
n n
!
X X
≤ kQk2∞ kvk22 = n2 kQk2∞ kvk22 .
i=1 j=1

Thus we obtain
|q(x) − q(c) − 2c⊤ Q(x − c)| ≤ |(x − c)⊤ Q(x − c)| ≤ kx − ck2 kQ(x − c)k2
≤ kx − ck2 · n · kQk∞ kx − ck2
= n · kQk∞ · kx − ck22 .
ǫ
Let ǫ > 0. Set δ = . Then whenever 0 < kx − ck2 < δ, we have
n(kQk∞ + 1)
|q(x) − q(c) − 2c⊤ Q(x − c)| √ ǫ
≤ nkQk∞ kx − ck2 < nkQk∞ δ = nkQk∞ < ǫ.
kx − ck2 n(kQk∞ + 1)
Hence q is differentiable at c and q ′ (c) : Rn → R is given by q ′ (c)v = 2c⊤ Qv (v ∈ Rn ).

Solution to Exercise 5.31. We have that f (x) = kxk42 = (kxk22 )2 = (x⊤ x)2 = (x⊤ Ix)2 . By the result of
Exercise 5.30, we know that the map q given by q(x) = x⊤ Ix has derivative the map v 7→ 2c⊤ Iv = 2c⊤ v
(v ∈ Rn ). Using the result in Exercise 5.29 it now follows that f ′ (c) is the linear transformation given by
v 7→ 2q(c)q ′ (c)v = 2(c⊤ Ic)2c⊤ v = 4kck22 c⊤ v (v ∈ Rn ).

Solution to Exercise 5.36. We have


∂f
(x, t) = τ eτ t+ξx ,
∂t
∂f
(x, t) = ξeτ t+ξx , and
∂x
∂2f ∂
(x, t) = ξ eτ t+ξx = ξ · ξeτ t+ξx = ξ 2 eτ t+ξx .
∂x2 ∂x
∂f ∂2f
Thus (x, t) − (x, t) = τ eτ t+ξx − ξ 2 eτ t+ξx = (τ − ξ 2 )eτ t+ξx = 0 · eτ t+ξx = 0.
∂t ∂x2

Solution to Exercise 5.37. With F : R3 → R defined by


F (α, β, γ) = (P − aα − bβ)β ((α, β, γ) ∈ R3 ),
we have that Z Z
T T
′ ′
f (x) = (P − ax(t) − bx (t))x (t)dt = F (x(t), x′ (t), t)dt
0 0
for x : [0, T ] → R satisfying x(0) = 0 and x(T ) = Q. We calculate
∂F
(α, β, γ) = −aβ,
∂α
∂F
(α, β, γ) = P − aα − 2bβ.
∂β
Let x∗ denote the optimal mining operation. We have
∂F
(x∗ (t), x′∗ (t), t) = −ax′∗ (t),
∂α
∂F
(x∗ (t), x′∗ (t), t) = P − ax∗ (t) − 2bx′∗ (t).
∂β
So the Euler-Lagrange equation for this problem is:
 
∂F ′ d ∂F d
(x∗ (t), x∗ (t), t) − (x∗ (t), x∗ (t), t) = −ax′∗ (t) − (P − ax∗ (t) − 2bx′∗ (t)) = 0

∂α dt ∂β dt

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

142 Solutions

for all t ∈ [0, T ]. Thus −ax∗ (t) − (0 − ax′∗ (t) − 2bx′′∗ (t)) = 0 and so x′′∗ (t) = 0 (t ∈ [0, T ]). Hence x′∗ (t) = A
(t ∈ [0, T ]) for some constant A, and so x∗ (t) = At + B for some constant B ∈ [0, T ]. But we must also
have x∗ (0) = 0 and x∗ (T ) = Q. Thus
A·0+B = 0,
A·T +B = Q,
t
which gives us x∗ (t) = Q (t ∈ [0, T ]).
T

Solution to Exercise 5.38. We first calculate candidates for local minimizers. We have
 
f ′ (x1 , x2 )v = 4x31 − 12x2 −12x1 + 4x32 v (v ∈ R2 ).
So f ′ (x1 , x2 ) = 0 if and only if 4x31 = 12x2 and 4x32 = 12x1 , that is, if and only if x31 = 3x2 and x1 = x32 /3.
9 3
But if this latter √ condition
√ holds, then x2 /3 = 3x2 , that is, √ x2 (x82 − 34 ) = 0, 2
√ and so x0 = 0 or√x2 = 3,
that is,√x2 ∈ {0, 3, − 3}. If x2 = 0, then x1 = 0; if x2 = 3, then x1 = 3;√and√if x2 =√− 3,√ then
x1 = − 3. So if x31 = 3x2 and x1 = x32 /3, then we have that (x1 , x2 ) ∈ {(0, 0), ( 3, 3), (− 3, − 3)}.
√ √ √ √
Conversely, if (x1 , x2 ) ∈ {(0, 0), ( 3, 3), (− 3, − 3)}, then x31 = 3x2 and x1 = x32 /3.
√ √ √ √
So we have that f ′ (x1 , x2 ) = 0 if and only if (x1 , x2 ) ∈ {(0, 0), ( 3, 3), (− 3, − 3)}.
√ √ √ √
We calculate that f (0, 0) = 0, while f ( 3, 3) = f (− 3, − 3) = −18. Since we are interested in
global minimizers, we discard (0, 0).
Also,
 
(x2 + x22 )2 x21 + x22 kxk42
f (x1 , x2 ) = x41 + x42 − 12x1 x2 ≥ 1 − 12 = − 6kxk22 .
2 2 2
It follows that there exists a R > 0 such that for all kxk2 ≥ R,
√ √ √ √
f (x1 , x2 ) > −18 = f ( 3, 3) = f (− 3, − 3). (6.8)
2
In the closed and bounded, and hence compact, set K := {x ∈ R : kxk2 ≤ R}, the continuous function
f |K has a global minimizer. The global minimizer(s) can’t lie on the boundary where kxk2 = R since there
we have that (6.8) holds. So the global minimizer(s) has to be inside the open ball√B(0,
√R). These
√ global

minimizers are also local minimizers of g|B(0,R) . So these have to be the points ( 3, 3), (− 3, − 3).
2
√ √ √ √
But now they serve as √ global √ for√f in R because we have f ( 3, 3) = f (− 3, − 3) ≤ g(x)
√ minimizers
for x ∈ K as well as f ( 3, 3) = f (− 3, − 3) = −18 ≤ f (x) for x 6∈ K.

Solution to Exercise 5.39. When f is given by f (x, y) = 3x3 + 9y 2 − 9x3 y, we have


∂f
(x, y) = 9x2 − 27x2 y,
∂x
∂f
(x, y) = 18y − 9x3 .
∂y
Thus we obtain
∂2f ∂
(x, y) = (9x2 − 27x2 y) = 0 − 27x2 ,
∂y∂x ∂y
∂2f ∂
(x, y) = (18y − 9x3 ) = 0 − 27x2 .
∂x∂y ∂x
Thus we have
∂2f ∂2f
(x, y) = −27x2 = (x, y).
∂y∂x ∂x∂y

Now suppose that f is given by



 x2 − y 2
xy if (x, y) 6= (0, 0),
f (x, y) = x2 + y 2

0 if (x, y) = (0, 0)
Then if (x, y) 6= 0, we have
∂f y(x4 + 4x2 y 2 − y 4 )
(x, y) = .
∂x (x2 + y 2 )2

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 143

Also,
x2 − y 2
xy −0
∂f x2 + y 2 x2 − y 2
(0, 0) = lim = lim y 2 = 0.
∂x (x,y)→(0,0) x (x,y)→(0,0) x + y 2
From the above two computations, we see that in particular, for all y we do have
∂f
(0, y) = −y.
∂x
∂2f ∂2f
Therefore for all y, (0, y) = −1 and hence (0, 0) = −1 too.
∂y∂x ∂y∂x
∂2f
The computation of (0, 0) is similar. Thus,
∂x∂y
∂f x(x4 − 4x2 y 2 − y 4 )
(x, y) = .
∂y (x2 + y 2 )2
if (x, y) 6= (0, 0). Also,
x2 − y 2
−0
xy
∂f x2 + y 2 x2 − y 2
(0, 0) = lim = lim x 2 = 0.
∂y (x,y)→(0,0) y (x,y)→(0,0) x + y 2

From the above two computations, we see that in particular, for all x we do have
∂f
(0, y) = x.
∂y
∂2f ∂2f
Therefore for all x, (0, y) = 1 and hence (0, 0) = 1 too. So we have
∂x∂y ∂x∂y
∂2f ∂2f
(0, 0) = −1 6= 1 = (0, 0).
∂y∂x ∂x∂y

Solution to Exercise 5.40. Let use call the map f . Thus f (x, y) = xy for (x, y) ∈ R2 . Clearly
∂f ∂f
(x0 , y0 ) = y0 and (x0 , y0 ) = x0 ,
∂x ∂y
and so we guess that f ′ (x0 , y0 ) must be the linear transformation L that sends (h, k) ∈ R2 to
∂f ∂f
(x0 , y0 ) · h + (x0 , y0 )k = y0 h + x0 k.
∂x ∂y
We will now prove this claim. We have
f (x, y) − f (x0 , y0 ) − L((x, y) − (x0 , y0 )) = xy − x0 y0 − y0 (x − x0 ) − x0 (y − y0 )
= x0 ✘
xy − ✘ y0 − y0 x + ✘ ✘
y0 x 0 − x0 y + x0 y0

= x(y − y0 ) − x0 (y − y0 ) = (x − x0 )(y − y0 ),
hence
|f (x, y) − f (x0 , y0 ) − L((x, y) − (x0 , y0 ))| = |x − x0 | · |y − y0 |
≤ k(x, y) − (x0 , y0 )k · k(x, y) − (x0 , y0 )k = k(x, y) − (x0 , y0 )k22 .
Let ǫ > 0. Set δ = ǫ > 0. Then whenever 0 < k(x, y) − (x0 , y0 )k2 < δ, we have
|f (x, y) − f (x0 , y0 ) − L((x, y) − (x0 , y0 ))|
≤ k(x, y) − (x0 , y0 )k2 < δ = ǫ.
k(x, y) − (x0 , y0 )k2
This completes the proof.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

144 Solutions

Solution to Exercise 6.3.


(1) Let P = {x0 , x1 , . . . , xn } where a = x0 < x1 < · · · < xn = b and P ′ = {x′0 , x′1 , . . . , x′m } where
a = x′0 < x′1 < · · · < x′m = b. We have to show that U (f, P ) ≥ L(f, P ′ ) where f : [a, b] → R is a
bounded function.
We now define a partition P ′′ that is finer than both P and P ′ . Let P ′′ = P ∪ P ′ =
{x′′0 , x′′1 , . . . , x′′s },
where the elements have been relabeled so that a = x′′0 < x′′1 < · · · < x′′s = b.
The new partition P ′′ subdivides each interval [xk , xk+1 ] (0 ≤ k ≤ n − 1) into n(k) ≥ 1
subintervals [xk,0 , xk,1 ], [xk,1 , xk,2 ] . . . , [xk,n(k)−1 , xk,n(k) ], say, where xk = xk,0 < xk,1 < · · · <
xk,n(k)−1 < xk,n(k) = xk+1 .
Similarly, P ′′ subdivides each interval [x′k , x′k+1 ] (0 ≤ k ≤ m − 1) into m(k) ≥ 1 subintervals
[xk,0 , x′k,1 ], [x′k,1 , x′k,2 ] . . . , [x′k,m(k)−1 , x′k,m(k) ], say, where x′k

= x′k,0 < x′k,1 < · · · < x′k,m(k)−1 <
x′k,m(k) = x′k+1 .
We can now calculate as follows, where the three inequalities below follow from the fact that
if [c, d] ⊆ [c′ , d′ ], then

inf f (x) ≤ inf f (x) ≤ sup f (x) ≤ sup f (x).


x∈[c′ ,d′ ] x∈[c,d] x∈[c,d] x∈[c′ ,d′ ]

n−1
!
X
U (f, P ) = sup f (x) · (xk+1 − xk )
k=0 x∈[xk ,xk+1 ]

n−1
! n(k)−1
X X
= sup f (x) (xk,ℓ+1 − xk,ℓ )
k=0 x∈[xk ,xk+1 ] ℓ=0
!
X n(k)−1
n−1 X
= sup f (x) (xk,ℓ+1 − xk,ℓ )
k=0 ℓ=0 x∈[xk ,xk+1 ]
!
X n(k)−1
n−1 X
≥ sup f (x) (xk,ℓ+1 − xk,ℓ ) (note that [xk,ℓ , xk,ℓ+1 ] ⊆ [xk , xk+1 ])
k=0 ℓ=0 x∈[xk,ℓ ,xk,ℓ+1 ]
s−1
!
X
= sup f (x) · (x′′k+1 − x′′k )
k=0 x∈[x′′ ′′
k ,xk+1 ]

s−1
!
X
′′ ′′
= U (f, P ) ≥ L(f, P ) = inf f (x) · (x′′k+1 − x′′k )
x∈[x′′ ′′
k ,xk+1 ]
k=0
!
X m(k)−1
m−1 X
= inf f (x) (x′k,ℓ+1 − x′k,ℓ )
x∈[x′k,ℓ ,x′k,ℓ+1 ]
k=0 ℓ=0
!
X m(k)−1
m−1 X
≥ inf f (x) (x′k,ℓ+1 − x′k,ℓ ) (again note that [x′k,ℓ , x′k,ℓ+1 ] ⊆ [x′k , x′k+1 ])
x∈[x′k ,x′k+1 ]
k=0 ℓ=0
m−1
! m(k)−1
X X
= inf f (x) (x′k,ℓ+1 − x′k,ℓ )
x∈[x′k ,x′k+1 ]
k=0 ℓ=0
m−1
!
X
= sup f (x) · (x′k+1 − x′k )
k=0 x∈[x′k ,x′k+1 ]

= L(f, P ′ ).

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 145

(2) We have already shown that U (f, P ) ≥ L(f, P ′ ) for any P, P ′ ∈ P. It follows that for any
P ′ ∈ P, L(f, P ′ ) is a lower bound of the set S = {U (f, P ) | P ∈ P}. Therefore,
L(f, P ′ ) ≤ inf U (f, P ),
P ∈P

since a lower bound of S is ≤ the greatest lower bound of S. It follows that inf U (f, P ) is an
P ∈P
upper bound of the set T = {L(f, P ′ ) | P ′ ∈ P}. Therefore,
inf U (f, P ) ≥ sup L(f, P ′ ),
P ∈P P ′ ∈P

since an upper bound of T is ≥ the least upper bound of T .

Solution to Exercise 6.8.


(1) We have to show that f + g is also bounded on [a, b] and
Z b Z b
sup L(f + g, P ) = inf U (f + g, P ) = f (x) dx + g(x) dx.
P ∈P P ∈P a a

To show that f + g is bounded, let M1 , M2 ∈ R be such that |f (x)| ≤ M1 and |g(x)| ≤ M2 for
all x ∈ [a, b]. Then |f (x) + g(x)| ≤ |f (x)| + |g(x)| ≤ M1 + M2 for all x ∈ [a, b], which shows that
f + g is bounded.
Since we already know that sup L(f + g, P ) ≤ inf U (f + g, P ) (by Exercise 6.3), it remains
P ∈P P ∈P
to show the following two inequalities:
Z b Z b
inf U (f + g, P ) ≤ f (x) dx + g(x) dx ≤ sup L(f + g, P ).
P ∈P a a P ∈P
Rb
To show the first inequality, let ε > 0. Since inf U (f, P ) = a
f (x) dx, there exists P ∈ P such
P ∈P
that Z b
ε
U (f, P ) < f (x) dx + . (6.9)
a 2
Rb
Similarly, since inf U (g, P ) = a
g(x) dx, there exists P ′ ∈ P such that
P ∈P
Z b
′ ε
U (g, P ) < g(x) dx + . (6.10)
a 2
Let P ′′ be the union of P and P ′ (as in Exercise 6.3). Then
U (f, P ′′ ) ≤ U (f, P ) and U (g, P ′′ ) ≤ U (g, P ′ ). (6.11)
Note that for any subinterval [c, d] of [a, b], and any x ∈ [c, d],
f (x) + g(x) ≤ sup f (x) + sup g(x).
x∈[c,d] x∈[c,d]

Therefore,
sup (f (x) + g(x)) ≤ sup f (x) + sup g(x).
x∈[c,d] x∈[c,d] x∈[c,d]
It follows that
U (f + g, P ′′ ) ≤ U (f, P ′′ ) + U (g, P ′′ ). (6.12)
Putting together (6.9), (6.10), (6.11), (6.12), we obtain
Z b Z b
U (f + g, P ′′ ) ≤ f (x) dx + g(x) dx + ε.
a a
Since inf U (f + g, P ) ≤ U (f + g, P ′′ ), we obtain
P ∈P
Z b Z b
inf U (f + g, P ) ≤ f (x) dx + g(x) dx + ε.
P ∈P a a

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

146 Solutions

Since this is true for all ε > 0, we obtain


Z b Z b
inf U (f + g, P ) ≤ f (x) dx + g(x) dx.
P ∈P a a

In a similar way we obtain the second inequality


Z b Z b
sup L(f + g, P ) ≥ f (x) dx + g(x) dx.
P ∈P a a

It follows that
Z b Z b
sup L(f + g, P ) = inf U (f + g, P ) = f (x) dx + g(x) dx,
P ∈P P ∈P a a
Rb Rb
which implies that f + g is Riemann integrable on [a, b] and that a
f (x) + g(x) dx = a
f (x) dx +
Rb
a
g(x) dx.

(2) If |f (x)| ≤ M then |αf (x)| ≤ |α|M for all x ∈ [a, b]. It follows that αf is also bounded on [a, b].
Rb Rb
To show that αf is Riemann integrable and that a αf (x) dx = α a f (x) dx, we first consider the
case where α > 0. Then using the fact that for any subinterval [c, d] of [a, b],

sup αf (x) = α sup f (x)


x∈[c,d] x∈[c,d]

and
inf αf (x) = α inf f (x),
x∈[c,d] x∈[c,d]

it follows that U (αf, P ) = αU (f, P ) and L(αf, P ) = αL(f, P ) (check this by writing out the
summations!), and consequently,
Z b
inf U (αf, P ) = α inf U (f, P ) = α f (x) dx
P ∈P P ∈P a

and
Z b
sup L(αf, P ) = α sup L(f, P ) = α f (x) dx,
P ∈P P ∈P a
Rb Rb
which implies that αf is Riemann integrable on [a, b] and a
αf (x) dx = α a
f (x) dx.
Next consider the case where α = −1. Now using the fact that for any subinterval [c, d] of
[a, b],
sup f (x) = − inf −f (x)
x∈[c,d] x∈[c,d]

and
inf f (x) = − sup −f (x),
x∈[c,d] x∈[c,d]

it follows that U (−f, P ) = −L(f, P ) and L(−f, P ) = −U (f, P ) (again, write out the summations
to verify this), and consequently,
Z b
inf U (−f, P ) = − sup L(f, P ) = − f (x) dx
P ∈P P ∈P a

and
Z b
sup L(−f, P ) = − inf U (f, P ) = − f (x) dx,
P ∈P P ∈P a
Rb Rb
which implies that −f is Riemann integrable on [a, b] and a
−f (x) dx = − a
f (x) dx.

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 147

Next consider the remaining case α < 0. Then, since f is Riemann integrable, from what we
have already shown, |α|f is Riemann integrable, hence αf = −|α|f is Riemann integrable, and
Z b Z b
αf (x) dx = −|α|f (x) dx
a a
Z b
=− |α|f (x) dx
a
Z b
= −|α| f (x) dx
a
Z b
=α f (x) dx.
a

(3) Since f (x) ≥ 0 for all x ∈ [a, b], it follows that for any [c, d] ⊆ [a, b], inf x∈[c,d] f (x) ≥ 0.
Therefore, for any P ∈ P, L(f, P ) ≥ 0, and we obtain that
Z b
f (x) dx = sup L(f, P ) ≥ 0.
a P ∈P

(4) Define h : [a, b] → R by h(x) = g(x) − f (x) for x ∈ [a, b]. By parts (1) and (2), h = g + (−1)f
is Riemann integrable and
Z b Z b Z b Z b
h(x) dx = g(x) + (−1)f (x) dx = g(x) dx + (−1)f (x) dx
a a a a
Z b Z b
= g(x) dx + (−1) f (x) dx.
a a
Rb
Since f (x) ≤ g(x), it follows that h(x) ≥ 0 for all x ∈ [a, b], and by part (3), a
h(x) dx ≥ 0.
Rb Rb Rb Rb
It follows that a g(x) dx − a f (x) dx ≥ 0, that is, a f (x) dx ≤ a g(x) dx.

(5) We first show that |f | is Riemann integrable. We will do this by showing that
inf U (|f |, P ) ≤ sup L(|f |, P ),
P ∈P P ∈P

since the opposite inequality is trivial.


Let ε > 0. There exists a partition P ∈ P such that
Z b
ε
U (f, P ) < f (x) dx +
a 2
and there exists P ′ ∈ P such that
Z b
ε
L(f, P ) > f (x) dx − .
a 2
′′ ′
Let P be the union of P and P (as in Exercise 6.3). Then
U (f, P ′′ ) − L(f, P ′′ ) ≤ U (f, P ) − L(f, P ′ ) < ε. (6.13)
We next show that
U (|f |, P ′′ ) − L(|f |, P ′′ ) ≤ U (f, P ′′ ) − L(f, P ′′ ). (6.14)
We do this termwise by showing that for any subinterval [c, d] ⊆ [a, b],
sup |f (x)| − inf |f (x)| ≤ sup f (x) − inf f (x). (6.15)
x∈[c,d] x∈[c,d] x∈[c,d] x∈[c,d]

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

148 Solutions

For any x, y ∈ [c, d] we have

f (x) − f (y) ≤ sup f (x) − inf f (x)


x∈[c,d] x∈[c,d]

and similarly,
f (y) − f (x) ≤ sup f (x) − inf f (x).
x∈[c,d] x∈[c,d]

It follows that
|f (x) − f (y)| ≤ sup f (x) − inf f (x).
x∈[c,d] x∈[c,d]

By the triangle inequality (in R),

|f (x)| ≤ |f (y)| + |f (x) − f (y)|.

It follows that
|f (x)| ≤ |f (y)| + sup f (x) − inf f (x)
x∈[c,d] x∈[c,d]

for all x, y ∈ [c, d], hence

sup |f (x)| ≤ inf |f (y)| + sup f (x) − inf f (x),


x∈[c,d] y∈[c,d] x∈[c,d] x∈[c,d]

which implies (6.15). If we apply (6.15) to each of the subintervals appearing in the upper and
lower sums of |f | and f , we obtain (6.14), which, together with (6.13) gives that

inf U (|f |, P ) − sup L(|f |, P ) ≤ U (|f |, P ′′ ) − L(|f |, P ′′ ) < ε.


P ∈P P ∈P

Since this is true for all ε > 0, we obtain

inf U (|f |, P ) − sup L(|f |, P ) ≤ 0.


P ∈P P ∈P

It follows that |f | is Riemann integrable on [a, b].


Rb Rb
To show that | a f (x) dx| ≤ a |f (x)| dx, note that f (x) ≤ |f (x)| and −f (x) ≤ |f (x)| and
apply part (4) to obtain that
Z b Z b
f (x) dx ≤ |f (x)| dx
a a

and
Z b Z b Z b
− f (x) dx = −f (x) dx ≤ |f (x)| dx,
a a a

which together gives


Z b Z b
f (x) dx ≤ |f (x)| dx.
a a

(6) Suppose to the contrary that f (c) 6= 0 for some c ∈ [a, b]. Thus f (c) > 0. Applying the
definition of continuity to f at c with ε = f (c)/2, we obtain the existence of δ > 0 such that for all
x ∈ [a, b] with |x − c| < δ we have |f (x) − f (c)| < f (c)/2, from which follows that f (x) > f (c)/2.
Thus we have f (x) > f (c)/2 for all x in an open interval around x0 . (This also shows that if c = a
or c = b, then there are other values at which f is positive, so we can choose c ∈ (a, b) and we
may assume without loss of generality that (x0 − δ, x0 + δ) ⊆ (a, b).)

Downloaded by Lukhanyiso Jikela ([email protected])


lOMoARcPSD|21840264

Solutions to the exercises from Chapter 5 149

Let P be the partition {a, x0 − δ, x0 + δ, b}. Then


Z b
f (x) dx
a
≥ L(f, P )
   
= inf f (x) (x0 − δ − a) + inf f (x) ((x0 + δ) − (x0 − δ))
x∈[a,x0 −δ] x∈[x0 −δ,x0 +δ]
 
+ inf f (x) (b − (x0 + δ))
x∈[x0 +δ,b]
 
≥ inf f (x) ((x0 + δ) − (x0 − δ))
x∈[x0 −δ,x0 +δ]

f (c)
≥ · 2δ > 0,
2
a contradiction.

Downloaded by Lukhanyiso Jikela ([email protected])

You might also like