Victor Bryant
METRIC SPACES
Iteration and application
| CAMBRIDGE
UNIVERSITY PRESS
CAMBRIDGE UNIVERSITY PRESS
Cambridge, New York, Melbourne, Madrid, Cape Town,
Singapore, Sao Paulo, Delhi, Tokyo, Mexico City
Cambridge University Press
T h e Edinburgh Building, Cambridge CB2 8RU, U K
Published in the United States of America by Cambridge University Press, New York
www.cambridge.org
Information on this title: www.cambridge.org/9780521318976
© Cambridge University Press 1985
This publication is in copyright. Subject to statutory exception
and to the provisions of relevant collective licensing agreements,
no reproduction of any part may take place without the written
permission of Cambridge University Press.
First published 1985
Reprinted 1987, 1990, 1994, 1996
A catalogue record for this publication is available from the British Library
Library of Congress Catalogue Card Number: 84-15519
ISBN 978-0-521-26857-8 Hardback
ISBN 978-0-521-31897-6 Paperback
Cambridge University Press has no responsibility for the persistence or
accuracy of URLs for external or third-party internet websites referred to in
this publication, and does not guarantee that any content on such websites is,
or will remain, accurate or appropriate. Information regarding prices, travel
timetables, and other factual information given in this work is correct at
the time of first printing but Cambridge University Press does not guarantee
the accuracy of such information thereafter.
CONTENTS
Preface V
1 Sequences by iteration 1
1.1 Where are we heading ? 1
1.2 Sequences of numbers by iteration 1
1.3 Iterations in a different world 7
2 Metric spaces 12
2.1 Distance 12
2.2 Examples of metric spaces 14
2.3 Sequences 22
3 The three Cs 29
3.1 Iteration revisited 29
3.2 Closed sets 30
3.3 An internal test for convergence 36
3.4 Complete sets 38
3.5 Compact sets 45
4 The contraction mapping principle 52
4.1 Real fixed points 52
4.2 Contractions 57
4.3 Real contractions revisited 62
4.4 Some extensions 64
4.5 Differential equations 71
4.6 The implicit function theorem 83
4.7 Conclusion 86
5 What makes analysis work ? 87
5.1 Continuity 87
5.2 Attained bounds 92
5.3 Uniform continuity 94
5.4 Inverse functions 96
5.5 Intermediate values 98
5.6 Some final remarks 103
Index 105
PREFACE
Some years ago I regularly gave a traditional course on metric spaces
to second-year special honours mathematics students. I was then
asked to give a watered-down version of the same material to a class
of combined honours students (who were doing several subjects,
including mathematics, at a more general level) but, to put it mildly,
the course was not a success. It was impossible to motivate students to
generalise real analysis when they had never understood it in the first
place and certainly could not remember much of it. It was also
counter-productive to start the course by revising real analysis
because that convinced the students that this was 'just another
analysis course' and their interest was lost for evermore.
So when I gave the course again the following year I decided to turn
the material inside out and to start with the applications (namely the
use of contractions in solving a wide range of equations). This meant
that thefirstchapter was a revision of some iterative techniques used
to obtain approximations to solutions of equations. This immediately
captured the interest of the class: they enjoyed using their calculators
and writing programs to solve the equations. Some of the ideas were
entirely new to them; for example using iteration to solve an equation
with constraints, or solving a differential equation by iterating with an
integral and obtaining a sequence of functions.
The second and third chapters were more traditional but the big
difference was that the need for distance, function space, closed set,
and so on, had been anticipated and motivated. Another difference
was that, having approached the subject via iteration, it was then
natural to define all the concepts in terms of sequences: hence closed
sets (rather than open ones) formed the basis of the approach.
For most students the fourth chapter was the highlight of the
course. It consisted of the contraction mapping principle and the use
of its algorithmic proof in solving equations. But I was then able to
vi Preface
say to them that, as we had developed all the tools of the subject, it was
now an easy matter to look back to real analysis and get a better
understanding of it. The last chapter therefore recalled and general-
ised the classic theorems of real analysis.
This book, then, is an approach to metric spaces along those lines.
It tries to avoid assuming that the reader knows much about analysis
and if a difficult concept is to be encountered the reader is prepared by
several glimpses of it, in examples, well in advance. My aim is to
provide a book which can be read and enjoyed by a wide range of
second- or third-year students in universities or polytechnics. The
only prerequisite is to have done a course on elementary analysis: it is
not a prerequisite to have understood it nor to have remembered it all.
There are several people who have contributed indirectly to this
work. Firstly I thank Hazel Perfect and John Pym for their most
constructive comments. Secondly my well-worn copies of the books
on metric spaces by Copson, Simmons and Sutherland testify to the
use which they have been to me over the years. Next I must mention
my colleagues Mary Hawkins and Harry Burkill: at one time we
collaborated on a metric space course, and for any of their examples
and proofs which may remain in this new approach I thank them.
Also I thank Anne Hall for preparing a beautiful typescript from my
almost-illegible original. Finally, and principally, I thank my friend
and mentor, Roger Webster. It was his M.Sc. course on functional
analysis many years ago which rekindled my pleasure in mathematics
and introduced me to metric spaces.
I have tried to provide a readable and natural introduction to an
abstract subject in a down-to-earth manner. Even if I can transmit
only a fraction of the pleasure which the subject has given me, then I
will have been successful.
Sheffield Victor Bryant
1984
1
Sequences by iteration
1.1 Where are we heading?
Imagine a table top covered with a thin layer of dust. We give
it a flick with a duster, but all the dust settles back on the table top.
Suppose that after being rearranged every two specks of dust are
closer together than they were before. Then the remarkable thing is
that there is one and only one speck of dust which is back in the same
place that it started in.
That is an example of a set X and a function / : X—>X. The
property that / brings the points of X closer together is enough to
ensure that X has one and only one point which is not moved by / ; i.e.
a unique xeX with x =/(x). These so-called 'fixed points' of functions
are invaluable in solving equations. For example, the fixed points of
the function / given by f(x) = j(x3 — 5x2 +1) are those x for which
x = j(x 3 — 5x2 + l); i.e. they are precisely the roots of the equation
x 3 — 5x 2 — 3x + 1 = 0. In fact, it turns out that this real function / has
just one fixed point and that it is easy to find with a calculator; so the
unique real root of the cubic equation will be easily found.
In this book our search will be for a large collection of situations X
and functions / : X—• X which have unique fixed points. The study of
such situations provides an interesting piece of mathematics, but the
real fascination will lie in the wide range of problems which they will
enable us to solve.
1.2 Sequences of numbers by iteration
To find a real root of the cubic equation
x3+x2-5x-3 = 0
rewrite the equation as
x=4(x3+x2-3)
or x = / ( x ) , where / ( x ) = j ( x 3 + x 2 — 3). Start with a first guess at a
2 Sequences by iteration
root, 0 for example. Apply / to this first guess to give /(0)= —0.6.
Apply / to this answer to give / ( - 0 . 6 )= -0.5712. Continue in this
way to give a list or sequence of answers:
0
i
/(0)=-0.6
i
/(-0.6)= -0.5712
1
/(-0.5712)= -0.5720191
1
/ ( - 0.5720191)= -0.5719924
1
/(-0.5719924)= -0.5719933
1
/(-0.5719933)= -0.5719933
1
This is the best my calculator can manage, but to this accuracy at least
there is no point in continuing as the answer —0.5719933 will keep
repeating itself. So (to within my calculator's accuracy)
/(-0.5719933)= -0.5719933.
In other words, we have found an x with x=/(x) - called a 'fixed
point' of/ By our above remarks this x will be (again to within my
calculator's accuracy) a root of the cubic equation x 3 + x2 — 5x — 3 =
0. Let us check:
(-0.5719933)3-f(-0.5719933)2-5(-0.5719933)-3
= -0.0000000081...,
which, of course, is very nearly zero as expected. So having rewritten
our equation in the form x=/(x) the solution of the equation was a
mechanical process requiring no further thought. But it must be
remembered that this process is onlyfindingan approximate root: the
sequence which we produce only seems to come to an end because of
the limited accuracy of our calculators. And in all cases we ought to
substitute our answers back into the equations to check their validity.
Exercise 1 Find a root, to four decimal places, of the cubic
equation x3 + 2x2 — 7x — 1 = 0 by rearranging it into the form
x=/(x) = 7(x 3 +2x 2 — 1) and making a first guess between
— 3 and 1. Reapplying / (or 'iterating' with / ) should give
Sequences of numbers by iteration 3
you a sequence settling down (or 'converging') to a root of the
equation.
(Calculators are obviously needed! A fairly economical way of
calculating /(x), having entered x into the display, is by the following
steps:
store recall and repeat
in from
memory memory
Of course, a programmable machine can be used to great advantage.)
In Exercise 1 you should have found that one root of the equation
x + 2x 2 — 7x — 1 = 0 is approximately — 0.1378: you could check by
3
substituting this value back into the cubic that it does give a value
very near zero.
Now let us examine exactly what was happening in the above
examples: we wanted to solve the equation x = / ( x ) and so we made a
first guess xx at a root and calculated all future guesses by reapplying
/ to give
X =
2 f(Xlh X =
3 f(X2\ • • • J Xn + l=f(Xn)i •• •
We then looked at the limit of the sequence
x
l 9 x2> X
3> ' • • » xn> • • • * X
and noted that it had the required property that x =/(x). Of course, as
logical people, we must ask some crucial questions about this process:
Will the sequence which we produce by iterating with f
definitely converge?
If it does converge will the limit definitely be a root of x =
fix)?
If the answers to both these questions were 'yes' and any function led
to a sequence which converged to a root of the required equation,
then it would be such a wonderful method that practically all other
ways of solving equations would be redundant. Let us answer the first
question by means of an exercise.
Exercise 2 Repeat the previous exercise by rearranging the
cubic equation x 3 + 2x 2 — 7x — 1 = 0 into the form x =f(x) =
7(x3 +2x 2 — 1). Try to find a root of the cubic by making a
first guess of 2 or more and iterating with /
4 Sequences by iteration
Exercise 3 Rearrange x 3 + 2x 2 - 7x - 1 = 0 into the form x =
/ ( * ) = \ / { i ( - x 3 + 7 x + l)}. Try iterating with this / for
various first guesses. See if you canfinda sequence in this way
which converges to a root of the cubic.
•BHEimmEimsmmHsmsEi] and repeat.)
In Exercise 2 you should have got a sequence which very soon
became out of hand - clearly not settling down to a root of the cubic.
In Exercise 3 if you started with an xx between 0 and 2 then you
should have got a sequence converging to around 1.91909 - which is
very close indeed to a root of the cubic. But for some starting values
(such as Xj = 3) you cannot even proceed because /(3) is not defined.
So, to summarise, some functions and some starting points will give
sequences converging to the required roots, but often a function will
lead to a sequence which does not converge. The principles of
numerical analysis are beyond our scope here but it is worth
mentioning the Newton-Raphson method of iteration, designed as a
systematic way of deciding which function to use for iteration: it often
leads to a successful iterative process. We content ourselves with one
exercise on this.
Exercise 4 To solve the cubic 2x 3 + 3x 2 + 6x +1 = 0
rearrange it to 6x 3 + 6x 2 + 6x = 4x 3 + 3x 2 — 1 and hence x =
/ ( x ) = ( 4 x 3 + 3 x 2 - l ) / 6 ( x 2 + x + l). Confirm that
_. . the original cubic
f (x) = x
the derivative of the cubic
(this is how, in general, the / is chosen in the Newton-
Raphson method). Choose any xl9 iterate with / and hence
find to four decimal places the unique real root of 2x 3 +
3x 2 +6x + l = 0 .
The function in Exercise 4 always leads to a convergent sequence,
regardless of thefirstguess x x . One of the aims of this book is to find a
large class of functions which are equally dependable.
Although we have been able to find functions for which the iterative
process failed to produce convergent sequences, so far whenever a
convergent sequence has been produced it has led us to a root. So the
answer to our second question
Sequences of numbers by iteration 5
/ / the sequence produced by iteration with f converges will the
limit definitely be a root of x=f(x)?
seems much more likely to have a favourable answer. Imagine, for
example, that we are trying tofindan x for which x =/(x), where / is a
cubic in x, and we produce a convergent sequence.
x 1 ,x 2 =/(x 1 ),x 3 =/(x 2 ),... ,xB + 1 =/(x„),.. .->x.
Then f(xn) is simply a cubic in xn and so as the x„s get closer to x so do
the /(x„)s get closer to /(x); i.e.
/U1),/(x2),/(x3),...,/(xJ,...^/(x)
II II II II
x x
*2> 3> 4 » •••> Xn + l ,...( *X...)
But, as you can see, the sequence of /(x„)s is simply part of the
sequence of x„s, which converge to x. Therefore x =/(x) and the limit
x does satisfy the equation.
Most of the functions which we encounter will be 'continuous'; i.e.
if x and y are sufficiently close, then f(x) and f(y) are close. If/ is of
this type and an iterative process leads to a convergent sequence, then
the limit of that sequence must be a root of x = /(x), as we now see.
Theorem 1.1 Let X be a subset of the real numbers R, let / : X—> X be
continuous and let xxeX. Then if the sequence
*1> *2 =f(x1), X 3 = / ( x 2 ) , . . . , Xw + 1 =/(x„), . . .
converges to xe X it follows that x=/(x).
Proof Let x 2 =/(x 1 ), x 3 =/(x 2 ), etc. and assume that
x1? x2, x 3 , . . . , x„,... >x
as stated. Then the x„s get very close to x and so (by continuity) the
/(x„)s get very close to /(x); i.e.
f(x1), /(x 2 ), / ( x 3 ) , . . . , / ( x j ,.. . ^ / ( x ) )
I II II II .-• x=/(x).
X21 x$, X4, xn + 1 , . . . — • x / LI
The above result is only a very special case of one which we shall meet
later (where we will also discuss the necessary conditions on the
domain X off).
For those interested enough to check that the condition of
continuity of / is actually needed in the above theorem we will
consider a slightly more unusual example (the reader not interested in
6 Sequences by iteration
this esoteric point can turn immediately to the goat example below -
or indeed to the beginning of Section 1.3 if he has had enough of
solving equations by producing real sequences). Let [x] denote the
'largest integer not exceeding x'. For example [3.142] = 3. This is one
of the simpler functions which is not continuous, for it is possible for x
to be very close to y (for example 1.99 and 2) without [x] being close to
\_y]. Now let us try to solve the equation
*=/(x) = [x] + l - i ( [ x ] + l - x ) 2
by an iterative process starting with xx = 1. This gives
^2=/(x 1 ) = [ l ] + l - K [ l ] + l - D 2 = 2 - | = 1 . 5 ,
x 3 =/(x 2 )=[1.5] +1 -$([1.5] +1 - 1.5)2 = 2-K0.5) 2 = 1.875 etc.
which leads to the sequence
1,1.5,1.875,1.9921875,1.9999695,...
which is clearly converging to 2. So
/(l), /(1.5), /(1.875), /(1.9921875),...
II II II II
1.5, 1.875, 1.9921875,1.9999695,...
also converges to 2. But
/(l),/(1.5),/(1.875),...^/(2)
since /(2) = 2.5. So the limit does not satisfy x=/(x) and the
continuity of / is needed in the above theorem.
To conclude our series of numerical examples we will solve a very
famous problem concerning a goat: the complicated function with
which we will iterate begins to show the power of this method. The
problem is that a farmer owns a circular plot of land (of unit diameter,
say) which is covered in grass. He wants to tie his goat by a long rope,
one end attached to a point on the circumference of thefieldand the
other end attached to the goat. How long should the rope be in order
that the goat can eat exactly half the grass ?
Exercise 5 Show (or take my word for it) that the shaded area
in Figure 1.1 opposite is
-sin" 1 x - - ( l - x 2 ) 1 / 2 + x 2 cos" 1 x.
2 2
Hencefindan equation in x if the shaded area is equal to half
of the area of the larger circle. By rearranging your equation
into the form x =/(x) for some / and by choosing a suitable
Iterations in a different world 1
first guess for xl9 obtain a sequence which converges to the
required value of x.
Note that the equation you get in Exercise 5 may have other roots
besides the one relevant to the goat problem, so make a quick mental
check that the limit of your sequence is the right sort of size. The
rearrangement which I used was obtained by finding an expression
for sin " l x and hence for x: the iterative process with a starting value
of just over \ then produced a convergent sequence (with limit of
approximately 0.57935).
1.3 Iterations in a different world
The general principle behind our iterative techniques in the
previous section was
old guess- process - new guess
where the process can be repeated to give a sequence of improving
guesses. In all the examples so far the process has been dealing with
numbers, but now let us illustrate a process which deals with
something different.
Note before proceeding that the list of numbers
3,2,1,1,0,0,0
is an inventory of itself, in the sense that
3 2 1 1 0 0 0
no. of no. of no. of no. of no. of no. of no. of
0s in Is in 2s in 3s in 4s in 5s in 6s in
the the the the the the the
list list list list list list list.
We are going to find a self-counting list of ten numbers.
Fig. 1.1
Field of diameter 1
rope of length x
8 Sequences by iteration
So let U10 be the set of points with ten real coordinates (or lists of
ten real numbers'). We shall find a member x of IR10 with
X = X X
V*0> *1> *2» *3> *4> *5> -*6> l> *8» 9)
II II
no. of no. of etc.
Os in Is in
the the
list list
In order to employ an iterative technique we will use the process
whereby given any member of IR1 ° we count up the occurrences of the
numbers 0 to 9 as its coordinates. For example,
/
count up
( 8 , 7 , 8 , 2 , 1, 1 J, 5.9,0, 1, 1)-H occurrences—^(1,3, 1,0,0,0,0, 1,2,0)
of 0 to 9
Repeatedly applying the process / to give a sequence of points of R1 °
yields, for the above starting point,
x 1 = ( 8 , 7, 8, 2, 1, l i 5.9,0, 1, 1),
x 2 = / ( X l ) = (l, 3, 1, 0, 0, 0, 0, 1, 2, 0),
x 3 = / ( x 2 ) = (5, 3, 1, 1, 0, 0, 0, 0, 0, 0),
x 4 = / ( x 3 ) = (6,2,0, 1,0, 1,0,0,0,0),
x 5 = / ( x 4 ) = (6,2, 1,0,0,0, 1,0,0,0),
x 6 = / ( x 5 ) = (6,2, 1,0,0,0, 1,0,0,0).
So already the process / is repeating itself and we have actually
reached a point
x = (6, 2, 1,0,0,0, 1,0,0,0)
for which /(x) = x; i.e. x is a self-counting list. Try this process for
yourself with different starting points (and with lists of different
lengths).
The above example illustrated that our iterations need not only
deal with single numbers. Our next few examples will concern
iteration with functions and this will have far-reaching consequences.
To start with, imagine that we have a process which takes in some
function x (e.g. x(t) = t2 for t e U) and, after processing, gives out a new
function y (e.g. y(t) = t2 + 1 for t e U). In general
Iterations in a different world 9
process • a new function y.
(a function R -• R)
Since y is a function, to define it we must stipulate what y(t) is in terms
of x's values which we know all about. For example, our process could
start with x and produce y with, for each t e U,
y(t) = 1+ u2x(u)du.
So if x is the zero function, then y(t) = 1 for all t; and if x(t) = sin (f3),
then
y(t)= 1+ u2 sin (u3) du= 1 + [ ~ i c o s (w3)]r0
= f-4°cos(t3).
This process can be applied to any continuous function (so that the
integral is defined) to produce another continuous function. So if X is
the set of continuous functions (from U to IR, say) then the process
takes a n x e l and creates a j / e l : the process is simply a function
/ : X—• X. This is no different in principle from the examples in the
previous section except that the elements of X are themselves
functions. To summarise, the above process is a function / : X—• X
where x e X is taken to y=f(x) given by
(f(x))(t) = y(t)=l+j'u2x(u)du.
Exercise 6 Let x be the function given by x(t)= 1. Evaluate
the function / ( x ) ; i.e. find (f(x))(t) in terms oft, where / is as
stated above.
In Exercise 6 f(x) turned out to be the function given by (f(x))(t) =
Exercise 7 Let xx be the function given by xl(t)=l. Let
f(x1) = x 2 where / is as above. Then x 2 is the function given
by x 2 (t)= 1 +3*3- Let/(x 2 ) = x 3 : evaluate x3(r). Let/(x 3 ) =
x 4 : evaluate x4(r).
We learnt in the previous section of the possible interest of reapplying
a function and investigating the behaviour of the resulting sequence.
If we consider our new / defined as above and start with the function
10 Sequences by iteration
xl(t)=l, then reapplying / gives the sequence
t3 , x ^ t3 t6
x1(t)= 1, x2(t)= 1 + —, x33(t) = 1 + — +
3' W 3 9x2!'
3 6
r t t9
x 4 (r)=l+-+——-
3 9 x 2 ! + ;2 7 x 3 ! " "
As yet, we have no formal way of testing whether a sequence of
functions settles down, but you may have already spotted a pattern in
the above sequence and guessed that it is settling down to the function
given by
3 32x2! 33xn! 3"xw!
Our work in the previous section might lead us to suspect that this
function is a fixed point of/; i.e. f(x) = x or
x{t) = (f(x))(t)=l+\ u2x(u)du (allfeR).
And indeed this is the case, as can be checked by integration.
So it seems that we have solved the equation in x,
x(t)=l + u2x(u )du (re I
by starting with any function x1 and reapplying / to give a sequence
of functions: their limit was the required root.
At the moment much of this is informal: we do not really know
what it means to say that a sequence of functions converges. (Nor
have we seen the significance of equations like the integral equation
above.) It seems that the general principle of iteration with a function
/ might be useful in many situations apart from solving numerical
equations. But we must develop the theory for coping with functions
/ : X—• X and for determining in the most general situations whether
the sequence
x = x X
*1» 2 f\ lh 3=/(X2), ...
converges. Before doing so, we conclude this section with one further
'functional' example of the above type.
Exercise 8 Let X be the set of continuous real functions
defined on the open interval ] — 1,1[. (We shall always use
the notation ]a, b[ for the open interval {xe R: a<x<b}.)
For xGX let f(x)elbe defined by
Iterations in a different world 11
(f(x))(t)=l-\[x(u)fdu.
Let xx be the zero function (i.e. x1(t) = 0 for all te] — 1, l[)
and let x 2 = / ( x 1 ) , x 3 = / ( x 2 ) , etc. Show that x2(t)=l and
x3(t) = 1 — t. Evaluate x 4 and x 5 and note that the functions
Xx,x2,x3, . . . seem to be approaching the function xeX
given by
x(t)=l-t + t2-t3 + --=- .
1+r
Confirm by direct integration that x(t)= 1/(1 +t) is indeed a
root of the equation
x(t)=l- |[x(M)] 2 dw.
2
Metric spaces
2.1 Distance
Our eventual aim is to be able to consider sequences
(sometimes of numbers, sometimes of functions and sometimes,
perhaps, from other worlds) and to ask whether they settle down or
converge. To ask whether the members of X
• * 1 > ^ 2 > - ^ 3 » • • • » •*/!> • • •
get closer to x we need to be able to say how far xn is from x. Hence our
basic need is for the idea of the distance between any two members
of X.
The approach we choose is a pure mathematical one, but it happens
to pay off handsomely in some of the later applications. We shall
isolate the three fundamental properties of distance and base all our
deductions on these three properties alone. This makes proofs much
easier (with only three properties to think about rather than all the
technical properties of (R, say) and it makes the results much more
general because they apply to any situation which has a concept of
distance satisfying our three basic properties.
The three properties are apparent from a normal mileage chart
(Figure 2.1). The first obvious property is that zeros appear down the
diagonal and nowhere else; so the distance between x and y is zero if
and only if x = y. The second obvious property of the mileage chart is
its symmetry; i.e. the distance from x to y is the same as that from y to
x. The third property is inherent in the chart but is not so immediately
seen. The distance from London to Manchester is 190 miles, whereas
London Manchester Sheffield
[London 0 190 160
Manchester 190 0 40
[Sheffield 160 40 0
Distance 13
the distance from London to Sheffield to Manchester is 160+40 =
200 miles. In general, if we go via some other point we must travel as
far or further than we would on the direct route (Figure 2.2). With our
usual straight-line distances this property can be illustrated by means
of the three vertices of a triangle: for this reason it is known as the
triangle inequality.
*M J
A S>*
distance x to y
< (distance x to z) + (distance z to y)
*+
&
<£ t-»
# P
A
x* Distance •
\
^ Z
x to z
Definition Let A" be a non-empty set and for each x9yeX let d(x9 y) be
a real number satisfying
Ml. d(x9 y) = 0 if and only if x = y;
Ml. d(x9y) = d(y9x) for each x9yeX;
M3. d(x, y) ^d(x9 z)+d(z9 y) for each x, y and zeX.
Then d is called a metric or distance on X, and X together with d is
called a metric space (X, d).
Of course, there are other properties of d which you might have
expected to be in the definition. For example, d(x9 y)^0 for all x and
y. But this property follows from the other three, as the interested
reader might like to confirm.
Exercise 9 Use the fact that
d(x, x)^d(x9 y) + d{y9 x)
for any x, y in a metric space (X, d) to deduce that d(x9 y) ^0.
Note also that if distances are generally greater going via an
additional point, then they are greater going via any number of
additional points zl9 z 2 , . . . , zn: for, by repeated use of M3,
d(x,y)^d(x,zl) + d{z1,y)
^d(x9zl)+d(zl9z2) + d(z2,y)
^d(x9z1) + d(zl,z2) + d(z29z3) + d(z3,y)
^d(x,z1)+d{z1,Z2)+d(z29z3) + - • • +d(z„y).
14 Metric spaces
It is crucial to our later applications that there are many examples
of metric spaces. To confirm that some of the examples satisfy M3
may be a little technical, but that does not alter the fact that the basic
idea of distance and a metric space is a very simple and elegant one.
2.2 Examples of metric spaces
1. Let X=U, the set of real numbers, and for x , y e X define
d{x, y) by d(x, y) = \x — y\. Then (X, d) is a metric space, as we now
verify:
Ml. d(x, y) = \x — y\ = 0 if and only if x = y;
Ml. d(x,y) = \x-y\ = \-{x-y)\ = \y-x\ = d(y,x);
M3. d(x,y) = \x-y\ = \(x-z) + (z-y)\
^\x-z\ + \z-y\
= d(x,z) + d(z,y).
The verification of M3 in Example 1 used the easy fact that, for real
numbers, |a + ft|^|a| + |fe|. This same property is true for complex
numbers a and b: in fact (as the interested reader can verify for
himself), if the arguments of a, b and a + b are 0, <f> and \// respectively,
then
\a + b\ = \a\ cos (9 - xj/) + \b\ cos W> - ^ K \a\ + \b\.
So precisely the same reasoning which showed that d{x,y) = \x — y\
defines a metric for U will show that it defines a metric for C. Hence:
2. Let X—C, the set of complex numbers, and for x9yeX
define d(x9 y) by d(x, y) = \x — y\. Then, as above, (X, d) is a metric
space.
3. Let X = R2, the set of points in the coordinate plane. We shall
use bold print for the members x of X to distinguish them from their
coordinates; x = (xt,x2), etc. For x , y e l define d(x,y) by
d(x,y) = rf((x1,x2),(3;1,);2)) = [(x 1 ->; 1 ) 2 + ( x 2 - ) ; 2 ) 2 ] 1 / 2 .
This is precisely the usual straight-line or Euclidean distance between
points in the plane. It is intuitively clear that our everyday distance
satisfies M l , M2 and M3, but we now give the formal verification:
M l . d(x, y) = 0 if and only if [(xx -yx)2 + (x 2 ->; 2 ) 2 ] 1 / 2 = 0 which
happens if and only if xx =yx and x2 = y2, i.e. x = y.
Examples of metric spaces 15
Ml. rf(x,y) = [(x 1 -y 1 ) 2 + (x 2 - 3 ; 2 ) 2 ] 1 ' 2
=[(yi-^i)2+0'2-^)2]1/2=^x).
an< z==
M3. We will establish M3 for x = (x 1 ,x 2 ), y^iy^yi) i
(^1^2)-
Note that the quadratic in t given by
[(x 1 -z 1 ) 2 + (x 2 -z 2 ) 2 ]t 2
+ 2[(xl-zl)(zl-yl) + (x2-z2)(z2-y2)']t + [(zl-yl)2 + (z2-y2)2]
is in fact
[(x1-z1)t + (zl-yl)y+[(x2-z2)t^(z2-y2)Y
which is never negative for real t. But if a quadratic at2 +ftt+y with
a, y ^0 is never negative for real t then its discriminant is not positive,
and /?^2(ay)1/2. So in this case
2[(x1-z1){zl-yl) + (x2-z2)(z2-y2)']
^2{[(x 1 -z 1 ) + (x 2 -z 2 ) 2 ][(z 1 ~ } ; 1 ) 2 -f(z 2 -^ 2 ) 2 ]} 1 / 2
2
= 2d(x,z)d(z,y).
Therefore
= [(x1-z1) + (z1-y1)Y+[(x2-z2) + (z2-y2)Y
= (x1-zl)2 + (x2-z2)2+2[(xl-z1)(z1-yl)
+ (x 2 -z 2 )(z 2 -.y 2 )] + (z 1 -y 1 ) 2 + (z 2 -)/ 2 ) 2
^(x1-z1)2 + (x2-z2)2+2d(x,z)d(z,y) + (z1-y1)2 + (z2-y2)2
= [d(x, z)] 2 + M(x, z)d(z, y) + [«/(*, y)] 2
= [d(x,z)+d(z,y)]2
and
d(x,y)^d(x,z)+d(z,y)
as required. So (.Y, d) is a metric space.
Readers familiar with the representation of C in an Argand diagram
(or 'complex plane') will perhaps suspect that there is a strong
connection between distances in C and distances in R2. In fact if x =
Xj +x 2 i and y=yx +y2i in C (where xx, x2, yt and y2 are real), then
the definition of distance d in C given in Example 2 shows that
d{x,y)=\{x1 +x2i)-(y1 +y2i)\
= \(xi-yl) + (x2-y2)i\
= {.(x1-yl)2 + (x2-y2)2y2,
which is precisely the distance from (xl5 x2) to (yl9 y2) in U2 using the
16 Metric spaces
definition of distance from Example 3. So the verification of M3 in
that example also provides a verification for C (or vice versa).
4. Let X = R", the set of points with n real coordinates; e.g. x =
( x l 5 x 2 , . . . , xn)eX. (In the case n = 2 this gives thecoordinate plane of
Example 3 and in the case n = 3 this gives the usual 3-dimensional
space.) For \,yeX define d(x, y) by
d{x, y) = d((xu x 2 , . . . , x j , ()>!, y2,..., yj)
= [ ( x 1 - y 1 ) + (x 2 ->; 2 ) + - + ( x n - y J 2 ] 1 / 2 .
2 2
Then (X, d) is a metric space. The verification is very similar to the
case n = 2 given in Example 3, except, of course, that M3 is verified by
considering the quadratic
[ ( x 1 - z 1 ) 2 + - - - + ( x n - z j 2 ] t 2 + 2 [ ( x 1 - z 1 ) ( z 1 - y 1 ) + ---
+ (xn-zn)(zn-yn)']t + \_(z1-y1)2 + '--+(zn-yn)2l
Exercise 10 Verify that the example (X, d) in 4 does satisfy
M l , M2 and M3 and hence that it is a metric space.
The examples so far have simply put into the setting of a metric space
all the familiar concepts of distance (between real numbers or points
in a plane, etc.) but the main advantage of our abstraction will be that
it incorporates some far less familiar situations, some of which will
lead to powerful applications. The first of these unusual examples
shows how our metrics can differ considerably from an intuitive
concept of distance.
5. Let X be any non-empty set and for x, y e X define d(x, y) by
JO ifx = y
[l ifx^y.
Then (X, d) is a metric space, called a discrete space.
Before we leave the reader to verify that a discrete space is a metric
space some comments about M3 might be helpful. If X is a set and
d(x, y) is defined for each x.yeX such that d(x, y)^0 and d satisfies
M l and M2, then it is easy to check that
d(x,x)^d(x, z) + d(z9 x) (which is M3 with x = y)
d{x,y)^d(x,y) + d(y,y) (which is M3 with y = z)
d(x,y)^d(x,x) + d(x,y) (which is M3 with x = z ) .
In other words, if d(x, y) ^ 0 and M l , M2 hold, then M3 holds if any
Examples of metric spaces 17
two out of x, y and z are the same. So it is only necessary to check M3
for different x, y and z.
Exercise 11 Verify that a discrete space is a metric space.
Given the points (0,2) and (3,6) in U2 how far apart are they? In the
usual straight-line measurement of distance (as in Example 3) they are
5 apart, but in the discrete measurement of distance (as in Example 5)
they are 1 apart. Since sets can have more than one metric on them we
ought to make it clear which metric we are referring to when asking
any question concerning distance. However, unless we say anything
to the contrary, when talking about distances in R, R2, R" or C we
shall assume that the usual metrics outlined in the above examples are
being used.
By way of a diversion we introduce yet another metric on R2.
6. Let X = R2 and for x , y e l define d(x,y) by
"l*i - y i I Kx2 = y2
d(x9y) = d((xl9x2),(y1,y2)) = <l
Then (X, d) is a metric space. We verify M l , Ml and M3:
M l . d(x,y) = 0 if and only if \xl-yl\=0 and x2 = y2 or
l*iI +1*2~yi\ + I.ViI = 0 a n d xi^yi' These latter conditions
are incompatible and so d(x, y) = 0 if and only if xx =yx and
x2 = y2\ i.e. x = y.
Ml.
d(x,y) = h~yi1 ifX2 = y2
(Jxil + ta-^l + LVil ifx 2 7*y 2
bi-xil ify2=*2 \ = d ( v
ly xj
\yi\ + \y2-x2\ + \Xl\ ify2*x2\ ' *
M3. To verify M3 for x = (xl9x2), y=(yi,y2) and z = (z l5 z 2 ) we
note firstly that |xx - yx | ^ d(x, y) and then check M3 in cases.
If x2 = y2, then
d(x,y) = \x1-y1\
^|xi-Zil + |zi-j>il
^d(x,z) + d(z, y).
If x2^y2, then z 2 cannot equal both x 2 and y2, so assume
that z 2 7«^x2 (the cases z2^y2 following similarly). Then
18 Metric spaces
if(x, y) = |x1|H-|x2 —y2| +1^1
<\Xi\+\X2-Z2\+\Z2-y2\+\yi\
f(|xil + | xi~z
2 - :2\+\z1\) + \z1-yl\ if y2 = z2
^
I d x j + l x a -'Zi\
: + \zi\)H\zi\+\z2-y2\+\yi\) Kyi**!
= d(x,z) + d(z9y),
and M3 is verified.
So this is yet another metric on U2. Its distances can be illustrated in
the plane: we show both cases in one illustration (Figure 2.3). Perhaps
the picture shows why this metric is known as the 'lift metric' or the
'raspberry pickers' metric'! (Think about travelling between parts of a
tall building or between parts of afieldplanted with rows of raspberry
plants but with a central path linking the rows.)
Fig. 2.3 1
^- Length of this path is d(x,y)
Length of this y
path is d (x', y')
Exercise 12 Let X = C, the set of complex numbers, and
define d by
ft) "if.x = y
d(x9y)-
I M +-\y\
IJ tix^y.
Show that d is a metric for C and think of a suitably
illuminating name for it.
Exercise 13 Let X= U2 and define d by
d(x9y) = d{(xl9X2),(yi,y2)) = ™x{\x1-yi\9\x2-y2\};
i.e. the larger of\x1-yl\ or \x2-y2V Show that d is a metric
for U2. (The keen reader can check that
d(x, y) = d((xl9..., x„), (yl9..., yn))
= max{|x1-y1|,...,|xn-jj}
defines a metric for (R*.)
Examples of metric spaces 19
Those last few non-standard examples show that the abstract concept
of distance can differ from our preconceived ideas of distance. But by
far the most important examples for our purposes concern the
distance between functions.
Consider the functions x, y: [0,2] —• R given by x(t) = t2 and y(t) =
r + 1. How far apart are these functions? How could we sensibly
measure this distance ? A good starting point is with a picture, so let us
look at the graphs of these functions (Figure 2.4). One way of
Fig. 2.4 f
measuring their distance apart is to ask how far their graphs are ever
apart. In this case when t = \ the graphs are l j apart, and that is the
maximum separation which occurs. In general, given two functions x
and y with domain A we ought to consider the values of \x(t)-y(t)\ as t
takes all values in A and then ask what the biggest of these values is.
Since we shall only ever be considering the distance between
continuous functions on closed intervals [a, fc] ( = {xe U: a ^ x ^b})
this idea of the 'biggest separation' turns out to be sufficient for our
needs: the reader who wishes to consider why this idea is not sufficient
for all functions can read the comments after the next example.
7. Let X be the set of continuous functions from [a, b] to IR, and
for x,yeX define d(x, y) by
d(x,y) = mnx{\x(t)-y(t)\: te[a,b~]}.
Then {X, d) is a metric space, as we now verify:
20 Metric spaces
Ml. d(x9y) = max {\x(t)~y(t)\: r e [ a , b ] } = 0 if and only if
l*(0 - y(0l = 0 for each t e [a, ft]; i.e. if and only if x(t) = y(t)
for each t e [a, ft]; i.e. if x = y.
Ml. d(x,y) = mzx {\x(t)~y(t)\: re [a, ft]}
= max{|y(r)-x(f)|: re [a, ft]}
= d(y,x).
M3. d(x,y) = max{|x(r)-}/(t)|: re [a, ft]}
= \x{t0) - y{t0)\ for some r0 e [a, ft]
^\x(t0)-z{t0)\ + \z(t0)-y(t0)\
^max{|x(r)-z(r)|: te[a, ft]}
4-max {|z(r)->;(r)|: *e[a, ft]}
= d(x,z) + d(z,)>),
as required.
Exercise 14 Let x, y: [0,1] —»R be given by x(t) = t and
y(f) = r2. Calculate the distance between x and y in the metric
space of Example 7.
In general, we shall let C(a, b) denote the collection of continuous
functions from [a, ft] to U. (Those readers who find it hard to consider
a set of functions can think of it as a set of graphs.) Unless we say
otherwise, the metric which we shall be using on this set is the 'max'
metric defined in 7. Readers who wish to pursue this subject further
will find this metric referred to elsewhere as the 'sup' metric. We shall
pause now to explain some of the possible inadequacies of the 'max'
metric approach in more general situations. The reader not interested
in this subtle point can turn immediately to Example 9.
What is the distance between the functions x, y from the open
interval ]0,1[ to R given by x(t) = t and y(t) = t2 +1 + 1 ? The above
approach would imply that we should look at
\x(t)-y(t)\ = \t-(t2+t + l)\ = t2 + l
for *e]0,1[ and ask what is its biggest value. But £2 + l (te]0, l[)
takes all values between 1 and 2 without actually ever equalling 2. So
instead of asking for the maximum of \x(t) — y(t)\ (which never
actually occurs) we must ask for the smallest number which is at least
as big as every \x(t) — y(t)\. In general a non-empty set E of real
numbers is bounded above if there exists a number u (an upper bound)
Examples of metric spaces 21
with e^u for all eeE. Then E has a least upper bound, called its
supremum or sup E. If E has a biggest member then that biggest
member is the supremum.
Similarly a set E is bounded below if there exists a number / (a lower
bound) with / ^ e for all e e £, and £ is bounded if it is bounded above
and below.
Now we can extend Example 7 to include functions x: A —• IR
provided that {x(t): teA} is bounded: this simply ensures that the
distance between two functions is never infinite.
8. Let X be the set of bounded functions from some set A to IR;
i.e. those x: A—>U such that the set {x(t): teA} is bounded. For
x,yeX define d(x, y) by
d(x,y) = sup{\x(t)-y(t)\: teA}.
Then (X, d) is a metric space. The verification is similar to that for
Example 7 except that we cannot assume that
sup{|x(r)-y(r)|: teA} = \x(t0)-y(t0)\
for some t0 e A. Therefore the proof of M3 has to be revised as follows.
M3. For any t0eA we have
l*(*o) - y(* o)l ^ \x(t0) - z(t0)\ + \z(t0) - y(t0)\
^ s u p {|x(r)-z(0|: teA}
+ sup{\z{t)-y(t)\:teA]
= d(x,z) + d(z,y).
Therefore d(x, z) + d{z, y) is an upper bound for the set {\x(t) — y(t)\:
teA} and so
d(x,y) = sup {\x(t)-y(t)\: teA}
^d(x,z) + d(z,y).
Exercise 15 Let x, y: [0,1]—• IR be given by x(f) = [f] (the
integer part of t) and y{t) = t2. Evaluate the distance between
these functions using the 'sup' metric of Example 8.
9. Another way of measuring distance between continuous
functions is by 'adding up' all the distances apart of their graphs. Let
X = C(a, b), the set of continuous functions [a, b]—»IR, and for
x,yeX define d(x, y) by
d(x,y)={b\x(t)-y(t)\dt.
Then (X, d) is a metric space.
22 Metric spaces
Exercise 16 Show that the (X, d) defined in Example 9 does
satisfy Ml, M2 and M3. (You may assume all the necessary
properties of integrals. In particular you will need the fact
that, if x is continuous with x(t)^0 for te[a, b~\ and
$ x(t) dt = 0, then x(t) = 0 for each t.)
We are now in a position to return to the question of whether a
sequence of functions (or other objects) converges.
2.3 Sequences
A sequence is simply an infinite list. Examples of sequences
are
1,2,4,8,16,...
(which is a sequence - or at least the start of a sequence - of integers)
and
(2,i),(3,i),(4,i),(5,i),...
(which is a sequence of points of IR2) and
{ i + ii 2 , 1 : 3 , 1 -
*» 2 + 21? 3 ^ 3 1 ' 4 ^4*> • • •
(which is a sequence of members of C). In general a sequence in a set X
is a list
X\, ^ 2 , X 3 , . . . , X n , . . .
of members of X.
What do we mean by a converging sequence? We might guess that
the sequence of real numbers
? 11 i i 11 1 . I
converges to 1. That is because the distances of the terms from 1
dwindles to nothing: the distance of the nth term from 1 (as always
with respect to the usual metric on IR) is given by
K-l| = = —>0 asn—•00.
n n
The sequence
x 1 =(2il),x 2 = (2ii) > x 3 = ( 2 i i ) , . . . , x n = ( ' 3 - p i Y . . .
in (R2 would seem to be converging to x = (3,0). To check this we look
at the distance of the nth term from that x:
Sequences 23
d(x„,x) = d([3-~,l-),0,0)
w
.4-M-
1/2
1/2
i + "1 *
= vl4"Vj
Again, this distance certainly tends to 0 as n gets large.
We can now see what we mean by a sequence converging in a metric
space (X, d).
Definition In a metric space (X, d) the sequence xl9 x 2 , x 3 , . . . , x „ , . . .
of points of X converges to limit x e Xiid(xn, x) —• 0 (i.e. if the sequence
d(xl9 x), d(x2, x ) , . . . of real numbers converges to 0) as n—• oo. We
write xl9 x 2 , x 3 ,...—•x or simply xn—• x as n—•oo.
Exercise 17 Use the definition to show that the sequence
(0j 1)> (2? ^)» (4» 9)' V8> 3l)> • • • » ( 1 — ^ n - l J ~y~2
converges to (1,^) in U2.
Exercise 18 Show that the sequence
. . 7T . . 7t 7C . . 71
cos7r+ 1 sin 7c,cos — + 1 sin—, cos—+ 1 sin
3'
7C . . 71
cos—-hi sin —,...
4 4
converges to 1 in C.
The sequences in these two exercises can be illustrated in the plane
and, as far as intuition goes, they certainly seem to converge as
claimed (Figure 2.5). It is also fairly clear that
(0,D,(if),(if),(iH),...
Fig. 2.5 Imaginary
•^
Real
U2 C
24 Metric spaces
converges to (1, j) since the first coordinates tend to 1 and the second
coordinates tend to \. And indeed it does turn out that this coordinate
check is enough, as we now see.
Theorem 2.1 The sequence ( x ^ y j , (x 2 ,y 2 ), C^)^)*--- converges to
(x, y)in IR 2 ifandonlyifx 1 ,x 2 ,x 3 ,... converges to x and yi9y29 y$,• • •
converges to y in R.
Proof Note firstly that for any (x, y) and (x', y')eU2 we have
| x - x K [ ( x - x ' ) 2 + ( y - / ) 2 ] 1 / 2 = t/((x,);),(x',/)).
So if (xl9 y j , (x2, y 2 ), (x 3 , y 3 ) , . . .-* (x, y) in R2, then
0 ^ |x„ - x| ^ d((xn, yn), (x, y)) - • 0 as n -> oo
and |x„ — x|—>0asn —>oo. Hence x 1 ,x 2 ,x 3 ,...—»x and, similarly, yl9
yi,y^...-•y in R.
Conversely, if x x , x 2 , x 3 ,...—•x and yl9 y 2 , y 3 ,...—>y in (R, then
|x„-x|—•O and \yn — y\—»0 as n—*oo. Hence (xn — x)2—•() and
(y„ —y)2—•O as n—•oo and
^((x n ,yJ,(x,y)) = [ ( x n - x ) 2 + ( y n - y ) 2 ] 1 / 2 ^ 0 as n-^oo.
Thus (xj, y x ), (x2, y 2 ), (x 3 , y 3 ),...—• (x, y) in R2 as required. •
Exercise 19 Show that (x^y^Zj), (x 2 ,y 2 ,z 2 ), (x 3 ,y 3 ,z 3 ),
...—•(x,y,z)in R3 if and only ifx 1 ,x 2 ,x 3 ,...—•xandy 1 ,y 2 ,
y 3 , . . .—>y and z1? z 2 , z 3 ,...—>z in R. If you are keen, state
and prove the corresponding result for R".
Exercise 20 Use the fact that the distance in C from xx +x 2 i
to yx +y 2 i is the same as the distance in R2 from (x l9 x 2 ) to
(y1? y2) to show that the sequence of complex numbers zl9z2,
z 3 ,... converges to z if and only if Rez 1? Rez 2 , Rez 3 , ...
converges to Re z and Im zl9 Im z 2 , Im z 3 , . . . converges to
Im z. (Here, Re z denotes the real part of z and Im z denotes
its imaginary part.)
All the above examples show that, as far as R, R2, R" and C are
concerned, our abstract concept of convergence teaches us nothing
that we could not have guessed anyway. It is only in the non-standard
examples that intuition starts to let us down.
Exercise 21 Let xl9 x 2 , x 3 ,...—>x in a set X with the discrete
metric d. (So d(xn, x) is either 0 or 1 for each n.) Show that,
apart from a finite number of the x„s, all the terms of the
Sequences 25
sequence are equal to x. (Hint: d(xn9 x) must eventually be
less than 1.)
Before proceeding note that the sequence (2^, ^), (2£, £), (2^, i ) , . . . in
U2 converges to (2,0) with respect to the usual metric but it does not
converge with respect to the discrete metric. So when talking about
convergence we ought to make it clear which metric is involved. As
mentioned earlier, unless we say otherwise we assume that the usual
metric is being used.
Now let us turn to the function spaces. Recall that C(l, 2) is the set
of continuous functions [1,2]—>(R and that the metric which will
concern us most on this set is the 'max' metric. In this space consider
the sequence xl9 x 2 , x 3 , . . . given by
X t 2t 3f nt
I ( 0 = T - ^ ^ 2 ( 0 = ^-—,x 3 (t) = - — , . . . , xn(t) = — — , . . . .
1+t z+t J +t n-rt
Figure 2.6 illustrates the graphs of these functions. The graphs seem
Fig. 2.6 / •
4
ii •
/ Xi
x6
xs
x*
*3
x2
Xi
1
_ 1 ~2 **
t
to suggest that this sequence of functions converges to the function x
given by x(t) = t. Let us check this:
0^d(x n ,x) = max{|x,,(0-x(0h l ^ t ^ 2 }
nt
= max —t :l^r<2
n+t
26 Metric spaces
\nt-t(n + t)
= max : l^t^2
n+t
t2
= max<{ : l^t^2
n+t
U2 1 4
<max < —: 1 ^f ^ 2 > = — • O as n—• oo.
[n J n
Hence d(xn, x) —>0 as n—•oo and we have checked formally that x l 5
x 2 , x 3 , . . . does converge to x in C(l, 2). How could we have guessed
the limit without drawing the graphs ? Fix t and consider the values of
*i> x 2 , x3> • • • a t ^ namely
r It 3t nt
9
l+t 2 + f'3 + r ' " * ' n + f'""
Since r is fixed this is merely a sequence of real numbers and (since
nt/(n + t) = t/(l +t/n)) it converges to t. So for each fixed coordinate
the functions converge to t, as shown in the above figure: that leads us
to try x given by x(t) = t as the limit. This is justified in general by the
following result:
Theorem 2.2 Let x, x 1? x 2 , x 3 , . . . be functions in C(a, b) such that x l 5
x 2 , x 3 ,...—»x. Then, for any te[a, b~\, x^r), x2(t), x3(f), ...—• x(r)
in R.
Proof Since d(xn, x) is the maximum of all the \xn(t) — x(t)\ for
t e [a, b] it follows that, for any particular t,
\xn(t)-x(t)\^d(xn,x).
So if d(x„,x)—>0 as n—•oo, then |x„(f) — x(r)|—•O as n—•oo; i.e. if x l 5
x 2 , x 3 ,...—•x in C(a, b), then xl(t), x2(t), x3(t),...—>x(t) in R. D
So the moral of that result is that if we are trying to decide whether a
sequence xl9 x 2 , x 3 , . . . in C(a, b) converges, then for each te\_a, b]
find the limit of the real sequence xx (£), x 2 (t), x 3 (r), (If it fails to
exist for some r, then the above theorem shows that x 1 ? x 2 , x 3 , . . .does
not converge in C(a9 b).) If the limit exists denote it by x(t). The
collection of these x(r)s for te[a,b] defines a new function
x: [a, b] —• IR and we check whether x e C(a, b) and whether it is the
limit of xl9 x 2 , x 3 ,
Exercise 22 Let xx, x 2 , x 3 , . . . in C(0,1) be defined by xl (t) = t,
x2(t) = t2, . . . , xn(t) = tn, Find the limit of the real
Sequences 27
sequence xl(t),x2{t),x3(t),.. .for each te[0,1], and denote
this limit by x(t). Show that the function defined in this way is
not continuous and is not, therefore, in C(0,1). (Hence x x , x 2 ,
x 3 , . . . does not converge in C(0,1).)
Exercise 23 Let x 1? x 2 , x 3 , . . . in C(0,1) be defined by
Show that xl{t),x2(t\x3(t),...—> 0 for each r e [ 0 , 1 ] . (Hence,
by the above theorem, if xl9 x 2 , x 3 ,...—»x then x(t) = 0 for
each t e[0,1].) But show that, for the usual 'max' metric on
C(0,1), if x is the zero function then d(xn, x) = j . Hence
confirm that xl9 x 2 , x 3 , . . . does not converge in C(0,1).
So, by the above theorem, convergence in C(a, b) implies 'coordinate-
wise' convergence of the functions. But, as Exercises 22 and 23 show,
coordinatewise convergence is not enough to ensure convergence in
C(a, b). Coordinates merely enable us to define the coordinatewise
limit x and then we must check whether x is in C(a, b) and is also the
limit with respect to the 'max' metric. The sequence of functions in
Exercise 23 is illustrated in Figure 2.7.
It is clear that, in some way, the sequence of functions illustrated
below does not settle down to the zero function as neatly as the earlier
illustrated sequence settled down to the function x(t) = t. Conver-
gence in C(a, b) with respect to the 'max' metric requires much more
Fig. 2.7 f
* These functions converge
*2 coordinatewise to the zero
function, e.g. x1(^) = ^,
X4
*a<i)=ft,*3(i)-i,
*5
*4(i)=&...->0.But
Xi>X2>*3>--- *+zero
function
since |x„£)-0| = i
0 1
28 Metric spaces
than coordinatewise convergence. Readers interested in seeing this
stronger concept in more traditional approaches elsewhere will find it
referred to as uniform convergence.
We have now set up all the required machinery for asking whether
sequences converge. Before proceeding we have two pure mathe-
matical points to make to conclude this chapter. The first is that, given
a sequence x 1? x 2 , x 3 , . . . , a subsequence of it is another sequence xfci,
xki, x fe3 ,... obtained from the original by picking out the /qst, /c2nd,
fc3rd terms, etc., where kl<k2<k3< If a sequence converges in a
metric space then any subsequence of it converges with the same limit.
For if d{xn, x)—>0 as n—>oo, then d(xkn, x) (being d(xm, x) for some
m^n)—>0 as n-+co. The second point we make is that we can talk
unambiguously of the limit of a convergent sequence in a metric
space, as we now confirm.
Theorem 2JIfx 1 ,x 2 ,x 3 ,...—•xandx 1 ,x 2 ,x 3 ,...—>yin(X,d),then
x = y.
Proof If d(xn, x)—>0 and d(xn, y)—>0 as n—•oo, then
0^d(x,y)^d(x,xn) + d(x„y) —>0
and so d(x, y) must equal 0: hence x = y. •
This result shows that a sequence can have at most one limit and so we
can talk unambiguously of the limit of a convergent sequence. The
proof is also a neat way of concluding this chapter, for it shows that by
our abstract approach we can give transparent proofs which work for
all metric spaces and which are no harder than the corresponding
proofs for R alone.
3
The three Cs
3.1 Iteration revisited
Let us return briefly to the idea of solving a real equation of
the form x =f(x) by iterating with the function / But now let us try to
solve the equation with the additional constraint that the root should
lie in some given set.
For example let us try to find a root of x 3 — 10x2 + 3 lx — 30 = 0 with
x > 2. We will rearrange the equation as
- x 3 + 10x 2 +30 rt x
*= 31 =/(*)
and iterate with / starting at xx = 2.5 (which does satisfy xx > 2). Then
x2=f(xx)*2.4798, x 3 =/(x 2 )*2.4595, x 4 = / ( x 3 ) % 2.4392, and so
on. In this way we get a convergent sequence, so its limit is a root of
the given equation. Also the terms of the sequence all satisfy xn > 2. So
far, so good. But unfortunately the limit of the sequence turns out to
be 2 - which does not satisfy the condition x > 2.
As another example let us try to find a rational (i.e. fractional) root
of x 4 + 2x 3 — 3x2 - 4x + 2 = 0. We will rearrange this as
x 4 + 2 x 3 - 3 x 2 + 2 rt ^
*= 4 =/(*)
and iterate with / starting with x t = 1. This time, as we are looking for
a rational root, we will write the terms of the sequence as fractions. We
find that x 2 = / ( x 1 ) = i x 3 = / ( x 2 ) = §|, x 4 =/(*3) = ^ f M § i i and so
on. In this way we get a convergent sequence so its limit is a root of the
given equation, and all the terms of the sequence are rationals. So far,
so good. But the limit of the sequence turns out to be J2 — 1, which is
not rational, so again this technique has failed to find a root with the
additional constraint.
In the first example we were trying to find a root in the open interval
]2, oo[: we found a sequence in the required set converging to a root,
30 The three Cs
but unfortunately the limit was outside the set. In the second example
we were trying to find a root in the set of rationals: we found a
sequence in the required set converging to a root, but unfortunately
the limit was again outside the set. For this method to work, enabling
us to find a root in the set A, say, we must first ensure that A has the
property that for convergent sequences in A their limits are also in A.
This rules out open intervals like ]0,1[ or ]2, oo[, but not closed ones.
We will look now at the corresponding property of a set A in an
arbitrary metric space: this brings us to closed sets, the first of our
three Cs.
3.2 Closed sets
Definition In a metric space (X, d) the set A £ X is closed if
whenever al9a2,a3,.. .6 A anda1,a2,a$9...—> a it follows that a e A.
So, for example, the interval [0,1] is closed in IR. For if al,a2,a39
.. . e [ 0 , 1 ] anda 1 ,a 2 »^3»- •.—>«, thena x ^ 0 , a 2 ^ 0 , a 3 ^ 0 , . . . and so
a^O (by a straightforward property of real sequences), and ax ^ 1 ,
a 2 ^ l , fl3^l, . . . and so a^\\ hence a 6 [ 0 , 1 ] . Similarly, in U all
closed intervals are closed (surprise, surprise). However, the interval
]0,1[ is not closed: e.g. i, y, i , . . . are all in that set but their limit, 0, is
not. The set Q of rationals is not closed in IR, for, as we have seen, it is
possible to have a sequence of rationals converging to an irrational
limit.
Exercise 24 Let (X, d) be a metric space and let xeX. Show
that the set {x} is closed.
Exercise 25 Show that the set of irrational numbers is not
closed in U.
Now let us look at some subsets of IR2. Consider first the set A =
{(x,y): x2+y2<5}. The sequence (0,1), (±, 1±), (f, if), (J, If), . . . of
points of A converges but its limit (1, 2) is not in A since l 2 + 2 2 equals
5. Hence A is not closed. On the other hand, the set B = {(x, y):
x 2 + y 2 ^ 5 } is closed. For if (x^yj, (x 2 ,y 2 ), ( x 3 , y 3 ) , . . . are points of
B converging to (x, y), then
xl+yl,x22+y22,xj+yl...— x2 +y2
^5 ^5 ^5
Closed sets 31
Fig. 3.1
A= {(x,y):x2+y2<5\ B= \(x,y):x2+y2<5\
not closed closed
and so x 2 + y2 ^ 5. Hence the limit (x, y) is in B and the set 5 is closed.
These sets are illustrated in Figure 3.1.
Another way of writing B is as / "* (] — oo, 5]) where / : R2—• R is
given by f(x, y) = x2 -\-y2. In other words, B consists of all those (x, y)
for which f(x, y) lies in ] - oo, 5]. Our verification above that B was
closed depended upon two facts. Firstly we used the property that if
( * i , )>i)> (x2> yi\ (*3> y*)> • • . - » ( x , y\
then
f(xl9yi) = x\+y{, / ( x 2 , y 2 ) = x22+yl
f(x3, y3) = x23 +yl.. . ^ / ( x , y) = x2 +y2.
Secondly we used the fact that if a convergent sequence lies in
] — oo, 5], then so does its limit. The former property is simply the
continuity of/ (informally, that if (x', y') is close to (x, y) then f(x\ y')
is close to / ( x , y)) and the latter property is, of course, simply that
] —oo,5] is closed.
Our next theorem will generalise that result and enable us to find
many closed sets in R2. For this we will need the concept of a
continuous function / of two variables. A function / : R2 —• R is
continuous if whenever
(xi,yi), (*2> ^2), (*3, M • • •-> (*> y)
it follows that
f(xl9 yx), / ( x 2 , y2), / ( x 3 , y3),.. . - • / ( x , y).
(This formal property need not concern us unduly: most uncontrived
functions R2 —» R are continuous.)
Theorem 3.1 Let/: R2—>R be continuous and let A be closed in R.
Then f'^A) ( = {(x,y): f(x,y)eA}) is closed in R2.
32 The three Cs
Proof We must consider a convergent sequence in / 1(A) and show
that the limit is also in / " 1 (A). So let (x1,y1), (x2, y2\ (*3, y 3 ) , . . . be in
/_1U)with
(*i, y\\ (*2> ^2), (*3> ya)»• • •-* (x, y).
ef-HA)
Then, by the continuity of /,
f(xi,yi), f(x2, y2\ / ( x 3 , y 3 ),.. .-*/(x, y).
But A is a closed set and therefore it contains the limit f(x, y). Hence
(x, y) ef ~1 {A) and we have shown that every convergent sequence in
f~1(A) has its limit in f~l(A); i.e. f~1(A) is closed as claimed. •
Exercise 26 Show that the set {(x, y): xy ^ 1} is closed in U2 .
Exercise 27 Show that all straight lines in IR2 are closed. (You
may recall from Exercise 24 that the set {0} is closed in R and
then express a straight line in the form / - 1 ({0}) = {(x,y):
f(x, y) = 0} for some suitably chosen continuous function / )
Exercise 2#Letx 1 ? x 2 ,x 3 ,...—>x in C(a, fr)andletx1,x2,x3,
. . . all be functions from [a, b~\ to [c, d]. Use Theorem 2.2 to
show that, for each t e [a, b~\, x(t) e [c, d~\. Deduce that the set
of continuous functions [a, b~\ —• [c, d] is a closed set in
C(a, b).
Theorem 3.1 is a special case of a very general result concerning
continuous functions and closed sets. The general result need not
concern the reader only interested in our central theme of iteration,
but for those readers who wish to use metric spaces to obtain a better
understanding of analysis a fuller version of Theorem 3.1 will appear
in the final chapter.
Exercise 29 Show that the quadrant {(x, y)\ x2 +y2 ^ 1, x ^ 0
and y^O} is closed in R2.
There are various ways of tackling Exercise 29, one being from the
basic definition. In that case let (x l5 yx)9 (x2, y2), (x3, y3),...—> (x, y)
with each (x„, yn) satisfying x2 +y2 ^ 1, xn ^ 0 and yn ^ 0 and deduce
that x 2 +y2 ^ 1, x ^ 0 and y^0. A more far-reaching way of showing
Closed sets 33
that the given quadrant is closed is to note that
{(x,y): x 2 + y 2 ^ l , x ^ 0 a n d y^O}
= {(x, y): x2 +y2 ^ 1} n {(x, y): x^0} n {(x, y): y^O}.
Each of the three sets on the right is closed (by Theorem 3.1) and, as we
see now, any intersection of closed sets is closed.
Theorem 3.2 Any intersection of closed sets in a metric space is itself
closed.
Proof Let A be the intersection of any number of closed sets in a
metric space and let a1,a2,a3,... be points of ,4 with aua2,a39...—>a.
Then to show that A is closed it remains to prove that aeA. But since
#i, 02> a3> • • • a r e i n t r i e intersection of all the closed sets in question,
they are in each of the closed sets. So the limit a is also in each of the
closed sets. Hence a is in the intersection of all these closed sets,
namely A. So we have shown that every convergent sequence in A has
its limit in A; i.e. that A is closed. •
Exercise 30 A half-plane in U2 is a set of the form {(x, y):
ax + by ^c} where a, b and c are real constants with not both
a and b zero. Show that half-planes are closed in R2 and
deduce that all triangles are closed in R2. The keen reader
might like to deduce that all polygons are closed.
Exercise 31 Let A^C(0,1) consist of those functions x in
C(0,1) with x(0) = 0. Use Theorem 2.2 to show that if x x , x 2 ,
x 3 , . . .areinXandx 1 ,x 2 ,x 3 ,...—•x,thenxey4;i.e. that/I is
closed. Deduce that the set
{xeC(0,1): x(0) = 0, xdfo) = 1, X(T!O) = 2 , . . . ,
x ( ^ ) = 9 9 a n d x ( l ) = 100}
is closed.
Now let us look at a problem similar to that in Exercise 29 but with
'and' replaced by 'or'. Let us try to show that the set
{(x,y): x 2 + y 2 ^ l or x ^ O o r y^O}
is closed in U2 (Figure 3.2). Consider any sequence (x l5 yx), (x2, y2\
(x3, ^ 3 ) , . . . in this set with (x1? y r ), (x2, y2), (x3, y 3 ),...-»(x, y). Since
there is an infinite number of terms in the sequence and each satisfies
at least one of the three conditions x 2 +y2 ^ 1, xn ^ 0 or yn ^ 0 , we
must be able to pick out an infinite number of the (x„, yn)s which all
34 The three Cs
satisfy the same condition. (Imagine sorting the ( x ^ ^ ) , (x2,y2),
(*3> j>3)> • • • into three piles: the first pile consists of those satisfying
*n +.Vn ^ 1> the second pile consists of those of the remaining ones
satisfying xn ^ 0 , and the third pile is all the rest - which must satisfy
yn ^ 0 . At least one of these piles must be infinite.) Assume for example
that an infinite number of the terms satisfy x2 + y2 ^ 1; pick out a sub-
sequence with this property: (xki, ykl), (xkl, yki\ (xk3, yk3),..., say. This
sequence also converges to (x, y) and so this limit satisfies x 2 + y2 ^ 1:
hence (x, y) is in the set
{(x9y): x2+y2^l or x ^ O or y^O}.
A similar argument would have worked if some subsequence had
satisfied one of the other conditions. So, in general, any convergent
sequence in the given set has its limit in the set; i.e.
{(x,y): x2+y2^l or x ^ O or y^O} is closed.
Again, that is only a particular case of the following general result.
Theorem 3.3 Let Al9 A2,..., Ak be closed sets in a metric space. Then
At u A2 u . . . u Ak is closed.
Proof Let A = A1vA2v...vAk and let au a2, a 3 , . . . be in A with
#i»ci2, a 3 ,...—• a. We must show that a e A. But some A{ must contain
an infinite number of the al9 a2, # 3 , . . . (otherwise Al9 A2, . . . , Ak
would only contain in total a finite number of the al9 a2, a 3 , . . . ) . So
assume that the subsequence akl, aki, ak^... lies in At (which we know
is closed). Also aki,aki, %3, ...—•# and so a e At. Hence o e i j U ^ u
. . . u Ak = A and we have shown that every convergent sequence in A
has its limit in A; i.e. A is closed. •
Closed sets 35
Exercise 32 Show that the set
{(*>y): y = nx for some integer n between 1 and 100}
is closed in U2.
Exercise 33 Show that the set
{xeC(0,100): x(t) = t for some integer t)
is closed in C(0,100).
In each of those exercises the given set could be expressed as a union
of a finite number of sets which we already know to be closed (from
Exercise 27 and the first part of Exercise 31).
The reader may have noticed a subtle difference between Theorems
3.2 and 3.3: the former concerned any intersection of closed sets, but
the latter only concerned unions of a finite number of closed sets. The
following exercise shows that Theorem 3.3 cannot be extended to
arbitrary unions of closed sets.
Exercise 34 Let A c U2 be given by
A = {(x,y): y = nx for some integer n).
Show that A is a union of straight lines (and hence of closed
sets). But show that A is not closed by finding a sequence of
points in A converging to (0,1) (which is not in A).
That concludes our basic introduction to closed sets. We have only
included enough to give the reader a general idea of the concept as far
as it is needed in our subsequent applications to iterative processes. In
Chapter 5, our optional pure mathematical chapter, we shall return to
this concept and see its role in analysis.
The idea of a closed set was a most natural one in our approach via
iteration and sequences. But if we had approached the subject of
metric spaces from a different standpoint (perhaps by generalising e-S
arguments from real analysis, or as a preparation for the even more
general subject of 'topology') then we might have more naturally
come across the concept of an 'open' set. Open sets are precisely the
complements of closed sets. There is no real need for both concepts
because all theorems about open sets can be stated in terms of closed
sets. So we shall restrict attention to closed sets for the moment, and
return to open sets in the final chapter.
36 The three Cs
3.3 An internal test for convergence
Given a sequence is it possible to tell by looking at its terms
whether it converges or not? In other words, can we find a test for
convergence which does not need prior knowledge of the limit?
The real sequence
1 3 7 15
2> 4> 8? 16> • • •
certainly seems to be converging, but then perhaps so does
Is there some test which we can apply to these terms to decide whether
they converge or not? Similarly, in C(0,1) given the sequence xl9 x2,
x 3 , . . . defined by
t It 3t
1W 1 + r > 2W 2 + r 3W 3 + r ,
or the sequence yl9 y2, y 3 , . . . defined by
t It 3r
2 2 W 2 r3W
1+t ' ' l+4r ' • l + 9 t 2 ' *••
is there some way of looking at the terms and deciding whether the
sequences are convergent?
The only method available to us so far is to guess the limit and then
check that the terms do get arbitrarily close to that limit. For example,
we would guess that
X
l=2> x
2=4> x
3="8' •••
converges to 1; we would then check formally that
K-l| =
•4'- =——*0
2n
as n—>oo.
But perhaps we can test the convergence of a sequence by looking at
the distances apart of the terms in the sequence without ever having to
consider the limit at all. In this example for m ^ n the distance between
terms is given by
1 1
l*m-*nl = 1 2m] \ 2"
which tends to zero as m and n get large. However, for the divergent
sequence
yi = U y2 = l + i h = l + K i •..
the distance from the nth to the 2nth term is given by
An internal test for convergence 37
, l l i\ / l l
\y2n-yn\ = •+-
n,
1 1 j_ 1 1 1
+ -
w + 1 H + -+•
2 2n^2n
n times
and so here it is not the case that \ym — yn\ —>0 as m and n get large.
In C(0,1) the convergent sequence xl9 x 2 , x 3 , . . . given by
It 3r
x 1 (t) = * 2zv(0x = ; x3J(t)
W =:
i+r " ' 2+r 3+r
also has the property that the distance between the mth and nth terms
tends to zero as m and n get large. For if m^n then
mt0 nt0
u ( X m , Xn) for some toe[0,1]
m + t0 n + t0
(m-n)t20
I —— ^ — > 0 .
\(m + t0)(n + t0)\ n + t0 n
However, as we saw in Chapter 2, the sequence given by
t It 3t
yi(') = ; 2 y2(t)=- 2 y3(t)
3 W =:
1+r ' y2X)
l+4r ' ' l+9t2' ' "
diverges in C(0,1), and it turns out that this sequence fails to have the
property that d(ym9 yn)~>0 as m, n—• oo. For
d(y3ll,yll) = m a x { | y 3 l l ( r ) - ^ ( r ) | : 0 $ t ^ l }
'*"»* "*U
3n-l/n wl/n
|l+(3n) 2 (l/n) 2 l-hn 2 (l/n) 2 | IA-
So perhaps we have reason to suspect that a sequence xl9 x 2 , x 3 , . . .
converges in a metric space (X, d) if and only if d(xm, xn) —* 0 as m, n —>
oo. Certainly convergent sequences do have this property, as we now
see.
Theorem 3.4 If xl9 x 2 , x 3 , . . . is a convergent sequence in a metric
space (X, d), then d(xm, x„)—>0 as m, n—>oo.
Proo/ Assume that the limit of the sequence is x. Then
0^d(x m ,x„)^d(x m ,x) + d(x,xj—>0 + 0 = 0 as m,n—>oo.
Hence d(xm,x„)—>0 as m,n—•oo. •
38 The three Cs
Exercise 35 Let x 1? x 2 , x 3 , . . . in C(0,1) be given by
t2 It2 3t 2
x
l(t) = -< T* X2{t) = - ^"T» X3(t) = - ^TT>
2 3 W 2
IV 1 + r 2> 2V / 1 + 2 t ' l+3t
By considering t = l/« show that d{xni^lni)^\ and use
Theorem 3.4 to deduce that this sequence is not convergent.
What about the converse of Theorem 3.4? If a sequence xl9 x 2 , x 3 , . . .
in a metric space (X, d) has the property that d(xm, x„)—>0 as m, n—•
00, can we be sure that the sequence converges? Consider the metric
space (X, d) where X is the set of rationals and d is the usual real
distance. Let xl9 x 2 , x 3 , . . . be given by
1 X-
Xi=l and x„ + i = — h — forn^l.
xn 2
Then
v —1 v —ij.i-1 v —l,3_il v 577
Xl — 1, X 2 — i - t - 2 —2> ^S-3~h4_12» • x 4 — 408> • • •
and so we get a sequence of real numbers. This is quite a well-known
sequence used to find J2 by iterative techniques for, as we can check
with a calculator, this sequence seems to converge to yjl. So, by
Theorem 3.4, d(xm,x„) = |xm — x„|—•() as m, n—•00. But now we are
interested in X, the set of rational numbers. The members of the
sequence are all rationals and so we have a sequence in X with
d(xm9 xn) —• 0 as m, n —• 00. However, the sequence does not converge in
X since there is no x e X with x l 5 x 2 , x 3 ,...—>x. This example may
seem a little contrived, but in general a metric space (X, d) may consist
of a set like the rationals which can have a sequence xx, x 2 , x 3 , . . . with
d(xm, x„)—•O as m, w—•oo for which there exists no x e X with x l 5 x 2 ,
x3, . . . - * x .
In our results concerning iteration in a metric space we are not
going to be able to allow our spaces to have 'missing limits' in that
sense; and this idea leads us on to the second of our three Cs.
3.4 Complete sets
Definition The set A c X is complete in the metric space (X, d)
if whenever al9 a2, a3, . . . is a sequence in A with d(am,a„)—>0 as
m, n —• 00, it follows that the sequence is convergent with its limit in A.
So the complete sets are precisely those for which the d{am9an)—•O
condition is sufficient to ensure convergence. Our above observations
Complete sets 39
show that Q, the set of rationals, is not complete in IR. But we will now
see that IR itself is complete (with the usual metrics). This is the one
point in our development of the subject where we ought to pause to
question the assumptions which we make about IR.
A sequence x l 5 x 2 , x 3 , . . . is bounded above if there exists a number u
with xn ^ u for all n. Similarly, it is bounded below if there exists a
number / with l^xn for all n. It is bounded if it is bounded above and
below. In other words, naturally enough, a sequence is bounded if it
lies entirely between two 'bounds'. A common assumption in real
analysis is that a sequence which is increasing and which is bounded
above must be convergent. From this it can be proved easily that a
bounded sequence has a convergent subsequence. (The reader
interested in these results can find proofs, for example, in Copson's or
Sutherland's books listed on page 103.) So we shall assume without
proof that any bounded real sequence has a convergent subsequence.
We are now able to deduce the completeness of IR.
Theorem 5.5 IR is complete.
Proof Let al9 a2, a 3 , . . . be a sequence in U with the property that
l^m — flnl->0 a s rn,n—>oo. Then in particular there exists a positive
integer N with \am — an\< 1 for m,n^N. But then the whole sequence
lies in the set
{aua2,...,aN-x} u [ a N - 1,aN + 1 ]
and so the sequence is bounded. Hence there is a convergent
subsequence ak , aki, ak3,...—>a, say. But then
0 ^ \an - a\ ^ \an - ak) + \aK - a\
1 1
0 0
since since
k-flml->0 afc„— a
for large
n and m
and so \an — a|—>0 as n—•oo; i.e. al9 a 2 , a3, ...—>a. Therefore the
condition \am — an\ —> 0 as m, n —• oo ensures the convergence of ax, a2,
a 3 , . . . in R; i.e. IR is complete. •
Sequences for which d(am, an) —• 0 as m, n —• oo are often referred to as
Cauchy sequences and Theorem 3.5 will be found in books on real
analysis as the 'Cauchy criterion for convergence'.
40 The three Cs
Which subsets of U are complete? The interval [0,1] is complete.
To see this let al9 a2, fl3, . . . be in [0,1] with the property that
\am — an\—>0 as m, n—»oo. But then, of course, a1,a2, a3,... are in IR
and \am — an\-+§ as m,w—>oo. Therefore, by Theorem 3.5, the
sequence is convergent in IR: al9 a2, a 3 ,...—• a, say. But since [0,1] is
closed and contains al9 a2, a3, . . . it also contains the limit a. So
whenever al9a2,a39 • • • a r e *n [0> 1] a D d lflm ~" a»l ~~> 0 as m, n—>oo it
follows that the sequence converges to a limit in [0,1]; i.e. [0,1] is
complete.
The reader will probably see that we have only used the complete-
ness of IR and the closedness of [0,1] to establish the completeness of
[0,1] in the above argument.
Theorem 3.6 If A^B in a metric space and A is closed and B is
complete, then A itself is complete.
Proof Let al, a2, a3,... be in A (and hence in B) with the property that
d(am, a„)—>0as m, n—>oo. Then, by the completeness of B, al9a2,a3,
...—• b for some beB. But as al9 a2, a 3 , . . . are all contained in the
closed set A, it also contains the limit, and so it is complete as
required. •
So all closed subsets of IR are complete. Are there any others? The set
]1,2[ is not complete because the sequence ax = 1^, a2 = l i , a 3 = 1 i,
... certainly lies in the set ]1,2[, has the property that \am — an\—>0 as
m, n —• oo and yet is not convergent to a point of the set. Again this is a
particular case of a general result.
Theorem 3.7 Let A be a complete set in a metric space. Then A is
closed.
Proof We will assume that A is not closed and show that it is not
complete. So let al9a29 a 3 , . . . be a convergent sequence in A with its
limit, a, not in A. Then, by Theorem 3.4, d{am,an}—>0 as m,n—•oo.
But al9 a2, a3,... does not converge to a point of A and so A is not
complete. •
The results above show therefore that a subset of U is complete if and
only if it is closed. In general, if X itself is complete (in which case
(X, d) is called a complete space) then the concepts of closedness and
completeness coincide in the metric space (X9 d).
Complete sets 41
Exercise 36 Let P c C(0,1) be the set of all polynomials; i.e.
functions of the form x(t) = c0+clt + c2t2 + • •• +cntn for
some real constants c 0 , c l 5 c 2 , . . . , c„. Show that P is not
closed and hence not complete in C(0,1).
The fact that IR is complete enables us to deduce that R2, C and R" are
all complete.
Theorem 3.8 R2 is complete.
Proof Let (xl9 yt)9 (x2, y2\ (*3, ^ 3 ) , . . . be such that
d((xm9ym)9 (xH,yH))-+0 as m,n->oo.
But then, since
= d((xm,ym), (xH,yH))-+0 as m,w-»oo,
it follows that |xm — xn| —• 0 as m, n —» 00. So, by the completeness of R,
the sequence x 1? x 2 , x 3 , . . . converges to x, say. Similarly yl9 y2, y3,
...—>y, say. Then by Theorem 2.1
(*i,)'i), {xl9y2\ (x 3 ,y 3 ), . . . - ^ ( x , y )
and the completeness of R2 is established. D
Exercise 37 Show that C is complete. (You may recall from
Chapter 2 that the distance from x + y\ to x' + yi' in C is the
same as the distance from (x, y) to (x', y') in R2.)
Exercise 38 Use a similar procedure to that in the proof of
Theorem 3.8 to show that R" is complete.
Exercise 39 Use Exercise 21 to show that every subset of a
discrete space (X, d) is closed. Show also that X is itself
complete and hence that every subset of X is complete.
Exercise 40 Let d, d' be metrics on a set X for which there
exist positive constants a and /? with
ad'(x, y)^d(x, y)^pd'(x, y)
for all x9yeX. Show that
(i) for a sequence x 1 , x 2 , x 3 , . . .in X,d{xn, xm)—•Oasm, H—>oo if
and only if d'(x„,x m )-^Oasm,n-^oo;
(ii) forx,x 1 ? x 2 ,x 3 ,.. . i n X , x 1 , x 2 , x 3 , . . .—> x in (X,d) if and only
if x 1? x 2 , x 3 ,...—*x in (X, d')\
(iii) X is complete with respect to d if and only if it is complete
with respect to d'.
42 The three Cs
In Exercise 13 we introduced the metric defined on Un by
ld'(x,y) = d'(^i,...^ n ),(y 1 ,...,3; n )) = max{|x 1 -3; 1 |,...,|x I J -yJ}.
This metric space will feature in one of our applications, and again we
shall need the fact that it is complete.
Exercise 41 Let d' be the 'max' metric on W recalled above
and let d be the usual metric on W. Show that
d'(x,yKd(x,yKn 1 / 2 d'(x,y)
for all x , y e l . Deduce from Exercise 40 that W is complete
with respect to the metric d'.
Our principal applications will be in the function spaces C(a, b) and it
will be crucial to us that these spaces are complete. We now give an
outline proof of that fact.
Theorem 3.9 C(a, b) is complete.
Proof L e t x 1 , x 2 , x 3 , . . . inC(a, b)be such that rf(xm, xn)—•Oasm, n—•
oo. Then for each t e [a, b]
0^\xm(t)-xn(t)\^d(xm9xn)-^0 as m,n-»oo
and so by Theorem 3.5 the sequence x1{t), x2(t), x 3 ( 0 , . . • converges,
to x(t), say. Doing this for each re [a, fc] defines a function
x: [a, b~\—* U. We need to show that xeC(a, b) and that d(xn, x)—•()
as n—•oo.
To show that x e C{a, b) we have to show that it is continuous. So
let tl9t2, t3,...—•fin [a, b]: we will try to deduce that x(tx), x(£2),
x(t3),...—>x(t). The continuity of each ofthex 1 ,x 2 ,x 3 ,.. .enables us
to draw the following chart of limits.
x
*i('i) i(h) *i(*3) ••• *i(0 ••• -• *i(0
x x
*2('i) ^2(^2) i(h) ••• i(Q ••• ~* x2(t)
Xs(ti) x3(t2) x3(t3) ... x3(tn) ... —• x 3 (0
X X
*m('l) mih) m(h) ••• *m('n) ••• ~* Xm(t)
i i i 1 i
x(h) x(t2) x(t3) ... x(rj ... -* x(t).
Since this is only meant to be an outline proof we hope the reader will
find it plausible from that chart that x(t 1 ),x(t 2 ),x(f 3 ),...—>x(t). For if
we choose m and n large enough, then
Complete sets 43
x(tH)*xm{tH)*xm(t)*x(t).
(The analytical reader will, rightly, want a better verification that x
is continuous. So, for readers with an understanding of e-methods,
here is a formal verification.
Let e > 0 and let N be such that d(xN,x)<e/3. Now xNeC(a,b) and
so it is continuous. Hence, given t0e[a9 b] there exists a <5>0 such
that
\t —10\ < 6 implies \xN(t) — xN(t0)\ < e/3.
Therefore
\t — tQ\<S implies
\x(t)-x(t0)\^\x(t)-xN(t)\ + \xN(t)-xN(t0)\ + \xN(t0)-x(t0)\
^d(x,xN) + - + d(xN,x)
s e e
This formal argument involving es and Ss shows that x is
continuous.)
So, having convinced ourselves (one way or another) that
xeC(a,b\ we must check now that x is the limit of x l 5 x 2 ,x 3 ,... by
showing that d(xn,x)—•O as n—>oo. We know that d(xm,x„)—>0 as
m,n—•oo. So for m,n^N1, say, d(xm,xn)^l. Hence for any fixed
te[a9 b] and n^Nx we have
\xm(t)-xn(t)\m for m>Nx
and so
xn(t)-l^xm(t)^xn(t) + l for m^Nv
It follows that the sequence xNl(t), xNi + l(t)9 xNi+2(t), • • • lies in the
closed interval [xn(t)~ 1, xn(t) + 1 ] , and hence so does the limit of the
sequence, x(t). Therefore
xH(t)-l^x(t)^xm{t) +l
and
|x(t)-x„(r)Kl.
This was true for any te[a, b~\ and any n^Nv Hence d(xn9 x)^ 1 for
all n ^Nv A very similar argument will show that there exists an N2
with d(xn,x)^j for all n^N2l and an N3 with d(x„,x)^y for all
n^N3. Continuing in this way shows that d(xn, x) —• 0 as n —•oo and
that xl9 x 2 , x 3 ,...—•x as required.
44 The three Cs
We have thus shown that if a sequence xl9 x 2 , x 3 , . . . in C(a, b) has
the property that d(xm9xn)—>0 as m, n—XX), then the sequence
converges in C(a, b). So C(a, b) is complete (and so is this proof). •
In Chapter 2 we met an alternative metric on C(a, b) given by
d(x,y)= \x(t)-y(t)\dt.
In this case d(x, y) is the area sandwiched between the graphs of the
functions x and y, as illustrated in Figure 3.3. In the final exercise of
this section we see that C(—1,1) with this integral metric is not
complete.
Fig. 3.3
rb
Shaded area = / I x(t) -y(t) I df
Exercise 42
(i) Let xl9 x 2 , x 3 , . . . in C(— 1,1) be given by
7
0 if —1 ^r^O
1
xm(t) = nt ifO<r<-
,fl
1 if-^r^l,
n
as illustrated in Figure 3.4. Draw the graph of xm, for some
m>n, on the same axes and hence find d(xm, x„), where d is
the integral metric. Deduce that d(xm, xn)—>0 as m, n—•oo.
Fig. 3.4 1 •*„(')
T~
1
1
L i
1
—* t
n
Compact sets 45
(ii) Now let y:\_-l, 1]—• U be given by
. , [0 i f - l ^ t < 0
yit)=
\i ifo<^i.
Find JL j \xn(t) — y{t)\ At and confirm that it tends to zero as n
gets large.
(iii) Assume that x l 5 x 2 , x 3 ,...—>x in C(— 1,1) with the integral
metric. Use the fact that
f1 f1 f1
|x(t)-y(0ldr^ |x(r)-xM(t)|dr + |xw(r)-y(t)|dr
J-i J-i J-i
to show that J1- x |x(t) — y(t)l At = 0. Convince yourself with a
picture that it is impossible to find a continuous function
x: [ —1,1]—>R such that the area sandwiched between the
graphs of x and y is zero. Deduce that xl9 x 2 , x 3 , . . . is not
convergent in C(—1,1) with the integral metric and that
C(— 1,1) is not complete with respect to this metric.
Before moving on to the third and last C, namely compactness, let us
briefly summarise what we have learned about completeness. A
complete set is one in which if a sequence has the property that the
distances between terms tend to zero, then the sequence converges to a
point of the set. A complete set is necessarily closed, but a closed set
need not be complete. But if we restrict attention to complete metric
spaces (X, d); i.e. where X itself is complete, then subsets of X are
complete if and only if they are closed. All the principal spaces which
we meet in our applications (R, R2, R", C and C(a, b) with their usual
metrics) are complete and so all their closed subsets are complete.
3.5 Compact sets
The third property of sets which we wish to consider is that of
'compactness'. This is rather less relevant to us than the previous
properties of being closed or complete, so the reader may turn straight
to Chapter 4 without much loss. However, having come so close to
this property we can consider it with little extra effort. Then our
applications in the next chapter will have several extensions and in the
final chapter the relevance of compactness to real analysis will
become clear.
We remarked in the previous section that any real sequence in the
46 The three Cs
interval [a, b] has a convergent subsequence (with its limit in the set).
Indeed, it was that property of [a, b~\ which enabled us to show that R
is complete. This property has such far-reaching consequences in
analysis that we isolate it in the following definition.
Definition A set A in a metric space (X, d) is compact if every sequence
in A has a subsequence convergent to a point of A.
The closed bounded intervals [a,b~] in U are therefore compact, but U
itself is not. For example, the sequence 1,2, 3 , 4 , . . . has no convergent
subsequence. In fact we shall see in Theorem 3.11 that the compact
subsets of U are precisely the closed and bounded sets. But first we
provide the missing link in the implications
compact => complete => closed
which shows that our three properties are progressively stronger.
Theorem 3.10 Let A be a compact set in a metric space. Then A is
complete.
Proof Let A be compact in (X, d) and let ax, a2, a 3 , . . . be a sequence
in A with the property that d(am, an) —• 0 as m, n —• oo. To establish the
completeness of A we must show that this sequence converges to a
point of A. By compactness there is a convergent subsequence
with its limit in A. It is now easy to show that the original sequence
also converges to a. For
0^d(an,a)^d(aH,akm) + d(aku9d)-+0
i i
0 0
since since
d(a„am)^>0 akl,ak2tak3,...
as m, n—>oo converges to a
andsod(a„, a)^>0,i.e.al,a29a39.. .itself con verges to the point a of A.
Hence A is complete. •
We saw in Theorem 3.6 that a closed subset of a complete set is
complete. You can use a similar approach in the following exercise.
Exercise 43 Let A c B in a metric space, where A is closed
and B is compact. Show that A is compact.
Exercise 44 Let d be the discrete metric on a set X.
Compact sets 47
Remembering that a sequence converges in this space if and
only if it is of the form
J\ii, -^2> • • • J Ni ' 5 ' * * * '
show that A ^ X is compact if and only if A contains only a
finite number of points.
Exercise 45 Which of the following subsets of R are
compact ?
(a) Q n [0,1] (i.e. the fractions in [0,1])
(b) [ 2 , 2 « u [ 3 , 3 j ] u [ 4 f 4 i ] u . . .
(c) {1,2,3,. . . , N } .
In Exercise 45 the set in (a) is not closed (for example, it has a sequence
converging to 1/^/2 which is outside the set), and the set in (b)
contains the sequence 2,3,4,... which has no convergent sub-
sequence. So only the set in (c) is compact and, as you may have
realised from your solutions to Exercises 44 and 45, it is easy to check
that any set containing only afinitenumber of points in a metric space
is compact.
We are now able to prove that, as far as R is concerned, the compact
sets are just the closed and bounded ones.
Theorem 3.11 Let A^U. Then A is compact if and only if it is closed
and bounded.
Proof First let A c R be compact. Then, as we saw in Theorems 3.10
and 3.7,
compact => complete => closed,
and so A is certainly closed. Also A can have no sequence of points
tending to infinity (or — oo) since such sequences do not have any
convergent subsequences. Hence A is bounded.
Conversely, let A c R be closed and bounded. Then A is a subset of
some interval [a, b~\. Hence A is a closed subset of a compact set and
(by Exercise 43) is itself compact. •
Exercise 46 Let A<^Rbe non-empty and compact. Show
that A has a least member and a greatest member. (You know
that A is bounded, so it has a least upper bound, a, say. There
must be a sequence of members of A converging to a.)
Although in R compactness is characterised so simply, it is not the
48 The three Cs
case in all spaces that closed and bounded sets are necessarily
compact. But of course to consider this we need to know what is
meant in general by a 'bounded' set. As its name implies, a bounded
set is one where the distance between any pair of its points has certain
bounds.
Definition A set A in a metric space (X, d) is bounded if there exists a
number D such that d(a,a')^D for all a,a'EA.
Exercise 47 Show that a set A c [R2 is bounded if and only if
there exist numbers a, ft, c and d such that (x, y)eA implies
xe[a, b] and ye[c, d~\. (Figures 3.5 and 3.6 might help.)
Exercise 48 Show that a non-empty set A in a metric space
(X, d) is bounded if and only if there exists as A and a
number D' such that d(a, a') ^ D' for all a' e A. (This is like the
definition but with a fixed.)
Exercise 49 Show that if the points a, al9 a2,a3,... of a metric
space (X, d) have the properties
m
Fig. 3.5
<©. m
Bounded
d(a,a')<D therefore contained therefore contained
allfl, a E A in a circle in a square
Fig. 3.6
% %
Contained in therefore contained in I therefore
a rectangle a circle (of bounded
diameterD, say)
Compact sets 49
d(a,ax)^l, d(a,a2)^2, d(a,a3)^3, ...,
then the sequence al9 al9 a3, . . . has no convergent sub-
sequence. Deduce that in a metric space a compact set is
bounded.
So, in any metric space, a compact set is both closed and bounded;
and, in U at least, the converse is true. This simple characterisation
also holds in IR2.
Theorem 3.12 Let A ^ U2. Then A is compact if and only if it is closed
and bounded.
Proof If A is compact, then (as we have seen in general) it must be
closed and bounded.
Conversely, let A be closed and bounded. Then as we saw in
Exercise 47 there exist numbers a, ft, c and d such that A is a closed
subset of the set
# = {(*> y) : xe\_a9 b~\ and ye[c, d]}.
So if we can show that B is compact, then it will follow that A is a
closed subset of a compact set and hence is itself compact.
So let (xl9 yt), (x2, y2\ ( x 3 , 3 ^ ) , . . . be a sequence of points in B.
Then xl9 x 2 , x 3 , . . . is a sequence of real numbers in [a, b~] and, by the
compactness of that set, there is a subsequence xki, xkz, x*3, . . .
convergent to a point x of [a9 b]. Then the sequence yki, yki, yk3,... of
real numbers in the compact set [c, d~\ must have a convergent
subsequence yli9 ylj9 y,3, . . . with its limit y in [c, d]. But then
x / l ,x / 2 ,x / 3 ,...->xe[a,&],
yil>yi2>yiz>----j>ye[c9i\9
and so
(*/t> yix\ (xl2, yh)9 (x/3, y, 3 ),.. .-> (x, .y)6 5.
Hence any sequence in B has a convergent subsequence (with limit in
B) and so B is compact. Finally, as remarked above, it follows that A is
compact because it is a closed subset of B. •
Exercise 50 Show that A ^ (R3 is bounded if and only if there
exist real numbers al9 al9 a3, bl9 b2 and b3 with
A^{(x,y9z)eU3:xe[aubl~\,ye[a29b2]andze[a39b3]}.
Hence show that A is compact if and only if it is closed and
bounded. (The keen reader could confirm that the same is
true for R".)
50 The three Cs
So we have seen that in W compactness is easily characterised. We have
also seen that, in any metric space, a compact set must be closed and
bounded. But is it true in general that a closed and bounded set must
be compact?
Exercise 51 Let A be a set in a metric space (X, d) and assume
that A contains a sequence al9 a2, a 3 , . . . such that, for some
positive number 5, d(am, an)^S for each m # n. Show that A is
not compact.
Exercise 52 Let A^C(0,1) consist of those continuous
functions [0,1] —• [0,1]. Show that any two members of A
are at most a distance 1 apart. Deduce (using Exercise 28)
that A is closed and bounded. By considering functions like
a„(t) =
1 ±<t*l
show that A is not, however, compact.
It seems that when we move to spaces other than Rn compactness is
less straightforward. For example the set A in Exercise 52 is closed
and bounded, and yet it fails to be compact. We confirmed this by
finding an infinite number of members of A each mutually at least
distance | apart.
In thefinalchapter the relevance of compactness to continuity will
become clear. But at this stage we are able to see a preview of one of
those results in some exercises (the second of which is actually needed
in the next chapter).
Exercise 53 Let A be a compact subset of U and let / : A —> U
be continuous. Show that for any al9a2, a3,... in A the real
sequence f(ax\ f(a2), f(a3), . . . has a convergent sub-
sequence with limit f(a0) for some a0 e A. Deduce that the set
f(A) = {f(a): aeA} is compact.
Exercise 54 Let A be a non-empty compact subset of the
metric space (X, d), let L be any fixed real number, and let
F: A-+ U be a function with the property that
\F(x)-F(y)\^Ld(x9y)
Compact sets 51
for all x, ye A. Show that for any sequence al9a2,a39. ..inA
the real sequence F(al), F(a2), F(a 3 ),... has a convergent
subsequence with limit F(a0) for some a0eA. Deduce that
the set F(A) = {F(a): a e A} is compact in R. Hence show that
there exists a0eA such that F(a0) is less than or equal to
every other F(a) for aeA.
Exercise 55 Let A be a compact subset of the metric space
(X, d), let (X\ d') be another metric space, let L be any fixed
real number, and let / : A —> X' be a function with the
property that
d'(f(x),f(y))^Ld(x,y)
for all x, y eA. Show that f(A) is compact in (X\ d').
This chapter has been, of necessity, full of the 'bread and butter'
results of the subject. But your patience in getting this far will be
rewarded because we are now able to apply our newly found
techniques to a wide range of problems.
The contraction mapping principle
4.1 Real fixed points
Consider the graph of a continuous real function / : [a,ft]—•
[a,ft].Note that, for these purposes, 'continuous' simply means that
the graph is in one piece, and the fact that / : [a,ft]—• [a, ft] means
that it is defined for all x with a^x^b and that f(x) then satisfies a ^
/(x)^ft (see Figure 4.1). Consider, too, the line y = x (dotted in the
picture). When x = a the graph of / is on or above the line (since /(a),
being in [a,ft],is at least a). When x =ftthe graph of/ is on or below
the line (since /(ft), being in [a,ft],is at mostft).Since the graph of/ is
in one piece, it means that at some point in [a, ft] the graph of /
crosses the line; i.e. for some xe[a, ft] we have x = /(x). (To those
readers who remember their basic real analysis: this is, in fact, based
on the 'intermediate value theorem'.)
Exercise 56 Let / : [ — n/2,rc/2]—• U be given by /(x) =
cos4x. Sketch the graph of / and observe that f(x)e
[ — n/2,7c/2] for each x (which is not to say, of course, that
every number in [ — n/2, n/2] occurs as an /(x)). In how
many places does the graph of / cross the line y = x ? How
Fig. 4.1
y=x
/
/*
y=fW
/
/
/
X
/
/
/
Realfixedpoints 53
many roots in [ — TC/2, 7t/2] does the equation x = cos4x
have?
So the graph of any continuous function / : [a,ft]—• [a,ft]crosses the
line y = x in at least one place; in other words, there is at least one
xe[a, ft] for which x=/(x). So for any continuous function
/ : [a,ft]—•[<*,ft]the equation x=/(x) has at least one root. But is
that pure mathematical fact of any use in actually calculating the
roots?
Exercise 57 Let / : IR—• 1R be given by /(x) = cos x. Sketch
the graph off and observe that it crosses the line y = x in just
one place. Choose any real number xx and (with your
calculator set in radians) work out cosx 1? cos (cos xx),
cos (cos (cos xx)), ... ; i.e. display your number and
repeatedly press the cos button. Continue until consecutive
answers agree to five decimal places.
What you have found in Exercise 57 (to five decimal places, anyway) is
a number unaffected by taking its cosine; i.e. an approximate root of
the equation cosx = x. Check for yourselves that cos 0.73908%
0.73908. Figure 4.2 shows the position of x2 = cos xl in relation to x1.
So if we start with any x t and keep reapplying the cosine function,
then the sequence
x l5 x 2 =cosx 1 , x 3 =cosx 2 , x 4 = cosx 3 , ...
Fig. 4.2 yk
54 The contraction mapping principle
Fig. 4.3 y i
y y=x
y - cos x
f i
r—I
/ ,
X 2 *4 *i * Xi \ "*
can be illustrated as in Figure 4.3. And, as we saw with our earlier
calculations, the sequence does seem to converge to that unique x for
which cosx = x.
This is precisely the sort of sequence with which we started the
book. We are now ready to ask 'what property of/ will ensure that a
solution to x = / ( x ) can be found by starting with any number xx and
repeatedly applying the function / to give
X X
li 2~J(Xl)i X =
3J (x2)i X
4~J\xZh •••
leading to a root?'
For what function / can we be sure that if xx is a guess at a root of
x = / ( x ) , then x 2 = / ( x 1 ) is closer to the root? Suppose that x 0 is the
root of x = / ( x ) : then we require that
| x 2 - x 0 | < | x A - x 0 | or \f(xx)-f(x0)\<\xx-x0\.
Since we do not know the value of x 0 we shall have to require that
\f(x)-f(y)\<\x-y\
for all x # y ; i.e. / reduces the distance between points.
The cosine function certainly has this property. For any distinct x
and y
(x + y^
|cos x — cos y\ =12 sin x-y sin
2 2
x
x-y +y
= 2 sin sin
^2 sin
Real fixed points 55
<2 x-y (since |sin w\ < |w| for all w ^0)
2
= \x-y\.
Exercise 58 Let / : (R—*(R be given by / ( x ) = sinx. Show
that, for any distinct x and y9
\f(x)-f(y)\<\x-y\.
Choose any number xx and enter it on your calculator (set in
radians). Repeatedly apply the sine function and observe that
the displayed numbers seem to converge to 0. This is because
0 is the unique root of x = sin x.
We saw earlier that for the function given by / ( x ) = cos4x the
equation x = / ( x ) has three roots. If, however, a function has the
property that
\f(x)-f(y)\<\x-y\
for each x ^ y , then the equation x = / ( x ) can have at most one root.
Exercise 59 Let AT be a subset of U and let / : X —> R have the
property that
\f(x)-f(y)\<\x-y\
for each x^y. Show that if x 0 = / ( x 0 ) and x' 0 =/(x' 0 ) then
x 0 = x'0. Deduce that the equation x = / ( x ) has at most one
root.
We mention in passing that it is possible, however, for a function to
have the property that
\f{x)-f(y)\<\x-y\
for each x^y and yet for the equation x =f(x) to have no solution.
Exercise 60 Let / : [l,oo[—•[l,oo[ be given by / ( x ) = x +
1/x. Show that
\f(x)-f(y)\<\x-y\
for each x^y, but that there is no x for which x = / ( x ) .
In Exercise 60, for very large x and y, \f(x)—f(y)\ gets very close
indeed to |x — y\. If we want our equations x = / ( x ) to have a root
perhaps we ought to ensure that |/(x)—f(y)\ cannot get close to
56 The contraction mapping principle
Exercise 61 Let / : [1, oo[—+[1, oo[ be given by /(x) =
§f{x + 1/x). Show that
\f(x)-f(y)\^\*-y\
for all x , y e [ l , o o [ . Display any number xx in your
calculator and work out
*1> *2=/(*l)> X3=f(x2\ ....
Observe that your sequence seems to be converging to 5.
Confirm, by solving the equation algebraically, that 5 is the
unique root of the equation x = / ( x ) .
So although the condition \f(x)—f(y)\<\x — y\ was not enough to
ensure a root of the equation x = / ( x ) , perhaps the condition
\f(x)-f(y)\^k\x-y\
(where k is some number less than 1) will be enough.
Exercise Let / : [0,1]->(R be given by f(x) =
62
Show that/(x) is actually in [0,1] and hence
T ( X 3 + X 2 + 1).
that / : [0,1]->[0,1]. Show also that for any x, y e [ 0 , 1 ]
\f(x)-f(y)\^\x-y\.
(You will need the fact that
|x 3 ->; 3 | = | x - y | \x2+xy + y2\ etc.)
Calculate/(0),/(/(0)),/(/(/(0))),.. .and hence find, to three
decimal places, a root of the equation x =f(x). Show that this
is an approximate root of
x3+x2-7x + l=0.
We have used informal arguments in this section in order to prepare
ourselves for the formality of the next section. But in our informal way
we have seen that a condition like
\f(x)-f(y)\^k\x-y\
(for some number k < 1) might ensure that the equation x =f(x) has a
root. Then, starting with any number xl9 the iterative process of
repeatedly applying / to xx might lead us to that root.
The points for which x =f(x) are called the fixed points off and we
shall now see how, in some very general situations, if / reduces
distances (rather like we have seen) then / has a unique fixed point.
We shall also see how iterating with / leads us to that fixed point.
Contractions 57
4.2 Contractions
DefinitionLet (X, d) be a metric space. Then a contraction of
(X, d) is a function f: X-+X with the property that, for some real
number fc<l,
dif(x),f(y)Hkd(x9y) for all x, >> e X.
So, as we saw in Exercise 61, the function / : [1, oo[—• [1, oo[ given by
/(x) = ff (x + l/x) is a contraction since (as we checked earlier)
d(f(x),f(y)) = \f(x)-f(y)\^\x-y\ = %d(x9y).
So, with the usual metric and with k = §f, this function is a contraction
of [1, oo [. Similarly, as we saw in Exercise 62, the function / : [0,1] —•
[0,1] given by f(x) = j (x3 +x 2 +1) is a contraction since
d(f(x),f(y))^d(x,y).
In both cases those contractions had uniquefixedpoints, found by
taking any initial guess and reapplying the function. But, of course,
the advantage of our general definition is that it extends to worlds
beyond U.
Exercise 63 Use the properties
|cos x — cos y\ ^ |x — y\ and |sin x — sin y\ ^ |x — y\
derived earlier to show that
cos y2 — cos x2 V /sin yl — sin xx V~|1/2
2 ) +{ 2 ;J
^ [ ( > > l - * l ) 2 + (>>2-*2) 2 ] 1 / 2 -
2 2
Deduce that if / : R —• IR is given by
/(x) =f(xu x2) = (| cos x2, \ sin xx +1)
(and d is the usual metric on R2), then
d(/(x),/(y)Kid(x,y)
for each x = (xx, x2) and y = (yu y2) in IR2. This shows that /
is a contraction of U2.
We saw with real functions that iterating with a contraction / led to
the unique fixed point of /
Exercise 64 Let / : U2 —• U2 be given by
f(x) =/(x 1? x2) = {\ cos x2, % sin xx +1),
as in the previous exercise. Let xx =(0,0). Then
/(x, ) = / (0,0) = (i cos 0 , i sin 0 + 1) = (0.5,1).
58 The contraction mapping principle
Let x 2 = / ( x 1 ) = (0.5,1). Continue in this way (so that, for
example,
x 3 = / ( x 2 ) = / ( 0 . 5 , l ) = (i cos l , i s i n 0.5 + 1) = ...)
until two consecutive pairs agree to two decimal places.
What you hope to have found, approximately, is a root of
x = / ( x ) , i.e.
(x 1 ,x 2 ) = ( i c o s x 2 , ^ s i n x i + 1).
Confirm with your calculator that the pair (x l5 x 2 ) which (to
two decimal places) repeated itself in the above process does
give
x1»jCOsx2 and x 2 ^ j s i n x 1 + l.
Already we have seen that by having a contraction of R2 rather than U
we can extend the sort of equations to which our methods will apply.
Now we are going to ask whether a contraction of any metric space
will have a unique fixed point and, if so, whether the sequence
*1> ^2 = y(- X 'l)> ^3 = i r 2 ) ' •••
leads to that fixed point. We saw earlier that a sequence which
appears to be behaving nicely might not actually converge because
the metric space might have some gaps.
Exercise 65 Let X = {xeQ: x ^ 1}, the set of rational
numbers of 1 or more, and define / : X—>X by
/ ( x ) = x/2 + 1/x. Show that, for all x9yeX,
d(f(x),f(y))^d(x9y)
(where d is the usual metric for real numbers). Show,
however, that for this contraction / there is no xeX for
which x = / ( x ) .
All these clues which we have assembled together seem to imply that
we need to consider contractions of a complete metric space, bringing
us at last to the key result of this chapter (and the focal point of our
whole approach). It is known as the contraction mapping principle (or
BanacWs fixed-point principle).
Theorem 4.1 Let / : I - > I b e a contraction of the complete metric
space (AT, d). Then / has a unique fixed point. Furthermore, if xx is any
point of X, then the sequence
Contractions 59
X x X
l> 2~/(*l)> $—f\Xl)i •••
converges to that unique fixed point.
Proof Let xx be any point of X and, as usual, let x 2 =f{x1)9 x 3 =/(x 2 ),
and so on. We hope to prove that the sequence xl9 x 2 , x 3 , . . .
converges and that its limit is the unique fixed point of /
Given a sequence in a complete metric space the obvious way to
test convergence is by seeing whether the sequence is a Cauchy
sequence. Calculating the distance from x„ to xm is a little ambitious,
so let us start by considering the distance from x„ to xrt + 1. From the
inequalities
d(xn, xn + !) = d(f(xn _!), f(x„)) ^ kd(xn _ x, x„)
= fcd(/(xw_2),/(xn_1))^fe2d(xn_2,x„_1)
= fc"-2d(/(^i)/(x2))^fen-1d(x1,x2)
we see that d(x,„ xw + 1 )^/c n ~ 1 d(x 1 ,x 2 ). (And, sinceO^/c<l, for large n
the distance between xn and x„ + x is very small.)
We can now extend this easily to the distance from x„ to xm for any
m>n. For
d(xH9 xm) ^ d(xH9 xn + l) + d(xn + l9xm)
^a(x„, x„ + 1 ) + a(xn + 1 ,x n + 2 ) + a(xn + 2 , xm)
^d(xn9xn + l) + d(xn + l9xn + 2)
-hd(x„ + 2 ,x n + 3 ) + - • • + d ( x w _ 1 , x j .
And we have already established an upper bound for the distance
between successive xs. Hence
d(Xn>xm)^d(xH9xH + 1) + d(xH + l 9 x H + 2) + '-+d(xm-t9xm)
*:kn-1d(xl,x2) + knd{xux2)+ • -+km~2d(xl9x2)
n-1 n
^(/c +fc + ...)d(x 1 ,x 2 )
= fc"-1(l+fc +fc2+ ...)</(x 1 ,x 2 )
= -—rd(x 1 ,x 2 ) (—•0 as n—•oo).
So, as we hoped, if n—• oo and m>n, then d(x„, xm)—>0; i.e. xl9 x 2 , x 3 ,
. . . is a Cauchy sequence in the complete metric space (X9 d). Hence
Xi9x2,x39...-+x (say)
60 The contraction mapping principle
therefore
/ ( x i ) , / ( x 2 ) , / ( x 3 ) , . . , -»/(x) since the distance between
f(xn) and f(x) is, if anything,
less than that of xn from x;
i.e. <-3> X4, */(x) since, by our original con-
struction, / ( x 1 ) = x 2 and
/ ( x 2 ) = x 3 , etc.
Also -•x from the last line of the
* 4 ,. . .-
previous page
Therefore *=f(x) for sequences cannot have
two different limits.
So the limit of the sequence is indeed a fixed point off. The fact that
it is the unique fixed point of / (and hence that, regardless of the
choice of x x , the sequence will always lead to this fixed point) is left as
an exercise:
Exercise 66 Let (X,d) be a metric space and let / : X—>X
have the property that
d(f(x)J(y))<d(x,y)
for all x 7* y in X. Show that if/(x 0 ) = x 0 and/(x'0) = x'0, then
x 0 = x'0; i.e. / has at most one fixed point.
That final fact now completes the proof of Theorem 4.1. •
We shall see several applications of the contraction mapping
principle in later sections, but we also include three here as exercises
before proceeding to some extensions of the result.
Exercise 67 Let / : {R3—• IR3 be given by
f(\)=f(xl,x2, x 3 )= (^cos x 2 + 1 , f sin x 3 , Jx x ).
Show that / is a contraction of IR3 and deduce that the
simultaneous equations
X =
l 2 COSX2 + l ,
x 2 = f sinx 3 ,
X 3 = 4 Xt
have a unique solution. Find, by an iterative process, an
approximation to that solution.
Contractions 61
Exercise 68 Recall that C(0, |) is the collection of continuous
functions from [0, j] to U together with the 'max' metric. So
each x in C(0, ^) is a function for which each x{t) is defined
(O^f <£). Given such an x, define a new function / ( x ) in
C(0,i)by
(f(x))(t) = t(x(t) + l).
So, for example, if x(t) = t2 + l for O^r^j, then/(x) is the
function given by (f(x)){t) = t{t2+2). Show that / is a
contraction of C(0, £). Show also that if x t is the function
given by xx (t) — t, then iterating with / yields the sequence of
functions xl9 x 2 , x 3 , . . . where
xn(t) = r + r2 + -••+£".
Use the sum of a geometric progression to show that, for each
t, xn(t) tends to t/(\ — t) as n tends to infinity. Confirm directly
that the function x given by x(t) = f/(l — t) is the unique fixed
point of /
Our applications to C(a, b) will, in general, be much more significant
and useful than that particular exercise. We now see an application to
matrices.
fa b\
Exercise 69 Let , be a real matrix all of whose
\c d)
entries are, in modulus, less than \. Let / : U2 —> U2 be given
by
/(X)=/(X1,X2)=(X'1,X2)
where
\x'2) = {c d)\xj
Show that
max {|xi|, |x'2|} ^ 2 max {\a\, \b\, \c\, \d\} • max {\xx\, |x 2 |}.
Deduce that / is a contraction of the metric space R2 with
metric given by
d(x,y) = d((x 1 ,x 2 ), (y 1 ,y 2 )) = max { I x i - y j , ^ - y 2 | } .
As we saw in Exercise 41 this space is complete and so / has a
unique fixed point; i.e. there is precisely one (xl,x2)eU2 with
62 The contraction mapping principle
\x2J \c d)\x2J
This (x1? x 2 ) must clearly be (0,0). Deduce that
c d l
\ ~J
is non-singular.
Exercise 70 The interested reader can extend the ideas of the
previous exercise to show that if M is a real n x n matrix all of
whose entries are, in modulus, less than 1/w, then there exists
precisely one (x1? x 2 , . . . , x j e R" with
('i:)-®-
and that M — / is non-singular, where / is the nxn identity
matrix.
Those last two exercises give the reader a taste of the applications to
linear algebra.
4.3 Real contractions revisited
Most of the real functions which we have encountered have
been differentiable and, for such functions, there is an easy test of
whether or not they are contractions (with respect to the usual real
metric).
Theorem 4.2 Let / : [a, ft] —• [a, b] be differentiable. Then / is a
contraction of \_a, b] if and only if there exists a number k< 1 with
|/'(x)|^fcforallxe]a,fc[.
Proof Assume first that / is a contraction, i.e. that there exists k< 1
with \f(x)-f(y)\ ^k\x-y\ for all x, ye [a, b\ Then, in particular, for
any x and x+<$x in [a, b~] we have
\f(x+8x)-f(x)\^k\(x+Sx)-x\ = k\Sx\.
Hence, for <5x^0,
\f(x+Sx)-f(x)\
I ^x \
and the limit of the left-hand expression as 5x—•O is also less than or
Real contractions revisited 63
equal to k. But that limit is precisely |/'(x)|. Hence the bound on
|/'(x)| is proved.
Conversely, assume that |/'(x)| ^k (< 1) for all x e]a, b\_. For any
x ^y in [a, b~\ we know that
f(x)-f(y)
=f'(c)
x-y
for some c between x and y. This is the mean value theorem from real
analysis, but for those readers who wish to at least be convinced of its
plausibility there is a diagram (Figure 4.4).
Fig. 4.4 \f
Straight line AB has gradient/(x) -f(y)
Here the curve is parallel to AB, i.e.
/'(c) = **>-/<?>
But we know that |/'(c)| ^k and so
|/(x)-/(y)|
=f'(cHk.
x-y
Hence
l/(*)-/(y)K*|x-.y|
and / is a contraction, as required. D
Exercise 71 Show that the function f: R—+U given by
f(x) = cos x is not a contraction, but that the function g(x) =
Y§o cos x is a contraction.
Exercise 72 Confirm by differentiation that the function
/ : [l,oo[—>[l,oo[ given by / ( x ) = x + l/x is not a contrac-
tion, but that the function g: [l,oo[—•[l,oo[ given by
0(x) = §f (x + l/x)is.
Exercise 73 Let a be any real number larger than 5. Show by
differentiation that / : [—1,1]—•[—1,1] given by
/ ( x ) = ^ ( x 3 + x 2 + l)
64 The contraction mapping principle
is a contraction of [ — 1,1]. Deduce that the equation
x3+x2-ax + l=0
subject to the restriction |x| ^ 1 has exactly one root.
Exercise 74 Show that the function / : IR —• IR given by f(x) —
cos (cos x) is a contraction, but that the function g: R —• R
given by
#(x) = sin (sin (sin (... (sin x)...)))
(with any number of sines) is not a contraction.
4.4 Some extensions
If/: X-+ X, then f0f is given by / 0 / ( x ) = / ( / ( x ) ) and it is
again a function from X to X: it is called the second iterate of/ In
general/)/)... 0 /(with Nfs) is the Nth iterate and, of course, is given
by
fofo- - o / W =/(/(••• /(*)..•)).
It is abbreviated to fN. So, for example, the third iterate of/: IR—• R
defined by / ( x ) = x 2 is given by
/ 3 ( x ) = / ( / ( / ( x ) ) ) = ( ( x 2 ) 2 ) 2 = x8,
and the fourth iterate of g: IR —• R defined by g(x) = x 2 + 1 is given by
^ ( x ) = (((x2 + l) 2 + l) 2 + l) 2 + l.
We saw in Exercises 71 and 74 that the cosine function is not a
contraction of R but that its second iterate is a contraction. In fact we
shall now see that provided some iterate of / is a contraction we still
get a fixed-point result similar to the contraction mapping principle.
Theorem 4.3 Let (X, d) be a complete metric space and let / : X—> X
have the property that, for some AT, the iterate fN is a contraction of
X. Then / has a unique fixed point. Furthermore the usual iterative
process starting at any x^eX and calculating
*1> *2=/(*l)> X3=f(x2), ...
yields a sequence converging to that unique fixed point.
Proof It is a pleasant surprise that we need not return to Cauchy
sequences and the details of the proof of the contraction mapping
principle, for our generalisation follows more easily than that. We
know that, as fN is a contraction of the complete space (X9 d\ fN has a
unique fixed point x 0 . Hence
Some extensions 65
therefore f(fN (x0))= f(x0)
Just / a p p l i e d — ^
tox0N+ 1
times """^s.
therefore fN (f(x0)) = f(x0)
orfN(y)=y where ^ =f(x0).
So y is another fixed point of fN. But fN only has one, namely x 0 .
Hence f(x0) = y = x0 and x 0 is a fixed point o f / It must be unique,
since any point which / fixes clearly remains fixed by fN.
Finally, to see that
x x = x X
l> 2 f\ \)-> 3=J (X2), ...
converges to x 0 , we relabel fN as g and note that
X
N+I =f(xN)=f{f(xN.l))= - - - =fN(x1) = g(xl).
Similarly xN+2 = g(x2\ e t c - So we now rewrite the sequence xl9 x 2 , x 3 ,
. . . as
xl9x2, x 3 , . . . , x N , gfri), g(x2)9 g(x3),..., g(xN\
Q{g{xl)\g{g{x2)\....
This is actually a combination of the N sequences
Xl,0(*l),0(0(Xl))..--
x2,g(x2),g(g(x2)),...
xN,g{xN),g(g(xN)\....
Each of these is obtained by starting at some point of X and iterating
with the contraction g. We know that any such sequence converges to
the unique fixed point of g, namely x 0 . So each of the N sequences
converges to x 0 and hence the combined sequence x l 5 x 2 , x 3 , ...
converges to x 0 , as claimed. •
Exercise 75 Let / : U—• R be given by / ( x ) = e~x. Use
differentiation to show that / is not a contraction of R, but
that f2 is. Use an iterative technique to find (to three decimal
places) the unique fixed point off, and hence the unique root
of log x + x = 0. (Observe that in this iteration the sequence of
66 The contraction mapping principle
approximations which you get splits into two parts, one
increasing towards the limit and the other decreasing. This
resembles the proof of Theorem 4.3 where (when dealing with
an Nth iterate) the sequence was split into N parts each
converging to the fixed point.)
Exercise 76 Let C(0, n/2) be the metric space of continuous
real functions x: [0, n/2~\ —> R together with the usual max
metric. Let / : C(0,n/2) -+C(0,n/2) take the member x of
C(0, n/2) (which is itself a function) to the member f(x) of
C(0, n/2) given by
(f(x))(t) = {x(u) + u) sin udu.
So, for example, if x is the zero function (x(t) = 0 for all r), then
f(x) is the function given by
(f(x))(t) = (0 + M) sin u du = sin t — t cos t.
Let x, yeC(0, n/2) be given by
x(t)=-t, y(t)=l-t (0^t^n/2).
Find / ( x ) and f(y) and evaluate d(x, y) and d(f(x),f(y)).
Deduce that / is not a contraction of C(0, n/2).
We have seen that the function / in Exercise 76 is not a contraction:
but we now set out to show that some iterate of / is a contraction.
Note that, for any x,ye C(0, n/2) and t e [0,7r/2],
(f(x))(t)-(f(y))(t)=\ (x(u) + w)sinwdw- (y(u) + w) sin w du
Jo Jo
= (x(w) — y(u)) sin M dw
Z\ \x(u)-y(u)
i)\ du.
But t/(x, y) is the biggest of all the possible values of |x(w) — y(u)\ and so
(f(x))(t)- (f(y))(t)^ f d(x, y) dii = rf(x, y).
In particular d(/(x), /(j;)) is the largest of all the values of |(/(x))(f) -
(f(y))(t)\ and so
Some extensions 67
d(f(x)J(y))^max(td(x9y)) = ^d(x9y), Os£t<|.
Exercise 77 Let / be as above. Use the inequalities just
established (at first applied to/(x) and/(y) rather than x and
y) to show that
T £r
(/•(/(*)))(«)-(/(/(y)))(0^ | l(/(x))(«)-(f(y))(u)\ du
i.e. f2(x) ^ wd(x,)>)du
2
t
= yd(x,3;),
and deduce further that
(f3(xMt)-(f3(y))(t)^d(x,y).
Hence show that for all x,yeC{0,n/2),
d(J(x), /(>>)) ^ - <*(*> y) ( s e e n above),
and
d(fHx)J3(y))^l-feJd(x9y).
Deduce that the third iterate of / is a contraction of C(0, n/2).
(If you find some of these details hard simply move on: there
is a similar example worked through in full detail at the
beginning of the next section.)
We have therefore seen that if / : C(0, n/2)—• C(0, n/2) is given by
(f(x))(t)= (x(w) + u) sin u du,
then the third iterate of / is a contraction of the (complete) metric
space C(0, n/2). Hence, by Theorem 4.3, / has a uniquefixedpoint; i.e.
there is precisely one continuous x: [0,n/2~\—• R which satisfies
ft
x(t) = (f(x))(t)= (x(w) + w)sin udu
o
68 The contraction mapping principle
for all te[0,n/Z]. So the 'integral equation'
x(t) = (x(u ) + u)sinudu (O^t^
Jo
has a unique solution. But differential equations are much more
common and useful: what is the relevance of the fixed point of / to a
differential equation? We see that the integral equation (left-hand
box) is equivalent to a differential equation (right-hand box):
Differentiate both
sides, and note that
JC(0) = 0
x(t) (x(u) + u) sin u dw - ^ = (* + 0 sin t
Jo and
forallf<E[0, f ] JC(0) = 0.
Integrate both sides
from 0 to t
We will meet this again in the next section and it is only included here
as a preview. But let us summarise what we have seen in this example.
The solutions of the differential equation
dx
= (x +1) sin t subject to x(0) = 0
dr
are equivalent to solutions of
x(t) = (x(u) + u) sin u du.
These, in turn, are the fixed points of a function / : C(0, n/2) —•
C(0, n/2) for which some iterate is a contraction. Hence / has a unique
fixed point and the differential equation has a unique solution. We
shall return to this topic much more fully in the next section.
That completes our generalisation of the contraction mapping
principle to iterates: our second extension involves weakening the
requirement that d(f(x), f(y))^kd(x, y) for some fc<l. We saw by
means of examples in the first section of this chapter that the
condition d(f(x), f(y)) < d(x, y) for all x # y was not enough to ensure
that / has a fixed point. Our counter-example concerned / ( x ) = x +
1/x for xe[l, oo [ where, in a very informal sense, the fixed point was
'at infinity' and therefore not allowed. For our second extension of the
Some extensions 69
contraction mapping principle we shall see now that if we restrict
attention to compact spaces then the slightly weaker condition that /
reduces distances (rather than reduces them by at least a factor of
k<l) is sufficient to ensure a unique fixed point. Furthermore our
iterative process will again lead to that fixed point. Those readers who
chose to pass quickly over compactness in the previous chapter can
easily miss out this theorem.
Theorem 4A Let (X, d) be a compact metric space (i.e. X is itself
compact) and let / : X—>X satisfy
d(f(x)9f{y))<d(x,y)
for all x ^ y in X. Then / has a unique fixed point x 0 . Furthermore if
xteX and x2 =f(x1), x 3 =/(x 2 ), and so on, then the sequence x1? x 2 ,
x 3 , . . . converges to x 0 .
Proof For each xeX the distance d(x9 f(x)) is a real number. Define
F: X—• U by F(x) = d(x, f(x)). By the triangle inequality, for any x
and y in X
d(xj(x))^d(x,y) + d(yj(y)) + d(f(y),f(x)),
and so
\F(x)-F(y)\ = \d(x,f(x))-d(y,f(y)\
**d(x,y) + d(f{y)9f(x))
^2d(x,y)
(by the given distance-reducing property of / ) .
So, as in Exercise 54, the set
F(X) = {F(x): xeX} = {d(x9f(x)): xeX}
has a least element, a, say, equal to d(x0, /(x 0 )). The exercise will now
show us that <x = 0 and hence that x 0 = / ( x 0 ) .
Exercise 78 Show that if a ^ 0 then the distance from / ( x 0 ) to
f(f(x0)) is less than a. Deduce that / has a fixed point.
Hence / has a fixed point x 0 and (by Exercise 66) this point is unique.
The remaining question, crucial to our approach, is whether choosing
xxeX and evaluating x 2 = / ( x 1 ) , x 3 = / ( x 2 ) , and so on, will give a
sequence converging to x 0 . So let x x , x 2 , x 3 , . . . be constructed in that
way. Then
d(xn,x0) = d(f(xn-.1),f(x0))^d(xn-l9x0)
70 The contraction mapping principle
and so the sequence of real numbers
^(x^Xo), d(x 2 ,x 0 ), d ( x 3 , x 0 ) , . . .
is decreasing and hence convergent to some number /? ^ 0 . We aim to
show that fi = 0 (so that
d(xu x 0 ), d(x2, x 0 ), d(x3, x 0 ),...—>0;
i.e. xl9 x 2 , x 3 ,...—>x 0 as claimed).
By the compactness of (X, d) the sequence xl9 x 2 , x 3 , . . . certainly
has a convergent subsequence
xkl, xk2, xki,...-+y (say).
Therefore
d(xfcl, x 0 ), d(xk2, x 0 ), d(xfc3, x 0 ) , . . .-+d(y, x 0 )
since d(x„,x 0 ) lies between d(y,x0) — d(y,xn) and d()>,x0) + d()/, xn)
(the triangle inequality). But
^(x^Xo), rf(x2,x0), <2(x 3 ,x 0 ),.. .-»j?
and so d(y,x0) = p. We have shown therefore that any convergent
subsequence of xl9 x 2 , x 3 , . . . has its limit at a distance /} from x 0 . But
as
x
kt9 x
k2> x /c 3 5 • • -~*y
and / reduces distances it follows that
f(xkl)J(xkl)J(xkJ9...-+fiy);
II II II
X X X
fc, + 1> k2 + \> k3 + 1> • • • V (y)'
So we have another convergent subsequence of xl9 x2, x 3 , . . . and by
the above comments its limit must be distance /? from x 0 ; i.e.
d(f(y),x0) = p. Hence
d(/0'),/(xo)) = d(/(y),xo) = i8 = d(y,xo)
and so by the fact that / actually reduces distances between distinct
points it follows that x 0 = y and /? = 0.
Therefore j8 is 0, the sequence of d(xn, x 0 )s does converge to 0, and
xi9 x 2 , x 3 , . . . does converge to x 0 , the fixed point off, as required. •
Exercise 79 Let / : [ — n,n]—• [ — 7r,n~\ be given by / ( x ) =
sinx + 1. Show that | / ' ( x ) | < l for all x e ] — n97r[. Use the
technique of the proof of Theorem 4.2 to deduce that
l/M-/0>)l<|x-j>l
Differential equations 71
for all x # y in [ — n, n~\. What can you deduce about the
equation
sinx — x + l = 0 ?
Find a root of the equation (to three decimal places, say).
During our development of the contraction mapping principle we
have seen applications to real equations (Exercises 61, 62, 71, 72, 73,
75 and others), simultaneous equations (Exercises 63, 64 and 67),
linear algebra (Exercises 69 and 70) and differential equations
(Exercises 76 and 77). We now examine that last application in much
more detail.
4.5 Differential equations
The contraction mapping principle will enable us to deduce a
general result about the solutions of a differential equation of the form
dx/dt = F(x,t). But in order to understand the result (and proof) we
begin by working through one example in its full detail.
Let us start therefore by looking at the differential equation
^ = ( x + t 2 )e'- 1
where we want a solution x defined for re[1,3] and which satisfies
the initial condition x(l) = 4. Integrating the equation from 1 to t and
using the fact that x(l) = 4 gives
4+
'- I
x (r) = 4 + r ( x ( « ) + u 2 )e"- 1 du (l^r^3).
(Indeed, as we remarked when considering the differential equation in
the previous section, this integral equation ensures that x(l) = 4 and
on differentiation we get back to
So the integral equation is equivalent to the differential equation
together with the initial condition.)
Therefore we can switch attention to the integral equation. For
each xeC(l, 3) let f(x) be the function in C(l, 3) given by
,=4+
(f(x)){t)=4+\
[ (x(u) + u2)e"-1dM (l^t<3) .
72 The contraction mapping principle
Then the solutions of the integral equation are precisely those
functions x e C ( l , 3) for which / ( x ) = x; i.e. the fixed points of/ We
shall investigate whether/: C(l, 3)—• C(l, 3) (or some iterate of it) is a
contraction.
For each x, yeC(l,3) and r e [ 1 , 3 ]
(f(x))(t)-(f(y))(t) = [4+[(x(u)
- 4 + J (y(u) + ii 2 )e i , - 1 dii
Recall that d{x, y) is the largest of all the values of |x(w) — J;(M)| for
-1
1 ^ M ^ 3 , and also the biggest value of e " for w e [ l , 3 ] is e2.
Therefore
(f(x))(t)-(f(y))(t)
= (x(u)-y(u))ett-ldu^\ \x(u)-y(u)\ e 2 dw
^ d(x,);)e 2 dw = d(x,)>)e 2 (f-l).
Ji
If we apply the first part of this process to / ( x ) and f(y) rather than
x and y we get
(/ 2 (*))(O-(/ 2 O0)(*K f \(f(x)){u)-{f(y))(u)\ e 2 dii.
But we have also established that
(f(x))(t)-(f(y))(t)^d(x,y)e2(t-l)
and so
((f(x))(u)-(f(y))(u)
\(f(x))(u)-(f(y))(u)\=\ or
l(f(y))(u)-(f(x))(u)
^d(x,y)e2(u-l).
Hence
(f2(x))(t)-{f2(y))(t)^
1 \(f(x))(u)-(f(y))(u)\e2du
Differential equations 73
^ d(x,y)z\u-\)du
= d(x,y)e* ——.
Exercise 80 Continue the above process to deduce that
(f3(x))(t)-(fHy))(t) ^d(x, y) e 6 ^ -
and
(/ 4 (x))(t)- (f4(y))(t) <d(x, y) e 8 ^ - .
State the general result concerning (fN(x))(t)-(fN(y))(t).
Since, for l ^ t ^ 3 ,
(f(x))(t)- (f(y))(t) ^d(x, y) e2(t - 1 ) ^ 2 e2d(x, y)
it follows that
d(f(x), f{y))= max \{f(x)){t)-(f{y))(t)\ ^2 e2d(x, y).
However, 2e2 % 15 and so the fact that d(f(x), f(y))^2e2d(x, y) is not
much help in showing whether or not / is a contraction! Similarly we
saw above that for 1 ^ t ^ 3
(f2(x))(t)-(f2(y))(t)^d(x, y) e 4 — ^ ^ **d^ *) = 2e4rf(x, y)
and so d(f2(x),f2(y))^2e*d(x, y): but again 2e 4 ^109 is much too
large to be of any help.
Exercise 81 Use the results of the previous exercise to show
that
d(f3(x),p(y))^^-d(x,y)
d{f\x)J\y))^-d{x,y)
and, in general,
2Ne2N
How does Exercise 81 help us to establish that some iterate of / is a
74 The contraction mapping principle
contraction? We look in detail at the factors by which the iterates
change distances:
'factor' k
d(f(x), f{y)) ^2e2d(x, y) 2e2 % 15 in going from the first to the
second the factor multiplies by
2e2/2
22 22
d(f2ix\ f2(y))^—"2 e*d{x,y) - e 4 % 109 in going from the second to the
third the factor multiplies by
2e2/3
23 23
d(P(x\ p{y)) ^— e6<i(x, y) — e6 %538 in going from the third to the
fourth the factor multiplies by
2e2/4
d(f*(x)tf*(y)) ^e*d(x,y) - e 8 * 1987
13
2 i3 2
d(fl3(x), / 1 3 ( ) > ) K — e26d(x,y) — e 26 *257493 multiplied by 2e2/14 (> 1) so the
"• 13- factors are still rising
214 214
d(f1A(x)9flA{y))** — e28d(x9y) — e 28 *271805 multiplied by 2e2/15 (<1) so the
*4- factors, at last, start to decrease
V15(4/15W)^e3Mxj) ^e 3 0 *267784
<*(/36M, / 3 6 (y» ^ e72<*(x, y) | y e72 * 3.43
d{p\x)J"1{y))^1—^d{x9y) |^e74*1.31
d(f3S(xl f3*(y))^ e76rf(x, y) fgy e?6 *0.53
3
22 2
Incidentally e 2 = l + 2 + — + — +••• which is another way of seeing
that the terms 2n/n! tend to zero.
Hence d(f3S(x), f38(y))^kd(x, y) where k is approximately 0.53;
i.e. the 38th iterate of / is a contraction! Therefore / has a unique
fixed point and (if you haven't forgotten by now) this means that the
original integral equation has a unique solution. We have therefore
(rather tortuously) seen that the differential equation
Differential equations 75
— =(x + t 2 ) e ' _ 1 (Ut$3), x(l) = 4
at
has a unique solution.
Of course we shall not have to go through this process for each
differential equation! Our next theorem will cover all equations of
this type in one go: the advantage of having worked through this
example at length is that, with luck, the proof will now seem natural
and easy. We make one more comment before proceeding to that
general result. The contraction mapping principle, apart from
establishing the existence of unique fixed points, enables us to find the
fixed point by repeatedly applying the function. In this particular
application reapplying the function will mean integrating a
succession of functions. This is generally a daunting task even if the
integration is elementary. For example, to solve the differential
equation
dx
= (x + t2)et-\ x(l) = 4
at
considered earlier would require iterating with / given by
(/(X))(t) = 4+ J (xM + ^ e " " 1 dw.
Starting with an initial guess of x 1 (t) = 0, say, each stage would
involve integrals of terms of the form un emu + p : these are reasonably
elementary to integrate but remember that the 38th iterate of / was
the contraction, so we would have to perform 38 iterations before we
could be sure that our function was any closer to the root than Xi! An
alternative is to use a computer to calculate successive terms by
numerical integration. In general, though, our method is a delightful
way of showing the existence and uniqueness of solutions of
differential equations without actually finding them. But before
proceeding to the general result we include one exercise where the
solution of the differential equation can actually be found by
iteration. That is not to say, of course, that it is the most practical way
of solving the equation.
Exercise 82 The solutions of the differential equation
dx
_ = ( x + f)r (O^r^l), x(0) = 0
76 The contraction mapping principle
are the fixed points of the function / : C(0,1)—* C(0,1) given
by
-I
(f(x))(t)=\ (x(u) + u)udu.
Show that / is a contraction. Hence the differential equation
has a unique solution. To find the solution let xl be given by
xx (r)=0 for all t e [0,1]. Let x 2 =f(xx), x 3 =f(x2) and so on.
Show that x2(t) = t3/3 and x3(t) = (t3/3) + (t5/\5). Convince
yourselves by evaluating a few more terms in the sequence
that xl9 x 2 , x 3 , . . . converges to the functions given by
t3 t5 t1 t9
This, in series form, is therefore the required unique solution
of the differential equation.
Theorem 4.5 Let F: U x [a, b] —* U; i.e. F is a real function of two
variables such that F(x, t) is defined for all x e U and t e [a, b~\. Assume
that F is continuous and that there exists a fixed real number L with
x r,/ x , r , , fThis is called a
' ['Lipschitzian condition'
for all x j e R and t e [ a , ft]. Then the differential equation
—=F(x,r),
subject to an initial condition of the type x(a) = jS, has a unique
solution.
Proof (We will assume that the initial condition is x(a) = fi: there is
no real loss in generality, but it makes the proof a little easier.) Having
seen two particular examples worked through (F(x, t) = (x +1) sin t in
Exercise 76, etc., and F(x, f)=(x + t 2 ) e f - 1 at the beginning of this
section), the reader will probably know exactly how the proof will
proceed. Define / : C(a, b)—• C(a, fe) by
+
'-" I
(/(*))(') = /»+ F(x(ii),ti)dii.
Then the required solutions of the differential equations are precisely
the fixed points of /
Now, for each x, ye C(a, b) and te [a, ft],
Differential equations 77
(f(x))(t)-(f(y))(t)
= [p + f F(x(«), M) du\-lp + f F(y(«),«) d«l
"I' '[F(x(M),u)-F(y(«),ii)]d«
'S L\x(u)-y(u)
<)| du (by the Lipschitzian condition)
^
i Ld(x, y) du = Ld(x, y)(t - a ) .
Hence, in a similar fashion, but with f(x) and f(y) rather than x
and y,
(f2(x))(t)- (f2(y))(t) ^ P L\(f(x))(u)- (f(y))(u)\ du.
But we have already shown that
(f(x))(t)-(f(y))(t)^Ld(x,y)(t-a)
and so
(f2(x))(t)-(f2(y))(t)^ | L|(/(x))(«)-aoo)(«)|dti
^ L 2 d(x,);)(u-a)
j)du
=mx,y)^f.
Continuing in this way it is easy to deduce that in general
(t-af
(fN(x))(t)~(fN(y))(t)^LNd(x,y)-
AM
^LNd{x, y) (
^~~- (since t e [a, fc]).
Hence for each positive integer A/
d(fN(x),fN(y))^^N]a) d(x,y).
Now as N —• oo the numbers LN(b — a) N /N! tend to zero. For, as we
saw in the earlier example, as soon as N is larger than the fixed
number L(b — a) the terms LN (b — a)N/N! start to fall and eventually to
decrease dramatically. Hence there exists some lOf°(b — a)N°/N0\
78 The contraction mapping principle
less than 1, and so the AT0th iterate of/is then a contraction. Thus,
by Theorem 4.3, / has a unique fixed point; and the differential
equation has a unique solution. •
In practice, finding whether there exists such an L as in Theorem 4.5 is
straightforward, as the next result and exercises show.
Theorem 4.6 Let F: U x [a,6]—>U; i.e. F is a real function of two
variables such that F(x9t) is defined for all xeIR and te[a,b]. Assume
that F is continuous, that F can be partially differentiated with respect
to x (i.e. with t fixed) and that dF/dx is bounded throughout U x [a, b~\.
Then the differential equation
dTFM
subject to an initial condition of the type x(a) = j3, has a unique
solution.
Proof Assume that \dF/dx\ ^ L for all x e IR and t e [a, b~\. Then let us
fix t and define a function G of x alone by
G(x) = F(x,t).
T
fixed
It follows that G is differentiable and, by the mean value theorem we
recalled earlier, for any distinct x and y in R
G(x)-G(y)=G'(c)(x-y)
for some c between x and y. Hence
\F(x,t)-F(y,t)\ = \G(x)-G(y)\ = \G'(c)\\x-y\
\dF\
|x-yKL|x-y|.
~|ax|
t
evaluated at
(Ct)
Thus F satisfies a Lipschitzian condition as in Theorem 4.5 and the
uniqueness of the solution of the differential equation follows from
that result. •
Exercise 83 Show that for any positive real number Tand
any real number x
Differential equations 79
2x 2x
2 2 2 ^ 2
(T +x ) T +x'
Now let
F(x,t) = -
2' + x 2
for x e U and t e [ —10,10]. Use the above inequalities with
T=2 r / 2 to show that
dF
^10-215
dx\
throughout IRx[-10,10]. It follows, therefore, from
Theorem 4.6, that the differential equation
dx t
x(0)=l
dP 2< + x 2 '
has a unique solution defined on [—10,10]. Show that the
same conclusion can be drawn for any interval [ — n, n\ Do
you see why this means that the differential equation
dx t ^ A
z
At 2* + x
has a unique solution x defined for te R?
Exercise 84 Let [a, b~] be an interval contained in ]0, oo[.
Show that for t e [a, b] and x e (R
ex 1 1
(r + ex)2 t a
Use Theorem 4.6 to show that the differential equation
dx_ 1
dt t + ex
subject to an initial condition x(a) = j8 has a unique solution
x defined for t e [a, b~\. Can you see why this means that there
is a unique such solution x defined for fe]0, oo[?
There is an extension of the above approach to simultaneous
differential equations and hence to higher-order linear differential
equations: the reader who has had enough of this topic can turn
straight to the next section.
We saw in Exercise 64 that in order to solve the simultaneous
80 The contraction mapping principle
equations
x 1 = 2 ^cosx 2 and x 2 = 2sinx 1 + l
we could consider the function / : R2—• (R2 given by / ( x l 5 x 2 ) =
(^ cos x 2 , \ sin xx +1) and look for its fixed points. In other words, to
solve two real equations simultaneously we would consider pairs of
members of M. Similarly, in Exercise 67, in order to solve three real
equations simultaneously we used the metric space of triples of
members of U. So if we want to consider a set of simultaneous
differential equations (with solutions on [a,fc],say) perhaps we ought
to consider a metric space consisting of lists of members of C(a, b).
Needless to say the approach can get a little technical and so we give
only a broad outline (leaving most of the work to you!).
Let us see first why the solutions of simultaneous differential
equations are relevant to the solutions of linear differential equations
of higher order.
Exercise 85 Show that given any solution (for 0 ^ t ^ 1, say)
of the simultaneous differential equations
dxi dx 2
—— = x 2 , X!(0)=1 and —-— = er — x t — 2tx 2 , x2(0) = 2
at at
the function xx is a solution of
Conversely, show that if Xj is a solution of this second-order
equation, then the pair xx and x 2 = dx x /dr satisfy the
simultaneous equations.
If we rewrite the equations
dxi
— = x 2 ( = F 1 (x 1 ,x 2 ,r), say), x x (0)=l
and
dx
—^ = e r - x 1 - 2 r x 2 ( = F 2 (x l 9 x 2 ,r), say), x2(0) = 2
at
in condensed form as
^ = F ( x , f ) , x(0) = (l,2),
at
then there is a close superficial resemblance to the differential
Differential equations 81
equations considered in Theorem 4.5. Imitating the proof of that
result, perhaps we should set up a metric space (X, d) of pairs of
continuous functions and then define a relevant / : X —• X where fixed
points are the required solutions. We could then try to show that
some iterate of / is a contraction.
Exercise 86 Let X be the set of pairs x = (x l 5 x 2 ) of con-
tinuous real functions defined on [0,1]. Define d by
d(x,y) = d((xl9x2), (yuy2))
= max \xx(t)-y1 (01 + max \x2(t)-y2(t)\
for x, yeX. Show that (X9 d) is a metric space and is
complete.
Exercise 87 Let X be as in the previous exercise and define
/ : X-+Xby f(x)=f(xl,x2)=(yl,y2) where
3>i(0=l+ | x2(u)du
and
y 2 (0 = 2 + (e"-x 1 (ii)-2iix 2 (ii))dii.
Show that, given any fixed point (x l9 x 2 ) of/, the function xx
is a solution of the differential equation
d2Xi dxi dxi
+2t e (0)=1
V ^ = ' *' ' -dr(0)=2-
Conversely, show that if xx is a solution of this differential
equation then x = (xl9 dx1/dt) is a fixed point of /
In general, to consider the solutions of an nth-order linear differential
equation, we shall rewrite it as n simultaneous first-order differential
equations. We shall then show that these solutions coincide with the
fixed points of a function on the metric space of n- tuples of continuous
functions. The result is stated below, but only a very broad outline of
the proof is given. It generalises the idea, seen in introductory courses,
that a second-order linear differential equation with two initial
conditions has a unique solution.
Theorem 4J\,t\h,gx,Q2,... ,#„: [a, b] —* R be continuous functions.
Then the differential equation
82 The contraction mapping principle
dnx d n ~~ * x dn~ ^x
a
^+ i(t)-^^+g2(t)-^^+''-^gn(t)x1(t)=h(t)
subject to initial conditions
dx dxn~x
*i («) = &, -^(x) = f}2,..., ^T<«) = Pn
has a unique solution.
Outline proof The solutions of the required differential equation are
equivalent to the solutions of the simultaneous equations
dx1
— = x2 ( = Fl(xu...,xn9t), say), x 1 (a) = j81,
dx2
— = x 3 ( = F2(xu...9xH,t), say), x 2 (a) = j32,
dxn
( = Fn(x1,..., x„, t), say), xn(a) = j?n.
We abbreviate these n equations to
dx
— = F(x,f), x(a) = fi
where x = (xl9..., x„), F stands for the n functions (Fx,..., Fn), and
So now let X be the set of n-tuples of continuous real functions
defined on [a,ft].Define d by
d(x, y) = d((Xi,..., x„), CK! yn))
= max |Xi ( 0 - ^ ( 0 1 + ' *' + max \xn(t)-yn(t)\.
a^t^b a^t^b
Then (X, d) is a complete metric space.
Now define / : X-+X by / ( x ) = / ( x 1 ? . . . , x j = ( y 1 ? . . . ,y n ) = y,
where
yt(t)=Pt + FI.(x1(w),...,xn(w),w)dM.
Then some iterate of / is a contraction. When trying to prove the
corresponding result for a single equation in the proof of Theorem 4.5
we needed the fact that
The implicit function theorem 83
F(x,t)-F(y,t)^L\x-y\.
The fact that some iterate of our new / is a contraction will follow
similarly from the fact that we can find a fixed number L with
distance from F(x, t) to F(y, t)^L- (distance from x to y)
(in W) (in R")
For the distance from F(x,r) to F(y,r) is
{[F1(x,t)-F1(y,t)Y + [F2(x,t)-F2(yj)-]2 + '-- +
li2
lFn(x,t)-Fu(y9t)Y}
= {(x2-y2)2 + (x3-y3)2 + '-+[9i(t)(xn-yn) + -- +
2 12
9n{t){x,-y,J\ Y
^L{(xl-y1)2 + (x2-y2)2 + -- + (xn-yn)2y'2
whereL = (2n-1) 1 / 2 maxa^b {gl{t\g1{tl... ,gH{t), 1}. We leave the
interested reader to check this fact.
Hence the function F does satisfy a Lipschitzian condition and it
can be shown, in a fashion similar to that in the proof of Theorem 4.5,
that some iterate of / is a contraction. Hence / has a unique fixed
point and the original differential equation has a unique solution. •
4.6 The implicit function theorem
Our final application of the contraction mapping principle is
to a classical theorem from analysis. We introduce this theorem by
means of an example.
Exercise 88 Show that the equation
n _t
— + x + tan x = 0
4
has a unique solution. (This can be done in a naive way
considering the signs of the left-hand side and also showing
that the expression is increasing, or by rearranging the
equation into the form x =f(x) and using our earlier theory.)
Show similarly that for any fixed t > — 1 there is a unique
real number x with
t a n - 1 r + x + tan" 1 (xt) = 0.
The result of Exercise 88 is that the equation
t a n - 1 f+ x + t a n - 1 (xt) = 0
84 The contraction mapping principle
uniquely defines a function x in terms of t for all t > — 1. We say that x
is defined implicitly in terms of t (rather than explicitly when x is given
in the form x=f(t)). But in Exercise 88 we only showed that x was
defined for each individual t and we learnt nothing of the overall
behaviour of x: in fact it turns out to be a continuous function. We
shall prove this using the following result.
Theorem 4.8 (The implicit function theorem.) Let F: U x [a, b~\ —• R;
i.e. F is a real function of two variables such that F(x, t) is defined for
all x e U and t e [a, b~\. Assume that F is continuous and that there
exist numbers m and M with
ox
for all xeU and te[a, b~\. Then there exists a unique continuous
function x: [a, b] —> R such that F(x(t\ t) = 0 for all t e [a, b]; i.e. the
equation F(x, t) = 0 does implicitly define a unique continuous
function x in terms of t.
Proof Let / : C(a, b)^>C(a, b) take the function xeC(a,b) to the
function f(x) given by
(f(x))(t) = x ( t ) - ^ ^ (a^t^b).
M
We claim that / is a contraction of C(a, b). For if x, yeC(a, b) and
t e [a, b] then, as we saw in the proof of Theorem 4.6, the mean value
theorem shows that
dF
F(x9t)-F(y9t) = —(x-y).
T
evaluated at (c,t)
for some c between
x and y
Hence
F(x(t)
(f(x))(t)-(f(y))(t) = x(t)--
•• (x(t) - y(t)) - ^ (F(x(t), t) - F(y(t), t))
The implicit function theorem 85
= (x{t)-y(t))~~{x(t)-y(t))
M ox
t
evaluated at (c, t)
for some c between
x(t) and y{t)
i-i-S)(*W-^))<(i-5)(x(0-y(0).
Therefore, with the usual metric on C(a, b),
d(f(x),f(y))=max\f(x)(t)-(f(y)){t)\
a^t^b
^max 1-^1 \x(t)-y(t)\
a^t^b
M
Since 1 — m/M< 1 it follows that / is a contraction of C(a, b) and
therefore has a unique fixed point, x, say. Thus for all t e [a, b~\
x(t)={f(xm)=x(t)-^^-
M
and
F(x(t)9t) = 0.
It follows that x is a continuous solution of F(x, r) = 0. Its uniqueness
follows from the fact that any such solution must be a fixed point
off. D
Exercise 89 Show that the equation
4x — 3 cos x = 1 — It cos t
uniquely defines a continuous function x in terms of t, for t in
any interval [a, b~\. Why does this mean that a continuous x is
uniquely defined for te Ul
Exercise 90 Show that the equation
tan" 1 r + x + tan" 1 (xt) = 0
uniquely defines a continuous function for t e [a, b~\ where
- 1 <a<0<b. (It might help to note that for such an a, b
and t
86 The contraction mapping principle
Why does this mean that a continuous x is uniquely defined
for £ e ] - l , o o [ ?
4.7 Conclusion
Those readers who are not interested in pure mathematical
abstraction will probably not wish to read the final chapter which
generalises real analysis and at the same time shows us the underlying
reasons why it works. It is for such readers that the current approach
has evolved, with its main theme of iteration by contraction. Only the
concepts necessary for that approach were introduced and they were
motivated and explained with that approach in mind. But it would be
wrong for the reader to think that this was the only use of metric
spaces. This subject, as part of the area of mathematics known as
'functional analysis', has a large variety of uses in both pure and
applied mathematics. It just happens that contractions are capable of
being understood without a great deal of pure mathematical back-
ground.
However, I hope that most readers will feel that, having worked
hard to understand the abstract ideas of metrics, closed sets,
completeness and compactness, it would be a pity to stop without
seeing thejr relevance to analysis. So we now include a final chapter
with the subject looked at in a more traditional way.
5
What makes analysis work ?
5.1 Continuity
The central theme of analysis is that of a continuous function.
Whenever possible we have defined our concepts in terms of
sequences and, for the moment, we shall use sequences to define
continuity. We saw in the first section of Chapter 3 that a function
/ : R2 —• R is continuous if whenever
(x1? yj, (x2, y2), (x3, y3\ . . . - » ( x 0 , y0)in ^ 2
it follows that
/(*!> )>i), f(*2, y2\ /(*3> ys\ • • .->/(*<» ^o) in R .
That idea extends easily to a function / : X—• X' where (X, d) and
(X\ d') are both metric spaces.
Definition Let (X, d) and (X\ d') be metric spaces and let / : X—* X'.
Then / is continuous if whenever
x 1 ,x 2 ,x 3 ,...—>x 0 in (X,d)
it follows that
/(*i),/(x 2 ),/(x 3 ), • • •—/(^o) in (X\df)
Exercise 91 Let (X9 d) and (X\ d') be metric spaces and let
/ : X—> X' have the property that for some fixed number L
d'{f(x),f(y)HLd(x,y)
for all x, y X. Show that / is continuous.
Exercise 92Let/: C(0, l)-> Rbegiven b y / ( x ) = x(0);i.e.for
each member x of C(0,1), f(x) is the value of that function at
0. Show that / is continuous (with respect to the usual
metrics on C(0,1) and R).
Exercise 93 Let (X, d) be a metric space and let x 0 e X. Define
/ : X—• R by f(x) = d(x0, x). Show that, for each x, y e R,
l/W-/(yKi(x,y)
and deduce from Exercise 91 that / is continuous.
88 What makes analysis work?
We saw in Theorem 3.1 that if / : IR2—•R is continuous then
/ " 1 ([a, b~\) is closed. This is a very special case of the following result.
Theorem 5.1 Let (X, d) and (X\ d') be metric spaces and let / : X—• X'
be continuous. Then if A is closed in (X\ d') it follows that / " l (A) is
closed in (X, d).
Proof We must consider a convergent sequence in / " l (A) and show
that its limit is also in / _1(>4). So let
X j , X 2 , -*3> • • • *XQ.
> „ <
ef'HA)
Then, by the continuity of /,
f(xi),f(x2),f(x3),.. .->/(x 0 ).
But ,4 is closed and hence / ( x 0 ) e A and x0ef~1(A) as required. •
Exercise 94 Use Theorem 5.1 and Exercise 93 to show that if
(X, d) is a metric space with x 0 e X and if r is a real number,
then the sets
{xeX: d(x, x 0 ) ^ r } and {xeX: d(x, x 0 ) ^ r }
are closed in (Jf, d).
It turns out that the converse of Theorem 5.1 is also true, namely that
if/: X—>X' is such that f~1(A) is closed in (X, d) whenever A is
closed in (X\ d'), then / is continuous. It is therefore possible to
characterise continuity in terms of closed sets.
Theorem 5.2 Let (X, d) and (X\ d!) be metric spaces. Then / : X-+X'
is continuous if and only if f~1(A) is closed in (X, d) whenever A is
closed in (X\ d').
Proof The fact that continuous functions have the stated property
was established in Theorem 5.1. To prove the converse, assume that
f~1(A)is closed in (X, d) whenever A is closed in (X'9 d'). To show that
/ is continuous we take a convergent sequence
x 1 , x 2 , x 3 , . . . - > x 0 in (X,d)
and assume that
/ ( x ! ) , / ( x 2 ) , / ( x 3 ) , . . .+f(x0) in (X\df).
Then some subsequence of f(xx), f(x2), / ( x 3 ) , . . . fails to get close to
Continuity 89
/ ( x 0 ) ; i.e. there exists an r > 0 and a subsequence f{xki ),f{xk2),f(xk3),
. . . all of whose terms are distance r or more from /(x 0 ). Hence
f(xki)J(xk2),f(xk3\...e{xeX':d'(f(x0),x)>r}=A, say,
which by Exercise 94 is a closed set in (X\ d'). Hence, by the assumed
condition on /, f~1(A) is closed in (X, d). But
X
kx 9 Xk2> X
k3> ' • • * *0
< „ '
ef~1(A), closed
and so x 0 ef ~1 (A) and / ( x 0 ) e A. That means that d(f(x0\ f(x0)) ^
r > 0 , which is a contradiction. Thus / ( x 1 ) , / ( x 2 ) , f(x3),...—•/(XQ)
and / is continuous as claimed. •
Our definition and characterisation of continuity fits in with our
restricted use of the concept in earlier chapters. But in a traditional
approach to metric spaces the starting point is the e-d definition of
continuity of a real function:
'/: U—»U is continuous if given x e U and e > 0 there exists
<5>0 such that
\f(*)— f(y)\<£ whenever |x — y\<5;
i.e.
f{y)e]f(x) — 8,/(x)+e[ = 7 whenever ye]x — 8, x-f 5[ = J.'
So traditionally a function is continuous if whenever / is an open
interval containing f(x) there exists an open interval J containing x
with f{J)^I. In order to generalise this approach to functions from
one metric space to another we need first to generalise the idea of an
open interval. We could actually manage without this concept even in
this last chapter, but in order that the reader can understand other
texts we must include this generalisation here.
Definition Given a metric space (X, d)9 a member x0eX and a number
r > 0 , the open ball Bx(x0, r) is the set
{xeX: d(x0,x)<r}.
So, for example, the open ball B R (x 0 , r) (assuming the usual metric on
R) is the set
{xe(R: |x 0 — x\<r} = ~\x0 — r , x 0 + r[.
Exercise 95 Illustrate the open ball centre (0,0) and radius r
in U2 in the following cases:
90 What makes analysis work?
(a) with the usual metric,
(b) with the 'max metric' defined in Exercise 13, and
(c) with the 'lift metric' defined in Example 6 of Chapter 2.
The generalisation of our s-S definition of continuity to functions
/ : X—>X' would now read
'/: X —> X' is continuous if given xeX and e > 0 there exists
S > 0 such that
f(y)eBx. (f(x),e) whenever yeBx(x,S):
But it is actually possible to use the idea of an open ball to characterise
continuity without the drawback of the technical complications of as
and (5s.
Definition In a metric space (X, d) the set A is open if given a e A there
exists r > 0 with Bx(a, r ) c A
In other words, A is open if each a e A is the centre of some open ball
contained entirely in A. For example, in R the set A = ]0, oo[ is open
because if a e ] 0 , oo[, then the open ball BR(a, a) ( = ]0,2a\) is
contained in A. An open set in U2 (usual metric) is illustrated in Figure
5.1.
Fig. 5.1
Exercise 96 Show that the set {(x, y): x > 0 and y > 0} is open
in IR2 (usual metric).
Exercise 97 Let A be the open ball Bx(x, r) in the metric
space (X, d). Use the triangle inequality to show that if a eA
then r — d(x, a)>0 and Bx(a, r — d(x, a))^A. Deduce that
open balls are open.
We can now characterise continuity in terms of open sets. This has the
advantage of avoiding es and 5s. (Indeed, it avoids any mention of
Continuity 91
distance and it prepares the way for the generalisation - not covered
in this text - to 'topology' where the basic concept is not distance but
the idea of an open set.) The characterisation says that / : X—>X' is
continuous if and only if / " * (A) is open in (X, d) whenever A is open
in {X\ d'), and we (or you!) will prove this in Exercise 100.
Some readers will surely have already asked themselves how we
have managed without the concept of an open set if, as we claim, it is a
central idea in analysis. Other readers will have noticed the strong
connection between the above statement concerning continuity in
terms of open sets and the statement of Theorem 5.2 concerning
continuity in terms of closed sets. The 'missing link' will be established
in Theorem 5.3, but before stating that result we warn the reader not
to think of open as 'not closed'.
Exercise 98 Let A = [0,1[ in R (usual metric). Find a
convergent sequence in A whose limit is not in A, and show
also that Bu(0, r)£A for any r > 0 . Deduce that A is neither
open nor closed.
Exercise 99 Let (X, d) be a metric space. Show that X and 0
are both open and closed in (X, d).
Theorem 53 Let (X, d) be a metric space. Then A ^ X is open if and
only if its complement X\A is closed (Figure 5.2).
Fig. 5.2
A open (in R 2 ) 2
\A closed
Proof Assume first that A is open and let x x , x 2 , x 3 , . . . be a sequence
of points of X\A with x l 5 x 2 , x 3 ,...—• XQ. T O show that X\A is
closed we must show that x 0 e X\A. Let us assume that x 0 £ X\A
and derive a contradiction. If x 0 4 X\A, then x 0 is in the open set A
and so Bx(x0, r)^A for some r > 0 . Hence
92 What makes analysis work?
^ l > ^2> ^3> • • • *^0
* ' v
eX\A
<=X\Bx(x0,r)
= {xeX: d(x0, x)^r} (closed)
and it follows that x0e{xeX:d(x0,x)^r}; i.e. d ( x o , x o ) ^ r > 0 ,
which is the required contradiction. Hence X\A is closed.
Conversely assume that A is not open. Then there exists anaeA
with
Bx(aA)$A, Bx(aA)$A, Bx(a,±)$A9... ;
i.e. there exist an xleX\A with d(a9xx)<l9 an x2eX\A with
d(a, x 2 ) < i , an x3eX\A with d(a, x 3 )<y, Clearly we have a
sequence x x , x 2 , x 3 , . . . in X\A convergent to a $ X\A. Hence X\A
is not closed, and the theorem follows. •
Whenever we have needed a result concerning open sets in the earlier
chapters we have therefore been able to quote the results in terms of
closed sets.
Exercise 100 Use Theorems 5.2 and 5.3 to show that / : X —•
X' is continuous if and only if f~1(A) is open in (X, d)
whenever A is open in (X\ d').
Exercise 101 Use Theorems 3.2, 3.3 and 5.3 to show that
(a) any union of open sets in a metric space is itself open;
(b) if AX9Al9..,.Ak are open sets in a metric space then
Axr\A2r\... nAk is open.
Give an example to show that an intersection of open sets
need not be open.
We have shown^he link between our approach (via sequences) and
the traditional approach (via open sets). We are now able to look at
the classical theorems concerning continuous functions and to
generalise them to arbitrary metric spaces. At the same time we hope
that this abstraction enables us to see in the underlying structure why
the theorems work.
5.2 Attained bounds
Here is a result from introductory courses on real analysis:
Attained bounds 93
the proof is chosen as typical of the way in which the results are
proved at that level.
Theorem 5.4 Let / : [a, ft] —• U be continuous. Then / is bounded and
attains its bounds; i.e. there exist numbersm, M with m^J(x)^Mfor
all xe[a,ft] and numbers £9rje[a,b] with m=f(£) and M=f(rj)
(Figure 5.3).
Fig. 5.3 y=fix)
k
Proof We will show that the set / ( [ a , ft]) = {/(x): x e [ a , ft]} is
bounded above and that there exists r\ e [a, ft] with
/(iy) = Af = s u p / ( [ a , 6 ] ) ;
the result for lower bounds follows similarly.
Assume then that / ( [ a , b~]) is not bounded above. Then there exists
xle[a, ft] with / ( x j ^ l , x 2 e [ a , ft] with / ( x 2 ) ^ 2 , x 3 e [ a , ft] with
/(x3)>3,....
Now x 1? x 2 , x 3 , . . . is a sequence in [a, ft]
and so it has a convergent subsequence by the compactness
x
k^ x
k -> k -> ' • •
2
x
z *0 x of [a, ft]
in [a, ft]. Therefore
f(xkl),f(xk2),f(xkJ,.. .->/(x 0 ) by the continuity
^/Cj ^ rC2 ^^3 of/
whereas it is clear that the sequence / (xki), /(x k z ), / (x fc3 ),... is tending
to infinity. This contradiction shows that / ( [ a , ft]) is bounded above.
Let M = sup / ( [ a , ft]). Then there is a sequence in / ( [ a , ft]) tending
to M,
94 What makes analysis work?
f{yilf(yi\f{y*\..^M.
Now yx, y2, y 3 , . . . is a sequence in [a, fc] and
by the compactness
so it has a convergent subsequence
of [a, b]
in [0, &]. Therefore
by the continuity
off
But /()>i), /O^)* /0>3)»...—• Af and so, by the uniqueness of limits,
M=f(rj) as required. •
It seems that the only properties of [a, b] and / used in that proof
were the compactness of [a, b] and the continuity of/ So the obvious
generalisation is as stated in the next exercise.
Exercise 102 Let (X, d) be a compact metric space and let
/ : X—+ IR be continuous. Imitate the proof of Theorem 5.4 to
show that there exist £,rjEX with f(£)^f(x)^f(n) for all
XEX.
That generalisation was no harder to prove than the more restricted
version. The pleasant bonus to finish this section is that by
generalising the result further the proof becomes easier.
Theorem 5.5 Let (X, d) and (X\ d') be metric spaces with (X, d)
compact. Let / : X—*X' be continuous. Then f(X) = {f(x): x e X} is
compact (and hence closed and bounded) in (X\ d').
Proof We must take any sequence in f(x) and find a convergent
subsequence with its limit in f(X). So let / ( x x ) , / ( x 2 ) , f(x3),... be a
sequence in f(X). Then the sequence xl9 x2, x 3 , . . . is in the compact
set X and so it has a convergent subsequence
x X
kx> k2-> Xk^ • • • >X
0
in X. Hence, by the continuity of /,
f(xkl), f(xk2), f(xk3),.. .-*/(x 0 )
in f(X). We have found the required convergent subsequence and
thus established that f(X) is compact. •
5.3 Uniform continuity
Returning again to the e-5 definition of continuity, a
function / : [a, b] —• U is continuous if for each x e [a, b] and each
Uniform continuity 95
e > 0 there exists S > 0 (which may depend upon the choice of x and e)
such that (for ye[a, b])
f(y)e]f(x) — e,f(x)+e[ whenever ye~\x — S,x+S[
or
|/(x) —f(y)\ <e whenever \x — y\ <8.
Sometimes it is possible to choose a 8 which does not depend on x;
i.e. given e>0 there exists <5>0 such that (for all x,ye\_a,b~\)
\f{x)—f(y)\<e whenever \x — y\<8.
Such a function is said to be uniformly continuous.
A theorem from elementary analysis courses is that if/: [a, b~] —• R
is continuous then it is uniformly continuous. Analysts who approach
metric spaces with a view to generalising that result soon see that
again it is the compactness of [a, b] which makes it work. But they
start considering sets like ]x — 8, x +5[ covering [a,ft]and that leads
them to consider compactness in the following way: 'a set A is
compact in a metric space if and only if whenever A is contained in a
union of open sets it follows that A is contained in the union of a finite
number of them.' This characterisation (which turns out to be
equivalent to our definition) has the advantage that it carries over to
the further generalisation of 'topology' where all concepts have to be
defined in terms of open sets. However, our form of compactness is
still sufficient to enable us to understand why the result on uniform
continuity works.
Theorem 5.6 Let / : [a, b] —• R be continuous. Then it is uniformly
continuous.
Proof Assume that / is continuous but not uniformly continuous: we
shall deduce a contradiction. Since / is not uniformly continuous
there must exist some e > 0 (which isfixedfrom this point onwards) so
that no 5>0 has the property that (for x,ye[a, b])
\f(x)-f{y)\<s whenever \x-y\<5.
So
6 = 1 fails; i.e. there exist xl,y1e[a, b~] with
IXi-3^1 butl/frJ-ZO^e.
8 = j fails; i.e. there exist x2, y2 e [a, fo] with
\x2-yi\<? but|/(x 2 )-/(y 2 )|^c.
96 What makes analysis work?
3 = 3 fails; i.e. there exist x 3 , y3 e[a, ft] with
l* 3 ->>3l<i but|/(x3)-/(y3)|^c.
The sequence xl9 x 2 , x 3 , . . . in [a, ft] must ]
have a convergent subsequence by the compactness
X
k^ X
k2i Xk3> ' ' ' **0 of [a, ft]
in [a, ft]. Since
\xkm-ykm\^>0
it follows that by the triangle
inequality
also. Hence
f(xkl), f(xk2), f(xk3),.. .->/(x 0 ) by the continuity
and of/
f(ykl),f(yk2),f(yk3l---^f(xo).
Therefore \f(xkn)—f(ykn)\-+0, which contradicts the fact that
\f(xn) —f(yn )l ^ £ > 0 f° r a ^ n- Hence the continuity of/ does imply its
uniform continuity. •
Definition Let (X, d) and (X\ d') be metric spaces. Then / : X-+X' is
uniformly continuous if given e>0 there exists <5>0 such that, for all
x,yeX,
d'(f(x)9 f{y)) < £ whenever d(x, y) < d.
Exercise 103 Let (X, d) and (X\ d') be metric spaces with
(X, d) compact. Imitate the proof of Theorem 5.6 to show
that if / : X—+X' is continuous then it is uniformly
continuous.
5.4 Inverse functions
Here is another standard result and proof from courses in
real analysis.
Theorem 5J Let A^U and let / : [a, ft] —• R be continuous and
bijective (i.e. for each ye A there is precisely one x e [ a , ft] with
f(x) = y). Then / has a well-defined inverse f~l: A—+[a,b~] and it,
too, is continuous (Figure 5.4).
Inverse functions 97
Fig. 5.4
k
[a,b]
[a,b]
Proof The inverse function is certainly well defined because for each
ye A there is just one x e [ a , b~] with / ( x ) = y: so f~1(y) is precisely
that x.
To show t h a t / - 1 is continuous we let a l 9 a 2 , a 3 , . • •—+a0in A: we
need to deduce that f~1(al),f~1 (a2),f~* (a 3 ),...—•/ ~ * (a0). Assume
that this fails. Then some subsequence must fail to get close to
f~l(a0); i.e. there exists r > 0 and a subsequence f~1(akl),f~1(ak2)9
f "* (% 3 ),... where members are all distance at least r from / "* (a0).
The sequence
f-\akxlf-\ak2\f-\ak^...
is in [a, b~\ and so it has a convergent by the compactness
of [a, b]
subsequence
1 1 1 a
/- K),/- K), r K)>---^
J
in [a, b~\. Since the terms of this sequence
since
have
{X R:
\f~HaO-f-Ha0)\^r
|x- -/ _1 (ao)l ^ »•}
it follows that is closed
\*-f-Hao)\>r>0
and that <x¥:f~1(a0).
Now since
/-1(a,i),/-1(a,2),/-1(a,3),...-a
by the continuity
it follows that
_1 _1 off
/(/ K)),/(/ K)),
/(/"1K)),...-/(a)j
i.e.
all9al2,ai,...-+f(ot).
98 What makes analysis work?
Buta 1 ,a 2 ^3>- • .-+a0and so by the uniqueness of limits / ( a ) = a0 and
oc=f~1 (a0). This contradicts the earlier statement and shows that / "*
is indeed continuous, as required. •
Exercise 104 Let (X, d) and (X\ d') be metric spaces with
(X, d) compact. Imitate the proof of Theorem 5.7 to show
that iff: X —• X' is continuous and bijective, then the inverse
function / ~ *: X' —• X is also continuous.
The result quoted in Exercise 104 can, in fact, be proved in a rather
more sophisticated fashion.
Theorem 5.8 Let (X, d) and (X\ d') be metric spaces with (X, d)
compact. Then if/: X—>X' is continuous and bijective its inverse
/ _ 1 : X'—> X is also continuous.
Proof To show that f~l: X' —• X is continuous we use the charac-
terisation of continuity from Theorem 5.2 and show that if B is closed
in (X,d) then (f'l)'l(B) (=f(B)) is closed in {X\d'). But
B closed in the compact (X, d)
=> £ compact in (X, d) (Exercise 43)
=>/(£) compact in (X\ d') (Theorem 5.5)
=>f(B) closed in (X\ d') (Theorem 3.11)
=>(/" l )~*(B) closed in (X\d!).
Hence / " l is continuous, as claimed. •
5.5 Intermediate values
In Chapter 3 we considered the 'three Cs', closed, complete
and compact. There is a fourth equally important property of sets in a
metric space, that of'connectedness'. The reason that we have left this
concept until now is that it played no role in our iterative approach.
Informally a set is 'connected' if it is in 'one piece'. So, for example,
we would expect the set F (in U2) illustrated in Figure 5.5 to be
connected, but the sets D and E to be disconnected.
To show that D is disconnected we write it as the union of two
'separate' pieces.
D = {(x, y): x2 +y2 ^ 1 } u {(x9y): x2 +y2^2}9
and similarly we write E as the union of two separate pieces
£ = {(x,y): x > 0 a n d > ; ^ 0 } u { ( x , y ) : x < 0 a n d y^O}.
Intermediate values 99
Fig. 5.5
D = \(x,y)eU2:x2+y2< 1 £ = \(x, y)GU2:xy>0 F= \(x,y)GU2:xy>0\
or x2+y2>2\
But one could argue that
F = {(x,^):x^0and>;^0}u{(x,>;):x^0,y^0and(x,y)^(0,0)}
splits F into two separate pieces. What is it that distinguishes the split
of D and E above from the split of F? In the cases of D and E the two
parts can be surrounded by disjoint open sets (see Figures 5.6 and 5.7):
D= {(x,y):x2+y2*:l}v{(x9y):x2+y2^2},
in in
2 2
{(x,y):x +>> <f} {(x,y):x2+y2>}}
X / open and
disjoint
E = {(x9y): x > 0 a n d y^O} u {(x9y): x < 0 a n d y^O}.
in in
{(x,y): x + y > 0 } {(x,y): x + y < 0 }
open and
disjoint
Fig. 5.6
100 What makes analysis work?
Fig. 5.7
Fig. 5.8
However, we cannot do this in the case of our split of F (see Figure
5.8):
F = {(x,y): x^O and y^O} ^J {(x,y): x^O, y^O and (x,y)^(0,0)}
in
B
If B is open then, as (0,0)e£, it follows that a ball centre (0,0) lies
entirely in B. So B must clearly overlap the other part of F.
Definition In a metric space (X, d) a set A is disconnected if A =
Al u 4 2 where AUA2 are non-empty and Ax c f i , A2^C for some
disjoint open sets B and C. Otherwise A is connected.
As you might expect, the connected sets in IR are precisely the intervals
([0,1],]2,4],]5,9[,]1, oo[, IR, etc.) which are characterised by the fact
that if they contain two points then they contain all points between
them.
Intermediate values 101
Theorem 5.9 In R (with the usual metric) the connected sets are
precisely the intervals.
Proof Assume first that A^U is disconnected. Then A = AX v A2
with beAl9ceA2 and b<c, say, and with AX^B, A2^C for some
disjoint open sets B and C. The set [b, c]nB is non-empty and
bounded above and so it has a supremum £. Now £ £ C, for if it were in
that open set then there would exist r > 0 with ] ( - r , ^ + r [ c C . That
would make £ — r an upper bound for [fc,c] n £ and would contradict
the choice of f as the least upper bound. Similarly £$B, for if it were in
that open set there would exist s > 0 with ]£ — 5, £ + s [ c £ which
(together with the fact that f < c) would contradict the fact that £ is an
upper bound of [b,c] n £. Hence { ^ B u C and as i g f i u C it
follows that £$A. Hence b,ceA but £ e [fc,c] and f £ 4 : it follows that
,4 is not an interval.
Conversely if A c |R is not an interval then there exist b,ceA and an
ae~\b,c[ with a$A. Then
A = (A n ] - oo, a[) u (A n ]a, oo[)
in in
]-oo,a[ ]a,oo[
x open and
/
disjoint
shows that A is disconnected.
Hence A is disconnected if and only if it is not an interval, and the
theorem follows. •
We are now able to recall the most classic result of elementary
analysis: we only give an outline of the sort of proof encountered in
elementary courses.
Theorem 5.10 (The intermediate value theorem.) Let / : [a, b] —• U be
continuous and let c be a number between/(a) and/(ft). Then c =/(<!;)
for some £e[a,b]; i.e./takes all 'intermediate values'.
102 What makes analysis work?
Outline proof Assume that
f(a)<c<f(b) and let B =
{x: f{x)<c] (see Figure 5.9).
Fig. 5.9
y=f(x) Basically this is choosing a
A
number in [a, ft] but not in the
disjoint open sets B =
/"1(]-oo,c[)andC =
/ _ 1 ( ] c , o o [ ) ; i.e. it is using the
connectedness of [a, ft].
•V
Then B has a supremum £ and
(by various continuity argu-
ments) it can be shown that
# B and # C = {x: f(x)>c}.
Hence f{£) = c as required. •
Exercise 105 Let (X, d) be a connected metric space (i.e. X is
itself connected) and let / : X—>R be continuous. Assume
that a,beX and CEU are such that/(a)<c</(ft). Show that
there exists £eX with
^/-'a-oo.cDu/^acooD.
Deduce that / K ) = c.
Our final result generalises Theorem 5.10 and Exercise 105 and, at the
same time, its proof becomes most straightforward and natural.
Theorem 5.11 Let (X, d) and (X\ d') be metric spaces with (X, d)
connected. Then if / : X—>Xf is continuous it follows that f(X) =
{f(x): xeX} is connected in (X\ d').
Proof We shall show that if f(X) is disconnected then X is
disconnected. Indeed, we shall show that if the open sets B, C
Some final remarks 103
'disconnect' f(X) then the open sets f~l{B), f~l(C) will 'discon-
nect' X.
Assume then that
f(X) = A1vA2 (non-empty)
in in
B C.
\ t
open and
disjoint in {Xf, d')
Then, by the continuity of /, the sets f~l(B) and f~l{C) are open in
(X, d). Also they are disjoint since
/-1(B)n/-1(C)=/-1(BnC)=/-1(0) = 0.
Finally
X^f-l{f{X))=f-HAl^A2)=f-1(Ai)^f-i{A2)
in in
l
f- (B) f-HC).
\ /
open and
disjoint
So if X is connected it follows that f(X) is connected. •
5.6 Some final remarks
I hope that the reader who has persevered this far will have
found the task rewarding. Not everyone appreciates mathematical
abstraction and for that reason our whole approach was motivated
by iterative techniques. However, having set up all the machinery for
that approach, it was then an easy step for us to use that machinery to
look with fresh insight at analysis.
Those readers who wish to pursue the subject further (perhaps
towards functional analysis or to topology) can read any of the
excellent books on the subject. In particular the following three have
given me a great deal of pleasure and I can recommend them to you.
The first covers a little more material than our approach, but the
other two go much further.
Metric Spaces, by E. T. Copson (Cambridge University
Press, 1968).
104 What makes analysis work?
Introduction to Topology and Modern Analysis, by G. F.
Simmons (McGraw Hill, 1963).
Introduction to Metric and Topological Spaces, by W. A.
Sutherland (Oxford University Press, 1975).
Finally, I hope that you have enjoyed this natural approach to a
piece of abstract mathematics, and I wish you well in your future
reading.
INDEX
attained bound 92-4 inverse function 96-8
Banach's fixed point principle 58-60 irrational 30
bibliography 103, 104 iterate 64
bounded sequence 39 iteration 2, 31
bounded set 21, 48 least upper bound 21
C{a,b) 20 lift metric 17, 18
Cauchy criterion 39 limit 3, 23, 28
Cauchy sequence 39 linear algebra 61, 62
closed interval 19 matrices 61, 62
closed set 30-5 max metric on C{a,b) 19, 20
compact set 45-51 max metric on R" 18
compact space 69 mean value theorem 63
completeness of C(a,b) 42 metric 13
completeness of U 39 metric space 13
completeness of U2 41 metric (usual) on C(a,b) 19
complete set 38-45 metric (usual) on R, C and IR2 14
complete space 40, 45
connected 100, 101 Newton-Raphson 4
continuous 5, 31, 87 open ball 89
convergent 3, 23 open interval 10
coordinatewise convergence 24, 26-8 open set 35, 90
contraction 57
contraction mapping principle 58-60 rational 29, 30
differential equation 68, 71-83 sequence 2, 22
disconnected 100 self-counting list 7
discrete space 16 simultaneous differential equations
distance 12, 13 80-3
simultaneous equations 57, 58, 60
Euclidean distance 14 subsequence 28
fixed point 2, 56 sup metric 21
supremum 21
half-plane 33
topology 35, 91
implicit function 84 triangle inequality 13
implicit function theorem 84
integral equation 10, 68 uniformly continuous 94-6
intermediate values 98 uniformly convergent 28
intermediate value theorem 101 unique fixed point 56, 58