0% found this document useful (0 votes)
21 views208 pages

Conradie Notes

The document is a course outline for a Real Analysis module at the University of Cape Town, detailing topics such as numbers, sequences, series, functions, and integration. It includes a series of challenging questions aimed at deepening the understanding of real numbers and their properties. The content emphasizes the complexity of real analysis beyond basic definitions and encourages critical thinking about mathematical concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views208 pages

Conradie Notes

The document is a course outline for a Real Analysis module at the University of Cape Town, detailing topics such as numbers, sequences, series, functions, and integration. It includes a series of challenging questions aimed at deepening the understanding of real numbers and their properties. The content emphasizes the complexity of real analysis beyond basic definitions and encourages critical thinking about mathematical concepts.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 208

i

Department of Mathematics and Applied Mathematics

University of Cape Town

Real Analysis
(MAM2000W Module 2RA)

Originally written by Anneliese Schauerte


Revised and expanded version by Jurie Conradie

2010
Contents

1 Uncomfortable questions 1

2 Numbers and Sequences 16


2.1 Rational numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.3 Least upper and greatest lower bounds . . . . . . . . . . . . . . . . . 31
2.4 Real numbers at last . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3 Convergent and divergent sequences 43


3.1 Sequences converging to zero . . . . . . . . . . . . . . . . . . . . . . . 43
3.2 Convergent sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Divergent sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.4 Subsequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.5 Cauchy sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 Infinite series 86
4.1 Basic definitions and properties . . . . . . . . . . . . . . . . . . . . . 87
4.2 Tests for convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.3 Tests for non-negative series . . . . . . . . . . . . . . . . . . . . . . . 104
4.4 Regroupings and rearrangements of series . . . . . . . . . . . . . . . . 120
4.5 Power series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

ii
CONTENTS iii

5 Continuous and differentiable functions 132


5.1 A first look at continuity and differentiability . . . . . . . . . . . . . 132
5.2 Limits of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3 Continuous functions . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4 Differentiable functions . . . . . . . . . . . . . . . . . . . . . . . . . . 154

6 Sequences and series of functions 160


6.1 Sequences of functions . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.2 Series of functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

7 Integration 179
7.1 What is integration? . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
7.2 Partitions, sums and integrals . . . . . . . . . . . . . . . . . . . . . . 181
7.3 When does the integral exist? . . . . . . . . . . . . . . . . . . . . . . 186

A Logic and proofs 192


A.1 Basic logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
A.2 Definitions, proofs and counterexamples . . . . . . . . . . . . . . . . 196

B Rational numbers 201


Chapter 1

Uncomfortable questions

Real analysis presumably has something to do with real numbers, and analysis
suggests that whatever we are going to do, it will not be done superficially. In
fact, real analysis certainly has to do with a deeper understanding of real numbers
and what makes them tick, but it goes beyond that and also looks at real-valued
functions and their properties. You may feel that you have a pretty good idea of
what a real number is, and that you spent a substantial part of your first year
working with real-valued functions when you were doing calculus. And then there
was more of this in the Advanced Calculus module, only there you had to work with
real-valued functions of more than one variable. What more is there to say?
In the rest of this section we ask some questions which will hopefully make you
a little uncomfortable, or at least suggest that there are still some problems to be
solved.
Question 1. What is a real number?

This is easy, you may feel. A real number is a number that is either a rational
number or an irrational number.
Good. But what is a rational number?
Easy, you say. It is a number that can be expressed in the form p/q, where p is an
integer and q a non-zero integer.
Good (for the moment). But what is an irrational number?
Well, those are just all the numbers that are not rational, you say (a little irritated).
Ah, but what are these “numbers” you refer to?
The real numbers, you say, ... and then realise you have come full circle.
Perhaps saying that you know what a real number is was a little premature. But
you have worked with them a lot, and you do think that you have a pretty good feel

1
CHAPTER 1. UNCOMFORTABLE QUESTIONS 2

for them. Now the next question comes from you.


Question 2. Can’t we think of real numbers as being just points on the number
line?

This is certainly a possibility. Let’s first look at how we construct a number


line. The first step (once you’ve drawn the line) is to choose a point to represent
0. The next step is to decide on a unit of length. Once you have fixed this length
(which can be anything other than 0), you can mark off a point to the right of 0 at a
distance from 0 equal to this length; this point represents 1. The point at the same
distance to the right of 1 represents 2, and in the same way we can mark off 3, 4, 5,
etc. Doing the same to the left of 0 gives us the points representing −1, −2, −3, . . ..
To find the point corresponding to say 23 , first divide the interval between 0 and 1
into three equal pieces. This can be done using only an unmarked ruler (also called
a straight-edge) and a pair of compasses. (How?) The point corresponding to 23 lies
at the end of the second piece. In a similar way one can find a point representing
any rational number.

• • •1 •2 • •
-2 -1 0 3 3 1 2

Figure 1.1: The real line.

This brings us to our next question.


Question 3. Are there any points on the number line not corresponding to rational
numbers?

1•

• • • • •
−2 −1 0 1 x 2

Figure 1.2: A length x such that x2 = 2

This is not at all such a daft question as it may seem to us nowadays. For a
very long time Greek mathematicians thought that every point on the number line
corresponded to a rational number. But Pythagoras’ theorem contained a threat to
CHAPTER 1. UNCOMFORTABLE QUESTIONS 3

this belief. If we construct a right-angled triangle with the two sides adjacent to the
right angle both equal to the unit length, then this theorem tells us that the length
x of the hypothenuse is such that x2 = 12 + 12 = 2. Hence there is a point on the
number line at distance x from 0.
Question 4. Does the point at a length x from 0 (where x > 0 and x2 = 2) represent
a rational number?

Well, let’s suppose it does. Then we can find two positive integers p and q such
that x = pq and p and q have no factors in common (if they did, we could cancel
the common factor). Since x2 = 2, we get p2 /q 2 = 2, or p2 = 2q 2 . It follows from
this that p2 is an even number. But then p is also an even number. (Prove this;
if you are stuck, look at Appendix A.) There is therefore a positive integer k such
that p = 2k. From this we get 2q 2 = p2 = 4k 2 , or q 2 = 2k 2 . But then q 2 is even,
and hence also q is even. We have shown both p and q are even; this means that
they have the factor 2 in common. But we have assumed that they have no factor
in common. This contradiction means that our assumption that there is a rational
number x such that x2 = 2 was wrong.
In terms of the number line this means that there is a point on the number
line that does not correspond to a rational number. If we think of real numbers as
numbers corresponding to points on the number line, then we now have (at least)
one real number that is not a rational number. Adapting the proof above enables
us to find many more.(You are asked to find one more in the exercises.)
The discovery that there are lengths that do not correspond to rational numbers
came as a profound shock to Greek mathematicians, and lead to a distrust of algebra.
They felt that the only safe way to do arithmetic was to work with lengths corre-
sponding to numbers, and to use geometric constructions to find sums, differences,
products and quotients of numbers. This cautious approach lead to stagnation in
Greek mathematics, and it was in India and the Arab world that the major advances
in algebra were made in the middle ages.
Using geometry to do arithmetic is not such a practical idea. But what use are
numbers to us if we cannot do arithmetic with them? We know how to do arithmetic
with rational numbers, but what about irrationals? Our problem is that our only
“definition” of a real number at the moment is “something that corresponds to a
length” (for positive real numbers), or, more generally, “a point on the number line”.
You have a flash of inspiration: decimal representations!
Question 5. Can’t we define real numbers using decimal representations?

The idea goes something like this: Rational numbers can be represented by ter-
CHAPTER 1. UNCOMFORTABLE QUESTIONS 4

minating or recurring decimals. Irrational numbers are numbers with non-recurring


decimal representations. Let’s explore this idea in a little more detail.
A decimal representation of a number is an abbreviation. As an example, 1.375
3 7 5
stands for 1 + 10 + 100 + 1000 ; we can also write this number as the rational number
1375
1000
. In the same way any decimal representation with a finite number of digits is
an abbreviation for a rational number, with denominator some power of 10.
Does every rational number have a decimal representation? Most of you will
probably feel quite confident in answering “yes” to this question. You will remember
a method for doing this you have seen at school which uses a process of “long
division”. Given a rational number, we divide the numerator into the denominator;
if there is a remainder, we divide the numerator into 10 times the remainder, and
so on. If the denominator of a rational number has only 2 and 5 as prime factors,
we’ll reach a point where the remainder is 0. (In the exercises we ask you to explain
why this is the case.) The finite sequence of numbers we get in this case gives us the
digits in the decimal representation. (This is not quite as obvious as you may think:
we ask you give the reason in the exercises.) But if the denominator has a prime
factor other than 2 or 5, we’ll never reach a point where there is no remainder, and
what we hope will be the decimal representation becomes an infinite sequence of
digits. This leads to the next uncomfortable question:
Question 6. What does an infinite decimal representation mean?

The question is most easily understood in terms of an example. If we use the


method outlined above to find a decimal representation for 65 , we get

5
= 0.833333 . . .
6
(This was to be expected, since 6 has factors 2 and 3.) But what does the right
hand side mean? If we see it as the same kind of abbreviation as before, it should
stand for
8 3 3 3
0+ + + + +···
10 100 1000 10000
But surely it is impossible to add infinitely many numbers! You can spend the rest
of your life trying to do this addition, and never finish. So how can we claim that 65
is equal to this infinite sum?
Question 7. Can we attach a meaning to an infinite sum?

If we look at the example above, there is a glimmer of hope. You may recognise
3 3 2
100
+ 1000 + 10000 + · · · as an infinite geometric series, and remember that there is
3
a formula for the sum of such a series. The series has first term 100 and common
CHAPTER 1. UNCOMFORTABLE QUESTIONS 5

1
ratio 10
, so the formula gives
3
8 3 3 3 4 100 5
+ + + +··· = + 1 = .
10 100 1000 10000 5 1 − 10 6

It works! The formula seems to have saved the day in this example. But will
it do the same for any rational number? If we use “long division” to find the
decimal representation for any rational number, will we always end up with an
infinite geometric series to which we can apply the formula to find its sum? You may
remember hearing that if we use this method to find the decimal representation of a
rational number, we always get a finite or recurring infinite decimal representation.
Do you know why? (Try to work this one out for yourself.) If we accept that this is
the case, then this recurring decimal representation, when written out in full, will
give us an infinite geometric series. This is nicely illustrated by the rational number
1
7
:
142857
1000000 142857 1
0.142857142857142857 . . . = 1 = = .
1 − 1000000 999999 7

But have you really solved the problem by quoting a formula? And where does
the formula come from? How does the formula get around the fundamental question
of how to add infinitely many numbers?
The problems don’t end there, unfortunately. We introduced decimal represen-
tations in the hope that this would enable us to do arithmetic with real numbers,
that this would give us a more practical way of doing this than geometrical construc-
tions with line segments. The following example illustrates a method often used to
convert recurring decimals to fractions, and suggests that it is business as usual:

Let x = 0.3333 . . .
Then 10x = 3.333 . . .
Hence 9x = 10x − x = 3.333 . . . − 0.333 . . . = 3.
1
and so x = .
3
It looks as if we can do calculations with infinite decimals in the same way as
with finite decimals. But how do we go about calculating 4 × 0.3333 . . .? We
can’t start multiplying on the right, because there is no “rightmost” digit. What
makes this more infuriating is that we know what the answer should be: surely
4 × 0.333 . . . = 4 × 31 = 43 = 1.3333 . . ..
So we do have a problem with doing arithmetic with infinite decimals, and this
problem crops up even in the case of decimals representing rational numbers. (In
CHAPTER 1. UNCOMFORTABLE QUESTIONS 6

this case we at least have an escape route: we can convert them to rationals, do
arithmetic with rationals, and convert back the answer to decimal form.)
In short:
Question 8. How does one do arithmetic with infinite decimals?

In case you may have forgotten, recall that we are still trying to say what a real
number is. The idea we are exploring is that we could think of real numbers as
being represented by finite or infinite decimal “expansions”. We have seen that
there is a problem with infinite decimal expansions: we need to be able to make
sense of infinite sums of (rational) numbers. In the case of recurring decimal ex-
pansions we could still try to hide behind a formula. Using this formula, we can
show that every rational number has a decimal expansion that is either finite, or
infinite but recurring. Conversely, we can show that every finite or recurring infinite
decimal expansion represents a rational number. The idea is that all the other dec-
imal expansions represent irrational numbers. But in the case of the other decimal
expansions, we cannot think of them as infinite geometric series any more (there is
no “common ratio”), so the crutch of the formula falls away. So we are left with:
Question 9. How can we make sense of non-recurring decimal expansions?

We have hit several stumbling blocks in trying to identify real numbers with dec-
imal expansions. The gist of the problem seems to be that we do not understand
“infinite sums” and how to work with them. This does not mean that the problem
is insurmountable, simply that there is work to be done. We’ll spend the first part
of this course trying to sort out these problems.
For the record we should also mention here that even our acceptance of a rational
number as “ a number that can be written in the form p/q, with p an integer and
q a non-zero integer” is problematical. What do we mean by “in the form p/q”?
Are 1/2 and 2/4 different rational numbers? Fortunately these problems can be
resolved, but not without some work. These issues are perhaps best dealt with in a
course on algebra, and so we’ll stick to our rather uncritical acceptance of rational
numbers in this course.
Real analysis is not only about understanding real numbers, but also about
understanding real-valued functions. It is perhaps not surprising that there are
troubling questions that we can ask about them as well. You may recall that there
were a number of rather important theorems in your calculus course that you did
not prove, such as the mean value theorem. We’ll address these omissions in this
module. But there are also new questions that arise when we try to add infinitely
many functions, a situation that arises surprisingly frequently in practice.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 7

One such situation is readily understandable with your first-year knowledge of


Taylor polynomials. As an example, we look at the function f (x) = ex . It is an
infinitely differentiable function (that is, it has derivatives of every order). From
this it follows that we can find Taylor polynomials of all orders for it. The Taylor
polynomial of order n about x = 0 is given by
x2 x3 xn
pn (x) = 1 + x + + +···+ .
2! 3! n!
This polynomial gives an approximation to f (x) = ex . The larger n is, the better
this approximation becomes. Since f is infinitely differentiable, we could write down
a “Taylor polynomial of infinite degree” (not acceptable terminology!) for f :
x2 x3
p(x) = 1 + x + + + ···.
2! 3!
The proper name for something like this is a Taylor series. You will probably notice
that we have here one of our old problems in disguise: for every real number x,
the right hand side of the equation above is an infinite sum of real numbers. Even
supposing we can in some way make sense of such a sum, the questions do not stop
2 3
there. Is it true that ex = 1 + x + x2! + x3! + · · · for every x? If it is not true for every
x, for which values of x (if any) will it be true?

Figure 1.3: The function f (x) = ex and its first three Taylor polynomials.

Similar questions could of course be asked for any infinitely differentiable func-
tion, such as the sine and cosine functions for example.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 8

To summarise:
Question 10. Where (if anywhere) does the Taylor series of an infinitely differen-
tiable function equal the function?
Suppose that the Taylor series of an infinitely differentiable function f about
x = 0 equals the function for every x in some interval (−a, a), then we can write

X
f (x) = an xn
k=0

for every x ∈ (−a, a). Roughly speaking, we can think of f as a “polynomial of


infinite degree”. It is now natural to ask whether such “polynomials” behave in the
same way as ordinary polynomials of finite degree. In particular, can we differentiate
and integrate them in the same way as ordinary polynomials? When we differentiate
a polynomial, we use the rules which say that the derivative of a sum of functions
is equal to the sum of their derivatives, and the derivative of a constant times a
function is equal to the constant times its derivative. For integration there are
similar rules. But are these rules also valid for “infinite sums”? Specifically:
Question 11. If

X
f (x) = an xn for every x ∈ (−a, a),
k=0
is f differentiable and integrable, and do we have
∞ Z x ∞
0
X
n−1
X xn+1
f (x) = nan x and f (t) dt = an
k=1 0 k=0
n+1

for every x ∈ (−a, a)?


Non-differentiable functions cannot be approximated by Taylor polynomials. But
such functions occur frequently in practice. An example of such a function is the
so-called square wave function, which is defined on the interval [−π, π) by

0 for −π ≤ x ≤ 0
f (x) =
1 for 0 < x < π.
(Note that f is discontinuous, and therefore certainly not differentiable. We can
extend the definition of f to the whole real line by requiring that f (x + 2π) = f (x)
for all x ∈ R; f then becomes a periodic function with period 2π. Such functions can
be approximated by linear combinations of functions of the form sin kx and cos kx
(where k ∈ N) which are periodic functions themselves. These linear combinations
are known as Fourier polynomials. Figure 1.4 on the next page shows the graph of
the square wave function f and its Fourier polynomials of degree 1, 3 and 5.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 9

Figure 1.4: Fourier polynomials for the square wave function

There is nothing that stops us from writing down a “Fourier polynomial of infinite
degree”, or Fourier series for this function. (To be honest, we have not explained
how these polynomials are found. For the moment we ask you to accept that it can
be done.) Once we have done this, we can start asking the same kind of questions
that we asked for Taylor series. We’ll keep it simple:
Question 12.Where (if anywhere) does the Fourier series of a periodic function
equal the function?
We have glossed over some technicalities in the last two questions. But too
much detail would probably have obscured the essence of the problems we raised.
Hopefully there was enough to convince you that there are interesting questions that
can be asked, and that there are no easy answers.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 10

The next problem is easy to state. Suppose we are faced with the problem of
finding the solutions (if there are any) of the equation

x = cos x.

On the left in Figure 1.5 the graphs of the functions f (x) = cos x and g(x) = x
are shown; the solution of the equation above is given by the x-coordinate of the
point of intersection of the two graphs. From the figure it seems to be clear that the
equation has a solution. Unfortunately there is no formula for solving an equation
like this. One possibility is to resort to a “trial and error” method. As a first step,

Figure 1.5: Solving the equation x = cos x.

guess a solution, say x = 1 (radian, of course). To check our guess, we calculate


cos x = cos 1. Since cos 1 6= 1, we do not have a solution yet. In the graph on the
right in Figure 1.5 a horizontal line with equation y = cos 1 has been drawn. The
intersection of this line with the line y = x has x-coordinate cos 1. It looks as if cos 1
may be a slightly better approximation to the solution of the equation x = cos x
than 1. So we try x = cos 1 as a second approximation. More formally, we can write
x1 = 1 (our first guess), x2 = cos x1 = cos 1 (our second guess). A check shows that
x2 is still not a solution, so we take x3 = cos x2 as our third approximation. The
graph suggests that if we continue in this way, we are going to get better and better
approximations to the solution. It looks as if the sequence (xn ) defined inductively
by
x1 = 1, xn+1 = cos xn , n ≥ 1
will converge (whatever that means!) to the solution of the equation x = cos x. But
can we always use this method to solve an equation of the form x = f (x)?
CHAPTER 1. UNCOMFORTABLE QUESTIONS 11

To get a better “feel” for the method, we look at a second example. This time
we try to solve the equation x = e0.5x − 1. If we take x1 = 2 as our initial guess,

Figure 1.6: Attempting to solve the equation x = e0.5x − 1.

and put xn+1 = e0.5xn − 1, we see from Figure 1.6 that the sequence (xn ) obtained
seems to converge to the solution x = 0 This is not all that useful, since it is easy
to spot this solution immediately. Starting with x1 = 3 leads to a sequence with
terms just getting larger and larger. It is clear from the graph, however, that there
is a solution other than x = 0 to this equation. A little bit of experimentation and
a good look at the graph should be enough to convince you that we’ll end up with
a sequence converging to 0 or a sequence with terms getting larger and larger, no
matter which starting point (other than the solution itself) we choose. It is natural
to ask why the method works for the first equation, but not for the second.
The method can go wrong in other ways as well. If it is used to try to solve the
equation
2
x = 4x2 e−x ,
most starting points lead to a sequence with terms which eventually alternate be-
tween two different values and therefore does not get closer to any one of the solutions
of the equation.
On the left in Figure 1.7 on the next page we see what happens if we start with
x1 = 1, and on the right we see what happens if x1 = 2 is used as a starting point.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 12

2
Figure 1.7: Attempting to solve the equation x = 4x2 e−x .

These examples suggest the following question:


Question 13. Suppose we try to solve the equation x = f (x) by guessing a solution
x1 and then constructing a sequence (xn ) by putting xn+1 = f (xn ) for every n ≥ 1.
Will the terms of this sequence get closer and closer to something, and if it does,
will this something be a solution of the equation?
For practical applications, even just being able to find necessary conditions on
x1 and f to ensure convergence to a solution will be very useful.
In your first year you came across another method for finding approximate so-
lutions to equations: Newton’s method. This was a similar method to the one we
have looked at here in the sense that it was also an iterative method: every ap-
proximation to the solution depended on the previous approximation. You may also
remember that choosing the correct starting point was important, and that a bad
choice could lead to the method failing. We can ask similar questions to Question
12 about Newton’s method.
In the examples above we used a real number x1 and a function f to generate a
sequence (xn ) of real numbers inductively by defining xn+1 = f (xn ) for n ≥ 1. This
sequence depends on both x1 and f , and is usually known as the orbit of x1 under
f . From the examples we have considered it is clear that such an orbit can behave
in different ways. The orbit can be a sequence with terms getting closer and closer
to some number, or the terms can get larger and larger (positive or negative), as
can be seen in Figure 1.6. Figure 1.7 shows that the orbit could also be a sequence
which eventually alternates between two values.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 13

But even “worse” things can happen: Figure 1.8 shows an orbit with what seems
to be randomly distributed terms filling up an interval. This is known as chaotic
behaviour.

Figure 1.8: A chaotic orbit.

Sequences like these arise in the study of systems that change with time, provided
we only consider the time at discrete time intervals. We call such systems discrete
dynamical systems. All of this suggests our final question:
Question 14. Is there any way in which we can predict the behaviour of a discrete
dynamical system?

In this introductory chapter we have raised many questions, and answered none of
them. If you feel somewhat less comfortable with real numbers now than before you
have started reading, we have achieved our objective. Understanding real numbers
and their properties is far from a trivial exercise. But we hope that the some of the
examples have also shown that it may, in the long run, be a rewarding one.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 14

Historical Notes
Brook Taylor (1685 – 1731)
Taylor was an English mathematician
best known for his discovery of the for-
mula for what is now known as the Tay-
lor expansion of a function. This first
appeared in his book Methodus incre-
mentorum directa et inversa of 1715, in
which he also develops the branch of
mathematics known as the ”calculus of
finite differences”. He is credited with
the invention of integration by parts,
and gave the first systematic account
of the basic principles of perspective.
In 1712 he was elected a Fellow of the
Royal Society and became a member
of the committee who had to settle the
priority dispute on the invention of the
calculus between Newton and Leibniz.
Joseph Fourier (1768 – 1830)
Fourier was born in Auxerre in France,
one of fifteen children. He showed great
interest in mathematics at an early age,
but initially decided to train for the
priesthood. However, he never took re-
ligious vows, but took a job as a teacher
in Auxerre, at the same time doing
mathematical research. It was the time
of the French Revolution and in 1793
Fourier became heavily involved in pol-
itics on the side of the revolutionaries.
His support for one faction within the
revolution led to his imprisonment on
two occasions. In 1794 he became one
of the first students at the newly estab-
lished Ecole Normal in Paris and later
taught mathematics at the College de
France and the Ecole Polytechnique.
CHAPTER 1. UNCOMFORTABLE QUESTIONS 15

He joined Napoleon’s expedition to Egypt as scientific advisor in 1798 and on his


return to France was appointed by Napoleon as Prefect (an administrative position)
in Grenoble. Although he was not very keen on this position, he acquitted himself
well of this task and in addition managed to do some of his best research, into the
mathematical theory of the conduction of heat, during this time. He established
the partial differential equation governing heat diffusion and used infinite series
of trigonometric functions (now called Fourier series in his honour) to solve the
equation. This work at first proved to be very controversial and his essay Theorie
analytique de la chaleur was only published in 1822. Fourier returned to Paris after
Napoleon was finally defeated and worked there until his death in 1830.

Exercises

1. Use a proof by contradiction to show that if p is an integer and p2 is even, then


p is even as well. Is the statement still true if we replace “even” by “odd”?

2. Prove that if x is a positive real number and x2 = 3, then x is not rational.


For which positive integers m can we adapt the proof to show that if x2 = m,
then x is not a rational number?

3. Let p and q be positive integers.


(a) Suppose that every prime factor of q is either 2 or 5. Explain why the
method we use to obtain the decimal representation of pq will only give
us a finite number of non-zero digits in this case. Also explain why the
decimal representation we get equals pq .
(b) Explain why pq has an infinite recurring decimal representation if q has a
prime factor other than 2 or 5.

4. Write 2.345345345 . . . in the form pq , with p and q integers.


2
5. Find the decimal representation of 13
.

6. In this question you may assume that every real number is either a rational
or an irrational number, but never both ratinal and irrational.
Prove or disprove the following statements:

(a) The sum of two rational numbers is a rational number.


(b) The sum of a rational and an irrational number is irrational.
(c) The product of two rational numbers is a rational number.
(d) The product of a rational and an irrational number is a rational number.
Chapter 2

Numbers and Sequences

In the previous chapter we raised many problems about our understanding of real
numbers. Although we have a number of helpful ways of thinking about real num-
bers, none of them turned out to give us a useful and rigorous definition. This
chapter looks at ways to remedy this unsatisfactory situation.
Let’s start by putting our expectations on the table. As usual, we’ll use Q and
R to denote the sets of rational and the set of real numbers respectively. We would
like to define real numbers in such a way that:

• Q is a subset of R;
• we can do everything with the elements of R that we can do with those of
Q, in particular we can add subtract, multiply and divide them, and compare
them for size;
• R contains more elements than Q (and in particular contains an element x
such that x2 = 2);
• there is a one-to-one correspondence between the elements of R and the points
on the number line;
• there is a correspondence between the elements of R and decimals (recurring
and otherwise).

There are two possible approaches to defining real numbers:

1. The bottom-up, or constructive approach. This is done by starting with the


rational numbers and their properties and defining real numbers in terms of
rational numbers.

16
CHAPTER 2. NUMBERS AND SEQUENCES 17

2. The top-down, or axiomatic approach. Here we define the real numbers as a


set with operations satisfying a set of axioms, in much the same way as the
definition of a vector space in linear algebra. To be able to talk about the set
of real numbers, one will then have to show that there is essentially only one
such set.

We’ll opt for the axiomatic approach, but will try to soften the blow by taking some
time to motivate the axioms. This will be done in two stages. Since we want the
real numbers to be a set of numbers with all the properties of the rational numbers,
we’ll start by listing the essential properties of the rational numbers. Then we’ll
try to find a property that we feel the real numbers should have, but the rational
numbers does not have.
In the second stage we shall need to use sequences, and use them in a substantial
way. As you will soon see, this chapter about numbers will end up being mostly
about sequences. As luck will have it, a good understanding of sequences will in the
next chapter help us to come to grips with infinite sums as well.

2.1 Rational numbers


In this short section we list the essential properties of the set of rational numbers.
These properties are “essential” in the sense that we can derive all other properties
of the rational numbers from them. We’ll list the properties in a form that will
remind you of the axioms for a vector space.
On the set Q of rational numbers there are two operations, addition and mul-
tiplication. More formally, addition is a function from Q × Q to Q that takes the
ordered pair (x, y) in Q × Q to x + y ∈ Q, and multiplication a function that takes
the ordered pair (x, y) ∈ Q × Q to xy ∈ Q. (Built into this definition is the fact that
Q must be closed under addition and multiplication.) Addition has the following
properties:

A1 x + y = y + x for all x, y ∈ Q (addition is commutative).


A2 (x + y) + z = x + (y + z) for all x, y, z ∈ Q (addition is associative).
A3 There is a unique element, denoted by 0, in Q with the property that x + 0 =
0 + x = x for all x ∈ Q (0 is the identity for addition).
A4 For each x ∈ Q, there exists a unique element −x ∈ Q with the property that
x + (−x) = 0 (every rational number has a unique inverse with respect to
addition).
CHAPTER 2. NUMBERS AND SEQUENCES 18

Multiplication has the following properties:

M1 xy = yx for all x, y ∈ Q (multiplication is commutative).

M2 (xy)z = x(yz) for all x, y, z ∈ Q (multiplication is associative).

M3 There is a unique element, denoted by 1, in Q with the property that 1x = x


for all x ∈ Q (1 is the identity for multiplication).

M4 For each x ∈ Q such that x 6= 0, there exists a unique element x−1 ∈ Q with
the property that xx−1 ) = 1 (every non-zero rational number has a unique
inverse with respect to multiplication).

Addition and multiplication are linked by the distributive property:

D x(y + z) = xy + xz for all x, y, z ∈ Q.

The set of rational numbers with its operations of addition and multiplication
is not the only example of this kind of structure in mathematics. If you are doing
the Introductory Algebra module, you will see others. Any set with operations of
addition and multiplication satisfying A1 – A4, M1 – M4 and D above is called a
field, as you may have seen in the Linear Algebra module already.
In a field we can define two further operations: For x, y ∈ Q,

• subtraction is defined by x − y = x + (−y) and

• division is defined by x/y = xy −1 , provided y 6= 0.

From the properties listed above many other familiar algebraic properties of the
set of rational numbers can be derived. We shall not do this here, but will use them
freely. Since the properties above are the axioms defining a field, these deductions
can be done for any field. If you do the Introductory Algebra module, you will see
this and more.
We can also compare rational numbers: there is an order relation between ratio-
nal numbers, which we shall denote by the usual symbol < . This relation has the
following properties:

O1 If x, y ∈ Q, then exactly one of the following holds: x < y or x = y or y < x.

O2 If x, y ∈ Q and x < y, then x + z < y + z for all z ∈ Q.


CHAPTER 2. NUMBERS AND SEQUENCES 19

O3 If x, y ∈ Q and x < y, then xz < yz for all z ∈ Q such that 0 < z.


O4 If x, y, z ∈ Q and x < y and y < z then x < z.

A field with a relation < satisfying properties O1 – O4 is known as an ordered


field. There are ordered fields other than Q; in particular, we’ll see later that R is
also an ordered field.
As usual, y > x means the same as x < y, and x ≤ y means x < y or x = y. A
rational number x such that x > 0 is called positive, and one such that x ≥ 0 is
called non-negative.
From the axioms of an ordered field, a number of properties involving the or-
der relation can be deduced. Since we’ll need these properties when manipulating
inequalities, we list them here (without proof) for future reference.

Proposition 2.1.1 Let F be an ordered field and a, b, c ∈ F.

(a) If c > 0, then a ≤ b ⇔ ac ≤ bc.

(b) If c < 0, then a ≤ b ⇔ ac ≥ bc.

1 1
(c) If a > 0 and b > 0, then a ≤ b ⇔ ≥ .
a b
1 1
(d) If a < 0 and b < 0, then a ≤ b ⇔ ≥ .
a b
(e) If a ≥ 0 and b ≥ 0, then a ≤ b ⇔ a2 ≤ b2 .

Some of you may feel a bit uneasy about the fact that we seem to have taken
for granted the fact that Q with its usual addition and multiplication satisfies the
properties A1 – A4, M1 – M4, D and O1 – O4. If so, you may reassure yourself by
looking in Appendix B for more detail on how this may be proved.
In the rest of this chapter we’ll frequently use the absolute value of a number.
We give the definition for rational numbers below, but the definition makes sense in
any ordered field. Since we’ll be defining the set of real numbers in such a way that
it is also an ordered field, everything we prove here for absolute values of rational
numbers will remain true for absolute values of real numbers as well.
Recall that if x ∈ Q, then |x| = x if x ≥ 0 and |x| = −x if x < 0. We call |x|
the absolute value, or modulus, of x.
It follows almost immediately from this definition that
CHAPTER 2. NUMBERS AND SEQUENCES 20

• | − x| = |x|;
• if x, c ∈ Q and c > 0, then |x| ≤ c ⇔ −c ≤ x ≤ c;
• if x, a ∈ Q, then |x − a| = x − a if x ≥ a and |x − a| = a − x if a > x, and
therefore |x − a| is exactly the distance between x and a on the number line.

We collect some useful properties of the absolute value:

Proposition 2.1.2 For x, y ∈ Q,

(a) |xy| = |x| |y|.

(b) |x + y| ≤ |x| + |y| (the triangle inequality).

(c) | |x| − |y| | ≤ |x − y|

Proof: (a) and (b) are left as exercises. [Hint: consider cases.]
We derive (c) from (b):
(c) From (b) we have
|x| = |(x − y) + y| ≤ |x − y| + |y| and |y| = |(y − x) + x| ≤ |y − x| + |x| = |x − y| + |x|.
Combining these two inequalities we get
−|x − y| ≤ |x| − |y| ≤ |x − y|, and so | |x| − |y| | ≤ |x − y|.

The next step is to try to identify what it is that makes the set of real numbers
different from the set of rational numbers. We have to work for the moment with our
rather incomplete understanding of what a real number is. The route we’ll follow
will depend heavily on the use of sequences, and we therefore look at sequences of
rational numbers in the next section.

Summary:
In this section we have introduced the set of rational numbers Q axiomatically as a
set on which the two operation of addition and multiplication are defined, satisfying
certain properties with respect to these operations. Any set with two operations
satisfying these properties is called a field ; the set of rational numbers with its usual
operations is therefore an example of a field. We also listed some properties of the
order relation < on Q. Any field with a relation satisfying these properties is called
an ordered field, so that Q is an example of an ordered field. The section ended with
some important properties of inequalities and absolute values.
CHAPTER 2. NUMBERS AND SEQUENCES 21

Exercises

1. Use the definition of the absolute value to prove that for all x, y ∈ Q,

(a) | − x| = |x|
(b) if c > 0, then |x| ≤ c ⇔ −c ≤ x ≤ c
(c) if c > 0, then |x − y| ≤ c ⇔ y − c ≤ x ≤ y + c
(d) |xy| = |x| |y|
(e) |x + y| ≤ |x| + |y|.

[Hint: Consider cases.]

2. Prove that in any field F, (−a)(−b) = ab for every a, b ∈ F.


[Hint: You may use only the axioms A1 – A4, M1–M4 and D.]

2.2 Sequences
The idea of an infinite sequence of numbers is so natural that it is tempting to do
without a definition altogether. Informally, we think of such a sequence as a list
that can be written in the form

a1 , a2 , a3 , a4 , a5 , . . . ,

where the dots indicate that the numbers continue indefinitely. The subscripts on
the a’s are labels indicating the positions of the numbers in the list; so a3 , for
example is the third number in the list, and more generally ai is the i-th number.
The numbers ai could be real numbers, or rational numbers, or complex numbers, or
anything else; but in this section we will restrict our attention to rational numbers.
We will call a sequence of rational numbers a rational sequence for short. A rigorous
definition of an infinite rational sequence is not hard to give; here it is:

Definition 2.2.1 An infinite rational sequence is a function a : N+ → Q.


A finite rational sequence is is a function a : {1, 2, . . . , n} → Q, where n is some
fixed positive integer (n is known as the length of the sequence).

Two remarks are appropriate at this stage. Firstly, since we will be dealing
with rational sequences almost exclusively in this section, we will in the rest of this
section abbreviate “rational sequence” to “sequence”. Secondly, we will work only
with infinite sequences in what follows, so that we may as well drop the qualification
CHAPTER 2. NUMBERS AND SEQUENCES 22

“infinite” as well. This means that if we use the word “sequence” in the rest of this
section, it really means “infinite rational sequence”.
From the point of view of the definition we have given, a sequence should be
denoted by a single letter, like a, and its values by
a(1), a(2), a(3), a(4), . . . ,
but it has become customary to use subscript notation instead. In this notation, if
a : N+ → R is a sequence, we write
a1 , a2 , a3 , a4 , . . . instead of a(1), a(2), a(3), a(4), . . . .
We call an the n-th term of the sequence, and denote the entire sequence by
(an )n∈N+ , (an )∞
n=1 or, more simply, (an ). Note that there is a difference between an
(the n−th term of the sequence) and (an ) (the entire sequence).

Example 2.2.2 Sequences can be defined by giving an explicit formula for the n-th
term of the sequence. Here are some examples:

(a) an = 2n , then (an ) = (2, 4, 8, 16, 32, . . .)

(b) an = n1 , then (an ) = (1, 12 , 31 , 14 , 51 , . . .)

(c) an = (−1)n , then (an ) = (−1, 1, −1, 1, −1, 1, . . .)

(d) an = sin nπ
2
, then (an ) = (1, 0, −1, 0, 1, 0, −1, 0, 1, 0, −1, 0, . . .)

(e) an = n!, then (an ) = (1, 2, 6, 24, 120, . . .)

Since we can think of a sequence as a rational-valued function defined on the


set of positive integers, it is possible to draw the graph of such a function (at least
on a part of its domain!). The graph will not be a continuous curve, but a set of
dots, one corresponding to each positive integer. The graph of the second sequence
is shown in Figure 2.1 on the next page. As an exercise, draw (part of) the graphs
of the sequences in third and fourth example above.

Example 2.2.3 A sequence can also be defined in words. As an example, let pn be


the n-th prime number. Then (pn ) is the sequence of all prime numbers.

Sequences can also be defined recursively. This means that a formula or description
is given for obtaining the n-th term in terms of one or more of the preceding terms.
In such cases we can only find a term once we have obtained the preceding terms.
We illustrate this in the following two examples.
CHAPTER 2. NUMBERS AND SEQUENCES 23

1 •



• • • • • • • • •
0 1 2 3 4 5 6 7 8 9 10 11 12

Figure 2.1: The sequence ( n1 ).


 
1 2
Example 2.2.4 Let c1 = 2 and cn+1 = cn + for all n ∈ N+ . Calculating
2 cn
the first few terms gives  
3 17
(cn ) = 2, , , . . . .
2 12
There could be a problem with this definition: What if, for some n, cn = 0? Then
cn+1 would be undefined. We must therefore check that cn 6= 0 for all n ∈ N+ . This
follows by induction, since c1 > 0 and cn > 0 ⇒ cn+1 > 0 (why?); so all the terms
are positive, and no difficulty arises.

Example 2.2.5 Let u1 = 1 and u2 = 1 and un+2 = un + un+1 for all n ∈ N+ . Then
(un ) = (1, 1, 2, 3, 5, 8, 13, . . .).
This is the famous Fibonacci sequence, of which many of you may have heard before.
Note that in this case we need to know the preceding two terms to calculate the
next one.

Example 2.2.6 Let (an ) be any sequence and define the sequence (bn ) by putting,
for each n ∈ N+ , bn = an+3 . Then (bn ) = (a4 , a5 , a6 , . . .), so (bn ) is obtained by
omitting the first three terms of (an ).
For example, if (an ) = ( n1 ), then (an+3 ) = ( 14 , 51 , 61 , . . .).
More generally, for any fixed k ∈ N+ , the sequence (ak+n ) is obtained from the
sequence (an ) by omitting the first k terms of (an ).

There are certain types of sequences that we will frequently use in the rest of this
course. Although the names we use are generally quite good descriptions, we need
to give precise definitions. Such a definition will always be our starting point when
we prove something about a sequence of this type.
CHAPTER 2. NUMBERS AND SEQUENCES 24

Definition 2.2.7

(a) A constant sequence is a sequence with all its terms equal. More formally, a
sequence (an ) is constant if and only if there is a real number c such that an = c for
every n ∈ N+ . Writing this in terms of quantifiers and connectives, we get

(an ) is constant ⇔ (∃c ∈ R)(∀n ∈ N+ )[an = c].

(b) An eventually constant sequence is one in which from some term onward,
all its terms are equal. More formally, (an ) is eventually constant if and only if there
is a real number b and a k ∈ N+ such that, for any n ∈ N+ , if n ≥ k then an = b.
Using quantifiers and connectives, this condition reads:

(an ) is eventually constant ⇔ (∃b ∈ R)(∃k ∈ N+ )(∀n ∈ N+ )[n ≥ k ⇒ an = b].

(c) A sequence (an ) is an arithmetic sequence if there is a d ∈ Q such that


an+1 − an = d for all n ∈ N+ .

(d) A sequence (an ) is an geometric sequence if there is an r ∈ Q such that


an+1 = ran for all n ∈ N+ .

Example 2.2.8

(a) If an = cos 2nπ for every n ∈ N+ , then (an ) is a constant sequence, since an = 1
for every n ∈ N+
41
(b) If bn is the n-th digit in the decimal expansion of 3
, then the sequence (bn ) is
eventually constant, since for n ≥ 3, bn = 6.

Some ideas that you are familiar with from calculus are also useful in the study of
sequences. Increasing or decreasing sequences are very similar to an increasing or
decreasing real-valued functions. Here are the definitions:

Definition 2.2.9 Let (an ) be a sequence of real numbers. We say that (an ) is

(a) increasing if and only if an ≤ an+1 for all n ∈ N+ ;


CHAPTER 2. NUMBERS AND SEQUENCES 25

(b) strictly increasing if and only if an < an+1 for all n ∈ N+ ;

(c) decreasing if and only if an+1 ≤ an for all n ∈ N+ ;

(d) strictly decreasing if and only if an+1 < an for all n ∈ N+ ;

(e) monotone if and only if (an ) is either increasing or decreasing;

(f ) strictly monotone if and only if (an ) is either strictly increasing or strictly


decreasing.

Warning: Note that the property of being increasing or decreasing is one that a
sequence either has, or does not have. A sequence other than a constant sequence
cannot be both increasing and decreasing, and no sequence can be both strictly
increasing and strictly decreasing. If the terms of a sequence increases up to a point
and then start decreasing (or the other way round), it is not monotone.
To prove that a sequence (an ) is increasing, you have to prove that an+1 ≥ an for
every n ∈ N+ . To prove that it is not increasing, it is enough to find one n ∈ N+
such that an+1 < an . Similar comments apply to decreasing sequences.

Example 2.2.10

1 1
(a) The sequence ( n1 ) is strictly decreasing, because n < n + 1 ⇒ > for all
n n+1
n ∈ N+ .

(b) the sequence (n!) is strictly increasing, because (n + 1)! = (n + 1)n! > n! for all
n ∈ N+ .

(c) The eventually constant sequence (5, 4, 3, 2, 1, 1, 1, . . .) is decreasing, but not


strictly decreasing.

(d) The sequence ((−1)n+1 n) = (1, −2, 3, −4, 5, −6, . . .) is neither increasing nor
decreasing; so is not monotone.

Remark: A sequence (an ) is increasing if and only if an+1 − an ≥ 0 for all n ∈ N+ .


an+1
If an > 0 for all n ∈ N+ , (an ) is increasing if and only if ≥ 1 for all n ∈ N+ .
an
Similar remarks, with the obvious changes, apply to strictly increasing, decreasing
and strictly decreasing sequences.
CHAPTER 2. NUMBERS AND SEQUENCES 26
 
n
Example 2.2.11 We show that (bn ) = is strictly increasing. To do this,
n+1
we must show that bn < bn+1 for all n ∈ N+ . This is the same as showing that
bn+1 − bn > 0 for all n ∈ N+ ; so we show that instead. Now
n+1 n (n + 1)2 − n(n + 2) 1
bn+1 − bn = − = = >0
n+2 n+1 (n + 2)(n + 1) (n + 2)(n + 1)
for all n ∈ N+ , as required.

It is possible to prove things about recursively defined sequences, even though


we do not have an explicit formula for the n-th term. Such proofs usually involve
induction. Make sure that you know how to do a proof by induction.

Example 2.2.12 The sequence (un ) defined by : u1 = 1, u2 = 1, un+2 = un + un+1


for all n ∈ N+ , is increasing. Listing the first few terms may make this look quite
obvious, but if we want to be sure that nothing goes wrong beyond these terms, we
have to give a proof using the definition. As a first step, prove by induction that
un > 0 for all n ∈ N+ . (Do this; make sure that you get the inductive assumption
correct.)
Now u2 ≥ u1 and un+2 > un+1 for all n ∈ N+ (because un+2 = un+1 + un , and
un > 0); so un+1 ≥ un for all n ∈ N+ . (In fact, un+1 > un for all n ≥ 2.)

Sequences that do not get out of hand, in the sense that their terms do not get
arbitrarily large positive or negative, are generally easier to work with and more
useful. We call such sequences bounded. The next definition makes this idea precise.

Definition 2.2.13 Let (an ) be a sequence of rational numbers.

(a) The sequence (an ) is bounded above if and only if there is a rational number
c such that an ≤ c for all n ∈ N+ , or symbolically

(an ) is bounded above ⇔ (∃c ∈ Q)(∀n ∈ N+ )(an ≤ c).

The number c is then called an upper bound of (an ).

(b) The sequence (an ) is bounded below if and only if there is a rational number
c such that an ≥ c for all n ∈ N+ , or symbolically,

(an ) is bounded below ⇔ (∃c ∈ Q)(∀n ∈ N+ )(an ≥ c).

The number c is then called an lower bound of (an ).


CHAPTER 2. NUMBERS AND SEQUENCES 27

(c) The sequence (an ) is bounded if and only if it is both bounded above and bounded
below.

(d) The sequence (an ) is called unbounded if and only if it is not bounded.

Remarks

1. We can interpret boundedness in terms of the graph of a sequence:


Saying that a sequence is bounded above means that there is some horizontal
line such that the graph of the sequence lies below this line; for a sequence
that is bounded below the graph lies above a horizontal line. The graph of a
bounded sequence lies in the band between two horizontal lines.



• •
• • •
• •
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14




c

Figure 2.2: A graph of a bounded sequence (an ), with c ≤ an ≤ d.

2. If a sequence has an upper bound c, then every real number larger than c is
also an upper bound for the sequence. This means that a sequence which is
bounded above has infinitely many upper bounds. Similarly, a sequence which
is bounded below has infinitely many lower bounds.

3. A sequence (an ) is not bounded above if it does not have an upper bound or,
equivalently, no real number is an upper bound. Symbolically this becomes

(an )is not bounded above ⇔ (∀c ∈ R)(∃n ∈ N+ )(an > c).
CHAPTER 2. NUMBERS AND SEQUENCES 28

As an exercise write out what it means for a sequence not to be bounded


below.
4. To show that a sequence is not bounded, it is enough to show that it is either
not bounded above, or not bounded below.

Example 2.2.14
4+3n 4+3n 4
  
(a) The sequence n
is bounded. To see this, note that n
= n
+3 .
Since n > 0, n4 + 3 > 3, so certainly an = n4 + 3 ≥ 3 for all n ∈ N+ . This shows that
(an ) is bounded below, and that 3 is a lower bound.
Since n1 ≤ 1 and n4 ≤ 4, we have an = n4 + 3 ≤ 7 for all n ∈ N+ . It follows that (an )
is bounded above, and that 7 is an upper bound.
So 3 ≤ n4 + 3 ≤ 7 for all n ∈ N+ , and this sequence is bounded above and below; i.e.
it is a bounded sequence.
 n

(b) The sequence (−1) n
= (−1, 12 , − 31 , 14 , − 15 , . . .) is bounded; −1 is a lower bound
1
and 2
is an upper bound.

(c) We show that the sequence (an ) = (n − 10) = (−9, −8, −7, −6, . . .) is bounded
below, but not bounded above. For any n ∈ N+ , n ≥ 1, so n − 10 ≥ −9. So −9 is
a lower bound for the sequence (n − 10). To show that (an ) is not bounded above,
we must show that (∀c ∈ Q)(∃n ∈ N+ ) (an > c). So let c ∈ Q be given. Choose
n ∈ N+ so that n > c + 10. Then an = n − 10 > (c + 10) − 10 = c, as required. It
follows that (an ) is not bounded.

(d) In much the same way it can be shown that the sequence (−n − 10) is bounded
above, but is not bounded below, and hence not bounded.

(e) The sequence (an ) = ((−1)n+1 n) = (1, −2, 3, −4, . . .) is not bounded above, nor
bounded below. We show that it is not bounded above. Let c ∈ Q be given. Let
n ∈ N+ be chosen so that n is odd and n > c. Then an = (−1)n+1 n = n > c, as
required.
If d ∈ Q and m ∈ N+ is chosen to be even and such that m > −d, then am =
(−1)m+1 m = −m < d, and so (an ) is also not bounded below.

(f) In the last example we show how induction can be used to prove that a recursively
defined sequence is bounded. Consider the sequence (cn ) defined by
 
1 2
c1 = 2, cn+1 = cn + for all n ∈ N+
2 cn
CHAPTER 2. NUMBERS AND SEQUENCES 29

(see Example 2.2.4). We show that 1 ≤ cn ≤ 2 for all n ∈ N+ .


Certainly 1 ≤ c1 = 1 ≤ 2.
Now suppose that 1 ≤ cn ≤ 2 for some k ∈ N+ . We show that 1 ≤ ck+1 ≤ 2. Now
1 1
1 ≤ ck ≤ 2 ⇒ ≤ ≤1
2 ck
2
⇒ 1≤ ≤2
ck
2
⇒ 2 ≤ ck + ≤4
ck
 
1 2
⇒ 1≤ ck + ≤2
2 ck
⇒ 1 ≤ ck+1 ≤ 2.

By induction, it follows that 1 ≤ cn ≤ 2 for all n ∈ N+ .

The next result is useful when you need to show that a sequence is bounded,
because it allows you to check one condition, instead of two (bounded above and
bounded below).

Proposition 2.2.15 A sequence (an ) is bounded if and only if there is a positive


rational number c such that |an | ≤ c for all n ∈ N+ . Equivalently,

(an ) is bounded ⇔ (∃c ∈ Q+ )(∀n ∈ N+ )(|an | ≤ c).

Proof: ( ⇒ ) : Suppose (an ) is bounded. Then there exist c1 , c2 ∈ Q so that for all
n ∈ N+ , c1 ≤ an ≤ c2 . Let c = maximum {|c1 |, |c2 |}. We want to show that, for all
n ∈ N+ , |an | ≤ c, i.e. −c ≤ an ≤ c. Now, for any n ∈ N+ :

an ≤ c2 ≤ |c2 | ≤ c and an ≥ c1 ⇒ −an ≤ −c1 ≤ |c1 | ≤ c ⇒ an ≥ −c.

Putting these together gives −c ≤ an ≤ c, as required.


(⇐) : Suppose there exists c ∈ Q+ so that, for all n ∈ N+ , |an | ≤ c. Then
−c ≤ an ≤ c for all n ∈ N+ , so (an ) is bounded above (by c) and bounded below
(by −c).

Summary:
In this section we gave a precise definition of an infinite rational sequence and
introduced the notation we’ll use for such sequences. We looked at different ways of
defining sequences. Special types of sequences were introduced:
CHAPTER 2. NUMBERS AND SEQUENCES 30

• constant and eventually constant sequences;

• arthmetic and geometric sequences

• increasing and decreasing sequences;

• monotone sequences;

• sequences that are bounded above, bounded below and upper and lower bounds
for sequences;

• bounded and unbounded sequences.

Exercises

1. Classify each of the following sequences as increasing, decreasing or not mono-


tone. Give a proof for your answer in each case.

(a) (an ) = (9 − n2 )
 
2n + 2
(b) (bn ) =
3n + 6
 
n
(c) (cn ) =
n+3
 
n
(d) (dn ) = 3
2
−n

2. For each of the sequences in Question 1(a) and (b) say whether it is (a) bounded
above (b) bounded below (c) bounded (d) unbounded. Give proofs for your
answers.

3. The sequence (an ) is defined by:


2
a1 = 4, an+1 = 3 − for all n ∈ N+ .
an
(a) Calculate the first three terms of (an ) .
(b) Show that 2 < an ≤ 4 for all n ∈ N+ , by using induction. [Hint: Check
that 2 < a1 ≤ 4. . Then suppose that 2 < an ≤ 4 for some n ∈ N+ and
prove that 2 < an+1 ≤ 4.]
(c) Prove that (an ) is strictly decreasing.
CHAPTER 2. NUMBERS AND SEQUENCES 31

4. Define the sequence (an ) by a1 = α and an+1 = a2n − 2an + 2 for all n ∈ N+ .
Suppose that 1 < α < 2. Show that:

(a) 1 < an < 2 for all n ∈ N+ . (Hint: Complete the square.)


(b) (an ) is decreasing.

Is (an ) bounded?

5. Define (bn ) by b1 = β and bn+1 = 2bn /(bn + 1) for all n ∈ N+.

(a) Show that, if 0 < β < 1, then (bn ) is increasing. Is (bn ) bounded in this
case? Give a proof for your answer.
(b) Show that, if β > 1, then (bn ) is decreasing. Is (bn ) bounded in this case?
Give a proof for your answer.
(c) What happens if β = 0? And if β = 1?

2.3 Least upper and greatest lower bounds


We are still looking for a property that the set of real numbers has that is not shared
by the set of rational numbers. At the moment our best picture of the set of real
numbers is still the number line. As we saw in Chapter 1, there is a point on this
number line at a distance x from 0, where x2 = 2, and that this length x does not
correspond to a rational number. We now try to approximate this number x by
rational numbers.
Let’s start the approximation process by trying to approximate x by an integer.
If we try a1 = 1, we get a21 = 1 < 2 = x2 , and so a1 < x. On the other hand b1 = 2
gives b21 = 4 > 2 = x2 , so x < b1 . Our first attempt yields the approximations a1
and b1 , with a1 < x < b1 and b1 − a1 = 1.
We next try to get approximations a2 , b2 such that a2 < x < b2 and b2 −a2 = 0.1.
A bit of experimentation shows that 1.42 < 2 < 1.52 , so we can take a2 = 1.4 and
b2 = 1.5. If we continue in the same way, we find that 1.412 < 2 < 1.422 , so that we
can take a3 = 1.41, b3 = 1.42 and get b3 − a3 = 0.01. It should now be clear that we
can construct two sequences (an ) and (bn ) of rational numbers such that:

• (an ) is increasing and (bn ) is decreasing;

• bn − an ≤ 10−(n−1) for every n ∈ N+ ;

• an < x < bn for every n ∈ N+ (where x2 = 2).


CHAPTER 2. NUMBERS AND SEQUENCES 32

What we have constructed is a nested sequence of intervals with rational end-


points on the real line such that every one of the intervals contains the point at a
distance x from 0 (where x2 = 2). Saying that the sequence of intervals is nested
means that every interval in the sequence is contained in the previous one. We can
find an interval containing x of length as small as we like by taking n large enough.
The sequence (an ) certainly is bounded above (by b1 = 2, for example). In fact,
for every m ∈ N+ , bm is an upper bound for the sequence (an ). To see this note
that am < bm , and for k < m, ak < am < bm (since (an ) is increasing). For k > m,
ak < bk < bm since (bn ) is decreasing). Hence bm is an upper bound for (an ), and
this is holds for all m. Since (bn ) is decreasing, it is natural to ask whether amongst
all the upper bounds for (an ) there is a smallest one. The number corresponding to
the distance x such that x2 = 2 is an obvious choice, since by construction an < x
for all n (so that x is an upper bound), and x < bn for all n (so that x is smaller than
all the upper bounds bn ). But we know that x does not correspond to a rational
number!
Before we investigate this in more depth, let’s get the appropriate definitions out
of the way.

Definition 2.3.1 Let (an ) be a sequence.

(a) If (an ) is bounded above, then the smallest of the upper bounds of (an ) is called
the supremum, or least upper bound (l.u.b.) of (an ) and denoted by supn an , or
sup{an : n ∈ N+ }.

(b) If (an ) is bounded below, then the largest of the lower bounds of (an ) is called
the infimum, or greatest lower bound (g.l.b.) of (an ) and denoted by inf n an , or
inf{an : n ∈ N+ }.

The definition does not say anything about the existence of the supremum or
the infimum of a sequence, and the example we started with suggests that this may
not be entirely unproblematic.
If you want to impress the snobs, note that supremum and infimum are Latin
words, and their plurals are respectively suprema and infima (rather than supre-
mums and infimums).
The following propositions gives useful characterisations of the supremum and
infimum.
CHAPTER 2. NUMBERS AND SEQUENCES 33

Proposition 2.3.2 Let (an ) be a sequence.

(a) If (an ) is bounded above, the following are equivalent:


(i) b = supn an ;

(ii) an ≤ b for every n ∈ N+ (i.e. b is an upper bound) and if an ≤ c for every


n ∈ N+ , then b ≤ c (i.e. b is less than or equal to any other upper bound);
(iii) an ≤ b for every n ∈ N+ (b is an upper bound) and if d < b, then there
is an n ∈ N+ such that an > d (i.e. if d is smaller than b, it is not an upper
bound of (an )).

(b) If (an ) is bounded below, the following are equivalent:


(i) b = inf n an ;

(ii) an ≥ b for every n ∈ N+ (i.e. b is a lower bound) and if an ≥ c for every


n ∈ N+ , then b ≥ c (i.e. b is greater than or equal to any other lower bound);
(iii) an ≥ b for every n ∈ N+ (i.e. b is a lower bound) and if d > b, then there
is an n ∈ N+ such that an < d (i.e. if d is greater than b, it is not a lower
bound of (an )).

Proof: For (a), (ii) simply says precisely what it means for b to be the smallest upper
bound, so that (i) and (ii) are equivalent. To see that (ii) and (iii) are equivalent,
note that the second part of (iii) is simply the contra-positive of the second part of
(ii). The proofs for (b) are similar.
It is time for some examples. First we prove two simple but useful results about
the rational numbers.

Lemma 2.3.3

(a) For every x ∈ Q+ , there is an n ∈ N+ such that n > x.


1
(b) If y ∈ Q and 0 ≤ y < n
for every n ∈ N+ , then y = 0.

Proof: (a) We can write x = pq , for p, q ∈ N+ . We want an n such that n > x = pq ,


or nq > p. If we choose n = 2p then nq = 2pq ≥ 2p > p, since q ≥ 1.
(b) Suppose 0 ≤ y < n1 but y > 0. Then y1 ∈ Q+ and so it follows from (a) that we
can find an n ∈ N such that n > y1 , or y > n1 , contradicting our assumption. Hence
y = 0.
CHAPTER 2. NUMBERS AND SEQUENCES 34

Example 2.3.4

(a) The sequence ( n1 ) is bounded above and below.


Since it is a decreasing sequence, its first term 1 is an upper bound, and since it is a
term of the sequence, there can be no smaller upper bound. Therefore supn n1 = 1.
Since all the terms of the sequence are positive, 0 is a lower bound. It follows from
Lemma 2.3.3 (b) that it is also the greatest lower bound, so inf n n1 = 0. Note in this
case the supremum of the sequence is a term of the sequence, but not the infimum.
n
(b) Let an = .
n+1
Since an = (n+1)−1
n+1
1
= 1 − n+1 , it follows that (an ) is a monotone increasing sequence
1
and a1 = 2 a lower bound. Since 12 is one of the terms of the sequence, there can be
no larger lower bound, so inf n an = 21 .
1
Since clearly an = 1 − n+1 ≤ 1, 1 is an upper bound. Now suppose b is an upper
1 1
bound, then 1 − n+1 ≤ b, or 1 − b ≤ n+1 , for every n. If b ≤ 1, it follows from
Lemma 2.3.3 (b) that 1 − b = 0 and so b = 1, showing that 1 is the smallest upper
bound, or supn an = 1. Note that in this case inf n an is a term of the sequence, but
not supn an .

(−1)n n
(c) Let an = + 1.
n+1
Then (an ) is not a monotone sequence. Since
n (−1)n n (−1)n n
0< < 1, −1 < < 1 and so 0 < an = +1<2
n+1 n+1 n+1
for all n ∈ N∗ . It follows that 0 is a lower bound and 2 an upper bound of (an ).
We show that 0 is the greatest lower bound. Suppose c is a lower bound and 0 ≤ c.
(−1)n n
Then 0 ≤ c ≤ + 1 for all n. In particular, whenever n = 2k − 1, we have
n+1
1
0≤c≤ , and this holds for every k ∈ N+ . it follows from Lemma 2.3.3 (b) that
2k
c = 0, so no lower bound can be larger than 0. It follows thatt inf an = 0; a similar
argument shows that supn an = 2. Note that neither the supremum nor the infimum
is a term of the sequence.

It is important to distinguish between the notions maximum, minimum, supre-


mum and infimum. We start with the definitions of the first two concepts (the
definitions of the other two we’ve had already).
CHAPTER 2. NUMBERS AND SEQUENCES 35

Definition 2.3.5 Let (an ) be a sequence.

(a) If the sequence (an ) has a largest term (i.e. if there is an p ∈ N+ such that
an ≤ ap for every n ∈ N+ ), then we call ap the maximum of the sequence and
denote it by maxn an .

(b) If the sequence (an ) has a smallest term (i.e. if there is an p ∈ N+ such that
an ≥ ap for every n ∈ N+ ), then we call ap the minimum of the sequence and
denote it by minn an .

Warning: It is important to understand the relationship between a maximum and


a supremum, and a minimum and an infimum, of a sequence.

• The maximum of a sequence, if it exists, is always an upper bound, and must


therefore be the smallest upper bound, so that in this case supn an = maxn an .
As the second example above shows, a sequence which is bounded above need
not have a maximum (but it always has a supremum).

• The minimum of a sequence, if it exists, is always a lower bound, and must


therefore be the greatest lower bound, so that in this case inf n an = minn an .
As the first example above shows, a sequence which is bounded below need not
have a minimum (but it always has an infimum).

• If the range of the sequence (an ) is finite, that is if the set {an : n ∈ N+ } is
finite (this will be the case, for example, if the sequence is eventually constant),
then both maxn an and minn an exist.

A sequence can only have a least upper bound if it has at least one upper bound,
that is if it is bounded above. But does every sequence that is bounded above have
a supremum? The answer is: “It depends where you want it to be.”
To clarify this rather enigmatic answer, we need to return to the example at the
beginning of the section. There we constructed two sequences of rational numbers,
(an ) and (bn ) such that (an ) is increasing, (bn ) decreasing, bn − an = 10−(n−1) and
an < x < bn for every n ∈ N+ , where x2 = 2. It follows from this that x is an upper
bound for (an ) and a lower bound for (bn ). It also follows that for a number y to
be the least upper bound of (an ), it will have to satisfy an ≤ y ≤ bn (since for every
m ∈ N+ , bm is an upper bound of the sequence (an )).
Suppose now that there is a rational number y such that an ≤ y < x ≤ bn for
every n (that is, y is a rational upper bound for (an ), and smaller than x). Then
CHAPTER 2. NUMBERS AND SEQUENCES 36

a2n ≤ y 2 < x2 = 2 < b2n , hence


4
0 < 2 − y 2 < b2n − a2n = (bn − an )(bn + an ) ≤ 10−n+1 × 4 ≤ .
n
This contradicts Lemma 2.3.3 (b), and therefore there can be no such y. A similar
argument shows that (bn ) cannot have a rational lower bound greater than x.
We have therefore found examples of

• a sequence of rational numbers which is bounded above but does not have a
rational supremum;

• a sequence of rational numbers which is bounded below, but does not have a
rational infimum.

At the same time we have shown that if we allow a number which corresponds to a
point on the real line, the sequence (an ) has a supremum and the sequence (bn ) has
an infimum. We will define the real numbers in such a way that it is this property
(the existence of a supremum for every sequence which is bounded above) that will
distinguish the real numbers from the rational numbers.

Summary:
In this section we introduced the least upper bound, or supremum, of a sequence
that is bounded above, and the greatest lower bound, or infimum, of a sequence
that is bounded below. An example was given to show that a rational sequence
which is bounded above need not have a rational supremum. A sequence that is
bounded above need not have a maximum, but if it has, it equals the supremum of
the sequence. Similar comments applies to a minimum.

Exercises

1. For each of the following sequences (xn ), say whether supn xn , maxn xn , inf n xn
and minn xn exist, and where it does exist, find it.

(a) xn = (−1)n ;
(b) xn = 3−n ;
(c) xn = (−2)n ;
5n − 9
(d) xn = ;
3n
(e) xn = (2 + (−1)n )n + n1 .
CHAPTER 2. NUMBERS AND SEQUENCES 37

2. Give proofs for your answers in 1(d).

3. Let (an ) and (bn ) be bounded sequences such that for every n, m ∈ N+ ,
an ≤ bm . Show that supn an ≤ inf n bn .

2.4 Real numbers at last


We are now ready to define the real numbers. Since we would like the real numbers
to have all the properties of the rational numbers, we need the real numbers to form
an ordered field. To this requirement we are now going to add one further axiom.
It is important to keep in mind that when we say “the rational numbers” we
have in mind more than simply a set (of numbers). We are also assuming that
there are operations of addition and multiplication defined on the set, which turns
it into a field, and an order relation, with which it becomes an ordered field. Strictly
speaking, when we think of the set of rational numbers as an ordered field we should
use notation like (Q, +, ×, <) in stead of just Q. But since it is usually clear from the
context whether we are simply thinking of the set Q or the ordered field (Q, +, ×, <),
we’ll simply use Q in both cases. In the same way we’ll now use the symbol R for
the set of real numbers as well as the ordered field (R, +, ×, <).
We have already defined what we mean by an infinite rational sequence: it is
simply a function a : N+ → Q. In the same way an infinite real sequence (or an
infinite sequence in R) will mean a function a : N+ → R. The definitions and results
we have had before for increasing, decreasing and bounded sequences of rational
numbers carry over to sequences of real numbers, and we’ll make use of this in what
follows. As was the case with rational sequences. we’ll assume that all sequences
are infinite, and “real sequence” will in future mean “infinite real sequenc”.

Definition 2.4.1 The real numbers is a set R together with

• an operation of addition (a function which assigns to (x, y) ∈ R × R the


element x + y ∈ R);

• an operation of multiplication (a function which assigns to (x, y) ∈ R × R the


element x × y = xy ∈ R)

• an order relation (<)

such that (R, +, ×, <) is an ordered field, which also satisfies the axiom:
CHAPTER 2. NUMBERS AND SEQUENCES 38

• [LUB] Every sequence in R which is bounded above has a supremum in R.

You may wonder why the axiom LUB is phrased in terms of a supremum only.
What about infima? Here is the reason.

Proposition 2.4.2 Let (an ) be a real sequence.

(a) If (an ) is bounded below, then the sequence (−an ) is bounded above and
inf n an = − supn (−an ).

(b) If (an ) is bounded above, then the sequence (−an ) is bounded below and
supn an = − inf n (−an ).

Proof: (a) Since (an ) is bounded below, there is a b ∈ R such that an ≥ b for all
n ∈ N+ . Hence −an ≤ −b for all n ∈ N+ , showing that (−an ) is bounded above.
By axiom LUB, c = supn (−an ) ∈ R, and so c ≥ −an , and hence an ≥ −c, for all
n ∈ N+ . This shows that −c is a lower bound for (an ). Suppose d is any lower
bound for (an ). Then an ≥ d, and hence also −an ≤ −d, for all n ∈ N+ . This means
that −d is an upper bound for (−an ), and since c is the least upper bound, we must
have c ≤ −d, or −c ≥ d. It follows that −c is the greatest lower bound of (an ), and
so inf n an = −c = − supn (−an ).
(b) The proof is similar.

Corollary 2.4.3 Every sequence in R that is bounded below has an infimum in R.

There are a number of important and uncomfortable questions we can ask about
the definition of the real numbers above.

1. Is there a set R with operations of addition and multiplication and an order


relation defined on it such that all the properties in the definition are satisfied
(or, put more bluntly, does the set of real numbers, as defined, exist)?

2. If the answer to the previous question is “yes”, is there only one such set (i.e.,
is R unique)?

3. Can we say that R contains Q?


CHAPTER 2. NUMBERS AND SEQUENCES 39

Fortunately, the answer to each one of the questions is in the affirmative. The proofs
are lengthy and not easy, and if we were to do them properly it would take up a
large part of this course. Since there are many other interesting things to discover
about real numbers, we’ll only make some comments about the proofs and leave it
at that.
That there is a set satisfying the properties in the definition can be shown by
constructing such a set from the rational numbers, using Q and its properties. There
are several ways to do this, all in some sense using the idea that a real number should
be thought of as a number that can be “approximated” by rational numbers. The
problem is to do it in such a way that a sequence of approximations can be thought
of as a number in its own right, and to define addition, multiplication and the
order relation for such “numbers”. If you are not satisfied with starting with the
rational numbers, you could go all the way back to the natural numbers, define them
axiomatically and then construct first the integers and then the rational numbers.
The proof that, once we have constructed an example of a set with operations
satisfying the properties of the definition, all other examples must essentially be the
same (in much the same way that two isomorphic vector spaces are essentially the
same, as vector spaces) is beyond the scope of this course.
To see that we can think of Q as a subset of R, we think of the additive identity
of R as the natural number 0, and the multiplicative identity as the number 1. Since
R is closed under addition, it must contain 1 + 1, 1 + 1 + 1, . . ., and these we can
identify with the numbers 2, 3, . . .. The additive inverse of each of these numbers
must also be in R, and we can identify them with the numbers −1, −2, −3, . . .. This
shows that Z can be regarded as a subset of R. Since each q ∈ N+ can now be
regarded as an element of R, its multiplicative inverse q −1 must also be in R. We
can now identify the rational number p/q (with p ∈ Z, q ∈ N+ ) with the element
pq −1 of R. This allows us to think of Q as a subset or R. This is not quite the end
of the story, since we also need to check that the usual addition, multiplication and
the order relation in Q is compatible with this identification of Q with a subset of
R. This can be done, but we won’t do it here. If you want a bit of a conceptual
challenge, you can try to work out what it is that has to be proved.
From now on we’ll assume the existence and uniqueness of the real numbers in
the sense explained above, and also that we may regard every real number as a
rational number. In the rest of this section we start exploring the consequences of
the definition.
A bit of a reassurance: we have seen that we can identify the point on the real
line at a distance x from 0, where x2 = 2, with the supremum of a sequence of
rational numbers, and by our definition of R, it must therefore be a real number.
CHAPTER 2. NUMBERS AND SEQUENCES 40

From now on we’ll denote it by 2, as usual.
The two properties of Q listed in Lemma 2.3.3 also hold in R, but we need the
axiom LUB in the proof.

Proposition 2.4.4 (Archimedean property of R).

(a) If x ∈ R and x > 0, there is an n ∈ N+ such that n > x.


1
(b) If x ∈ R and 0 ≤ x < n
for every n ∈ N+ , then x = 0.

Proof: (a) The proof is by contradiction. Suppose that n ≤ x for every n ∈ N+ .


Then the sequence (n) is bounded above in R, and by axiom LUB must have a
supremum. Let y = supn n. Then y − 1 ∈ R, but since y − 1 < y, y − 1 is not an
upper bound for (n) and therefore there must be an m ∈ N+ such that m > y − 1,
and hence m + 1 > y. Since m + 1 ∈ N+ , this contradicts the fact that y is an upper
bound for (n).
(b) The proof is the same as that for Lemma 2.3.3 (b).
The set R has a very useful property, called the Nested Interval Property
(NIP). As you will see, the proof that this is the case depends on the axiom LUB in
the definition of R. We’ll see later that the NIP is in fact equivalent to the axiom
LUB.

Theorem 2.4.5 Let (an ) and (bn ) be two sequences of real numbers such that
(a) (an ) is increasing and (bn ) is decreasing;
(b) an < bn for every n ∈ N+ ;
(c) for every c > 0 in R, there is an n ∈ N+ such that bn − an < c.
Then there is a unique x ∈ R such that an ≤ x ≤ bn for every n ∈ N+ .

Proof: The sequence (an ) is bounded above (by b1 ) and therefore x = supn an exists
and is a real number, and an ≤ x for every n. As we have shown before, every term
of the sequence (bn ) is an upper bound for (an ), and therefore and x ≤ bn for every
n ∈ N+ . This shows that an ≤ x ≤ bn for every n. Suppose that there is a y ∈ R
such that an ≤ y ≤ bn for every n; without loss of generality we may assume that
y ≥ x. Then 0 ≤ y − x ≤ bn − an for every n. For every m ∈ N+ , there is, by the
assumption (c), an n ∈ N+ such that bn − an < m1 . From this we have that for every
m ∈ N+ , 0 ≤ y − x < m1 , and so by Proposition 2.4.4 (b) y = x.
The Nested Interval Property gives us a way of seeing how we can identify points
on the real line with real numbers as we have defined them here. Given a point on
CHAPTER 2. NUMBERS AND SEQUENCES 41

the real line, we first find an interval of length 1 with integer endpoints containing
the point. Next we divide this interval into two intervals with equal length and
choose the one which contains the point. Continuing with this method of bisecting
intervals we can find a nested sequence of intervals all of which contain the point
we started with and such that the length of the n-th interval is 2−(n−1) . It follows
from Proposition 2.4.4(a) that for every c > 0 we can find an n ∈ N+ such that
2−n < c (check this).The intervals therefore satisfy the conditions of the NIP, which
then guarantees that there is a unique real number common to all the intervals, and
by construction this number corresponds to the point we started with.
If, conversely, we start with an element x ∈ R, then we know that for any integer
n we must have x = n, x < n or x > n. We can use this property to find an integer
m such that m ≤ x ≤ m + 1. The number x therefore corresponds to a point in
the interval [m, m + 1] on the real line. Using the same bisection method as above
we can find a nested sequence of intervals, starting with [m, m + 1], all containing
x, and the point on the real line common to all these intervals then represents the
number x.

Summary:
In this section we defined the real numbers R as an ordered field which satisfies the
least upper bound axiom, and showed that this implies that R also satisfies a greatest
lower bound axiom and has the Archimedean and Nested Interval properties. Using
this we could set up a correspondence between the real numbers and points on the
number line.

Exercises
1
1. Prove that for every x ∈ R,with x > 0, there is an n ∈ N such that < x.
n
Deduce that if y is a real number such that 0 ≤ y < x for every positive real
number x, then y = 0.

2. Prove that if the real sequence (an ) is bounded above, the the sequence (−an )
is bounded below and inf(−an ) = − supn an .

3. In this question we give definitions for sets of real numbers that are very similar
to the ones we have already given for sequences.
Let A be a non-empty subset of R.

(a) A is bounded above if there is a c ∈ R such that a ≤ c for every a ∈ A.


If this is the case, we call c an upper bound for A. The smallest upper
bound of A is called the supremum of A, and denoted by sup A.
CHAPTER 2. NUMBERS AND SEQUENCES 42

(b) A is bounded below if there is a b ∈ R such that a ≥ b for every a ∈ A.


If this is the case, we call b a lower bound for A. The greatest lower
bound of A is called the infimum of A, and denoted by inf A.
(c) A is bounded if it is both bounded above and bounded below.

We’ll see later (Theorem 3.5.12) that it can be shown that every non-empty
set of real numbers which is bounded above has a supremum, and every non-
empty set of real numbers which is bounded below has an infimum. You may
assume this for the moment.
For each of the following subsets of R, say whether it is bounded above, and
if it is, find its supremum. Also say whether it is bounded below, and if so,
find its infimum.

(a) {x ∈ R : x2 < 2}.


(b) {x ∈ R : |x + 3| < 5}.
(c) {x ∈ R : x−1 > 2}.
(d) {x ∈ R : x−1 ≤ 2}
Chapter 3

Convergent and divergent


sequences

Two of the most fundamental concepts in analysis are those of convergence and
limits. In your calculus course you have certainly worked with limits a great deal;
in fact, the two basic notions in calculus, those of differentiation and integration,
depend for their definitions on the idea of a limit. In addition, the precise definition
of the continuity of a function at a point also makes use of a limit. In this section
we are going to look at limits in a slightly different context: that of sequences.
Looking at the examples and questions raised in the first chapter, you will realise
that satisfactory answers to the questions asked there will have to rely on being able
to say what we mean by the convergence of a sequence, and being able to determine
whether a sequence converges, and if so, to what. We make a start here to finding
the tools to deal with such problems.

3.1 Sequences converging to zero


We start our investigation of convergence by making precise what is meant by saying
a sequence of non-negative real numbers converges to 0. (We say a sequence (an ) is
non-negative if an ≥ 0 for every n ∈ N+ .) You probably already have quite a good
intuitive idea of what it means for a sequence (an ) of such numbers to converge
to 0. It may be something along the following lines: “(an ) converges to 0 if we
can make the distance between an and 0 as small as we like by taking n large
enough.” Sequences like ( n1 ) and (2−n ) would certainly satisfy this criterium, and

43
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 44

would therefore qualify. But one could also argue that a sequence like
1 1 1 1
1, , 1, , 1, , 1, , 1, . . .
2 4 8 16
satisfies this criterium (we can get as close as we like to 0 using the odd-numbered
terms), but we would not feel comfortable with saying that the sequence converges
to 0 (because the even-numbered terms stay far from 0). To avoid this, we could
make our requirement a bit more stringent: ‘(an ) converges to 0 if we can make the
distance between an and 0, and the distance between all subsequent terms and 0, as
small as we like by taking n large enough.”
A definition in words like this is not very useful when we want to prove things
about convergent sequences. We need a very precise definition. Such a precise
definition is given below. At first sight it is not pretty. It contains three quantifiers,
and a Greek letter (). You don’t have to use a Greek letter (any other letter will
do), but the quantifiers have to be there, and they have to be in the correct order.

Definition 3.1.1 A non-negative sequence (an ) in R converges to 0 if and only


if for every  ∈ R+ , there is an N ∈ N+ such that for every n ∈ N+ , if n ≥ N, then
0 ≤ an < .
More formally, this becomes

A non-negative sequence (an ) in R converges to 0

⇔ (∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ 0 ≤ an < ].

If all the symbols are a bit overwhelming, you may find the following translation
into words less of a shock to the system: A non-negative sequence (an ) of real
numbers converges to 0 if and only for every positive real distance, we can find a
term in the sequence such that this term and all subsequent terms are within this
distance from 0.
It is time for some examples.

Example 3.1.2 There would be something seriously wrong with our definition if
the sequence (an ) = ( n1 ) did not converge to 0. So let us check that it does indeed
converge to 0 according to the definition. Informally, what we need to do is to find,
for every given a distance  ∈ R+ , a term aN in the sequence such aN and all terms
after it is at a distance less than  from 0. Since the sequence ( n1 ) is decreasing, it
will be enough to find an N ∈ N+ such that N1 < , or equivalently, N > 1 . The
Archimedean property guarantees that we can find such an N.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 45

This gives a good idea of how to give a formal proof. Let  ∈ R+ . Then 1 ∈ R+ ,
and so it follows from Proposition 2.4.4 that there is an N ∈ N+ such that N > 1 ,
and hence N1 < . Then for every n ∈ N+ , if n ≥ N, then 0 ≤ n1 ≤ N1 < . This
shows that the sequence ( n1 ) converges to 0.

It is time to introduce the usual notation for convergence. If (an ) is a sequence


converging to 0, we’ll write an → 0 as n → ∞, or simply an → 0; we sometimes say
that an tends to 0.
What is perhaps not so clear from the rather slick formal proof above is that
to satisfy the requirements of the definition (and so to show that (an ) does indeed
converge to 0), it is necessary to produce, for every positive distance , a corre-
sponding positive integer N (which will usually depend on ) such that for every
term from the N-th one onwards, the distance between the term and 0 is smaller
than the given . You may think of this as a game in which you are trying to con-
vince a doubter that the sequence does indeed converge to 0. The doubter comes up
with a positive ; you have to find an N as required by the definition. If you can’t
supply such an N, your claim that the sequence converges to 0 will be dismissed.
Let’s illustrate that with an example.

1
Example 3.1.3 You claim that the sequence (n− 2 ) converges to 0.
1
Your opponent, the doubter, comes up with  = 10 and challenges you to produce a
corresponding N.
You claim that N = 101 will do.
√ √ √ 1 1
Your opponent checks: If n ≥ 101, then n ≥ 101 > 100 = 10, and so √ < ,
n 10
so she has to admit that you won this round.
1
Your opponent’s next challenge is  = 1000 .
You respond with N = 1000001. √ √
Your opponent checks: If N = 1000001, then if n ≥ 1000001, then n ≥ 1000001 >
√ 1 1
1000000 = 1000, and so √ < , so she has to admit that you won the second
n 1000
round as well.
By now your opponent has probably worked out that you have a winning strategy.
1
If she comes up with an , you choose N to be any integer larger than 2 . She
1

realises that you will always win, and concedes that (n− 2 ) converges to 0.

To prove that a sequence converges to 0 it is not enough to win a few rounds


of this game (no matter how many). What is needed is a winning strategy. In
Example 3.1.2, the Archimedean property of R ensured that the winning strategy
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 46

1
we proposed (choosing N > ) is legitimate: there is such an N. The same is true

for the last example.
Can we also use the definition to prove that a sequence does not converge to 0?

Example 3.1.4 At the start of this section we argued that the sequence defined by

a2n = 2−n and a2n−1 = 1 for all n ∈ N+

does not qualify as a sequence converging to 0. Can we use the definition to show
this does not converge to 0? The odd-numbered terms are all equal to 1, and they
certainly never get to within a distance 21 of 0. If we take  = 21 , then a2n−1 = 1 > .
It follows that for every N ∈ N+ , there will be an n ≥ N such that an = 1 > .
Hence for  = 12 there will be no N that will satisfy the definition, and so the
sequence does not converge to 0.

It would be tedious in the extreme if we had to come up with a new winning strategy
for every sequence we wanted to prove converges to 0. To avoid this, we are going
to prove some results that will help us to treat large classes of examples at the same
time. You will probably find that none of these results are surprising. So you may
well ask why we bother to prove things that seem obvious. There are a number of
reasons:

• To give credibility to the formal definition by showing that results you think
should be true (using your informal intuition) can indeed be proved (using
only the formal definition);

• To give you practice in proving results using this formal definition.

• To build up a useful stock of results that will make it easier to calculate limits
in future.

The first result is a kind of “sandwich theorem”:

Proposition 3.1.5 If an ) and (bn ) are real sequences such that 0 ≤ an ≤ bn for
every n ∈ N+ and bn → 0, then an → 0 as well.

Proof: Let  ∈ R+ . Since bn → 0, there is an N ∈ N+ such that for every n ∈ N+ ,


if n ≥ N, then 0 ≤ bn < , and since an ≤ bn , also 0 ≤ an < . Hence an → 0.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 47

Corollary 3.1.6 If bn → 0 and x is a real number such that 0 ≤ x ≤ bn for every


n ∈ N, then x = 0.

Proof: In the proposition, take an = x for every n.

Example 3.1.7 We use the sandwich theorem to show that the sequence (2−n )
converges to 0. Since we have 2n > n for every n ∈ N+ (if you do not believe this,
you can prove it by induction!), we can deduce that 0 ≤ 2−n ≤ n1 for all n ∈ N+ .
But we have seen above that n1 → 0, so Proposition 3.1.5 tells us that 2−n → 0 as
well.
The same proof will in fact show that for every k ∈ N+ with k ≥ 2, the sequence
(k −n ) converges to 0.
We can do even better. Let 0 < r < 12 . We show that the sequence (r n ) converges
to 0. Since 0 < r < 12 , 0 < r n < 2−n for all n ∈ N. Since (2−n ) → 0, we also have
r n → 0.
1
The case 21 < r < 1 needs some more care. We can write r = , for some
1+b
0 < b < 1. Then we have, for all n ∈ N+ ,
n(n − 1) 2 1 11
(1 + b)n = 1 + nb + b + · · · + bn > nb, and so r n = < .
2 (1 + b)n bn
11
Since → 0, also r n → 0.
bn

We look at combinations of sequences next.

Proposition 3.1.8 Let (an ) and (bn ) be non-negative sequences.

1. If an → 0 and c ∈ R+ , then can → 0.


2. If an → 0 and bn → 0, then an + bn → 0.

3. If an → 0 and (bn ) is a bounded sequence, then an bn → 0.

Proof: (a) We first look informally at what we have to prove. To show that can → 0,
we need to show that for every  > 0, we can find an N ∈ N+ such that if n ≥ N,
0 ≤ can < , or 0 ≤ an < c−1 . Since an → 0 and c−1  > 0, we can use c−1  in the
definition of convergence to find a corresponding N. The formal proof consists of
showing that this N works to show that can → 0.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 48

So let  > 0. Since an → 0 and c−1  > 0, we can find an N ∈ N+ such that for
every n ∈ N+ , if n ≥ N, 0 ≤ an < c−1 . But then it follows that for every n ∈ N+ ,
if n ≥ N, 0 ≤ can < . This proves that can → 0.
(b) Informally, given  > 0, we want 0 ≤ an + bn <  for large enough n. To do
this, it will be enough to ensure that an < 21 , and bn < 12 , for then we’ll have
0 ≤ an + bn < . Will this be possible? Since an → 0, and 21  > 0, we can choose
an N1 such that 0 ≤ an < 21  for n ≥ N1 . In the same way, since bn → 0, we can
choose an N2 such that 0 ≤ an < 21  for n ≥ N2 . If n is larger than or equal to both
N1 and N2 , both inequalities will be satisfied.
For the formal proof, let  > 0. Since an → 0 and bn → 0, we can find an
N1 ∈ N+ and an N2 ∈ N+ such that for every n ∈ N+ , if n ≥ N1 , then 0 ≤ an < 21 
and if n ≥ N2 , then 0 ≤ bn < 21 . Let N = max{N1 .N2 }. Then for every n ∈ N+ , if
n ≥ N, we have n ≥ N1 and n ≥ N2 . Hence 0 ≤ an + bn < 12  + 21  = . This shows
that an + bn → 0.
(c) We are given that (bn ) is bounded, so there is a c > 0 such that bn ≤ c for all
n ∈ N+ , and so 0 ≤ an bn ≤ an c. The result now follows from (a) and the sandwich
theorem.
We finish this section by returning to the Nested Interval Property (NIP) that
we introduced in the previous chapter. We saw there that the axiom LUB implies
the NIP, and promised to show that the converse is also true. We are now in a
position to do this. To start note that now that we have defined what it means for
a non-negative sequence to converge to 0, we can restate the NIP, as is done in the
theorem below. Check that you can see why this statement of the NIP is equivalent
to the one given in Theorem 2.4.5.

Theorem 3.1.9 Suppose that the NIP holds: If (an ) and (bn ) are two sequences of
real numbers such that
(a) (an ) is increasing and (bn ) is decreasing;
(b) an < bn for every n ∈ N+ ;
(c) bn − an → 0,
then there is a unique x ∈ R such that an ≤ x ≤ bn for every n ∈ N+ .
Then every non-empty subset of R which is bounded above has a supremeum.

Proof: Let S be a non-empty subset of R that has an upper bound. We must


show that S has a supremum.
We construct sequences (an ) and (bn ) satisfying the conditions (a), (b) and (c).
Choose some x ∈ S (this is possible, since S 6= ∅) and then choose any a1 < x. Then
a1 is not an upper bound of S. Let b1 be an upper bound of S (any upper bound
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 49

will do). Let c1 be the midpoint of [a1 , b1 ], i.e. c1 = 21 (a1 + b1 ).


If c1 is an upper bound of S, let a2 = a1 and b2 = c1 .
If c1 is not an upper bound of S, let a2 = c1 and b2 = b1 .
In either case, a2 is not an upper bound of S and b2 is an upper bound of S. Also
a1 ≤ a2 ≤ b2 ≤ b1 and b2 − a2 = 21 (b1 − a1 ).
We now proceed inductively. Once we have defined ak and bk for some k ∈ N+ , we
let ck = 21 (ak + bk ).
If ck is an upper bound of S, let ak+1 = ak and bk+1 = ck .
If ck is not an upper bound of S, let ak+1 = ck and bk+1 = bk .
In either case, ak+1 is not an upper bound of S, and bk+1 is an upper bound of S.
Also
a1 ≤ a2 ≤ . . . ≤ ak ≤ ak+1 ≤ bk+1 ≤ bk ≤ . . . ≤ b2 ≤ b1
and
1 1
bk+1 − ak+1 = bk − ak 2 = . . . = (b2 − a2 )2−(k−1) = (y1 − x1 )2−k .
2 2
It now follows from Example 3.1.7 and Proposition 3.1.8 that bn − an → 0. Since
(a), (b) and (c) hold, it follows from the assumption that there is a unique x ∈ R
such that an ≤ x ≤ bn for every n ∈ N+ .
We claim that x = sup S.
First we prove, by contradiction, that x is an upper bound of S. Suppose that there
exists s0 ∈ S such that s0 > x. Choose n ∈ N+ such that bn − an < s0 − x. (We
can do this because bn − an → 0 as n → ∞.) Then bn < an + s0 − x ≤ s0 , because
an − x ≤ 0. This is a contradiction, because bn is an upper bound of S, so we must
have bn ≥ s for all s ∈ S.
Next we show that x is the smallest upper bound of S, again using proof by con-
tradiction. Suppose that b is an upper bound of S and b < c. Choose n ∈ N+ so
that bn − an < c − b. Then an > bn + b − c ≥ b, because bn − c ≥ 0. This is a
contradiction, because it would make an an upper bound of S, which it is not.
We are now ready to tackle sequences in general (not only non-negative ones)
converging to any number (not only 0).

Summary:
We started our investigation of convergent sequences in this section by defining what
it means for a non-negative sequence to converge to 0. The definition was used to
find some useful examples of such sequences, and we also saw how it can be used
to show that a sequence does not converge to 0. A few important properties of
sequences converging to 0 were proved. Finally we proved that the Nested Interval
Property is equivalent to the Least Upper Bound axiom.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 50

Exercises

1. Let an = 0 for every n ∈ N+ . Prove that (an ) converges to 0.

2. Let (an ) be a non-negative sequence and suppose that there is a k ∈ N+ such


that an = 0 for all n ≥ k. Show that (an ) converges to 0.

3. Let (an ) be a non-negative sequence converging to 0. Use the definition to


prove the following:

(a) ( an ) converges to 0;
(b) (a2n ) converges to 0;
(c) for every fixed k ∈ N+ , the sequence (an+k ) converges to 0.

4. Prove or disprove:

(a) The sequence (an ) defined by an = (1 + (−1)n ) n1 converges to 0.


(b) The sequence (an ) defined by an = (1 + (−1)n ) 14 converges to 0.

3.2 Convergent sequences


The intuitive idea we tried to capture in a precise definition in the previous section
was that a sequence (an ) of non-negative real numbers converge to 0 if the distance
between 0 and an can be made as small as we like by taking n large enough. Since
an ≥ 0, the distance between an and 0 is just an . The next step is to define what we
mean by an arbitrary (not necessarily non-negative) sequence (an ) of real numbers
converging to a real number a (not necessarily 0). To do this we need to be more
careful about distances. The idea now is to make the distance between an and a
small, and this distance is given by |an −a|. The fact that (|an −a|) is a non-negative
sequence allows us to use the definition for convergence of a non-negative sequence
to 0 to define convergence of (an ) to a.

Definition 3.2.1 Let (an ) be a real sequence and a ∈ R. Then we say that the
sequence (an ) converges to a if and only if the non-negative sequence (|an − a|)
converges to 0.
More formally, using quantifiers and connectives, this reads:

A sequence (an ) in R converges to a ∈ R

⇔ (∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ |a − an | < ].


CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 51

In this case we call a the limit of the sequence (an ).


If there is an a ∈ R such that (an ) converges to a, we say that the sequence (an ) is
convergent.

Note that since |a − an | = |(−1)(an − a)| = | − 1||an − a| = |an − a|, we could


also have used |a − an | in the place of |an − a| in the definition.
Here is a simple example

 
2n + 1
Example 3.2.2 Let an = .
n
Looking at the first few terms, it seems fairly obvious that (an ) converges to 2.
According to the definition we need to check that the sequence 2n+1

n
− 2 converges
to 0. Since
2n + 1 2n + 1 − 2n 1
−2 = = ,
n n n
1
and we already know that ( n ) converges to 0, this is indeed the case.

There are many ways of writing the fact that (an ) converges to a. The following all
mean the same:

• (an ) converges to a

• (an ) tends to a

• an → a

• an → a as n → ∞

• lim an = a
n→∞

• lim an = a
n

• lim an = a

• a is the limit of (an ).

It might make more sense to write (an ) → a rather than an → a, but most books
write an → a, so we will stick to this tradition.
We start with some easy deductions from the definition.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 52

Proposition 3.2.3 Let (an ) be a real sequence.

(a) If an = c for every n ∈ N+ , then an → c (a constant sequence is convergent).

(b) If there is a k ∈ N+ such that an = c for every n ≥ k, then an → c (an eventually


constant sequence is convergent).

(c) Let k ∈ N+ and an → a, then the sequence (an+k ) also converges to a.

Proof: (a) and (b) follows at once from the definition and Exercises 1 and 2 of
section 3.1. Make sure that you know why.
(c) Informally, we want, given  > 0, to have n large enough so that |an+k − a| < .
Since an → a, we can find an N ∈ N+ such that |an −a| <  if n ≥ N, or |an+k −a| < 
if n + k > N. This gives us enough information to write down a formal proof.
For the formal proof, let  > 0. Since an → a, there is an N ∈ N+ such that
for all n ∈ N+ , if n ≥ N, |an − a| < . Hence for all n ∈ N+ , if n ≥ N − k, then
n + k ≥ N and so |an+k − a| < . This shows that an+k → a.
The result in (c) above shows that omitting a finite number of terms from a
convergent sequence will not change the fact that it converges, or its limit.
In the case of a constant or eventually constant sequence, the terminology “an
tends to a” is not really appropriate, since in these cases an is always, or eventually,
equal to a. The terminology is so firmly established that we will stick to it even in
these two cases, even though it is not strictly speaking correct.
A convergent sequence can have only one limit:

Proposition 3.2.4 The limit of a convergent sequence is unique.

Proof: Suppose an → a and an → b. We want to show a = b. For every n ∈ N+ ,

0 ≤ |b − a| ≤ |b − an + an − a| ≤ |b − an | + |an − a|.

We know that |b − an | → 0 and |an − a| → 0, and so also |b − an | + |an − b| → 0,


by Proposition 3.1.8 (b). But then it follows from Corollary 3.1.6 that |b − a| = 0
or a = b.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 53

Proposition 3.2.5 Let an → a and bn → b, and c ∈ R. Then

(a) an + bn → a + b;

(b) can → ca;

(c) an − bn → a − b;

(d) |an | → |a|.

Proof: (a) We have, for all n ∈ N+ ,

|(an + bn ) − (a + b)| = |(an − a) + (bn − b)| ≤ |an − a| + |bn − b|.

Since an → a and bn → b, we have that |an − a| → 0 and |bn − b| → 0, and hence by


Proposition 3.1.8(b), |an − a| + |bn − b| → 0. It now follows from Proposition 3.1.5
that |(an + bn ) − (a + b)| → 0, and so an + bn → a + b.
(b) The proof is similar, using the fact that |can − ca| = |c||an − a| and Proposi-
tion 3.1.8(a).
(c) Use an − bn = an + (−1)bn and (a) and (b).
(d) Use the fact that | |an | − |a| | ≤ |an − a| and Proposition 3.1.5.

Proposition 3.2.6 Every convergent sequence is bounded.

Proof: Suppose an → a. Then by Proposition 3.2.5(d) we also have |an | → |a|. It


follows from the definition of convergence that, taking  = 1, we can find an N ∈ N+
such that for all n ≥ N, if n ≥ N, then | |an | − |a| | < 1, and hence

|an | = | |an − |a| + |a| | ≤ | |an | − |a| | + |a| < 1 + |a|.

Now let C = max{|a1 |, |a2|, . . . , |aN −1 |, 1 + |a|}. Then it follows that for all n ∈ N+ ,
|an | ≤ C, and so (an ) is bounded.

Proposition 3.2.7 If an → a and bn → b, then an bn → ab.

Proof: It follows from Proposition 3.2.6 that (bn ) is bounded. If a = 0, then it


follows from Proposition 3.1.8(c) that an bn → 0 = ab. In general, we have

|an bn − ab| = |an bn − abn + abn − ab| ≤ |an − a||bn | + |a||bn − b|.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 54

Since an → a and bn → b, we have |an − a| → 0 and |bn − b| → 0. Since (bn )


is bounded, |an − a||bn | → 0 (by Proposition 3.1.8(c)), and |a||bn − b| → 0 (by
Proposition 3.1.8(a)), and so |an bn − ab| → 0 (by Propositions 3.1.8 and 3.1.5).
For quotients of convergent sequences we have to be a little more careful:

Proposition 3.2.8 Let (bn ) be a sequence with all its terms non-zero, and suppose
bn → b, with b 6= 0.
 
1
(a) The sequence is bounded.
bn

1 1
(b) → .
bn b
an a
(c) If also an → a, then → .
bn b

Proof: (a) It follows as in the proof of Proposition 3.2.6 that taking  = 21 |b|
we can find an N ∈ N+ such that if n ≥ N, then | |bn | − |b| | < 21 |b|, and so
− 12 |b| < |bn | − |b| < 12 |b|. From this we can deduce that for n ≥ N, |bn | ≥ 12 |b|,
1 2
or ≤ . If we let C = max{|b1 |−1 , |b2 |−1 , . . . , |bN −1 |−1 , 2|b|−1 }, then it follows
|bn | |b|
1
that ≤ C for all n ∈ N+ . (b) Use the fact that ( b1n ) is bounded, bn → b and
|bn |
that
1 1 |b − bn |
− =
bn b |b||bn |
1 1
to deduce that → , by Proposition 3.1.8 (c).
bn b
(c) This now follows immediately from (b) and Proposition 3.2.7.

n2 + 1
Example 3.2.9 We show that the sequence (an ), with an = , converges
3n2 − 2
and find its limit.
n2 + 1 1 + 1/n2
For any n ∈ N+ , an = =
3n2 − 2 3 − 2/n2
(divide numerator and denominator by n2 ).
1 1 1 2 2 1
Now → 0, so 2 → 0, so 1 + 2 → 1. Also − 2 → 0, so 3 − 2 → 3. So bn → .
n n n n n 3
Make sure that you can give a reason or a reference for each of these steps.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 55

We now look at the interaction between convergence and order.

Theorem 3.2.10 Suppose an → a, bn → b and an ≤ bn for every n ∈ N+ . Then


a ≤ b.

Proof: Suppose a > b. We show that this assumption leads to a contradiction; we


can then say that a ≤ b. The idea behind the proof is to show that you can find
an am close to a which is larger than the corresponding bm (which must be close to
b). If we can do this, it will contradict the fact that an ≤ bn for all n. Here is the
formal proof.
If a > b, then  = 21 (a − b) > 0.
Since an → a, there is an N1 ∈ N+ such that for all n ∈ N+ , if n ≥ N1 , then
|an − a| < ; in particular, an − a > − = − 21 (a − b), or an > a − 12 (a − b) = 12 (a + b).
Since bn → b, there is an N2 ∈ N+ such that for all n ∈ N+ , if n ≥ N2 , then
|bn − b| < ; in particular, bn − b <  = 21 (a − b), or bn < b + 12 (a − b) = 12 (a + b).
Choose m = max{N1 , N2 }. Then from the above we have am > 21 (a + b) > bm ,
a contradiction, since we know that an ≤ bn for every n ∈ N+ .

3.2.11 Let (an ) be a non-negative sequence and suppose an → a.


Proposition √

Then an → a.

+
Proof: Since an ≥ 0 for √
every n ∈ N√ , it follows from
√ Theorem
√ 3.2.10 that a ≥ 0.
√ √ √
Since |an − a| = | an − a| | an + a| ≥ | an − a| a, we have that for every
n ∈ N+ ,
√ √ |an − a|
| an − a| ≤ √ .
a
|an − a| √ √
But √ → 0, so it follows from Proposition 3.1.5 that an → a.
a
+
A similar argument can be used to show that if p ∈ N√ , p ≥ 2 and (an ) is a

non-negative sequence such that an → a, then also an → a. We leave this as an
p p

exercise.

Example 3.2.12 We determine whether the sequence (bn ) converges, where


r
6n
bn = .
n+3
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 56
r r
+ 6n 6
For any n ∈ N , bn = = .
n+3 1 + 3/n
1 3 3 6
Now → 0, so → 0, so 1 + → 1, so → 6 and, finally
n n n 1 + 3/n

r
6
bn = → 6.
1 + 3/n

We can now state and prove a general version of the sandwich theorem.

Theorem 3.2.13 (Sandwich theorem) Let (an ), (bn ) and (cn ) be real sequences such
that an ≤ bn ≤ cn for every n ∈ N+ , and an → s, cn → s. Then (bn ) is convergent,
and bn → s.

Proof: We have 0 ≤ bn − an ≤ cn − an for every n ∈ N+ , and cn − an → s − s = 0.


It follows from Proposition 3.1.5 that bn − an → 0. But then bn = (bn − an ) + an →
0 + s = s.
If we have to rely on the definition of convergence to prove that a sequence
converges, we have to guess a number (the limit for the sequence) and then use
the definition to show that the sequence does indeed converge to this number. The
sandwich theorem allows us, in some cases, to bypass this process by comparing the
sequence with convergent sequences with known limits. We illustrate this with two
non-trivial examples.


Proposition 3.2.14 (a) Let a ∈ R+ . Then lim n
a = 1.
n→∞

(b) lim n
n = 1.
n→∞

Proof: (a) We consider cases:


(i) Suppose a = 1. Then (bn ) is the constant sequence with every term equal to 1,
so bn → 1.
√ √
(ii) Suppose a > 1. For any n ∈ N+ , n a > 1.√( If n a ≤ 1 then a ≤ 1n ; a
contradiction.) So we can write, for every n ∈ N+ , n a = 1 + kn , with kn > 0. Then

a = (1 + kn )n
n(n − 1) 2
= 1 + nkn + kn + . . . + knn using the Binomial Theorem
2
≥ 1 + nkn (Why can we omit the remaining terms?)
> nkn .
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 57

a a
It follows that, for any n ∈ N+ , 0 < kn < . But → 0, so by the Sandwich
√ n n
Theorem, kn → 0. So 1 + kn → 1; hence n a → 1.
1 √ q 1
(iii) Suppose a < 1. Write b = . Then a = n 1b = √
n
. But if a < 1, then
a n
b
√ 1
b > 1. So n b → 1 (by (ii)), so √ n
→ 1.
b

(b) For any n ∈ N+ √, if n ≥ 2, then n n > 1. (Why?) In this case, we can again
write, for every n, n n = 1 + kn for some kn > 0. Then
n = (1 + kn )n
n(n − 1) 2
= 1 + nkn + kn + . . . + knn
2
n(n − 1) 2
> kn
2
r
2 2
It follows that kn2 < , so kn < .
r n−1 rn − 1
2 2
Now → 0, so since 0 < kn < , we have kn → 0.
n − 1√ n−1
It follows that n n = 1 + kn → 1.
Given a sequence, it is quite often fairly obvious that it is likely to converge,
and not too difficult to guess its limit. It is then a question of confirming your
suspicion by using the definition of convergence, or more preferably one or more of
the limit rules. But sometimes, as in the two cases above, it is not quite so obvious,
and certainly not so easy to prove. It would be very useful to have a property or
properties of a sequence that would ensure its convergence. We have seen that a
convergent sequence is always bounded. But boundedness alone is not enough to
guarantee convergence; we need something more. The additional property that will
do the trick is monotonicity.

Theorem 3.2.15 Every real sequence that is increasing and bounded above con-
verges. Its limit is its supremum.

Proof: Let (an ) be an increasing real sequence that is bounded above, and let
a = supn an . We show that an → a. This means that given  ∈ R+ , we must find
an N ∈ N+ so that for any n ∈ N+ , if n ≥ N, then |an − a| < . Since a ≥ an for
every n ∈ N+ , we can write this last inequality as a − an < .
Let  ∈ R+ be given. Then a −  < a. Since a is the smallest upper bound of (an ),
a −  is not an upper bound of (an ). So there exists k ∈ N+ such that a −  < ak .
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 58

But (an ) is increasing, so ak ≤ an for all n ≥ k. Choose N = k. Then for any


n ∈ N+ ,
n ≥ N ⇒ a −  < aN ≤ an ⇒ |an − a| = a − an < .

Corollary 3.2.16 Every real sequence that is decreasing and bounded below con-
verges. Its limit is its infimum.

Proof: If (an ) is bounded below and decreasing, (−an ) is bounded above and
increasing. By the theorem −an → supn (−an ) = − inf n an . Hence an → inf n an .

2
Example 3.2.17 Define (an ) by a1 = 4 and an+1 = 3 − for all n ∈ N+ . In
an
Question 3 of the Exercises for Section 2.2 you were asked to show that
2 ≤ an ≤ 4 for all n ∈ N+ , and
(an ) is decreasing.
So (an ) is a decreasing sequence that is bounded below.
It follows from the corollary above that (an ) converges. Suppose limn→∞ an = a. By
Theorem 3.2.10 an → a and an ≥ 2 for every n ∈ N+ implies that a ≥ 2. It follows
from Propositions 3.2.3, 3.2.5 and 3.2.8 that
 
2 2
a = lim an+1 = lim 3 − =3− .
an a

Then
0 = a2 − 3a + 2 = (a − 2)(a − 1), and so a = 2 or a = 1.

So a = 2 (since we know that a ≥ 2).

 
1 2
Example 3.2.18 Consider the sequence (cn ) defined by c1 = 2, cn+1 = cn +
2 cn
for all n ∈ N+ .
We show by induction that c2n ≥ 2 for all n ∈ N+ .
Clearly c21 ≥ 2.
Suppose that c2k ≥ 2 for some k ∈ N+ . We show that c2k+1 ≥ 2.
     2
1 4 1 4 1 2
c2k+1 −2 = c2k +4+ 2 −2 = 2
ck − 4 + 2 = ck − ≥ 0.
4 ck 4 ck 4 ck
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 59

By induction, c2n ≥ 2 for all n ∈ N+ .


Next we show that (cn ) is decreasing. For any n ∈ N+ ,
2 − c2n
 
1 2
cn+1 − cn = cn + − cn = ≤0
2 cn 2cn
since c2n ≥ 2; so cn+1 ≤ cn for all n ∈ N+ .
Next we observe that (cn ) is bounded below (because all the terms are positive).
Corollary 3.2.16 thus guarantees that (cn ) converges.
To find the limit of (cn ), we use Proposition 3.2.3. Let lim cn = c. Then
lim cn+1 = lim cn = c, so

 
1 2 1 1
c= c+ ⇒ c = ⇒ c2 = 2 ⇒ c = 2.
2 c 2 c

Note that it follows from Theorem 3.2.10 that if cn is positive and c2n ≥ 2 for
every n ∈ N+ and cn → c then c is a positive number with c2 ≥ 2. So, in particular,
we can divide by c in the above argument.

In the last example we √


have produced a sequence of rational numbers converging
to the irrational number 2. The theorem we used to show that this sequence
converges relied on the existence of greatest lower bounds, which in its turn relies
on the existence of least upper bounds. This is an illustration of a far more general
result.

Theorem 3.2.19 If a and b are real numbers and a < b, then there is a rational
number x such that a < x < b.

Proof: Let A = {n ∈ N : n > max{(b − a)−1 , b−1 }}. Then A is non-empty. Choose
any q ∈ A. Now let B = {n ∈ N : n < bq}. Then B is a non-empty finite set. Let
p = max B. Then p ∈ B and p + 1 ∈ / B.
p
We show that if we put x = , then a < x < b.
q
p p+1
It is clear that < b. Since b ≤ , it also follows that
q q
p+1 1 p
a = b − (b − a) < − = .
q q q
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 60

Corollary 3.2.20 Every real number is the limit of a sequence of rational numbers.

Proof: Let a be a real number. By the theorem, for every n ∈ N+ , there is a


rational number an such that a < an < a + n1 . The sandwich theorem shows that
an → a.
The proof can be modified slightly to produce a decreasing sequence of rational
numbers converging to a (which is then also the infimum of the sequence). A similar
construction yields an increasing sequence of rational numbers converging to a. This
gives:

Corollary 3.2.21 Every real number is the supremum of a sequence of rational


numbers, and also the infimum of a sequence of rational numbers.

Summary:
In this section we defined the notion of convergence of a real sequence to a real
number, by saying that a sequence (an ) converges to a iff the distance between an
and a converges to zero. From this definition we were able to prove a large number of
“limit rules”, which enable us to find limits of sequences without having to resort to
the definition. We also proved that bounded monotone sequences always converge:
an increasing sequence to its supremum, and a decreasing sequence to its infimum.
We finished the section by showing that every real number is the limit of a sequence
of rational numbers.

Exercises

1. For each of the following sequences, show that it converges and find its limit.
You may use any of the results proved in this section.
 3 
2n + n
(a)
n3 + 4n2
 4
n + n2 − 1

(b)
3n5 + 2
 n
2 + 3n

(c)
1 + 5n
√ !
5n4 + 3n
(d)
7n − 5n2 + 2
 n
8 + n3

(e)
n!
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 61
 
cos(n)
(f) .
2 + n2
2. Suppose that (xn ) converges, and a, b ∈ R.

(a) Show that if xn ≥ a for all n ∈ N+ , then lim xn ≥ a. [Hint: This should
be very quick; use a theorem in the notes.)
(b) Show that if xn ≤ b for all n ∈ N+ , then lim xn ≤ b.
(c) Conclude that if xn ∈ [a, b] for all n ∈ N+ , then lim xn ∈ [a, b].

3. Let α ∈ R be such that α < 7. Consider the sequence (dn ) defined by:
dn + 7
d1 = α and dn+1 = for all n ∈ N+ .
2
(a) Show that dn ≤ 7 for all n ∈ N+ .
(b) Show that (dn ) is increasing.
(c) Is (dn ) convergent? If so, find its limit.

4. Go back to the exercises for Section 2.2, Question 4. In it we defined a sequence


(an ) by: a1 = α and an+1 = a2n − 2an + 2 for all n ∈ N+ (where α was some real
number such that 1 < α < 2). Show that (an ) converges and find its limit.

5. Let (xn ) be a bounded sequence. For each n ∈ N+ , define

yn = sup{xk : k ≥ n} and zn = inf{xk : k ≥ n}.

Show that

(a) the sequence (yn ) is decreasing and bounded below;


(b) the sequence (zn ) is increasing and bounded above;
(c) the sequences (yn ) and (zn ) are both convergent;
(d) inf n xn ≤ limn→∞ zn ≤ limn→∞ yn ≤ supn xn .

The limits limn→∞ zn and limn→∞ yn are known as the lim inf and lim sup
of the sequence (xn ) respectively, and denoted by lim inf n xn and lim supn xn .
 
n n
6. Let xn = (−1) + . Find supn xn , inf n xn , lim inf n xn and lim supn xn .
n+1
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 62

7. (More of a challenge) For all n ∈ N+ , let an be defined by


 n
1
an = 1 + .
n
(a) Prove that the sequence (an ) is bounded above, as follows.
i. Show that r! ≥ 2r−1 for all r ∈ N+ .
n! 1 1
ii. Show that r
≤ for 2 ≤ r ≤ n.
r!(n − r)! n r!
 n
1
iii. Use the binomial expansion of 1 + and the results in (i) and
n
(ii) to show that an ≤ 3 for all n ∈ N+ .
(b) Prove that (an ) is increasing, as follows.
 n  n+1
1 1
Write down the binomial expansions of 1 + and 1 + .
n n+1
Compare the corresponding terms in these expansions and observe that
for 1 ≤ r ≤ n,
   
r−1
   
r−1

1 2 1
1− 1− ... 1− ≤ 1− ... 1 − .
n n n n+1 n+1
(c) Conclude that (an ) converges. (In fact, an → e. Sometimes e is defined
as this limit, but it can also be proved from other definitions of e that
this limit is equal to e

3.3 Divergent sequences


In the first two sections of this chapter we concentrated on convergent sequences.
In this section we look briefly at sequences that do not converge. A sequence (an )
converges if we can find an a ∈ R such that (an ) converges to a; the sequence does
not converge if we can find no such a. A sequence that does not converge is called
divergent.
It is important to get clarity about what it means for a sequence (an ) not to
converge to a. If the sequence does converge to a, it means that for every distance
, we can find a positive integer N such that all the terms in the sequence from aN
onwards lie within a distance  from a. If (an ) does not converge to a, there is a
distance  for which this is not possible, that is, for which no such N exists. Saying
that no such N exists means that for every positive integer N, it is not true that for
every n ≥ N, an is within a distance  of a, and that means that there is an n ≥ N
such that an is further away from a than .
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 63

Here it is, very formally and in symbols:

• To say that (an ) converges means:


(∃a ∈ R)(an → a), or:
(∃a ∈ R)(∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ |an − a| < ].

• So to say that (an ) diverges means:


(∀a ∈ R)(∃ ∈ R+ )(∀N ∈ N+ )(∃n ∈ N+ )[n ≥ N ∧ |an − a| ≥ ].

To get some practice in proving that a sequence diverges, we illustrate the formal
definition of divergence with a sequence we have always thought of as not convergent.

Example 3.3.1 Let (an ) = ((−1)n+1 ) = (1, −1, 1, −1, 1, −1, . . .).
We show that (an ) diverges. We do this by showing that (an ) cannot converge to a
non-negative number a, because every second term of the sequence equals −1 and
is therefore at a distance more that 12 from a. Likewise, (an ) cannot converge to
a negative number a, because every second term of the sequence equals 1 and is
therefore at a distance more that 12 from a.
Here is the formal proof: Let a ∈ R be given.
Choose  = 12 .
Let N ∈ N+ be given.
We must find an n ∈ N+ such that n ≥ N and |an − a| ≥ 21 . We do this as follows:
If a ≥ 0, choose any n ≥ N for which an = −1.
Then n ≥ N and |an − a| = | − 1 − a| ≥ 1 > 12 (because | − 1 − a| is the distance
between a and −1, and if a ≥ 0 this distance is greater than or equal to 1).
On the other hand, if a < 0, choose any n ≥ N for which an = 1. Then n ≥ N and
|an − a| = |1 − a| ≥ 1 > 12 , as required.

Example 3.3.2 Does the sequence (an ), with


n
an = (−1)n + ,
n+1
converge or diverge?  
n
Suppose (an ) converges. Then since converges, it follows that the sequence
  n+1
n
an − also converges (it is the difference of two convergent sequences). But
n+1
n
since (−1)n = an − , this clearly contradicts what we found in the previous
n+1
example. Hence (an ) must be divergent.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 64

How easy is it to spot divergent sequences? There is one large class of sequence
that we can immediately classify as divergent:

Proposition 3.3.3 Every unbounded sequence is divergent.

Proof: This follows immediately from the fact that every convergent sequence is
bounded (Proposition 3.2.6).
The converse of this result is not true, and we have an immediate example: We
have just proved that ((−1)n ) is divergent, but it clearly is a bounded sequence.
Two special kinds of divergent sequences are of interest. We single them out in
the following definition.

Definition 3.3.4 Let (an ) be a real sequence.

(a) We say (an ) diverges to ∞ and write an → ∞ as n → ∞ (or limn→∞ an = ∞)


if and only if for any positive real number K there is an N ∈ N+ such that, for any
n ∈ N+ , if n ≥ N then an > K.
Using quantifiers and connectives, this condition reads:

an → ∞ ⇔ (∀K ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ an > K].

(b) We say (an ) diverges to −∞ and write an → −∞ as n → ∞ (or limn→∞ an =


−∞) if and only for any positive real number K there is an N ∈ N+ such that, for
any n ∈ N+ , if n ≥ N then an < −K.
Using quantifiers and connectives, this condition reads:

an → −∞ ⇔ (∀K ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ an < −K].

Remarks:
(a) Strictly speaking, before we can classify the types of sequences we have just
defined as divergent, we have to prove that if a sequence diverges to ∞ or to −∞,
then it cannot converge to any real number a. This looks so obvious that one can be
forgiven for not even thinking that it is necessary. But it has to be done, and doing
it is in fact a surprisingly useful exercise which helps to get one to think carefully
about the definitions. We ask you to do this in the exercises.
(b) Not all divergent sequences are of the two types we have just defined. The
sequence ((−1)n+1 ) diverges, but does not diverge to ∞ or −∞.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 65

(c) The notation an → ∞ for a divergent sequence is a bit confusing, because it


looks as if we are saying that the limit of (an ) does exist, when in fact it does not
exist. We use an → ∞ as a convenient notation, but ∞ does not represent a real
number, and should not be treated as such.
For the record we state the following obvious result:

Proposition 3.3.5 If an = n for every n ∈ N+ , then an → ∞.

Proof: This follows almost immediately from the Archimedean property. Write out
the proof for yourself.


Example 3.3.6 We show that − n → −∞ as n → ∞.
+
+ +
√ Given K ∈ R we must
Let’s first look informally at what we need to do. √ find
N ∈ N so √ that for any n ∈ N , n ≥ N ⇒ − n < −K. However, − n <
−K ⇐⇒ n > K ⇐⇒ n > K 2 . If we choose N > K 2 , then the required
condition will be satisfied.
Here is the formal proof: Let K ∈ R+ be given. Choose N ∈ N+ such that
N > K 2 (the Archimedean property guarantees that we can do this). Then, for any
n ∈ N+ , √ √
n ≥ N ⇒ n > K 2 ⇒ n > K ⇒ − n < −K,
as required.

The next proposition gives us a useful way of avoiding the definition in many
cases by making use of sequences we know diverge to ∞ or −∞.

Proposition 3.3.7 Let (an ) and (bn ) be real sequences.

(a) If bn → ∞ and an ≥ bn for every n ∈ N+ , then an → ∞.

(b) If bn → −∞ and an ≤ bn for every n ∈ N+ , then an → −∞.

Proof: In both cases this follows almost immediately from the definition. Write
out the one-line proofs yourself. newpage

Example 3.3.8 We show that the sequence (an ), with an = (n(2+(−1)n )), diverges
to ∞.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 66

The value of (−1)n depends on whether n is odd or even.


If n is odd, then n(2 + (−1)n ) = n(2 + (−1)) = n.
If n is even, then n(2 + (−1)n ) = n(2 + (+1)) = 3n.
From this it follows immediately that an ≥ n for every n ∈ N+ . Since (n) diverges
to ∞, it follows from the last proposition that an → ∞.

We finish this section by establishing some fairly obvious links between divergent
and convergent sequences.

Proposition 3.3.9 Let (an ) and (bn ) be sequences of real numbers.

1
(a) If an 6= 0 for all n ∈ N+ and an → ∞, then → 0.
an
1
(b) If an > 0 for all n ∈ N+ and an → 0, then → ∞.
an
an
(c) If bn 6= 0 for all n ∈ N+ , bn → ∞ and (an ) is a bounded sequence, then → 0.
bn

Proof: (a) Let  > 0, then also K = −1 > 0. Since an → ∞, there is an N ∈ N+
1 1
such that for all n ∈ N+ , if n ≥ N, then an > K = , or 0 < < . This shows
 an
1
that → 0.
an
(b) The proof is similar, and left as an exercise.
(c) This follows from (a) and Proposition 3.1.8.

Example 3.3.10 For each of the following sequences, decide whether it converges
or diverges:

1 − 3n
  
2n + 2
(a) (cn ) = (b) (dn ) = √ .
n3 + n n2 + n − n

1 − 3n 1/n − 3 1
(a) For any n ∈ N+ , cn = 3 = 2 . Now − 3 → −3 and so the sequence
  n +n n +1 n
1
− 3 is bounded. Since n2 + 1 ≥ n for every n ∈ N+ , n2 + 1 → ∞, so by
n
Proposition 3.3.9(c), cn → 0.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 67

(b) For any n ∈ N+ ,

2n + 2 2 + 2/n
dn = √ =p ;
2
n +n−n 1 + 1/n − 1

2 p
(divide numerator and denominator by n). Now 2 + → 2 and 1 + 1/n − 1 → 0.
n
Then p
1 1 + 1/n − 1 0
= → = 0.
dn 2 + 2/n 2

p 2 1
Also > 0 for all n ∈ N+ , so
1 + 1/n − 1 > 0 and 2 + > 0 for all n ∈ N+ .
n dn
1
This means we have everything we need to apply Proposition 3.3.9(b) with an = .
dn
1
We conclude that → ∞, i.e. dn → ∞.
an

Summary:
Sequences that do not converge are called divergent. In this section we saw how the
definition of convergence can be used to prove that a sequence diverges. We saw
that the class of divergent sequences contain every unbounded sequence, but many
others as well. Two special kind of divergent sequences were defined: those that
diverge to ∞, and those that diverge to −∞. A link between sequences converging
to 0 and sequences diverging to ∞ was established.

Exercises

1. For each of the following sequences say whether it is convergent or divergent.


If it is divergent, say whether it diverges to ∞, to −∞, or neither of the two.
 n 
9
(a)
8
 3 
n +n
(b)
2 − n2
 n
n
(c) .
n!
(d) ((−2)n )
n2
 
(e) √ .
n3 + 1
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 68

2. Prove that if a sequence (an ) diverges to ∞, then it is not convergent.

3. Prove Proposition 3.3.7.

4. Prove Proposition 3.3.9(b).

5. Let r ∈ R and put an = r n . We have seen that if 0 < r < 1, then an → 0.


Prove that

(a) if |r| < 1, an → 0;


(b) if r > 1, an → ∞;
(c) if r < −1, (an ) diverges.

What happens if |r| = 1?

6. (a) Suppose that (bn ) is an increasing sequence. Show that if (bn ) is not
bounded above, then bn → ∞.
(b) Give an example to show that the statement in (a) is false if the assumption
that (bn ) is increasing is omitted.

7. (a) Prove that if an → ∞ or an → −∞, then 1/an → 0 (half of this is just


about Proposition 3.3.9(a)).
(b) Show that the converse of this statement is false.

3.4 Subsequences
We have on occasion found it useful when working with a sequence to look at
a “part” of the sequence only. As an example, when calculating the limit of a
convergent recursively defined sequence (an ), we used the fact that the sequence
(an+1 ) obtained by omitting the first term has the same limit as the sequence itself. A
sequence such as ((−1)n +( 21 )n ) can be better understood by looking at the sequence
of even-numbered terms and the sequence of odd-numbered terms separately. It is
natural to call such a sequence obtained by picking some of the terms of the original
sequence (or, equivalently, by leaving out some of the terms of the original sequence)
a subsequence. It is this idea we are going to explore in more detail in this section.
A subsequence of a sequence is obtained by selecting some of the terms of the
original sequence. It is important that we be very clear about how this selection
should be done.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 69

We want the subsequence to contain

• only terms of the original sequence;

• infinitely many terms;

• the terms in the same order as in the original sequence;

• no term of the original sequence more than once.

We can say how we select the terms by listing the numbers (i.e. the subscripts) of
the terms of the original sequence we select. This list will be a list of infinitely many
positive integers, in other words a sequence of positive integers. To ensure that
the terms in the subsequence appear in the same order as in the original sequence,
the sequence of integers has to be increasing. To ensure that we do not choose a
term more than once, this sequence of integers will have to be strictly increasing.
It is traditional to write n1 for the number of the first term we select, n2 for the
number of the second term, and so on. The list will then be the sequence of integers
(nk ). (We could make this even clearer by writing (nk )∞k=1 , to indicate that k is the
running index.). The subsequence then is the sequence

an1 , an2 , an3 , an4 , . . . .

We make all of this precise in the definition:

Definition 3.4.1 Let (an ) be a real sequence and let (nk ) = (n1 , n2 , n3 , . . .) be a
strictly increasing sequence of positive natural numbers, i.e. n1 < n2 < n3 < . . ..
Then the sequence (ank ) = (an1 , an2 , an3 , . . .) is called a subsequence of (an ).

For example, given a sequence (an ) = (a1 , a2 , a3 , a4 , . . .) the following are some
subsequences of (an ):

(ak+1 ) = (a2 , a3 , a4 , . . .) (here nk = k + 1)


(a2k ) = (a2 , a4 , a6 , a8 , . . .) (here nk = 2k)
(a3k+5 ) = (a8 , a11 , a14 , a17 . . .) (here nk = 3k + 5)
(a2k ) = (a2 , a4 , a8 , a16 , . . .) (here nk = 2k )

The condition that (nk ) be strictly increasing is important. It means, for exam-
ple, that a sequence like

(a2 , a1 , a4 , a3 , a6 , a5 , a8 , a7 , . . .)
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 70

is not a subsequence of (an ). This is because (2, 1, 4, 3, 6, 5, 8, 7, . . .) is not increasing.


The sequence
a3 , a5 , a7 , a7 , a9 , a11 , . . .
is also not a subsequence of (an ). This is because (3, 5, 7, 7, 9, 11, . . .) is not strictly
increasing.
Informally, a subsequence is obtained from a sequence by simply leaving out
some terms (finitely or infinitely many of them), in such a way that the order of the
remaining terms is unchanged.
The fact that (nk ) is strictly increasing has the following useful consequence.

Lemma 3.4.2 If (nk ) is a strictly increasing sequence of positive natural numbers,


then nk ≥ k for all k ∈ N+ .

Proof: We use a proof by induction:


Clearly n1 ≥ 1 because n1 ∈ N+ . Suppose that nk ≥ k for some k ∈ N+ . We
show that nk+1 ≥ k + 1. Now nk+1 > nk because (nk ) is strictly increasing. So
nk+1 ≥ nk + 1. (This is because these are natural numbers: if k, m ∈ N+ and k > m
then k ≥ m + 1. This is, of course, false for real numbers.)
Putting this together gives nk+1 ≥ nk + 1 ≥ k + 1, as required.

Proposition 3.4.3 Suppose that (an ) is a convergent sequence. Then every subse-
quence of (an ) also converges, and has the same limit as (an ).

Proof: Let’s first have an informal look at how to go about the proof. Suppose
an → a and (ank ) is a subsequence of (an ). We must show that ank → a as k → ∞.
(Note that since k is the index that changes, it is k that tends to infinity.) So, given
 ∈ R+ , we need N ∈ N+ so that for all k ∈ N+ , k ≥ N ⇒ |ank − a| < . But since
an → a, we can find M ∈ N+ so that for all k ∈ N+ , k ≥ M ⇒ |ak − a| < . Since
nk > k, it looks as if taking N = M will do the trick.
Here is the formal proof: Suppose that an → a and that (ank ) is a subsequence
of (an ). Let  ∈ R+ be given. Since ak → a as k → ∞, there exists N ∈ N+ so
that for all k ∈ N+ , k ≥ N ⇒ |ak − a| < . Then since nk ≥ k for all k ∈ N+ , by
Lemma 3.4.2,
k ≥ N ⇒ nk ≥ k ≥ N ⇒ |ank − a| < .

The following consequence of Proposition 3.4.3 turns out to be useful, so we state


it as a separate corollary:
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 71

Corollary 3.4.4 If the sequence (an ) has two subsequences that converge to different
limits, then (an ) diverges.

This corollary often provides an easy way to show that a sequence diverges. For
example, we now have a painless way to show that (an ) = ((−1)n ) diverges. The
subsequence (a2n ) converges to 1 (it is a constant sequence!), and the subsequence
(a2n−1 ) converges to −1. Therefore (an ) diverges.
The next proposition is probably the most surprising result we will show about
subsequences. It indicates that subsequences can often be better behaved than the
original sequence. The proof is quite cunning; it is easy to get confused trying to
prove this, if you do not hit on the right idea.

Proposition 3.4.5 Every real sequence has either an increasing subsequence or a


decreasing subsequence. (It could also have both.)

Proof: We identify certain terms of a sequence for which we use the quite descriptive
name peak point.
Given a sequence (an ) we say that aM is a peak point of the sequence (an ) if and
only if for all n ∈ N+ , n ≥ M ⇒ aM ≥ an .
Informally, if you’re standing on a peak point and you look right (in the direction of
the positive n-axis) all the terms of the sequence are lower down than you, or on the
same level. This is shown in the graph below. Usually the graph of a sequence will
consist of dots only, but here we have joined them with straight lines to illustrate
where the name peak point comes from.

a2

a4
a8

a3
a6 a10
a9
a1 a7 a12
a5 a11
0 1 2 3 4 5 6 7 8 9 10 11 12

Figure 3.1: Peak points of a sequence.

For the sequence in this graph, a2 , a4 , a8 , a9 , a10 are peak points; a1 , a3 , a5 , a6 , a7 , a11
are not.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 72

We divide the proof into two cases.


Case 1: (an ) has infinitely many peak points.
Suppose an1 , an2 , an3 , . . . are the peak points. Then an1 ≥ an2 and an2 ≥ an3 and
an3 ≥ an4 . . . and so on; hence an1 ≥ an2 ≥ an3 ≥ . . . and this shows that (ank ) is a
decreasing subsequence of (an ).
Case 2: (an ) has only finitely many peak points.
Suppose ak is the last peak point of (an ). Put an1 = ak+1 .
Now ak+1 is not a peak point (because ak was the last one), so there exists ` > k + 1
such that a` > ak+1 . Put an2 = a` .
Now a` is not a peak point (because ak was the last one), so there exists m > ` such
that am > a` .
Put an3 = am .
Continue inductively to define a subsequence (an1 , an2 , . . .) in this way. From the
construction an1 < an2 < an3 < . . ., so (ank ) is increasing.
(Note: If (an ) has no peak points at all, begin with an1 = a1 and continue as above.)

Proposition 3.4.5 gives some indication why monotone sequences are important.
While it is certainly not true that every sequence is monotone, every sequence does
have a monotone subsequence. The next result is even better.

Theorem 3.4.6 (The Bolzano-Weierstrass Theorem)


Every bounded sequence of real numbers has a convergent subsequence.

Proof: Suppose (an ) is a bounded sequence. By Proposition 3.4.5 (an ) has either an
increasing or a decreasing subsequence, (ank ). Now (ank ) is also bounded. (One can
use the upper and lower bounds of (an ).) By Theorem 3.2.15 and Corollary 3.2.16,
(ank ) converges.

Example 3.4.7 Consider the sequence ( 21 , 31 , 32 , 41 , 24 , 43 , 51 , 52 , 53 , 45 , 61 , . . .).


It is bounded, because every term is a rational number between 0 and 1. It has
many convergent subsequences, for example:
     
1 2 3 4 1 1 1 1 1 2 3 4
, , , ,... , , , , ,... , , , , ,... .
2 4 6 8 2 3 4 5 2 3 4 5

It is clear from this that the original sequence diverges.


CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 73

Summary:
In this section we made precise the important notion of a subsequence of a sequence.
Subsequences inherit good properties from their parent sequences: a subsequence of
a bounded sequence is itself bounded and a subsequence of a convergent sequence is
also convergent, to the same limit. But a subsequence may have even better proper-
ties than its parent sequence: Every bounded sequence, not necessarily convergent
itself, has at least one convergent subsequence (the Bolzano-Weierstrass theorem).

Historical Notes
Bernard Placidus Johann Nepomuk Bolzano (1781 – 1848)

Bolzano was born in Prague (now in the Czech Republic) and went to the University
of Prague in 1796 to study philosophy and mathematics. From the start he was
interested in the philosophical and foundational aspects of mathematics; as he put
it “those parts of mathematics which was at the same time philosophy”. In 1800 he
started three years of theological study and at the same time began on a doctoral
thesis on geometry. In it he investigated the nature of proof in mathematics. In
1804 he was awarded the doctorate, and ordained as a priest in the Roman Catholic
church. He was also appointed as a professor in philosophy and religion at the
University of Prague. He was a pacifist and had strong views on economic justice
which did not make him popular with the authorities. This lead to him losing his
position in 1819, being placed under house arrest and forbidden to publish.
Bolzano embarked on a series of books
on the foundations of mathematics.
The first volume was published in 1810,
the second was written but never pub-
lished. Instead he began work on an
attempt to free calculus from the con-
cept of an infinitesimal. He wanted to
develop rigorous definitions of the no-
tions of limit, convergence and deriva-
tive that did not depend on geometry.
He used his new approach to give a
proof of the intermediate value theo-
rem, and also gave a definition of a
what is now called a Cauchy sequence
before Cauchy did so himself!
After 1817, Bolzano did not publish
any mathematical work for many years,
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 74

but worked on a theory of science and knowledge. He did return to the foundations
of mathematics, and worked on an attempt to put the whole of mathematics on a
logical foundation. In the process he discovered a number of paradoxes concerned
with infinite sets. He was the first to use the term “set”, and to give an example
of an infinite set whose elements is in one-to-one correspondence with the elements
of a proper subset of itself. He anticipated Cantor’s theory of infinite sets in many
ways, but much of his work was only published after his death in 1848.

Karl Theodor Wilhelm Weierstrass (1815 – 1897)


Weierstrass was born in Ostenfelde,
Germany, the eldest of four chil-
dren. His father, a well-educated man,
changed jobs often and this resulted
in Karl having to change schools fre-
quently. His mother died when he was
12, and his father remarried two years
later. In 1829 his father took on a job
in Paderborn, and Karl went to school
there and took on a part-time book-
keeping job to help the family. He did
very well at school, especially in math-
ematics. His father wanted him to fol-
low a career in finance, and therefore
enrolled him for a courses in law, fi-
nance and economics at the University
of Bonn in 1834.
Karl wished to study mathematics, but was afraid to disobey his father’s wishes.
His way of resolving the conflict was not to attend lectures at all, but to spend the
next four years trying to enjoy student life.
Karl did study mathematics on his own, and became interested in elliptic functions.
He managed to solve a problem posed by Abel, and this made him decide to study
mathematics, even if it went against his father’s wishes. He gave up his studies and
left the university without taking any examinations. His father was greatly upset,
but arranged for Karl to enroll at the Academy in Münster in 1839 in order to qualify
as a teacher. There he was encouraged by Gudermann, who also had an interest
in elliptic functions, to continue his mathematical career. During this time he laid
the foundations for his theory of a function of a complex variable. He finished his
training as teacher in 1842, and started teaching at secondary schools. He found this
a time of “unending dreariness and boredom”. In 1850 his health began to suffer,
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 75

probably as a result of the mental and physical strain brought on by trying to do


mathematics and satisfy the demands of his teaching job.
In 1854 Weierstrass published a paper on Abelian functions which immediately
brought him to the attention of the mathematical world. Although he did not
succeed in getting a university appointment at the time, his next paper in 1856
led to a number of offers from universities. He accepted an offer from the Industry
Institute in Berlin, and was subsequently offered a chair at the University of Berlin, a
position he was only able to take up a few years later. Weierstrass was a very popular
lecturer and attracted students from all over the world. He gave a course on the
foundations of analysis in 1860; his emphasis was on rigour and led to his discovery
of a continuous but nowhere differentiable function. His health deteriorated further
and this led to a complete collapse in 1861. When he returned to work a year later
he had to lecture sitting down, with a student writing on the blackboard for him.
In 1863 Weierstrass began to formulate his theory of real numbers. His lectures on
this and other topics were eventually published and exerted an enormous influence
on the teaching of analysis, so much so that the term “Weierstrassian rigour” is still
in use today. A large number of famous mathematicians benefited from his teaching.
One of the most famous was Sofia Kovalevskaya, who was taught privately by him,
because women were not allowed admission to the university at the time. He was
instrumental in securing her a position in Stockholm and in getting the University
of Göttingen to award her an honorary doctorate.
Weierstrass’ perfectionism resulted in him publishing little, but he did supervise the
initial work on publication of his complete works. Much of this only appeared after
his death, from pneumonia, in 1897.

Exercises

1. Use subsequences to show that the following sequences diverge.


1 1
(a) Define (an ) by: an = 2+ 2
if n is a multiple of 10, and an = otherwise.
n n
n
(b) (bn ) = ((−1)n ).
n+1
2. Show that the following sequences are bounded, and find at least one conver-
gent subsequence in each case.
1
(a) (an ) = ((−1)n + )
2n

(b) (bn ) = (sin ).
2
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 76

3. Prove or disprove. State clearly whether the statement is true or false.

(a) Every bounded sequence converges.


(b) If (an ) has a convergent subsequence, then (an ) is bounded.
(c) Every subsequence of a convergent sequence is bounded.
(d) If (an ) has two subsequences converging to the same limit, then (an ) also
converges to that limit.

4. Define the sequence (xn ) recursively by:


2
x1 = 9 and xn+1 = for all n ∈ N+ .
1 + xn
(a) Show that one of the subsequences (an ) = (x2n ) and (bn ) = (x2n−1 ) is
increasing, and the other decreasing. Show that both of them converge
to the same limit, and find this limit.
(b) Prove that (an ) converges, to this same limit.

3.5 Cauchy sequences


Suppose that we want to prove that a sequence (an ) converges but we have no idea
in advance what its limit a might be. Then there is no point in appealing directly to
the definition of (an ) converging to a, because we will not be able to prove anything
about |an − a| if we do not know what a is.
So, how else can one show that a sequence converges?
We have already seen one answer to this question in the result that every bounded
monotone sequence converges. (See Theorem 3.2.15 and Corollary 3.2.16). However,
this only helps with monotone sequences.
What we need is a property of a sequence which does not rely on knowing the
limit of the sequence, but does guarantee that it will converge. This means that the
property should be formulated in terms of the terms of the sequence only (horrible
pun!). Since every convergence sequence must have this property, it may help to
look at properties of convergent sequences.
If a sequence converges, it follows from the definition of convergence that for
every given distance , there is a term in the sequence such that every term from
that one onwards will be within a distance of  from the limit. But then any two
of these terms cannot be more than a distance 2 apart. Very informally we can
say that the terms of a convergent sequence eventually bunch together. Such an
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 77

informal description is not good enough to use when you need to prove things, so
we will have to give a rigorous definition of this property. A sequence with this
property is called a Cauchy sequence, named after the mathematician we may hold
responsible for its introduction.

Definition 3.5.1 A real sequence (an ) is a Cauchy sequence if and only if, for
any  ∈ R+ there exists N ∈ N+ so that for any n, m ∈ N+ , if n ≥ m ≥ N then
|an − am | < .

Using quantifiers and connectives, this condition reads:

(an ) is Cauchy ⇐⇒
(∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )(∀m ∈ N+ )[n ≥ m ≥ N ⇒ |an − am | < ].

Very roughly, the terms of a Cauchy sequence get closer and closer to one another.
There is a danger, however, in relying too heavily on such a rough idea. It is possible,
for example, to give an example of a sequence with the property that the distance
between successive terms tend to 0, but the sequence is not a Cauchy sequence (see
the exercises).
To start, we need to know that convergent sequences are Cauchy sequences. We
are, of course, hoping that the converse will be true as well, since if it is, we’ll be
able to identify convergent sequences without knowing what their limits are.

Proposition 3.5.2 Every convergent sequence is a Cauchy sequence.

Proof: Suppose that an → a for some a ∈ R.


Let  ∈ R+ be given.
Since an → a, there exists N ∈ N+ such that for any n ∈ N+ , n ≥ N ⇒ |an − a| < 2 .
So, for any n, m ∈ N+ ,
n ≥ m ≥ N ⇒ |an − am | = |(an − a) + (a − am )|
≤ |an − a| + |am − a|
 
< +
2 2
= .

The proposition above expresses precisely the idea that, if the terms of a sequence
are getting closer and closer to some limit, then they are also getting closer to one
another.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 78

Corollary 3.5.3 If a real sequence is not a Cauchy sequence, then it is not conver-
gent.

Our main aim in this section will be to prove the converse of Proposition 3.5,
i.e. to show that every Cauchy sequence of real numbers converges. First we look
at some examples. Since every convergent sequence is a Cauchy sequence, there is
no lack of examples. What we are looking for are examples of sequences that are
not obviously convergent, but which we can show are Cauchy sequences.

Example 3.5.4 Let a, b ∈ R and let (cn ) be defined by: c1 = a, c2 = b and


1
cn+2 = (cn+1 + cn ) for all n ∈ N+ . We show that (cn ) is a Cauchy sequence.
2
We have
1 1
cn+2 − cn+1 = (cn+1 + cn ) − cn+1 = (cn − cn+1 ).
2 2
Thus
1 1 1 1
|cn+2 − cn+1 | = |cn+1 − cn | = 2 |cn − cn−1 | = . . . = n |c2 − c1 | = n |b − a|.
2 2 2 2
+
So, for any n, m ∈ N , if n > m
|cn − cm | = |cn − cn−1 + cn−1 − cn−2 + . . . + cm+1 − cm |
≤ |cn − cn−1 | + |cn−1 − cn−2 | + . . . + |cm+1 − cm |
 
1 1 1
≤ + + . . . + m−1 |b − a|
2n−2 2n−3 2
1
= n−2 (1 + 2 + . . . + 2n−m−1 )|b − a|
2
1 − 2n−m
 
1
= n−2 |b − a|
2 1−2
2n−m − 1
= |b − a|
2n−2
2n−m
≤ |b − a|
2n−2
1
= m−2 |b − a|.
2

1
Given  ∈ R+ , we could choose N ∈ N+ so large that |b − a| < . (Find
+
2N −2
such an N.) Then for any n, m ∈ N ,
1
n ≥ m ≥ N ⇒ |cn − cm | ≤ |b − a| < .
2N −2
This shows that (cn ) is a Cauchy sequence.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 79

We next have an important example of a sequence that is not a Cauchy sequence.

Example 3.5.5 Let (sn ) be defined by:


1
s1 = 1, sn+1 = sn + for all n ∈ N+ .
n+1
 
1 1 1 1 1 1
Then (sn ) = 1, 1 + , 1 + + , 1 + + + , . . . .
2 2 3 2 3 4
We show that (sn ) is not a Cauchy sequence.
To say that (sn ) is not a Cauchy sequence means:

(∃ ∈ R+ )(∀N ∈ N+ )(∃n ∈ N+ )(∃m ∈ N+ )[n ≥ m ≥ N ∧ |sn − sm | ≥ ].


1
Choose  = .
2
Let N ∈ N+ be given.
Choose n = 2N and m = N.
Then n ≥ m ≥ N. Also:
   
1 1 1 1 1
|sn − sm | = 1 + + ...+ + ...+ − 1 + + ...+
2 N 2N 2 N
1 1 1
= + + ...+
N +1 N +2 2N
1 1 1
≥ + + ...+
2N 2N 2N
N
= (since there are N terms in the sum)
2N
1
=
2
= .

This shows that (sn ) is not a Cauchy sequence, and it therefore follows from Corol-
lary 3.5.3 that (sn ) is not convergent.

Proposition 3.5.6 Every Cauchy sequence is bounded.

Proof: The proof is very similar to the proof that every convergent sequence is
bounded.
Suppose (an ) is a Cauchy sequence. Taking  = 1 in the definition of a Cauchy
sequence, we can find an N ∈ N+ such that for every n, m ∈ N+ , if n ≥ m ≥ N,
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 80

then |an − am | < 1. In particular, it follows that for all n ∈ N+ , if n ≥ N, then


|an − aN | < 1. For such n we have
|an | = |(an − aN ) + aN | ≤ |an − aN | + |aN | < 1 + |aN |.
Put C = max{|a1 |, |a2 |, . . . , |aN −1 |, 1 + |aN |}. Then it follows that for all n ∈ N+ ,
|an | ≤ C, showing that (an ) is bounded.

Corollary 3.5.7 Every Cauchy sequence has a convergent subsequence.

Proof: Use Theorem and Proposition 3.4.5.

Proposition 3.5.8 Suppose that (an ) is a Cauchy sequence. If (an ) has the con-
vergent subsequence (ank ), then (an ) converges and its limit is the same as that of
the subsequence (ank ).

Proof: Suppose that (an ) is a Cauchy sequence and (ank ) is a subsequence that
converges to a. We show that an → a as well. Let  ∈ R+ be given.
Since ank → a as k → ∞, there exists N1 ∈ N+ so that for any k ∈ N+ ,
k ≥ N1 ⇒ |ank − a| < 21 .
Since (an ) is Cauchy, there exists N2 ∈ N+ so that for any n, m ∈ N+ ,
n ≥ m ≥ N2 ⇒ |an − am | < 21 .
Let N = maximum {N1 , N2 }.
So, for any n, k ∈ N+ , if n ≥ N and k ≥ N then:
|ank − a| < 21 , because k ≥ N1
|an − ank | < 12 , because n ≥ N2 and nk ≥ k ≥ N2 . Hence
|an − a| = |an − ank + ank − a|
≤ |an − ank | + |ank − a|
 
< +
2 2
= .

Warning: The statement in this proposition is false without the assumption


that (an ) is Cauchy. For example, if (an ) = ((−1)n ) then (an ) has the convergent
subsequence (a2n ), but (an ) itself diverges.

Theorem 3.5.9 A real sequence converges to a real number if and only if it is a


Cauchy sequence.

Proof: Put together Proposition 3.10, Corollary 3.13 and Proposition 3.14.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 81

Example 3.5.10 We can now return to Example 3.5.4. There we showed that
1
the sequence defined inductively by c1 = a, c2 = b and cn+2 = (cn+1 + cn ) for
2
all n ∈ N+ is a Cauchy sequence. We can now say that it must therefore be
convergent. But what is its limit? Let’s put limn→∞ cn = c. Trying a method we
used before, we can then say limn→∞ cn+2 = limn→∞ cn+1 = limn→∞ cn = c and so
c = limn→∞ cn+2 = 12 (limn→∞ cn+1 + limn→∞ cn ) = 21 (c + c). This is reassuring, but
hardly useful; we can still not say what c is. We try something different: From the
definition of cn+2 , we easily obtain cn+2 + 21 cn+1 = cn+1 + 21 cn . Applying this result
repeatedly, we get

1 1 1 1 1
cn+2 + cn+1 = cn+1 + cn = cn + cn−1 = . . . = c2 + c1 = a + b.
2 2 2 2 2
Taking limits in this equation gives
   
3 1 1 1 2 1
c = c + c = lim cn+2 cn + cn+1 = a + b, and so c = a+ b .
2 2 n→∞ 2 2 3 2

If you look back at what we have done in this chapter, you will realise the
important role played in the results we proved by the least upper bound axiom
(LUB): Every sequence in R that is bounded a bove has a supremum. Using this
we were able to prove that every bounded monotone sequence converges, and this
in its turn lead to a proof that every Cauchy sequence converges.
It is interesting to ask whether we could have taken another axiom in the place
of the least upper bound axiom in the definition of the real numbers. Here is the
answer:

Theorem 3.5.11 The following are equivalent:

(a) Every sequence of real numbers that is bounded above has a a least upper bound
in R.

(b) Every increasing sequence of real numbers that is bounded above converges to a
real number.

(c) Every Cauchy sequence of real numbers converges to a real number.

Proof: Since we have already shown that (a) implies (b) (Theorem 3.2.15) and that
(b) implies (c) (Theorem 3.5.9), we only need to prove that (c) implies (a). We leave
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 82

this as a challenging exercises for the prospective mathematicians amongst you. You
will find some hints in the exercises.
It follows that we could equally well have used (b) or (c) in the palace of (a)
in the definition of the real numbers. What is more, we have already seen that (a)
in the theorem above (the axiom we denoted by LUB) is equivalent to the Nested
Interval property (NIP). This means that we could also have replaced the axiom
LUB in the definition of the real numbers by the NIP.
In the exercises for Section 2.4 we defined the supremum of an arbitrary non-
empty subset of the real numbers. At the time we did not ask whether the supremum
of a non-empty subset of R which is bounded above always has a supremum, but
simply asked you to find the suprema of some specific sets. The axiom LUB guaran-
tees that if we can write the elements of the set as a sequence, then the supremum
will exist. Fortunately, more is true:

Theorem 3.5.12 The following statements are equivalent:

(a) Every sequence of real numbers that is bounded above has a least upper bound in
R.

(b) Every non-empty set of real numbers that is bounded above has a least upper
bound in R.

Proof: It is clear that (b) implies (a). We leave the proof that (a) implies (b) as
another challenging exercise. You will find some hints in the exercises.

Summary:
In this section we found a property that ensures that a real sequence converges,
but does not require us to guess the limit first. This is the property of being a
Cauchy sequence. We introduced Cauchy sequences and showed that every conver-
gent sequence is a Cauchy sequence. Since a Cauchy sequence is bounded, it has a
convergent subsequence. Although a sequence with a convergent subsequence need
not converge itself, we showed that if the sequence is also a Cauchy sequence, then
it will converge. From all of this we can deduce that a real sequence converges to a
real number if and only if it is a Cauchy sequence.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 83

Historical notes
Augustin Louis Cauchy (1789–1857)

Cauchy was born in Paris at the time of the French Revolution. When he was four,
conditions in Paris became so dangerous that the family moved to Arcuiel. There
conditions were so hard that they soon returned to Paris. Cauchy’s father took an
active interest in his son’s education, and sought advice from the mathematicians
Laplace and Lagrange. Cauchy’s formal education started with two years devoted to
classical languages, after which he studied mathematics at the Ecole Polytechnique
and engineering at the Ecole des Ponts et Chaussèes. After graduation he took
on a job at Cherbourg, where he worked on harbour facilities for Napoleon’s fleet.
Although he worked extremely hard, he made time for mathematical research and
published his first paper.
In September 1812 Cauchy became ill and returned to Paris. While recuperating
he continued his research and published a further paper. He did not want to return
to Cherbourg, and managed to obtain a posting to the Ourcq Canal project, where
he worked as a student earlier. Cauchy wanted an academic position, but failed in
a number of applications. He did man-
age to obtain further sick leave, and a
stoppage on the canal project caused
by political events gave him two years
in which to do research. One of the im-
portant achievements of this time was
a memoir on integration which became
the basis of his theory of complex func-
tions.
In 1815 he did manage to secure a po-
sition at the Ecole Polytechnique, and
the next year was awarded a prize by
the French Academy of Sciences for a
paper on waves. This was followed by a
solution to a problem posed by Fermat,
and election to the French Academy of
Sciences.
In 1817 he obtained a post at the Collège de France. His textbook Cours
d’analyse of 1821 attempted to establish a rigorous foundation for the study of
the calculus. This included a rigorous definition of the notion of an integral, and a
rigorous treatment of convergence of infinite series.
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 84

Cauchy’s relations with his fellow scientists were not particularly good. He was
a staunch Catholic and tended to bring religion into his work. He supported the
Jesuits in their attack on the Academy of Sciences. His manner was often abrupt
and very critical, and the young mathematicians Abel and Galois amongst others
were treated badly by him.
After the revolution of 1830, Cauchy decided to leave Paris and he spent some
time in Switzerland. When he returned to Paris, he was asked to swear an oath
of allegiance to the new government. When he refused to do this, he lost all his
positions. He went to Turin, and was appointed professor of theoretical physics
there in 1832. The next year he went from there to Prague to tutor the grandson of
Charles X, but Cauchy’s quick temper and the prince’s lack of interest meant that
this was not a particularly successful venture. Cauchy returned to Paris in 1838,
but could not teach because he still refused to take the oath of allegiance. This
even meant that he could not take up a new position to which he was appointed.
His religuous and political views also resulted in him not being appointed to a
professorship at the Collège de France, even though he was the best candidate.
He continued to do research in mathematical physics, astronomy and differential
equations.
Political changes in France in 1848 led to Cauchy regaining his old appointments.
When he applied for the chair at the Collège de France in 1850, he narrowly lost
out to Liouville, and this soured relations between them. Another dispute (in which
Cauchy was proved to be wrong) led to a great deal of bitterness in the last years of
Cauchy’s life. He died in 1857. His name lives on in many terms in mathematics, like
Cauchy sequences, the Cauchy-Schwarz inequality, the Cauchy-Riemann equations
and the Cauchy integral formula. He contributed to all the then-known areas of
mathematics, and his contributions show an amazing creativity and insight.

Exercises

1. Let the sequence (an ) be defined by: a1 = 0 and an+1 = 1 + an for all
n ∈ N+ .

(a) Prove that an ≥ 1 for all n ≥ 2.


1
(b) Prove that |an+1 − an | ≤ |an − an−1 | for all n ≥ 2.
2
(c) Prove that for all n ∈ N+ ,
 n−1
1
|an+1 − an | ≤ .
2
CHAPTER 3. CONVERGENT AND DIVERGENT SEQUENCES 85

(d) Prove that for all n, m ∈ N+ , if n > m then


 m−2
1
|an − am | ≤ .
2

(e) Deduce that (an ) is a Cauchy sequence. (Use the definition of a Cauchy
sequence.)
(f) Does (an ) converge? If so, to what?

2. Let (an ) be a sequence such that |an+1 − an | < Ck n for all n ∈ N+ , where k,
C ∈ R and 0 < k < 1 . Prove that (an ) is a Cauchy sequence.

3. Prove Theorem 3.5.11.


[Hints: To prove that (c) implies (a), consider a sequence that is bounded
above, and show that if it does not have a maximum, it has an increasing
subsequence. Then show that an increasing sequence which is bounded above
is a Cauchy sequence. By assumption this Cauchy sequence converges; show
that its limit is the supremum of the original sequence.]

4. Prove Theorem 3.5.12.


[Hints: If A is non-empty and bounded above, let a1 ∈ A and b1 be an upper
bound of A. Put c2 = 21 (a1 + b1 ).
If c2 is an upper bound of A, let b2 = c2 and a2 = a1 . If it is not, let a2 ∈ A
be such that a2 > c2 , and put b2 = b1 .
Continue in this way and define sequences (an ) and (bn ) inductively such that
for every n ∈ N+ , an ∈ A, bn is an upper bound of A, 0 ≤ bn − an ≤
2−n+1 (b1 − a1 ), and (an ) is increasing and (bn ) is decreasing. Then show that
supn an = inf n bn is the least upper bound of A.]
Chapter 4

Infinite series

In the previous chapter we studied infinite sequences and we were particularly inter-
ested in what happens to the terms of a sequence “in the long run”; more precisely,
we wanted to be able to say whether a sequence converges or not, and if it does
converge, what its limit is. These are questions that only make sense for infinite
sequences; for a finite sequence, there is no “long run”. In this chapter we ask what
we can say about the sum of terms of a sequence. If the sequence is finite, there
is not much to say: we can always add finitely many real numbers, and get a real
number as answer. But we saw in Chapter 1 that there are situations that arise
quite naturally where we may want to add infinitely many real numbers.
Taken at face value, the idea of adding infinitely many real numbers seems ridicu-
lous. This is a process that can never stop; there will always be another number to
add. There is clearly no hope of getting a real number as answer. But on second
thoughts there do seem to be situations where we are in fact claiming that we can
add infinitely many numbers and get a real number as answer. The most familiar
case is that of decimal representations for real numbers. An infinite decimal is an
abbreviation for a sum of infinitely many fractions, and we are usually quite com-
fortable with the claim that such a sum equals a real number. We do not think
twice about writing something like
3 3 3 1
0.333 . . . = + + +··· = .
10 100 1000 3

How do we make sense of all this? The answer is that before we can work
rigorously with infinite sums, we have to attach a meaning to such a sum; it is clear
that we cannot take it at face value. “Attach a meaning to” for a mathematician
means “giving a definition of”. We would like to give a natural and useful definition,
one that makes you want to say: “Now this is a sensible way of going about it, coming

86
CHAPTER 4. INFINITE SERIES 87

to think about it, we couldn’t really have done it any differently.” . If the definition
is such that it can accommodate our idea of decimal representations, and results
in infinite sums having many of the properties of finite sums as well, we could feel
well satisfied. This chapter starts with such a definition, and goes on to explore its
consequences.
It is only fair to issue two warnings right at the beginning. The first is that not
all infinite sums are going to cooperate and yield a real number as answer. You’ll
have to accept that there is probably something seriously amiss in a world where
the sum
1+1+1+1+···
yields a real number as answer. So it will be prudent to admit defeat in some cases.
Not every infinite sum has an “answer”.
The second warning is never to take anything infinite lightly. Weird and wonder-
ful things happen when working with the infinite. Infinite sums are no exception.
If you assume that infinite sums behave exactly like finite ones, you do so at your
own peril. You have been warned!

4.1 Basic definitions and properties


Let (an ) be an infinite sequence. We want to give a meaning to the sum

a1 + a2 + a3 + a4 + · · · .

We can add finitely many of the terms of the sequence. In particular, we can find
the sums

s1 = a1
s2 = a1 + a2
s3 = a1 + a2 + a3
s4 = a1 + a2 + a3 + a4

and so on; in fact, for every n ∈ N+ , we can find the sum

sn = a1 + a2 + a3 + · · · + an .

We call these sums partial sums of the sequence (an ); specifically, sn is called the n-th
partial sum. The fact that we can form such a partial sum for every n ∈ N+ means
that starting with an infinite sequence (an ), we can form a new infinite sequence
(sn ). This in turn allows us to ask what will happen to sn in the long run. More
CHAPTER 4. INFINITE SERIES 88

precisely, the question is whether the sequence (sn ) converges or not. If it does
converge, the limit would be a very natural candidate for the “sum” of all the terms
of the sequence (an ). We formalise this intuitive idea in a definition:

Definition 4.1.1 An infinite real sequence (an ) is called summable if the sequence
(sn ) of its partial sums converges. If this is the case, we
P∞ call the limit of the sequence
(sn ) the sum of the sequence (an ), and denote it by k=1 ak . In short:

X n
X
a1 + a2 + a3 + · · · = lim sn , or ak = lim ak .
n→∞ n→∞
k=1 k=1


X
P P∞
We sometimes write ak or 1 ak instead of the more precise ak .
k=1

Our first example (really a whole class of examples) is a familiar one. It illustrates
that sometimes we have to admit defeat: not all sequences are summable.

Example 4.1.2 (Geometric sequences)


Let r ∈ R and for n ∈ N+ , put an = r n−1.
Then if r 6= 1, we have for n ∈ N+ ,
n
2 n−1
X 1 − rn 1 1 n
sn = 1 + r + r + · · · + r = r k−1 = = − r .
k=1
1−r 1−r 1−r

Whether the sequence is summable or not will depend on the value of r. We consider
four cases:

(a) If |r| < 1, we have seen before that r n → 0 . It follows from this that if |r| < 1,
1
then (sn ) is convergent and lim sn = . Hence if |r| < 1, the sequence (r n ) is
n→∞ 1−r
1
summable and has sum .
1−r
(b) If r = 1, sn = 1 + 1 + · · · + 1 = n (we are simply adding n 1’s), and hence (sn )
is not convergent. Therefore the constant sequence with all terms equal to 1 is not
summable.
1 − (−1)n 1
(c) If r = −1, sn = = (1−(−1)n ). The sequence (sn ) does not converge,
1 − (−1) 2
and so the sequence ((−1)n ) is not summable.
CHAPTER 4. INFINITE SERIES 89

(d) If |r| > 1, the sequence (r n ) does not converge, and therefore (sn ) also does not
converge. Therefore in this case as well the sequence (r n ) is not summable.

We can now settle and old question: What does 0.999 . . . equal? This is an
1
infinite sum of the type we discussed above, in disguise, with r = 10 . We have
 
9 9 9 1 1 9 1
0.999 . . . = + +··· = 1+ + +··· = 1 = 1.
10 100 10 10 100 10 1 − 10

Note on terminology and notation:



X
It is customary to refer to the infinite sum ak as an infinite series. Rather
k=1
confusingly, it is entirely conventional to replace the statement that the sequence (an )
X∞
is summable by the statement that ak converges and the statement that (an ) is
k=1

X
not summable by the statement that ak diverges. This is rather odd because,
k=1

X
in the case that (an ) is summable, ak is a real number and so cannot converge ,
k=1

X
and in the case that (an ) is not summable, ak is not defined. Nevertheless, you
k=1
will get used to this abuse of notation, and it rarely causes any trouble.
We do some more examples. As with sequences, it will be helpful to build up a
stock of convergent and divergent series.

Example 4.1.3 Let’s look at the series



X 1 1 1 1 1
= + + + ...+ + ....
k(k + 1) 1·2 2·3 3·4 k(k + 1)
k=1

1 1 1
We use the fact that for every k ∈ N+ , = − .
k(k + 1) k k+1
The n-th partial sum of this series is
1 1 1
sn = + + ...+
1·2 2·3 n(n + 1)
     
1 1 1 1 1
= 1− + − + ...+ −
2 2 3 n n+1
1
= 1−
n+1
CHAPTER 4. INFINITE SERIES 90


X 1
So lim sn = 1 and hence = 1.
k=1
k(k + 1)

A series like this one is sometimes called a telescoping series.


X 1 1 1
Example 4.1.4 The series = 1+
+ + . . . arises so often that it has a
k 2 3
k=1
1 1
special name. It is called the harmonic series. In this case sn = 1 + + . . . + .
2 n
If you look at Example 3.5.5, you will see that we showed there that (sn ) is not a

X 1
Cauchy sequence, so it does not converge. So diverges.
k
k=1

To create new convergent series from existing ones, we need to know how we
can combine convergent series to form convergent series. We start with sums and
constant multiples. The proofs rely on similar results for sequences.

P P
Proposition 4.1.5 Suppose ak and bk are convergent series. If c ∈ R, then
P P P P
(a) (ak + bk ) converges and (ak + bk ) = ak + bk ;
P P P
(b) (cak ) converges and (cak ) = c ak .

P
Proof: (a) To begin, we calculate the n-th partial sum of the series (ak + bk ).
n
X
(ak + bk ) = (a1 + b1 ) + (a2 + b2 ) + . . . + (an + bn )
k=1
= (a1 + · · · + an ) + (b1 + . . . + bn )
Xn X n
= ak + bk .
k=1 k=1
Pn
ak ) and ( nk=1 bk ) converge, by assumption, the sequence
P
Since the sequences ( k=1

n
! n
! n
!
X X X
(ak + bk ) = ak + bk
k=1 k=1 k=1
CHAPTER 4. INFINITE SERIES 91

also converges, and



X n
X n
X n
X
(ak + bk ) = lim (ak + bk ) = lim ak + lim bk ,
n→∞ n→∞ n→∞
k=1 1 1 1

by Proposition 3.2.5(a).
(b) The proof is similar and left as an exercise.
So far infinite series seem to behave in much the same way as finite sums. The
proofs of the two properties above relies on the corresponding properties of finite
sums. But as we warned you in the introduction to this chapter, it is dangerous to
assume that it is always business as usual.

Example 4.1.6 (a) We have seen in Example 4.1.2 that the series

X
(−1)k = −1 + 1 − 1 + 1 − 1 + 1 . . .
k=1

diverges .
(b) Now let’s see what happens if we change this series into a new one by adding
brackets. Consider the series
X
ak = (−1 + 1) + (−1 + 1) + (−1 + 1) + . . . .

Here a1 = −1 + 1, a2 = −1 + 1, . . . , an = −1 + 1, . . . so if we denote the partial sums


bysn , we get s1 = 0, s2 = 0, . . . , sn = 0 so (sn ) is the constant zero sequence, so of
course converges to 0. So (−1 + 1) + (−1 + 1) + . . . = 0.
(c) Let’s change the brackets again. Consider
−1 + (1 − 1) + (1 − 1) + . . . .
Here s1 = −1, s2 = −1 + 0 = −1, . . . , sn = −1 so (sn ) is the constant sequence with
value −1, so converges to −1. So −1 + (1 − 1) + (1 − 1) + . . . = −1.

What we see in this example is that, although the original series in (a) diverges,
the two new series in (b) and (c) converge. This shows that we have created two
new series in (b) and (c) by introducing brackets, and that even the position of the
brackets changes matters. This is in sharp contrast with the situation for finite
series, where the associative law holds and the way we group terms has no effect on
the sum of the terms.
The next result says that the convergence or divergence of a series is not changed
if we omit or insert finitely many terms at the start of the series. The sum will
change, of course (but not the fact that the series converges).
CHAPTER 4. INFINITE SERIES 92

Proposition 4.1.7 If m ∈ N+ , the series


P∞ P∞
k=1 ak and k=m ak either both con-
verge, or both diverge.

Proof: Let m ∈ N+ and suppose that ∞


P P∞
k=m ak converges. We show that k=1 ak
converges. We write sn = a1 + . . . + an for the n-th partial sum of ∞
P
a
k=1 k and

t1 = am
t2 = am + am+1
..
.
tn = am + am+1 + · · · + am+n−1

for the partial sums of ∞


P
k=m ak . We know that (tn ) is convergent and need to show
that (sn ) is convergent. For n > m,

sn = a1 + . . . + an
= (a1 + . . . + am−1 ) + (am + . . . + an )
= (a1 + . . . + am−1 ) + tn−m+1
= sm−1 + tn−m+1 .

Since (tn ) is convergent, it follows from Proposition 3.4.3 that the subsequence
(tn−m+1 ) also converges. Since sm−1 is a constant, it follows from the above that
(sn ) also converges.
The proof of the converse is similar, and is left as an exercise.

Summary:
In this section we made precise the idea of P an infinite sequence (an ) being summable
or not, or equivalently of the infinite series ∞k=1 ak converging or diverging. A series
converges (respectively diverges) if and only if its sequence of partial sums converge
(respectively diverges). As illustration we looked at the important examples of the
geometric and harmonic series. We showed that sums and constant multiples of
converging infinite series are again convergent, but that the associative law does not
hold for infinite series.

Exercises

1. (a) Find the sum of the series


X 5
.
25k 2 + 5k − 6
[Hint: factorise the denominator.]
CHAPTER 4. INFINITE SERIES 93

(b) Find the sum of the series


X 2k + 3k
.
5k

2. Say whether the following series are convergent or divergent. Give a reason
for your answer.
P −k
(a) e
P 1
(b)
5k
P 1
(c) ( k + ( 23 )k ).
P
(d) (a + (k − 1)d), where a and d are real constants.

3. Use geometric series to express 0.52̇6̇ as a rational number.

4.2 Tests for convergence


The most important thing we need to know about an infinite series is whether it
converges or not. If it does, one can try to find its sum exactly, or if this is not
possible (and this is usually the case), we can approximate the sum by one of the
partial sums. It is possible to develop tests which will tell us whether a given series
converges or diverges, without having to try to find a sum for it. In this section
we look at tests that can be used for series with terms that could be positive or
negative, while in the next section we specialise to series with positive terms only.
We start, rather perversely, with a result that gives us a test for divergence.

P
Proposition 4.2.1 If ak converges, then an → 0.

P
Proof: SupposeP ak = s and sn = a1 + . . . + an is the n-th partial sum of the
series. Since ak converges, sn → s, and hence also sn−1 → s. For all n ≥ 2,
an = sn − sn−1 , so

lim an = lim sn − lim sn−1 = s − s = 0.

P
Corollary 4.2.2 (n-term test for divergence) If an 6→ 0, then ak diverges.
CHAPTER 4. INFINITE SERIES 94

Example
P 1 4.2.3 We have already seen (in Example 4.1.4) that the harmonic se-
1
ries n
diverges. However, n
→ 0. This example shows that the converse of
Proposition 4.2.1 is false.

Proposition 4.2.1 and its corollary are far more useful than it looks at first sight.
Unfortunately, though, these results are often misunderstood and misused. We
summarise exactly what they say (and do not say) as follows:
P
• an 6→ 0 ⇒ an diverges.
P
• an → 0: Does not tell us anything about the convergence of the series an .
P
• an converges ⇒ an → 0.
P
• an diverges : Does not tell us anything about the convergence or divergence
of the sequence (an ).

Example 4.2.4
X k
(a) Does converge or diverge?
4k + 1
n 1 1 P
Here the n-th term is an = = → . Since an 6→ 0, an diverges.
4n + 1 4 + 1/n 4
X
(b) Does (ln(k + 1) − ln k) converge or diverge?
n+1 1
Here the n-th term is an = ln(n + 1) − ln n = ln( ) = ln(1 + ) → 0 (since
n n
ln 1 = 0). This does not tell us anything about the convergence or divergence of the
series. So we calculate the n-th partial sum of the series instead:
sn = (ln 2 − ln 1) + (ln 3 − ln 2) + . . . + (ln(n + 1) − ln n)
= ln(n + 1) − ln 1
= ln(n + 1)
P
As n → ∞, ln(n + 1) → ∞ so (sn ) diverges. So (ln(k + 1) − ln k) diverges.

As we have seen, Proposition 4.2.1 can be useful if we want to show that a series
diverges. It cannot, however, be used for showing that a series converges.
In the previous chapter we proved that a sequence converges if and only it is
a Cauchy sequence. Since an infinite series converges if and only if its sequence of
partial sums converges, it follows that a series converges if and only if its sequence of
partial sums is a Cauchy sequence. This leads almost immediately to the following
criterion for convergence:
CHAPTER 4. INFINITE SERIES 95

+
Theorem 4.2.5 An infinite series ∞
P
k=1 ak converges if and only if for every  ∈ R
+ + +
there is an N ∈ N such that such that if n ∈ N , m ∈ N and n > m ≥ N, then
n
X
ak < .
k=m+1

Pn
Proof: Let sn = k=1 be the n-th partial sum of the series. Then for n > m,
n
X m
X n
X
|sn − sm | = ak − ak = ak .
k=1 k=1 k=m+1

The result now follows from the definition of a Cauchy sequence.


The following test gives a sufficient condition for an alternating series to con-
verge. A alternating series is one in which the terms are alternatively positive
and negative. Such a series can be written in the form

X ∞
X
(−1)k+1 pk or (−1)k+1 pk ,
k=1 k=1

with pk ≥ 0 for all k ∈ N+ , where the first form gives a series starting with a
non-negative term, and the second one starting with a non-positive term. The test
is named after the mathematician who introduced the notation that we still use in
calculus today.

Theorem 4.2.6 (The Leibniz Test) Let (pn ) be a sequence such that

(a) pn ≥ 0 for all n ∈ N+ ;

(b) (pn ) is a decreasing sequence;

(c) pn → 0.

Then the series



X
(−1)k+1 pk = p1 − p2 + p3 − p4 + p5 − . . .
k=1

converges. It follows that the series



X ∞
X
k
(−1) pk = (−1) (−1)k+1 pk = −p1 + p2 − p3 + p4 − p5 + . . .
k=1 k=1

also converges.
CHAPTER 4. INFINITE SERIES 96

Proof: We will need the following inequalities in our calculations:


For any n, m ∈ N+ , if n > m then
0 ≤ pm+1 − pm+2 + pm+3 . . . ± pn ≤ pm+1 .
We have two cases to check.
Case 1: The coefficient of pn is positive. Then
pm+1 + (−pm+2 + pm+3 ) + . . . + (−pn−1 + pn ) ≤ pm+1
since each term in brackets is negative or zero, because (pn ) is decreasing and each
pn ≥ 0. If we group differently, we get
(pm+1 − pm+2 ) + (pm+3 − pm+4 ) + . . . + (pn−2 − pn−1 ) + pn ≥ 0,
since each term in brackets is positive or zero, and pn ≥ 0.
Case 2: The coefficient of pn is negative. Then
0 ≤ pm+1 + (−pm+2 + pm+3 ) + . . . + (−pn−2 + pn−1 ) − pn ≤ pm+1
for similar reasons.
k+1
P
Let (sn ) be the sequence of partial sums of (−1) pk . We show that (sn ) is a
Cauchy sequence; that is enough to prove that (−1) pk converges. Let  ∈ R+
k+1
P
be given. Choose N ∈ N+ so that, for any n ∈ N+ , n ≥ N ⇒ pn+1 < . (This is
possible because pn → 0.)
Then, for any n, m ∈ N+ , if n > m ≥ N,
n
X
(−1)k+1 pk = | ± (pm+1 − pm+2 + pm+3 − . . . ± pn )|
k=m+1
= |pm+1 − pm+2 + pm+3 − . . . ± pn |
≤ pm+1
< .
It follows from Theorem 4.2.5 that the series converges.
Warning: The Leibniz Test is a test for convergence, not divergence. It says that
if certain conditions are satisfied, an alternating series is convergent. It does NOT
say that if condition (a) or (b) fails that the series will necessarily be divergent. (We
can say that if condition (c) fails, the series will be divergent. Why?)
The Leibniz test is particularly useful, not only because it provides a way of
proving convergence, but also because it gives us an easy way of estimating how
close the partial sums of the series are to the (actual) sum. The corollary says how
this can be done.
CHAPTER 4. INFINITE SERIES 97

Corollary 4.2.7 Let (pn ) satisfy the conditions of the Leibniz test. Then, for any
m ∈ N+ ,
X∞ m
X
(−1)k+1 pk − (−1)k+1 pk ≤ pm+1 .
1 1

k+1
In words: The absolute value of the error made by approximating ∞
P
1 (−1) pk
P m k+1
by 1 (−1) pk is at most pm+1 , or, more loosely: The value of the error made
by approximating the sum of the series by a partial sum is at most equal to the
absolute value of the first term not used in the partial sum.
k+1
Proof: For brevity, write s = ∞
P
k=1 (−1) pk . As usual, let sn be the n-th partial
sum of the series. Then s = lim sn . For any m ∈ N+ ,

X m
X
k+1
(−1) pk − (−1)k+1 pk = |s − sm |
1 1
= | lim sn − sm |
n→∞
= | lim (sn − sm )|
n→∞
= lim |sn − sm |
n→∞
≤ pm+1

This last inequality comes from the proof of Theorem 4.2.6. Make sure that you
know where each of the equalities above come from.

Example 4.2.8 Consider the series


X (−1)k+1 1 1 1
=1− + − + ....
k 2 3 4
1
Let pn = . Then (pn ) is decreasing, pn ≥ 0 for all n ∈ N+ and pn → 0. So
n
Leibniz’s test applies, and the series (−1)k+1 k1 converges. We can now use the
P
error estimate given by Corollary 4.2.7 to estimate the sum, s, of the series. If we
1 1 1 1
approximate s by 1− + −. . .+ , the error will be less than . If we approximate
2 3 9 10
1 1 1 1
s by 1 − + − . . . + , the error will be less than , and so on.
2 3 99 100
CHAPTER 4. INFINITE SERIES 98

This is of more interest than it may appear, at first sight. You may recall – if
you don’t, don’t worry – that the Taylor series for ln(1 + x) about x = 0 is given by:

x2 x3 x4
ln(1 + x) = x − + − + ...
2 3 4
and so
1 1 1
ln 2 = 1 − + − + ....
2 3 4
This means that what we are really finding here are estimates for ln 2.

X (−1)k+1 3k
Example 4.2.9 Does converge or diverge?
k2 + 1
3n 3/n
Let pn = 2 . Then pn = → 0. Also pn ≥ 0 for all n ∈ N+ and (pn ) is
n +1 1 + 1/n2
decreasing. (This requires a bit of calculation - check it yourself.) So Leibniz’s test
X (−1)k+1 3k
applies, and converges.
k2 + 1

X (−1)k+1 3k 2
Example 4.2.10 Does converge or diverge?
k2 + 1
3n2 3
Let pn = 2 . Then pn = → 3. The Leibniz test does not apply.
n +1 1 + n−2
However, Corollary 4.2.2, the n-th term test for divergence, does apply. If we write
3n2
an = (−1)n+1 2 for the n-th term of the series, then the sequence (an ) diverges.
n +1
(This follows from the fact that a2n → −3 and a2n+1 → 3.)
X X (−1)k+1 3k 2
So, in particular, an 6→ 0 and so ak = diverges.
k2 + 1

Our main aim in this chapter will be to develop tests that allow us to decide
whether a given series converges or diverges. Sometimes we will be able to find the
precise sum, sometimes not. We’ll find that, generally speaking, it is easier to work
with series for which all the terms are non-negative. If a series has both positive
and negative terms, we can create a new series by replacing the terms of the original
series by their absolute values; the new series will then have only non-negative terms.
Our next step will be to explore the relationship between these two series. We first
introduce some helpful terminology:
P
Definition 4.2.11
P A series ak is called absolutely convergent if and only
if the series |ak | is convergent. (More formally, the sequence (an ) is absolutely
summable if and only if the sequence (|an |) is summable.)
CHAPTER 4. INFINITE SERIES 99

At first sight, there is no reason to think that this notion could be at all helpful.
We begin with a series that has positive and negative terms in it, make all the terms
positive and see whether the resulting series converges. Why should this tell us
anything about the convergence of the original series? The next theorem tells us
why.

Theorem 4.2.12

(a) Every absolutely convergent series is convergent.

(b) A series is absolutely convergent if and only if the series formed from its positive
terms and the series formed from its negative terms both converge.

P P
Proof: (a) Suppose we are given
P a series ak such that |ak | converges. We use
Theorem 4.2.5 to show that ak also converges.
Let  ∈PR+ be given.
Since |ak | converges, it follows from Theorem 4.2.5 that there is an N ∈ N+ so
that for all n, m ∈ N+ , if n > m ≥ N, then |am+1 | + . . . + |an | < . Hence for
n > m ≥ N,

|am+1 + am+2 + · · · + an | ≤ |am+1 | + . . . + |an | < 


P
(by the triangle inequality), and it then follows from Theorem 4.2.5 that ak
converges.
(b) We define: 
ak if ak ≥ 0
a+
k =
0 if ak < 0

0 if ak ≥ 0
a−
k =
−ak if ak < 0

Note that

a+
P P
(i) k is the series of positive terms of ak

(−a−
P P
(ii) k ) is the series of negative terms of ak

(iii) |ak | = a+ −
k + ak for any k ∈ N
+

(iv) ak = a+ − +
k − ak for any k ∈ N .
CHAPTER 4. INFINITE SERIES 100
P + P
First suppose Pthat aP
k and (−a−k ) both converge. Then, by (iii) and Proposi-
a+ −
P P P
tion 4.1.5(a), |ak | = k + ak . This shows that |a |
k converges, i.e. ak
converges absolutely.
P P
Next suppose that ak is absolutely convergent, i.e. that |ak | converges. If we
add the two equations (iii) and (iv) above, we get
1
|ak | + ak = 2a+ +
k , so ak = (|ak | + ak ).
2
P P
P + |a
Now k | converges
P by hypothesis, so ak also converges
P (by part (a)). So
ak = 21 P |ak | + 21 P
P
ak , i.e. the series of positive terms of ak converges. P
− 1 1
P
Similarly, (−ak ) = 2 ak − 2 |ak | and so the series of negative terms of ak
also converges.
From the theorem above it follows that any convergent series with only positive
terms can be used to obtain infinitely many other convergent series, simply by
putting in minus signs at random.
However, not every convergent series can be obtained in this way. There are series
which are convergent, but not absolutely convergent. They get a special name:

Definition 4.2.13 A series is called conditionally convergent if and only if it


is convergent, but not absolutely convergent.

Of course, conditionally convergent series and absolutely convergent series both


converge, but we will see that their behaviour is often significantly different.
P
Note that to prove
P that a series ak is conditionally
P convergent, you must prove
two things: that ak does converge and that |ak | does not converge.

P P (−1)k+1 1 1 1
Example 4.2.14 Consider the series ak = = 1 − + − + . . ..
P k 2 3 4
In Example 4.2.8, we saw that ak converges.
P 1 1
In Example 4.1.4, we saw that |ak | = 1 + + + . . . diverges.
2 3
P (−1)k+1
So is a conditionally convergent series.
k

Corollary 4.2.15 If a series is conditionally convergent then the series formed from
its positive terms and the series formed from its negative terms both diverge.
CHAPTER 4. INFINITE SERIES 101
P
Proof: Suppose ak is a conditionally Pconvergent series. We use the same notation
+
as in thePproof of Theorem
P 4.2.12, so ak is the series obtained from the positive
P
terms of ak , and (−ak ) Pis the series obtained

from the negative terms of ak .
a+ (−a−
P P
By Theorem 4.2.12, either k diverges or k ) diverges (since
P + ak is not
absolutely
P convergent).
P Now
P +use aP proof by contradiction. Suppose ak converges.
Then (−ak ) = ak − ak , so (−a−

k ) converges – a contradiction. One obtains
a similar contradiction if one assumes that (−a−
P
k ) converges.

Example 4.2.16 Look at Example 4.2.14 again.


P P (−1)k+1 1 1 1
There we showed that ak = = 1 − + − + . . . is conditionally
k 2 3 4
convergent. From Corollary 4.2.15 we then obtain that

X 1 1 1
= 1 + + + ... and
k=1
2k − 1 3 5

X −1 1 1 1
= − − − − ...
k=1
2k 2 4 6

are both divergent.

Summary:
In this section we started the process of finding tests that will enable us to decide
whether a given series converges or diverges, without having to find a sum for the
series. The terms of a convergent series tend to 0, and therefore a series with terms
that do not converge to 0 must be divergent. Leibniz’s test gives sufficient conditions
for an alternating series to converge, and if it converges, also gives an estimate of
the error made when approximating the sum by a partial sum. A series is absolutely
convergent if the series obtained by taking the absolute value of each of its terms
converges. An absolutely convergent series is convergent, but not conversely: a series
which is convergent but not absolutely convergent is called conditionally convergent.
A series is absolutely convergent ifand only if the series formed from its positive
terms and the series formed from its negative terms both converge. If a series is
conditionally convergent, both these series diverge.
CHAPTER 4. INFINITE SERIES 102

Historical Notes
Gottfried Wilhelm von Leibniz (1646 – 1716)

Leibniz is famous both as a mathematician and as a philospher. He was born in


Leipzig, Saxony (now Germany) in 1646, the son of a professor of philosophy at the
University of Leipzig. His father died when he was only six years old, and he was
brought up by his mother, the daughter of a lawyer. He entered the University of
Leipzig at age fourteen, and graduated two years later in philosphy and mathematics.
He subsequently obtained a master’s degree in philosophy, also at Leipzig, and a
doctorate in law from the University of Altdorf.
In 1667 Leibniz declined a position at
the University of Altdorf and started
working for Baron von Boineberg in
Frankfurt. He worked on scientific,
literary and political projects and at
the same time practised as a lawyer
in Mainz. He developed an interest in
physics and in 1671 published a book
containing his abstract ideas of mo-
tion. He corresponded widely with sci-
entists in London and Paris. In 1672 he
went to Paris on a diplomatic mission.
There he met many mathematicians
and philosophers, and studied mathe-
matics and physics under the famous
physicist Christiaan Huygens. It was
at this time that he did work on the
summing of infinite series.
Leibniz went to London on another diplomatic mission in 1673 and while he was
there, visited the Royal Society (to which he was later elected). In his contact with
mathematicians there he realised that his knowledge of mathematics was insufficient
and on his return to Paris he devoted much of his time to improving his knowledge
of mathematics. In the process he became a highly original and productive mathe-
matician. It was at this time that he stated developing his version of the calculus,
and introduced the convenient notation for derivatives and integrals that is still in
use today. Newton was developing the calculus at the same time and wrote to Leib-
niz announcing his results. An unfortunate delay in the correspondence between
them lead to a suspicion on Newton’s part that Leibniz had stolen his results. This
later lead to a bitter priority dispute between them, and the Royal Society set up
CHAPTER 4. INFINITE SERIES 103

a committee to decide the matter. The clearly biased finding of the committee was
in Newton’s favour, and the Leibniz spent a large part of the last years of his life
trying to refute the report of the committee.
Another major contribution of Leibniz to mathematics was the development of
the binary system of arithmetic. He also made substantial contributions to dynam-
ics. He left Paris in 1676 to take up a position in Hanover, working for the Duke of
Hanover, and stayed there until his death in 1716.
Leibniz travelled extensively during his lifetime, met many of the leading scien-
tists and philosophers of his time, corresponded with more than 600 of them and
put much effort into setting up and promoting the aims of scientific societies. He
had extremely wide-ranging interests, and believed in working in many different
disciplines, ignoring the traditional boundaries between them.
Exercises

1. Determine whether the following series are convergent or divergent. Give a


reason for your answer.
X k
(a) (−1)k
k+1
X ln k
(b) (−1)k
k
X (−1)k+1
(c) p
k(k + 1)

X (−1)k+1
(d)
k(ln k)2
k=2

2. (a) Classify the series


X (−1)k+1

3
k
as absolutely convergent, conditionally convergent or divergent.
(b) What can you conclude about the convergence or divergence of the series
X 1 X 1

3
and √3
?
2k − 1 2k
X (−1)k+1
3. Estimate the maximum error in approximating the sum of the series
2k + 1
by the first (i) 8 (ii) 9 terms of the series. How many terms of the series are
needed in order to obtain an error which does not exceed 0.0001 in absolute
value?
CHAPTER 4. INFINITE SERIES 104
P
4. Prove that if ak is absolutely convergent then

X ∞
X
| ak | ≤ |ak |.
k=1 k=1

4.3 Tests for non-negative series


We continue our search for tests for convergence or divergence of series, but now
concentrating primarily on series with non-negative terms. Such tests also allow us to
test whether a series with both positive and negative terms is absolutely convergent;
if it is, then we know it is also convergent.

Theorem 4.3.1 Let ak ≥ 0 for every ninN+ . Then the series


P
ak is convergent
if and only if its sequence of partial sums is bounded.

Proof: It follows from the fact that all the terms of the series are non-negative that
the sequence of partial sums is increasing. Since an increasing sequence is convergent
if and only if it is bounded, the sequence of partial sums will be convergent if and
only if it is bounded. The result follows from the fact that an infinite series converges
if and only if its sequence of partial sums converges.
We can now use this result to find a test that allows us to determine whether
a series converges or diverges by comparing it to one that is known to converge or
diverge.

Theorem 4.3.2 (Comparison test) Let (an ) and (bn ) be real sequences.

(a) If 0 ≤ ak ≤ bk for all k ∈ N+ and


P P
bk converges, then ak also converges
X∞ ∞
X
(and ak ≤ bk ).
k=1 k=1

(b) If 0 ≤ ak ≤ bk for all k ∈ N+ and


P P
ak diverges, then bk also diverges.

Proof: (a) Suppose 0 ≤ ak ≤ bk for all k ∈ N+ . Let sn = a1 + · · · + an and


tn = b1 +· · ·+bn denote
P the n-th partial sums of the two series. Then 0 ≤ sn ≤ tn for
every ninN+ . Since bk is convergent, (tn ) is a bounded sequence, and hence (sn )
CHAPTER 4. INFINITE SERIES 105
P
is also bounded. It follows from Theorem 4.3.1 that ak converges. P
Taking Plimits
in the inequality 0 ≤ sn ≤ tn shows that, under these circumstances, ak ≤ bk .
P
(b) This comes directlyP from the contrapositive of the statement in (a): if ak
does not converge then bk does not converge.
Combining the theorem with Proposition 4.1.7 gives the following more general
version of the comparison test:

Corollary 4.3.3 Let (an ) and (bn ) be real sequences.

there is an m ∈ N+ such that 0 ≤ ak ≤ bk for all k ≥ m and


P
(a) IfP bk converges,
then ak also converges.

there is an m ∈ N+ such that 0 ≤ ak ≤ bk for all k ≥ m and


P
(b) IfP ak diverges,
then bk also diverges.

We summarise the uses and limitations of the comparison test in the table below.

The comparison test can only be used if ak ≥ 0 and bk ≥ 0 for every ninN+ .

P P
Know bk converges Know bk diverges

P
ak ≤ bk Deduce ak converges Cannot say anything

P
ak ≥ bk Cannot say anything Deduce ak diverges

As our first application of the comparison test we show that all infinite decimal
expansions (not only repeating ones) represent real numbers.
CHAPTER 4. INFINITE SERIES 106

Proposition 4.3.4 For each k ∈ N+ , let ak ∈ {0, 1, 2, . . . , 9}. Then the series

X ak
converges to a real number x, and 0 ≤ x ≤ 1.
k=1
10k

ak 1
Proof: For every k ∈ N+ , k
≤ 9 k.
P −k 10 10
Since 10 is a convergent geometric series, it follows from the comparison test
∞ ∞
X ak X 1
that k
converges to a real number x, and that 0 ≤ x ≤ 9 k
= 1.
k=1
10 k=1
10


X 1 1 1
Example 4.3.5 Consider the series 2
= 1 + + + . . ..
k=1
k 4 9
1 1
For all k ≥ 2, 2
≤ .
k k(k − 1)

X 1
We saw in Example 4.1.3 that ) = 1. Hence
k=1
k(k + 1

∞ ∞
X 1 X 1
= = 1.
(k − 1)k k(k + 1
k=2 k=1

∞ ∞
X 1 X 1
It follows from the comparison test that converges and ≤ 1.
k=2
k2 k=2
k 2

∞ ∞
X 1 X 1
Hence, by Proposition 4.1.7, converges, and ≤ 2.
k=1
k2 k=1
k2
X 1 1 1 1
Next we look at p
= 1 + p + p + p + . . . for p ≥ 2. Since k 2 ≤ k p for
k 2 3 4
+ 1 1 +
all k ∈ N , we have 2 ≥ p for all k ∈ N , so we can use the comparison test to
k k
P 1
conclude that converges.
kp

Example 4.3.6 Often the comparison test can be used to analyse very complicated
looking series in which most of the complication is irrelevant (for convergence, at
least). For example, consider the series
X 2 + sin4 (k + 1)
.
2k + k 2
CHAPTER 4. INFINITE SERIES 107

2 + sin4 (k + 1) 3
For all k ∈ N+ , 0 ≤ k 2
≤ k (Why?).
2 +k 2

X 3
Also is a convergent geometric series. Since the terms of the original series
k=1
2k
are all positive, the comparison test applies, and the series converges.


X k+7
Example 4.3.7 Consider the series .
k=1
3k 2 − 1
We first try to get a feeling for the kind of series we are dealing with. For large k,
k+7 k 1 X1
the expression is approximately equal to = . Since diverges,
3k 2 − 1 3k 2 3k k
we expect the given series to diverge as well.

k+7 k 1
Now ≥ = ≥ 0 for all k ∈ N+ .
3k 2 − 1 3k 2 3k
X1 X 1
Since diverges, diverges as well. The comparison test applies,
k 3k
X k+7
and diverges.
3k 2 − 1
Sometimes a little more ingenuity is needed to find an appropriate inequality. If we

X k−1
are given the series , we could argue as above that we would expect it
k=1
3k 2 + 1
X1
to behave like , and therefore to diverge.
k
We have, for all k ≥ 3, that k − 1 ≥ 21 (k + 1) and 3k 2 + 1 ≤ 3(k + 1)2 . Therefore
1
k−1 2
(k + 1) 1 1
2
≤ 2
= for all k ≥ 3.
3k + 1 3(k + 1) 6k+1
∞ ∞
1X 1 1X1
Since the series = diverges, it follows from the comparison test
6 k=3 k + 1 6 k=4 k
∞ ∞
X k−1 X k−1
that 2
diverges as well. But then the series 2+1
will also diverge.
k=3
3k + 1 k=1
3k

The next test is arguably the easiest to apply, and has the further merit that it
only requires that the terms of the series be non-zero. There are series, however, for
which this test is inconclusive.
CHAPTER 4. INFINITE SERIES 108

Theorem 4.3.8 (Limit Ratio Test) Let an 6= 0 for every ninN+ and suppose
an+1
there is an ` ∈ R such that lim = `.
n→∞ an
P
(a) If ` < 1, then an converges absolutely (and hence converges).
P
(b) If ` > 1, then an diverges.

(c) If ` = 1, the test is inconclusive (i.e., it cannot say whether the series converges
or diverges).

Proof: (a) We begin with the case ` < 1. (We know ` ≥ 0. Why?)
`+1
Choose b ∈ R such that ` < b < 1. (For example, b = would do.) Then
2
b − ` > 0, so there exists N ∈ N+ so that for all n ∈ N+ ,
an+1
n≥N ⇒ −` < b−`
an
an+1
⇒ −(b − `) < −`< b−`
an
an+1
⇒ <b (ignore the other inequality)
an
⇒ |an+1 | < b|an |.
So for all n > N,
|an | < b|an−1 | < b2 |an−2 | < . . . < bn−N |aN | = bn (b−N |aN |).
Now bn is a convergent geometric series (since 0 < b < 1), andP
therefore b−N |aN | bn
P P

converges as well.
P∞It now followsP from the comparison test that k=N |ak | converges,
hence so does k=1 |ak |, i.e. ak is absolutely convergent.
(b) Now consider the case ` > 1.
`+1
Choose b ∈ R such that 1 < b < `. (Again b = would do.) Then ` − b > 0 so
2
there exits N ∈ N+ so that for all n ∈ N+ ,
an+1
n≥N ⇒ −` <`−b
an
an+1
⇒ −(` − b) < −` <`−b
an
an+1
⇒ b< (ignore the other inequality)
an
⇒ |an+1 | > b|an |.
CHAPTER 4. INFINITE SERIES 109

Since b > 1, bn−N |aN | > |aN | > 0 for all n > N So for any n ∈ N+ , if n > N then

|an | > b|an−1 | > b2 |an−2 | > . . . > bn−N |aN | > |aN |.
P
Now |aN | > 0, so this shows that an 6→ 0 as n → ∞. So, by Corollary 4.2.2, ak
diverges.
X1 X 1
(c) To see that the case ` = 1 is inconclusive, consider the series and .
k k2
P X1 an+1 1/(n + 1) n
If ak = then = = → 1.
k an 1/n n+1
P X 1 an+1 1/(n + 1)2 n2
If ak = then = = → 1.
k2 an 1/n2 n2 + 1
an+1 X 1 X1
So in both cases we have that lim = 1, but converges, whereas
n→∞ an k2 k
diverges.

Example 4.3.9 We show that, for all r ∈ R,



X rk r r2 r3 r4
= + + + + ...
k=1
k! 1! 2! 3! 4!

is absolutely convergent.
If r = 0, the result is clear.
If r 6= 0, apply the Ratio Test:

r n+1
an+1 (n + 1)! r n+1 n! |r|
= n = · n = → 0.
an r (n + 1)! r n+1
n!

an+1 X rk
Since limn→∞ = 0 < 1, converges absolutely (and so, of course,
an k!
converges).

Example 4.3.10 Consider the series



X 1 1 1 1
k
= 2
+ 3
+ + . . ..
(ln k) (ln 2) (ln 3) (ln 4)4
k=2

(We begin at k = 2 simply because (ln 1)1 = 0, so we would be dividing by 0 if we


CHAPTER 4. INFINITE SERIES 110

used k = 1.)
1
In this case an = , so
(ln n)n
1
n
(ln n)n

an+1 (ln(n + 1))n+1 1 ln n
= = = · .
an 1 (ln(n + 1))n+1 ln(n + 1) ln(n + 1)
(ln n)n

(Why can we drop the absolute value signs?)


 n
ln n ln n
Now ≤ 1 and so ≤ 1 as well; therefore
ln(n + 1) ln(n + 1)

an+1 1
0≤ ≤ .
an ln(n + 1)

1 an+1
As n → ∞, → 0 so, by the Sandwich Theorem, → 0.
ln(n + 1) an

X 1
According to the ratio test converges.
k=2
(ln k)k

Example 4.3.11 Consider the series


X X ek e2 e3 e4
ak = = e + + + + . . ..
k2 22 32 42
For this series,

en+1
2
en+1 n2

an+1 (n + 1)2 1
= = n · =e → e > 1.
an en e (n + 1)2 1 + 1/n
n2
P
By the ratio test, ak diverges.

Example 4.3.12 For r ∈ R, consider the series


X∞ ∞
X
ak = kr k = r + 2r 2 + 3r 3 + 4r 4 + . . ..
k=1 k=1
To use the ratio test, we calculate

an+1 (n + 1)r n+1 1


= n
= (1 + )|r| → |r|.
an nr n
CHAPTER 4. INFINITE SERIES 111

Whether the P series converges or diverges clearly depends on the value of r.


If |r| < 1, P ak converges absolutely (and hence converges).
If |r| > 1, ak diverges.
If |r| = 1, the
P ratio test does not help, so we have to think of something else.
If r = 1, = 1 + 2 + 3 + . . .. Here an = n. Obviously an 6→ 0, so, by
ak P
Corollary 4.2.2, ak diverges.
ak = −1 + 2 − 3 + 4 − . . .. Here an = n(−1)n . Again an 6→ 0, so
P P
If r = −1, ak
diverges.

Remark: A careful look at the proof of the ratio test shows that it is not essential
an+1
for lim to exist. We can in fact use much the same argument to show that
n→∞ an
the following more general version of the ratio test holds.

Theorem 4.3.13 (Ratio test) Let (an ) be a sequence of non-zero real numbers.

|an+1 |
(a) If there is an N ∈ N+ and an 0 < ` < 1 such that ≤ ` for n ≥ N, then
P |an |
an converges absolutely.

|an+1 |
(b) If there is an N ∈ N+ such that
P
≥ 1 for n ≥ N, then an diverges.
|an |

Proof: Exercise.
The proof of the next test is quite similar to that of the ratio test, and we
therefore leave it as an exercise. It also has a limit version, which is easier to apply.

Theorem 4.3.14 (Root test) Let (an ) be a real sequence.

+
p
(a)
P f there is an N ∈ N and an 0 < ` < 1 such that n
|an | ≤ ` for n ≥ N, then
an converges absolutely.
p
(b) If there is an N ∈ N+ and an ` > 1 such that n |an | ≥ ` for n ≥ N, then
P
an
diverges.
p P
(c) In particular, if limn→∞ n |an | = ` exists, then an converges absolutely if
` < 1 and diverges if ` ≥ 1. If ` = 1, the test is inconclusive.
CHAPTER 4. INFINITE SERIES 112

Proof: Exercise.

Example 4.3.15 In Example 4.3.12 we applied the limit ratio test to the series
k
P∞
k=1 kr . We now try to apply the limit root test to the same series. We have
n
an = nr , so p √
lim n |an | = lim n n|r| = |r|.
n→∞ n→∞

It follows from the root test that the series converges for |r| < 1 and diverges for
|r| > 1.

The last test we consider in this section can be used only for non-negative series,
and uses improper integrals.

Theorem 4.3.16 (Integral Test) Let f : [1, ∞) → R be a decreasing function


such that f (x) ≥ 0 for all x ≥ 0, and put ak = f (k) for all k ∈ N+ .
X∞
Then ak converges if and only if
k=1
Z ∞ Z n
f (x) dx = limn→∞ f (x) dx
1 1

exists.

Proof: The following graph illustrates the inequalities that we will need:

Figure 4.1: Graph illustrating the proof of the integral test.


CHAPTER 4. INFINITE SERIES 113

Sincef is decreasing, we have that for all k ∈ N+ ,


Z k+1
ak+1 = f (k+1) = f (k+1)[(k+1)−k)] ≤ f (x) dx ≤ f (k)[(k+1)−k] = f (k) = ak .
k

Then for every n ∈ N+ ,


n
X n Z
X k+1 n
X
ak+1 ≤ f (x)dx ≤ ak . (∗)
k=1 k=1 k k=1

Let us write
n
X
sn = ak+1
k=1
n Z
X k+1 Z 2 Z n+1 Z n+1
tn = f (x) dx = f (x) dx + . . . + f (x) dx = f (x) dx
k=1 k 1 n 1
n
X
un = ak .
k=1

The sequences (sn ), (tn ), (un ) are all increasing, because f is positive, and from (*)
we have
sn ≤ tn ≤ un for all n ∈ N+ (∗∗)
P
If ak converges, then (un ) is bounded above. But (**) then shows that (tn ) is
bounded above, and since it is an increasing sequence, it converges.
Z n+1
Hence lim f (x) dx = lim tn exists.
n→∞ 1 n→∞
Z n
If lim f (x)dx exists, then (tn ) converges and so is bounded above. But (**)
n→∞ 1 P∞
then shows
P∞ that (s n ) is bounded above, and so converges, i.e. k=2 ak converges.
Then k=1 ak also converges also, by Proposition 4.1.7

Example 4.3.17 The p-series. Let p ∈ R. The series


X 1 1 1
p
= 1 + p + p + ...
k 2 3
is known as the p-series. There is, of course, not only one series; for each p we have
a series, and it is of interest to know for which p the series converges, and for which
p it diverges. We have already seen that the series converges for all p ≥ 2. With the
integral test we can treat all p > 0 at the same time.
CHAPTER 4. INFINITE SERIES 114

1
Let f (x) = p , for p ≥ 0. Then f is positive and decreasing on [1, ∞).
x
Now if p 6= 1
Z n Z n  −p+1 n  
1 x 1 1
f (x) dx = p
dx = = −1 .
1 1 x −p + 1 1 1 − p np−1

If p = 1,
n n
1
Z Z
f (x) dx = dx = [ln x]n1 = ln n.
1 1 x
We now have to consider three cases:
n
1 −1
Z
• If p > 1, p − 1 > 0, and so p−1 → 0, so lim f (x) dx = , hence
n n→∞ 1 1−p
X 1
converges.
kp
Z n X1
• If p = 1, lim f (x) dx = lim ln n = ∞, so diverges.
n→∞ 1 n→∞ k
(This is just the harmonic series, so this result is not new.)
Z n
1 1−p 1
• If 0 ≤ p ≤ 1, 1 − p ≥ 0, so p−1 = n → ∞, so lim p
dx = ∞, hence
n n→∞ 1 x
X 1
diverges.
kp

1
If p < 0, the function f (x) = p is increasing on [0, ∞), and therefore the integral
x
test cannot be used. We leave it as an exercise for you to show that for p < 0, the
X 1
series diverges.
xp
We summarise our results for the different cases of the p-series:

X 1
• converges for p > 1
kp
X 1
• diverges for p ≤ 1
kp

Example 4.3.18 Consider the series



X 1 1 1 1
= + + + . . ..
k=2
k ln k 2 ln 2 3 ln 3 4 ln 4
1
Let f (x) = for x ≥ 2.
x ln x
CHAPTER 4. INFINITE SERIES 115

Then f is decreasing and f (x) ≥ 0 for x ∈ [2, ∞), so the Integral Test applies. In
this case we begin with n = 2, since ln 1 = 0 so f (1) would be undefined.
Z n Z n
1
f (x) dx = dx = [ln(ln x)]n2 = ln(ln n) − ln(ln 2) → ∞.
2 2 x ln x
n ∞
1
Z X
Since lim f (x) dx does not exist, diverges.
n→∞ 2 k ln k
k=2

Example 4.3.19 Consider the series



X 1 1 1 1
2
= 2
+ 2
+ + . . ..
k=2
k(ln k) 2(ln 2) 3(ln 3) 4(ln 4)2
1
Let f (x) = for x ≥ 2. Then f is positive and decreasing on [2, ∞).
x(ln x)2
Z n Z n  n
1 −1 1 1 1
f (x) dx = 2
dx = = − → .
2 2 x(ln x) ln x 2 ln 2 ln n ln 2

X 1
By the Integral Test, converges.
k=2
k(ln k)2

Summary:
In the last two sections we have looked at a number of tests which enables us to
determine whether a given infinite series converges or diverges. It is often not so
easy to decide which test would be the most appropriate. In the place of the usual
short summary, we give here a strategy for tackling this kind of problem.
P∞
Suppose you are given an infinite series n=1 an and have to decide whether it
converges or diverges.

1. The first, easy test that you can apply to any series
P is the n-th term test
for divergence: If (an ) does not tend to 0, then an diverges.
WARNING: This is only a test for divergence. If we find that an → 0, this
does not help us to decide whether the series converges or diverges. It could
do either.
Pan → 0, you will have to look for another test to decide whether the series
If
an converges or diverges. Here are the possibilities:

2. If the series is an alternating series (i.e. the terms are alternately positive and
negative) we may be able to apply the Leibniz Test: If an = (−1)n+1 pn ,
CHAPTER 4. INFINITE SERIES 116

where P (pn ) is a decreasing, non-negative sequence converging to 0, then the


n+1
series ∞
P∞
n=1 na = n=1 (−1) pn is convergent.
WARNING: This is only a test for convergence, not Pfor divergence. If
n+1
the sequence (pn ) satisfies all the conditions, the series ∞ n=1 (−1) p n con-
verges. If the sequence (pn ) does P not satisfy all the conditions, this test does
∞ n+1
not tell us anything; the series n=1 (−1) pn could converge or diverge.
However, if (pn ) does not converge to 0, then ((−1)n+1 pn ) will also not con-
verge
P∞ to 0 n+1 (why?), and so it follows from the n-term test for divergence that
n=1 (−1) pn is divergent.
3. If an 6= 0 for every n ∈ N+ , we can use the ratio test; we give the limit
version:
an+1
If limn→∞ = ` exists, then
an
P
(a) if ` < 1, the series an converges;
P
(b) if ` > 1, the series an diverges;
(c) if ` = 1, the test does not tell us anything.
4. If the n-th term of the series contains a power of n, it may be more convenient
to use the proot test; we give the limit version:
If limn→∞ n |an | = ` exists, then
P
(a) if ` < 1, the series an converges;
P
(b) if ` > 1, the series an diverges;
(c) if ` = 1, the test does not tell us anything.
an is a non-negative series (i.e. an ≥ 0 for every n ∈ N+ ), there are two
P
5. If
further tests to choose from:
(a) The comparison test: Suppose an ≥ 0 and bn ≥ 0 for all n ∈ N+ .
i. If an ≤ bn for all n ∈ N+ and
P P
bn converges, then an converges.
P P
ii. If an ≥ bn and bn diverges, then an diverges.
(b) The integral test: Let f be a function that P is non-negative and de-
R ∞ on [1, ∞). If weRput
creasing
n
an = f (n), then an converges if and only
if 1 f (x) dx = limn→∞ 1 f (x) dx exists.
6. If the sequence (an ) has both positive andP negative terms and the Leibniz test
cannot be used, we can look at the series |an | (this is, of course, a different
+
series). Since |an | ≥ 0 for all n ∈ N , we can use any of the above two tests
for non-negative
P series. If an > 0, we can
P of course also use the ratio test. If
the
P series |an | converges,Pthe series an will also converge. If the series
|an | diverges, the series an could still converge.
CHAPTER 4. INFINITE SERIES 117
P P
7. If the series |an | P
converges, the series an is called
P absolutely conver-
gent.
P If the series |an | diverges, but the series an converges, the series
an is called conditionally convergent.

Exercises

1. Classify the following series as convergent or divergent. Give reasons for your
answers.
X
(a) e−k
X 1 + 2 + ...+ k
(b)
k2
X k
(c)
k3 + 1
X 1
(d)
3k + 1
X ek
(e)
k!
X ln k
(f)
2k 3 − 1
X (k!)2
(g)
(2k)!
X√
(h) ( 1 + k 2 − k).
X 3k
(i)
k5k

X 1
(j) p
for p > 0
k=2
k(ln k)

X 1
(k) for p > 0
k=2
k(ln k)(ln(ln k))p

2. Classify each of the following as absolutely convergent, conditionally conver-


gent or divergent. Give reasons for your answers.

X (−1)k k
(a)
k=1
1 + k2

X (−1)k k 2
(b)
k=1
1 + k2
CHAPTER 4. INFINITE SERIES 118

X 5(−1)k+1
(c)
k 2 − 21

X 1 
1
(d) −
k! k
3. Let a ∈ R such that a > 1. Show that the series

a − a1/2 + a1/3 − a1/4 + . . .

diverges, but the series

(a − a1/2 ) + (a1/3 − a1/4 ) + (a1/5 − a1/6 ) + . . .

converges. [Hint: For the second series, first show that there is a telescoping
series similar to it that converges.]
P1 P 1 P 1
4. The series , and all diverge. Investigate the convergence
k 2k 2k − 1
or divergence of
X 1 1
 X 1 1

(a) − (b) − .
k 2k 2k 2k − 1

5. If possible, find examples of series with the following properties. If it is im-


possible, explain why that is so.
P P P
(a) ak , bk and ak bk all converge.
P P P
(b) ak and bk diverge, but ak bk converges.
P P P
(c) ak converges, bk diverges and ak bk converges.
P P P
(d) ak converges, bk diverges and ak bk diverges.
P P P
(e) ak and bk converge, but ak bk diverges.
P
6. Adapt the proof of the Ratio Test to show that if ak is a series for which
an+1 P X k!
→ ∞, then ak diverges. Apply the result above to show that
an ek
diverges.
CHAPTER 4. INFINITE SERIES 119

7. This question is for those of you interested in decimal expansions and the
structure of the real numbers. We have already seen, in Example 4.3.4, that
every infinite decimal expansion represents a real number. We now show that
every real number can be represented by a decimal expansion. We do this for
real numbers between 0 and 1, but the method applies to all real numbers.

(a) Let r be any real number. If 0 ≤ r < 10, show that there is an a ∈ N+
such that
0 ≤ a ≤ r < a + 1 ≤ 10 (1)
(b) Let x ∈ [0, 1). Show that there is an a1 ∈ {0, 1, 2, . . . , 9} such that
a1 a1 1
≤x< + .
10 10 10
[Hint: Consider 10x and apply (1).]
(c) Show that if
k k
X ai X ai 1
i
≤x< i
+ k (2)
i=1
10 i=1
10 10

where ai ∈ {0, 1, 2, . . . , 9} for each i, then there is an ak+1 ∈ {0, 1, 2, . . . , 9}


such that
k+1 k+1
X ai X ai 1
i
≤x< i
+ k+1 .
i=1
10 i=1
10 10

[Hint: Multiply (2) by 10k and juggle things so that you can apply (1).]
(d) By induction, for all i ∈ N+ there exist ai ∈ {0, 1, 2, . . . , 9} such that, for
all n ∈ N+ ,
n n
X ai X ai 1
i
≤x< i
+ n.
i=1
10 i=1
10 10
n
X ai
(e) Let sn = and show that sn → x. Then you have proved that
i=1
10i

X ai
= x, or more informally, x = 0.a1 a2 a3 a4 a5 . . ..
i=1
10i

8. Prove Theorem 4.3.13

9. Prove Theorem 4.3.14


CHAPTER 4. INFINITE SERIES 120

4.4 Regroupings and rearrangements of series


As we have seen, infinite series sometimes behave very differently to finite series (or
sums, as we usually call them). For finite series the question of convergence never
arises. Infinite series can converge or diverge. But there are properties that hold
for finite sums, but may fail to hold for convergent infinite series. To make matters
more complicated, they do not always fail to hold; whether they hold or not will
depend on the type of convergence.
We now look at two properties of finite sums: associativity and commutativity
of addition. For example, for finite sums
(a1 + a2 ) + (a3 + a4 ) = a1 + (a2 + a3 ) + a4 (addition is associative)
a1 + a2 + a3 + a4 = a2 + a4 + a3 + a1 (addition is commutative).
We now ask the corresponding questions for infinite series:
P
• Given a series ak = a1 + a2 + a3 + . . . consider a regrouping
(a1 + a2 ) + (a3 + a4 + a5 ) + a6 + . . . ,
P
for example. Will this new series converge if ak does? Will it have the same
sum? (Note that in a regrouping the order of the terms remains unchanged.)
P
• Given a series ak = a1 + a2 + a3 + . . . consider a rearrangement
a3 + a1 + a4 + a2 + a5 + . . . ,
P
for example. Will this new series converge if ak does? Will it have the same
sum? (Note that in a rearrangement the order of the terms changes.)

We first look at regroupings.


We have seen before that the series
X
(−1)k+1 = 1 − 1 + 1 − 1 + 1 − 1 . . .

diverges, but that different groupings of its terms give series that converge (possibly
to different sums).
On the other hand, we have also seen an example where regrouping P does not
change the sum. It follows from the proof of Proposition 4.1.7 , if ak converges,
then ∞ ∞
X X
ak = (a1 + . . . + am ) + ak
k=1 k=m+1
CHAPTER 4. INFINITE SERIES 121

for any m ∈ N+ . Here grouping the first m terms together does not change the sum.
There is one obvious difference between these two examples – the one series
diverges, the other converges.

P
Theorem 4.4.1 Given a convergent series ak , we can regroup consecutive terms
of the series without changing the sum of the series.
P
Proof: Suppose we are given that P ak = ` for some ` ∈ R. Consider another
series b1 + b2 + b3 + . . . formed from ak byPregrouping the terms so that b1 is the
sum of one or more of the initial terms in ak , b2 is obtained by adding one or
more of the next terms to come along, and so on. We are not allowed to change the
order of the terms, only to regroup them. For convenience, we write:
b1 = a1 + a2 + . . . + ak1
b2 = ak1 +1 + ak1 +2 + . . . + ak2
b3 = ak2 +1 + ak2 +2 + . . . + ak3
..
.

Note that km ≥ m for all m ∈ N+ . Denote the partial sums of the two series by
sn = a1 + . . . + an and tm = b1 + . . . + bm . Then
tm = b1 + b2 + . . . + bm = a1 + a2 + . . . + akm = skm .
P
We are given thatP ak = `, i.e. that sk → ` as k → ∞.
We must show that bn = `, i.e. that tm → ` as m → ∞.
We know (sk ) converges to `. But for each m ∈ N+ , tm = skm , and (s Pkm ) is a
subsequence of (sk ), and therefore must converge to ` as well. But then bn = `
as needed.
We look at rearrangements next. Here the story is considerably more compli-
cated. We begin with an example.

Example 4.4.2 Consider the series


X (−1)k+1 1 1 1
=1− + − + ....
k 2 3 4
We have seen in Example 4.2.8 that this series converges, and that its sum is not
zero (it equals ln 2). Denote its sum by s. Now rearrange the series as follows:
1 1 1 1 1 1 1 1 1 1 1
1− − + − − + − − + − − ....
2 4 3 6 8 5 10 12 7 14 16
CHAPTER 4. INFINITE SERIES 122

Denote the n-th partial sum of this new series by tn . Then


1 1 1 1 1
t3n = 1 − − ...+ − −
2 4 2n − 1 4n − 2 4n
1 1 1 1 1 1 1 1
= (1 + + . . . + ) −( + + ...+ ) − ( + + ...+ )
3 2n − 1 2 6 4n − 2 4 8 4n
1 1 1 1 1 1 1 1 1
= (1 + + . . . + ) − (1 + + . . . + ) − ( + + ...+ )
3 2n − 1 2 3 2n − 1 2 2 4 2n
1 1 1 1 1 1
= (1 − + − . . . + − )
2 2 3 4 2n − 1 2n
1
(Note that we can of course rearrange a finite sum any way we like.) Now t3n → s,
2
1 1 1 1
because 1 − + − . . . + − = s2n , the 2n-th partial sum of the original
2 3 2n − 1 2n
series.
1 1 1 1 1
Also t3n−1 = t3n + → s and t3n−2 = t3n + + → s, so the new
4n 2 4n 4n − 2 2
1
series converges to s. This is definitely not the same as s, because s 6= 0.
2

What is happening here? The example above shows that it is possible to take a
convergent series, rearrange the terms and get a series that converges to a different
sum. However, this does not always happen. If you experiment with different
rearrangements of
X 1 1 1 1
= 1 + + + + ...
2k−1 2 4 8
you will see that the sum is always 2. What is the difference between these two
X 1
series? Well, one difference is that contains only positive terms. That is
2k−1
important (as we shall see), but it is not the whole story, as the next example shows.

Example 4.4.3 Consider the series


X (−1)k+1 1 1 1 1
=1− + − + − ....
2k−1 2 4 8 16
1 2
This is a geometric series with sum = . All rearrangements of this
1 − (−1/2) 3
series give the same sum. (For a proof of this fact, see the next theorem.)

We get a clue as to what is happening by looking at the sums of the positive


terms and of the negative terms in each case.
CHAPTER 4. INFINITE SERIES 123

In the previous example, the sum of positive terms equals


1 1 1 4
1+ + + ... = = ,
4 16 1 − 1/4 3
while the sum of negative terms equals
1 1 1 −1/2 2
− − − −... = =− .
2 8 32 1 − 1/4 3

1 1
In Example 4.4.2, the series of positive terms 1 + + + . . . diverges,
3 5
1 1 1
and the series of negative terms − − − − . . . also diverges.
2 4 6
X (−1)k+1
The difference between the two examples is that is absolutely con-
2k
X (−1)k+1
vergent, whereas is conditionally convergent.
k
For absolutely convergent series all is well:

P
Theorem 4.4.4 Given an absolutely convergent series ak , we can rearrange the
terms of the series without changing the sum of the series.

Proof: See the exercises for this section.


While absolutely convergent series are very well-behaved with regard to rear-
rangement of terms, with conditionally convergent series things can go as dramati-
cally wrong as we please:
P
Theorem 4.4.5 If ak is a conditionally convergent
P series and c is any real num-
ber, then there is a rearrangement of the terms of ak such that the resulting series
converges to c. There are also rearrangements of the terms for which the rearranged
series will diverge to ∞ or −∞.

Proof: We omit the full proof. The basic idea is not difficult. You know that in
a conditionally convergent series, the series formed from its positive terms diverges,
as does the series formed from its negative terms. The terms of the series converges
to 0.
Choose enough positive terms so that your sum is bigger than c. Then add enough
negative ones so that the sum is less than c again. Then add positive ones till the
sum is bigger than c. In this way, the sum gets closer and closer to c.
CHAPTER 4. INFINITE SERIES 124

Summary:
We looked at grouping and rearrangements of infinite series in this section. Grouping
of the terms of a divergent infinite series can change it into a convergent series, and
different groupings can even produce different sums. But groupings of the terms of
a convergent series will not change the sum of the series. Rearrangements of the
terms of an absolutely convergent series will not change the fact that it converges,
or the sum of the series. But the terms of a conditionally convergent series can be
rearranged to converge to any real number, or even to diverge.

Exercises

1. Read the following and answer the questions below.


P
Theorem P If ak is absolutely convergent, then for every rearrangement (bn )
of (an ), bk is also convergent, and to the same sum.
Proof: For any n ∈ N+ , let sn = a1 + . . . + an and tn = b1 + . . . + bn . (1)
Let  ∈ R+ be given. (2)
ak converges, there is an N1 ∈ N+ such that, for any n ∈ N+ ,
P
Since

X 
n ≥ N1 ⇒ ak − sn < . (3)
k=1
2

We can also choose N2 ∈ N+ such that



X 
|ak | < . (4)
k=N2 +1
2

Let N = max{N1 , N2 } and now choose M ∈ N+ such that M > N and


{a1 , . . . , aN } ⊆ {b1 , . . . , bM }. (5)
If n > M, then |tn − sN | ≤ |aN +1 | + |aN +2 | + . . . (6)
So
∞ ∞
X X  
ak − tn ≤ ak − sN + |tn − sN | < + = . (7)
k=1 k=1
2 2

The theorem now follows. (8)


P
(a) Why does ak converge?
CHAPTER 4. INFINITE SERIES 125

(b) Why can we choose N2 so that the inequality in line (4) holds?
(c) Why can M be chosen, as claimed in line (5)?
(d) Explain how the inequality in line (6) follows.
(e) Explain how the expression in line (7) is obtained.
(f) How does the theorem follow in line (8)?

4.5 Power series


So far the terms of the series we have looked at were real constants (that is, real
numbers that are not dependent on a variable). In the last section of this chapter we
consider series with terms that depend on a variable. We can therefore think of the
terms as functions of the variable. This means that for each value of the variable,
we have an infinite series, and each such series can converge or diverge. We will be
interested mainly in determining for which value of the variable the series converges.
The type of series we consider here are called power series, because each term of the
series is a power function (a constant multiple of a power of the variable). Here is
the precise definition:

Definition 4.5.1 A power series in x about the point a is a series of the form

X
ak (x − a)k = a0 + a1 (x − a) + a2 (x − a)2 + . . . + an (x − a)n + . . .
k=0

where x is a real variable, a ∈ R and an ∈ R for every n ∈ N. The real numbers


an are called the coefficients of the power series and do not depend on x, and a is
called the centre of the power series.

In the rest of this section we will take a = 0. Such a power series has the form

X
ak xk = a0 + a1 x + a2 x2 + . . . .
k=0

The results we prove can easily by generalised to the case where a 6= 0, by a simple
substitution.
The familiar Taylor series, for instance
x2 x3 x4
ln(1 + x) = x − + − + ...
2 3 4
CHAPTER 4. INFINITE SERIES 126

x2 x3
ex = 1 + x + + + ...
2! 3!
x3 x5
sin x = x − + − ...
3! 5!
are examples of power series.
Note that it is usual in a power series to begin with k = 0, to allow for the term
0
a0 x = a0 , which is independent of x.

P∞ k
Theorem 4.5.2 Suppose w ∈ R and k=0 ak w converges.
k
P∞
Then k=0 ak x converges absolutely for all x such that |x| < |w|.

k n n
Proof: Since ∞
P
k=0 ak w converges, an w → 0 as n → ∞. So (an w ) is a bounded
sequence and so we can find C ∈ R+ so that, for all n ∈ N+ , the inequality |an w n | ≤
C holds.
Now, for any x ∈ R such that |x| < |w|, and any n ∈ N,

xn x n x n
|an xn | = an w n · n
= |an w n | · ≤C .
w w w

x x X x k
Since does not depend on n and < 1, the series converges (why?),
w w k=0
w

X x k
also converges and, by the Comparison Test, ∞ k
P
so C k=0 |ak x | converges
k=0
w
too. This shows that ∞ k
P
k=0 ak x is absolutely convergent.

ak w k diverges.
P
Corollary 4.5.3 Suppose
ak xk diverges for all x with |x| > |w|.
P
Then

ak xk converged. By
P
Proof: Suppose that, for some x ∈ R, |x| > |w| and
k
P
Theorem 4.5.2, ak w would also converge – a contradiction.
In the proof of the next theorem we will make use of the idea of the supremum
of a set (rather than a sequence) of real numbers. If A is a set of real numbers, an
upper bound of A is a real number C such that x ≤ C for all x ∈ A. If A has an
upper bound, we say it is bounded above, and then it also has a least upper bound,
or supremum, denoted by sup A. It is an upper bound with the property that if
c < sup A, there is an x ∈ A such that x > c (see Question 3 in the exercises for
Section 2.4, and Theorem 3.5.12).
CHAPTER 4. INFINITE SERIES 127

ak xk exactly one of the following holds:


P∞
Theorem 4.5.4 For a power series k=0

(a) the series converges for x = 0 only;

(b) the series converges for all x ∈ R;

(c) there exists a positive real number R such that the series converges absolutely if
|x| < R and diverges if |x| > R.

(Note this theorem gives no information about the case |x| = R.)

Proof: Suppose (a) and (b) do not hold; we show that (c) must then hold. Define

X
R = sup{|x| : ak xk converges }.
k=0

First we check that this


P supremum really exists. Since (b) does
P∞not hold, there is
k k
a w ∈ R for which ∞ a
k=0 k w diverges. Now suppose that
P k=0 ak x converges
for some x ∈ R. If it were the case that |w| < |x|, then ak w k would converge
(by Theorem 4.5.2) , which it does not. So |w| ≥ |x|. This shows that the set
{|x| : k=0 ak xk converges } is bounded above, and so has a supremum.
P∞

Since (a) does not hold, R 6= 0, so R > 0.


Now suppose |x| < R for some x ∈ R. By the definition of a supremum, there exists
R such that |x| < |w| ≤ R and k=0 ak w k converges. Then, by Theorem 4.5.2,
P∞
P∈
w
∞ k
k=0 ak x converges absolutely.
k
If |x| > R for some x ∈ R, then ∞
P
k=0 ak x clearly diverges. (Look at the definition
of R again!)
The number R given in Theorem 4.5.4 is called the radius of convergence of
the series. In general, nothing can be said about convergence in the cases x = R
and x = −R; for a particular series these cases have to be checked separately.
If (a) holds, we write R = 0. If (b) holds, we write R = ∞, and say that the series
has infinite radius of convergence. (This is just a notational convenience; we are not
claiming that ∞ is a real number.)
A power series will therefore converge on an interval that has one of the following
forms (for some real number R):

(−R, R) or [−R, R] or (−R, R] or [−R, R) or R or {0}.

This is known as the interval of convergence of the series.


CHAPTER 4. INFINITE SERIES 128

Example 4.5.5 Consider the power series



X xk+1 x2 x3 x4
=x+ + + + ....
k+1 2 3 4
k=0

xn
We apply the Ratio Test, with an = . Then
n
an+1 xn+1 n n
= · n = |x| · → |x|.
an n+1 x n+1

So the series converges absolutely for |x| < 1 and diverges for |x| > 1. So it has
radius of convergence R = 1. To find its interval of convergence, we must still
consider the cases x = 1 and x = −1.
1
If x = 1, the series becomes ∞
P
k=0 , the harmonic series, which diverges.
k+1
(−1)k+1
If x = −1, the series becomes ∞
P
k=0 , which converges (see Example 4.2.8).
k+1
So this power series has interval of convergence [−1, 1).

Example 4.5.6 We now look at the power series



X xk x2 x3
= 1+x+ + + ....
k=0
k! 2! 3!

xn
We can use the ratio test again, with an = :
n!
an+1 |x|n+1 n! |x|
= n = →0
an |x| (n + 1)! n+1
for all x ∈ R. From this it follows that this series converges for all x ∈ R, so its
interval of convergence is R.

Example 4.5.7 Consider the power series



X
k!xk = 1 + x + 2!x2 + 3!x3 + . . . .
k=0

Apply the Ratio Test with an = n!xn . Then:


an+1 (n + 1)!xn+1
= = |x| · (n + 1).
an n!xn
CHAPTER 4. INFINITE SERIES 129

If x = 0, then |x| · (n + 1) = 0, so the series (not surprisingly!) converges.


If x 6= 0, then |x| · (n + 1) → ∞ as n → ∞, so the series diverges.
So this power series has radius of convergence R = 0 and interval of convergence
{0}. (Needless to say, we won’t have much use for such a power series!)

Example 4.5.8 Consider the power series



X xk x x2 x3
=1+ + 2 + 3 + ....
k=0
ek e e e

x
This is a geometric series. It converges absolutely for < 1, i.e. for |x| < e. It
e
x x
diverges for ≥ 1, i.e. for |x| ≥ e. (Note that the case = 1 is taken care
e e
of automatically by the geometric series.) The interval of convergence is therefore
(−e, e).

Summary:
In this section we considered power series, that is series for which each terms is a
constant multiple of a power of a real variable x. We showed that there is an interval
such that for every x in the interval the series converges, and for every x outside the
interval the series diverges. This interval is called the interval of convergence of the
power series, and its radius is the radius of convergence.

Exercises

1. Find the radius and interval of convergence for each of the following power
series
∞ ∞
X (3x)n X (2x)n
(a) √ (b)
n=0
n+1 n=0
(n + 1)3
∞ ∞
X xn X (nx)n
(c) (d) .
n=0
nn n=0
n!
∞ ∞
X xn X 2n n
(e) (f) x .
n=1
n2 2n n=0
n!

an xn is given by
P
2. Show that the radius of convergence of the power series
|an+1 |
lim whenever this limit exists.
n→∞ |an |
CHAPTER 4. INFINITE SERIES 130

3. Read the following and answer the questions below.


Theorem Suppose that the three series
∞ ∞ ∞
X
k
X
k−1
X ak k+1
(1) ak x (2) kak x (3) x
k=0 k=1 k=0
k+1

have radii of convergence R1 , R2 and R3 respectively, and that R1 , R2 and R3


are all in R. Then R1 = R2 = R3 .
Proof: Note that the series (2) is obtained from (1) by differentiating each
term with respect to x. Similarly, series (1) can be obtained from series (3) in
this way. We will need this later. (1)
First we show that R2 ≥ R1 . (2)
Fix x ∈ R such that |x| < R1 , and choose r such that |x| < r < R1 . (3)
ak r k is absolutely convergent.
P
Then (4)
So the sequence (|an r n |) is bounded. (5)
So there exists M ∈ R+ so that |an r n | < M for all n ∈ N+ . (6)
Then, for all n ∈ N+ ,
 n−1
n−1 n−1
 x n−1 M |x|
|nan x | = nan r < n .
r r r
(7)
X  |x| k−1
The series k is convergent. (8)
r
kak xk−1 converges for |x| < R1 .
P
By the Comparison Test, (9)
So R2 ≥ R1 . (10)
Next we show that R3 ≥ R1 . (11)
For any n ∈ N+ ,
 n
an n+1 an n+1  x n+1 |x|
x = r < M|x|
n+1 n+1 r r
(12)
From this, one can deduce that R3 ≥ R1 . (13)
However, as we stated at the start, the relation between R3 and R1 is the same
as that between R1 and R2 , so we also obtain
CHAPTER 4. INFINITE SERIES 131

R1 ≥ R3 .
(14)
So finally R1 = R2 = R3 . (15)

(a) How do you know that there is an r with the properties claimed in line
(3)?
(b) Why is the series in line (4) absolutely convergent?
(c) Why is the sequence (|an r n |) bounded, as claimed in line (5)?
(d) Show how the inequality in line (7) is obtained.
(e) Why is the series in line (8) convergent?
(f) How is line (9) obtained from the previous working?
(g) Why does line (10) follow?
(h) Show how the inequality in line (12) is obtained.
(i) Why does line (13) follow?
(j) Explain how the inequality in line (14) is obtained.

4. Show that the three series of Question 3 need not have the same interval of
1
convergence, by considering the case where an = .
n
5. (a) Show that if any one of the series in Question 3 converges for all x ∈ R,
then so do the other two.
(b) Show that if any one of the series in Question 3 has interval of convergence
equal to {0} then so do the other two.
Chapter 5

Continuous and differentiable


functions

In this chapter we shift our focus from sequences and series to real-valued functions.
We will look at some familiar notions from the calculus with the same rigour that
we applied to sequences and series. We will not be abandoning sequences altogether,
though. As you will soon see, it is possible to define some key concepts, such as limits
and continuity, in terms of convergence of sequences. This has the huge advantage
that we can use all our knowledge of sequences to prove results about limits and
continuous functions.
Series also will have their day, but that will have to wait till the next chapter,
when we consider series of functions.

5.1 A first look at continuity and differentiability


We begin by recapping the ideas of continuity and differentiability you already know.
Our aim is to recall the geometric intuition behind these ideas, and then to point
out where greater rigour is needed.
In this chapter we will work exclusively with real-valued functions of a real
variable. Informally we can think of such a function as a “rule” (usually denoted by a
letter such as f ) which assigns to every real number x in some set of real numbers (the
domain of the function) a unique real number f (x). We can picture such a function
by drawing its graph. If the domain of the function f is denoted by dom(f ), then
the graph of f is the subset of R2 given by {(x, y) ∈ R2 : x ∈ dom(f ), y = f (x)}. If,
as will often be the case, the domain of the function is all of R, we’ll write f : R → R

132
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 133

to indicate that f is a real-valued function defined for every x ∈ R.


You are already familiar with the geometric idea of a function being continuous
if its graph is “in one piece”, i.e. can be drawn without lifting pencil from paper.
It will be helpful initially to distinguish between continuity of a function at a point
(the graph of the function does not have a gap at that point), and continuity on
an interval (the graph is in one piece, without gaps, on the whole interval). In this
section we look only at the first idea.
We need to move from an informal description of continuity at a point such
as given above to a more formal definition that we can use to prove results about
continuous functions. We’ll do this in stages. To start with a very simple example
which will help us to formulate a definition.

Example 5.1.1 (a) Let the function f be defined by


|x|
f (x) = .
x
Since f (x) is not defined for x = 0, there will definitely be a gap in the graph of f at
the point x = 0, and so, at least according to our intuitive idea, f is not continuous
at x = 0.

1◦ 1◦
0 •0

◦ −1 ◦ −1

Figure 5.1: The graphs of f and g

(b) We try to remedy the situation by changing the definition of f by defining it at


x = 0 as well. This creates a new function; let’s call it g. There are of course many
ways that we can do this, but here is the one we have chosen:
|x|


 if x 6= 0
g(x) = x

0 if x = 0

Now g is defined at x = 0, but we have not managed to fill the gap. The problem
is that if we approach x = 0 from the left, we stay on the line y = −1, while if we
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 134

approach from the right, we stay on the line y = 1. You may recall from your first-
year calculus that we say in this case that the left-hand limit limx→0− g(x) = −1,
and the right-hand limit limx→0+ g(x) = 1. Since the two are different, there is no
hope of filling the gap in the graph. We say in such cases that the limit limx→0 g(x)
does not exist. If this is the case, the function cannot be continuous.
(c) Let’s try to change g in such a way that the limit at x = 0 exists. If we put
h(x) = |g(x)|, we get 
1 if x 6= 0
h(x) =
0 if x = 0
Now limx→0− h(x) = 1 = limx→0+ h(x), so limx→0 h(x) does exist. But there is still
a gap in the graph of h at x = 0. The problem is that limx→0 h(x) = 1 6= 0 = h(0).
To fill the gap and ensure continuity at x = 0 we need to have h(0) = limx→0 h(x).
(d) To change h into a continuous function, we put k(x) = 1 for all x ∈ R. Then we
have k(0) = limx→0 k(x), the gap at x = 0 is filled and k is a continuous function.

1◦ 1

0 0

Figure 5.2: The graphs of h and k

This rather simple example motivates the following familiar definition:

Definition 5.1.2 A real-valued function f is continuous at a ∈ R if and only if

(a) f is defined at a;

(b) limx→a f (x) exists;

(c) limx→a f (x) = f (a).

This is sometimes expressed more briefly by simply saying “f is continuous at a if


and only if limx→a f (x) = f (a)”.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 135

There are a number of legitimate concerns that can be raised about this defini-
tion.

• What is limx→a f (x)? When does it exist? In many cases it may be intuitively
clear whether this limit exists and if so what it equals, but this is certainly
not always the case.

• Does it make sense to talk about continuity, and limits, unless the function is
defined not only at a, but also “near” a?

Before we can make much progress, we will have to say clearly what we mean
by the limit of a function at a point, and get clarity on when this limit exists, and
when not.
It is now clear from the examples above and the definition that the functions f, g
and h in Example 5.1.1 are not continuous at 0, but that the function k, defined by
k(x) = 1 for all x ∈ R, is continuous. These are rather trivial examples, but they
do illustrate the essential ideas. There are far more complicated ones to come!
We now turn to differentiability. Geometrically speaking, a real-valued function
f is differentiable at a point a ∈ R if a (unique) tangent can be drawn to the graph
of f at the point (a, f (a)). This means that the graph of f is “smooth” at this
point – it does not change direction abruptly. This idea was used in your first-year
calculus course to motivate the following definition:

Definition 5.1.3 A real-valued function f defined at a is differentiable at a if


the limit
f (a + h) − f (a)
lim
h→0 h
exists; if this is the case, the limit is denoted by f 0 (a) and called the derivative of
f at a.

The definition of differentiability also needs the definition of a limit of a func-


tion at a point to be made precise. We finish this section with well-known result
establishing the link between continuity and differentiability.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 136

Theorem 5.1.4 If a real-valued function f is differentiable at a ∈ R, then f is


continuous at a.

Proof: Suppose that f is differentiable at a, i.e. that f 0 (a) exists. We must show
that limx→a f (x) = f (a). Now
lim f (x) = lim f (a + h)
x→a h→0

f (a + h) − f (a)

= lim · h + f (a)
h→0 h
f (a + h) − f (a)
= lim · lim h + lim f (a)
h→0 h h→0 h→0
0
= f (a) · 0 + f (a)
= f (a), as required.

Warning: The converse of this theorem is false. The standard example is the
function f defined by f (x) = |x|, which is continuous at x = 0, but not differentiable
there.
Note that we are using rules like “the limit of a sum is the sum of the limits”
and “the limit of a product is the product of the limits” in the proof above. Before
we can prove results like these, we need a precise definition of the limit of a function
at a point.
We rest our case for the need of a definition. In the next section we do limits
properly!
Summary:
In this section we reminded you of the usual defininitions of continuity and differ-
entiability of a real-valued function at a point, and we saw that in order to make
them rigorous, a precise definition of the limit of a function at a point is needed.

5.2 Limits of functions


We start this section with the promised definition of the limit of a function at a
point. Before we do that, it is worth making two perhaps obvious, but nonetheless
important remarks.

• It is possible to find the limit of a function f at a point a even if the function


is not defined at a.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 137

• To find the limit of f at a, we do need f to be defined near a.

The phrase “near a” is not very precise. What we will require is that there be some
open interval I (i.e., I must be an interval of the form (c, d) = {x ∈ R : c < x < d})
which contains the point a and such that f is defined everywhere on I except perhaps
at a itself. This means that f must be defined on I \ {a} = {x ∈ R : x ∈ I, x 6= a}.
(We sometimes call I \ {a} a deleted interval.)
Our intuitive idea of the limit of a function f at a point a is that it is the number
that the function values f (x) tend to as x tends to a. In the definition that follows,
we use sequences converging to a to approach the point a.

Definition 5.2.1 Let f be a real-valued function, a ∈ R and suppose there is an


open interval I containing a such that f is defined on I \ {a} = {x ∈ I : x 6= a}.
Then the limit of f at a exists and is equal to ` if and only if for every sequence
(xn ) in I\{a}, xn → a implies that f (xn ) → `.
We then write limx→a f (x) = `.
If for every ` ∈ R we can find a sequence (xn ) in I\{a} such that xn → a but
f (xn ) 6→ `, we say limx→a f (x) does not exist.

Using quantifiers and connectives, these definitions read as follows:

lim f (x) exists ⇔ (∃` ∈ R)(∀(xn ) ⊆ I \ {a})[xn → a ⇒ f (xn ) → ` ].


x→a

limx→a f (x) does not exist ⇔ (∀` ∈ R)(∃(xn ) ⊆ I \ {a})[xn → a ∧ f (xn ) 6→ ` ].


When dealing with limits in the sequel we shall not explicitly state the assump-
tion that the function f is defined on a deleted interval containing a, or that we
consider only sequences in this deleted interval. We shall also use the phrase “near
a” to mean “in a deleted open interval containing a”.
An immediate consequence of the definition is the following fact, useful for prov-
ing that limits do not exist:

Proposition 5.2.2 Suppose f is defined near a. If there exist sequences (xn ) and
(yn ) such that xn → a and yn → a but f (xn ) → `1 and f (yn ) → `2 with `1 6= `2 ,
then limx→a f (x) does not exist.

We begin with some very simple examples.


CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 138

Example 5.2.3 (a) Let c ∈ R and f be the constant function defined by f (x) = c
for all x ∈ R. Let a be any real number. We show that limx→a f (x) = c.
So suppose (xn ) is a sequence such that xn → a. Then f (xn ) = c for all n ∈ N+ , so
(f (xn )) is the constant sequence with each term equal to c, which certainly converges
to c.
(b) Let g(x) = x for all x ∈ R. Let a be any real number; we show that
limx→a g(x) = a.
So suppose (xn ) is a sequence such that xn → a. Then g(xn ) = xn so (g(xn )) = (xn ),
so g(xn ) → a as required.

Now for a rather remarkable example: a function for which the limit does not exist
at any point!

Example 5.2.4 Define the function h : R → R by



1 if x is rational
h(x) =
0 if x is irrational

(This function is known as the Dirichlet function.) We begin by showing that


limx→0 h(x) does not exist, and we do this by using Proposition 5.2.2.
Let xn = n1 , then xn is rational for every n and xn → 0. Since h(xn ) = 1 for every
n, h(xn ) → 1. √
2
Now let yn = , then yn is irrational for every n and yn → 0. Since h(yn ) = 0 for
n
every n, h(yn ) → 0 6= 1. Hence limx→0 h(x) does not exist.
The same kind of argument can be used to show that limx→a h(x) does not exist for
every a ∈ R.

We are now in a position to use all the “limit rules” we have for convergent
sequence to prove similar rules for limits of functions.

Theorem 5.2.5 Let a ∈ R, and let f and g be functions such that limx→a f (x) and
limx→a g(x) both exist. Then in each case below the limit on the left exists, and

(a) limx→a [f (x) + g(x)] = limx→a f (x) + limx→a g(x)

(b) limx→a [f (x) − g(x)] = limx→a f (x) − limx→a g(x)

(c) limx→a f (x)g(x) = limx→a f (x) · limx→a g(x)


CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 139

(d) limx→a cg(x) = c limx→a g(x) for any c ∈ R

f (x) limx→a f (x)


(e) limx→a = provided g(x) 6= 0 near a and limx→a g(x) 6= 0
g(x) limx→a g(x)

(f ) limx→a |f (x)| = | limx→a f (x)|.

Proof: (a) Suppose that limx→a f (x) = `1 and limx→a g(x) = `2 .


Let (xn ) be a sequence such that xn → a. Then f (xn ) → `1 and g(xn ) → `2 , so
f (xn )+g(xn ) → `1 +`2 (by Proposition 3.2.5(a)). This shows that limx→a [f (x)+g(x)]
exists, and equals `1 + `2 = limx→a f (x) + limx→a g(x).
For (b) to (f), apply the other parts of Proposition 3.2.5.
We have now proved all the properties of limits that we needed in the proof of
Theorem 5.1.4, therefore the proof of that theorem is now complete.

Theorem 5.2.6 (Sandwich Theorem for real-valued functions.)


Let a ∈ R and f, g, h be functions such that

(a) f (x) ≤ g(x) ≤ h(x) for all x near a;

(b) limx→a f (x) = limx→a h(x) = `.

Then limx→a g(x) exists and limx→a g(x) = ` also.

Proof: Let (xn ) be a sequence such that xn → a. Then f (xn ) → ` and h(xn ) → `.
Also f (xn ) ≤ g(xn ) ≤ h(xn ) for all n ∈ N+ , so by the Sandwich Theorem for
sequences (Theorem 3.2.13) g(xn ) → `.

Example 5.2.7 (a) Let f (x) = x sin( x1 ) for x 6= 0.


Since −1 ≤ sin( x1 ) ≤ 1 for all x 6= 0, −|x| ≤ x sin( x1 ) ≤ |x| for all x 6= 0.
Now limx→0 |x| = 0 and limx→0 (−|x|) = 0, so it follows from the Sandwich Theorem
that limx→0 f (x) = 0.
(b) Now let g(x) = sin( x1 )for x6= 0.  
1 2
It is easy to check that f = 0 and f = 1 for all n ∈ N+ .
nπ π + 4πn
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 140
 
1
Let (xn ) = . Then xn → 0 and f (xn ) → 0.
 nπ 
2
Let (yn ) = . Then yn → 0 and f (yn ) → 1.
π + 4πn
By Proposition 5.2.2, limx→0 sin( x1 ) does not exist.

Now that we have a rigorous definition of the limit of a function at a point, we


have also made rigorous our definition of continuity of a function at a point. We
explore the consequences in the next section.

Summary:
In this section we gave the definition of the limit of a function at a point in terms
of convergence of sequences. This allowed us to prove results for limits of functions
that look very similar to the results for sequences.

Historical Notes

Johann Peter Gustav Lejeune Dirichlet (1805 – 1859)


Lejeune Dirichlet was born in the small
town of Düren, near Cologne, where his
father was the postmaster. He had a
passion for mathematics from an early
age. He attended school first in Bonn
and later in Cologne. At the age of 16
he finished his schooling, and decided
to attend a university in France, since
the standard of German universities
was not high at the time. He attended
lectures at the Collège de France, where
he was taught by many of the lead-
ing mathematicians of the time. He
stayed in the house of General Foy
(who had retired from the army after
the Napoleonic Wars), and was treated
very well by his family.
Dirichlet immediately attracted attention with his very first paper in mathematics,
in which he was able to make substantial progress with the case n = 5 of Fermat’s last
theorem. When General Foy died in 1825, Dirichlet returned to Germany. He was
awarded an honorary doctorate by the University of Cologne and appointed at the
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 141

University of Bresslau in 1827. Standards at the university were low, and Dirichlet
took up a position at the Berlin Military College a year later. Soon afterwards he
was appointed at the University of Berlin, where he stayed until 1855. He had a high
teaching load, partly because he was required to keep on teaching at the Military
College even after he was appointed at the University. Improvements in his salary
and an appointment to the Berlin Academy of Sciences in 1831 enabled Dirichlet to
marry Rebecca Mendelssohn, the sister of the composer Felix Mendelssohn.
When Gauss died in 1855, Dirichlet was offered his position at the University of
Göttingen. He initially tried to use the offer to negotiate improved conditions in
Berlin, but when he was unsuccessful, he accepted the chair in Göttingen. Here
conditions were far more favourable, and he had more time for research and excellent
research students. Unfortunately he was not destined to enjoy this for long. While
at a conference in Switzerland in 1858 he suffered a heart attack, and never recovered
fully. He died in Göttingen the next year, shortly after his wife.
Dirichlet made important contributions to many fields of Mathematics. His initial
work was mainly on various aspects of number theory, and is considered to be the
father of analytic number theory. He was also the first to propose the modern
definition of a function. He did outstanding work in mechanics and is also regarded
as the founder of the theory of Fourier series.

Exercises
n
X
1. Prove that if f is a polynomial function (i.e. f is of the form f (x) = ak xk ),
k=0
then for every a ∈ R, limx→a f (x) = f (a).
p(x)
2. Prove that if f (x) = , where p and q are polynomial functions, then
q(x)
limx→a f (x) = f (a) for every a ∈ R such that q(a) 6= 0.

3. Prove or disprove:

(a) If lim f (x) and lim (f (x) + g(x)) exist, then lim g(x) exists.
x→a x→a x→a

(b) If lim f (x) exists and lim g(x) does not exist, then lim (f (x).g(x)) does
x→a x→a x→a
not exist.
(c) If lim f (x2 ) exists, then lim f (x) also exists.
x→0 x→0
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 142

4. Let the function f : R → R be defined by


f (x) = x if x is rational, and f (x) = 1 − x if x is irrational.
(a) Does limx→1 f (x) exist? Give a reason for your answer.
(b) Does limx→ 1 f (x) exist? Give a reason for your answer.
2

5.3 Continuous functions


Now that we have a proper definition for the limit of a function at a point, we can
rephrase the definition of continuity of a function at a point (Definition 5.1.2) in
terms of sequences:

Proposition 5.3.1 Let a ∈ R and suppose f is a real-valued function defined on


an open interval I containing a. A function f is continuous at a if and only if, for
every sequence (xn ) in I such that xn → a, we have f (xn ) → f (a).

Proof: Combine Definition 5.1.2 and Definition 5.2.1. Note that for the lim f (x) to
x→a
be defined we need f to be defined on a deleted open interval containing a, and for
f to be continuous at a, we need f to be defines at a as well, hence the requirement
that f be defined on an open interval containing a.
To avoid statements about continuity at a point a becoming very long and te-
dious, we’ll in future omit the fact that the function f is assumed to be defined on
some open interval containing a; whenever we say something like “f is continuous at
a”, we’ll assume that that f is defined on such an interval. When working with two
functions both continuous at a, each one will be defined on such an open interval,
and these intervals may be different. But the intersection of the two open intervals
will then still be an open interval containing a.
In future, when we want to prove that a function f is continuous at a, we’ll start
of by saying “Let (xn ) be a sequence such that xn → a”; the understanding is that
when we say this, there is an open interval I containing a and (xn ) is a sequence in
I, i.e. xn ∈ I for every n ∈ N+ .
Our first two examples are so obvious it hardly seems worth mentioning, but
these give us the building blocks for many other continuous functions.

Example 5.3.2 Let c ∈ R. The functions f : R → R and g : R → R defined by


f (x) = c and g(x) = x are both continuous at every a ∈ R. This follows at once
from Example 2.2.8.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 143

The next example is more unusual.

Example 5.3.3 Let 


x if x ∈ Q
h(x) =
1 − x if x ∈ R \ Q
We show that h is continuous at x = 21 , i.e. that lim h(x) = h( 21 ) = 21 .
x→1/2

Suppose (xn ) is a sequence such that xn → 21 .


Let  ∈ R+ be given. Then there exists N ∈ N+ so that, for any n ∈ N+ , n ≥ N ⇒
|xn − 21 | < .
Now suppose n ∈ N+ and n ≥ N. Consider two cases:
If xn ∈ Q then |h(xn ) − 21 | = |xn − 12 | < .
If xn ∈ R \ Q then |h(xn ) − 21 | = |(1 − xn ) − 21 | = | 12 − xn | < .
In either case |h(xn ) − 21 | < , so we have shown that h(xn ) → 21 .
Now let a ∈ Q and a 6= 21 . We show that h is not continuous at a.
1
Let (xn ) = (a + √n
). Then xn → a and h(xn ) → a.
2
Let (yn ) = (a + n ). Then yn → a and h(yn ) → 1 − a.
Since a 6= 1 − a, limx→a h(x) does not exist.
A similar argument shows that h is not continuous at any a ∈ R \ Q. (We leave
the details for you.) This function has the peculiar property of being continuous at
x = 21 , but nowhere else

The next proposition says that if a function is continuous and positive at a point,
then there will be some open interval containing the point where it will still be
positive. This simple result is surprisingly useful, as we’ll soon see.

Figure 5.3: A continuous function positive at a point is positive on an interval


CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 144

Proposition 5.3.4 If the function f is continuous at a ∈ R and f (a) > 0 then


there exists δ ∈ R+ so that f (x) > 0 for all x ∈ (a − δ, a + δ).

Proof: We use proof by contradiction: suppose no δ ∈ R+ with the required


property can be found. Then, for any n ∈ N+ , there exists xn ∈ (a − n1 , a + n1 ) such
that f (xn ) ≤ 0. Now xn → a as n → ∞. Since f is continuous at a, f (xn ) → f (a).
Since f (xn ) ≤ 0 for all n ∈ N+ , f (a) ≤ 0 by Theorem 3.2.10. This contradicts
f (a) > 0.
Clearly the same argument shows that if f is continuous at a and f (a) < 0 then
there exists δ ∈ R+ so that f (x) < 0 for all x ∈ (a − δ, a + δ).

Theorem 5.3.5 Let f and g be functions continuous at a ∈ R. Let c ∈ R. Then:

(a) f + g, f − g, cf, f g and |f | are all continuous at a.

f
(b) If g(a) 6= 0 then is continuous at a.
g

Proof: (a) Let (xn ) be a sequence such that xn → a. Since f and g are continuous
at a, f (xn ) → f (a) and g(xn ) → g(a). So (by Proposition 3.2.5(a)) f (xn ) + g(xn ) →
f (a) + g(a), so f + g is continuous at a. The other proofs are similar.
(b) Since g(a) 6= 0, there exists δ ∈ R+ so that g(x) 6= 0 for all x ∈ (a − δ, a + δ),
f
by Proposition 5.3.4. Then is defined on the open interval (a − δ, a + δ). Let
g
(xn ) be a sequence contained in the interval (a − δ, a + δ), such that xn → a. Then
f (xn ) f (a) f
f (xn ) → f (a) and g(xn ) → g(a), so → (by Proposition3.2.8). So is
g(xn ) g(a) g
continuous at a.

Corollary 5.3.6 Polynomial functions are continuous at every point of the real line
and rational functions are continuous at every point where the denominator is not
zero.

Proof: This follows from Example 2.2.8 and Theorem 5.3.5.

Proposition 5.3.7 Let f and g be functions so that f is continuous at a and g is


continuous at f (a). Then the composite function g ◦ f is continuous at a.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 145

Proof: Let (xn ) be a sequence such that xn → a. Then f (xn ) → f (a) (why?) and
g(f (xn )) → g(f (a)) (why?). So (g ◦ f )(xn ) → g(f (a)), as required.
(If you are observant, you may notice that we have to be more careful here about
the open intervals on which f and g are defined. See if you can work out what the
problem is, and if you are stuck, look ahead at Theorem 5.3.23.)
So far we have been concentrating on the idea of a function being continuous at
a point. Often, however, we will be more interested in whether or not a function is
continuous on an interval, or on the whole real line. (In what follows, when we say
“interval”, we’ll include the real line as well.) We state precisely what we mean by
this in the next definition.

Definition 5.3.8 Let I be any interval on the real line and f : I → R be a function.
We say that f is continuous on the interval I if and only if for every sequence
(xn ) in I such that xn → x and x ∈ I also, we have f (xn ) → f (x).

If I is an open interval (or the whole real line), this amounts to saying that f is
continuous at every point of I.
This is not so in general, however. Suppose I = [a, b]. If f is continuous on [a, b],
then for every sequence (xn ) such that a ≤ xn ≤ b for every n ∈ N+ and xn → a,
f (xn ) → f (a).
For f to be continuous at the point a, however, we must have f (xn ) → f (a) for
every sequence (xn ) in an open interval containing a, including sequences sequences
(xn ) for which xn ≤ a for some or all n.

Example 5.3.9 Consider the function


|x|


 if x 6= 0
f (x) = x

1 if x = 0

We use this function to illustrate the definition of continuity on an interval.


Is f continuous on the interval [0, 4]?
Let (xn ) be a sequence contained in [0, 4], such that xn → x and x ∈ [0, 4] also. Then
f (xn ) = 1 for all n ∈ N+ and f (x) = 1 so f (xn ) → f (x). Hence f is continuous on
the interval [0, 4].
Is f continuous at 0?
Let (xn ) = (− n1 ). Then xn → 0 but f (xn ) → −1, so f (xn ) 6→ f (0). Hence f is not
continuous at 0.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 146

Is f continuous on the interval [−4, 0]? Use the sequence (xn ) = (− n1 ) again to
show that f is not continuous on the interval [−4, 0].
We leave it to you to show that f is continuous on the interval (−4, 0).

We call an interval of the form [a, b] where a, b ∈ R and a ≤ b a closed, bounded


interval. We have used the word “bounded” in various contexts before: for se-
quences and for subsets of R. Let’s recall the definition of the latter.

Definition 5.3.10 A subset S of R is bounded above if there is a C1 ∈ R such


that x ≤ C1 for all x ∈ S, bounded below if there is a C2 ∈ R such that x ≥ C2
for all x ∈ S, and bounded if it is both bounded below and above.

As for sequences, we can prove that a set S is bounded if and only if there is a C ∈ R+
such that |x| ≤ C for every x ∈ S. We now extend the notion of boundedness to
functions as well.

Definition 5.3.11 Suppose A ⊆ R and f : A → R is a function. We say that f is


bounded on A if and only if the set {f (a) : a ∈ A} is a bounded subset of R, i.e.
if and only if there exists C ∈ R+ such that |f (a)| ≤ C for all a ∈ A.

Continuous functions defined on closed, bounded intervals have some very useful
properties. We look at some of the most important ones here. The first one says
that such a function is bounded (see Figure 5.4 on page 148).

Theorem 5.3.12 If the function f : [a, b] → R is continuous on [a, b] then f is


bounded on [a, b], i.e. there exists K ∈ R+ such that |f (x)| ≤ K for all x ∈ [a, b].

Proof: We use proof by contradiction; suppose f is not bounded on [a, b].


Then, for each n ∈ N+ , there exists xn ∈ [a, b] such that |f (xn )| > n. This yields
a sequence (xn ) contained in [a, b]. Since (xn ) is bounded, the Bolzano-Weierstrass
Theorem (Theorem 3.4.6 says that (xn ) has a convergent subsequence (xnk ). Suppose
that xnk → x as k → ∞. Then x ∈ [a, b] (by Theorem 3.2.10). By continuity of f ,
f (xnk ) → f (x) as k → ∞. But then the sequence (|f (xnk )|) is also convergent, and
so bounded. This contradicts the initial assumption that |f (xnk )| > nk , and hence
|f (xnk )| ≥ k (by Lemma 3.4.2), for all k ∈ N+ .

Example 5.3.13 (a) The conclusion of the theorem need not hold if the interval on
1
which f is defined is not closed and bounded. For example f (x) = is continuous
x
on (0, 1) (check!) but is not bounded on this interval.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 147

(b) The conclusion of the theorem also need not hold if f is not continuous. For
example,
1

 if x 6= 0
x

f (x) =

0 if x = 0

is defined on the closed interval [0, 1] but is not bounded on [0, 1]. It is not continuous
on [0, 1].

The Least Upper Bound axiom says that every sequence which is bounded above
has a least upper bound, or supremum. From this we could also deduce that every
sequence which is bounded below has an infimum, or greatest lower bound. We saw
in Section that it follows from this that something similar is true for sets of real
numbers. We state this result again in the form that we will need it:

Proposition 5.3.14 Let S be a non-empty set of real numbers.

(a) If S is bounded above, it has a least upper bound, or supremum (denoted by


sup S).

(b) If S is bounded below, it has a greatest lower bound, or infimum (denoted by


inf S).

(c) If S is bounded, it has both an infimum and a supremum.

As is the case with sequences, if a set S has a minimum (denoted by min S), it is
also the infimum of the set. A non-empty set in R which is bounded below always
has an infimum, but need not have a minimum. Similarly, if a set S has a maximum
(denoted by max S), it is also the supremum of the set. A non-empty set in R which
is bounded above always has a supremum, but need not have a maximum.
Suppose a function f is continuous on an interval [a, b]. Let
S = {f (x) : x ∈ [a, b]}.
From Theorem 5.3.12 we know that S is a bounded subset of R, and so sup S and
inf S exist. In fact, we can prove more: max S and min S exist, as the following
theorem shows.

Theorem 5.3.15 If the function f : [a, b] → R is continuous on [a, b] then there


exist numbers c and d in [a, b] such that f (c) ≤ f (x) ≤ f (d) for all x ∈ [a, b].
(In this case f (c) = min{f (x) : x ∈ [a, b]}, f (d) = max{f (x) : x ∈ [a, b]}.)
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 148

Figure 5.4: A continuous function is bounded and has a maximum and a minimum.

Proof: Let S = {f (x) : x ∈ [a, b]}, and write M = sup S. We shall find d ∈ [a, b]
such that f (d) = M; this will show that M = max S.
For each n ∈ N+ , M − n1 is not an upper bound of S, so there exists xn ∈ [a, b] such
that M − n1 < f (xn ) ≤ M. Since n1 → 0, f (xn ) → M by the Sandwich Theorem.
On the other hand, (xn ) is contained in [a, b], so is bounded, so has a convergent
subsequence, (xnk ) say. Suppose xnk → d. Then d ∈ [a, b]. By continuity of f ,
f (xnk ) → f (d). However, f (xnk ) → M (since (f (xnk )) is a subsequence of (f (xn ))).
So, finally, f (d) = M (since limits of convergent sequences are unique).
A similar argument shows that if m = inf S, there exists c ∈ [a, b] such that f (c) =
m; i.e. m = min S. Fill in the details yourself.

1 3 3
Example 5.3.16 Consider the function f (x) = x − x2 + 2x on the interval
3 2
[0, 2]. Since it is a polynomial function, f is continuous on [0, 2] (in fact, on R).
Let S = {f (x) : x ∈ [0, 2]}. You can use a bit of first-year calculus to show that
max S = 65 = f (1) and min S = 0 = f (0).
Now consider the same function f as above, but on the interval (0, 2). Then max S =
5
6
= f (1) still holds, but min S does not exist. However, inf S does exist and
inf S = 0.

Example 5.3.17 Consider



2x if x ∈ [0, 1)
h(x) =
2 − x if x ∈ [1, 2]
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 149

Now h is not continuous on [0, 2] (why not?). Let S = {h(x) : x ∈ [0, 2]}. Then
min S = 0 = h(0) or h(2), so in this case f attains its minimum at two points in
[0, 2]. However, max S does not exist, but sup S exists and sup S = 2.

The next theorem is one that makes good intuitive sense: If a continuous function
is negative at some point and positive at another, its graph must cross the x-axis
at some point in between the two. You probably used this result frequently when
trying to find approximations to the zeros of a continuous function; now we can
prove it.

Theorem 5.3.18 If the function f is continuous on [a, b], f (a) < 0 and f (b) > 0
then there exists c ∈ (a, b) such that f (c) = 0.

Figure 5.5: A continuous function that changes sign has a zero.

Proof: Before starting the proof we note that there may be more than one such
number c; our strategy will be to find the smallest one. Let

S = {x ∈ [a, b] : f (x) ≥ 0}.

Since b ∈ S, S is not empty; since S ⊆ [a, b], S is bounded. Hence c = inf S exists.
We show that c ∈ (a, b) and f (c) = 0.
Now c ∈ [a, b] (why?); we check that c 6= a and c 6= b. There exists δ ∈ R+
so that f (x) < 0 for all x ∈ [a, a + δ) (this can be proved in the same way as
Proposition 5.3.4). So f (x) ≥ 0 implies that x ≥ a + δ; so c ≥ a + δ. This gives
c 6= a; a similar argument shows c 6= b.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 150

Next we show that f (c) = 0. Since c = inf S, for each n ∈ N+ , c + n1 is not a lower
bound of S, so there exists xn ∈ S such that c ≤ xn < c + n1 . So xn → c, and so
f (xn ) → f (c) by continuity of f . Since f (xn ) ≥ 0 for all n ∈ N+ , f (c) ≥ 0.
Now suppose f (c) > 0. Then there exists δ ∈ R+ so that f (x) > 0 for all x ∈
(c − δ, c + δ) ⊆ [a, b]. So (c − δ, c + δ) ⊆ S, which gives inf(c − δ, c + δ) ≥ inf S,
i.e. c − δ ≥ c. This is a contradiction, so f (c) ≤ 0. Putting this all together we get
f (c) = 0.

Corollary 5.3.19 (The Intermediate Value Theorem) If the function f is con-


tinuous on [a, b] and f (a) ≤ w ≤ f (b) or f (b) ≤ w ≤ f (a), then there is a c ∈ [a, b]
such that f (c) = w.

(In other words, if a continuous function takes on two values, it also takes on any
value in between those two values.)
Proof: Suppose that f (a) ≤ w ≤ f (b) . If w = f (a), we obviously take c = a; if
w = f (b) we can take c = b. So now consider the case f (a) < w < f (b). Define
g(x) = f (x) − w.
Then g is continuous on [a, b] (why?). Also g(a) = f (a) − w < 0 and g(b) =
f (b) − w > 0. So we can apply Theorem 5.3.18 to the function g to obtain c ∈ (a, b)
such that g(c) = 0, i.e. f (c) = w.
The Intermediate Value Theorem is useful in establishing the existence of real
numbers with various properties, as the following two results show.

Theorem 5.3.20 Every non-negative real number has a non-negative square root.

Proof: Let t be a real number such that t ≥ 0.


Choose n ∈ N+ so that 0 ≤ t ≤ n2 . Then the function f (x) = x2 is continuous
on [0, n], f (0) = 0, f (n) = n2 . By the Intermediate Value Theorem there exists
c ∈ [0, n] such that f (c) = t; i.e. c2 = t. So c is the desired nonnegative square root
of t.

Definition 5.3.21 A real number a is a fixed point of a function f if and only if


f (a) = a.

Proposition 5.3.22 Every continuous function f : [0, 1] → [0, 1] has a fixed point.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 151

Proof: Assume that f (0) > 0 and f (1) < 1. (Why can we do this?) Define
g(x) = f (x) − x.
Then g(0) = f (0) − 0 = f (0) > 0 and g(1) = f (1) − 1 < 0.
By Theorem 5.25, there exists c ∈ (0, 1) such that g(c) = 0, i.e. such that f (c) = c.

Note that although Proposition 5.3.22 gives the existence of a fixed point of a
function, it does not provide any method for finding it. For example, if f : [0, π2 ] →
[0, 1] is defined by f (x) = cos x, then Proposition 5.3.22 states that there exists
c ∈ [0, π2 ] such that cos c = c. To find an approximation to this solution, we can use
the method described in Chapter 1.
The theory of fixed points is very important, particularly in obtaining the exis-
tence of solutions of differential equations.
Before concluding this section on continuity, we give a characterization of conti-
nuity that you may well encounter as the definition of continuity in other textbooks.
It is the so-called “ − δ definition”.

Theorem 5.3.23 Let a ∈ R and I be an open interval containing a. Then the


function f : I → R is continuous at a if and only if for every  ∈ R+ there exists
δ ∈ R+ such that for every x ∈ I, if |x − a| < δ then |f (x) − f (a)| < .
In terms of quantifiers and connectives this becomes

f : I → R is continuous at a
⇐⇒ (∀ ∈ R+ )(∃δ ∈ R+ )(∀x ∈ I)[|x − a| < δ ⇒ |f (x) − f (a)| < ]. (∗)

Proof: Suppose first that the condition (∗) holds. Let (xn ) be a sequence contained
in I such that xn → a. We must prove that f (xn ) → f (a). Let  ∈ R+ be given.
From (∗), there exists δ ∈ R+ so that, for any x ∈ I, |x − a| < δ ⇒ |f (x) − f (a)| < .
Since xn → a there exists N ∈ N+ so that, for any n ∈ N+ , n ≥ N ⇒ |xn − a| < δ.
Putting this together gives, for any n ∈ N+ ,

n ≥ N ⇒ |xn − a| < δ ⇒ |f (xn ) − f (a)| < .

To prove that if f is continuous at a, then (∗) holds, we prove its contrapositive. To


do this, we suppose (∗) does not hold, and show that f is not continuous at a. If
(∗) is not true, there exists an  ∈ R+ such that, for any δ ∈ R+ , there exists x ∈ I
such that |x − a| < δ but |f (x) − f (a)| ≥ . In particular, for this  and δ = n1 ,
there exists xn ∈ I such that |xn − a| < n1 but |f (xn ) − f (a)| ≥ . We can find
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 152

such an xn for each n ∈ N+ . This gives a sequence (xn ) in I such that xn → a but
f (xn ) 6→ f (a), so f is not continuous at a.
The theorem gives a necessary and sufficient condition for a function f to be
continuous at a point a. Essentially the same proof gives a necessary and sufficient
condition for f to be continuous on an interval :

Corollary 5.3.24 Let I be an interval. The function f : I → R is continuous on


I if and only if for every  ∈ R+ , for every a ∈ I, there exists δ ∈ R+ such that for
every x ∈ I, if |x − a| < δ then |f (x) − f (a)| < .
In terms of quantifiers and connectives this becomes

f : I → R is continuous on I
⇐⇒ (∀ ∈ R+ )(∀a ∈ I)(∃δ ∈ R+ )(∀x ∈ I)[|x − a| < δ ⇒ |f (x) − f (a)| < ].

The condition in the corollary will be satisfied if for a given  > 0, for each a ∈ I,
there is a δ > 0 for this a such that if |x − a| < δ, then |f (x) − f (a)| < , i.e. the δ
could be different for different a ∈ I. If, for a given  > 0 we can get a δ > 0 that
will do for all a ∈ I, we say that the function is uniformly continuous on I:

Definition 5.3.25 Let I be an interval. The function f : I → R is uniformly


continuous on I if for every  > 0, there is a δ > 0 such that for all x, y ∈ I, if
|x − y| < δ, then |f (x) − f (y)| < .
In terms of quantifiers and connectives this becomes

f : I → R is uniformly continuous on I
⇐⇒ (∀ ∈ R+ )(∃δ ∈ R+ )(∀x, y ∈ I)[|x − y| < δ ⇒ |f (x) − f (y)| < ]. (∗∗)

If a function is uniformly continuous on an interval I, it is certainly continuous


on I, but the converse need not be true. For closed and bounded intervals, though,
the converse is also true.

Theorem 5.3.26 Let I = [a, b] be a closed and bounded interval and f : I → R be


continuous on I. Then f is uniformly continuous on I.

Proof: The proof is by contradiction. Suppose f is continuous but not uniformly


continuous on I. Then the condition (∗∗) does not hold, i.e. there is an  > 0 such
that for every δ > 0, there are x, y ∈ I such that |x − y| < δ, but |f (x) − f (y)| ≥ .
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 153

In particular, for each n ∈ N+ , we can take δ = n1 and find xn , yn ∈ I such that


|xn − yn | < n1 , but |f (xn ) − f (yn )| ≥ . The sequence (xn ) is bounded (why?), and
therefore has a subsequence (xnk ) which converges to some x ∈ I, by Theorem 3.4.6.
The sequence (ynk ) is likewise bounded, and therefore has a subsequence (ynk` ),
which converges to some y ∈ I. It follows that we also have xnk` → x as ` → ∞.
Since |xnk` − ynk` | < n1k for every ` ∈ N+ , we have xnk` − ynk` → 0 as ` → ∞. Hence
`

0 = lim xnk` − lim ynk` = x − y,


`→∞ `→∞

and so x = y. This gives f (x) = f (y). But we also have

|f (x) − f (y)| = | lim f (xnk` ) − lim f (ynk` )| = lim |f (xnk` ) − f (ynk` )| ≥ ,


`→∞ `→∞ `→∞

which gives a contradiction.

Summary:
In this section we used the precise definition of the limit of a function at a point
to give a rigorous definition continuity of a function at a point. Continuity of a
function on an interval was then defined, and two important results about functions
continuous on closed bounded intervals were proved: they are bounded and attain
there maximum and minimum value on the interval at points in the interval. A third
important theorem says that if a continuous function on a closed bounded interval
changes signs between two points, it must be zero somewhere between these two
points. From this the intermediate value theorem can be derived, and this theorem
has several useful consequences. An equivalent definition of continuity, not using
sequences, the  − δ definition) was also derived. We also introduced the notion
of uniform continuity, and showed that a function which is continuous on a closed
bounded interval is uniformly continuous there.

Exercises

1. Let h(x) = x − [x] where [x] denotes the largest integer less than or equal to
x.

(a) Sketch a rough graph of h.


(b) Is h continuous at the point 1? And at 21 ? Justify your answers.
(c) Is h continuous on the interval [0, 1]? And on the interval [0, 21 ]? Justify
your answers.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 154

2. Let I be an interval and suppose the function f : I → R has the property that
there is a constant C ∈ R+ such that |f (x) − f (y)| ≤ C|x − y| for all x, y ∈ I.
Prove that f is continuous on I.
[Functions with this property are called Lipschitz-continuous.]
3. Suppose f is continuous on [a, b].
(a) Show that if f (a) < 0 then there exists δ ∈ R+ so that f (x) < 0 for all
x ∈ [a, a + δ) ⊆ [a, b].
(b) Show that if f (b) > 0 then there exists δ ∈ R+ so that f (x) > 0 for all
x ∈ (b − δ, b] ⊆ [a, b].
4. Prove or disprove:

(a) The function f (x) = x is continuous on [0, ∞).
(b) The function g(x) = 17x7 − 19x5 − 1 has a zero in the interval (−1, 0).
5. Show that if the functions f and g are continuous at a ∈ R, then the function
h defined by h(x) = max{f (x), g(x)} is continuous at a.
[Hint: First show that 2 max{f (x), g(x)} = |f (x) − g(x)| + f (x) + g(x).]
6. If f : R → R is continuous and f (r) = 1 for all r ∈ Q, then f (x) = 1 for all
x ∈ R. [Use the fact that between any two real numbers there is a rational
number.]
7. Prove that if n is an odd positive integer and a ∈ R, then there is an x ∈ R
such that xn = a.
[Hint: apply the Intermediate Value Theorem.]
1
8. Show that the function f : (0, 1) → R defined by f (x) = x
is continuous on
(0, 1), but not uniformly continuous there.

5.4 Differentiable functions


In this section we look very briefly at some properties of differentiable functions.
Most, if not all of them should be familiar to you from your first-year course. Our
main aim here is to show that now that we have a rigorous definition of a limit,
these results can all be proved. We will not pay any attention to the details of
differentiation techniques, since these were done in depth in your calculus course.
The usual proofs of the sum, difference, product and quotient rules can now all be
made rigorous by the limit rules we proved in Section 5.2. The chain rule can also
be proved, though it is a bit trickier than the other rules (see the exercises).
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 155

You may recall that the differentiability (and hence continuity) of the trigono-
metric and inverse trigonometric functions can be deduced from those of the function
sin x. The differentiability of sin x at any non-zero x can be derived from its differ-
entiability at 0. To show that sin x is differentiable at x = 0 you probably used a
geometric argument, which in its turn relies on a geometric definition of sin x. There
are ways around this. It is possible to define sin x, and many other functions by
means of infinite series. This idea will come up in the next chapter.
Differentiability of a function f at a point a, and the derivative f 0 (a) at a, was
defined in Section 5.1. We now look at differentiability on an open interval.

Definition 5.4.1 We say that a function f is differentiable on the open inter-


val (a, b) if and only if f 0 (x) exists for all x ∈ (a, b).

Definition 5.4.2 If a function f is defined on an interval I, we call x0 ∈ I a


maximum point of f on I if and only if f (x) ≤ f (x0 ) for all x ∈ I. Similarly
x0 ∈ I is a minimum point of f on I if and only if f (x) ≥ f (x0 ) for all x ∈ I.

Proposition 5.4.3 If the function f : (a, b) → R has a maximum (or minimum)


point at x0 ∈ (a, b) and f is differentiable at x0 , then f 0 (x0 ) = 0.

Proof: Suppose that x0 is a maximum point of f . First consider a sequence (hn )


such that hn → 0 and hn ≥ 0 for every n ∈ N+ . Then x0 + hn ≥ x0 , and so
f (x0 + hn ) − f (x0 )
f (x0 + hn ) ≤ f (x0 ) for every n. Then ≤ 0 for every n, and
hn
f (x0 + hn ) − f (x0 )
hence f 0 (x0 ) = lim ≤ 0.
n→∞ hn
A similar argument, but this time using a sequence (hn ) such that hn → 0 and
hn ≤ 0 for every n ∈ N+ , shows that we also have f 0 (x0 ) ≥ 0. It follows that
f 0 (x0 ) = 0.
The proof in the case where x0 is a minimum point is similar.

Theorem 5.4.4 (Rolle’s Theorem) If the function f is continuous on [a, b], dif-
ferentiable on (a, b) and f (a) = f (b), then there exists c ∈ (a, b) such that f 0 (c) = 0.

Proof: Consider two cases:


Case 1: f has a maximum point, c, on [a, b] such that c 6= a and c 6= b. By
Proposition 5.4.3, f 0 (c) = 0.
Case 2: If Case 1 does not hold, then a and b are maximum points for f on [a, b]
(since such a maximum point must exist, by Theorem 5.3.15, and f (a) = f (b)).
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 156

Then f has a minimum point, c, on (a, b), and by Proposition 5.4.3 again, f 0 (c) = 0.

Example 5.4.5 The following two examples show that both conditions in the the-
orem are essential.

(a) Let
x2

if x ∈ [0, 2)
h(x) =
−x + 4 if x ∈ [2, 4]
Then h(0) = 0 = h(4) but there is no c ∈ (0, 4) such that h0 (c) = 0. The function is
of course not continuous on [0, 4] (it is not continuous at x = 0).

(b) Let g(x) = |x| for x ∈ [−2, 2]. Then g(−2) = 2 = g(2) but there is no c ∈
(−2, 2) such that g 0(c) = 0. In this case, g is not differentiable on (−2, 2) (it is not
differentiable at x = 0).

Theorem 5.4.6 (The Mean Value Theorem) If the function f is continuous on


[a, b] and differentiable on (a, b) then there exists c ∈ (a, b) such that
f (b) − f (a)
f 0 (c) = .
b−a

Proof: Define
f (b) − f (a)
g(x) = f (x) − (x − a).
b−a
Then g is continuous on [a, b] and g(a) = g(b) = f (a). Apply Rolle’s Theorem to g.
f (b) − f (a)
This gives c ∈ (a, b) such that g 0 (c) = 0, so f 0 (c) − = 0.
b−a
f (b) − f (a)
So f 0 (c) = , as required.
b−a
Note that Rolle’s Theorem is a special case of the Mean Value Theorem. We did
Rolle’s Theorem first because the proof of the Mean Value Theorem is easy using it.
f (b) − f (a)
Note also that is the slope of the line joining the points (a, f (a))
b−a
and (b, f (b)). So the Mean Value Theorem states that there is a c ∈ (a, b) at which
the tangent line to f is parallel to this line.
The Mean Value Theorem is a key ingredient in the proofs of many of the stan-
dard results about differentiability. We conclude with a few of these.
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 157

Corollary 5.4.7 If the function f is differentiable on (a, b) and f 0 (x) = 0 for all
x ∈ (a, b) then f is constant on (a, b).

Proof: Let x1 , x2 ∈ (a, b). Then f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (c) for some c ∈ (a, b).
(Why?) But f 0 (c) = 0 then, so f (x2 ) = f (x1 ).

Corollary 5.4.8 If the function f and g are both differentiable on (a, b) and f 0 (x) =
g 0 (x) for all x ∈ (a, b) then there is a constant c ∈ R such that f (x) = g(x) + c.

Proof: Exercise. (Apply Corollary 5.4.7 to h(x) = f (x) − g(x).)

Corollary 5.4.9 Suppose that the function f is differentiable on (a, b).

(a) If f 0 (x) ≥ 0 for all x ∈ (a, b) then f is increasing on (a, b).

(b) If f 0 (x) ≤ 0 for all x ∈ (a, b) then f is decreasing on (a, b).

Proof: (a) Suppose x1 , x2 ∈ (a, b) and x1 < x2 .


Then f (x2 ) − f (x1 ) = (x2 − x1 )f 0 (c) for some c ∈ (a, b).
So f (x2 ) − f (x1 ) ≥ 0, and so f (x2 ) ≥ f (x1 ) as required.
(b) The proof is similar.

Summary:
In this section we looked at some familiar results from calculus involving derivatives,
such as Rolle’s theorem and the mean Value Theorem, and showed that we are now
able to prove them rigorously using the results about limits and continuity proved
in the previous sections.

Historical notes
Michel Rolle (1652 – 1719)

Michel Rolle was born in Ambert, France, the son of a shopkeeper. He received very
little formal education beyond primary school. Before going to Paris in 1675, he
worked as an assistant to several attorneys in his home town. In Paris he worked
as a scribe and an expert in arithmetic, and soon married. He taught himself
mathematics and first came to the attention of the mathematical world when he
solved a problem posed by the mathematician Ozanam. This lead to him being
awarded a state pension and an appointment in the civil service and as a tutor to
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 158

the sons of the French Secretary of War. He was elected to the Royal Academy of
Sciences in 1685, and subsequently did impressive work in mathematics. He suffered
a stroke in 1708, and this put an end to his mathematical work. He died in 1719
after a second stroke.
Rolle’s main contributions to mathematics were in the field of number theory,
algebra an geometry. He published a book entitled Traité de Algèbre, on the theory
of equations, in 1690, in which he uses ideas that foreshadow the techniques of the
calculus. Despite this, he was a fierce critic of the calculus then being developed,
and this lead to vigorous and sometimes acrimonious debates in the Academy. To
his credit, he later admitted that he had been wrong.
The theorem we now know as Rolle’s theorem appeared in work published by
Rolle in 1691 to
√ justify the methods he used to solve equations. Rolle also introduced
the notation x, and popularised the use of the equality sign (=).
n

Exercises

1. Suppose that f (x) = x if x ≥ 1 but f (x) = x2 if x < 1. Show that f is


continuous, but not differentiable at x = 1.

2. Let f : R → R be defined by f (x) = x sin x1 if x 6= 0 and f (0) = 0. Show that


f is continuous but not differentiable at x = 0.

3. Suppose that f : R → R is n times differentiable on R and f (x) = 0 for n + 1


different values of x ∈ R. Prove that there is an x ∈ R such that f (n) (x) = 0.

4. Let I be an interval and f be a function that is differentiable on I.

(a) Show that if the derivative f 0 is bounded on I, then f is Lipschitz-


continuous on I, i.e. there is a C ∈ R+ such that |f (x) − f (y)| ≤ C|x − y|
for all x, y ∈ I.
(b) Show that if I is a closed bounded interval and f 0 is continuous on I, the
f is Lipschitz-continuous on I.

5. Prove or disprove: If f : R → R is differentiable on R, and f (n) = n for all


n ∈ N+ , then there are infinitely many points where f has derivative equal to
1. [Note that you are not assuming that f (x) = x for all x ∈ R. That would
be a bit too easy....]
CHAPTER 5. CONTINUOUS AND DIFFERENTIABLE FUNCTIONS 159

6. Let f : [a, b] → R be continuous on [a, b] and differentiable on (a, b) and


suppose f 0 is bounded on (a, b). Show that f is Lipschitz-continuous on [a, b]
(i.e. show that there is a C > 0 such that |f (x) − f (y)| ≤ C|x − y| for all
x, y ∈ [a, b].
[Hint: Use the Mean Value Theorem.]

7. Let f (x) = cos x.

(a) Show that f has a fixed point in [0, 1] (i.e. that there is an x ∈ [0, 1] such
that f (x) = x).
(b) Let x1 = 1, xn+1 = f (xn ) for n ≥ 1. Use Question 6 to show that there
is a constant k such that 0 < k < 1 and |xn+1 − xn | ≤ k n−1 for all
n ∈ N+ , n ≥ 2.
(c) Use the inequality in (b) to show that (xn ) is a Cauchy sequence.
(d) Show that (xn ) converges to a fixed point of f in [0, 1].

8. Prove the following generalisation of the Mean Value Theorem, known as the
Cauchy Mean Value Theorem:
If the functions f and g are continuous on the interval [a, b] and differentiable
on (a, b), then there is a c ∈ (a, b) such that

(f (b) − f (a))g 0(c) = (g(b) − g(a))f 0(c).

[Hint: Apply Rolle’s Theorem to the function h defined by

h(x) = f (x)(g(b) − g(a)) − g(x)(f (b) − f (a)).]

9. Let f and g be functions such that g is differentiable at a ∈ R and f is


differentiable at g(a).

(a) Define the function ϕ near 0 by



 f (g(a + h)) − f (g(a))
if g(a + h) − g(a) 6= 0
ϕ(h) = g(a + h) − g(a)
f 0 (g(a)) if g(a + h) − g(a) = 0

Prove that ϕ is continuous at 0.


d
(b) Use this result to show that (f (g(x)) = f 0 (g(a))g 0(a) (the chain rule).
dx
Chapter 6

Sequences and series of functions

In this chapter we combine the notions of sequences, series and functions and look
at sequences and series of functions. This is not such an unfamiliar idea as it may
appear at first. We have in fact already studied one special example of a series of
functions: power series. The terms of a power series are functions (rather simple
ones, namely functions of the form fn (x) = an xn , or power functions as they are
sometimes called). Taylor series are in turn special kinds of power series, and you
have probably realised by now that they are useful. Another very familiar example
of a series of functions is a Fourier series; in this case the functions are sine or cosine
functions of the form an cos nx or bn sin nx.
In the first section of the chapter we will be looking at sequences of functions
and define, in a very natural way, a limit function for such a sequence. We will be
primarily interested in whether the limit function inherits properties like continuity,
differentiability and integrability from the functions in the sequence. In the second
section we ask similar questions about series of functions, and specialise in the end
to power series.

6.1 Sequences of functions


In this section we take the notion of the limit of a sequence of real numbers and
use it to define a limit for a sequence (fn ) of real-valued functions. This limit is a
function itself, call it f , and is defined pointwise, by finding the limit of a sequence
of real numbers. The idea is to define f (x) to be the limit of the real sequence
(fn (x)), for each x for which it makes sense. Here is the precise definition:

160
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 161

Definition 6.1.1 For each n ∈ N+ , let fn be a real-valued function. Let A be a


subset of R. If, for some x ∈ A, (fn (x)) is a convergent sequence of real numbers,we
can define
f (x) = lim fn (x).
n→∞
In this way, we obtain a new function f with domain
B = {x ∈ A : lim fn (x) exists }.
n→∞

This function f is called the pointwise limit of the sequence (fn ) of functions. We
write
fn → f pointwise on B.

x
Example 6.1.2 For each n ∈ N+ , let fn (x) = .
n
Fix an x ∈ R and look at the sequence
 x x x x 
(f1 (x), f2 (x), f3 (x), . . .) = x, , , , . . . , , . . . .
2 3 4 n
This sequence is convergent, and tends to 0. This is the case for every x ∈ R+ . This
means that
B = {x ∈ R : lim fn (x) exists } = R and f (x) = lim fn (x) = 0 for all x ∈ R.
n→∞ n→∞

So fn → f pointwise on R.

Example 6.1.3 For each n ∈ N+ , let


nx if 0 ≤ x ≤ n1

gn (x) =
1 if n1 < x ≤ 1

Fix x ∈ [0, 1] and look at (g1 (x), g2 (x), . . . , gn (x), . . .).


1
If x 6= 0 there exists N ∈ N+ so that < x. Then gn (x) = 1 for all n ≥ N. So
N
gn (x) → 1.
If x = 0, gn (x) = 0 for all n ∈ N+ , so gn (x) → 0.
Hence limn→∞ gn (x) exists for every x ∈ [0, 1]. Define

1 if x ∈ (0, 1]
g(x) =
0 if x = 0
Then gn → g pointwise on [0, 1].
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 162

Example 6.1.4 For each n ∈ N+ , let hn (x) = x2n for all x ∈ R.


Consider various values of x:
If x = 1, hn (x) = 1 for all n, so hn (x) → 1.
If x = −1, hn (x) = 1 for all n, so hn (x) → 1.
If −1 < x < 1, hn (x) = x2n → 0.
If |x| > 1, hn (x) = x2n → ∞ so lim kn (x) is not defined.
n→∞
Hence B = {x ∈ R : limn→∞ hn (x) exists } = [−1, 1].
Define 
0 if |x| < 1
h(x) =
1 if |x| = 1
Then kn → k pointwise on [−1, 1].

We now want to consider questions such as these: Suppose fn → f pointwise.

• If all the fn ’s are continuous, is f continuous?


• If all the fn ’s are differentiable, is f differentiable? If so, is f 0 (x) = lim fn0 (x)?
n→∞
Rb Rb
• If a fn (x)dx exists for all n, does a
f (x)dx exist?
Rb Rb
If so, is a f (x)dx = lim a fn (x)?
n→∞

Instead of the theorems you might expect, we get a list of counter-examples.

Example 6.1.5

(a) In Example 6.1.3 each gn is continuous on the interval [0, 1], but g, their pointwise
limit, is not.

x2n
(b) For each n ∈ N+ , define fn (x) = for each x ∈ R.
1 + x2n
Then each fn is differentiable on the whole real line. Consider various values of x:
If |x| < 1, x2n → 0 so fn (x) → 0.
1
If |x| = 1, x2n → 1 so fn (x) → .
2
If |x| > 1, x2n → ∞ and fn (x) → 1.
Define 
 0 if |x| < 1
1
f (x) = if |x| = 1
 2
1 if |x| > 1
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 163

Then fn → f pointwise on R. Here each fn is differentiable on R, but f is not even


continuous, so certainly is not differentiable on R.

(c) For each n ∈ N+ , define the function gn : [0, 1] → R by


1


 2n2 x if 0 ≤ x ≤
2n


1 1

gn (x) = 2
2n − 2n x if <x≤
 2n n

 1
0 if < x ≤ 1


n

4 g4

3 g3

2
g2
1 g1

1 1 1
0 4 3 2 1
Figure 6.1: Graphs of g1 , g2 , g3 and g4 .

On the interval [0, n1 ] the graph of gn forms an isosceles triangle of height n, with
area 21 .
If x = 0, gn (x) = 0 for all n ∈ N+ so gn (x) → 0. If x 6= 0, there exists N ∈ N+ so
that n1 < x for all n ≥ N. So gn (x) = 0 for all n ≥ N, so gn (x) → 0. What this
shows is that gn → g pointwise, where g is the constant zero function.
R1 R1
Now 0 gn (x) dx = 21 for all n ∈ N+ , but 0 g(x) dx = 0, so
R1 R1
0
g(x) dx 6
= lim 0 n
g (x) dx.
n→∞

Our idea of pointwise convergence, while perhaps easy to work with, gives
counter-intuitive (and not very useful) results. What is wrong? Even though fn → f
pointwise there is some sense in which fn is not necessarily “close” to f , even for
large n. (Look at Example 6.1.5(c) again.) We will remedy this state of affairs by
introducing a stronger form of convergence than pointwise convergence – it is called
uniform convergence.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 164

Definition 6.1.6 Let I be an interval and (fn ) a sequence of real-valued functions


defined on I, and f : I → R. For each n ∈ N+ , let
dn = sup |fn (x) − f (x)|.
x∈I

We say that (fn ) converges uniformly to f on I (or f is the uniform limit of


(fn )) if and only if dn → 0 as n → ∞. We write: fn → f uniformly on I.

Intuitively, dn is meant to be a measure of the “distance” between fn and f , and fn


converges to f uniformly if this distance tends to 0 as n → ∞.
Note that, in the definition above, dn = sup |fn (x) − f (x)| may fail to exist for some
x∈I
(or all!) n ∈ N+ . If it does exist for all n ≥ N (for some fixed N), we calculate
lim dn as usual (ignoring the first N terms). Otherwise, we conclude immediately
n→∞
that fn does not converge uniformly to f .

Example 6.1.7 For each n ∈ N+ , define the function fn : [0, 1] → R by fn (x) = nx .


We show that fn → f uniformly on [−1, 1], where f is the constant zero function.
For any n ∈ N+ , dn = sup | nx − 0| = n1 .
x∈[−1,1]
1
Certainly n
→ 0 as n → ∞, as required.
Now let fn (x) = nx on R.
We show that fn does not converge uniformly to f , the constant zero function, on
R.
In this case dn = sup | nx | does not exist (no matter what n is).
x∈R

nx2
Example 6.1.8 For each n ∈ N+ , let gn (x) = and g(x) = x on the interval
1 + nx
[0, 1]. Then gn → g uniformly on [0, 1], since
nx2
dn = sup −x
x∈[0,1] 1 + nx
x
= sup
x∈[0,1] 1 + nx
1
= sup
x∈[0,1] n + 1/x
1
= ,
1+n
1
and so dn = → 0 as required.
1+n
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 165

Proposition 6.1.9 A sequence (fn ) converges uniformly to a function f on an


interval I if and only if for every  ∈ R+ there exists N ∈ N+ such that for every
n ∈ N+ and for every x ∈ I,

n ≥ N ⇒ |fn (x) − f (x)| < . (∗)

Proof: Suppose (fn ) converges to f uniformly on I. Let  ∈ R+ be given.


Since dn → 0, there exists N ∈ N+ so that, for all n ∈ N+ ,

n ≥ N ⇒ |dn | < 
⇒ sup |fn (x) − f (x)| < 
x∈I
⇒ |fn (x) − f (x)| <  for every x ∈ I.

Suppose the condition (∗) holds. Let  ∈ R+ be given.


Then there exists N ∈ N+ so that for every n ∈ N+ and every x ∈ I, n ≥ N ⇒
 
|fn (x) − f (x)| < . But then n ≥ N ⇒ sup |fn (x) − f (x)| ≤ < , as required.
2 x∈I 2
It is worthwhile to compare very carefully the criteria for pointwise and uniform
convergence:

• fn → f pointwise on I ⇐⇒
(∀x ∈ I)(∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ |fn (x) − f (x)| < ].

• fn → f uniformly on I ⇐⇒
(∀ ∈ R+ )(∃N ∈ N+ )(∀n ∈ N+ )(∀x ∈ I)[n ≥ N ⇒ |fn (x) − f (x)| < ].

They are not the same!


When showing that fn → f pointwise, the N chosen can depend on x and ;
when showing that fn → f uniformly the N chosen depends only on  – one N
must work for all x ∈ I. This may seem a minor distinction, but it makes all the
difference. It might make more sense with an example:

x
Example 6.1.10 We look at the sequence (fn ) on R defined by fn (x) = again.
n
Recall that the pointwise limit function f in this case is the constant zero function.
We have already seen that fn → f pointwise but not uniformly.
When checking pointwise convergence, you would begin with:
Let x ∈ I and  ∈ R+ be given.
x |x|
We need N ∈ N+ so that n ≥ N ⇒ | | < . That’s easy: take N > .
n 
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 166

If you tried to prove uniform convergence, you would begin with:


Let  ∈ R+ be given.
x
We need N ∈ N+ so that for every x ∈ R, n ≥ N ⇒ | | < . This is clearly not
n
x
going to work: whichever N you choose, I could choose x so that | | ≥ .
N

Corollary 6.1.11 If (fn ) converges uniformly to f on I, then (fn ) converges to f


pointwise on I also.

We are now in a position to prove some results involving uniform convergence in


settings where pointwise convergence provided only counter-examples.

Theorem 6.1.12 If (fn ) is a sequence of continuous functions on the interval I


and (fn ) converges uniformly to f on I, then f is continuous on I.

Proof: Let a ∈ I. We take a sequence (xm ) contained in I such that xm → a as


m → ∞, and we show that f (xm ) → f (a).
We must show that, given  ∈ R+ there exists M ∈ N+ so that for every m ∈ N+ ,
m ≥ M ⇒ |f (xm ) − f (a)| < . Let  ∈ R+ be given.
Since fn → f uniformly on I there exists N ∈ N+ so that for all n ∈ N+ , n ≥ N ⇒

|fn (x) − f (x)| < for every x ∈ I.
3

In particular, then, |fN (xm ) − f (xm )| < 3 for all xm and |fN (a) − f (a)| < .
3
Now fN is continuous on I, so there exists M ∈ N+ so that for all m ∈ N+ ,

m ≥ M ⇒ |fN (xm ) − fN (a)| < .
3
Putting all this together we have, for all m ∈ N+ , if m ≥ M then
|f (xm ) − f (a)| = |f (xm ) − fN (xm ) + fN (xm ) − fN (a) + fN (a) − f (a)|
≤ |f (xm ) − fN (xm )| + |fN (xm ) − fN (a)| + |fN (a) − f (a)|
  
< + +
3 3 3
= .

Corollary 6.1.13 If (fn ) is a sequence of continuous functions on an interval I


converging pointwise to a function f that is not continuous on I, then the conver-
gence is not uniform.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 167

Look at Examples 6.1.5(b) and (c) again – in both cases the convergence cannot
be uniform.

Theorem 6.1.14 If the sequence (fn ) of continuous functions converges uniformly


to f on [a, b], then
Z b Z b
lim fn (x) dx = f (x) dx.
n→∞ a a

Proof: Let  ∈ R+ be given.


Since fn → f uniformly on [a, b], there exists N ∈ N+ so that for all n ∈ N+ and all

x ∈ [a, b], n ≥ N ⇒ |f (x) − fn (x)| < . So, for all n ∈ N+ , if n ≥ N then
2(b − a)
Z b Z b Z b
f (x) dx − fn (x) dx = (f (x) − fn (x)) dx
a a a
Z b
≤ |f (x) − fn (x)| dx
a
b

Z
≤ dx
a 2(b − a)

=
2
< .

Rb Rb
So a
fn (x) dx → a f (x) dx as n → ∞.
Rb
We define a f (x) dx formally in the next chapter show there that if f is contin-
Rb
uous on [a, b] then a f (x) dx exists. The properties of the integral required in the
above proof are simple, and should be familiar from your first-year course. In the
proof of the next result we will also need the Fundamental Theorem of Calculus.

Theorem 6.1.15 Let (fn ) be a sequence of differentiable functions defined on [a, b]


which converges pointwise to f . Suppose also that the sequence (fn0 ) converges uni-
formly to a continuous function g on [a, b]. Then f is differentiable on (a, b) and
f 0 (x) = lim fn0 (x) for all x ∈ (a, b).
n→∞

Proof: Suppose fn0 → g uniformly on [a, b].


Fix x ∈ (a, b), and apply Theorem 6.1.14 to the functions fn0 and g on the interval
[a, x].
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 168
Rx Rx
Then lim fn0 (t) dt =g(t) dt
n→∞ a Ra x
So lim [fn (x) − fn (a)] = a g(t) dt.
n→∞
Rx
Since fn → f pointwise on [a, b], we obtain f (x) − f (a) = a g(t) dt . By the
Fundamental Theorem f is differentiable and f 0 (x) = g(x), so f 0 (x) = lim fn0 (x).
n→∞

Corollary 6.1.16 If fn and f are as in the Theorem, then fn → f uniformly on


[a, b].

Proof: Exercise.
In comparison with our nice results about continuity and integration (Theo-
rem 6.1.12 and Theorem 6.1.14), the result about differentiability (Theorem 6.1.15)
is a bit disappointing. It is possible for (fn ) to converge uniformly to f on I , for
differentiable functions fn , but for the conclusion f 0 (x) = lim fn0 (x) to be false.
n→∞

sin nx
Example 6.1.17 Consider fn (x) = on [0, 2π]. If f is the constant zero
n
function, fn → f uniformly on [0, 2π]. To see this, note that

sin nx 1
dn = sup |fn (x) − f (x)| = sup ≤ ,
x∈[0,2π] x∈[0,2π] n n

so dn → 0.
Now fn0 (x) = cos nx for all x ∈ [0, 2π], and f 0 is the constant zero function. However
fn0 does not even converge pointwise to f 0 ; for example fn0 (π) = (−1)n and this does
not converge.

Summary:
In this section we introduced the pointwise limit of a sequence of functions, and
showed that in general the limit of a sequence of continuous functions need not be
continuous, and that differentiation and integration are not interchangeable with the
limiting process. We then introduced the stronger notion of uniform convergence
of a sequence of functions and showed that for a uniformly convergent sequence of
functions we obtain far more satisfactory results.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 169

Exercises

1. Find the pointwise limit f of the sequence (fn ), where this limit exists, if
fn : R → R and fn (x) is
1 n2 x xn
(a) e−nx (b) xe−nx (c) (d) (e) .
1 + nx 1 + n3 x2 1 + xn
2. Find the pointwise limit f of the sequence (fn ), where this limit exists, if
fn : R+ → R and fn (x) is
x nx2
(a) (b)
1 + nx 1 + nx
3. For each of the sequences (fn ) in Questions 1 and 2 that have a pointwise limit
f , determine whether (fn ) converges uniformly to f or not.

4. Determine whether the following sequences of functions converge uniformly on


the interval [−1, 1] or not.
2 1 2
(a) fn (x) = e−nx (b)fn (x) = e−x .
n
5. For each n ∈ N+ , define the function fn : R → R by
x
fn (x) = .
1 + nx2
(a) Find the pointwise limit f of the sequence (fn ).
(b) For a fixed n, sketch the graph of fn indicating maxima and minima.
(Use the usual calculus graph-sketching techniques.)
(c) For fixed n, find dn = sup{|fn (x) − f (x)| : x ∈ R}. Use this to decide
whether fn → f uniformly on R or not.
(d) Find the pointwise limit g of the sequence (fn0 ) of derivatives.
(e) Note that f is differentiable on R, but that f 0 (0) 6= g(0).

6. Answer the questions (a) - (c) of the previous question for the sequence of
functions gn : R → R defined by

x2
gn (x) = .
n + x2
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 170

6.2 Series of functions


We now turn our attention to series of functions. Since a convergent series of real
numbers is just the limit of the sequence of partial sums, there is a natural way to
handle series of functions, now that we know how to deal with limits of sequences of
functions. We simply define the series of functions to be the limit of the sequence of
partial sums of the functions (where this limit exists, of course). The first definition
makes this idea precise:

Definition 6.2.1 Let I be an interval and for each n ∈ N+ , let fn , and f be real-
valued functions defined on I.
For each x ∈ I and each n ∈ N+ , let
n
X
sn (x) = f1 (x) + f2 (x) + . . . + fn (x) = fk (x).
k=1


X
We say that the series of functions fk converges pointwise to f if and only if
k=1
the sequence of functions (sn ) converges pointwise to f .
X∞
Similarly fk converges uniformly to f if and only if the sequence (sn ) con-
k=1
verges uniformly to f .

X (−1)k+1 x
Example 6.2.2 Consider the series on the interval [0, 1].
1 + kx
x
First let’s check that this series converges pointwise. Fix x ∈ [0, 1]. Now →0
  1 + nx
x
as n → ∞ and (for fixed x, remember) is a decreasing sequence. So the
1 + nx
X (−1)k+1 x
Leibniz test applies and converges to some real number, which we
1 + kx
denote by f (x).
Next let’s check that this series converges uniformly:
By the error estimate associated with the Leibniz test we have, for any n ∈ N+ ,
n
X (−1)k+1 x x
− f (x) ≤ .
k=1
1 + kx 1 + (n + 1)x
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 171

Then
n
X (−1)k+1 x
dn = sup − f (x)
1 + kx
x∈[0,1] k=1
x
≤ sup
x∈[0,1] 1 + (n + 1)x
1
= .
1 + (n + 1)
x
The last equality follows because is an increasing function of x (for
1 + (n + 1)x
1
fixed n, now). Since → 0, dn → 0 also.
n+2


X
Theorem 6.2.3 Suppose fk converges uniformly to f on [a, b].
k=1

(a) If each fn is continuous on [a, b], then f is continuous on [a, b].

(b) If each fn is continuous on [a, b],


Z b ∞ Z
X b
f (x)dx = fk (x) dx.
a k=1 a


X
(c) If each fn is differentiable and fk0 converges uniformly on [a, b] to a continuous
k=1

X
function, then f is differentiable on (a, b) and f (x) = 0
fn0 (x) for all x ∈ (a, b).
n=1

Proof: Apply Theorems 6.1.12, 6.1.14 and 6.1.15 to the function

sn (x) = f1 (x) + f1 (x) + . . . + fn (x).

The next theorem provides a useful criterion for showing that a series of functions
converges uniformly.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 172

Theorem 6.2.4 ( The Weierstrass M-test for uniform P convergence) Let


(Mn ) be a sequence of positive real numbers such that Mn converges and let (fn )
be a sequence of real-valued functions defined on an interval I such that

|fn (x)| ≤ Mn for each x ∈ I.


P P
Then, for each x ∈ I, fn (x) converges
P absolutely, and fn (x) converges uni-
formly on I to the function f (x) = fn (x).

Proof: Checking for absolute convergence is a simple application of the comparison


test, as we will see now. Suppose
P that x ∈ I is given. For all n ∈ N+ , we have
P 0 ≤ |fn (x)| ≤ Mn , so
that |fn (x)| converges, by the comparison test. Therefore
fn (x) converges absolutely.
To check for uniform convergence, we calculate, for any N ∈ N+ ,

X N
X
dN = sup fn (x) − fn (x)
x∈I n=1 n=1
X∞
= sup fn (x)
x∈I
n=N +1
X∞
≤ sup |fn (x)|
x∈I
n=N +1

X
≤ Mn
n=N +1

X N
X
= Mn − Mn
n=1 n=1
→ 0 as N → ∞.


X xn
Example 6.2.5 We show that 2 + x2
converges uniformly on [0, 1].
n=1
n
xn
 
+ 1 1
For any x ∈ I and n ∈ N , ≤ . So we can use (M n ) = .
n2 + x2 n2 n2

X 1 X xn
Since converges, the Weierstrass M-test shows that converges
n2 n=1
n2 + x2

uniformly on [0, 1].


CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 173

For the rest of this section, we turn our attention to power series again. The
next few theorems make it clear why power series are so very useful in calculus.


X
Theorem 6.2.6 Let ak xk be a power series.
k=0

X
For each x in its interval of convergence I put f (x) = ak xk .
X k=0
If [−r, r] ⊂ I then ak xk converges absolutely and uniformly to f on [−r, r].

Proof: For each x ∈ [−r, r] and n ∈ N+ , we have

|an xn | = |an ||x|n ≤ |an |r n


n
P use k (Mn ) = (|an |r ). Since r is in the interval of convergence ofPthe kseries,
We
|ak |r converges. (See Theorem 4.5.2) By The Weierstrass M-test, ak x con-
verges absolutely and uniformly to f on [−r, r].
We now look at differentiating and integrating power series. Power series are
particularly well-behaved in this regard, and this is one of the reasons why they are
so useful.


X
Theorem 6.2.7 Let an xn be a power series with radius of convergence R. For
n=0

X
each x in its interval of convergence, put f (x) = an xn . Then f is differentiable
n=0

X
on (−R, R), and for all x ∈ (−R, R), f 0 (x) = nan xn−1 .
n=1

Proof: Let x ∈ (−R, PR) and choose real numbers r and s such that |x| < r < s < R.
n
By Theorem 6.2.6, an x converges uniformly and absolutely on [−s, s]. We show
n−1
P
below that nan x converges uniformly on [−r,Pr]. This requires a bit of calcula-
tion. But once we have done that, we know that nan xn−1 converges P
to a contin-
uous function, by Theorem 6.1.12, and P so, by Theorem 6.1.15, f (x) =P nan xn−1 .
0

Now, we show uniform convergence of nan xn−1 . Note that, since an sn is ab-
n
solutely convergent, the sequence (|an |s ) converges to 0, and so is bounded. Let B
be an upper bound for it. Then, for any x ∈ [−r, r],

nan xn−1 = n|an ||x|n−1


CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 174

≤ n|an |r n−1
n|an |r n sn
=
rsn  
n r n
= |an |sn
r s
B  r n
≤ n .
r s
X r
Use the Ratio Test to check that n( )n converges. You will need the fact that
P s n−1
r < s. By The Weierstrass M-test, nan x converges uniformly on [−r, r], as
required.

We follow up this result with a similar one for integrals. As before the working
for integrals is somewhat easier than that for derivatives.

Theorem 6.2.8 Let f be as in Theorem 6.2.7 . If [c, d] ⊂ (−R, R), then


Z d ∞
X an
f (x)dx = (dn+1 − cn+1 ).
c n=0
n + 1

Proof:
Z d Z ∞
dX
f (x)dx = an xn dx
c c n=0
Z d n
X
= lim ( ak xk )dx
c n→∞
k=0
Z d n
X
= lim ( ak xk )dx
n→∞ c k=0
n
XZ d
= lim ak xk dx
n→∞ c
k=0

X an
= (dn+1 − cn+1 ).
n=0
n+1

∞ ∞ ∞
X
n
X X an n+1n−1
Proposition 6.2.9 The three series an x , nan x and x
n=0 n=1 n=0
n+1
have the same radius of convergence. However, they need not have the same interval
of convergence.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 175

Proof: See the exercises.


X xk
To show that the intervals of convergence need not be the same, show that
X k
k−1
has interval of convergence [−1, 1), x has interval of convergence (−1, 1) and
X xk+1
has interval of convergence [−1, 1].
k(k + 1)


X x2k+1
Example 6.2.10 Consider f (x) = (−1)k .
k=0
(2k + 1)!

To find the radius and interval of convergence, apply the Ratio Test. For any x ∈ R,

(−1)n+1 x2n+3 (2n + 1)! x2


. = .
(2n + 3)! (−1)n x2n+1 (2n + 3)(2n + 2)

This tends to 0 as n → ∞, so the interval of convergence of this series is the whole


real line.
The theorems of this section tell us a lot about this function:
f is continuous on R. (Why?)
f is differentiable on R. (Why?) If we differentiate f term by term (as discussed in
Theorem 6.21), we get
X x2k X x2k
f 0 (x) = (−1)k (2k + 1) = (−1)k .
(2k + 1)! (2k)!

If you differentiate again (check this), you get f 00 (x) = −f (x).


These results are certainly not surprising if you recognise that f (x) = sin x. What
this illustrates, however, is that we could use this power series as a definition of
sin x, and would be able to find its derivative and integral from this definition.

Summary:
In this section we introduced the notion of pointwise and uniform convergence of
series of functions. The results of the previous section can be applied to the sequence
of partial sums of such series. The Weierstrass M-test is a very useful test for uniform
convergence of series of functions. When we apply the general results to power
series, we find that such series are very well behaved, and can be differentiated and
integrated term by term on any interval inside its interval of convergence.
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 176

Exercises

1. Decide whether the following series of functions converge uniformly on the


given intervals or not.
P k
(a) on (0, ∞).
x2
e−kx
(−1)k
P
(b) on [0, ∞).
k
P k
(c) x on (−1, 1).
2. Consider the power series


X x2k+1 x3 x5 x2n+1
f (x) = (−1)k = x− + − . . . + (−1)n + ...
k=0
2k + 1 3 5 2n + 1

(a) For which x ∈ R is f (x) defined?


(b) Find the power series for f 0 (x). What is its interval of convergence?
(c) Now integrate the original power series term by term. What do you get?
What is the interval of convergence of the resulting power series?
(d) Do you recognise the function f ? What is it?
3. Read the following theorem and its proof and then answer the questions fol-
lowing it.
Theorem Suppose that the three series
∞ ∞ ∞
X X X ak k+1
(1) ak xk (2) kak xk−1 (3) x
k=0 k=1 k=0
k + 1

have radii of convergence R1 , R2 and R3 respectively, and that R1 , R2 and R3


are all in R. Then R1 = R2 = R3 .
Proof: Note that the series (2) is obtained from (1) by differentiating each
term with respect to x. Similarly, series (1) can be obtained from series (3) in
this way. We will need this later. (1)
First we show that R2 ≥ R1 . (2)
Fix x ∈ R such that |x| < R1 , and choose r such that |x| < r < R1 . (3)
ak r k is absolutely convergent.
P
Then (4)
So the sequence (|an r n |) is bounded. (5)
So there exists M ∈ R+ so that |an r n | < M for all n ∈ N+ . (6)
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 177

Then, for all n ∈ N+ ,


 n−1
n−1 n−1
 x n−1 M |x|
|nan x | = nan r < n .
r r r
(7)
X  |x| k−1
The series k is convergent. (8)
r
kak xk−1 converges for |x| < R1 .
P
By the Comparison Test, (9)
So R2 ≥ R1 . (10)
Next we show that R3 ≥ R1 . (11)
For any n ∈ N+ ,
 n
an n+1 an n+1  x n+1 |x|
x = r < M|x|
n+1 n+1 r r
(12)
From this, one can deduce that R3 ≥ R1 . (13)
However, as we stated at the start, the relation between R3 and R1 is the same
as that between R1 and R2 , so we also obtain

R1 ≥ R3 .
(14)
So finally R1 = R2 = R3 . (15)

(a) How do you know that there is an r with the properties claimed in line
(3)?
(b) Why is the series in line (4) absolutely convergent?
(c) Why is the sequence (|an r n |) bounded, as claimed in line (5)?
(d) Show how the inequality in line (7) is obtained.
(e) Why is the series in line (8) convergent?
(f) How is line (9) obtained from the previous working?
(g) Why does line (10) follow?
(h) Show how the inequality in line (12) is obtained.
(i) Why does line (13) follow?
CHAPTER 6. SEQUENCES AND SERIES OF FUNCTIONS 178

(j) Explain how the inequality in line (14) is obtained.


(k) Show that the three series need not have the same interval of convergence,
1
by considering the case where an = .
n
Chapter 7

Integration

One of our aims in this course has been to put first-year calculus on firmer founda-
tions. We have done this by giving proper definitions for the basic concepts encoun-
tered in the calculus: convergence, limits, continuity, differentiability. There is still
one key concept in the calculus that we have not dealt with, namely integration. In
your first-year course you have seen how the definite integral can be defined as a
limit of sums, and you were told that the definite integral of a continuous function
over a bounded interval exists. It seems reasonable to expect that now that we have
a better understanding of limits and continuity we should be able to treat integra-
tion a little more rigorously, and prove results like these. In this chapter we look
very briefly at a proper definition for definite integrals and show that such integrals
exist, not only for continuous functions, but in fact for a wider class of functions.
Perhaps rather surprisingly, we make extensive use of suprema and infima in the
process.

7.1 What is integration?


The picture that the word “integration” conjures up in the minds of most mathe-
matically inclined people has something to do with finding the area under the graph
of a continuous positive function on some bounded closed interval on the real line.
The canonical picture is shown in Figure 7.1 on the next page.
The argument that goes with this picture runs something like this. If the graph
of the function f on the interval [a, b] is a straight line, or composed of straight line
pieces, we can calculate the area under the graph by using known formulae for the
areas of polygons. If this is not the case, we can try to approximate the area of the

179
CHAPTER 7. INTEGRATION 180

Figure 7.1: Approximating the area under the graph of a positive continuous func-
tion f .

region under the graph of f by summing the areas of a number of rectangles. These
rectangles are obtained by subdividing the interval [a, b] into a number (say n) of
subintervals using the division points x0 = a, x1 , x2 , . . . , xn = b. Each subinterval
[xi−1 , xi ] then becomes the base of a rectangle. The height of the rectangle is chosen
to be the value of the function at some point ti in the interval [xi−1 , xi ]. The sum
of the areas of the rectangles obtained in this way
n
X
f (ti )(xi − xi−1 )
i=1

then gives a (rough) approximation to the area under the graph of f . A sum like
this is called a Riemann sum.
To improve on this approximation, the argument continues, we have to increase
the number n of subintervals. It is clear (always take this phrase with at least a
pinch of salt!) that as n tends to infinity, the approximations will tend to the area
under the curve, at last if we ensure that the length of the largest subinterval also
goes to 0 as n tends to infinity. The area under the curve is therefore equal to the
limit n
X
lim f (ti )(xi − xi=1 ).
n→∞
i=1

Rb
At this stage the integral a f (x)dx is introduced as shorthand for this limit.
Note that we could (and in fact we do) formally write down the same kind of sums
and limits in the case where f is no longer positive on the whole of [a, b]. In this
case the integral can no longer be interpreted as an area.
CHAPTER 7. INTEGRATION 181

It is not difficult to expose the gaps in the argument above. Here are a few
uncomfortable questions:

• Have we in fact defined what we mean by the area under a graph?

• Does the limit appearing in the definition of the integral always exist (i.e for
all sequences of partitions of [a, b] and all choices of the points ti ∈ [xi−1 , xi ])?

• If the limit always exists, are all these limits equal (i.e. is the limit independent
of the choice of the sequence of partitions and the points ti ) in the subintervals?

We could get round the first question by admitting that we have not defined
what we mean by the area of a region in the plane that is not a polygon, and then
solving the problem by defining the area to be equal to the integral. This will agree
with our intuitive idea of area, but will only make sense if we know that the integral
exists, and has a unique value. So we are forced to consider the remaining two
questions.
It this stage it will pay to be more precise about definitions. We are going to
formulate the definitions in such a way that we can consider integrals of functions
that are not necessarily continuous.
Summary:
In this introductory section we raised some questions about the definition of the
definite integral usually given in introductory calculus courses.

7.2 Partitions, sums and integrals


The first step is to say exactly what a subdivision of an interval is.

Definition 7.2.1 A partition P of the interval [a, b] is a finite set {x0 , x1 , . . . , xn }


such that a = x0 < x1 < x2 < . . . < xn = b.
If P1 and P2 are partitions of [a, b], we say P2 is a refinement of P1 if P1 ⊂ P2 .
If P1 and P2 are partitions of [a, b], the partition P1 ∪ P2 is called the common
refinement of P1 and P2 .

The next step is to associate with each partition certain special approximating
sums.
CHAPTER 7. INTEGRATION 182

Definition 7.2.2 Let f be a bounded real-valued function on the interval [a, b], and
P = {x0 , x1 , . . . , xn } a partition of [a, b]. For i = 1, 2, . . . , n, put
mi = inf{f (x) : x ∈ [xi−1 , xi ]}, Mi = sup{f (x) : x ∈ [xi−1 , xi ]}.
The lower sum L(P, f ) associated with the partition P is given by
n
X
L(P, f ) = mi (xi − xi−1 )
i=1

and the upper sum U(P, f ) by


n
X
U(P, f ) = Mi (xi − xi−1 ).
i=1

Any sum of the form


n
X
R(P, f ) = f (ti )(xi − xi−1 ),
i=1

where for each i, ti ∈ [xi−1 , xi ], is called a Riemann sum.

Figure 7.2: Upper and lower sums for a bounded function.

Remarks:

1. Since the function f is bounded on [a, b], it is also bounded on each of the
intervals [xi−1 , xi ]. The sets {f (x) : x ∈ [xi−1 , xi ]} are therefore bounded, and
this ensures that mi and Mi exist.
CHAPTER 7. INTEGRATION 183

2. If f is continuous on [a, b], it is also continuous on each of the intervals [xi−1 , xi ],


and therefore in this case there are ci , di ∈ [xi−1 , xi ] such that mi = f (ci ) and
Mi = f (di ). It follows that if f is continuous, then the lower and upper sums
are also Riemann sums.

3. It is clear that if ti ∈ [xi−1 , xi ], then mi ≤ f (ti ) ≤ Mi . It follows that for any


Riemann sum R(P, f ),

L(P, f ) ≤ R(P, f ) ≤ U(P, f ).

4. Since the function f is bounded on [a, b], we can find constants m and M such
that m ≤ f (x) ≤ M for all x ∈ [a, b]. It follows easily from this that (see
Figure 7.2) that for any partition P we have

m(b − a) ≤ L(P, f ) ≤ U(P, f ) ≤ M(b − a).

It follows from this that the set of all upper sums is a non-empty set of real
numbers which is bounded below, and therefore has an infimum. In the same
way it follows that the set of all lower sums has a supremum.

The following definition now makes sense:

Definition 7.2.3 Let f be a bounded real-valued function on the interval [a, b]. The
lower integral of f on [a, b] is defined by
Z b
f (x) dx = sup{L(P, f ) : P is a partition of [a, b]}
a

and the upper integral by


Z b
f (x) dx = inf{U(P, f ) : P is a partition of [a, b]}.
a

To find out what the relationship between the lower and upper integral is, we
first have to look at the relationship of upper and lower sums for different partitions.
In what follows we’ll assume that f is a bounded real-valued function on [a, b].

Proposition 7.2.4 Let P1 and P2 be partitions of [a, b].

(a) If P2 is a refinement of P1 ,

L(P1 , f ) ≤ L(P2 , f ) and U(P2 , f ) ≤ U(P1 , f ).


CHAPTER 7. INTEGRATION 184

Figure 7.3: The effect of a refinement on lower and upper sums.

(b) For any two partitions P1 and P2 ,

L(P1 , f ) ≤ U(P2 , f ).

Proof:

(a) We first look at the situation where P2 contains exactly one point more than
P1 , say P1 = {x0 , x1 , . . . , xn } and P2 = {x0 , x1 , . . . , xi−1 , y, xi , . . . , xn }, with
xi−1 < y < xi (see Figure 7.3). Let mi be defined as before, and put

p = inf{f (x) : xi−1 ≤ x ≤ y} and q = inf{f (x) : y ≤ x ≤ xi }.

Then mi ≤ p and mi ≤ q and therefore

L(P2 , f ) − L(P1 , f ) = p(y − xi−1 ) + q(xi − y) − mi (xi − xi−1 )


= (p − mi )(y − xi−1 ) + (q − mi )(xi − y)
≥ 0.

This shows that L(P1 , f ) ≤ L(P2 , f ) in this case. The general case (P2 contains
m more points than P1 ) can be proved by induction. The proof that U(P2 , f ) ≤
U(P1 , f ) is similar.

(b) Let P = P1 ∪ P2 be the common refinement of P1 and P2 . Then we have

L(P1 , f ) ≤ L(P, f ) ≤ U(P, f ) ≤ U(P2 , f ).


CHAPTER 7. INTEGRATION 185

Theorem 7.2.5 Let f be a bounded real-valued function on the interval [a, b]. Then
Z b Z b
f (x) dx ≤ f (x) dx.
a a

Proof: If P and Q are any two partitions of [a, b], it follows from the previous
proposition that L(P, f ) ≤ U(Q, f ). Hence L(P, f ) is a lower bound for the set
{U(Q, f ) : Q is a partition of [a, b]} and so
Z b
L(P, f ) ≤ inf{U(Q, f ) : Q is a partition of [a, b]} = f (x) dx.
a

Then
Z b Z b
f (x) dx = sup{L(P, f ) : P is a partition of [a, b]} ≤ f (x) dx.
a a

The following example shows that the inequality in the theorem above can be a
strict one.

Example 7.2.6 Let f : R → R be the Dirichlet function defined by putting


f (x) = 1 if x is rational and f (x) = 0 if x is irrational. If P is any partition of [0, 1],
we have L(P, f ) = 0 (since every subinterval of [0, 1] contains an irrational number)
and U(P, f ) = 1 (since every subinterval of [0, 1] contains a rational number). Hence
Z b Z b
f (x) dx = 0 < 1 = f (x) dx.
a a

Summary:
In this section we made precise the notion of a partition of a closed bounded interval
[a, b], and also that of a refinement of a partition. With each partition and each
bounded function defined on the interval we can associate a lower and upper sum.
Refining a partition increases the lower sum and decreases the upper sum. The
supremum of the set of all lower sums and the infimum of the set of all upper sums
are respectively called the lower integral and the upper integral. The lower integral
is always less than or equal to the upper integral, but there are bounded functions
for which the two are not equal.
CHAPTER 7. INTEGRATION 186

7.3 When does the integral exist?


We are now ready to say what it means for a bounded function to be Riemann
integrable. The definition we give avoids the use of limits.

Definition 7.3.1 Let f be a bounded real-valued function on the interval [a, b].
Then f is said to be Riemann integrable over [a, b] if
Z b Z b
f (x) dx = f (x) dx.
a a

If this is the case, the common value


Rb of the two integrals is the Riemann integral of
f over [a, b], and is denoted by a f (x) dx.

Note that if the function f is integrable then it follows from the definitions that
Z b
L(P, f ) ≤ f (x) dx ≤ U(P, f )
a
Rb
for every partition P of [a, b], and that in fact the real number a
f (x) dx is the
unique real number with this property.
The Dirichlet function is an example of a bounded function that is not Riemann
integrable over [0, 1] (or over any interval on the real line, for that matter).
To show that there is in fact a large class of functions that are Riemann inte-
grable, we need the following criterium for integrability:

Proposition 7.3.2 A bounded real-valued f function on the interval [a, b] is Rie-


mann integrable if and only if for every  > 0, there is a partition P of [a, b] such
that
U(P , f ) − L(P , f ) < .

Proof: Suppose the condition is satisfied. We have to prove that


Z b Z b
f (x) dx = f (x) dx.
a a

It follows from Theorem 7.2.5 that it will be enough to show that


Z b Z b
f (x) dx ≤ f (x) dx.
a a
CHAPTER 7. INTEGRATION 187

Let  > 0. By our assumption we can find a partition P such that U(P, f )−L(P, f ) <
. Then Z b Z b
f (x) dx ≤ U(P, f ) ≤ L(P, f ) +  ≤ f (x) dx + .
a a

Since  > 0 was arbitrary, the required inequality follows.


Suppose, conversely that f is integrable and  > 0. We can find partitions P1
and P2 of [a, b] such that
b b
 
Z Z
U(P1 , f ) < f (x) dx + and L(P2 , f ) > f (x) dx − .
a 2 a 2

Let P be the common refinement of P1 and P2 . Then


Z b Z b
U(P, f ) − L(P, f ) ≤ U(P1 , f ) − L(P2 , f ) < f (x) dx − f (x) dx +  = ,
a a

using the fact that the upper and lower integrals are equal since f is integrable.

Theorem 7.3.3 If the real-valued function f is continuous on the interval [a, b], it
is Riemann integrable over [a, b].

Proof: Since the function f is continuous on the closed and bounded interval [a, b],
it is uniformly continuous on [a, b], by Theorem 5.3.26. We use Proposition 7.3.2
to show that f is Riemann integrable. Let  > 0. Since f is uniformly continuous

on [a, b], there is a δ > 0 such that |f (x) − f (y)| < b−a whenever x, y ∈ [a, b] and
b−a
|x − y| < δ. Choose n ∈ N such that n < δ and let Pn be the partition of [a, b] into
n subintervals of equal length. If Mi and mi are defined as usual, for i = 1, 2, . . . , n,
then
n
X b−a
U(Pn , f ) − L(Pn , f ) = (Mi − mi )
i=1
n
n
X  b−a

i=1
b−a n
= .

It follows that f is Riemann integrable.


Proposition 7.3.2 enables us to identify another large class of Riemann integrable
functions.
CHAPTER 7. INTEGRATION 188

Theorem 7.3.4 Let f be a bounded monotone real-valued function on the interval


[a, b]. Then f is Riemann integrable over [a, b].

Proof: It suffices to give a proof for a bounded increasing function; the proof for
decreasing functions is similar. For such functions it is easy to see that
(f (b) − f (a))(b − a)
U(Pn , f ) − L(Pn , f ) = ,
n
where Pn is the subdivision of [a, b] into n subintervals of equal length. It follows
from Proposition 7.3.2 that f is Riemann integrable.
Since there are bounded monotone functions that are not continuous everywhere,
it follows that the class of Riemann integrable functions is genuinely larger than the
class of continuous functions.
Summary:
A bounded function is Riemann integrable on an interval if the lower sum of the
function is equal to its upper sum. In this section we derived a necessary and
sufficient condition for a bounded function to be Riemann integrable, and use it to
show that continuous and also monotone functions are integrable.
Historical notes
Georg Friedrich Bernhard Riemann (1826 – 1866)
Riemann’s father was a Lutheran min-
ister who took responsibility for the ini-
tial schooling of his six children him-
self. He taught Bernhard until he
was ten. At school Bernhard quickly
showed an interest in mathematics. In
1846 Riemann went to the University
of Göttingen, where he at first stud-
ied theology since he was encouraged
to do so by his father. His interest
was in mathematics, however, and he
obtained permission from his father to
change course. Although Gauss taught
at Göttingen at the time, the math-
ematics department was not all that
good, and Riemann moved to Berlin in
1847. There he came under the influ-
ence of Dirichlet. Riemann liked the
latter’s intuitive way of reasoning and adopted the same style. During this time Rie-
CHAPTER 7. INTEGRATION 189

mann worked on a general theory of complex functions.


In 1849 Riemann returned to Göttingen and in 1851 he finished his Ph.D. under
the supervision of Gauss. In this he studied what are now called Riemann surfaces,
and used topological methods in complex function theory. On Gauss’s recommen-
dation he was appointed at Göttingen. In his research work on representation of
functions by trigonometric series he introduced conditions on a function to ensure
its integrability (now known as Riemann integrability).
To qualify as a lecturer, Riemann had to give a public lecture. The lecture he
gave On the hypotheses that lie at the foundations of geometry is now a classic of
mathematics. In it he defines what is now known as Riemannian space. Not many
people appreciated the depth of his ideas. It was only much later that his ideas
turned out to be exactly what Einstein needed for his theory of relativity.
In 1857 Riemann was appointed professor and published a very important paper
on the theory of abelian functions. In it he investigated the topological properties of
Riemann surfaces. Dirichlet, who had been appointed to the chair in mathematics
at Götingen in 1855, died in 1859 and he was succeeded by Riemann. Shortly
afterwards he was elected to the Berlin Academy of Sciences and at this occasion
delivered a lecture on the distribution of prime numbers. In it he used the so-called
zeta function ∞
X 1
ζ(s) =
n=1
ns
which he showed could be regarded as a function of a complex variable. Riemann
conjectured that all the non-trivial roots of this function had real part equal to
1
2
. This is now known as the Riemann hypothesis, and remains one of the major
unsolved problems of mathematics.
Riemann married in 1862. In the same year caught a bad cold which later
developed into tuberculosis. He tried to improve his health by spending much time
in Italy with its warmer climate. However, his health deteriorated and he died, aged
39, while on a visit to Italy in July 1866.
CHAPTER 7. INTEGRATION 190

Exercises

1. Let f (x) = x2 and Pn be the partition of the interval [0, 1] into n subintervals
of equal length.
(a) For each n ∈ N, find L(Pn , f ) and U(Pn , f ).
R1
(b) Find 0 f (x) dx.
R1
(c) Find 0 f (x) dx.
(d) Use only your answers to (a), (b) and (c) to determine whether f Riemann
integrable over [0, 1]. If so, find the value of the integral.
2. Let f : [1, 4] → R be defined by
 √
1≤x< √
 2 if √ 2
f (x) = 1 if √2 ≤ x < 5
3 if 5 ≤ x ≤ 4

and let Pn be the partition of the interval [1, 4] into n subintervals of equal
length.
(a) Calculate L(P3 , f ) and U(P3 , f ).
(b) Calculate L(P6 , f ) and U(P6 , f ).
(c) Let n ≥ 3. Find U(Pn , f ) − L(Pn , f ) in terms of n.
(d) Is f Riemann-integrable on [1, 4]? If so, what is the value of the Riemann
integral? Give reasons for your answers.
3. Letf be a bounded real-valued function on [a, b]. Prove or disprove the follow-
ing statements:
(a) If there is a partition P of [a, b] such that L(P, f ) = U(P, f ), then f is
Riemann integrable on [a, b].
(b) If for every partition P of [a, b] we have L(P, f ) < U(P, f ), then f is not
Riemann integrable on [a, b].
4. Show that if f is Riemann integrable over [a, b] and |f (x)| ≤ M for all x ∈ [a, b],
Rb
then | a f (x) dx| ≤ M(b − a).
5. Let a < c < b and let f be a function on [a, b] defined by putting f (x) = 1 if
x = c and f (x) = 0 for all other x ∈ [a, b]. Prove that f is Riemann integrable
Rb
and that a f (x) dx = 0. Deduce that if g is Riemann integrable on [a, b] and
the function h is obtained from g by changing
Rb the value
R b of g at one point in
[a, b], then h is Riemann integrable and a h(x) dx = a g(x) dx.
CHAPTER 7. INTEGRATION 191

6. Show that if f is Riemann integrable over [a, b] and a < c < b, then f is
Riemann integrable over [a, c] and over [c, b], and
Z b Z c Z b
f (x) dx = f (x) dx + f (x) dx.
a a c

[Hint: Show that it suffices to consider partitions of [a, b] containing the point
c.]

7. Show that if f and g are both Riemann integrable, so is f + g and


Z b Z b Z b
(f (x) + g(x)) dx = f (x) dx + g(x) dx.
a a a

8. Show that if f is Riemann integrable and c ∈ R, then cf is Riemann integrable


and Z b Z b
cf (x) dx = c f (x) dx.
a a

9. Show that if a function f is bounded on [a, b] and can be written as the


difference of two increasing functions, then it is Riemann integrable.

10. Let f be Riemann integrable over [a, b] and define the function F : [a, b] → R
by Z x
F (x) = f (t) dt (x ∈ [a, b]).
a

(a) Show that F is uniformly continuous on [a, b].


(b) Show that if f is continuous at x0 ∈ [a, b], then F is differentiable at x0
and F 0 (x0 ) = f (x0 ).

11. Let f be Riemann integrable over [a, b] and suppose that F is a differentiable
function on [a, b] such that F 0 (x) = f (x) for every x ∈ [a, b]. Prove that
Z b
f (x) dx = F (b) − F (a).
a

[Hint: For every  > 0, choose a partition P as in Proposition 7.3.2. Then use
the Mean Value Theorem.]
Appendix A

Logic and proofs

A.1 Basic logic


One of our aims in this module is to help you to read and understand proofs, and
to write simple proofs of your own. To be able to do this, you will need to know a
few things about elementary logic.
In what follows we introduce some basic ideas and terminology from logic. We
give a very concise outline here; for more detail you can consult the notes for the
Discrete Structures module.

• A (mathematical) statement or proposition is a sentence which is either true


or false, but not both.
• When we prove something, we show that a statement is true.
• Two or more propositions (statements) can be linked by connectives to form
a compound proposition (compound statement).
• Connectives are indicated by the words or phrases and, or, not, if ... then, if
and only if.
• We can use symbols as shorthand for the phrases above; the symbols
∧, ∨, ¬, ⇒, ⇔
are used respectively for the connectives
“and”, “or”, “not”, “if ... then” and “if and only if”.
• The precise meaning of these connectives can be given by making use of
truthtables.

192
APPENDIX A. LOGIC AND PROOFS 193

Example A.1.1 (a) The sentence “91 is divisble by 7” is a proposition, since we


can decide whether it is true or false (it is true).
(b) Is the sentence “x2 > x” a proposition?
You may have a problem with the fact that we are not told what x is. Let’s avoid
this problem for the moment by assuming that x is a real number. If we know this,
can we say that “x2 > x” is a statement? This depends on being able to decide
whether it is true or false. Someone may say: “It is false; just take x = 21 .” But
then someone else may say: “It is true; just take x = 2.” It now becomes clear that
“x2 > x” is sometimes true, and sometimes false; it all depends on the choice of
x. As it stands, it does not qualify as a statement. We can think of x as being
a variable. Whether the sentence is true or false will depend on the “value” this
variable has; for some values it will be true, for others false. Here we use “value”
for something that could be put in the place of a variable.
(c) The sentence
“If n2 is an odd number, then n is an odd number”
looks like a compound proposition. The sentences
“n2 is an odd number” and “n is an odd number.”
are linked by the connective “if ... then”. But are these sentences statements? Can
we decide whether they are true or false? Surely that depends on the value of n.
We again have sentences containing a variable; this time it can be taken to be an
integer.
(d) The sentence
“If n is a prime number, then n + 1 is a prime number”
has a similar structure. Ii contains two sentences depending on the variable n, and
linked by the connective “if ... then”. The variable n can be taken to be a positive
integer in this case.

The examples in (b), (c) and (d) show that in mathematics we quite naturally
come across sentences that contain variables, and that this disqualifies such sentences
from being statements. You may quite rightly feel unhappy about not taking such
sentences seriously. They seem to be saying something that makes sense. Perhaps
we should have a second look at what they seem to be saying. But let’s first agree on
a name for such problematical sentences: we’ll call a sentence containing variables
an open sentence. Whether such a sentence is true or false depends on the values
of the variables.
There is another bit of terminology associated with such open sentences. When
we write down an open sentence, we usually have in mind a set of elements that can
be substituted in the place of the variable(s) in the open sentence. We call this set
the universe for the sentence. Ideally we should specify the universe for the open
sentences we use; you will find that this is sometimes omitted if it is clear from the
APPENDIX A. LOGIC AND PROOFS 194

context. We have already suggested the set of real numbers as the universe for the
open sentence “x2 > x”. For the open sentence “n2 is an odd number” the universe
could be taken to be the set of integers, and for the sentence “n is a prime number”
we could use the set of natural numbers.
Associated with an open sentence and a universe there is a truth set: the set of
all the elements of the universe which when substituted in the open sentence makes
it a true statement. So, for example, 2 is in the truth set of x2 > x, but 21 is not.
The truth set in this case is the union of the intervals (−∞, −1) and (1, ∞).
But what is the use of open sentences if we cannot even decide whether they
are true or false? We need something that will change an open sentence into a
statement. If we change the open sentence “x2 > x” into
“ For all real numbers x, x2 > x”
we do get a statement, since we can decide whether it is true or false. (It is false,
since we have an example of a real number x for which it fails.)
We can also change the open sentence “x2 > x” into
“ There is a real number x such that x2 > x.
If we do this, we again get a statement, since we can decide whether it is true or
false. (It is true.)
Phrases such as “For all” and “There is (are)” are called quantifiers and they
are used to bind the variables in an open sentence to change it into a statement.
The examples above introduce the two most important kind of quantifiers. Phrases
such as for all, for every, for any are called universal quantifiers and denoted
by the symbol ∀, while phrases such as there is, there exists, for some are called
existential quantifiers and denoted by the symbol ∃.
The example (c) is a little more difficult to analyse. You will probably agree
that when we say
“If n2 is an odd number, then n is an odd number”
we really mean
“It does not matter what the integer n is,
as long as n2 is an odd number, then n is an odd number.”
Another way to say this is:
“For all integers n, if n2 is an odd number, then n is an odd number.”
Here the quantifier “For all” is used to bind the variable n in the open sentence
“if n2 is an odd number, then n is an odd number.”
Once we have done this, it becomes a statement; we can in this case show it is true
(try to do this, if you have not done so already).
We can treat the example (d) in the same way. The open sentence “If n is a
prime number, then n + 1 is a prime number” can be changed into a statement
APPENDIX A. LOGIC AND PROOFS 195

by using the universal quantifier to get “For all natural numbers n, if n is a prime
number, then n + 1 is a prime number.” It is a statement, because we can decide
whether it is true or false. In this case it is false (why?). We could also use the
existential quantifier: “There exists a natural number n such that if n is a prime
number, then n + 1 is a prime number.” In this case we get a true statement (why?).
It is unfortunately true that we tend to be somewhat sloppy about inserting
quantifiers when we state results. So, for example, you may well find that
“If n2 is an odd number, then n is an odd number.”
is stated as a result.
What is passed off as a statement when this is done is strictly speaking only an open
sentence. The result, as we have seen, should read
”For every integer n, if n2 is an odd number, then n is an odd number.”
Such sloppiness is widespread, and you will come across it often in these notes
and many mathematics texts you may read. It is assumed in such cases that it is
clear what the quantifier is that is necessary to turn the open sentence into a true
statement; almost invariably it will be some form of universal quantifier. Here is
another example:

Example A.1.2 Suppose you are asked to prove or disprove the statement “The
sum of two odd numbers is an even number.” We can write this as “If n and m
are odd numbers, then n + m is an even number.” This looks quite acceptable, but
strictly speaking it is an open sentence (containing the variables n and m), rather
than a statement. The intended statement is clearly: “For every integer n and every
integer number m, if n is an odd number and m is an odd number, then n + m is
an even number.” In this form it is clear that if we want to prove the statement, we
will have to give an argument that is valid for every odd number n and every odd
number m; an example of two odd numbers with a sum which is an even number
will not prove the statement.

You may feel that the insistence on putting in quantifiers is bordering on splitting
hairs. What makes matters worse is that we are not going to adhere to our own
strict standards in future. The reason that we are so petty now is that one should
always be aware of the intended quantifier, even when it is not there explicitly. This
becomes crucial when trying to prove that a statement is false, as we’ll see later.
Two of the connectives listed earlier need some further comment. The word “not”
is not strictly speaking a connective, since it is not used to link two statements, but
rather to change the meaning of a statement by negating it. Thus applying “not”
to the statement “x = 0” we get the statement “x 6= 0”, and applying “not” to the
statement “n is is even”, we get the statement “n is not even” (or, equivalently. “n
is odd”).
APPENDIX A. LOGIC AND PROOFS 196

The connective “if and only if” (which is often abbreviated to “iff”) is used to
indicate that two implications are both true. As an example, the sentence

n is an odd number if and only if n2 is an odd number

says two things:

• If n is an odd number, then n2 is an odd number, and

• If n2 is an odd number, then n is an odd number.

When you are asked to prove a statement involving the connective “if and only if”,
keep in mind that there are two things you need to prove.

A.2 Definitions, proofs and counterexamples


It is important when you use definitions to realise that they are also compound
statements, using the connective “if and only if”. The confusing part here is that
definitions are usually given with only an “if” where there should really have been
an “if and only if”. Here is an example: the definition of a prime number is usually
given as

A natural number is a prime number if it is larger than 1 and divisible


only by itself and 1.

This should really be:

A natural number is a prime number if and only if it is larger than 1 and


divisible only by itself and 1.

When it is made clear that what is given is a definition, the less precise first version
is usually given (and we’ll do this in these notes as well).
Most of the theorems you’ll see in this course (and elsewhere) will be statements
of the form “For every x, if P (x), then Q(x)”, where P (x) and Q(x) are open
sentences depending on the variable x. To prove that such a statement is true, we
assume that the statement P (x) is true, and show that it follows from this that the
statement Q(x) is true, making sure that our argument is valid for every x in the
universe for the open sentences P (x) and Q(x). Write out a proof of the statement
APPENDIX A. LOGIC AND PROOFS 197

“For every integer n, if n is an even number, then n2 is an even number” and make
sure that your proof satisfies this requirement.
When working with implications, the order in which we write down things is
very important. Here is an example.
Look at the statements:

(a) For every natural number n, if n is an odd number, then n2 + 1 is not a prime
number.

(b) For every natural number n, if n2 + 1 is not a prime number, then n is an odd
number.

These two statements clearly do not say the same thing. We can prove that
statement (a) is true. (Do this, using the fact that if n is odd, n2 will be odd.)
However, we cannot assume that because statement (a) is true, statement (b) will
also be true (in fact, it is false).
More generally, if P and Q are statements, then P ⇒ Q is a statement again.
We call the statement Q ⇒ P the converse of the statement P ⇒ Q.
WARNING:

If a statement of the form P ⇒ Q is true, its converse Q ⇒ P need not be true.


We tend to use the term “converse” rather loosely. Even though “if n is an
odd number, then n2 + 1 is not a prime number” is an open sentence rather than a
statement, we’ll refer to the open sentence “if n2 + 1 is not a prime number, then
n is an odd number” as its converse. We’ll also call the statement (b) above the
converse of the statement (a).
We have claimed above that the statement

For every natural number n, if n2 + 1 is not a prime number, then n is


an odd number.” (∗)

is not true. How do we prove that a statement of this form is false? The statement
claims that for every natural number something is true. To show that the claim is
false, it is enough to give an example of one natural number for which that something
is not true. The “something” in this case is the implication “if n2 + 1 is not a prime
number, then n is an odd number”. We therefore need to find a natural number n,
such that n2 + 1 is not a prime number, and n is not an odd number, that is, n is
an even number. This means that to disprove the statement (∗) (that is, to show
APPENDIX A. LOGIC AND PROOFS 198

that it is false) we have to find one example of an even number n such that n2 + 1
is not a prime number. A little experimentation shows that n = 8 will do (there are
others as well).
An example like this that is used to disprove a statement containing a universal
quantifier is called a counterexample.
We claimed a little earlier that the statement

For all natural numbers n, if n2 is an odd number, then n is an odd


number. (∗∗)

is true, and asked you to prove it. You may have discovered that it is not immediately
clear how one could start a proof. A direct proof would start by assuming that n is
a natural number such that n2 is an odd number. This means that we can find some
natural number k such that N 2 = 2k − 1. But where does one go from there? There
is a bit of logic that comes to the rescue here. First recall that ¬ Q is shorthand for
“not Q”. A statement of the form

P ⇒Q

is logically equivalent to the statement

¬ Q ⇒ ¬ P.

(one can show this using truth tables). We write this as

P ⇒ Q ≡ ¬ Q ⇒ ¬ P.

This means that if we can prove that one of the statements P ⇒ Q and ¬ Q ⇒ ¬ P
is true, the other will be true as well. We call ¬ Q ⇒ ¬ P the contra-positive of
P ⇒ Q. There are cases where it turns out to be easier to prove the contrapositive
than the statement itself.
Let’s return to the statement we are trying to prove, and write down its contra-
positive. We have not looked at what is meant by the contrapositive of a statement
that contains a quantifier. We’ll say that the contrapositive of a statement of the
form
For every x, P (x) ⇒ Q(x)
is
For every x, ¬ Q(x) ⇒ ¬ P (x) .
The contrapositive of the statement (∗∗) above then becomes
APPENDIX A. LOGIC AND PROOFS 199

For every natural number n, if n is not an odd number, then n2 is not


an odd number.

or, equivalently,

For every natural number n, if n is an even number, then n2 is an even


number.

This statement is in fact true. You will find this easier to prove; do this!
A closely related way of proving statements of the form P ⇒ Q is the method
of proof by contradiction. To use this method, we assume that the statement P
is true, and suppose that Q is false. Using these two assumptions, we then try to
prove a statement that is clearly false (this is the contradiction). It the only way
to explain this contradiction is that the assumption that P is true and Q is false
was wrong. Hence if P is true, Q must be true as well. There is a good example
in Chapter 1 of this method of proof: the proof that there is no positive rational
number x such that x2 = 2.
We often need to negate statements when doing a proof by contradiction, or
the contrapositive of a statement. These statements often contain implications and
quantifiers. We first look at the negation of an implication.
To say that the the statement P ⇒ Q is not true means that P is true and Q is
not true. In symbols:
¬(P ⇒ Q) ≡ P ∧ (¬Q).

Next we look at how negation interacts with quantifiers.

• A statement of the form “ For every x, P (x)” says that for every x in some
universal set the statement P (x) is true. If this is not true, that means that
there is an x in the universal set for which P (x) is not true. The negation of
the statement “For every x, P (x)” is therefore the statement “There exists an
x such that ¬P (x)”. In symbols:
¬(∀x)(P (x))) ≡ (∃x)(¬(P (x)).

• A statement of the form “There exists an x for which P(x)” says that there is
some x in the universal set for which the statement P (x) is true. If this is not
true, that means that for every x in the universal set P (x) is not true. The
negation of the statement “There exists an x for which P (x)” is therefore the
statement “For every x, ¬P (x)”. In symbols:
¬(∃x)(P (x))) ≡ (∀x)(¬(P (x)).
APPENDIX A. LOGIC AND PROOFS 200

Example A.2.1 (a) The negation of the implication “if n2 is an odd number, then
n is an odd number” is “n2 is an odd number and n is not an odd number” or “n2
is an odd number and n is an even number”.
(b) The negation of the statement “For every n ∈ N+ , if n2 is an odd number, then
n is an odd number” is “There exists an n ∈ N+ such that n2 is an odd number and
n is an even number”.
(c) Our final example looks at the negation of a statement with more than one
quantifier. We use the definition of convergence of a real sequence to 0. If you have
not seen it before, do not worry. Simply look at the structure of the statement in
the definition, even if you do not yet understand it.

A real sequence (xn ) converges to 0 iff for every  ∈ (0, ∞), there exists
an N ∈ N+ such that for every n ∈ N+ , if n ≥ N, then |xn | < .

In symbols:

(xn ) converges to 0 ⇐⇒ (∀ ∈ (0, ∞))(∃N ∈ N+ )(∀n ∈ N+ )[n ≥ N ⇒ |xn | < ].

To say what it means for the sequence (xn ) not to converge to 0, we have to negate
the statement following the “iff”. If we do this we get:

(xn ) does not converge to 0 iff there exists an  ∈ (0, ∞) such that for
every N ∈ N+ , there exists an n ∈ N+ such that n ≥ N and |xn | ≥ .

In symbols:

(xn ) does not converges to 0 ⇔ (∃ ∈ (0, ∞))(∀N ∈ N+ )(∃n ∈ N+ )(n ≥ N)∧(|xn | ≥ ].
Appendix B

Rational numbers

We have initially defined the set of rational numbers as the set {p/q : p ∈ Z, q ∈ N+ },
and addition and multiplication by:

p/q + r/s = (ps + qr)/qs, (p/q)(r/s) = pr/qs.

One could raise two objections to this. The first is that it is not clear what p/q
means. The second is that we would like, for example, 1/2 and 2/4 to be regarded
as equal. We can get around the first difficulty by replacing p/q by the ordered pair
(p, q), with p ∈ Z and q ∈ N+ . Let us write Q for this set, i.e.

Q = {(p, q) : p ∈ Z, q ∈ N+ }.

The definitions for addition and multiplication in Q then becomes:

(p, q) + (r, s) = (ps + qr, qs), (p, q)(r, s) = (pr, qs).

The way to get around the second difficulty is to introduce a new definition for
equality in the set of all such pairs:

(p, q) = (r, s) ⇔ ps = qr.

A more sophisticated approach to this is to define a relation ∼ on Q by

(p, q) ∼ (r, s) ⇔ ps = qr.

We can then check that ∼ is an equivalence relation. It follows that the equivalence
classes of this equivalence relation is then a partition of Q (i.e. every element (p, q)
of Q belongs to exactly one equivalence class; we’ll denote this equivalence class by

201
APPENDIX B. RATIONAL NUMBERS 202

[(p, q)]. The set of rational numbers Q is then defined to be the set of all these
equivalence classes:

Q = {[(p, q)] : (p, q) ∈ Q} = {[(p, q)] : p ∈ Z, q ∈ N+ }.

Addition and multiplication on Q is then defined by

[(p, q)] + [(r, s)] = [(ps + qr, qs)], [(p, q)][(r, s)] = [(pr, qs)].

Note that we are adding and multiplying equivalence classes here. The first thing
that needs to be checked now is that these definitions of addition and scalar mul-
tiplication do not depend on the representatives that we choose from equivalence
classes. For example, we must show that if [p, q)] = [(p1 , q1 )] and [(r, s)] = [(r1 , s1 )],
then [(ps + qr, qs)] = [(p1 s1 + q1 r1 , q1 s1 )]. Once we have done this, it is then possible
to check the properties A1 – A4, M1 – M4 and D for Q using the properties of the
integers.
We can define an order relation on Q (i.e. an order relation on equivalence classes
of ordered pairs) by

[(p, q)] > [(r, s)] ⇔ ps > rq, p, r ∈ Z, q, s ∈ N+ .

We will have to check again that this definition does not depend on the representa-
tives we choose from the equivalence classes. The properties O1 – O4 can then be
checked, using the order properties of the integers.
The above gives a very brief sketch of the construction of the rational numbers,
using the integers and their properties. There is a fair amount of work to be done
to check all the properties, but the checking is not unduly complicated.
One could of course go even further back, and ask how the integers are con-
structed. It is possible to start with a set of axioms for the natural numbers N
(the Peano axioms) and use these to define addition and multiplication on N, and
then to prove the usual properties of the natural numbers. Then the integers can
be constructed as equivalence classes of ordered pairs of natural numbers, in a fash-
ion somewhat similar to the method we indicated above for the construction of the
rational numbers. If you are interested, you will find the detail in many algebra
books.
Index

absolute value, 19 uniformly, 152


absolutely convergent series, 98 contrapositive (of a statement)), 198
alternating series, 95 convergence
Archimedean property, 40 of sequence, 51
associative, 17, 18 convergence (of sequence of functions)
pointwise, 161
Bolzano-Weierstrass theorem, 72 uniform, 164
bonded convergence (of series of functions)
above, 146 pointwise, 170
bound uniform, 170
greatest lower, 32 convergence to 0, 44
least upper, 32 converse (of a statement), 197
lower, 27 counterexample, 198
upper, 27
bounded, 27, 146 deleted interval, 137
above, 27 derivative, 135
below, 27, 146 differentiable
function, 146 on an open interval, 155
differentiable at a point, 135
chain rule, 159 Dirichlet function, 138
common refinement, 181 distributive, 18
commutative, 17, 18 divergent, 62
comparison test, 104 diverges, 63
conditionally convergent series, 100 to −∞, 64
connectives, 192 to ∞, 64
continuity division, 18
 − δ definition, 151
at a point, 142 field, 18
Lipschitz, 154 ordered, 19
on an interval, 145 fixed point, 150
uniform, 152 function
continuous Dirichlet, 138
at a point, 134
on an interval, 145 geometric series, 88

203
INDEX 204

identity proposition, 192


additive, 17 compound, 192
multiplicative, 18
infimum quantifier, 194
of a sequence, 32 existential, 194
of a set, 147 universal, 194
integral test, 112 radius of convergence, 127
intermediate value theorem, 150 ratio test, 111
interval of convergence, 127 real numbers, 38
inverse rearrangements (of an infinite series), 121
additive, 17 refinement (of a partition), 181
multiplicative, 18 regrouping (of an infinite series), 120
least upper bound axiom (LUB), 38 Riemann sum, 182
Leibniz’s test, 95 root test, 111
limit sandwich theorem, 46, 56
of sequence, 51 sequence, 21
pointwise, 161 arithmetic, 24
uniform, 164 bounded, 27
limit of a function, 137 bounded above, 27
limit ratio test, 108 bounded below, 27
Lipschitz continuous, 154, 158 Cauchy, 77
lower sum, 182 constant, 24, 52
maximum, 35 convergent, 51, 63
maximum point, 155 decreasing, 25
minimum, 35 divergent, 63
modulus, 19 eventually constant, 24, 52
Fibonacci, 23
nested interval perty, 48 finite, 21
non-negative, 19 geometric, 24
nsted interval property, 40 increasing, 25
infinite, 21
open sentence, 193 monotone, 25
partial sum, 87 rational, 21
partition, 181 recursively defined, 22
peak point (of a sequence), 71 strictly decreasing, 25
pointwise convergence, 161 strictly increasing, 25
pointwise limit, 161 strictly monotone, 25
positive, 19 summable, 88
power series, 125 unbounded, 27
proof by contradiction, 199 series
INDEX 205

absolutely convergent, 98 upper sum, 182


alternating, 95
conditionally convergent, 100 variable, 193
convergent, 89 Weierstrass M-test, 172
divergent, 89
geometric, 88
power, 125
statement, 192
compound, 192
subsequence, 69
subtraction, 18
sum
lower, 182
Riemann, 182
upper, 182
sum (of a sequence), 88
summable, 88
supremum
of a sequence, 32
of a set, 147

telescoping sum, 90
test
comparison, 104
integral, 112
Leibniz, 95
limit ratio, 108
ratio, 111
root, 111
Weierstrass M, 172
theorem
mean value, 156
Bolzano-Weierstass, 72
Cauchy mean value, 159
intermediate value, 150
Rolle’s, 155
truth set, 194

unbounded, 27
uniform convergence, 164
uniformly continuous, 152
universe, 193

You might also like