0% found this document useful (0 votes)
48 views43 pages

G1CMIN Measure and Integration 2003-4: Prof. J.K. Langley May 13, 2004

Download as pdf or txt
Download as pdf or txt
Download as pdf or txt
You are on page 1/ 43

G1CMIN Measure and Integration 2003-4

Prof. J.K. Langley


May 13, 2004
1 Introduction
Books: W. Rudin, Real and Complex Analysis ; H.L. Royden, Real Analysis (QA331).
Lecturer: Prof. J.K. Langley (jkl@maths, room C6c, 95 14964).
Lectures: Wed 11 C29; Th 1 C4; Th 5 C29.
There will be NO lectures in the week commencing Feb 9th.
Oce hours: none specied but my door is open most of the time and youre free to con-
sult me.
Assessment: 2.5 hour written examination. 5 questions, best 4 count.
PLEASE NOTE: G1CMIN is a traditional pure mathematics theory module along the fa-
miliar denition, theorem, proof structure; in particular there isnt much scope for calculations
and the module is more like G12RAN or G13MTS than G12CAN.
Contents: review of real analysis, Lebesgue measure, Lebesgue integration.
Outline: there are two main themes to the module. The idea of measure is concerned with
the size of sets. The rst distinction we meet between sets is usually between nite and innite
sets. We then rene the idea of an innite set to distinguish between the countable and the
uncountable (well review this concept). When we discuss the Lebesgue measure of a subset of
R, it will give us an indication of how much of the line is lled up by our set. Thus we will be
able to distinguish big uncountable sets from smaller ones.
The second main theme involves integration. G12RAN introduces the Riemann integral, which
has advantages in that it is relatively easy to dene, and displays well the link between integration
and dierentiation. Its main drawbacks are: (i) the class of Riemann integrable functions is too
small; (ii) it has technical problems, particularly with regard to whether
lim
n
(
_
f
n
) =
_
( lim
n
f
n
). (1)
Also, Riemanns integral is dicult to generalize to other settings.
1
Lebesgue measure gives a means for comparing the size of sets and leads to Lebesgue inte-
gration, which is used widely in pure maths, probability, mathematical physics, PDEs etc. This
module will go as far as the main theorems concerning when (1) holds. More advanced topics
will only be covered if time permits.
Aims and Learning Outcomes:
Aims: to teach the elements of measure theory and Lebesgue integration.
Learning Outcomes: a successful student will:
1. be able to state, and apply in the investigation of examples, the principal theorems as treated;
2. be able to prove simple propositions concerning sets, measure spaces, Lebesgue measure and
integrable functions.
Coursework: is not part of the assessment for G1CMIN, but the questions should be helpful
practice for the exam. There will be a short assignment each week, for handing in the week after.
For anyone taking this module for G13ES1 (Supplementary Maths), the assessment will con-
sist of a separate coursework assignment handed out by the end of Week 6 of the semester. It
will be due for handing in on the last day of the Spring term and will be based on material from
the rst half of the module. You need only attend enough lectures to cover this material.
Web notes: you can nd printed notes at
www.maths.nott.ac.uk/personal/jkl/min03.pdf
The lectures will, however, cover all of the material (except for some proofs given in previ-
ous modules such as G1ALIM) so you may prefer to use the Web notes either not at all or just
as backup. These notes may contain errors, omissions or obscure parts: these will be amended
as and when I nd them.
2
2 Sets
2.1 Countability
This is an important idea when deciding how big innite sets are compared to each other. We
shall see that R is a bigger set than Q. We say that a set A is countable if either A is empty
or there is a sequence (a
n
), n = 1, 2, 3, . . . , which uses up A, by which we mean that each
a
n
A and each member of A appears at least once in the sequence. This is the same as saying
that there is a surjective (onto) function f : N A (via a
n
= f(n)).
FACT 1: Any nite set is countable.
Indeed, if A = {x
1
, . . . , x
N
}, just put a
n
= x
n
if n N, and a
n
= x
N
if n > N.
FACT 2: If B A and A is countable, then B is countable.
If B is nite, this is obvious. If B is not nite, then nor is A, so take a sequence which uses
up A, and delete all entries in the sequence which dont belong to B. We then get an innite
sequence which uses up B.
FACT 3: Suppose that A is an innite, countable set. Then there is a sequence (b
n
), n = 1, 2, . . .,
of members of A in which each member of of A appears exactly once.
To see this, suppose that (a
n
), n = 1, 2, . . . uses up A. Go through the list, deleting any
entry which has previously occurred. So if a
n
= a
j
for some j < n, we delete a
n
. The resulting
subsequence includes each member of A exactly once. We have thus arranged A into a sequence
- rst element, second element etc. - hence the name countable .
FACT 4: Suppose that A
1
, A
2
, A
3
, . . . are countably many countable sets. Then the union
U =

n=1
A
n
, which is the set of all x which each belong to at least one A
n
, is countable.
Proof: delete any A
j
which are empty, and re-label the rest. Now suppose that the jth set
A
j
is used up by the sequence (a
j,n
), n = 1, 2, . . .. Write out these sequences as follows:
a
1,1
a
1,2
a
1,3
a
1,4
. . . . . . . . . . . . . . .
a
2,1
a
2,2
a
2,3
a
2,4
. . . . . . . . . . . . . . .
a
3,1
a
3,2
a
3,3
a
3,4
. . . . . . . . . . . . . . .
etc. Now the following sequence uses up all of U. We take
a
1,1
a
1,2
a
2,1
a
1,3
a
2,2
a
3,1
a
1,4
a
2,3
. . .
FACT 5: The set of positive rational numbers is countable. The reason is that this set is the
union of the sets A
m
= {p/m : p N}, each of which is countable. Similarly, the set of negative
rational numbers is countable, and so is Q (the union of these two sets and {0}).
FACT 6: If A and B are countable sets, then so is the Cartesian product A B, which is
3
the set of all ordered pairs (a, b), with a A and B B.
Here ordered means that (a, b) = (b, a) unless a = b.
Fact 6 is obvious if A or B is empty. Otherwise, if (a
n
) uses up A and (b
n
) uses up B, then
A B is the union of the sets C
n
= {(a
n
, b
m
) : m = 1, 2, 3, . . .}, each of which is countable.
FACT 7: the interval (0, 1) is not countable, and therefore nor are R, C.
Proof: We prove the following stronger assertion. Consider the collection T of all real num-
bers x = 0 d
1
d
2
d
3
d
4
. . . . . . in which each digit d
j
is either 4 or 5. Then T is uncountable.
Suppose that the sequence (a
n
), n = 1, 2, . . ., uses up T. Write out each a
j
as a decimal
expansion
a
1
= 0 b
1,1
b
1,2
b
1,3
. . . . . .
a
2
= 0 b
2,1
b
2,2
b
2,3
. . . . . .
a
3
= 0 b
3,1
b
3,2
b
3,3
. . . . . .
etc. Here each digit b
j,k
is 4 or 5. We make a new number x = 0 c
1
c
2
c
3
c
4
. . . as follows.
We look at b
n,n
. If b
n,n
= 4, we put c
n
= 5, while if b
n,n
= 5, we put c
n
= 4. Now x
cannot belong to the list above, for if we had x = a
m
, then wed have c
m
= b
m,m
, which isnt
true.
Example
Let S be the collection of all sequences a
1
, a
2
, . . . with each entry a positive integer. Then
S is uncountable: take the subset of S for which each a
j
is 4 or 5 and use the above proof.
2.2 The real numbers R
The key idea about R which we need is the existence of least upper bounds.
Let E be a non-empty subset of R. We say that a real number M is an upper bound for
E if x M for all x in E, and E is called bounded above. Among all upper bounds for E there
is one which is the least, called the sup or l.u.b. of E.
We adopt the convention that if E is a subset of R which is not bounded above, then the
sup of E is +.
For example sup((0, 1) Q) = 1 and sup N = .
The greatest lower bound of E (denoted glb or inf) is dened similarly: it is the greatest real
4
number which is less than or equal to every member of E. If E is not bounded below the glb is
.
2.3 Real sequences
Let (x
n
), n = 1, 2, . . . be a sequence of real numbers.
We say lim
n
x
n
= L R if to each positive real number corresponds an integer n
0
such that |x
n
L| < for all n n
0
.
We say lim
n
x
n
= + if to each positive real number M corresponds an integer n
0
such
that x
n
> M for all n n
0
.
We say lim
n
x
n
= if lim
n
(x
n
) = +. Thus every negative real M has n
0
with
x
n
< M for all n n
0
.
2.4 The monotone sequence theorem
If the real sequence (x
n
) is non-decreasing (i.e. x
n
x
n+1
) for n N then x
n
tends to a limit
(nite or +).
We recall the proof. Let s be the supremum of the set {x
n
: n N} = A. Suppose rst
that A is not bounded above, so that sup A = + by our convention. This means that if we are
given some positive number M, then no matter how large M might be, we can nd some member
of the set A, say x
n
1
, such that x
n
1
> M. But then, because the sequence is non-decreasing,
we have x
n
> M for all n n
1
, and this is precisely what we need in order to be able to say
that lim
n
x
n
= +.
Now suppose that A is bounded above, and let s be the sup. Suppose we are given some
positive . Then we need to show that |x
n
s| < for all suciently large n. But we know that
x
n
s for all n, so we just have to show that x
n
> s for all large enough n.
This we do as follows. The number s is less than s and so is not an upper bound for
A, and so there must be some n
2
such that x
n
2
> s . But then x
n
> s for all n n
2
,
and the proof is complete.
2.5 Lemma
Every real sequence (x
n
), n = 1, 2, . . . , has a monotone subsequence.
Proof: For each n consider the set E
n
= {x
m
: m n}. We look at two cases.
Suppose rst that for every n the set E
n
has a maximum element i.e. there is some m n such
5
that x
p
x
m
for all p n.
To form our subsequence choose n
1
1 such that x
n
1
is the maximum element of E
1
. Then
choose n
2
n
1
+ 1 such that x
n
2
is the maximum element of E
n
1
+1
. Since E
n
1
+1
E
1
we get
x
n
2
x
n
1
. Now take the maximum element of E
n
2
+1
: this will be x
n
3
for some n
3
> n
2
and we
have x
n
3
x
n
2
. Repeating this we get a non-increasing subsequence (x
n
k
).
Suppose now that some E
N
has no maximum element. Let n
1
= N and choose n
2
> n
1
such that x
n
2
> x
n
1
. Since E
N
has no maximum element, there is an element greater than all of
x
n
1
, x
n
1
+1
, . . . , x
n
2
, and this is x
n
3
for some n
3
> n
2
. Carrying on in this way, we get a strictly
increasing subsequence.
2.6 Corollary (Bolzano-Weierstrass theorem)
Every bounded real sequence has a bounded monotone subsequence and hence a convergent
subsequence.
2.7 Nested intervals
Let I
k
= [a
k
, b
k
] be closed intervals in R such that I
k+1
I
k
. Then a
k
a
k+1
b
k+1
b
1
and a
1
a
k+1
b
k+1
b
k
so a
k
converges, to A say, and b
k
converges, to B say. We have
a
k
A B b
k
for all k, so [A, B] is contained in all of the I
k
. Thus the intersection of the
I
k
is non-empty.
The example I
n
= (0, 1/n) shows that open intervals do not in general have this property.
2.8 Open sets
A subset U of R is called open if the following is true. To each x in U corresponds
x
> 0 such
that (x
x
, x +
x
) U. It is easy to check that if W
t
is open for each t in some set T then
the set

tT
W
t
, which is the set of all y each belonging to at least one W
t
, is open. Also the
intersection of nitely many open sets is open.
2.9 Open intervals
By an open interval in R we mean any of the following: (a, b), (a, +), (, b), R. All are open
sets.
A subset E of R is called closed if R \ E is open. Obviously a closed interval [a, b] (where
< a b < ) is closed, since the complement is (, a) (b, ), a union of two open
sets.
6
Since every (non-empty) open interval contains a rational number and an irrational number,
neither Q nor R \ Q is open.
2.10 Lemma
Let x R. For each t in some non-empty set T let W
t
be an open interval containing x. Then
V =

tT
W
t
is an open interval containing x.
Proof.
Let A be the inf of V and B the sup. We claim rst that A and B are not in V . If B
is in V then B is in some W
t
. Since W
t
is an open subset of R there is a b > B in W
t
, and
b is in V , which is a contradiction. Thus A and B are not in V and clearly V is a subset of (A, B).
We claim that (A, B) is a subset of V . Let y (A, B). Obviously if y = x then y is in
V . Suppose that x < y < B. Then (since B is the sup of V ) there is some w with x < y < w
such that w lies in some W
s
. Since x and w lie in the open interval W
s
, so does y, and y is in
V . The same proof works if A < y < x.
2.11 Theorem
Let V be a non-empty open subset of R. Then V is the union of countably many pairwise disjoint
open intervals.
Proof: let x V , and let C
x
be the union of all open intervals W such that x W V .
Then C
x
is well-dened (since V is open there is at least one such W) and open, and by Lemma
2.10 this C
x
is an open interval.
We claim that if y is in C
x
then C
x
= C
y
. To see this, note that C
x
and C
y
are both open
intervals containing y and contained in V , and so is C
x
C
y
, by Lemma 2.10. Since x C
x
C
y
we get C
x
C
y
C
x
, as C
x
is the union of all open intervals containing x and contained in V .
Since y C
x
C
y
, the same argument gives C
x
C
y
C
y
.
It follows that if C
x
and C
y
have non-empty intersection, with t belonging to both, then both
equal C
t
and so C
x
= C
y
.
Now let d
n
be a sequence using up all the rational numbers in V , and put D
n
= C
d
n
. The
set of D
n
is countable. Since each C
x
contains a rational number, every C
x
, for x in V , is one
of the D
n
. So V is the union of the D
n
. Now we go along our sequence D
n
, and delete any D
n
which is equal to one which has occurred previously.
2.12 Lemma (Special case of the Heine-Borel theorem)
Let I = [a, b] be a closed interval, and suppose that we have open subsets V
1
, V
2
, V
3
, . . . of R
such that I is contained in the union of the V
j
. Then I is contained in the union of nitely many
7
of the V
j
.
We remark that this is related to the concept of compactness (G13MTS), but we do not need
this concept here.
To prove the lemma, suppose the conclusion is false. Then we can nd x
1
I \ V
1
, x
2

I \ (V
1
V
2
), and, in general, x
n
I \ (V
1
. . . V
n
). Since a x
n
b, the sequence
(x
n
) is bounded, and so has a convergent subsequence x
n
k
, with limit [a, b]. But then
is is some V
N
, and there is some u > 0 with ( u, + u) V
N
, since V
N
is open. Since
x
n
k
, we see that for large k we have x
n
k
( u, + u) V
N
. But n
k
so that
for large k we have n
k
> N and x
n
k
V
N
V
1
. . . V
n
k
. This contradiction proves the lemma.
Again, this is a property of closed intervals [a, b] not shared by other intervals. For example
(0, 1] is contained in the union of the open intervals (1/n, 2), n N, and [0, ) is contained
in the union of the open intervals (1, n), n N. In neither case will nitely many of those
intervals suce to cover the set.
3 Continuous functions
3.1 Basic facts
Let E be a subset of R and let f : E R be a function. We say f is continuous on E if
the following is true. To each x
0
in E and each real > 0 corresponds a real > 0 such that
|f(x) f(x
0
)| < for all x in E with |x x
0
| < .
Fact 1: if t
n
is a sequence in E converging to x
0
then f(t
n
) converges to f(x
0
). To see
this, if > 0 take > 0 as above. We have some n
0
such that |t
n
x
0
| < for all n n
0
,
giving |f(t
n
) f(x
0
)| < for all n n
0
.
Fact 2: if E is a closed interval [a, b] and f : E R is continuous then f has a maxi-
mum and minimum on [a, b] and in particular is bounded.
To see this, let M be the supremum of the set f(E) = {f(x) : x E}. Take a strictly
increasing sequence (y
n
) (thus y
n
< y
n+1
) with limit M. No y
n
is an upper bound for f(E), so
we can nd s
n
in f(E) with y
n
< s
n
M. Thus s
n
tends to M. So there exist t
n
in E such
that f(t
n
) M. By the Bolzano-Weierstrass theorem we can assume WLOG that the sequence
(t
n
), being bounded, converges, to x
0
say, and x
0
is in the interval [a, b], since a t
n
b. Thus
f(x
0
) = M. Hence M is in f(E) (and M is the max of f(E)). In particular M is nite.
3.2 Pointwise convergence
Let E be a subset of R and let f
n
, n N and f be functions from E to R. We say that f
n
converges pointwise to f on E if for each x in E,
lim
n
f
n
(x) = f(x).
8
Thus for each > 0 and for each x in E, there is an integer N(x) such that |f
n
(x) f(x)| <
for all n N(x).
If the f
n
are continuous, does it follow that f is continuous? The answer is no.
Example
Let g
n
(x) be dened on [0, 1] for n N by g
n
(x) = 1 nx for 0 x 1/n and g
n
(x) = 0 for
x > 1/n. Then g
n
is continuous on [0, 1]. Set g(x) to be 1 for x = 0 and 0 otherwise. Then g
is not continuous, but g
n
g pointwise on [0, 1].
Notice here also that
lim
n+
( lim
x0+
g
n
(x)) = 1 = lim
x0+
( lim
n+
g
n
(x)) = 0.
A second example displaying the same phenomenon comes from h
n
(x) = e
nx
on [0, 1]. Then
again h
n
g pointwise on [0, 1].
So we need a stronger condition which will force the limit function to be continuous. The
idea is to make N(x) independent of x.
3.3 Uniform convergence
If f
n
and f are functions from E to R we say that f
n
converges uniformly to f on E if the
following is true. To each real > 0 corresponds an integer N such that |f
n
(x) f(x)| < for
all n N and for all x E.
The example g
n
above does not converge uniformly to g on [0, 1]. To see this, take = 1/4 and
x = 1/2n. Then g(x) = 0 but g
n
(x) = 1/2. No matter how large we take N, we can choose
n N with |g
n
(1/2n) g(1/2n)| > .
3.4 Theorem
If the real-valued functions f
n
are continuous on E and converge uniformly to f : E R then
f is continuous on E.
Proof: take x
0
in E and > 0. Take N so large that |f
N
(x) f(x)| < /3 for all x in
E. Take > 0 so that |f
N
(x) f
N
(x
0
)| < /3 for all x in E with |x x
0
| < . For such x we
get
|f(x) f(x
0
)| |f(x) f
N
(x)| +|f
N
(x) f
N
(x
0
)| +|f
N
(x
0
) f(x
0
)| < 3/3.
4 Riemann Integration
The Riemann integral will be dened for continuous and some other functions. It is relatively
easy to dene and use, and displays the interplay between integration and dierentiation well,
but it has certain disadvantages.
9
4.1 Basic denitions for the Riemann integral
Let f be a bounded real-valued function on the closed interval [a, b] = I. Henceforth a < b
unless otherwise explicitly stated. Assume that |f(x)| M for all x in I.
A PARTITION P of I is a nite set x
0
, . . . , x
n
such that a = x
0
< x
1
< . . . < x
n
= b.
The points x
j
are called the vertices of P. We say that partition Q of I is a renement of
partition P of I if every vertex of P is a vertex of Q (i.e. P is a subset of Q). For P as above,
we dene
M
k
(f) = sup{f(x) : x
k1
x x
k
} M, m
k
(f) = inf{f(x) : x
k1
x x
k
} M.
Next, we dene the UPPER SUM U(P, f) and LOWER SUM L(P, f) by
U(P, f) =
n

k=1
M
k
(f)(x
k
x
k1
), L(P, f) =
n

k=1
m
k
(f)(x
k
x
k1
).
Notice that L(P, f) U(P, f). The reason we require f to be bounded is so that all the m
k
and M
k
are nite and the sums exist. Notice also that M m
k
M
k
M for each k, and
so
M(b a) L(P, f) U(P, f) M(b a).
Suppose that f is positive on I and that the area A under the curve exists. It is not hard to see
that L(P, f) A U(P, f) for every partition P of I.
Further, if you draw for yourself a simple curve, it is not hard to convince yourself that re-
ning P tends to increase L(P, f) and decrease U(P, f). We prove this last statement as a
lemma.
4.2 Lemma
Let f be a bounded real-valued function on I = [a, b].
(i) If P, Q are partitions of I and Q is a renement of P, then
L(P, f) L(Q, f), U(P, f) U(Q, f).
(ii) If P
1
and P
2
are any partitions of I, then L(P
1
, f) U(P
2
, f). Thus any lower sum is
any upper sum.
Proof:
(i) We rst prove this for the case where Q is P plus one extra point. The general case then
follows by adding points one at a time. So suppose that Q is the same as P, except that it has
one extra vertex c, where x
k1
< c < x
k
. Then U(Q, f) U(P, f) is
(sup{f(x) : x
k1
x c})(c x
k1
) + (sup{f(x) : c x x
k
})(x
k
c)
(sup{f(x) : x
k1
x x
k
})(x
k
x
k1
).
10
This is using the fact that all other terms cancel. But the rst two sups above are less than or
equal to the third. Since cx
k1
, x
k
c, x
k
x
k1
are all positive, we get U(Q, f)U(P, f) 0.
The proof for the lower sums uses the same idea, or can be proved by noting that L(P, f) =
U(P, f).
(ii) Here we just set P to be the partition obtained by taking all the vertices of P
1
and all
those of P
2
. We arrange these vertices in order, and P is a renement of P
1
and of P
2
. Now we
can write
L(P
1
, f) L(P, f) U(P, f) U(P
2
, f).
4.3 Denition of the Riemann integral
Let f be bounded, real-valued on I = [a, b] as before, with |f(x)| M there. We dene the
UPPER INTEGRAL of f from a to b as
_
b
a
f(x)dx = inf
P
{U(P, f)}
where the supremum is taken over all partitions P of I. This exists and is nite, because all the
upper sums are bounded below by M(b a). Similarly we dene the LOWER INTEGRAL
_
b
a
f(x)dx = sup
P
{L(P, f)}
taking the sup over all partitions P of I. Again this exists and is nite, because all the lower
sums are bounded above by M(b a).
Now we dene f to be Riemann integrable on I if the upper integral equals the lower inte-
gral, in which case we denote the common value by
_
b
a
f(x)dx.
Notice that the lower integral is always the upper integral, because of Lemma 4.2, part (ii).
Also, if f is Riemann integrable and positive on I and the area A under the curve exists, then the
fact that L(P, f) A U(P, f) for every partition P of I implies that the lower integral is A
and the upper integral is A, which means that A equals
_
b
a
f(x)dx. As usual in integration, it
does not matter whether you write f(x)dx or f(t)dt etc.
4.4 Example
Dene f on I = [0, 1] by f(x) = 1 if x is rational and f(x) = 0 otherwise. Let P = {x
0
, . . . , x
n
}
be any partition of I. Then clearly M
k
(f) = 1 for each k, since in each sub-interval [x
k1
, x
k
]
there is a rational number. Thus U(P, f) =

n
k=1
(x
k
x
k1
) = 1 and so the upper integral is
1. Similarly, we have m
k
(f) = 0 for each k, all lower sums are 0, and the lower integral is 0.
Before proving that continuous functions are Riemann integrable, we rst deal with the rather
easier case of monotone functions (non-decreasing or non-increasing).
11
4.5 Theorem
Suppose that f is a monotone function on I = [a, b]. Then f is Riemann integrable on I.
Proof: we only deal with the case where f is non-decreasing (i.e. f(x) f(y) for x y).
The non-increasing case is similar. Now if f(b) = f(a) then f is constant on I and so the result
follows trivially (all upper and lower sums are the same).
Assume henceforth that f(b) > f(a). Let > 0. We choose a partition P = {x
0
, . . . , x
n
}
such that for each k we have x
k
x
k1
< /(f(b) f(a)). Now, since f is non-decreasing we
have M
k
(f) = f(x
k
) and m
k
(f) = f(x
k1
). Thus
U(P, f) L(P, f) =
n

k=1
(M
k
(f) m
k
(f))(x
k
x
k1
) =
n

k=1
(f(x
k
) f(x
k1
))(x
k
x
k1
) <
<
n

k=1
(f(x
k
) f(x
k1
))/(f(b) f(a)) = (f(x
n
) f(x
0
))/(f(b) f(a)) = .
Therefore U(P, f) L(P, f) +. So the upper integral of f (which is the inf of the upper sums)
is at most L(P, f) +. But L(P, f) is at most the lower integral (sup of the lower sums). Thus
the upper and lower integrals dier by at most and, since is arbitrary, must be equal.
To handle the case of continuous functions, we need the following.
4.6 Uniform continuity
Let f be a real-valued function on the closed interval I = [a, b]. We say that f is uniformly contin-
uous on I if the following is true. To each > 0 corresponds a > 0 such that |f(x) f(y)| <
for all x and y in I such that |x y| < .
4.7 Theorem
If f is continuous on [a, b] then f is uniformly continuous on [a, b].
Proof: suppose that > 0 and that NO positive exists with the property in the statement.
Then 1/n, for n N, is not such a . Thus there are points x
n
and y
n
in I with |x
n
y
n
| < 1/n,
but with |f(x
n
) f(y
n
)| .
Now (x
n
) is a sequence in the closed interval I, and so is a bounded sequence, and there-
fore we can nd a convergent subsequence (x
k
n
), with limit B, say. Since a x
k
n
b for each
n, we have B I. Now |x
k
n
y
k
n
| 0 as n , and so (y
k
n
) also converges to B. Since f
is continuous on I, we have f(x
k
n
) f(B) as n and f(y
k
n
) f(B) as n , which
contradicts the fact that |f(x
k
n
) f(y
k
n
)| is always . This contradiction proves the theorem.
12
Remark: the name UNIFORM continuity arises because the does not depend on the par-
ticular choice of x or y. The theorem is NOT true for open intervals, as the example h(x) = 1/x,
I = (0, 1) shows. To see this, just note that h(1/n) h(1/(n 1)) = 1 for all n N, but
|1/n 1/(n 1)| = 1/n(n 1), which we can make as small as we like.
4.8 Theorem
Let f be continuous, real-valued, on I = [a, b] (a < b). Then f is Riemann integrable on I.
Proof: let > 0 be given. We choose a > 0 such that for all x and y in I with |x y| < we
have |f(x) f(y)| < /(ba). We choose a partition P = {x
0
, . . . , x
n
} of I such that, for each
k, we have x
k
x
k1
< . Now take a sub-interval J = [x
k1
, x
k
]. We know that there exist c
and d in J such that for all x in J we have f(c) f(x) f(d). This means that M
k
(f) = f(d)
and m
k
(f) = f(c). But |c d| < and so M
k
(f) m
k
(f) = f(d) f(c) < /(b a). This
holds for each k. Thus
U(P, f) L(P, f) =
n

k=1
(M
k
(f) m
k
(f))(x
k
x
k1
) < (/(b a))
n

k=1
(x
k
x
k1
) = .
The same argument as used for non-decreasing functions now applies.
4.9 Theorem
Let f and g be Riemann integrable functions on I. Let c, d be real numbers. Then cf + dg is
Riemann integrable on I and
_
b
a
cf(x) +dg(x)dx = c
_
b
a
f(x)dx +d
_
b
a
g(x)dx.
Proof:
First consider cf, when c > 0. Obviously L(P, cf) = cL(P, f) and U(P, cf) = cU(P, f)
and so
_
b
a
cf(x)dx = c
_
b
a
f(x)dx. It is also easy to see that U(P, f) = L(P, f) and
L(P, f) = U(P, f), so
_
b
a
f(x)dx =
_
b
a
f(x)dx. This leaves only f + g to consider.
Take > 0 and partitions P
1
, P
2
of I such that L(P
1
, f) and U(P
2
, f) are both within of
_
b
a
f(x)dx. Note that we also have L(P
1
, f)
_
b
a
f(x)dx U(P
2
, f). We can assume that
P
1
= P
2
, because otherwise we can replace both by P
1
P
2
, and renements can only push the
lower and upper sums closer to the integral. Similarly, take Q such that L(Q, g) and U(Q, g) are
both within of
_
b
a
g(x)dx.
We can assume that P = Q, because otherwise we can replace both by P Q. Now
L(P, f) +L(P, g) L(P, f +g) U(P, f +g) U(P, f) +U(P, g).
Thus L(P, f +g) and U(P, f +g) both lie within 2 of
_
b
a
f(x)dx +
_
b
a
g(x)dx. Therefore so do
the upper and lower integrals of f +g and, since is arbitrary, the result follows.
13
4.10 The fundamental theorem of the calculus
Suppose that F, f are real-valued functions on [a, b], that F is continuous and f is Riemann
integrable on [a, b], and that F

(x) = f(x) for all x in (a, b). Then


_
b
a
f(x)dx = F(b) F(a).
Proof.
The proof is based on the mean value theorem. Let P = {x
0
, . . . , x
n
} be any partition of [a, b].
Then by the mean value theorem there exist points t
k
satisfying x
k1
< t
k
< x
k
such that
F(b) F(a) =
n

k=1
(F(x
k
) F(x
k1
)) =
n

k=1
f(t
k
)(x
k
x
k1
).
But this means that L(P, f) F(b) F(a) U(P, f). Hence the lower integral of f is at
most F(b) F(a), and the upper integral of f is at least F(b) F(a). But the lower and upper
integrals of f are, by assumption, the same.
4.11 Examples
(i) The limit of a sequence of Riemann integrable functions need not be Riemann integrable.
Let E be the countable set Q [0, 1], and let (r
n
) be a sequence in E, in which each element
of E appears exactly once. Dene g
n
as follows. Set g
N
(x) = 1 if x is one of the N points
r
1
, . . . , r
N
, and g
N
(x) = 0 otherwise. Obviously all lower sums for g
N
are 0. Take the partition
P = {x
0
, . . . , x
n
} with x
k
= k/n, 0 k n, and n > 2N an integer. Then at most 2N
intervals [x
k1
, x
k
] contain a point where g
N
(x) = 0, so U(P, g
N
) 2N/n. So
_
1
0
g
N
(x)dx = 0.
But as N we see that g
N
converges pointwise to the function of Example 4.4, which is not
Riemann integrable.
(ii) Dene h
n
on [0, 1], for integer n > 2, by h
n
(x) = n
2
x if 0 x 1/n and h
n
(x) =
n
2
(2/n x) if 1/n x 2/n and h
n
(x) = 0 if x 2/n. Then h
n
is continuous on [0, 1] with
Riemann integral 1. But h
n
0 pointwise on [0, 1], and 0 has Riemann integral 0.
5 Series
5.1 The extended real numbers
We dene R

= R {, }. Later we will dene some products involving , but for now


we just dene
+ = , x + =
for all x R. Note that + () is not dened.
We can extend < to R

in the obvious way by saying that < x < for every x in


R.
14
Note that if we say a subset A of R is bounded above this will continue to mean that there
is some M R such that x M for all x A: of course all x are also .
If A is any non-empty subset of R

, then A has a least upper bound (sup) and an inf. Note that
sup A is if A has no upper bound in R. In particular this is true if A.
We can dene limits of sequences in R

exactly as in R. In particular x
n
L R means
that given positive real we have |x
n
L| < (and so x
n
R) for all n n
0
().
The monotone sequence theorem remains true: if x
n
is a non-decreasing sequence in R

then x
n
tends to sup{x
n
}.
5.2 Series
We consider here only series with terms a
k
in [0, ]. Given such a
k
for k p N, the partial
sums
s
n
=
n

k=p
a
k
form a non-decreasing sequence in R

, and we set
S =

k=p
a
k
= lim
n
s
n
.
This will be if any a
k
is , but of course

k=1
1/k = .
Note that s
n
S for each n.
5.3 Re-arrangements
Suppose that a
k
[0, ] for each k N. Suppose that : N N is a bijection, and set
b
k
= a
(k)
for each k. Then
S
1
=

k=1
b
k
= a
(1)
+a
(2)
+. . . . . . . . .
is called a RE-ARRANGEMENT of
S
2
=

k=1
a
k
,
and the sums S
1
, S
2
are equal.
15
To see this, just note that if n N, then we can nd m such that (j) m for 1 j n, and
so
n

k=1
b
k
=
n

k=1
a
(k)

m

p=1
a
p
S
2
.
Letting n we get S
1
S
2
. Now just reverse the roles of the two series.
Surprisingly, this fails if we allow negative terms. The alternating series
S = ln 2 = 1/1 1/2 + 1/3 1/4 +. . . . . . . . . . . .
re-arranges (one odd, followed by two evens) to
1/1 1/2 1/4 + 1/3 1/6 1/8 + 1/5 1/10 1/12 +. . . . . . .
which has sum S/2.
5.4 Double series
Suppose that m and p are (nite) integers and m n and p q and a
j,k
[0, ]
for all integers j, k with m j n, p k q. N.B. j, k do not take the value . Then
n

j=m
_
q

k=p
a
j,k
_
= S
1
and
q

k=p
_
n

j=m
a
j,k
_
= S
2
are equal.
Proof: this is clearly true if n and q are both nite. Also if N n is nite and q = ,
N

j=m
_

k=p
a
j,k
_
=
N

j=m
_
lim
M
M

k=p
a
j,k
_
=
= lim
M
N

j=m
_
M

k=p
a
j,k
_
= lim
M
M

k=p
_
N

j=m
a
j,k
_
=

k=p
_
N

j=m
a
j,k
_
.
Setting N = n, this proves the result when n is nite. Now if n and q are both , take any
nite N m to get
N

j=m
_

k=p
a
j,k
_
=

k=p
_
N

j=m
a
j,k
_

k=p
_

j=m
a
j,k
_
.
Letting N we get S
1
S
2
, and the reverse inequality is proved the same way.
16
Note that if we take any sum of nitely many a
j,k
this will be at most the sum of nitely
many rows, and so at most the double sum.
Again, examples show that this property of independence of order of summation fails if we
allow terms of mixed sign.
5.5 Generalized series
If T is a countably innite set, and a
t
[0, ] for every t in T, we can dene

tT
a
t
as follows. Let t
n
be any sequence using up T, with every t in T appearing exactly once, set
b
n
= a
t
n
and put

tT
a
t
=

n=1
b
n
.
It does not matter which particular sequence we take, because if we use a dierent sequence the
series we get will be a re-arrangement of

b
n
and so have the same sum.
For example,

tZ
2
t
= 2
0
+ 2
1
+ 2
1
+ 2
2
+. . . . . . . . . = .
6 Measures
We rst need:
6.1 -algebras
Let X be a set, and let be a collection of subsets of X. Then is called a -algebra if is
non-empty and the following two conditions are satised:
(i) for every A in , the complement X\A is in ;
(ii) if we have countably many A
j
, say A
1
, A
2
, . . ., all in , then their union

j
A
j
is in .
In particular, the union of nitely many elements of is an element of .
By taking A (X \ A), we see that X is always in , and so is the empty set.
It follows from (i) and (ii) that the intersection of countably many elements of is an element
of (see problem sheet).
The simplest example of a -algebra is the power set P(X), the collection of all subsets of
X.
Note that some books omit the requirement that be non-empty: however, an empty
collection of subsets of X is not very interesting. Also it is easy to check (optional) that this
denition is equivalent to what Dr. Feinstein calls a -eld in his G1CMIN lecture notes and
exam papers.
17
6.2 Lemma
Let X be a set. If for every t in some set T, the collection
t
is a -algebra of subsets of X,
then the intersection

t

t
, which is the collection of all subsets of X each belonging to all of
the
t
, is itself a -algebra of subsets of X.
The proof is trivial: since X
t
for every t, the intersection is non-empty, and conditions
(i) and (ii) are obviously satised.
It follows that every non-empty collection H of subsets of X is a sub-collection of a mini-
mal -algebra of subsets of X. To do this, take the intersection of all -algebras of subsets of
X each of which contain H. This is not a vacuous denition, as there is always at least one,
namely P(X). This is called the -algebra generated by H.
The -algebra generated by the open sets is called the -algebra of Borel sets. We shall see
that not every subset of R is a Borel set.
6.3 Measures
Let X be a non-empty set, and let be a (N.B. non-empty) -algebra of subsets of X. By a
measure on we mean a function : [0, +] which satises the conditions () = 0
and
(E) =

j
(E
j
)
whenever we have a countable family of pairwise disjoints sets E
j
(all in ) whose union is E.
The elements of will be called -measurable (or just measurable) sets, and we will often talk
about as a measure on X (with the existence of taken for granted).
6.4 Examples
Let X be a set and let (U) be the number of elements of each subset U of X. Then is a
measure on P(X), called the counting measure.
If we have a measure on a set X and (X) = 1, then is called a probability measure.
Measurable sets correspond to events. The countable additivity corresponds to the probabilities
of pairwise mutually disjoint events. In particular, (X\U) = 1 (U).
Let X be a set, let x X, and let (U) be 1 if x U, and 0 otherwise. Then is a measure
on P(X) (point mass at x).
Let X be an uncountable set and, for U X, let (U) be 0 if U is countable and if U is
uncountable.
6.5 Some properties of measures
(i) If A B then (A) (B).
To see this, write C = B\A = B A
c
. Then B = A C and A, C are disjoint, and
(B) = (A) +(C) (A).
18
(ii) (A B) = (A) +(B \ A) (A) +(B) (using (i)).
(iii) (

n
j=1
G
j
)

n
j=1
(G
j
). (By induction, using (ii)).
(iv) If B
j
B
j+1
and B is the union of the (countably many) B
j
then (B
j
) (B) as
j .
To prove (iv), write E
1
= B
1
and E
j+1
= B
j+1
\B
j
. Then B
j
is the union of E
1
, . . . , E
j
(by
induction on j) and B is the union of all the E
j
and the E
j
are disjoint. (If m < n we have
E
m
B
m
B
n1
and E
n
= B
n
\ B
n1
and so E
m
E
n
= . Also if m is the smallest j for
which x B
j
then x E
m
). We now get
(B
j
) =
j

m=1
(E
m
)

m=1
(E
m
) = (B).
(v) It follows from (iv) that if the F
j
are all -measurable then (

j=1
F
j
) = lim
n
(

n
j=1
F
j
).
(vi) We then have, using (iii) and (v), subadditivity:
(

_
j=1
G
j
) = lim
n
(
n
_
j=1
G
j
) lim
n
n

j=1
(G
j
) =

j
(G
j
).
(vii) Note that some books omit the condition () = 0. However () = () + () and so
the only possible values for () are 0 and . If () = then (A) = for every A .
7 Lebesgue outer measure
We are going to construct the Lebesgue measure on a -algebra of subsets of R, which may
be thought of as a generalization of the idea of the length of an interval.
Now the idea of length makes sense for an interval, so that the length of (a, b) is b a (and
is if b = or a = ). However, for other sets such as Q the idea of length isnt dened
in any obvious way. So we start by constructing the Lebesgue outer measure, which is dened
for every subset A of R. The idea is to minimize the total length of open intervals which
together cover A. We will then prove some properties, including the fact that for intervals the
outer measure equals the length.
7.1 Denition
Let A R. It is always possible to choose countably many open intervals which together cover
A i.e. whose union contains A: in fact we can do this with one interval (, ).
So we dene the outer measure

(A) as follows. Consider all countable collections C of open


intervals (a
k
, b
k
) such that A

(a
k
, b
k
): these intervals have total length L(C) =

k
(b
k
a
k
).
19
We do this for all such countable collections C of open sets covering A, and take the inf of L(C)
over all possible C. In particular, since A (, +), this is always dened. Formally,

(A) = inf
_

k
(b
k
a
k
) : A
_
k
(a
k
, b
k
)
_
,
in which we consider only covering of A by countably many open intervals.
Another way to express this is that

(A) is the inmum of those positive t such that there


exist countably many open intervals (a
k
, b
k
) of total length

k
(b
k
a
k
) = t with A

k
(a
k
, b
k
).
Obviously, if A B R then any collection of open intervals covering B also covers A,
and so

(A)

(B).
Also {x} (x 1/n, x + 1/n) for n N, so

({x}) 2/n and so

({x}) = 0. Thus
we have

() = 0.
7.2 Theorem

is countably sub-additive, which means the following. If we have countably many subsets
E
1
, E
2
, . . . of R, and E =

n
E
n
is their union, then

(E)

(E
n
).
Proof. This is obvious if any

(E
n
) is innite. Now assume that all

(E
n
) are nite. Let
> > 0. For each n, we have

(E
n
) + 2
n
>

(E
n
) and so we can choose a countable
family of open intervals I
j,n
of length L
j,n
, such that
E
n

_
j
I
j,n
,

j
L
j,n
<

(E
n
) +2
n
.
Then E is contained in the union of all the I
j,n
. There are countably many of these intervals I
j,n
and together they cover E.
Consider the sum of the lengths of all the I
j,n
. A partial sum s for this series is the sum of
nitely many L
j,n
. So if N is large enough we get
s
N

n=1

j
L
j,n

n
(

(E
n
) +2
n
)

(E
n
) +

n=1
2
n
= +

(E
n
).
Since were taking an arbitrary partial sum, the sum of all the L
j,n
is at most +

(E
n
).
Since is arbitrary we get

(E)

(E
n
).
7.3 Examples
We have

(Q) = 0, while

([0, 1]\Q) = 1.
20
7.4 Lemma
Let A R, let x R, and let A +x = {y +x : y A}. Then

(A) =

(A +x).
To prove this, let s >

(A). Then s is not a lower bound for the set of t (0, ] such
that A can be covered by the union of countably many open intervals of total length t. So there
exists t with t < s such that A is contained in the union of the (a
k
, b
k
), where

k
(b
k
a
k
) = t.
But then A + x is contained in the union of the countably many intervals (a
k
+ x, b
k
+ x),
and these have total length t. So

(A + x)

(A). The reverse inequality holds, because


A = (A +x) + (x).
7.5 Theorem
If A is an interval then

(A) is the length of A.


Proof. Suppose that the interval A has nite end-points a, b. Then for n N we have
A (a 1/n, b + 1/n) so

(A) b a + 2/n. Since n is arbitrary we get

(A) b a.
Thus for a nite interval A, we have

(A) not more than the length of A, and this is obviously


also true for an interval of innite length. So we need to show that

(A) is at least the length


of A.
To prove this, we assume rst that A = [a, b], with a, b nite, a < b. Suppose that we
have a countable family of open intervals (a
k
, b
k
) (k T N) which together cover A, and

k
(b
k
a
k
) < b a. By Lemma 2.12 we can assume that there are only nitely many of these
intervals. Thus A is contained in the union of N open intervals (a
k
, b
k
). Let n be large and
partition A into n closed intervals I
1
, . . . , I
n
of equal length s = (b a)/n, with vertices in the
set {x
p
= a+ps : p Z}. For each k, the total length of those I
j
which meet (a
k
, b
k
) is at most
b
k
a
k
+2s (it is at most the dierence between the least x
p
which is b
k
and the greatest x
q
which is b
k
).
Since every I
j
is contained in [a, b] and so meets at least one (a
k
, b
k
), we see that the total
length of the I
j
is
(b a) = ns

k
((b
k
a
k
) + 2s) 2Ns +

k
(b
k
a
k
).
Since n can be chosen arbitrarily large, with s consequently arbitrarily small, we get
b a = ns

k
(b
k
a
k
).
So

(A) is at least the length of A when A is a closed interval with nite end-points. The
general case follows, because if A is an interval and t is positive but less than the length of A,
we can choose a closed interval B contained in A, of length t, to get

(A)

(B) t.
This idea and proof are easily generalized to higher dimensions. In R
2
we take the inmum
of the sum

(area of B
j
), over all coverings of A by open rectangles B
j
(open means no
21
boundary points included). Consider e.g. the straight line A from (0, 0) to (1, 1). The part with
j/n x, y (j + 1)/n lies in the open rectangle (j 1)/n < x, y < (j + 2)/n, which has
area 9n
2
. So

(A) 9n
1
and, since n can be arbitrarily large, A has two-dimensional outer
measure 0.
8 Lebesgue measure
The outer measure

of the last section turns out not to be a measure on the whole power set
P(R). However, we can nd a -algebra of subsets of R on which it is a measure. The idea is
the following. If

were a measure on the power set of R then wed have

(A) =

(A E) +

(A\E)
for every A, E, by disjointness. So we consider those E for which this is true for every A.
8.1 Denition
A subset E of R is said to be Lebesgue measurable if we have

(A) =

(A E) +

(A\E)
for every subset A of R. Note that

(A)

(A E) +

(A\E)
always holds, so that to show that some set E is Lebesgue measurable it suces to prove that

(A)

(A E) +

(A\E)
for every A.
We will sometimes write E
c
for R \ E.
8.2 Some basic facts
(i) R and the empty set are Lebesgue measurable.
(ii) Any set E with

(E) = 0 is Lebesgue measurable.


(iii) If E is Lebesgue measurable, so is R\E.
(iv) If E is Lebesgue measurable, so is x +E for every x R.
Property (iii) is obvious. To prove (ii), assume

(E) = 0. Then

(A E) will be 0 for all A,


which gives

(A E) +

(A \ E) =

(A \ E)

(A)
22
as required. Property (i) now follows from (ii) and (iii).
Now we prove (iv). Let E R be Lebesgue measurable, and note that for A R we have
A (E +x) = {y : y A, y x E} = {z +x : z +x A, z E} =
= x +{z : z A x, z E} = x + ((A x) E)
and hence

(A (E +x)) =

((A x) E), using Lemma 7.4.


Next, note that (E +x)
c
= E
c
+x. This gives
A \ (E +x) = A (E +x)
c
= A (E
c
+x) = x + ((A x) E
c
)
which gives

(A \ (E +x)) =

((A x) E
c
).
Since E is Lebesgue measurable we now get

(A) =

(Ax) =

((Ax) E) +

((Ax) E
c
) =

(A(E+x)) +

(A\(E+x)).
8.3 Theorem
The union or intersection of nitely many Lebesgue measurable sets is Lebesgue measurable.
We rst prove that if E and F are Lebesgue measurable subsets of R, then so is G = EF.
Write E
c
for R\E. We have

(A) =

(AE)+

(AE
c
) =

(AEF)+

(AEF
c
)+

(AE
c
F)+

(AE
c
F
c
) =
=

(AE F) +

(AE F
c
) +

(AE
c
F) +

(AG
c
)

(AG) +

(AG
c
),
since

is sub-additive and
G = E F = (E F) (E F
c
) (E
c
F).
Now, if we are given n 3 and Lebesgue measurable sets F
1
, . . . F
n
, we write
n
_
j=1
F
j
= F
n

n1
_
j=1
F
j
and so the assertion about unions of nitely many sets follows by induction. Intersections work
as well, because
n

j=1
F
j
is the complement of
n
_
j=1
F
c
j
and each F
c
j
is Lebesgue measurable.
23
8.4 Theorem
(i) Suppose that F
1
, F
2
, . . . are countably many pairwise disjoint Lebesgue measurable sets with
union F. Then F is Lebesgue measurable, and

(F) =

(F
j
).
(ii) The union of countably many Lebesgue measurable sets is Lebesgue measurable.
Proof. (i) We may assume that there are innitely many F
j
, by making them all empty from
some j on, if necessary. Let G
n
=

n
j=1
F
j
. The G
n
are Lebesgue measurable.
We claim that for every n N, and for every A R,

(A G
n
) =
n

j=1

(A F
j
).
This is clearly true for n = 1. Assuming it true for n and using the Lebesgue measurability of
F
n+1
, we get

(AG
n+1
) =

(AG
n+1
F
n+1
)+

(AG
n+1
F
c
n+1
) =

(AF
n+1
)+

(AG
n
) =
n+1

j=1

(AF
j
)
and the result follows.
Now, by the previous claim, if A is any subset of R,

(A) =

(AG
n
) +

(AG
c
n
)

(AG
n
) +

(AF
c
) =
n

j=1

(AF
j
) +

(AF
c
).
Since n is arbitrary,

(A)

(A F
j
) +

(A F
c
). (2)
Since

is countably sub-additive, we get, from (2),

(A)

(A F) +

(A F
c
),
which proves that F is Lebesgue measurable. Now choosing A = F in (2) and using sub-additivity
again gives

(F)

(F
j
)

(F).
To prove (ii), just note that if we have E
1
, E
2
, . . . with union E, then setting H
1
= E
1
, H
n+1
=
E
n+1
\(

n
j=1
E
j
), the H
j
are Lebesgue measurable and pairwise disjoint, and their union is E.
(If m < n then H
m
E
m
and H
n
E
m
is empty).
8.5 Corollary
For Lebesgue measurable F we dene (F) =

(F). The Lebesgue measurable sets form a


-algebra of subsets of R, and is a measure on .
24
8.6 Theorem
Let a R. Then the open interval (a, +) is Lebesgue measurable.
Proof. Let A be any subset of R, and let A
1
= A (a, +), A
2
= A (, a]. We only
need to show that

(A)

(A
1
) +

(A
2
), which is obvious if

(A) = . Assume now that

(A) is nite. Take a real > 0 and choose a countable collection of open intervals I
j
which
cover A, such that

j
|I
j
| <

(A) +,
using |I| for the length. Let
P
j
= I
j
(a, +), Q
j
= I
j
(, a].
Then A
1

j
P
j
so

(A
1
)

(P
j
) =

j
|P
j
|
since P
j
is an interval. Doing the same for A
2
,

(A
1
) +

(A
2
)

j
|P
j
| +

j
|Q
j
| =

j
|I
j
| <

(A) +.
Since is arbitrary the theorem is proved.
8.7 Corollary
All Borel sets (in particular, all open sets) are Lebesgue measurable.
8.8 Theorem
Let E R. The following statements are equivalent.
(i) E is Lebesgue measurable.
(ii) For every real > 0 there exists an open set U with E U and

(U \ E) < .
(iii) There exists a set V , such that:
V is the intersection of countably many open sets;
V contains E;

(V \ E) = 0.
Thus Lebesgue measurable sets may in a certain sense be well-approximated by open sets.
Proof. (ii) implies (iii): for each n N take an open set V
n
containing E, such that

(V
n
\E) <
1/n. Now let V be the intersection of the V
n
. Then E V , and for each n we have

(V \ E)

(V
n
\ E) < 1/n.
(iii) implies (i): take V as in (iii), and set W = R\V . Then W F = R\E, and F \W = V \E.
25
Since V is the intersection of open U
j
, say, then W = R\

U
j
=

(R\U
j
) is a union of closed
sets, and so is Lebesgue measurable. Hence we can write F as the union of a closed set W and
a set F \ W which has outer measure 0. Thus F is Lebesgue measurable and so is E.
(i) implies (ii). First set E
n
= E [n, n], n N. Each E
n
is Lebesgue measurable. For
each n N choose, directly from the denition of outer measure, a countable union U
n
of open
intervals containing E
n
and having sum of lengths less than (E
n
) +2
n
. Then A
n
= U
n
\ E
n
has (E
n
) + (A
n
) = (U
n
) and so (A
n
) =

(A
n
) < 2
n
. Let U be the union of the
U
n
, n N. Then U is open, and U \ E = (

U
n
) \ E =

(U
n
\ E)

(U
n
\ E
n
) =

A
n
has,
by the subadditivity of

, outer measure at most

nN

(A
n
) < (1/2 + 1/4 +. . .) = .
9 A set which is not Lebesgue measurable
9.1 The Axiom of Choice
The version of this axiom which we will use is the following:
Suppose that we have a set T, and that A
t
is a non-empty set, for each t T, and that
A
t
A
s
= , for s, t T, s = t. Then we can form a set B = {c
t
: t T} by choosing one c
t
from each A
t
.
9.2 Theorem
There exists a subset E of [0, 1] such that E is not Lebesgue measurable.
Proof. We dene a relation on [0, 1] by x y i xy is rational. Then is an equivalence
relation. To see this, obviously x x (so is reexive) and x y i y x (symmetric) and
x y and y z imply that y x and z y are rational, so that z x is rational (transitive).
For each x I = [0, 1] we form the equivalence class
[x] = {y I : y x}.
Then either [x] [y] is empty, or [x] = [y]. Thus I is the union of these disjoint equivalence
classes. We have a set of pairwise disjoint equivalence classes, whose union is I.
(To see that we have a set T of these, use the mapping : I T given by (x) = [x],
so that T is just the image (I).)
Using the Axiom of Choice, we form a set E which contains precisely one element of each
equivalence class. So for each x in [0, 1] there is a unique y in E such that x y is rational.
Note that 1 x y 1.
Now use the fact that H = Q [1, 1] is countable, and write this set as {r
1
, r
2
, r
3
, . . .},
with the r
j
distinct rational numbers, using up H. Then every x in [0, 1] belongs to one of the
E
j
dened by E
j
= E + r
j
. Also, if j = k then E
j
E
k
= . For if u lies in both then u r
j
and u r
k
are both in E. But (u r
j
) = (u r
k
) and (u r
j
) (u r
k
), and E contains just
one element of each equivalence class.
26
Suppose that E is Lebesgue measurable. Then so is each E
j
. Now, [0, 1] is contained in the
countable union of the pairwise disjoint E
j
, so
1 = ([0, 1]) =

([0, 1])

jN

(E
j
) =

jN
(E
j
) =

jN
(E).
So (E) > 0. But the E
j
are disjoint and each is a subset of [1, 2], so
=

jN
(E) =

jN
(E
j
) ([1, 2]) = 3,
which is impossible.
Hence it is not always true that

(A) =

(A E) +

(A E
c
),
so the Lebesgue outer measure described earlier is not additive on the whole power set of R, and
so is not a measure on the power set of R.
We also see that not every subset of R is a Borel set.
Not all mathematicians accept the Axiom of Choice. It can be shown to imply that every set
C has an ordering <

with the properties that, for all a, b, c in C,


(i) either a <

b or b <

a or b = a, and a <

b and b <

c implies a <

c, as well as
(ii) every non-empty subset D of C has a least element i.e. there exists d D such that
d <

c for all c in D with c = d.


Such an ordering is called a well-ordering. N is well-ordered by ordinary <, but R is not.
Every countable set A has an obvious well-ordering. Write A as {a
j
} without repetition, and
order by a
1
<

a
2
<

a
3
<

. . ..
10 Measurable functions
We have constructed Lebesgue measure for (some) subsets of R, and we now return to a general
measure space. So assume we have a non-empty set X, a (non-empty) -algebra of subsets of
X, and a measure : [0, ]. We will sometimes refer to the elements of as -measurable
sets.
The aim is to construct the integral
_
E
fd for measurable E, and to develop its properties,
and in this chapter we determine which functions can be used. Since we occasionally need
products of functions, we dene some products involving .
10.1 Products involving innity
We now dene
x. =
if x > 0 and
0. = 0.
The purpose of the latter is to make certain integrals take their expected values.
27
10.2 Lemma
Let E X with E in , and let f be a function from E to R

. The following four conditions


are equivalent:
(i) For each real y, the set A = {x E : f(x) > y} is in .
(ii) For each real y, the set B = {x E : f(x) y} is in .
(iii) For each real y, the set {x E : f(x) y} is in .
(iv) For each real y, the set {x E : f(x) < y} is in .
Each of these conditions implies that:
(v) for each extended real number y, the set {x E : f(x) = y} is in .
Proof. (i) and (ii) are equivalent, because B = E\A = E A
c
. If A is in then so are A
c
and B. Similarly, (iii) and (iv) are equivalent. Also, (i) and (iii) are equivalent, because
{x E : f(x) > y} =
_
nN
{x E : f(x) y + 1/n}
and
{x E : f(x) y} =

nN
{x E : f(x) > y 1/n}.
When y is nite, (v) clearly follows, since the intersection of elements of is in . Finally
{x E : f(x) = +} =

nN
{x E : f(x) > n}
and
{x E : f(x) = } =

nN
{x E : f(x) < n}.
We dene f to be -measurable (on E) if f satises any of (i) to (iv).
10.3 Lemma
If f is -measurable then so are f and f
2
and |f| and cf, for any constant c > 0.
Proof. Take y R. Then {x E : f(x) > y} = {x E : f(x) < y} . Also
{x E : cf(x) > y} = {x E : f(x) > y/c}. For y < 0 we clearly have |f(x)| > y and
f(x)
2
> y on all of E, while for y 0,
{x E : f(x)
2
> y} = {x E : f(x) > y
1/2
} {x E : f(x) < y
1/2
}
and the idea for |f| is the same.
Sums and products are only dicult insofar as f +g is undened where f = +and g =
and vice versa.
28
10.4 Theorem
If f, g are -measurable functions on E , taking values in R, then f + g and fg are -
measurable.
Proof. Suppose that f(u) + g(u) > y R. Then f(u) > y g(u) and there is a rational
number r such that f(u) > r > y g(u) and so g(u) > y r. The set {x E : f(x) >
r, g(x) > y r} is the intersection of -measurable sets and so -measurable. Then
{x E : f(x) +g(x) > y} =
_
rQ
{x E : f(x) > r, g(x) > y r}.
Now fg is also -measurable, since
fg =
1
2
((f +g)
2
f
2
g
2
).
Remark: we can modify the same proof for f +g and fg if f, g are measurable on E taking
values in [0, ]. Let M = {x E : f(x) = } and N = {x E : g(x) = } and
D = E\(M N). Then M, N and D are in , and f, g are -measurable on D (we always get
the intersection of D with an element of ). So for all y R we have
{x E : f(x) +g(x) > y} = {x D : f(x) +g(x) > y} M N,
which is in .
Similarly if y < 0 then f(x)g(x) > y for all x in E, while if y 0 then
{x E : f(x)g(x) > y} = {x D : f(x)g(x) > y} V W,
where V is the set on which f = and g > 0 (an intersection of measurable sets) and W is
the set on which g = and f > 0.
10.5 Lemma
Let f
1
, f
2
, . . . be countably many -measurable extended real-valued functions on E . We
can assume that f
n
is dened for each n N by making all of them the same from some point
on, if necessary. Then the functions g, h dened by
g(x) = inf{f
n
(x) : n N}, h(x) = sup{f
n
(x) : n N}
are -measurable.
If f
n
f pointwise on E, then f is -measurable on E.
Proof. If y R then
{x E : h(x) > y} =

_
n=1
{x E : f
n
(x) > y}
29
and this is in E. Similarly for g (take the set of x where g(x) < y).
Now suppose that f
n
f. For each n N dene g
n
by
g
n
(x) = sup{f
k
(x) : k n}.
Then each g
n
is -measurable, by the rst part. We claim that for each x E we have
f(x) = lim
n
g
n
(x).
If y < f(x) then for large n we have g
n
(x) f
n
(x) > y. Now suppose y > f(x). Then for large
n we have f
k
(x) < y for k n and so g
n
(x) y. So g
n
(x) f(x) as n and our claim
is proved.
But we clearly have g
n+1
(x) g
n
(x) (sup of a subset), and so for each x E we get
lim
n
g
n
(x) = inf{g
n
(x) : n N}.
So f is an inmum of measurable functions and so measurable.
We saw in the chapter on Riemann integration that continuous and monotone functions are
Riemann integrable. Here we show that they are measurable with respect to Lebesgue measure.
10.6 Theorem
Let f : R R be continuous, and let g : R R be non-decreasing. Then f and g are
measurable with respect to Lebesgue measure.
Proof. Take y in R. The set {x R : f(x) > y} is open, so Lebesgue measurable.
If g(x) > y for all x R then obviously {x R : g(x) > y} = R, which is Lebesgue
measurable. Now let a be the sup of x such that g(x) y, assuming henceforth that there exists
at least one such x. Then g(x) > y for x > a by denition. Next, if x < a then (again by
denition) there exists x

with x < x

a and g(x

) y, so that since g is non-decreasing we


have g(x) y for x < a. So the set {x R : g(x) > y} is either or R or (a, ) or [a, )
and all of these are Lebesgue measurable.
10.7 The characteristic function
Let (X, , ) be a measure space. Let A X, and dene the characteristic function
A
by

A
(x) = 1 if x A and 0 if x A. It is clear that this function is -measurable if and only if
A is in .
10.8 Simple functions
Lt (X, , ) be a measure space. A simple function is a function s : X R which takes only
nitely many dierent values (all in R and so nite). We restrict simple functions to taking only
30
nite values so that, for example, the sum of two simple functions is always dened (we avoid
). There are therefore nitely many disjoint sets A
j
, 1 j n, whose union is X, and
pairwise distinct real numbers
j
, such that we have
s(x) =
n

j=1

A
j
(x).
Note that since A
j
= {x X : s(x) =
j
}, it follows that s is -measurable if and only if all
the A
j
are in .
10.9 Theorem
Let (X, , ) be a measure space and let f : X [0, ] be a non-negative extended real-valued
function on X. Then f is -measurable if and only if there are -measurable simple functions
s
n
such that 0 s
1
s
2
s
3
. . . f on X and s
n
f pointwise on X.
Proof. The if part follows at one from Lemma 10.5. Now suppose that f is -measurable.
For n N divide the interval [0, 2
n
] into 4
n
closed intervals of length 2
n
, their end-points
forming the set
T
n
= {0, 1/2
n
, 2/2
n
, 3/2
n
, . . . , 2
n
1/2
n
, 2
n
}.
For x X and n N let s
n
(x) be the largest element of T
n
which is f(x). Clearly
0 s
n
(x) s
n+1
(x) since T
n
T
n+1
. If f(x) = then we have s
n
(x) = 2
n
f(x). If f(x) is
nite then for large n we have f(x) < 2
n
and s
n
(x) f(x) < s
n
(x)+1/2
n
, and so s
n
(x) f(x).
Finally s
n
is measurable because the sets {x : f(x) 2
n
} and {x : j/2
n
f(x) < (j + 1)/2
n
}
are all measurable.
11 The Lebesgue integral of a non-negative simple func-
tion
Throughout this section we assume that X is a set, and that is a (non-negative) measure
dened on a -algebra of subsets of X. When we write measurable for a set or function we will
always mean with respect to .
11.1 The integral of a simple function
To motivate this idea, consider the function s given by
s(x) = 0 (x < 0), s(x) = 1 (0 x < 1), s(x) = 2 (1 x 2), s(x) = 0 (x > 2).
Then s is a Lebesgue measurable non-negative simple function. The area under the curve y = s(x)
is obviously 1 + 2 = 1. Let

0
= 0, A
0
= (, 0) (2, ),
1
= 1, A
1
= [0, 1),
2
= 2, A
2
= [1, 2].
31
Then
s(x) =
2

j=0

A
j
(x)
and the area under the curve is
0
(A
0
) +
1
(A
1
) +
2
(A
2
).
In the general case suppose that s : X [0, ) is a non-negative simple function which
is -measurable. Note that s doesnt take the value . Hence there are distinct real numbers

j
0 and pairwise disjoint measurable sets A
j
for 1 j n such that X is the union of the
A
j
and
s(x) =
n

j=1

A
j
(x)
on X. We dene
_
X
sd =
n

j=1

j
(A
j
).
Note that it suces to sum over those j such that
j
= 0.
For example, let X = R and let A = Q. Then
A
is a simple function and
_
R

A
d = 1.(Q) = 0.
Similarly,
_
R
0d = 0. This is the reason why we dene 0. = 0. We restrict attention
(at least for now) to s 0 because we need to avoid .
Clearly
_
X
csd = c
_
X
sd
if c is a non-negative real constant.
Note also that if s is a non-negative -measurable simple function, then
_
X
sd = 0
i the set {x X : s(x) > 0} has measure 0.
11.2 The integral over a subset
If s is a non-negative -measurable simple function and E is a measurable subset of X then
E
s
is a non-negative measurable simple function and we dene
_
E
sd =
_
X

E
sd.
In fact we can write
s =

A
j
,
32
in which we just sum over those j with
j
= 0, and

E
s =

A
j
E
.
Thus
_
E
sd =

j
(A
j
E) =

j
=0

j
(A
j
E) =

j
=0

j
({x E : s(x) =
j
}).
In eect we are just restricting s to E and taking the measure of the portion of each set which
lies in E.
For example,
_
[0,1]

Q
d = 1.([0, 1] Q) = 0.
This integral is very easily computed, whereas the Riemann integrable of this function over [0, 1]
does not exist.
11.3 Lemma
Suppose that s, t are non-negative -measurable simple functions on X and 0 s t on E .
Then
_
E
sd
_
E
td.
Proof. Suppose rst that E = X and
s =

A
j
, t =

B
k
.
Here
H
means the characteristic function of H. The A
j
are disjoint and their union is X, and
the
j
are distinct, and the same thing is true for B
k
,
k
. Let s = c
j,k
, t = d
j,k
on A
j
B
k
. So
c
j,k
=
j
, d
j,k
=
k
. Then
_
X
sd =

j
(A
j
) =

k
(A
j
B
k
) =

k
c
j,k
(A
j
B
k
)

k
d
j,k
(A
j
B
k
) =

j
d
j,k
(A
j
B
k
) =
=

j
(A
j
B
k
) =

k
(B
k
) =
_
X
td.
In the general case we just note that
E
s
E
t. Lemma 11.3 is proved.
33
11.4 Lemma
Let s be a non-negative -measurable simple function on X. For each -measurable E X (i.e.
each E ), set (E) =
_
E
sd =
_
X

E
sd. Then is a measure.
Obviously () =
_
X
0d = 0. Suppose that E is a countable union of pairwise disjoint sets
E
k
. Suppose s =

j

j

A
j
. Then
(E) =

j
(A
j
E) =

k
(A
j
E
k
) =

j
(A
j
E
k
) =
=

j
(A
j
E
k
) =

k
_
E
k
sd =

k
(E
k
).
We can change the order of summation since all terms are non-negative.
11.5 Lemma
Let s, t be non-negative -measurable simple functions on X, and let E be a -measurable subset
of X. Then
_
E
(s +t)d =
_
E
sd +
_
E
td.
Proof. We only need prove this when E = X, since
E
s,
E
t are simple. Let (as before)
s =

A
j
, t =

B
k
, E
j,k
= A
j
B
k
.
Then
_
E
j,k
(s +t)d = (
j
+
k
)(E
j,k
) =
_
E
j,k
sd +
_
E
j,k
td.
But then, by the previous lemma, since s +t is simple,
_
X
(s +t)d =

j,k
_
E
j,k
(s +t)d =

j,k
_
E
j,k
sd +

j,k
_
E
j,k
td =
_
X
sd +
_
X
td.
12 The Lebesgue integral of a general non-negative
function
Note rst that if s is a non-negative -measurable simple function on X then Lemma 11.3 gives
_
X
sd = sup
_
X
td
in which the sup is taken over all non-negative -measurable simple t such that 0 t s on X.
34
Motivated partly by this fact, partly by lower sums for the Riemann integral and partly by
Theorem 10.9, we do the following. If f is any non-negative -measurable function on X, taking
values in [0, ], we dene
_
X
fd = sup
_
X
sd,
in which the sup is taken over all measurable simple functions s such that 0 s f on X. Note
that if f is itself a simple function, then the sup is a max.
For example, we show that
_
R
|x|d = .
To see this, let s(x) be 0 if x < 1, with s(x) = 1 if x 1. Then s is a non-negative Lebesgue
measurable simple function, with 0 s(x) |x| on R. Hence
_
R
|x|d
_
R
sd = 1.([1, )) = .
(i) If c is a positive real number then
_
X
cfd = c
_
X
fd for every non-negative measurable f.
We just write
_
X
cfd = sup
0scf
_
X
sd = sup
0ctcf
_
X
ctd = sup
0tf
c
_
X
td = c
_
X
fd,
in which each s is -measurable and simple and we write s = ct, and use the obvious fact that
ct cf i t f.
(ii) If f = 0 on X then f is simple and
_
X
fd = 0 (even if (X) = ).
12.1 The integral over a subset
Let E be a -measurable subset of X. Let f be a non-negative -measurable extended real-valued
function on X. Then g =
E
.f is a -measurable function.
We dene
_
E
fd =
_
X

E
.fd.
Note that this agrees with our earlier denition when f is simple.
If f is not dened on all of X, but only on some measurable F with E F X, we extend
f to X by making it 0 o F. Then
E
.f is the same regardless of which F we have.
We list some properties of the integral so dened.
(i) If f g on E apart from on a set of -measure 0, then
_
E
fd
_
E
gd.
To see this, suppose rst that E = X, and that we can write X = F G, where G has
measure 0, F G = , and f g on F. Then for non-negative -measurable simple s, taking
values
j
on pairwise disjoint -measurable sets A
j
,
_
X
sd =

j
(A
j
) =

j
(A
j
F) +

j
(A
j
G) =
_
X

F
.sd.
35
So if 0 s f on X then
F
s
F
f
F
g g and so
_
X
sd =
_
X

F
sd
_
X
gd.
Taking the sup over all s with 0 s f we get
_
X
fd
_
X
gd.
In the general case we just note that
E
.f
E
.g on all of X apart from a set of measure 0.
Note that is is standard to say that a property holds almost everywhere (or a.e.) on E if
it holds on E \ G, where G has -measure 0. This is a very useful concept but it is important to
remember that where G has 0 measure will in general depend on which measure we are using.
(ii) If f = g a.e. on E (i.e. apart from on a set of measure 0) then
_
E
fd =
_
E
gd.
This is easy, by (i), but is useful, and implies for example that we can change f to be, say
0, on a set of measure 0 without changing any integral.
(iii) If (E) = 0 then f = 0 a.e. on E and so
_
E
fd =
_
E
0d = 0.
(iv) If A B and A, B are -measurable subsets of X then
_
A
fd =
_
X

A
fd
_
X

B
fd =
_
B
fd.
(v) Suppose that
_
E
fd is nite. Then the set F = {x E : f(x) = +} has measure
zero.
To see this, put s
n
(x) = n
F
(x), n N. Then s
n

E
.f on X so n(F) =
_
X
s
n
d
_
E
fd.
We can express this conveniently by saying that f is nite a.e. on E.
(vi) Suppose that
_
E
fd = 0. Then the set F = {x E : f(x) > 0} has measure 0.
If not, the set F
n
= {x E : f(x) > 1/n} has positive measure for some n N, and we
put s = (1/n)
F
n
. Then 0 s
E
.f on X and 0 < (1/n)(F
n
) =
_
X
sd
_
E
fd = 0.
Combining (ii) and (vi) we see that for a -measurable non-negative function f on E we have
_
E
fd = 0 if and only if f vanishes a.e. on E.
12.2 Monotone convergence theorem
Let f
n
be non-negative measurable functions on a measurable subset E of X, such that
(i) 0 f
1
f
2
. . . a.e. on E, and
(ii) f
n
f pointwise a.e. on E
i.e. 0 f
1
f
2
. . . f and f
n
f pointwise on F = E \ G, where (G) = 0.
36
Then
_
E
f
n
d
_
E
fd, n .
Proof. Strictly speaking f is not dened a priori on all of E. However, f is measurable on F,
by 10.5. If we change f
n
and f to all be 0 on G then (i) is still satised and f
n
f pointwise on
all of E, the function f is measurable on E, and we have not changed the values of any integrals.
Assume for now that E = X. Now clearly, by the monotone sequence theorem,
_
X
f
n1
d
_
X
f
n
d y [0, +]
as n . Since f
n
f, we have y
_
X
fd.
The proof will be completed for the case E = X if we can show that y
_
X
fd. Sup-
pose that s is -measurable and simple, with 0 s f on X, and set (U) =
_
U
sd for each
U . This is a measure, as shown in the last chapter.
Let 0 < c < 1. Let E
n
= {x X : f
n
(x) cs(x)}. Then E
n
E
n+1
(obvious). We
claim that every x X is in the union of the E
n
, which is obvious if s(x) = 0, because then
x is in E
1
. If s(x) > 0 then f(x) > 0 and cs(x) < s(x) f(x) so x E
n
for large n, since
f
n
(x) f(x) s(x). Now
_
X
f
n
d
_
E
n
f
n
d c
_
E
n
sd = c(E
n
) c(X),
since X is the union of the expanding sets E
n
. Thus y c(X) = c
_
X
sd so y
_
X
sd
since c is arbitrary, and so since s is arbitrary we get y
_
X
fd.
For the general case in which we have E we just extend each f
n
and f to be 0 on X \ E,
and we get
_
E
f
n
d =
_
X

E
f
n
d
_
X

E
fd =
_
E
fd.
This is by applying the result for X to
E
.f
n
and
E
.f.
12.3 Examples and remarks
(i) We will later see that this result is not true without the condition that f
1
0. Let s
n
(x) = 0
for x n and s
n
(x) = 1 for x > n. Then s
n
is simple, s
n
s
n+1
and s
n
0 pointwise. But
well see that =
_
R
s
n
d 0 =
_
R
0d.
(ii) We also cannot drop the hypothesis that f
1
f
2
etc. If we set f
n
(x) = n
[0,1/n]
(x)
then
_
R
f
n
d = 1 but f
n
converges pointwise to the function which is at 0 and 0 everywhgere
else, and this function has integral 0.
37
(iii) If f is a non-negative measurable function and we choose a choose a non-decreasing se-
quence of non-negative simple functions s
n
tending to f pointwise (as in Theorem 10.9), then
_
X
s
n
d
_
X
fd.
(iv) If a
m
[0, ] for each m N and we set f(m) = a
m
then
_
N
fd =

m=1
a
m
, in
which is the counting measure on N. To see this, let f
n
(m) = a
m
for 1 m n, with
f
n
(m) = 0 for m > n. Assume rst that all a
m
are nite. Then each f
n
is a simple function and
f
n
, f satisfy the hypotheses of the MCT and so
_
N
fd = lim
n
_
N
f
n
d = lim
n
n

m=1
a
m
=

m=1
a
m
.
If any a
m
is innite then so is the integral of f (since f is innite on a set of non-zero -measure),
and so is the sum of the series (look at partial sums).
12.4 Theorem
Let f
n
be non-negative measurable functions on a measurable subset E of X and let f =

n=1
f
n
.
Then
_
E
fd =

n=1
_
E
f
n
d.
Proof. As usual we can assume E = X because otherwise we can extend the f
n
and f to X
by making them 0 on X \ E. We rst prove the theorem for nite sums. Let s
n
be non-negative
-measurable simple functions such that s
n
s
n+1
f
1
and s
n
f
1
pointwise, and let t
n
f
2
in the same fashion. Then 0 s
n
+t
n
s
n+1
+t
n+1
and so, by the MCT,
_
X
s
n
d +
_
X
t
n
d =
_
X
s
n
+t
n
d
_
X
f
1
+f
2
d.
Here weve used the result, already proved, that the integral of the sum of two simple functions is
the sum of the integrals. But
_
X
s
n
d
_
X
f
1
d and
_
X
t
n
d
_
X
f
2
d. The theorem is thus
proved for the sum of two functions and, by induction, for the sum of nitely many functions.
Now by the MCT
_
X
fd = lim
n
_
X
n

j=1
f
j
d = lim
n
n

j=1
_
X
f
j
d.
12.5 Theorem
Let f be a non-negative measurable function on X. Dene (E) =
_
E
fd. Then is a measure
and
_
E
gd =
_
E
gfd for non-negative -measurable g.
Proof. Let E be a countable union of pairwise disjoint measurable E
j
. Then
(E) =
_
X

E
fd =
_
X

E
j
fd =

j
_
X

E
j
fd =

j
(E
j
).
Thus is a measure (obviously () = 0). Now if g =
E
for some E then
_
X
gfd =
_
E
fd = (E) =
_
X

E
d =
_
X
gd.
38
Suppose now that g is non-negative, measurable and simple. Write
g =

A
j
as usual (a nite sum, with
j
[0, ), A
j
). Then Theorem 12.4 gives
_
X
gfd =

j
_
X

A
j
fd =

j
_
A
j
fd =

j
(A
j
) =
_
X
gd.
For a general non-negative measurable g, take simple s
n
with limit g and 0 s
1
s
2
. . . g.
Then
_
X
gd = lim
_
X
s
n
d = lim
_
X
s
n
fd =
_
X
gfd
by the MCT.
13 The integral of a general measurable function
Suppose now that f is any measurable function on a measurable subset E of X, taking values
in R

. Then f
+
= max{f, 0} and f

= max{f, 0} are measurable, and f = f


+
f

, and we
can dene
_
E
fd =
_
E
f
+
d
_
E
f

d
provided this is not i.e. at least one of
_
E
f
+
d,
_
E
f

d is nite.
Example: let s(x) = 0 for x a R, with s(x) = 1 for x > a. Then s
+
= 0, s

=
(a,)
and
_
R
sd = 0 ((a, )) = .
13.1 Integrable functions
We say that a -measurable function f : E [, ] is integrable if both
_
E
f
+
d and
_
E
f

d are nite, in which case


_
E
fd =
_
E
f
+
d
_
E
f

d
denitely exists. Note that in this case f
+
and f

are nite a.e., and so by changing f on a set


of measure 0 we can assume that in fact f maps E into R.
We list some easy properties of integrable f. We have
_
E
fd
_
E
f
+
d
_
E
|f|d.
39
Since we also have, obviously,

_
E
fd =
_
E
fd
_
E
|f|d,
we get the standard inequality

_
E
fd

_
E
|f|d.
Finally, if f and g are integrable functions and f g then f
+
g
+
and g

and so
_
E
fd
_
E
gd.
13.2 Lemma
Suppose that f, g are integrable functions and that is a real number. Then f +g and f are
integrable and
_
E
fd =
_
E
fd,
_
E
f +gd =
_
E
fd +
_
E
gd.
Proof. Since f and g are integrable we can assume that both take values in R, and so f +g
and f are certainly measurable. It remains only to prove the assertion about h = f + g. We
have
h
+
h

= h = f +g = f
+
f

+g
+
g

and so h
+
+f

+g

= h

+f
+
+g
+
. Now Theorem 12.4 gives
_
E
h
+
d +
_
E
f

d +
_
E
g

d =
_
E
h

+
_
E
f
+
d +
_
E
g
+
and re-arranging gives the result.
13.3 The dominated convergence theorem
Suppose that f
n
and g are measurable functions from a measurable subset E of X to R

, with
|f
n
| g a.e. on E, and
_
E
gd < +. Suppose that f
n
f pointwise a.e. on E. Then f is
integrable and
_
E
|f
n
f|d 0 and
_
E
f
n
d
_
E
fd as n +.
Proof. As usual we can assume that E = X, by extending the functions to be 0 o E. We can
also assume that |f
n
| g and f
n
f pointwise on all of X, by changing the functions to all be 0
on the set of measure 0 where this may fail. Doing this does not change the value of any integrals.
We know that f is measurable. But |f| g, so f
+
g, f

g, and so f is integrable.
Since
_
X
gd is nite, g is nite o a set F of measure 0, and again we can change f
n
and
g to be 0 on F without changing the value of any integrals.
For each n N, dene h
n
by
h
n
(x) = sup{|f
k
(x) f(x)| : k n}.
40
Then, since f
n
(x) f(x) R, we see that h
n
0 pointwise on X. We also have h
n+1
h
n
.
Moreover, h
n
2g, since |f
k
(x) f(x)| 2g(x) for all k n and for all x X.
Put g
n
= 2g h
n
. Then 0 g
1
g
2
. . . 2g, and g
n
2g pointwise. Hence the MCT
tells us that
_
X
g
n
d
_
X
2gd.
This gives
_
X
h
n
d =
_
X
2gd
_
X
g
n
d 0,
using the fact that g is integrable. Hence
_
X
|f
n
f|d
_
X
h
n
d 0
as n +. Thus
_
X
f
n
fd 0,
and
_
X
f
n
d =
_
X
f
n
fd +
_
X
fd
_
X
fd.
13.4 Remark
The existence of g is necessary for the theorem to work. Let f
n
(x) = n
2
for 0 < x 1/n, and
let f
n
(x) = 0 otherwise. Then f
n
0 pointwise, but
_
R
f
n
d = n .
13.5 Theorem
Let the bounded real-valued function f be Riemann integrable on I = [a, b], with |f| M <
there. Then f is measurable with respect to Lebesgue measure on I and the Riemann integral
J
1
=
_
b
a
f(x)dx equals the Lebesgue integral J
2
=
_
I
fd.
Proof. Suppose rst that f 0 on [a, b], say 0 f N. Let P be a partition of I. Then
L(P, f) is equal to
_
I
sd for some simple function s with 0 s f. Also U(P, f) equals
_
I
Sd for some simple S with N S f.
We rst show that, assuming f is measurable, we have J
1
= J
2
.
We have L(P, f) =
_
I
sd
_
I
fd = J
2
, and taking the supremum over all P we get J
1
J
2
.
Similarly J
2

_
I
Sd = U(P, f), and taking the inf over P we get J
2
J
1
.
We now prove that f is a Lebesgue measurable function. Since f is Riemann integrable, we
can take partitions P
n
with P
n+1
a renement of P
n
and L(P
n
, f) J
1
, U(P
n
, f) J
1
as
n . This gives us simple functions s
n
, S
n
with
0 s
n
s
n+1
f S
n+1
S
n
N
41
such that
_
I
S
n
s
n
d =
_
I
S
n
d
_
I
s
n
d = U(P
n
, f) L(P
n
, f) 0
as n . Let S

(x) = inf
n
S
n
(x), s

(x) = sup
n
s
n
(x) on I = [a, b]. Then
s

f S

, |S
n
s
n
| N,
and S
n
s
n
S

pointwise on I. So the DCT tells us that


_
I
S

d = 0, S

= 0 a.e.
So we get S

= s

= f a.e. on I, say on F = I \ G, where (G) = 0. Since S

is measurable
(an inmum of measurable functions) we deduce that f is measurable on F. We then have, for
y R,
{x I : f(x) > y} = {x F : f(x) > y} {x G : f(x) > y},
which is the union of a Lebesgue measurable set and a set of measure 0, and so Lebesgue mea-
surable.
It remains only to consider the case where f : I [M, M] is Riemann integrable. Here
we just apply the above proof to f + M, and note that adding M to f just increases the
Lebesgue and Riemann integrals by M(b a).
14 Pointwise and uniform convergence revisited
14.1 Example
Let h
n
(x) = e
nx
on I = [0, 1], and let
h(0) = 1, h(x) = 0 (0 < x 1).
Then h
n
h pointwise on I.
It is not true that h
n
h uniformly on I: take =
1
4
and set
x
n
=
ln 2
n
.
Then
h
n
(x
n
) =
1
2
, |h
n
(x
n
) h(x
n
)| =
1
2
> .
So there is no N such that |h
n
(x) h(x)| < for all n N and all x I.
However, it is true that if we x > 0 then h
n
h uniformly on [, 1]. In fact, on this
set, we have
|h
n
(x) h(x)| = h
n
(x) e
n
<
provided n is large enough.
42
14.2 Lemma
Let E be a subset of R with nite Lebesgue measure m(E). Let f
n
: E R

be measurable
functions converging pointwise on E to a function f : E R. Let , be positive. Then there
exist N and a subset A of E such that (A) < and |f
n
(x) f(x)| < for all n N and all
x E\A.
Proof. We know that f is also measurable, by Lemma 10.5. Let E
N
be the set of all x E
for which there exists n N with |f
n
(x) f(x)| . In fact this set is
E
_
nN
{x : |f
n
(x) f(x)| }
and so is Lebesgue measurable. Then E
N+1
E
N
and the intersection P of the E
N
is empty.
Since
(E) = (E\P) = lim(E\E
N
),
there exists N such that A = E
N
has (A) < .
14.3 Egorovs theorem
Let E be a subset of R of nite Lebesgue measure (E). Let f
n
: E R

converge pointwise
on E to f : E R. Let > 0. Then there exists a subset A of E, with (A) < , such that
f
n
f uniformly on E\A.
Proof. For each n N use Lemma 14.2 to choose A
n
E and p
n
such that (A
n
) < /2
n
and |f
m
(x)f(x)| < 1/n for all m p
n
and all x E\A
n
. Let A =

A
n
. Then E\A E\A
n
,
and so m p
n
implies that |f
m
(x) f(x)| < 1/n for all x E\A.
Note that no such theorem holds for E of innite measure. Let f
n
=
[n,)
, for n N. Then
f
n
0 pointwise, but not uniformly on the complement of any set of nite measure.
43

You might also like