Integration Theory
Integration Theory
TOMAS SJÖDIN
Contents
1. Notation and Prerequisites 2
1.1. Metric Spaces 3
1.2. Normed spaces, Banach spaces 3
2. Motivation 4
2.1. Ways to define a more general integral 4
3. Some general advice 5
4. Algebras of sets and lattices of functions 5
4.1. Vector lattices 5
4.2. Algebras, σ−algebras and Monotone Classes 6
4.3. Algebras versus vector lattices 8
5. Measures 9
5.1. Measures 9
5.2. Complete measures 10
5.3. Outer measures 11
5.4. Construction of outer measures 12
5.5. Metric outer measures 14
5.6. Regularity and Invariance of Lebesgue measure 18
6. Integrals 19
6.1. Measurable functions in the measure space context 20
6.2. Simple functions 21
6.3. The integral of simple functions in the measure space context 21
6.4. The Daniell integral 22
7. Convergence theorems 26
8. Two theorems of Stone 28
8.1. Stone’s theorem 28
8.2. Iterated integrals and the Fubini-Stone theorem 29
9. Different types of convergence 30
10. Product measures 32
11. Signed measures and Differentiation 37
11.1. Signed Measures 37
11.2. The Radon-Nikodym Theorem and Lebesgue Decomposition 39
11.3. Differentiation with respect to a doubling measure in a metric space 41
11.4. The one-dimensional case 44
12. Lp -spaces(*) 45
12.1. Measurable functions 45
12.2. Linear functionals on Lp 47
13. Continuous functions on a compact space(**) 49
14. Recommended textbooks 51
14.1. Lebesgue-type of treatment 51
14.2. Daniell-type treatment (or both) 51
0 · (±∞) = 0,
but of-course care always has to be taken with algebraic operations involving infinity.
We will use the following notation for set operations. Suppose A, B are sets, then A ∪ B denotes the
union of A and B, A ∩ B denotes their intersection and A \ B the difference, i.e. the set of all points
in A which does not belong to B. In case we work with subsets of some fixed space X we also use the
notation Ac for the set X \ A. For families of sets {Ai }i∈I we write ∪i∈I Ai for their union, and ∩i∈I Ai
for the intersection. In case I = N we also use the notation ∪∞ ∞
i=1 Ai ,∩i=1 Ai . We denote the set of all
subsets of X by P(X). When it comes to sequences of points (sets/functions) we will often just write
that “xn is a sequence” rather than “{xn }n∈N is a sequence”.
Sums of real numbers {ai }i∈I , where I is at most countable will be denoted by
X
ai .
i∈I
n
X ∞
X
ai and ai respectively.
i=1 i=1
For two functions f, g : X → R we write f ≤ g if f (x) ≤ g(x) for all x ∈ X. If A ⊂ R then supA
and infA denotes the supremum and infimum of A respectively. Furthermore we introduce the following
notation:
• (f ∧ g)(x) = min{f (x), g(x)}, (f ∨ g)(x) = max{f (x), g(x)},
• We more generally also define for a family of functions {fi }i∈I
_ ^
( fi )(x) = sup{fi (x) : i ∈ I}, ( fi )(x) = inf{fi (x) : i ∈ I},
i∈I i∈I
• f + = f ∨ 0, f − = −f ∨ 0 so that f = f + − f − and |f | = f + + f − ,
• if a sequence fn of functions from X to R converges pointwise to the function f , and if fn
is increasing (in its non-strict sense), then we write fn ↗ f . Similarly we write fn ↘ f for
decreasing convergence.
(Note that inequalities involving ∨, ∧ by definition reduces directly to pointwise statements about
min, max of real numbers.)
The choice of notation ∨, ∧ similar to ∪, ∩ is of-course no coincidence. Indeed if we by χA denote the
characteristic function of A, then
Some further simple properties of ∨, ∧, which are direct consequences of the corresponding statements
for real numbers since they are pointwise statements, are as follows
2
Lemma 1.1. Suppose f, g, h are real-valued functions on some fixed set X. Then the following
identities holds:
(a) f − = 21 (|f | − f ),
(b) f + = 12 (|f | + f ),
(c) f ∨ (g ∧ h) = (f ∨ g) ∧ (f ∨ h),
(d) f ∧ (g ∨ h) = (f ∧ g) ∨ (f ∧ h),
(e) f ∨ g = (f + h) ∨ (g + h) − h,
(f) f ∧ g = (f + h) ∧ (g + h) − h
1.1. Metric Spaces. We assume that the reader is familiar with metric spaces. A pair (X, ρ) where X
is a non-empty set and ρ : X × X → [0, ∞) such that for all x, y, z ∈ X
• ρ(x, y) = ρ(y, x),
• ρ(x, y) ≤ ρ(x, z) + ρ(z, y),
• ρ(x, y) = 0 if and only if x = y,
is called a metric space, and ρ is called a metric on X. We denote the open ball with radius r > 0 and
center x by
B(x, r) = {y ∈ X : ρ(x, y) < r},
and the sphere with radius r > 0 and center x by
S(x, r) = {y ∈ X : ρ(x, y) = r}.
Note that we always have that B(x, r) is open, S(x, r) is closed and ∂B(x, r) ⊂ S(x, r) but there are
situations when these are not the same (e.g. in X = N we have B(1, 1) = {1} so ∂B(1, 1) = ∅, but
S(1, 1) = {2}).
The space (X, ρ) is said to be complete if every Cauchy sequence x1 , x2 , x3 , . . . in X, i.e. such that
given ε > 0 there is N such that ρ(xn , xm ) < ε for all n, m ≥ N , is convergent to some x (i.e. ρ(xn , x) < ε
for all n ≥ N ).
The space (X, ρ) is said to be compact if there for any open cover of X by open sets {Oi }i∈I (i.e.
such that X = ∪i∈I Oi ) is a finite subcover (that is a finite subset J ⊂ I such that X = ∪i∈J Oi ). This
is equivalent to the condition that any sequence xn of points in X has a convergent subsequence.
Given two metric spaces (X, ρX ) and (Y, ρY ) then we say that f : X → Y is continuous at x ∈ X if
for every ε > 0 there is δx > 0 such that
ρY (f (x), f (y)) ≤ ε for all y ∈ X such that ρX (x, y) < δx .
In case f is continuous at every point of X, then we say that f is continuous on X. If furthermore we
can choose δx independently of x, then we say that f is uniformly continuous on X.
1.2. Normed spaces, Banach spaces. A pair (V, ∥ · ∥) where V is a real vector space and ∥x∥ ∈ [0, ∞)
for all x ∈ V such that
• ∥kx∥ = |k|∥x∥ for all k ∈ R and x ∈ V ,
• ∥x + y∥ ≤ ∥x∥ + ∥y∥ for all x, y ∈ V ,
• ∥x∥ = 0 if and only if x = 0,
is called a normed linear (or vector) space, and ∥ · ∥ is called a norm on V . A norm induces a metric by
ρ(x, y) = ∥x − y∥.
If (V, ρ) is complete, then (V, ∥ · ∥) (or simply V ) is called a Banach space.
A continuous (or bounded) linear functional on a normed real vector space V is a real-valued linear
function F : V → R such that there is a constant C such that
|F (v)| ≤ C||v|| for all v ∈ V.
One usually introduces a norm on these functionals as
||F || = inf{C : |F (v)| ≤ C||v|| for all v ∈ V }.
It is not hard to verify that these themselves forms a vector space called the dual space of V .
This space is furthermore always a Banach space.
3
2. Motivation
• The concept of area goes back a long time, and this is in some sense the starting point of
integration theory.
• The principles of integration were formulated independently by Isaac Newton and Gottfried
Leibniz in the late 17th century, through the fundamental theorem of calculus.
• Probably the first definition that a modern mathematician would say comes close to a rigorous
definition seems to go back to Cauchy (Lécons sur le calcul infinitesmal, p.81), who defined
an integral which he “proved” to be well defined for continuous functions on closed bounded
intervals of the real line.
• In the 19:th century Fourier wrote his famous book Théorie analytique de la chaleur (The Ana-
lytic Theory of Heat), where the representation of very general functions using infinite trigono-
metric series were discussed.
• Now the natural question arose. If we can represent a function using Fourier series, then we
could also consider any convergent Fourier series to be a function. But as it turned out such a
function may not always be so well behaved, and naive definitions of integrals would not do!
• Riemann introduced the first rigorous definition of an integral, and it is usually good enough in
situations when we do explicit calculations.
• But it is unfortunately not enough when we need to deal with limiting processes.
P∞
E.g. Suppose f (x) = n=0 (an cos(nx) + bn sin(nx)) (Fourier series):
when do we have that f (x) is integrable and when can we say that
Z b ∞ Z b
X
f (x)dx = (an cos(nx) + bn sin(nx))dx?
a n=0 a
for any reasonable definition of the integral since the sum is finite.
Hence the question is equivalent to whether
Z b Z b
f (x)dx = lim fk (x)dx.
a k→∞ a
For Riemanns definition it is hard to prove such convergence results. A simple example to explain the
problem is given by the Dirichlet function f , which is the characteristic function of the set of rational
points in [0, 1]. Since the rationals are countable we see that there is an increasing sequence fn such that
R1
fn is zero apart from n points where it is 1, and fn ↗ f everywhere on [0, 1]. Clearly 0 fn (x)dx = 0 in
Riemann’s sense, but f is not Riemann integrable.
2.1. Ways to define a more general integral. There are several ways to define a generalization of
the Riemann integral, but there are two that are mainstream (and they are more or less equivalent):
• Lebesgue’s approach, starting with generalizing length which gives rise to measure theory, and
then define the integral using this.
• Daniell’s approach starting with the Riemann integral for say the continuous functions and
extending this directly to a larger class of functions.
Here we will begin by developing measure theory, but the integration theory section is developed for
the Daniell approach, mainly because this offers no essential extra difficulties and is a bit more general.
As it turns out it is not actually much more general, but as a byproduct one get for instance the Riesz
representation theorem for linear functionals on the space of continuous functions almost for free.
We do however not only want one-dimensional integrals, but want a theory which can be used to cover
situations such as higher dimensional integrals, integrals on curves and surfaces. Furthermore measure
theory is the foundation of modern probability theory as introduced by Kolmogorov. Therefore we will
use an axiomatic approach, which is relevant in all the above cases.
4
3. Some general advice
Integration theory is a technical subject. Indeed nothing else should be expected since we are dealing
with a subject where simpler natural definitions such as the one by Riemann fails. There are a few
general tips that I find worthwhile to write down already here. It is probably a good idea to go through
them quickly now and come back to them later.
We will later define measure spaces (X, M, µ) where X is a set, M a suitable collection of subsets of
X (called a σ-algebra) and µ a measure which is a positive function on M with suitable properties.
Very often when we want to prove a statement regarding measures a good approach is the following:
Let Φ denote the set of all subsets of M which satisfies the statement. Maybe a large class is easily seen
to belong to Φ (for instance maybe all intervals if we work on the real line). Show that the class Φ is
closed under suitable algebraic operations (unions and complementation typically), and finally that it is
closed under monotone limits of sequences of sets. Then this typically forces Φ to be the whole set M.
Later when we define integrals a similar approach is often reasonable: Let Φ denote the set of all func-
tions satisfying the statement. Prove that it contains many elements (often so called elementary/simple
functions but it could be continuous functions or something else). Often the last step makes it sufficient
to prove that Φ is closed under monotone limits, which can often be done through one of the limit
theorems we will treat later, but on occasion one could also in analogy with the above need to prove that
Φ is closed under suitable algebraic operations (typically that Φ is a vector lattice, or at-least a vector
space). Again this will typically force Φ to contain all integrable functions
Finally it is as always important to think about special cases and examples. However, as stated
above, the Lebesgue theory is all about generalizing the earlier definitions to include cases where these
fails. Therefore it is important not to think of to simple cases. Often it is enough to consider sets
and functions on the real line, but choose somewhat nasty such. One example being for instance the
intersection between the rational numbers and intervals for instance, and the associated Dirichlet function
which is zero in all irrational numbers and one in the rational ones. Other important sets to keep in
mind are Cantor type sets (see example 5.23 below).
In these notes lattices/vector lattices will always be with respect to pointwise max/min. Notice that
any vector lattice satisfies 0 ∈ H, so for any h ∈ H we have in particular h+ , h− , |h| ∈ H. We actually
have a converse to this.
Proposition 4.2. If H is a vector space of real-valued functions then the following are equivalent
(a) H is a vector lattice,
(b) |h| ∈ H for every h ∈ H,
(c) h+ ∈ H for every h ∈ H,
(d) h− ∈ H for every h ∈ H.
Proof. This is a simple consequence of Lemma 1.1, because if for instance (b) holds, then by Lemma
1.1 (a) and (b) we see that h+ , h− also belongs to H, due to that H is a vector space. Furthermore
f ∨ g = (f − g)+ + g according to Lemma 1.1 part (e) (applied to h = −g), and hence for any f, g in H
also f ∨ g in H, and likewise we may treat f ∧ g. □
Example 4.3. If K is a compact metric space and H = C(K) denotes the set of all continuous
real-valued functions, then H is a vector lattice of bounded functions.
5
4.2. Algebras, σ−algebras and Monotone Classes.
Definition 4.4. Let X be a non-empty set. If A ⊂ P(X) is a non-empty family which is closed
under finite unions and complements, i.e.
(A1) A1 , A2 , . . . , Ak ∈ A ⇒ ∪kj=1 Aj ∈ A,
(A2) A ∈ A ⇒ Ac ∈ A,
then A is called an algebra of sets.
c c
• Note that if A is an algebra
then X = A ∪ A and ∅ = X belongs to A.
k k c c
• Since ∩j=1 Aj = ∪j=1 Aj we see that algebras are also closed under finite intersections. Indeed
one could equally well have replaced unions by intersections in the definition of algebras.
• Since A \ B = A ∩ B c we see that A \ B ∈ A if A, B ∈ A
Definition 4.5. If M ⊂ P(X) is an algebra which is closed under countable unions, then M is
called a σ−algebra.
c
Since ∩∞ ∞ c
j=1 Aj = ∪j=1 Aj we see that σ−algebras are also closed under countable intersections.
Definition 4.6. A family Φ ⊂ P(X) which is closed under countable increasing unions and
countable decreasing intersections is called a monotone class in X.
Exercise 4.1. Prove that an algebra A is a σ-algebra if and only if it is closed under monotone
increasing limits. In particular A is a σ-algebra if and only if it is a monotone class.
Exercise 4.2. Let X be a set and let M denote the set of all subsets E of X such that either E
or E c is at most countable. Prove that M is a σ-algebra.
Exercise 4.3. Suppose A is an algebra of sets in X, and let E ⊂ X be any set. Show that the
collection
AE = {A ∩ E : A ∈ A}
is an algebra (called the induced algebra) of sets in E. Prove that in case A is also a σ-algebra
then so is AE .
Exercise 4.4. Let {Ai }i∈I be any collection of algebras on the set X. Show that the intersection
A = ∩i∈I Ai is also an algebra. Also prove the corresponding result for σ-algebras and monotone
classes.
6
Exercise 4.6. Prove that if A is an algebra which is closed under countable disjoint unions, then A
is a σ-algebra. (Hint: Given a sequence of sets E1 , E2 , E3 , . . . in A look at A1 = E1 , A2 = E2 \ E1 ,
. . . , An = En \ ∪n−1
j=1 Ej , . . .)
Definition 4.7 (Borel sets). Given a metric space (X, ρ) the Borel σ-algebra BX is the smallest
σ-algebra containing all open sets.
Proof. Clearly Φ(A) ⊂ M(A) since σ-algebras are monotone classes. So we only need to prove that
Φ(A) is a σ-algebra. To do so fix E ∈ Φ(A) and define
ΦE = {F ∈ Φ(A) : F \ E, E \ F and E ∩ F are in Φ(A)}.
ΦE is for each E a monotone class, and F ∈ ΦE if and only if E ∈ ΦF . Furthermore if E ∈ A, then
A ⊂ ΦE since A is an algebra. Hence we see that Φ(A) ⊂ ΦE for every E ∈ A. Since this implies that
any F ∈ Φ(A) also belongs to ΦE in case E ∈ A we see that any E ∈ A belongs to ΦF , and hence
Φ(A) ⊂ ΦF for any F ∈ Φ(A). Hence for any E, F ∈ Φ(A) E \ F , F \ E and E ∩ F belongs to Φ(A). So
Φ(A) is an algebra and a monotone class, which proves that it is a σ-algebra. □
Lemma 4.9. Suppose A is an algebra on X and Ai ∈ A for all i ∈ I = {1, 2, . . . , n}. If we for
each subset J ⊂ I define
!
\ [
CJ = Ai \ Ai ,
i∈J i̸∈J
then we have that the sets CJ all belong to A, they are disjoint and
[
Ai = CJ .
{J:i∈J}
Proof. If J1 ̸= J2 , then there must be at least one i which belongs to one but not the other. Without
loss assume that i ∈ J1 . Then CJ1 ⊂ Ai and CJ2 ⊂ Aci , and hence they are disjoint. That they all
belong to A is obvious since there are only a finite number of operations and all sets Ai belongs to A by
assumption.
Finally if x ∈ Ai then let Jx denote the set of all j such that x ∈ Aj . Clearly i ∈ Jx and x ∈ ∩j∈Jx Aj
but x ̸∈ ∪j̸∈Jx Aj so x ∈ CJx . □
Proof. Let A denote the collection of all finite disjoint unions of elements in E. We need to prove that
A is an algebra, i.e. closed under taking finite unions and complements.
To prove that A is closed under taking unions it is clearly enough to prove that the union of any finite
family of sets A1 , A2 , . . . , An in E also lies in A (even if they are not disjoint). We do this by induction
over n. For n = 1 there is of-course nothing to prove. Suppose now that
m
[
A1 ∪ A2 ∪ · · · ∪ An−1 = Bj ,
j=1
7
where the family B1 , B2 , . . . , Bm is disjoint in E. Then let Acn = ∪rk=1 Ck where the Ck are disjoint in E.
Then we have
m
[ m [
[ r
c
A1 ∪ A2 ∪ · · · ∪ An−1 ∪ An = An ∪ (Bj ∩ An ) = An ∪ (Bj ∩ Ck ),
j=1 j=1 k=1
Definition 4.11. Let A be an algebra of subsets of X, and let HA denote the linear span of all
characteristic functions χA for A ∈ A. I.e.
( n )
X
HA = ai χAi : a1 , a2 , . . . , an ∈ R, A1 , A2 , . . . , An ∈ A .
i=1
Then the functions in HA are called simple functions over A.
so we can always make a refinement to get a case where the sets CJ are disjoint. Also note that ϕ is
a simple function if and only if ϕ takes a finite number of different non-zero values b1 , b2 , . . . , bk and
Bbi = ϕ−1 (bi ) ∈ A for each bi as well as ϕ−1 (0) ∈ A. The set ϕ−1 (0) may be empty, i.e. the value 0 is
never attained, or ϕ−1 (0) = X which corresponds to that ϕ is identically 0. Then
k
X
ϕ= bi χBbi .
i=1
Indeed for all bj we have
[
Bbj = CJ .
{J:cJ =bj }
This is called the canonical representation of ϕ.
Proof. Obviously H is a vector space, hence it is, according to Proposition 4.2, enough to prove that
|ϕ| ∈ H for each ϕ ∈ H. But using the notation from above, if
k
X
ϕ= bi χBbi
i=1
8
is the canonical representation of ϕ, then
k
X
|ϕ| = |bi |χBbi
i=1
5. Measures
5.1. Measures.
The property (µ2) is called countable additivity, and it essentially captures both the linearity and
limiting properties of measures. The triple (X, M, µ) is called a measure space. Note that countable
additivity implies finite additivity, since ∅ ∈ M.
Some standard terminology concerning (X, M, µ):
• The sets in M are called µ− measurable or simply measurable,
• µ is called finite if µ(X) < ∞,
• µ is called σ−finite if X = ∪∞j=1 Ej where µ(Ej ) < ∞ for each j.
• A property which holds apart from a set of µ−measure 0 is said to hold
µ−almost everywhere (µ−a.e.).
Example 5.2. Most examples of measure spaces of interest requires work to “construct”. Here
are a few that are important and simple however.
Let M = P(X) and let x be a point in X. Define µ(E) = 1 if x ∈ E and 0 otherwise. Then µ is
a measure called the dirac measure at x and is often denoted by δx .
Let M = P(X) and define µ(E) =number of points in E (infinite if there are infinitely many).
This is the so called counting measure on E.
A more general example which contains the two above as special cases is given if we let M = P(X),
let f (x) be a positive function on X and define
k
X
µ(E) = sup{ f (xj ) : x1 , x2 , . . . xk ∈ E}.
j=1
Exercise 5.2. Suppose (X, M, µ) is a measure space, and let E ∈ M. Define µ|E : M → [0, ∞]
by
µ|E (A) = µ(A ∩ E) for all A ∈ M.
Prove that µ|E is a measure. This is called the restriction of µ to E. (Note that one could
also consider this as a measure on the induced σ-algebra ME in a natural way. Formally this is
of-course not the same, but the choice makes little difference in practice.)
Exercise 5.3. Suppose (X, M, µ) is a measure space. Show that for any E, F ∈ M we have
µ(E) + µ(F ) = µ(E ∪ F ) + µ(E ∩ F ).
9
Theorem 5.3. Let (X, M, µ) be a measure space.
(a) If A, B ∈ M and A ⊂ B then µ(A) ≤ µ(B), P
∞
(b) If Ej ∈ M for j = 1, 2, . . ., then µ(∪∞
j=1 Ej ) ≤ j=1 µ(Ej ),
(c) If Ej ∈ M for j = 1, 2, . . . and E1 ⊂ E2 ⊂ . . ., then µ(∪∞j=1 Ej ) = limj→∞ µ(Ej ),
(d) If Ej ∈ M for j = 1, 2, . . . and E1 ⊃ E2 ⊃ . . . and µ(E1 ) < ∞, then µ(∩∞ j=1 Ej ) =
limj→∞ µ(Ej ).
Proof. We prove (a),(b) and (c) and leave (d) as an exercise. (a) follows since B = A ∪ (B \ A) and the
union is disjoint, so µ(B) = µ(A) + µ(B \ A) ≥ µ(A).
To prove (b) define A1 = E1 and for n ≥ 2 An = En \ ∪n−1 j=1 Ej . Then the sets An ⊂ En are disjoint
and ∪nj=1 Aj = ∪nj=1 Ej for each n (including n = ∞). Hence by part (a) we get
∞
X ∞
X
µ(∪∞ ∞
j=1 Ej ) = µ(∪j=1 Aj ) = µ(Aj ) ≤ µ(Ej ).
j=1 j=1
The first property is called monotonicity, the second subadditivity and the two last continuity
from below and above respectively.
Note that the assumption µ(E1 ) < ∞ in (d) can not be dropped. For instance if we on N let µ be the
counting measure and Ej = {n ≥ j}, then each µ(Ej ) = ∞, but ∩∞ j=1 Ej = ∅.
Exercise 5.5. Suppose E satisfies the assumptions of Proposition 4.10. Assume that µ and η are
two finite measures on M(E). Prove that if µ(A) = η(A) for all A ∈ E, then this also holds for
all A ∈ M(E) (i.e. µ = η). (Hint: Use the result of Proposition 4.10 together with the monotone
class lemma and continuity of measures.)
5.2. Complete measures. A measure µ is said to be complete if for every set A such that there is a
measurable set B with the property that A ⊂ B and µ(B) = 0, then A is itself measurable.
Theorem 5.4. If (X, M, µ) is a measure space and we let M consist of all sets A ∈ P(X) such
that there are E, F ∈ M with E ⊂ A, A \ E ⊂ F and µ(F ) = 0, then M is a σ−algebra which
contains M.
Furthermore if we define
µ(A) = µ(E),
where E is as in the definition above, then µ is a measure on M such that µ(A) = µ(A) for all
A ∈ M.
10
5.3. Outer measures.
Outer measures are mainly important as a tool to construct measures, but they do have some interest
in themselves as-well.
Definition 5.5. Let X be a non-empty set. A function µ∗ : P(X) → [0, ∞] such that
(µ∗ 1) µ∗ (∅) = 0,
(µ∗ 2) µ∗ (A) ≤ µ∗ (B)Pif A ⊂ B,
∞
(µ∗ 3) µ∗ (∪∞
j=1 Ej ) ≤
∗
j=1 µ (Ej ),
is called an outer measure on X.
Theorem 5.8. If µ∗ is an outer measure on X then the set M of all µ∗ −measurable sets forms
a σ−algebra. Furthermore the restriction µ = µ∗ |M is a complete measure on M.
Proof. Below E ⊂ X is an arbitrary set throughout. First of all note that in case A ⊂ X satisfies
µ∗ (A) = 0, then we have
µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) ≤ µ∗ (A) + µ∗ (E) = µ∗ (E)
by monotonicity, and hence A ∈ M. In particular ∅ ∈ M. It is also clear that M is closed under
complementation due to the symmetry of the definition.
Suppose now that A, B ∈ M, then if we use equation (1) first for A and then for B we get
µ∗ (E) = µ∗ (E ∩ A) + µ∗ (E ∩ Ac )
= µ∗ (E ∩ A ∩ B) + µ∗ (E ∩ A ∩ B c ) + µ∗ (E ∩ Ac ∩ B) + µ∗ (E ∩ Ac ∩ B c ).
But A ∪ B = (A ∩ B) ∪ (A ∩ B c ) ∪ (Ac ∩ B), so by subadditivity we get
µ∗ (E ∩ A ∩ B) + µ∗ (E ∩ A ∩ B c ) + µ∗ (E ∩ Ac ∩ B) ≥ µ∗ (E ∩ (A ∪ B)),
and hence (since Ac ∩ B c = (A ∪ B)c )
µ∗ (E) ≥ µ∗ (E ∩ (A ∪ B)) + µ∗ (E ∩ (A ∪ B)c ).
So M is an algebra. Furthermore µ∗ is finitely additive on M, because if A ∩ B = ∅, then
µ∗ (A ∪ B) = µ∗ ((A ∪ B) ∩ A) + µ∗ ((A ∪ B) ∩ Ac ) = µ∗ (A) + µ∗ (B).
11
Finally we only need to prove that M is closed under countable disjoint unions, and that µ∗ is
countably additive on such to complete the proof. So assume now that A1 , A2 , A3 , . . . is a disjoint
sequence of sets in M, and let Bn = ∪nj=1 Aj , B = ∪∞
j=1 Aj . Then
µ∗ (E ∩ Bn ) = µ∗ (E ∩ Bn ∩ An ) + µ∗ (E ∩ Bn ∩ Acn )
= µ∗ (E ∩ An ) + µ∗ (E ∩ Bn−1 ),
Pn
so by induction we have µ∗ (E ∩ Bn ) = j=1 µ∗ (E ∩ Aj ). Therefore
∞
X
µ∗ (E) ≥ µ∗ (E ∩ Aj ) + µ∗ (E ∩ B c )
j=1
∞
[
≥ µ∗ (E ∩ Aj ) + µ∗ (E ∩ B c )
j=1
= µ∗ (E ∩ B) + µ∗ (E ∩ B c ) ≥ µ∗ (E).
Hence all the inequalities above must be equalities, which proves that B ∈ M. If we let E = B above
we also see that µ∗ is countably additive on M. □
Definition 5.9. A class of subsets K of a set X is called a sequential covering class if ∅ ∈ K and
for every set A ⊂ X there is a sequence En of sets in K such that
∞
[
A⊂ En .
n=1
Theorem 5.10. Let K be a sequential covering class on X, and let λ : K → [0, ∞] be such that
λ(∅) = 0. For every set A ⊂ X define:
(∞ ∞
)
X [
∗
µ (A) = inf λ(En ) : En ∈ K, En ⊃ A .
n=1 n=1
Then µ∗ is an outer measure on X.
The assumption that K is a sequential covering class guarantees that there actually is something to
take the above infimum over for any subset A of X, so µ∗ is well defined.
Proof. The only thing that is not trivial to check is the countable subadditivity. To do so let An be a
sequence of sets. We need to prove that
∞ ∞
!
[ X
∗
µ An ≤ µ∗ (An ).
n=1 n=1
To do so notice that by definition, for a given ε > 0 we may for each n choose a sequence Ekn of sets in
K such that
∞
[
An ⊂ Ekn ,
k=1
and
∞
X ε
λ(Ekn ) ≤ µ∗ (An ) + .
2n
k=1
Hence
∞
X ∞
X
∗
µ (A) ≤ λ(Ekn ) ≤ µ∗ (An ) + ε.
n,k=1 n=1
□
12
When constructing outer measures one usually starts with some sequential covering class on which one
has decided what the measure should be. For instance if we want to generalize length on the real line,
then the class of all intervals (a, b) would be a natural choice, and we would then put λ((a, b)) = b − a.
But for the above construction to be helpful one would need to make sure of two things:
• the value is preserved in the sense that µ∗ (E) = λ(E) for any E ∈ K,
• the sets in K are µ∗ -measurable.
To deal with the first question many books starts from the assumption that λ is a pre-measure on an
algebra K:
One downside of this approach is that one needs quite a bit of control on both the set K and the
function λ. For instance the open intervals does not form an algebra on R (one can build an algebra
from half-open intervals however). The following definition of a pre-outer measure is more flexible.
Definition 5.12. Suppose K is a sequential covering class, and that λ : K → [0, ∞]. Then we
call λ a pre-outer measure on K if λ(∅) = 0 and if for every E ∈ K and every covering of E by a
countable union ∪∞ j=1 Ej of sets in K we have
∞
X
λ(E) ≤ λ(Ej ).
j=1
Theorem 5.13. If λ is a pre-outer measure on K, then the outer measure µ∗ constructed from
λ, K satisfies that
µ∗ (E) = λ(E) for all E ∈ K.
It is not difficult to show in the case of length on R with K consisting of open intervals (including the
empty set) that this gives us a pre-outer measure.
Theorem 5.14. Suppose that K is an algebra, and λ is a finitely additive pre-outer measure on
K. If we define µ∗ as above, then all sets in K are µ∗ -measurable with
µ∗ (A) = λ(A) for all A ∈ K.
Remark 5.15. The conditions in the theorem above are automatically fulfilled if λ is a pre-measure on
K.
Proof. The stated equality is a special case of the preceding theorem, so we only need to prove that all
sets in K are measurable. Suppose now that A ∈ K and E ⊂ X is arbitrary. Choose a covering ∪∞ i=1 Ai
of E by sets in K. Then since Ai ∩ A and Ai ∩ Ac all belong to K (since K is an algebra) we see that
∞
X
µ∗ (E ∩ A) ≤ λ(Ai ∩ A),
i=1
X∞
µ∗ (E ∩ Ac ) ≤ λ(Ai ∩ Ac ).
i=1
Exercise 5.7. In case K = P(X), then λ is a pre-outer measure if and only if it is an outer
measure.
Exercise 5.8. Suppose that µ0 is a finite pre-measure on some algebra A on X (i.e. µ0 (X) < ∞).
Let µ∗ denote the outer measure constructed from µ0 and A as above, and define the inner measure
µ∗ on X by
µ∗ (A) = µ0 (X) − µ∗ (Ac ).
Prove that a set E is measurable (in the sense of equation (1)) if and only if µ∗ (E) = µ∗ (E).
Definition 5.16. An outer measure µ∗ on a metric space (X, ρ) is called a metric outer measure
if for any sets A, B ⊂ X we have
ρ(A, B) > 0 ⇒ µ∗ (A ∪ B) = µ∗ (A) + µ∗ (B).
Theorem 5.17. If µ∗ is a metric outer measure on (X, ρ), then every Borel set is µ∗ −measurable.
Proof. Since the set of µ∗ −measurable sets is a σ−algebra it is enough to prove that all the closed sets
are measurable. So let F be a closed set, and A any set with µ∗ (A) < ∞. We need to prove that
µ∗ (A) ≥ µ∗ (A ∩ F ) + µ∗ (A ∩ F c ).
(Note that if µ∗ (A) = ∞ this inequality is trivially satisfied, and the reverse inequality is also always
satisfied by subadditivity.)
The first step is to prove that there is a sequence of sets En ⊂ A ∩ F c such that
ρ(En , F ) ≥ 1/n,
and
lim µ∗ (En ) = µ∗ (A ∩ F c ).
n→∞
Once we have this it is easy to complete the proof using the definition of metric outer measures.
To prove the existence of En define
En = {x ∈ A ∩ F c : ρ(x, F ) ≥ 1/n} .
We will prove that these En satisfies the above. (Note that A ∩ F c ⊂ F c which is open). Clearly En is an
increasing sequence of sets, so µ∗ (En ) is an increasing sequence of real numbers which all are bounded
from above by µ∗ (A ∩ F c ). It is also easy to see that A ∩ F c = ∪∞n=1 En (using that F is closed). It is
now enough to show that µ∗ (E2n ) → µ∗ (A ∩ F c ) as n → ∞. Define the sets
Gn = En+1 \ En
14
and note that
∞ ∞ ∞
! ! !
[ [ [
A ∩ F c = E2n ∪ Gk = E2n ∪ G2k ∪ G2k+1 .
k=2n k=n k=n
Hence
∞
X ∞
X
µ∗ (E2n ) ≤ µ∗ (A ∩ F c ) ≤ µ∗ (E2n ) + µ∗ (G2k ) + µ∗ (G2k+1 ).
k=n k=n
Now we use (which is the main point of the construction)
1 1 1
ρ(G2k , G2k+2 ) ≥ − = > 0.
2k + 1 2k + 2 (2k + 1)(2k + 2)
Since E2n ⊃ ∪n−1 ∗
k=1 G2k we get (using that µ is a metric outer measure)
n
X
µ∗ (G2k ) = µ∗ (∪nk=1 G2k ) ≤ µ∗ (E2n ).
k=1
Hence
∞
X
µ∗ (G2k )
k=1
is convergent. A similar argument will show that
X∞
µ∗ (G2k+1 )
k=1
P∞ P∞
is convergent. Therefore, since this implies that k=n µ∗ (G2k ) + k=n µ∗ (G2k+1 ) → 0 as n → ∞,
together with the above this proves the statement. □
Now we come to the construction of metric outer measures, and we will need some assumptions on
our sequential covering class K. We define the diameter of the set A
d(A) = sup{ρ(x, y) : x, y ∈ A}, d(∅) = 0.
Theorem 5.18. Assume that K is a sequential covering class on X such that for each n the set
Kn = {A ∈ K : d(A) ≤ 1/n}
is also a sequential covering class. Furthermore assume that λ : K → [0, ∞] with λ(∅) = 0.
For each n we define (in accordance with the previous section) the outer measures
(∞ ∞
)
X [
∗
µn (A) = inf λ(Ek ) : Ek ∈ Kn , Ek ⊃ A .
k=1 k=1
Then
µ∗n (A) ≤ µ∗n+1 (A)
holds for any subset A of X. If we furthermore define
µ∗0 (A) = lim µ∗n (A),
n→∞
then µ∗0 is a metric outer measure.
Proof. Since Kn ⊃ Kn+1 we see that µ∗n (A) ≤ µ∗n+1 (A) holds for any A ⊂ X.
To prove that µ∗0 is an outer measure is easy using that all µ∗n are.
To prove that it is a metric outer measure, assume that ρ(A, B) > 1/n for all n ≥ n0 (if ρ(A, B) > 0,
then there clearly is such n0 ). Let ε > 0. For each n we may then cover A ∪ B with sets Ekn ∈ Kn such
that
∞
X
µ∗n (A ∪ B) + ε ≥ λ(Ekn ).
k=1
Since d(Ekn ) ≤ 1/n for each k we see that each Ekn can intersect at most one of A and B. From this it
follows that
µ∗0 (A ∪ B) = µ∗0 (A) + µ∗0 (B).
(Indeed this follows also for µ∗n for each n ≥ n0 ). □
15
One may ask when the definition of µ∗ given in section 1:
(∞ ∞
)
X [
µ∗ (A) = inf λ(En ) : En ∈ K, En ⊃ A
n=1 n=1
gives the same measure as µ∗0 . In general this is not true. For this to hold one also needs assumptions
on λ:
Theorem 5.19. Assume that for each set E ∈ K, ε > 0 and n ∈ N there is a sequence Ek ∈ Kn
such that
[∞
E⊂ Ek ,
k=1
and
∞
X
λ(Ek ) ≤ λ(E) + ε.
k=1
Then µ∗ = µ∗0 .
Proof. Clearly µ∗ (A) ≤ µ∗0 (A) for all sets A by construction. By assumption, for any A ⊂ X there is a
sequence Ej in K such that
X∞
∗
µ (A) + ε/2 ≥ λ(Ej ).
j=1
Also by assumption each Ej can be covered by sets Ekj from Kn such that
∞
X ε
λ(Ekj ) ≤ λ(Ej ) + .
2j+1
k=1
Hence
∞ ∞ ∞
X ε XX ε ε
µ∗ (A) ≥ λ(Ej ) − ≥ λ(Ekj ) − j+1 −
j=1
2 j=1 2 2
k=1
∞
X
= λ(Ekj ) − ε.
j,k=1
Hence
µ∗ (A) ≥ µ∗n (A) − ε.
Since ε and n are arbitrary the proof is done. □
Exercise 5.9 (Lebesgue measure on RN ). Call a set R a rectangle in RN if there are numbers
ai ≤ bi , i = 1, 2, . . . , N , such that
R = x ∈ RN : ai ≤ xi ≤ bi for all i = 1, 2, . . . , N .
16
Example 5.20 (Hausdorff measure). This time we let K consist of all balls (including the empty
set)
B(a, r) = {x ∈ RN : |x − a| < r}.
Define, for s ∈ (0, ∞)
λs (B(a, r)) = rs .
The pair λ, K satisfies the hypothesis of theorem 5.18, but not 5.19 in general. The measure µ∗0
constructed above using these is called the s−dimensional Hausdorff measure and is often denoted
Hs . (It is instructive to look at the case N = 2 and s = 1 and try to visualize what this measure
is for a curve in the plane.) There is an important point here as-well. In metric spaces balls
have a central role, and unless we are in one dimension it is not so easy to build algebras around
these as it is for rectangles. Therefore it is often better not to rely to heavily on the concept of
pre-measures.
17
5.6. Regularity and Invariance of Lebesgue measure.
Example 5.23. Before looking at the next exercise here are two situations to look out for, even
on the real line. Suppose we let A denote the set of all rational numbers in [0, 1]. Let furthermore
m denote Lebesgue measure restricted to [0, 1]. Since A is countable we clearly have m(A) = 0.
Indeed, suppose ε > 0 and we enumerate the points in A as q0 , q1 , q2 , . . ., then we may put
∞
[ ε ε
O= qi − i+1 , qi + i+1 .
i=0
2 2
Then A ⊂ O where O is open, and we have
∞
X ε
m(O) ≤ i
= ε.
i=0
2
Also K = [0, 1] \ O is compact, and m(K) ≥ 1 − ε.
Note that an open subset on the real line is always a finite or countable union of disjoint intervals,
whereas nothing of the sort is true for compact sets.
Another case to consider is the Smith–Volterra–Cantor set (similar to the classical Cantor middle
third set) which is constructed as follows. (The below material is taken from the Wikipedia page
”Smith-Volterra-Cantor set” which contains some nice graphics as-well.)
In the first step we remove the middle 1/4 from [0, 1] so the remaining set is
3 5
0, ∪ ,1 .
8 8
The following steps consist of removing subintervals of width 1/22n from the middle of each of the
2n − 1 remaining intervals. So for the second step the intervals (5/32, 7/32) and (25/32, 27/32)
are removed, leaving
5 7 3 5 25 27
0, ∪ , ∪ , ∪ ,1 .
32 32 8 8 32 32
Continuing indefinitely with this removal, the Smith–Volterra–Cantor set is then the set of points
that are never removed. Since a set of measure
∞
X 2n 1 1 1 1
2n+2
= + + + ··· =
n=1
2 4 8 16 2
is removed from [0, 1] we see that this set has measure 1/2.
18
Example 5.24 (Non-measurable set). Here we will give an example of a set N ⊂ R which is not
Lebesgue measurable. More precisely we will prove that there is no map µ : P(R) → [0, ∞] such
that
P∞
(1) µ(∪∞i=1 Ei ) = i=1 µ(Ei ) for all disjoint sequences Ei ⊂ R,
(2) µ({y : y − x ∈ E}) = µ(E) for all E ⊂ R and all x ∈ R,
(3) µ([0, 1)) = 1.
I.e. there is no countably additive translation invariant function µ generalizing length to all
subsets of R. Note that the construction below depends on the axiom of choice (indeed if one
removes the axiom of choice it is consistent with the other axioms of set theory to assume that
all subsets of R are Lebesgue measurable).
To prove this we define the equivalence relation on [0, 1) by
x ∼ y if x − y ∈ Q,
where Q denotes the rational numbers. Let N ⊂ [0, 1) contain exactly one element from each
equivalence class (axiom of choice). Also let R = Q ∩ [0, 1] and put
Nr = {x + r : x ∈ N ∩ [0, 1 − r)} ∪ {x + r − 1 : x ∈ N ∩ [1 − r, 1)} for r ∈ R.
It is easy to see using (2) and (3) above that µ(Nr ) = µ(N ) for each r, and by construction these
forms a countable family of pairwise disjoint sets. Hence
X X
1 = µ([0, 1)) = µ(Nr ) = µ(N ).
r∈R r∈R
However there is only two possibilities for the last sum. Either µ(N ) = 0, in which case we get
the sum 0, or µ(N ) > 0 in which case we get the sum ∞. In either case this gives a contradiction.
6. Integrals
In this section we will outline the fundamentals of the definitions and properties of the integral from
the Daniell point of view. The case when we start from a measure space (X, M, µ) is however the most
important, so we start with some basic notions about simple functions and measurability of functions
defined on such a space.
We will here restrict attention to extended real-valued functions, i.e. functions
f : X → R = [−∞, ∞].
R is given the usual topology, i.e. we say that a set O is open if:
• for every x ∈ O ∩ R there is some r > 0 such that the ball B(x, r) is contained in O,
• if ∞ ∈ O then there is some T ∈ R such that (T, ∞] ⊂ O,
• if −∞ ∈ O then there is some T ∈ R such that [−∞, T ) ⊂ O.
19
6.1. Measurable functions in the measure space context.
We say that f : X → R is (M-)measurable if for every open set O ⊂ R the set f −1 (O) ∈ M.
Hence f is measurable if and only if f −1 (O) ∈ M for every Borel set O ⊂ R. It also follows that in
case the family of sets A generates the Borel σ−algebra (i.e. all sets in A are Borel sets, and the smallest
σ−algebra containing all of them is the Borel σ−algebra) then f is measurable iff f −1 (O) ∈ M for every
Borel set O ∈ A.
In particular f is measurable iff f −1 ((a, ∞]) is measurable for every a ∈ R, or similarly iff f −1 ([−∞, a))
is measurable for every a ∈ R.
The set of measurable functions can quite easily be proved to be closed under many operations:
The reason for the assumption of real values in (a) is simply because the sum f + g is not defined at
points where one of the functions is ∞ and the other −∞. In case there are no such points then the
result still holds for extended real-valued functions.
Proof. Let us prove two of these and leave the rest as an exercise. Suppose f, g are measurable, then
[
(f + g)−1 ((α, ∞]) = {x ∈ X : f (x) + g(x) > α} = ({x ∈ X : f (x) > α − q} ∩ {x ∈ X : g(x) > q}).
q∈Q
The right hand side of this is measurable, since each set in the countable union is measurable by assump-
tion.
If we look at a sequence gn of measurable functions we have instead
∞
[
(sup gn )−1 ((α, ∞]) = {x ∈ X : sup gn (x) > α} = {x ∈ X : gn (x) > α}.
n n
n=1
Again the right hand side is measurable since it is a countable union of measurable sets. □
Exercise 6.4. Give an example of an uncountable family {fi }i∈I of measurable functions for
which the supremum is not measurable. (Hint: Look at X = R with Lebesgue measure m. You
may use the result that there is a non-measurable set E ⊂ R without proof.)
20
6.2. Simple functions.
A function which can be written on the form
n
X
ϕ= ai χAi
i=1
where each ai ∈ R and Ai ∈ M is called a measurable simple function (so in particular it has values
in R). We denote the set of all such by HM .
Recall that according to Proposition 4.12 the set of measurable simple functions is a vector lattice
(w.r.t. max/min).
If we furthermore can choose the sets Ai such that µ(Ai ) < ∞ for every i then we say that ϕ is an
integrable simple function. We denote the set of all such by Hµ .
Exercise 6.5. Prove that the set Hµ of all integrable simple functions is a vector lattice.
Exercise 6.6.PProve that the value Iµ (ϕ) does not depend on the particular representation of ϕ of
n
R ϕ = Ri=1 ai χAi , and that Iµ : Hµ → R is linear and monotone in the sense that if ϕ ≤ ψ
the form
then ϕdµ ≤ ψdµ. (Hint: given two representations use Lemma 4.9 to get a representation
which is a refinement of both in the natural sense.)
Exercise 6.7. Let X = N, M = P(N) and µ be counting measure. Describe the integrable
simple functions and the integral Iµ for such.
Lemma 6.3. In case ϕn is a decreasing sequence of integrable simple functions such that ϕn ↘ 0
then Iµ (ϕn ) → 0.
Proof. Let ε > 0. Put An = {x ∈ X : ϕn (x) > ε}. By assumption we have that ∩∞ n=1 An = ∅ and
µ(A1 ) < ∞. Let M = sup{ϕ1 (x) : x ∈ X} and E = {x ∈ X : ϕ1 (x) > 0}. The fact that ϕ1 is a positive
integrable simple function implies that M ∈ [0, ∞) and that µ(E) < ∞. Since the sequence is decreasing
we get
ϕn ≤ εχE + M χAn .
Hence by monotonicity and linearity we get
0 ≤ Iµ ϕn ≤ εIµ χE + M Iµ χAn = εµ(E) + M µ(An ).
21
Since ε > 0 is arbitrary and µ(An ) → 0 as n → 0 by continuity from above for measures we get the
result. □
6.4. The Daniell integral. Here we will work with the Daniell integral, which is a bit more general
than the Lebesgue approach. This choice is due to that it gives some definite advantages later without
actually offering any extra difficulties in the proofs.
For the rest of this section we assume that X is a set and H a fixed vector lattice of
bounded real valued functions on X.
Example 6.5. The most important case for us is when we start with a measure space (X, M, µ).
Then we let H = Hµ denote the integrable simple functions, which we know forms a vector lattice,
and for a function f ∈ H we let Z
If = Iµ (f ) = f dµ.
Then it follows from our earlier results that (X, Hµ , Iµ ) is an integration space.
Example 6.6. Let R denote a closed and bounded rectangle in RN , and let H = C(R) denote
the set of all continuous real valued functions on R. Then H is a vector lattice, and if we let If
denote the Riemann integral of f , then I is an elementary integral on H. (This is so because
if a sequence fn of continuous functions decreases to 0, then the convergence is automatically
uniform.)
This example offers a different approach to define the Lebesgue integral in RN .
Definition 6.7.
L+ = f : X → R : there is a sequence hn ∈ H such that hn ↗ f and sup Ihn < ∞ .
n
+
For f ∈ L we define
If = sup{Ih : h ∈ H, h ≤ f }.
Clearly H ⊂ L+ since we may for f ∈ H choose hn = f for each n in the definition above, and also
clearly the definition of I still is consistent with the original value of I on H by monotonicity. Note that
a function in L+ does not need to be positive, but they are always bounded from below by some function
in H, and hence bounded from below by some constant.
Remark 6.8. Note that in case we work in the context of a measure space (X, M, µ) with the integrable
simple functions as our elementary functions, then every function in L+ is measurable, since it is a
pointwise limit of a sequence of measurable functions.
22
Lemma 6.9. If hn is a sequence in H, hn ↗ f and k ∈ H satisfies k ≤ f then Ik ≤ limn→∞ Ihn .
In particular Ihn ↗ If .
Theorem 6.10. L+ is a positive cone (i.e. af + bg ∈ L+ for all a, b ∈ [0, ∞) and f, g ∈ L+ ) and
a lattice (but not a vector lattice).
Furthermore:
(a) I(af + bg) = aIf + bIg holds for all f, g ∈ L+ and a, b ∈ [0, ∞).
(b) f ∈ L+ , f ≥ 0 ⇒ If ≥ 0.
(c) If fn is a sequence in L+ such that fn ↗ f , then either Ifn ↗ ∞ or f ∈ L+ and
Ifn ↗ If .
(d) If fn is a sequence in L+ such that fn ↘ 0, then Ifn ↘ 0.
Proof. That L+ is a positive cone and a lattice is easy to verify (just look at ahn + bkn , hn ∨ kn , hn ∧ kn
where hn ↗ f and kn ↗ g), and also (a) and (b) follows more or less immediately from the definition.
To prove (c), for each n we choose a sequence hnm ∈ H such that hnm ↗ fn as m → ∞. Now define
gn = h1n ∨ h2n ∨ · · · ∨ hnn ∈ H. Clearly gn is increasing, and since hin ≤ gn ≤ fn for all i ≤ n it follows
that gn ↗ f , and also, by Lemma 6.9, we have Ign ↗ If . Since Ign ≤ Ifn ≤ If we get the desired
conclusion.
To prove (d) assume that fn ↘ 0, let ε > 0 be fixed and choose h1 ∈ H such that h1 ≤ f1 and If1 ≤
Ih1 +ε/2 and h2 ≤ h1 ∧f2 such that I(h1 ∧f2 ) ≤ Ih2 +ε/4. Now note that I(h1 ∧f2 )+I(h1 ∨f2 ) = Ih1 +If2
and h1 ∨ f2 ≤ f1 , so
Ih2 + ε/4 ≥ I(h1 ∧ f2 ) = Ih1 + If2 − I(h1 ∨ f2 ) ≥ Ih1 + If2 − If1 ≥ If2 − ε/2.
So
Ih2 ≥ If2 − (ε/2 + ε/4).
Inductively choose hn ≤ hn−1 ∧ fn and Ihn ≥ I(hn−1 ∧ fn ) − ε/2n . Then one gets
n
X
Ihn ≥ Ifn − ε/2j ≥ Ifn − ε.
j=1
Now since hn ↘ 0 we have by definition that Ihn ↘ 0 and hence Ifn ↘ 0 since ε is arbitrary.
(We should remark here that some care has to be taken when proving part (d). It would be tempting
to try to use (c) on f1 − fn for instance, but note that L+ is not a vector space, so we can not use
subtraction.) □
We now further will extend the definition of I.
Definition 6.11. We let L denote the set of all extended real-valued functions f such that for
every ε > 0 there are h, g ∈ L+ such that −h ≤ f ≤ g and I(h + g) ≤ ε.
Proposition 6.12. A function f belongs to L if and only if there are decreasing sequences hn , gn ∈
L+ and α ∈ R such that
• −hn ≤ f ≤ gn for all n,
• −Ihn ↗ α and Ign ↘ α.
Proof. If hn , gn are as in the statement, then we have Ign + Ihn = Ign − (−Ihn ) ↘ 0, and hence we see
the sufficiency.
Conversely, if we for each n may choose gn′ , h′n such that −h′n ≤ f ≤ gn′ and I(h′n + gn′ ) ≤ 1/n,
then we may define hn = ∧nj=1 h′j and gn = ∧nj=1 gj′ . It is easy to see that −h′n ≤ −hn ≤ f ≤ gn ≤ gn′
23
for each n, so in particular 0 ≤ I(gn + hn ) ≤ 1/n. Clearly Ign is decreasing to some α, and since
−Ihn = Ign − I(gn + hn ) we get that −Ihn (which is increasing) also converges to α. □
Definition 6.13. We say that a set A ⊂ X is a (I−) null set if there is a function f ∈ L such
that f ≥ 0 on X, f ≥ 1 on A and If = 0. A property which holds apart from a null set is said
to hold (I-)almost everywhere (a.e.).
Clearly the above definition is equivalent to that χA ∈ L with IχA = 0, since 0 ≤ χA ≤ f . The
definition is equivalent to that for each ε > 0 there is 0 ≤ g ∈ L+ such that g ≥ 1 on A and Ig ≤ ε. It is
furthermore easy to verify that a countable union of null sets is itself a null set (look at fn /2n ).
Exercise 6.8. In case we work in the context of a measure space (X, M, µ), show that A is
Iµ -null if and only if there is a set B ⊃ A in M such that µ(B) = 0. (In general A need not be
measurable, but if µ is complete, then it is measurable with measure zero.)
Lemma 6.14. A set A ⊂ X is a null set if and only if for every ε > 0 there is an increasing
sequence hn ∈ H with hn ≥ 0 on X, limn→∞ hn ≥ 1 on A and Ihn ≤ ε for all n.
Remark 6.15. In case we work in the context of a measure space (X, M, µ) with the set of integrable
simple functions as our elementary functions, then we see that for any element f ∈ L there are elements
h, g ∈ L such that h ≤ f ≤ g, both h and g are measurable and h = g µ-a.e. To see this note that if
we choose hn , gn as in Proposition 6.12, then hn ↘ −h and gn ↘ g with the stated properties, because
gn + hn is decreasing with I(gn + hn ) ↘ 0.
In case µ is complete then in particular f is measurable. In this context we will also write
L1 (µ) = {f ∈ L : f is M-measurable}.
In particular L1 (µ) and L are equal if µ is complete, and in either case any function f ∈ L can be
modified on a set of measure zero to become an element in L1 (µ).
We also define Z
f dµ = Iµ f for all f ∈ L1 (µ).
24
Lemma 6.16. (a) If f ∈ L then the set {x ∈ X : |f (x)| = ∞} is a null set. Furthermore if
f ∈ L and u : X → R differs from f on at most a null set, then u ∈ L and If = Iu.
(b) If f ≥ 0 belongs to L and If = 0, then {x ∈ X : f (x) > 0} is a null set.
(c) If fn ≥ 0 is a decreasing sequence in L and Ifn ↘ 0, then fn ↘ 0 a.e.
Therefore it is enough to prove that A = {x ∈ X : k(x) = ∞} is a null set for every positive function
k ∈ L+ . But now the function k/n ∈ L+ is infinite on A for each n, and since I(k/n) = n1 Ik ↘ 0 this
shows that A is a null set, because we have 0 ≤ χA ≤ k/n for each n.
Since for the null set A = {x ∈ X : u(x) ̸= f (x)} we have ±∞χA ∈ L we see that (with the notation
from above) −h − ∞χA ≤ u ≤ g + ∞χA , and hence we easily get from the definition that u ∈ L with
the same integral as f .
(b): It is enough to prove that Aj = {x ∈ X : f (x) > 1/j} is a null set for each j. Now by definition
of the integral there is a sequence of functions gn ∈ L+ such that f ≤ gn and Ign decreases to zero. But
for each j we have χAj ≤ jgn , and I(jgn ) decreases to zero. Therefore I(χAj ) = 0 by definition.
(c): By definition of L we may for each n choose gn ∈ L+ with fn ≤ gn and Ign ≤ Ifn + 1/n. Let
gn = ∧nj=1 gj . Then fn ≤ gn′ ≤ gn and gn′ is a decreasing sequence in L+ . Since gn′ decreases to a function
′
g with 0 ≤ g ≤ gn′ we see that g ∈ L and Ig = 0. Therefore {x ∈ X : g(x) ̸= 0} is a null set, and by
construction it contains the set of points where the limit of fn is non-zero. □
What the above lemma says in particular is that how we interpret for instance a sum of two functions
f + g in L on the set where the functions are ±∞ does not make any difference for the integral. For
this reason one can allow integrable functions to be defined only on a subset of X whose complement is
a null set. With this at hand we can now formulate the following theorem, where we now use the above
in the interpretation of the statement about linear combinations of functions in L.
Exercise 6.9. Let X = N, M = P(N) and µ be counting measure. Describe L1 (µ) and the
integral Iµ in this context.
25
Exercise 6.10 (Integration over subsets). Let (X, M, µ) be a measure space. Given E ∈ M
then we may define integrals with respect to µ over E by the formula
Z Z
f dµ = f χE dµ,
E
where we interpret f χE as 0 on E c regardless of the value of f there. We could alternatively let
(see Exercise 5.2 for the definition of µ|E )
Z Z
f dµ = f dµ|E .
E
Show that these two definitions gives the same value for any function f ∈ L1 (µ).
7. Convergence theorems
Proof. Without loss of generality we may assume that the convergence is everywhere. Assume that
Ifn ↗ C < ∞. Let
k1 = f1 , k2 = f2 − f1 , . . . , kn = fn − fn−1
so that each ki ≥ 0 and
k1 + k2 + . . . + kn = fn .
P∞
Hence n=1 kn = f , and each kn ∈ L, kn ≥ 0. For each n we may now choose h′n , gn′ ∈ L+ such that
0 ≤ −h′n ≤ kn ≤ gn′
and
I(h′n + gn′ ) ≤ ε/2n .
(Because if h′′n ∈ L+ satisfies −h′′n ≤ kn then h′n = h′′n ∧ 0 ∈ L+ also satisfies −h′n = (−h′′n )+ ≤ kn since
kn ≥ 0.) Then
Xn Xn
′
hn = +
hj ∈ L and gn = gj′ ∈ L+ .
j=1 j=1
Furthermore
−hn ≤ fn ≤ gn , I(hn + gn ) ≤ ε and − Ihn ≤ Ifn ≤ Ign .
+ +
Now gn ↗ g ∈ L for some g ∈ L , and −hn is also increasing. By theorem 6.10 we get that Ign ↗ Ig,
where by construction C ≤ Ig ≤ C + ε. But also −Ihn is increasing to some real number α such that
α ≥ C − ε, and hence there is some N such that −IhN > C − 2ε. But now −hN ≤ f ≤ g and
I(g + hN ) ≤ 3ε. This gives the result. □
Pn
Proof. Apply the monotone convergence theorem to the sequence gn = i=1 fi . □
26
Proof. Without loss assume that the convergence is everywhere. Assume lim inf n→∞ Ifn < ∞. Let
gn = ∧∞j=n fj . Then gn ≤ fn , gn is increasing and gn ↗ f pointwise on X. Furthermore note that since
0 ≤ fn − ∧N N
j=n fj ∈ L and I(fn − ∧j=n fj ) ≤ I(fn ) < ∞ we see by monotone convergence that fn − gn
belongs to L, and hence by linearity so does gn . If we again apply the monotone convergence theorem,
then we have Ign ↗ If , and hence since Ign ≤ Ifn for every n we get the result. □
Proof. Without loss we may assume that the convergence and inequalities holds everywhere. Since
g + fn ≥ 0 we get by Fatou’s lemma
lim inf I(g + fn ) = Ig + lim inf Ifn ≥ I(g + f ) = Ig + If,
n→∞ n→∞
and hence
lim inf Ifn ≥ If.
n→∞
But we also have g − fn ≥ 0, and if we apply Fatou’s lemma again we get in a similar manner
lim inf I(g − fn ) = Ig − lim sup Ifn ≥ Ig − If,
n→∞ n→∞
and hence
lim sup Ifn ≤ If.
n→∞
□
Exercise 7.2. On (0, 1) with Lebesgue measure m, show that the sequence
n 0 < x < 1/n,
fn (x) =
0 1/n ≤ x < 1
satisfies that fn → 0 pointwise but Z
fn dm = 1.
(So the boundedness assumption |fn | ≤ g can not be dropped in the dominated convergence
theorem.)
Exercise 7.3. Suppose we have two integration spaces (X, H1 , I1 ) and (X, H2 , I2 ) over the same
space X, and use Li , L+ i to denote the corresponding classes of integrable functions.
Prove that in case H1 ⊂ L+ 2 with I1 h = I2 h for all h ∈ H1 , then L1 ⊂ L2 with I1 f = I2 f for all
f ∈ L1 , by completing the following steps:
(1) If h ∈ L+1 then there is hn ↗ h, where hn ∈ H1 and I1 hn = I2 hn ≤ C for all n. Hence
h ∈ L+
2 and I1 hn = I2 hn ↗ I2 h = I1 h by Theorem 6.10 (c).
(2) If f ∈ L1 then there is by assumption non-increasing sequences hn , gn ∈ L+ 1 with −hn ≤
f ≤ gn as in the definition of the space L1 . Since these functions belongs to L+2 with the
same value of the integrals, we see that by definition f ∈ L2 with I1 f = I2 f .
(In particular in case H1 ⊂ L+ +
2 and H2 ⊂ L1 with equality of the integrals, then L1 = L2 and the
integrals coincide, so in particular we could start from different lattices and still end up with the
same integration spaces in the end.)
27
8. Two theorems of Stone
8.1. Stone’s theorem. Here we will look at a special case of Stone’s theorem which relates the general
Daniell integral to the Lebesgue approach. We will only treat the case when the constant function 1 ∈ L,
which corresponds to that X has finite measure. This can be very much generalized, but some assumption
about “measurability” of X is needed for the result to hold. (Indeed the assumption can be replaced by
that (1 ∧ h) ∨ k ∈ L for all h, k ∈ L, but we will not enter into further discussion on this topic here.)
So assume now that 1 ∈ L. We define the following class of sets on X
M = {A ⊂ X : χA ∈ L}.
Then M is a σ−algebra. For instance X ∈ M is the same as the assumption that 1 ∈ L, and ∅ ∈ M is
the same as 0 ∈ L. That it is closed under complementation and countable unions follows easily from the
fact that L is a vector lattice and monotone convergence respectively. We now also define the measure
µ for A ∈ M as
µ(A) = IχA .
It again follows more or less immediately by definition that this indeed is a measure on M. Also note
that this measure is complete.
Proof. As noted above (X, M, µ) is a measure space, and furthermore any µ−integrable simple function
f is inR L, because the latter space is a vector space. By linearity it also follows immediately that
If = f dµ holds for any such function. Furthermore we know that any function in L1 (µ) may be
written as Ra difference of monotone limits of such functions, and hence it is clear that L1 (µ) ⊂ L and
that If = f dµ holds for all f ∈ L1 (µ).
The more difficult direction is to prove that L ⊂ L1 (µ). To prove this inclusion we note that it is
enough to prove that any f ∈ L with f ≥ 0 belongs to L1 (µ), since any function in L can be written as
the difference of two such functions and L1 (µ) is a vector space. So we fix such f . First of all we note
that for any a ∈ R the set
E = {x ∈ X : f (x) > a}
is µ−measurable, i.e. χE ∈ L. To see this note that the functions f − a ∈ L and hence all the functions
gn = 1 ∧ n(f − a)+ ∈ L. But it is clear that gn ↗ χE and hence we get the statement by the dominated
convergence theorem. Now for any pair of integers k, n we define the sets
Ek,n = {x ∈ X : f (x) > k2−n },
and the functions
2n
2
X
−n
fn = 2 χEk,n .
k=1
Then we have
0 if f (x) = 0
fn (x) = 22j 2−n if 22j 2−n ≤ f (x) < 22(j+1) 2−n .
n
2 if f (x) ≥ 2n
Hence we see that all fn ∈ L1 (µ), since these are µ−integrable simple functions. Furthermore we see
that fn ↗ f . Since we have
Z
fn dµ = Ifn ↗ If < ∞
we may apply the monotone convergence theorem (for L1 (µ)) to see that f ∈ L1 (µ) and that
Z
f dµ = If.
8.2. Iterated integrals and the Fubini-Stone theorem. For this subsection we let (X, IX , HX ),
(Y, IY , HY ) and (W, IW , HW ) be three integration spaces where furthermore W = X × Y . We will
write LX , LY , LW for the corresponding spaces of integrable functions. The Fubini-Stone theorem states
conditions to ensure that we can compute IW as an iterated integral
IW f (x, y) = IY (IX f (x, y)),
in the sense that the function IX f (x, y) is defined for a.e. fixed y, and hence is a function of y which
we may possibly integrate with respect to IY . (Of-course the roles of X and Y can be interchanged
typically.) Obviously for anything like this to be true IW must be appropriately connected to IX and
IY .
Theorem 8.2 (Fubini-Stone). With the notation from above, suppose we for every h ∈ HW have
(a) h(x, y) ∈ LX for a.e. y ∈ Y ,
(b) IX h(x, y) ∈ LY ,
(c) IW h(x, y) = IY (IX h(x, y)).
In that case (a),(b) and (c) also holds for all h ∈ LW .
Proof. Let Φ denote the set of all functions in LW for which (a), (b) and (c) holds. By assumption
HW ⊂ Φ and we want to prove that Φ = LW . We first note that due to linearity in all arguments
Φ is a vector space. Second of all, suppose ϕn is an increasing sequence of functions in Φ such that
ϕn ↗ ϕ ∈ LW . If we put gn (y) = IX ϕn (x, y), then gn is an increasing sequence in LY by assumption,
which converges to some element g ∈ LY by the monotone convergence theorem. Applying the monotone
convergence theorem we get
IW ϕn → IW ϕ,
IX ϕn (x, y) → IX ϕ(x, y) for a.e. y ∈ Y
IY (IX ϕn (x, y)) = IY gn → IY g.
(The a.e. part in the second line above is apart form the set of all y ∈ Y for which (a) fails to hold for
some n, which is by assumption a countable union of null sets, and hence itself a null set.) In particular
we see that L+
W ⊂ Φ. Using linearity we see that also Φ is closed under bounded decreasing limits.
Suppose finally that f is a function in LW . Then there are decreasing sequences hn , gn ∈ L+ W such
that −hn ≤ f ≤ gn and IW (hn + gn ) < 1/n. First of all this implies that hn + gn has a limit which
belongs to Φ and
IY (IX (hn (x, y) + gn (x, y))) ≤ 1/n.
However this implies that we must have IX (hn (x, y) + gn (x, y)) → 0 for a.e. y ∈ Y . Hence for all y apart
from these we must have that f (x, y) ∈ LX , since −IX hn (x, y) ≤ IX f (x, y) ≤ IX gn (x, y) holds for all y.
This also shows that IX f (x, y) belongs to LY , and the theorem follows. □
Exercise 8.2. Let X and Y be two closed and bounded rectangles in RN and RM respectively.
Furthermore let W = X ×Y , which then can be regarded as a rectangle in RN +M in a natural way.
Also let HX , HY and HW denote the set of continuous functions on X, Y and W respectively.
Show that the assumptions in the Fubini-Stone theorem is satisfied if IX , IY and IW are the
Riemann integral of the functions in HX , Hy and HW respectively. (Results about the Riemann
integral in higher dimensions from advanced calculus can be used without proof.) In this example
condition (a) in the Fubini-Stone theorem is satisfied everywhere for the elements in HW , but is
the same true for all elements in LW ?
29
9. Different types of convergence
In this section we let (X, M, µ) be a measure space.
Let fn be a sequence of measurable functions on X. We know what it means that fn converges to f
pointwise a.e. on X, but there are a few other important concepts of convergence as-well. Indeed just
having a.e. convergence says very little in regards to integrals as the following example explains.
Example 9.1. If we define on [0, 1] with Lebesgue measure fn = nχ(0,1/n] , then fn converges to
R1
0 everywhere pointwise, but we still have 0 fn dm = 1 for each n.
On the other hand if we put
n
fn = χ(0,1/n]
ln(n)
R1
we have that fn converges to 0 pointwise and 0 fn dm = 1/ ln(n), which converges to 0. However
note that if fn ≤ g for each n, then we would have g(x) at-least as big as something comparable to
−1/(x ln(x)), which is not integrable on [0, 1], and hence we can not have dominated convergence
here.
Definition 9.2. We say that a sequence of a.e. real valued functions fn converges to f
• in L1 (µ) if Z
lim |f − fn |dµ = 0,
n→∞
• in measure if for every ε > 0 we have
lim µ({x : |fn (x) − f (x)| ≥ ε}) = 0,
n→∞
• almost uniformly if there for every ε > 0 is a set E ⊂ X such that µ(E c ) < ε and fn
converges to f uniformly on E.
Remark 9.3. It is implicitly assumed in the definition for L1 (µ) convergence that the functions fn and
f belongs to L1 (µ). Also note that in either of the cases f is automatically also a.e. real valued.
Example 9.4. If we have a sequence fn which converges to f a.e. on X and we have dominated
convergence in the sense that |fn | ≤ g, where g ∈ L1 (µ), then according to the dominated
convergence theorem, since |fn − f | converges to 0 a.e. and we have |fn − f | ≤ 2g a.e.,
Z
|fn − f |dµ → 0 as n → ∞,
i.e. fn → f in L1 (µ). Note however that, as we gave an example of above, it is possible to have
convergence in L1 even if no such g exists.
Exercise 9.1. Prove that in case fn → f in measure and also fn → g in measure then we must
have f = g µ-a.e. (i.e. the limit is unique up to null sets).
Remark 9.5. We will prove in theorem 9.6 below that in case fn → f either in L1 (µ) or almost
uniformly, then it also converges in measure. So in particular if a sequence for instance converges to say
f in measure and to g almost uniformly, then f = g a.e.
Exercise 9.2. Prove that ∥f ∥1 = |f |dµ satisfies the following for all k ∈ R and f, g ∈ L1 (µ):
R
• ∥kf ∥1 = |k∥f ∥1 ,
• ∥f + g∥1 ≤ ∥f ∥1 + ∥g∥1 ,
• ∥f ∥1 = 0 if and only if f = 0 a.e.
30
Theorem 9.6. Suppose the functions fn , f are µ-a.e. real valued
(a) If fn converges to f in measure, then there is a subsequence which converges to f almost
uniformly.
(b) If fn converges to f in L1 (µ), then fn converges to f in measure.
(c) If fn converges to f almost uniformly, then it converges also a.e. and in measure.
Hence fnk converges to some function f˜ uniformly on X \ Fm for each m. By exercise 9.1 it is clear that
we must indeed have f˜ = f on X \ Fm for each m (because if a sequence converges in measure on X,
then so it does on any measurable subset of X trivially). But we also have
∞
X 1
µ(Fm ) ≤ µ(Ek ) ≤ → 0 as m → ∞.
2m−1
k=m
(b): Since Z
|fn − f |dµ ≥ εµ({x ∈ X : |fn (x) − f (x)| ≥ ε}),
we see that µ({x ∈ X : |fn (x) − f (x)| ≥ ε}) must go to zero as n → ∞ for each fixed ε > 0, since the
integral on the left hand side does so.
(c): Let for each n ∈ N Fn be a set such that µ(Fn ) ≤ 1/n and such that fn converges uniformly
to f in X \ Fn . Then in particular fn converges pointwise to f in X \ ∩∞ ∞
n=1 Fn , and ∩n=1 Fn has
measure zero, so fn converges to f a.e. Also for ε > 0 choose n such that ε > 1/n and N such that
|fm − f | ≤ 1/n on X \ Fn for all m ≥ N (which is possible due to the uniform convergence). Then
{x ∈ X : |fm (x) − f (x)| ≥ 1/n} ⊂ Fn for all m ≥ N , and from this we easily get convergence in
measure. □
Theorem 9.7 (Egoroff). If µ(X) < ∞ and fn → f a.e., where the functions fn , f are µ-a.e. real
valued, then fn converges to f almost uniformly.
Proof. We may without loss assume that f and fn are everywhere real-valued. For n, k ∈ N let
∞
k
\ 1 1
En = x ∈ X : |fm (x) − f (x)| < = x ∈ X : |fm (x) − f (x)| < for all m ≥ n .
m=n
k k
Since F ⊂ Enkk it follows that |fn (x) − f (x)| < 1/k for any n ≥ nk , and hence the series converges
uniformly on F . □
Exercise 9.4 (*). We say that a sequence of a.e. real-valued functions fn on a measure space
(X, M, µ) is Cauchy in measure if
lim µ({x ∈ X : |fn (x) − fm (x)| ≥ ε}) = 0
n,m→∞
for all ε > 0. Prove that if fn is Cauchy in measure, then there is a function f which is a.e.
real-valued and such that fn → f in measure. (Hint: adapt the proof of part (a) of Theorem 9.6.)
Exercise 9.5 (**). Let (X, M, µ) be a measure space. A family {fi }i∈I ⊂ L1 (µ) is called
uniformly integrable if for every ε > 0 there is δ > 0 such that
Z
fi dµ < ε for all E ∈ M with µ(E) < δ.
E
Prove that:
(a) If I is finite, then any family {fi }i∈I ⊂ L1 (µ) is uniformly integrable.
(b) If fn → f in L1 (µ) then {fn }n∈N is uniformly integrable.
To be precise, Beppo-Levi’s
P∞ theorem that we used (as we have developed the material) might not actually
be applicable in case i=1 χAi (x)χBi (y) is not a limit of functions in L1 (µ) (which may happen in case
(X, M, µ) is not σ-finite). However if this happens, then by definition this would have to imply that
there is Ai such that µ(Ai ) = ∞ and y ∈ Bi , in which case the stated inequality holds trivially. Now we
can do a similar argument with y:
∞
Z Z X !
µ(A)η(B) = µ(A)χB (y)dη(y) ≤ µ(Ai )χBi (y) dη(y)
i=1
∞ Z
X ∞
X
≤ µ(Ai )χBi (y)dµ(y) = µ(Ai )η(Bi ).
i=1 i=1
□
We define the outer measure on X × Y
(∞ ∞
)
X [
∗
π (E) = inf λ(Ai × Bi ) : E ⊂ (Ai × Bi ), Ai × Bi ∈ K .
i=1 i=1
Proof. It is enough to prove that A × B is measurable for each measurable rectangle. For a given set
E ⊂ X × Y with π ∗ (E) < ∞ and ε > 0 we may choose a cover E ⊂ ∪∞ i=1 Ai × Bi of measurable rectangles
such that
∞
X
π ∗ (E) + ε ≥ µ(Ai )η(Bi ).
i=1
Now we use that (A × B)c = (Ac × B) ∪ (A × B c ) ∪ (Ac × B c ) to get
∞ ∞
!
[ [
E ∩ (A × B) ⊂ (Ai × Bi ) ∩ (A × B) = ((Ai ∩ A) × (Bi ∩ B)),
i=1 i=1
E ∩ (A × B)c = E ∩ ((Ac × B) ∪ (A × B c ) ∪ (Ac × B c ))
= (E ∩ (Ac × B)) ∪ (E ∩ (A × B c )) ∪ (E ∩ (Ac × B c )
∞
[ ∞
[ ∞
[
⊂ ((Ai ∩ Ac ) × (Bi ∩ B)) ∪ ((Ai ∩ A) × (Bi ∩ B c )) ∪ ((Ai ∩ Ac ) × (Bi ∩ B c )).
i=1 i=1 i=1
33
Hence
π ∗ (E ∩ (A × B)) + π ∗ (E ∩ (A × B)c )
∞
X
≤ µ((Ai ∩ A)η(Bi ∩ B))
i=1
∞
X
+ (µ(Ai ∩ Ac )η(Bi ∩ B) + µ(Ai ∩ A)η(Bi ∩ B c ) + µ(Ai ∩ Ac )η(Bi ∩ B c ))
i=1
X∞ ∞
X
= (µ(Ai )η(Bi ∩ B) + µ(Ai )η(Bi ∩ B c )) = µ(Ai )η(Bi )
i=1 i=1
∗
≤ π (E) + ε.
Since ε > 0 is arbitrary it follows by definition that A × B is π ∗ -measurable. □
Remark 10.4. Typically this measure is not complete (i.e. there are typically π ∗ -measurable sets which
does not belong to M ⊗ N ). The reason for this choice is that we want sections
E y = {x ∈ X : (x, y) ∈ E}, Ex = {y ∈ Y : (x, y) ∈ E}
to be measurable for a measurable set E.
Now we want to get the Fubini theorem for measure spaces. We already have the Fubini-Stone
theorem, which takes us quite a bit on the way, but there is a problem to be addressed. For the product
measure above, for K to be a sequential covering class in general we need to allow sets of the form
A × B where A and B might have infinite measure. If we want to define a suitable integration space of
elementary functions and apply the Fubini-Stone theorem this of-course does not work, since then we
need a finite value of the integral. Therefore we from now on assume that X and Y are σ-finite (indeed,
as we will see in Exercise 10.1 the theorem fails to hold otherwise).
Lemma 10.5. Suppose that X and Y are σ-finite, then the set K′ of all rectangles A × B with
µ(A) < ∞ and η(B) < ∞ is a sequential covering class. If we define
∞
X ∞
[
π ′ (E) = inf{ λ(Ai × Bi ) : E ⊂ Ai × Bi , Ai × Bi ∈ K′ }.
i=1 i=1
then π ′ = π ∗ .
Proof. If we choose increasing sequences Xn and Yn in M and N respectively with finite measures and
such that X = ∪∞ ∞ ∞ ′
n=1 Xn and Y = ∪n=1 Yn , then X × Y = ∪n=1 Xn × Yn , and hence K is a sequential
covering class.
To see that π ′ = π ∗ , then note that since the infimum is taken over a smaller set we always have
π (E) ≥ π ∗ (E). The proof that π ′ is a pre-outer measure such that all rectangles in K′ are measurable
′
Since the sets (Ai ∩ Xn ) × (Bi ∩ Yn ) belongs to K′ we see by definition that we get π ′ (E) ≤ π ∗ (E) in
this case.
34
Finally for general E with π ∗ (E) < ∞ and ε > 0 choose A × B ∈ K with π ∗ (E) ≥ µ(A)η(B) − ε. Since
A × B = ∪∞ ′
n=1 (A ∩ Xn ) × (B ∩ Yn ) we see that indeed A × B is π -measurable. But due to continuity
from below for measures we then get, since π ((A ∩ Xn ) × (B ∩ Yn )) = π ∗ ((A ∩ Xn ) × (B ∩ Yn )) for each
′
n, that
π ′ (A × B) = π ∗ (A × B).
Hence
π ′ (E) ≤ π ∗ (A × B) ≤ π ∗ (E) + ε.
□
Now let W = X × Y . To apply the Fubini-Stone theorem to this context let HW denote the set of all
functions of the form
n
X
h(x, y) = ci χAi ×Bi (x, y) where ci ∈ R, Ai × Bi ∈ K′ .
i=1
For each fixed y ∈ Y we have h(·, y) ∈ L1 (µ) since it is then an integrable simple function (χAi ×Bi (x, y) =
χAi (x)χBi (y)).
Lemma 10.6. The set HW is a vector lattice of bounded functions, and the function
Xn n
X
IW h(x, y) = IW ( ci χAi ×Bi ) = ci µ(Ai )η(Bi )
i=1 i=1
is a Daniell integral on HW (in particular the value does not depend on the particular represen-
tation of h). Furthermore for any h(x, y) ∈ HW as above we have
Z n
X Z
h(x, y)dµ × η(x, y) = ci χAi ×Bi (x, y)dµ × η
i=1
n Z n
Z X !
X
= ci µ(Ai )η(Bi ) = ci χAi (x)χBi (y)dµ(x) dη(y).
i=1 i=1
Proof. Note that HW precisely consists of the A-simple functions, and according to Proposition 4.12 we
know that they form a vector lattice.
The stated equalities are simple consequences of linearity, and since IW = Iµ×η on HW it is also clear
that it is a Daniell integral on HW . □
Due to the above lemma we see that the Fubini-Stone theorem applies and we get:
Theorem 10.7 (Fubini). Assume that (X, M, µ) and (Y, N , η)R are σ-finite. If f ∈ L1 (µ × η)
then for a.e. y ∈ Y the function f (·, y) belongs to L1 (µ) and f (x, ·)dµ(x) belongs to L1 (η).
Furthermore Z Z Z
f dµ × η = f (x, y)dµ(x) dη(y).
Similarly Z Z Z
f dµ × η = f (x, y)dη(y) dµ(x).
Fubini’s student Tonelli also made the following addition to the theorem.
Theorem 10.8 (Tonelli). Suppose (X, M, µ) and (Y, N , η) are σ-finite measure spaces and f :
X × Y → [0, ∞] is µ × η-measurable. If
Z Z
f (x, y)dµ(x) dη(y) < ∞,
which follows by applying the monotone convergence theorem twice. But this implies immediately by
the monotone convergence theorem that f ∈ L1 (µ × η), and hence Fubini’s theorem is applicable. □
Remark 10.9. Tonelli’s theorem can be used in connection with Fubini’s theorem in the following way:
given a measurable function f : X × Y → R, show that
Z Z
|f (x, y)|dµ(x) dη(y) < ∞,
Exercise 10.1. Let X = [0, 1] with µ Lebesgue measure restricted to X, and Y = [0, 1] with
counting measure η (so η is not σ-finite). Let E = {(x, x) : x ∈ [0, 1]} ⊂ X × Y . Prove that, using
the notation from above, π ∗ (E) = ∞ but
Z Z
χE (x, y)dµ(x) dη(y) = 0,
Z Z
χE (x, y)dη(x) dµ(y) = 1.
Exercise 10.3 (*). As stated above the measure µ × η is typically not complete. Find (in a
book) and state a version of Fubini’s theorem for the completion of µ × η.
Exercise 10.4 (Cavalieri’s Principle **). Show that if (X, M, µ) is a measure space, then for
any positive measurable function f and any q ∈ (0, ∞) we have
Z Z ∞
q
f dµ = q tq−1 µ({x ∈ X : f (x) > t})dt.
0
(The last integral is with respect to Lebesgue measure on (0, ∞). An interesting remark is that in
case we take q = 1, then the function µ({x ∈ X : f (x) > t}) is decreasing, and hence the integral
would make sense even from the Riemann point of view. This is actually a different approach to
define the integral from a measure µ. Sometimes Cavalieri’s principle is also called the bathtub
principle.)
36
11. Signed measures and Differentiation
11.1. Signed Measures.
Example 11.2. Suppose (X, M, µ) is a measure space (µ ≥ 0) and that f ∈ L1 (µ), then
Z Z
(2) ν(A) = f dµ := f χA dµ for A ∈ M,
A
is a signed measure on M.
If µ1 and µ2 are two (positive) measures on M, where at-least one is finite then
(3) ν(A) = µ1 (A) − µ2 (A) for A ∈ M,
is a signed measure on M.
Exercise 11.1. Prove that ν as defined by equations (2) and (3) are signed measures.
Definition 11.3. A set E ∈ M is called positive for ν if for any subset A ⊂ E which belongs to
M we have ν(A) ≥ 0. Similarly we define a set E to be negative for ν if ν(A) ≤ 0 for all A ⊂ E,
and null if ν(A) = 0 for all A ⊂ E.
Exercise 11.2. Prove that if {Bj }j∈N is a sequence of negative sets for ν, then the union ∪∞j=1 Bj
is also negative for ν. (The corresponding result for positive sets is of-course also true.)
Exercise 11.3. Prove that if E ⊂ F and |ν(F )| < ∞, then also |ν(E)| < ∞.(Hint: Use that
ν(F ) = ν(E) + ν(F \ E).)
Proof. The uniqueness up to null sets is easy to prove and left to the reader. We assume without loss
that ν takes values in (−∞, ∞]. Define
β = inf{ν(E) : E ⊂ X is negative for ν}.
First of all note that −∞ < β < ∞ (this is why we assume that ν takes values in (−∞, ∞]). Then there
is some sequence {Bj }j∈N of negative sets such that ν(Bj ) → β as n → ∞. The union B = ∪∞ j=1 Bj is
itself also a negative set, and we have ν(B) ≤ ν(Bj ) for each j. Hence ν(B) = β.
We shall now prove that A = B c is a positive set, which will complete the proof, since we may then
put E + = A and E − = B. If this set is not positive then by definition there is some subset E0 ⊂ A with
ν(E0 ) < 0. But E0 can not be a negative set, because then ν(B ∪ E0 ) < β with B ∪ E0 a negative set,
which contradicts the definition of β. Therefore E0 contains a set E1 such that ν(E1 ) > 0. We let m1
be the smallest positive integer such that there is such a set with ν(E1 ) ≥ 1/m1 . Since |ν(E0 )| < ∞ we
have that |ν(E1 )| < ∞ (by the previous exercise). Since
1
ν(E0 \ E1 ) = ν(E0 ) − ν(E1 ) ≤ ν(E0 ) − < 0,
m1
37
we may apply the same argument to E0 \ E1 and choose a smallest integer m2 for which there is a set
E2 ⊂ E0 \ E1 and ν(E2 ) ≥ 1/m2 . We may continue this construction to get a sequence of numbers mk
k−1
and subsets Ek ⊂ E0 \ ∪j=1 Ej with ν(Ek ) ≥ 1/mk . By construction these sets are in particular mutually
disjoint, and hence
X∞
ν(∪∞ E
j=1 j ) = ν(Ej ) < ∞.
j=1
Since ν(Ej ) ≥ 1/mj this in particular means that 1/mj → 0 as j → ∞. But then it follows that for any
measurable subset F ⊂ F0 = E0 \ ∪∞ j=1 Ej we have
1
ν(F ) ≤ → 0 as k → ∞.
mk − 1
Hence ν(F ) ≤ 0 for all such F , and therefore F0 is a negative set with
∞
X
ν(F0 ) = ν(E0 ) − ν(Ek ) < ν(E0 ) < 0.
j=1
As for E0 above we see that this gives a contradiction to the definition of β and B. □
Definition 11.5. Suppose that both µ and ν are signed measures on M, then we say that µ and
ν are mutually singular, written µ ⊥ ν, if X = E ∪ F where E ∩ F = ∅ and E and F are null-sets
for µ and ν respectively.
Theorem 11.6 (Jordan Decomposition). There are unique positive measures ν + and ν − on M
such that ν = ν + − ν − and ν + ⊥ ν − .
Exercise 11.4. Prove that ν ± as given above indeed satisfies the statement of the Jordan de-
composition theorem. Also prove that in case
ν = ν1 − ν2 ,
where ν1 , ν2 are positive measures, then ν +
≤ ν1 and ν − ≤ ν2 .
Exercise 11.5. Let ν be a signed measure on (X, M). Prove the following for every B ∈ M:
(a) ν + (B) = sup{ν(A) : A ∈ M, A ⊂ B},
(b) |ν|(B) = sup{ν(A1 ) − ν(A2 ) : A1 , RA2 ∈ M, B = A1 ∪ A2 , A1 ∩ A2 = ∅},
(c) If ν is σ-finite then |ν|(B) = sup{ B f dν : f ∈ L1 (|ν|), |f | ≤ 1}.
Definition 11.7. Suppose that µ and ν are signed measures on M. We say that ν is absolutely
continuous with respect to µ, written ν ≪ µ if ν(E) = 0 for any E such that |µ|(E) = 0.
Exercise 11.6. Prove that ν ≪ |ν| and that ν ≪ µ if and only if |ν| ≪ |µ|.
38
Exercise 11.7 (*). Prove that in case µ and ν are signed measures on M such that ν ≪ µ and
such that |ν|(E) < ∞ for all sets E with |µ|(E) < ∞, then for any ε > 0 there is δ > 0 such that
|ν|(E) < ε for any E ∈ M with |µ|(E) < δ. (Hint: Assume this is not true, then there is ε > 0
and a sequence En s.t. |µ|(En ) < 1/2n and |ν|(En ) ≥ ε. Let E = ∩∞ ∞
n=1 (∪j=n Ej ), and show that
|µ|(E) = 0 but |ν|(E) ≥ ε.)
Exercise 11.8. Prove that in case f ∈ L1 (µ) where µ is a measure, then the measure ν defined
by Z Z
ν(E) = f dµ = f χE dµ.
E
is a signed measure which is absolutely continuous with respect to µ.
Remark 11.9. The reason for the fact that we can not take f ∈ L1 (µ) is simply because we may have
|ν|(X) = ∞. The integral should of-course be interpreted as ±∞ in those cases that correspond to sets
E with ν(E) = ±∞. Note also that there is either a lower or an upper bound on ν, since either ν + (E)
or ν − (E) is finite, and we have −ν − (E) ≤ ν(E) ≤ ν + (E) for all E. That is why we can take one of the
functions f + or f − integrable.
Proof. We first reduce the problem to the case when µ, ν are finite. Let X be the union of the disjoint
measurable sets Xj with finite µ- and |ν|-measure, which is easily seen to be possible. Suppose the
theorem holds for each subset Xn so that there are functions fn such that
Z
ν(E ∩ Xn ) = fn dµ
E∩Xn
holds for all measurable sets E ⊂ X. Then if we define f (x) = fn (x) if x ∈ Xn , which makes sense since
the sets are disjoint, then we get for every set E with |ν|(E) < ∞:
X∞ Z ∞
X
|f |dµ = |ν|(E ∩ Xn ) = |ν|(E) < ∞.
n=1 E∩Xn n=1
Hence it is easy to see that f is µ-integrable on E, and that f has the desired properties. We furthermore
may reduce to the case that ν is positive, since we may otherwise treat ν + and ν − separately.
As for the uniqueness part, if f, g are two different functions satisfying the theorem, then
Z Z
gdµ = f dµ
E E
holds for every E, and this implies that the functions are equal a.e. (Simply choose for instance E =
{x ∈ X : f (x) − g(x) ≥ 1/n} to see that it must have measure zero . . . ).
Now we shall prove the existence in the case of finite positive measures which finishes the proof. To
this end let Φ denote the set of all positive functions g for which
Z
gdµ ≤ ν(E) for all E ∈ M.
E
Let furthermore Z
α = sup{ gdµ : g ∈ Φ}.
39
Then there is a sequence gn in Φ such that gn dµ → α as n → ∞. Let hn = ∨nj=1 gj . If we for a
R
fixed n define the sets E1 = {x ∈ X : hn (x) = g1 (x)} and then inductively Ej = {x ∈ X : hn (x) =
gj (x)} \ ∪j−1
i=1 Ej , then by construction the sets Ej are disjoint, and for any E we get
Z Xn Z Xn
hn dµ = gj dµ ≤ ν(E ∩ Ej ) = ν(E).
E j=1 E∩Ej j=1
Therefore we see that also each hn ∈ Φ and this is an increasing sequence. Hence hn ↗ h, and by the
monotone convergence theorem h ∈ Φ with
Z
hdµ = α.
Since h ∈ L1 (µ) we may assume that it is finite-valued everywhere. If we now consider the measure η
defined by Z
η(E) = ν(E) − hdµ,
E
then by construction this is a positive measure, and the proof will be finished if we prove that η = 0.
Suppose this is not the case. Define the signed measures
1
ηn = η − µ.
n
Let An ∪ Bn be a Hahn decomposition of X w.r.t. ηn , and put
A0 = ∪∞
n=1 An , B0 = ∩∞
n=1 Bn .
Since B0 ⊂ Bn for each n we have
1
0 ≤ η(B0 ) ≤ µ(B0 ) → 0 as n → ∞.
n
Hence η(B0 ) = 0, and since η ̸= 0 and A0 ∩ B0 = ∅, A0 ∪ B0 = X we must have η(A0 ) > 0. Therefore,
since η is absolutely continuous with respect to µ we must also have µ(A0 ) > 0, and by continuity from
below for measures this implies that µ(An ) > 0 for some n. Hence
Z
1
µ(E ∩ An ) ≤ η(E ∩ An ) = ν(E ∩ An ) − f dµ.
n E∩An
If we now define g = h + χAn /n, then we see that g ∈ Φ, but this contradicts the definition of α since
Z Z
gdµ = hdµ + µ(An )/n > α.
□
The function f is usually called the Radon-Nikodym derivative of ν with respect to µ and written
dν
= f.
dµ
Exercise 11.9. Prove that in case µ and ν are positive measures on M and ν ≤ µ, then we have
dν
0≤ ≤ 1 a.e.
dµ
Proof. As in the proof of the Radon-Nikodym theorem this is reduced to the case of finite positive
measures, and the uniqueness is easy. Since ν ≪ µ + ν, and even ν ≤ µ + ν there is some measurable
function f : X → [0, 1] such that Z
ν(E) = f d(µ + ν),
E
for all E ∈ M.
40
Let
A = {x ∈ X : f (x) = 1}, B = X \ A.
Then Z Z
ν(A) = dµ + dν = µ(A) + ν(A).
A A
Since ν(A) < ∞ it follows that µ(A) = 0. Now define λ(E) = ν(E ∩ A), and ρ(E) = ν(E) − λ(E) =
ν(E ∩ B). Then it only remains to prove that ρ ≪ µ. So suppose µ(E) = 0, then we have
Z Z
dν = f dν,
E∩B E∩B
or what amounts to the same thing Z
(1 − f )dν = 0.
E∩B
Since 1 − f > 0 in B it follows that ν(E ∩ B) = ρ(E) = 0. □
11.3. Differentiation with respect to a doubling measure in a metric space. In this section
we shall work in the context of a metric space (X, ρ). We assume that µ is a (non-zero) Borel measure
on X. (The reader who feels uncomfortable with this level of generality may below assume that X = Rn
and that µ is Lebesgue measure.) We will use the notation
B(x, r) = {y ∈ X : ρ(x, y) < r}, S(x, r) = {y ∈ X : ρ(x, y) = r}.
Note in particular that B(x, r) is an open ball and S(x, r) a closed sphere. Also note that ∂B(x, r) ⊂
S(x, r), but there are cases where this is not an equality. For instance if we look at X = N with the
usual distance, then B(1, 1) = {1} so that the boundary is empty, but S(1, 1) = {2}. First we need a
covering theorem, which is a geometric rather than a measure-theoretic result. The result is true in any
separable metric space.
Theorem 11.11 (Vitali covering theorem). Let (X, ρ) be a separable metric space, and let
{B(xj , rj ) : j ∈ J}
be an arbitrary collection of balls in X such that
R = sup {rj : j ∈ J} < ∞.
Then there exists a (at most) countable subcollection
{B(xj , rj ) : j ∈ J ′ }, J′ ⊂ J
of balls from the original collection which are disjoint and satisfy
[ [
B(xj , rj ) ⊂ B(xj , 5rj ).
j∈J j∈J ′
In particular this is satisfied for Lebesgue measure in RN with the constant Cµ = 2N . Since the
proofs of the theorems in this section is essentially unchanged (if we add some simplifying assumptions)
for the more general case we have decided to treat it here, because it is of importance in many recent
developments of analysis in metric spaces, where such doubling conditions are often of crucial importance.
Remark 11.13. It should be remarked, although outside the scope of these notes, that not every metric
space can have a (non-zero) doubling measure on it. Indeed, for complete spaces, such a measure exists
if and only if the metric space is doubling, which by definition means that there is a constant C ′ such
that any ball of radius 2r in X can be covered by C ′ balls of radius r.
We also assume the following for simplicity:
For the rest of this section (X, ρ) is a metric space, and µ a doubling measure with doubling
constant Cµ . We will also assume that X is separable, and µ(S(x, r)) = 0 for all x, r.
The first assumption is actually redundant in this situation, because the condition that all balls have
strictly positive and finite measure will force X to be separable. The second assumption may however
fail even if there is a doubling measure on X, but most interesting examples would satisfy this.
We introduce the notation Z
1
Ar f (x) = f dµ
µ(B(x, r)) B(x,r)
for averages over balls.
Exercise 11.10 (**). Prove that for any f ∈ L1 (µ) the function g(x, r) = Ar f (x) is (jointly)
continuous in (x, r). (Hint: Prove first that if xn → x and rn → r then µ(B(xn , rn )) → µ(B(x, r)).
Note also that we use µ(S(x, r)) = 0 here.)
Exercise 11.11 (**). Prove that if X is a separable metric space, and µ is a Borel measure
on X for which every open ball has finite measure, then
R for any f ∈ L1 (µ) and ε > 0 there is
an integrable continuous function g on X such that |f − g|dµ < ε. (Hint: Consider the class
H ⊂ C(X) of continuous functions on X with bounded support R(i.e. which are zero outside some
ball), then these forms a vector lattice. The integral Iµ g = gdµ for these is an elementary
integral. If we start from (X, H, Iµ ) as our integration space, then we get a space L, which
contains H. Prove that for any open ball B ⊂ X we have χB ∈ L+ , and use this to prove that
indeed L1 (µ) ⊂ L.)
Proof. Let Ea = {x ∈ X : M f (x) > a}. For each x ∈ Ea we can choose rx such that Arx |f |(x) > a
by continuity of the function Ar |f |(x). By the Vitali covering theorem there is a sequence of points
x1 , x2 , x3 , . . . such that the balls B(x1 , rx1 ), B(x2 , rx2 ), B(x3 , rx3 ), . . . are disjoint,
42
and such that B(x1 , 5rx1 ), B(x2 , 5rx2 ), B(x3 , 5rx3 ), . . . covers Ea . Hence
∞
X
µ(Ea ) ≤ µ(B(xn , 5rxn ))
n=1
X∞
≤ µ(B(xn , 8rxn ))
n=1
∞
X
≤ Cµ3 µ(B(xn , rxn ))
n=1
∞ Z
X
≤ Cµ3 a−1 |f |dµ
j=1 B(xj ,rxj )
Z
≤ Cµ3 a−1 |f |dµ.
Theorem 11.16 (Lebesgue differentiation theorem). Suppose f ∈ L1 (µ), then for almost every
x we have Z
1
lim f dµ = f (x).
r→0 µ(B(x, r)) B(x,r)
Then
lim sup |Ar f (x) − f (x)|
r→0
= lim sup |Ar (f − g)(x) + (Ar g − g)(x) + (g − f )(x)|
r→0
≤ M (f − g)(x) + 0 + |f − g|(x).
So if we put
Ea = {x ∈ X : lim sup |Ar f (x) − f (x)| > a}, Fa = {x ∈ X : |f (x) − g(x)| > a},
r→0
then
Ea ⊂ Fa/2 ∪ {x ∈ X : M (f − g)(x) > a/2}.
But Z
a
µ(Fa/2 ) ≤ |f − g|dµ < ε,
2 Fa/2
1
R
The set of points for which limr→0 µ(B(x,r)) B(x,r)
|f − f (x)|dµ = 0 is called the Lebesgue set of f .
Proof. For each rational number q put gq (y) = |f (y) − q|. Then according to the above theorem we have
Z
1
lim gq (y)dµ(y) = gq (x)
r→0 µ(B(x, r)) B(x,r)
Let E = ∩q∈Q Dq (i.e. E c = ∪q∈Q Dqc which has measure zero). Then we may for any x ∈ E and ε > 0
choose q ∈ Q such that |f (x) − q| < ε, and hence |f (y) − f (x)| ≤ |f (y) − q| + ε. Therefore
Z
1
lim |f (y) − f (x)|dµ(y) < |f (x) − q| + ε < 2ε.
r→0 µ(B(x, r)) B(x,r)
11.4. The one-dimensional case. Here we will simply state without proof some results about the
one-dimensional case, when X ⊂ R equiped with Lebesgue measure m.
Definition 11.18 (Functions of bounded variation on the real line). Suppose F : [a, b] → R, then
we define the total variation TF : [a, b] → [0, ∞] of F over [a, b]:
( k )
X
TF (x) = sup |F (xj ) − F (xj−1 )| : a = x0 < x1 < . . . < xk = x .
i=1
If TF (b) is finite, then F is said to be of bounded variation on [a, b], written F ∈ BV ([a, b]).
We also define similarly for F : R → R the total variation TF : R → R by
( k )
X
TF (x) = sup |F (xj ) − F (xj−1 )| : −∞ < x0 < x1 < . . . < xk = x ,
i=1
and if
TF (∞) := lim TF (x) < ∞
x→∞
then we say that F ∈ BV (R).
Lemma 11.19. If F : [a, b] → R is increasing and we define G(x) = limy→x+ F (y), then G is
increasing and right continuous. Furthermore:
(a) The set of points where F and G are discontinuous is at most countable,
(b) F and G are differentiable a.e. and F ′ = G′ a.e.
We also introduce the space N BV (R) to denote the set of all functions in BV (R) which are right
continuous with limx→−∞ F (x) = 0.
Then we have the following characterization of finite signed measures on R.
44
Theorem 11.20. If µF is a finite signed measure on the Borel sets in R and we define F (x) =
µF ((−∞, x]), then F ∈ N BV (R).
Conversely if F ∈ N BV (R), then there is a unique measure µF such that
F (x) = µF ((−∞, x]).
One can fairly easily prove that any absolutely continuous function is of bounded variation, and that
it can be decomposed as a difference of two increasing functions which are also absolutely continuous.
Finally we have the following version of the fundamental theorem of calculus for the Lebesgue integral:
Remark 11.23. As a side remark not quite related to differentiation is that a bounded function
f : [a, b] → R is Riemann integrable if and only if it is continuous m-a.e.
12. Lp -spaces(*)
12.1. Measurable functions.
As usual we let (X, M, µ) denote a fixed measure space (µ ≥ 0). In this section we will by a measurable
function f mean an extended real valued function f defined on X \A where µ(A) = 0 and f is measurable
as a function on X \ A. We also will write f = g to mean that they are equal almost everywhere with
respect to µ on X (i.e. everything “a.e.” in this section). Note in particular that with these conventions
the integrable functions forms a vector space.
Definition 12.1 (Lp -norm). For 1 ≤ p < ∞ we define for a measurable function f
Z 1/p
p
||f ||p = |f | dµ .
We also define
||f ||∞ = inf {α ∈ [0, ∞] : µ({x : |f (x)| > α}) = 0} .
This is usually called the essential supremum of |f |, and it means that a.e. |f | ≤ ||f ||∞ .
Remark 12.2. The fact that ∥ · ∥∞ is not defined through an integral, but the rest of the Lp -norms are
indicates that they behave somewhat differently. For instance in a metric space setting one can prove that
continuous functions are dense in Lp (µ) for p ∈ [1, ∞), but if for instance the measure of every open set
is non-zero, then the continuous functions forms a closed subspace of Lp (µ) and the L∞ -norm restricted
to these is the same as the supremum-norm (simply because if we change a continuous function on a set
of measure zero it can not be continuous any more).
45
The reason for this definition is simply that the “norm” ∥·∥ is only a semi-norm on Lp since it does not
distinguish between functions which are equal almost everywhere. We however always think of elements
in Lp (µ) as functions defined almost everywhere. We then have that || · ||p is a norm on Lp (µ) for each
p, i.e.:
(a) ||f ||p ≥ 0 with equality if and only if f = 0 a.e.,
(b) ||kf ||p = |k|||f ||p for all k ∈ R and f ∈ Lp (µ),
(c) ||f + g||p ≤ ||f ||p + ||g||p for all f, g ∈ Lp (µ).
In particular we see that Lp (µ) is a vector space. The triangle inequality is in this setting usually called
Minkowski’s inequality, and will be discussed below.
Exercise 12.1 (*). Prove that ∥ · ∥p satisfies (a) and (b) above.
For a given p ∈ [1, ∞] we introduce the dual exponent q ∈ [1, ∞] (also commonly denoted p′ ) such
that
1 1
+ = 1.
p q
In case p = 1 then we interpret this as that q = ∞ and vice versa. Note that p is then the dual exponent
of q. Below p, q will always denote the dual exponents to each other.
Proof. The case p = 1 and p = ∞ are easy and left to the reader. We now assume that 1 < p < ∞. Also
the inequality is trivially satisfied unless ∥f ∥p > 0 and ∥g∥q > 0, which we henceforth assume.
Note that of a, b ≥ 0 then
∥(af )(bg)∥1 = ab∥f g∥1 , ∥af ∥p ∥bg∥q = ab∥f ∥p ∥g∥q
and hence we may assume that
∥f ∥p = ∥g∥q = 1.
Now we re-write the function |f (x)g(x)| as follows:
1/p
|f (x)|p
p 1/p q 1/q p 1/p q 1−1/p
|f (x)g(x)| = (|f (x)| ) (|g(x)| ) = (|f (x)| ) (|g(x)| ) = |g(x)|q .
|g(x)|q
Now we invoke the inequality from the above exercise with t = |f (x)|p /|g(x)|q to get
1/p
|f (x)|p 1 |f (x)|p
q 1 1 1
|g(x)| ≤ + |g(x)|q = |f (x)|p + |g(x)|q .
|g(x)|q p |g(x)|q q p q
Hence Z Z Z
1 1 1 1
|f g|dµ ≤ |f |p dµ + |g|q dµ = + = 1.
p q p q
□
46
Proof. The result is trivial if ∥f + g∥p = 0, so we assume that it is strictly positive. Note that
|f + g|p ≤ (|f | + |g|)|f + g|p−1 ,
so by Hölder’s inequality and the equality (p − 1)q = p we get
Z Z Z
|f + g|p dµ ≤ |f ||f + g|p−1 dµ + |g||f + g|p−1 dµ
Hence
Z 1−1/q
∥f + g∥p = |f + g|p ≤ ∥f ∥p + ∥g∥p .
Recall that this simply means that Lp (µ) as a metric space is complete. I.e. if fk is a sequence in
p
L (µ) such that
||fn − fk ||p → 0 as n, k → ∞,
then there is a function f ∈ Lp (µ) such that
||f − fk ||p → 0 as k → ∞.
Proof. Suppose fk is a Cauchy sequence in Lp (µ). Suppose we prove that fk is then also Cauchy
in measure, then we know (according to Exercise 9.4) that there is some subsequence fkj and some
function f such that fkj → f a.e. We may then apply Fatou’s lemma to conclude that
Z Z
|f |p dµ ≤ lim inf |fkj |p dµ.
j→∞
The right hand side must be finite since ∥fkj ∥p ≤ ∥fkj − fl ∥p + ∥fl ∥p holds for all l, and since the first
expression on the right hand side is assumed to go towards zero as kj , l goes to infinity there must be a
bound on the integrals. Applying Fatou’s lemma again gives us
Z Z
p
|f − fk | ≤ lim inf |fkj − fk |p .
j→∞
12.2. Linear functionals on Lp . By Hölders inequality we have that for any g ∈ Lq (µ) the linear
functional Z
Fg (f ) := f gdµ
satisfies
||Fg || = inf{C : |Fg (f )| ≤ C||f ||p for all f ∈ Lp (µ)} ≤ ||g||q .
We actually have ||Fg || = ||g||q (see Exercise 12.5).
47
Exercise 12.4. Prove that if a measurable function f on X belongs to Lp (µ) (1 ≤ p < ∞) then
Z
∥f ∥p = sup f gdµ .
g∈Lq (µ),∥g∥q =1
Prove also that in case X is σ-finite, then the result holds also for p = ∞. (Hint: consider
−p/q
the function g = sgn(f )|f |p−1 ∥f ∥p where the sign function sgn(x) is 0 if x = 0 and x/|x|
otherwise.)
In case X is σ-finite then indeed a measurable function f belongs to Lp (µ) if and only if
Z
sup f gdµ < ∞.
g∈Lq (µ),∥g∥q =1
Exercise 12.5. Prove that Fg is a continuous linear functional on Lp (µ) for each g ∈ Lq (µ), and
prove that ∥Fg ∥ = ∥g∥q (in case q = ∞ we assume that X is σ-finite for the last equality).
Theorem 12.7. Suppose (X, M, µ) is σ-finite, and 1 ≤ p < ∞. Then every continuous linear
functional F on Lp (µ) is of the form Fg for some unique element g ∈ Lq (µ).
Exercise 12.6. Prove that in case Theorem 12.7 holds for every finite measure space (X, M, µ)
then it follows that it also holds for every σ-finite X. (Hint: Write X as a countable increasing
union of measurable sets Xn with finite measure. Let gn be the corresponding function from the
theorem defined on Xn . Show that if n ≤ m then gn = gm in Xn , so that we may define g = gn
on Xn , which gives a measurable function on X. Then apply the monotone convergence theorem
a couple of times to draw the necessary conclusions.)
Proof. The uniqueness is clear, and by the above exercise what remains is to prove the existence of g for
a given functional F in case µ(X) < ∞. Let
ν(E) = F (χE ).
This is then well defined and by linearity and continuity it is easy to see that it indeed is a signed (finite)
measure on X (note that if E1 ⊂ E2 ⊂ E3 ⊂ · · · and E = ∪∞ p
i=1 Ei , then χEn → χE in L (µ) by monotone
convergence for any p ∈ [1, ∞) but typically NOT for p = ∞). Furthermore by definition of the space
Lp we must have that µ(E) = 0 gives F (χE ) = 0 (since F must give the same value for all functions
equal a.e.). Therefore ν is absolutely continuous with respect to µ, and hence of the form
Z
ν(E) = gdµ.
E
for all integrable simple functions on X. Suppose now that h is a bounded measurable function. Then
there is a sequence of simple functions hn converging pointwise to h such that |hn | ≤ ∥h∥∞ and therefore
we may apply the dominated convergence theorem to conclude that
Z Z
hn gdµ → hgdµ,
and by our assumptions , since F is continuous, F (hn ) → F (h) (because again by dominated convergence
hn converges to h in Lp (µ)). Therefore equation (4) holds for all bounded measurable functions h. Now
for a given f ∈ Lp (µ) let
fn (x) = |f (x)|sgn(g(x))χ{x∈X:|f (x)|≤n} .
48
Then ∥fn ∥p ≤ ∥f ∥p , and therefore
|F (fn )| ≤ ∥F ∥∥fn ∥p ≤ ∥F ∥∥f ∥p .
Since fn g ≥ 0 and converges pointwise to |f g| we may apply Fatou’s lemma and we get
Z Z
|f g|dµ ≤ lim inf fn gdµ = lim inf F (fn ) ≤ ∥F ∥∥f ∥p .
n→∞ n→∞
If we use the result of Exercise 12.4 we see that this implies that indeed g ∈ Lq (µ). To finish the proof
it is enough to note that for any h ∈ Lp (µ) there is a sequence of bounded functions hn ∈ Lp (µ) which
converges to h in Lp (µ), and by Hölder’s inequality this implies that indeed
∥hn g − hg∥1 ≤ ∥hn − h∥p ∥g∥q ,
which shows that indeed we have
Z
F (h) = hgdµ for all h ∈ Lp (µ).
Exercise 12.7. Let 1 ≤ p1 < p2 ≤ ∞. Give an example of a function f on (0, ∞) with Lebesgue
measure such that f ∈ Lp (µ) if and only if p1 < p < p2 . (Hint: Consider a function of the form
f (x) = x−a | log x|b for suitable a, b.)
Exercise 13.1 (**). Prove that ∥ · ∥s is a norm on C(K) and that fn → f in this norm is the
same as uniform convergence.
Lemma 13.1. If (K, ρ) is a compact metric space, then it is separable (i.e. it contains a countable
dense subset).
Proof. To see this, for each n the balls {B(x, 1/n)}x∈K obviously is an open cover of K, and hence it
has a finite subcover B(xn1 , 1/n), . . . , B(xnmn , 1/n). The collection of all xnj is countable, and if we relabel
them in a suitable way to x1 , x2 , . . . it is easy to see that we have a countable dense subset of K. □
Lemma 13.2. If (K, ρ) is a compact metric space, then it is second countable (i.e. there is a
countable collection of balls B(xn , rn ) such that any open set is a union of elements from this
collection).
Proof. If O is open, then by definition this means that for every x ∈ O there is a ball B(x, r) ⊂ O.
Now we may choose one of our points xn from the previous lemma and a rational number qn such that
|x − xn | < qn < r/3. Then it follows that x ∈ B(xn , qn ) ⊂ B(x, r) ⊂ O. Hence the countable collection
B(xn , q) with x1 , x2 , x3 , . . . countable and dense and q rational satisfies the statement. □
Theorem 13.3 (Dini). Suppose fn is a sequence of functions in C(K) such that fn ↘ 0 pointwise
on K. Then the convergence is uniform (i.e. ∥fn ∥s ↘ 0).
49
Proof. Given ε > 0, what we need to prove is that there is N such that fn < ε for all n ≥ N . To do so
look at the sets Bn = {x ∈ K : fn (x) < ε}. By continuity these sets are open and clearly cover K, since
for any x ∈ K there is an N such that fN (x) < ε by assumption. Hence for any ε > 0 there is a finite
subcover of K by such sets. However, we also have since fn is decreasing that the sets Bn are increasing.
Hence there must be some N such that K ⊂ Bn for all n ≥ N . This is exactly the statement that fn < ε
for all n ≥ N as was to be proved. □
Theorem 13.4 (Riesz). Suppose I : C(K) → R is positive and linear, then I is automatically
order-continuous and there is a unique finite Borel measure µ such that
Z
I(f ) = f dµ for all f ∈ C(K).
What we say in the theorem is that if I is linear and If ≥ 0 for all f ≥ 0 in C(K), then we
automatically have for any sequence fn ↘ 0 in C(K) that Ifn → 0. Hence I is an elementary integral
on C(K), which is an elementary lattice as noted earlier.
Proof. To prove that I is order-continuous let fn ↘ 0 in C(K). By Dini’s theorem for any ε > 0 there
is N such that fn ≤ ε for all n ≥ N . Note that positivity and linearity gives that I is monotone in the
sense that if f ≤ g then If ≤ Ig (since we may look at g − f which is positive). Hence
Ifn ≤ Iε = εI1 for all n ≥ N.
Since ε > 0 is arbitrary we see that we have Ifn → 0.
By Stone’s theorem this means that I = Iµ for some measure µ, which is finite since µ(K) = I1. To
prove that this is a Borel measure (i.e. that all Borel sets are measurable), then it is enough to prove
that open sets are µ−measurable (because we know that the µ−measurable sets forms a σ−algebra, and
hence if this σ−algebra contains the open sets it must also by definition contain all Borel sets). But we
saw above that any open set O is a countable union of open balls. So it is by the same type of argument
enough to prove that all open balls are µ−measurable. But given an open ball B(x, r) then the functions
fn defined below can easily be proved to belong to C(K):
1 |y − x| ≤ r − 1/n
fn (y) := n(r − |y − x|) r − 1/n < |y − x| ≤ r .
0 |y − x| > r
These functions satisfies fn ↗ χB(x,r) . So we have that χB(x,r) ∈ L+ as defined before. Hence B(x, r) is
µ−measurable by definition and the proof is done. □
Since any continuous linear functional F on C(K) can be written on the form F = I + − I − we can
prove the following result using the above:
Theorem 13.5 (Riesz). Given a continuous linear functional F on C(K) there is a unique finite
signed Borel measure µ such that Z
F (f ) = f dµ
holds for all f ∈ C(K). Furthermore µ is positive if and only if F (f ) ≥ 0 for every positive f in
C(K). Also
∥F ∥ = ∥µ∥ = |µ|(K).
That is, the continuous linear functionals on C(K) and the finite signed Borel measures are in 1 − 1
correspondence.
50
14. Recommended textbooks
The main recommended extra source for the course is:
• R. Bass: Real Analysis for Graduate Students. A book that is modern and inexpensive (cheap to
buy and also downloadable from the authors web page). It however works only from the Lebesgue
point of view.
14.1. Lebesgue-type of treatment.
• G.B. Folland: Real Analysis. This text has been used as the course text for integration theory at
MAI for a long time and is highly recommended. It contains much more than just integration
theory, and this course would roughly correspond to chapter 1-3+6.1-6.2. Unfortunately it is
unreasonably expensive theses days, and therefore I have decided to switch to the above book,
which for this course seems to be just as good.
• W. Rudin: Real and Complex Analysis. Again this is a very much used text world-wide, and
contains a lot of interesting material apart from integration theory. However it relies very much
on the Riesz representation theorem rather than outer measures to show existence of for instance
Lebesgue measure.
• A. Friedman: Foundations of Modern Analysis. This is a very good, short and inexpensive book,
which takes the reader quickly to the main points. Indeed this is the book I learned integration
theory from to begin with. It also contains a lot of other material on functional analysis, and is
highly recommended.It does not treat the Riesz representation theorem though.
• T. Tao: An Introduction to Measure Theory. Although I am not so familiar with this book it
seems like a very good text for self-study with much motivation. It is also freely downloadable via
the authors home page.
14.2. Daniell-type treatment (or both).
• G.S. Shilov and B.E. Gurevich: Integral, Measure and Derivative. A very nice text which works
from the Daniell point of view.
• A.E. Taylor: General Theory of Functions and Integration. A book which covers both approaches
to integration, and perhaps the one that would be most suitable if you want a book that covers
essentially everything in these notes.
• H. Federer: Geometric Measure Theory. A very comprehensive book, but quite challenging to read.
It treats both alternatives to integration theory (in chapter 2). This is the standard reference work
also for deeper material on “lower-dimensional” measures, covering theorems, differentiation...,
but I would not regard it as very suitable for this course since it is not easy!
51