0% found this document useful (0 votes)
13 views110 pages

Measure Theory

Uploaded by

jfang1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views110 pages

Measure Theory

Uploaded by

jfang1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 110

Measure Theory

Notes for Pure Math 451


Kenneth R. Davidson
University of Waterloo
C ONTENTS
Chapter 1. Measures 1
1.1. Introduction 1
1.2. σ-Algebras 3
1.3. Construction of Measures 6
1.4. Premeasures 8
1.5. Lebesgue-Stieltjes measures on R 10

Chapter 2. Functions 16
2.1. Measurable Functions 16
2.2. Simple Functions 18
2.3. Two Theorems about Measurable Functions 19

Chapter 3. Integration 21
3.1. Integrating positive functions 21
3.2. Two limit theorems 23
3.3. Integrating complex valued functions 26
3.4. The space L1 (µ) 28
3.5. Comparison with the Riemann Integral 29
3.6. Product Measures 34
3.7. Product Integration 36
3.8. Lebesgue Measure on Rn 43
3.9. Infinite Product Measures 46

Chapter 4. Differentiation and Signed Measures 49


4.1. Differentiation 49
4.2. Signed Measures 57
4.3. Decomposing Measures 60

Chapter 5. Lp spaces 64
5.1. Lp as a Banach space 64
5.2. Duality for Normed Vector Spaces 68
5.3. Duality for Lp 69

i
ii Contents

Chapter 6. Some Topology 74


6.1. Topological spaces 74
6.2. Continuity 76
6.3. Compactness 78
6.4. Separation Properties 81
6.5. Nets* 85

Chapter 7. Functionals on Cc (X) and C0 (X) 88


7.1. Radon measures 88
7.2. Positive functionals on Cc (X) 90
7.3. Linear functionals on C0 (X) 94

Chapter 8. A Taste of Probability* 97


8.1. The language of probability 97
8.2. Independence 98
8.3. The Law of Large Numbers 100

Bibliography 105
Index 106
C HAPTER 1

Measures

1.1. Introduction
The idea of Riemann integration is the following: Convergence of the upper
and lower sums to the same limit is required for Riemann integrability. The integral

F IGURE 1.1. Riemann Integral


Z b n
X
is approximated by Riemann sums f (x) dx ≈ Li (ti − ti−1 ). The function
a i=1
f (x) has to be ‘almost continuous’ in order for this to succeed. It is too restrictive.
Moreover there are no good limit theorems. (Again one needs uniform convergence
for most results,)
Lebesgue’s approach was to chop up the range instead. Then one gets the

F IGURE 1.2. Lebesgue integral


Z b n
X
approximations f (x) dx ≈ yi {x : yi−1 < f (x) ≤ yi } . The notation
a i=1
1
2 Measures

|A| is meant to convey the length of a set A. The complication is that, even for
continuous functions, the sets {x : yi−1 < f (x) ≤ yi } are neither open nor closed,
and can be complicated. It is a Borel set, and that will be seen to be good enough.
Lebesgue’s idea was to extend the notion of length or measure from open intervals
to a function m on a much larger class of sets so that
(1) m((a, b)) = b − a
(2) m(A + x) = m(A) for sets A and x ∈ R (translation invariance)
S∞
(3) if A is the disjoint union A = ˙ n=1 An , then m(A) = ∞
P
n=1 m(An ).
(countable additivity)
It turns out that it is not possible to define such a function on all subsets of R.
To see this, we define an equivalence relation on [0, 1) by x ∼ y if x − y ∈ Q. Then
use the Axiom of Choice to select a set A which contains exactly one element from
each equivalence class. Enumerate Q ∩ [0, 1) = {0 = r0 , r1 , r2 , . . . } and define

An = A + rn (mod 1) = (A + rn ) ∪ (A + rn − 1) ∩ [0, 1) for n ≥ 0.

Now by translation invariance and finite additivity,

m(A) = m(A + rn )
 
= m (A + rn ) ∩ [0, 1) + m (A + rn ) ∩ [1, 2)
 
= m( A + rn ) ∩ [0, 1) + m (A + rn − 1) ∩ [0, 1)
= m(An ).
S
Observe that Am ∩ An = ∅ if m 6= n. By construction, n≥1 An = [0, 1). Hence
by countable additivity,

X ∞
X
1 = m([0, 1)) = m(An ) = m(A).
n=0 n=0

There is no value that can be assigned to m(A) to make sense of this.


The way we deal with this is to declare that the set A is not measureable. We
will only assign a value to m(A) for a class of ‘nice’ sets.

1.1.1. R EMARK . Banach showed that it is possible to define a finitely additive,


translation invariant function on all subsets of R. However in Rn for n ≥ 3, even
this is not possible. The Banach-Tarski paradox shows that if A and B are two
bounded subsets of R3 with interior, then there is a finite decomposition of each:
Sn Sn
A = ˙ i=1 Ai and B = ˙ i=1 Bi and rigid motions (a combination of a rotation and
a translation) which carry Ai onto Bi . For example let A be the sphere of radius 1
and let B be the sphere of radius 1010 . Clearly this does not preserve volume, in
spite of what common sense suggests. This decomposition requires the Axiom of
Choice, and can’t be accomplished by hand.
1.2 σ-Algebras 3

1.2. σ-Algebras
1.2.1. D EFINITION . If X is a set, an algebra of subsets is a non-empty col-
lection of subsets of X, i.e., A ⊂ P(X), which contains the empty set, ∅, and is
closed under complements and finite unions. A σ-algebra is an algebra B of sub-
sets of X which is closed under countable unions. We say that (X, B) is a measure
space.

1.2.2 S IMPLE P ROPERTIES OF σ-A LGEBRAS . Let A be an algebra of sub-


sets A ⊂ P(X); and let B be a σ-algebra of subsets of X.
(1) A is non-empty, so suppose that E ∈ A. Then E c , ∅ and X = ∅c all
belong to A.
(2) If E, F ∈ A, then E ∪ F , E ∩ F = (E c ∪ F c )c , E \ F = E ∩ F c and
E4F = (E \ F ) ∪ (F \ E) belong to A.
c c ∈ B.
T S 
(3) If En ∈ B, then n≥1 En = n≥1 En

(4) If En ∈ B, then n≥1 En = ˙ n≥1 Fn where Fn = En \ i=1


S S Sn−1
Ei are
disjoint elements of B.
T
(5) If Bλ are σ-algebras for λ ∈ Λ, then B = λ∈Λ Bλ is a σ-algebra. So if
E ⊂ P(X) is any collection of subsets, the intersection of all σ-algebras
containing E is the unique smallest σ-algebra containing E. This is called
the σ-algebra generated by E.

1.2.3. D EFINITION . If X is a topological space, the Borel sets are the elements
of the σ-algebra BorX or Bor(X) generated by the collection of open subsets of X.
A Gδ set is a countable intersection of open sets. An Fσ set is a countable
union of closed sets.

1.2.4. D EFINITION . A measure on (X, B) is a map µ : B → [0,S∞) ∪ {∞} =:


[0, ∞] such that µ(∅) = 0 and is countably additive: if A = ˙ n≥1 An where
P
An ∩ Am = ∅ if m 6= n, then µ(A) = i≥1 µ(Ai ).
S
A measure µ if finite if µ(X) < ∞ and σ-finite if X = n≥1 En such that
µ(En ) < ∞ for n ≥ 1. It is called a probability measure if µ(X) = 1. A measure
µ is semi-finite if for every F ∈ B with µ(F ) 6= 0, there is an E ∈ B with E ⊂ F
and 0 < µ(E) < ∞.

1.2.5 S IMPLE P ROPERTIES OF M EASURES . Let µ be a measure on (X, B).


(1) (monotonicity) If E, F ∈ B with E ⊂ F , then

µ(F ) = µ(E) + µ(F \ E) ≥ µ(E).


4 Measures

(2) (subadditivity) If En ∈ B, write n≥1 En = ˙ n≥1 Fn where Fn ⊂ En .


S S
Then X X
µ(E) = µ(Fn ) ≤ µ(En ).
n≥1 n≥1

(3) (continuity from below) If En ∈ B with En ⊂ En+1 for n ≥ 1, then


setting E0 = ∅,
[  [˙  X
µ En = µ En \ En−1 = µ(En \ En−1 )
n≥1
n≥1 n≥1
n
X
= lim µ(Ei \ Ei−1 ) = lim µ(En ).
n→∞ n→∞
i=1
(4) (continuity from above) If En ⊃ En+1 for n ≥ 0 and µ(E0 ) < ∞, then
\ 
µ En = lim µ(En ).
n→∞
n≥1

1.2.6 S IMPLE E XAMPLES OF M EASURES . Let µ be a measure on (X, B).


(
|E| if E is finite
(1) Counting measure on (X, P(X)) is µc (E) =
∞ if E is infinite.
This measure is semi-finite. It is σ-finite if and only if X is countable.
(
1 if x ∈ E
(2) point mass on (X, P(X)) for x ∈ X is δx (E) =
0 if x 6∈ E.
This is a probability measure.
(
0 if E = ∅
(3) Define µ on (X, P(X)) by µ(E) = . This is a valid
∞ if E 6= ∅
measure, but it is essentially useless. Note that it is not even semifinite.

1.2.7. D EFINITION . A measure µ on (X, B) is complete if E ∈ B, µ(E) = 0


and F ⊂ E implies that F ∈ B.

1.2.8. T HEOREM . If (X, B, µ) is a measure, then


B = {E ∪ F : E, N ∈ B, F ⊂ N, µ(N ) = 0}
is a σ-algebra and µ̄(E ∪ F ) := µ(E) is a complete measure on (X, B) such that
µ̄|B = µ.. Moreover this is the smallest σ-algebra containing B on which µ extends
to a complete measure.

P ROOF. Note that (E ∪ F )c = (E ∪ N )c ∪ (N \ (E ∪ F )). So B is closed under


complements. Also if Ei ∪ Fi ∈ B and Fi ⊂ Ni where Ni ∈ B and µ(Ni ) = 0,
1.3 σ-Algebras 5

then [ [ [
Ei ∪ Fi = Ei ∪ Fi = E ∪ F
i≥1 i≥1 i≥1
S S
where F = i≥1 Fi ⊂ i≥1 Ni =: N , N ∈ B and µ(N ) = 0 by subadditivity. So
B is a σ-algebra. Moreover this is clearly the smallest σ-algebra containing B and
all subsets of null sets.
Next we show that µ̄ is well-defined. Suppose that A = E1 ∪ F1 = E2 ∪ F2
where Ei ∈ B, Fi ⊂ Ni and µ(Ni ) = 0. Let E = E1 ∩ E2 ∈ B. Then
E ⊂ Ei ⊂ A ⊂ (E1 ∪ N1 ) ∩ (E2 ∪ N2 ) ⊂ (E1 ∩ E2 ) ∪ N1 ∪ N2 .
Therefore µ(E) ≤ µ(Ei ) ≤ µ(E) + µ(N1 ) + µ(N2 ) = µ(E). So µ(E1 ) = µ(E2 ).
Thus µ̄(A) = µ(Ei ) is well-defined. In particular, if A ∈ B, then µ̄(A) = µ(A).
So µ̄|B = µ.
To see that µ̄ is countably additive, supposeSthat Ai = ESi ∪ Fi are
 disjoint,
S and
Fi ⊂ Ni where Ni are null sets. Then A := ˙ i≥1 Ai = ˙ i≥1 Ei ∪ ˙ i≥1 Fi .

S S
Moreover F = i≥1 Fi ⊂ i≥1 Ni =: N , and µ(N ) = 0 by subadditivity. So
[˙  X X
µ̄(A) = µ Ei = µ(Ei ) = µ̄(Ai ).
i≥1
i≥1 i≥1

Thus µ̄ is a measure.
If E ∈ B, then µ̄(E) = µ(E), so µ̄|B = µ; i.e. µ̄ extends the definition of µ.
To see that µ̄ is complete, suppose that µ̄(M ) = 0 and G ⊂ M . Then M = E ∪ F
where E, N ∈ B, F ⊂ N and µ(N ) = 0. Also µ(E) = µ̄(M ) = 0. Thus
G ⊂ M ⊂ E ∪ N , and µ(E ∪ N ) = 0. Hence G ∈ B. Thus µ̄ is a complete
measure.
Clearly B is the smallest σ-algebra containing B and all subsets of null sets.
Moreover it is clear that the only way to extend the definition of µ to B and be a
measure is to set µ̄(F ) = 0 when F ⊂ N and µ(N ) = 0. Thus, this is the unique
smallest complete measure extending µ. 

1.2.9. D EFINITION . If µ is a measure on (X, B), then (X, B, µ̄) is called the
completion of µ.

1.2.10. D EFINITION . Let µ be a measure on (X, B). A property about points


in X is true µ-almost everywhere if it is true except on a set of measure 0 (a null
set). Write a.e.(µ) or just a.e. if the measure is understood.

For example, the statement about f : X → R saying that f = 0 a.e.(µ) means


that there is a µ-null set N so that f (x) = 0 for x ∈ X \ N . It does not say that
{x : f (x) 6= 0} is measurable, only that it is contained in the set N of measure 0.
However if µ is a complete measure, then all subsets of null sets are measurable.
So in this case, it is true that {x : f (x) 6= 0} is measurable.
6 Measures

1.3. Construction of Measures


We need to develop some machinery so that we can define more interesting
measures such as Lebesgue measure on the real line. Indeed, the construction of
a family of measures related to Lebesgue measure will be our main application of
this machinery at this time.

1.3.1. D EFINITION . Let X be a non-empty set. An outer measure on X is a


map µ∗ : P(X) → [0, ∞] such that
(1) µ∗ (∅) = 0.
(2) (monotonicity) if A ⊂ B, then µ∗ (A) ≤ µ∗ (B).


S  ifP{Ai : ∗i ≥ 1} is a countable collection of sets, then
(3) (sub-additivity)
µ i≥1 Ai ≤ i≥1 µ (Ai ).

In other words, µ∗ is a monotone, subadditive function on subsets of X. The


following easy proposition shows that outer measures can easily be produced.

1.3.2. P ROPOSITION . Suppose that {∅, X} ⊂ E ⊂ P(X) and µ : E → [0, ∞]


is a function with µ(∅) = 0. For A ∈ P(X), define
X [
µ∗ (A) = inf µ(Ei ) : Ei ∈ E, A ⊂ Ei .
i≥1 i≥1

Then µ∗ is an outer measure.

P ROOF. Note that µ∗ (∅) = 0 by definition, and that monotonicity is imme-


diate. To establish sub-additivity, suppose that {Ai : i ≥ 1} ⊂ P(X). There is
nothing to prove unless i≥1 µ∗ (Ai ) < ∞. In this case, let ε > 0. For each i, find
P

sets Eij ∈ E for j ≥ 1 so that Ai ⊂ j≥1 Eij and j≥1 µ(Eij ) < µ∗ (Ai ) + 2−i ε.
S P
S S S
Then A = i≥1 Ai is covered by i≥1 j≥1 Eij and so
XX X  X ∗
µ∗ (A) ≤ µ(Eij ) < µ∗ (Ai ) + 2−i ε = µ (Ai ) + ε.
i≥1 j≥1 i≥1 i≥1

Since ε > 0 is arbitrary, this establishes the claim. So µ∗ is an outer measure. 

1.3.3. E XAMPLE . Let X = R and let E denote the collection of all bounded
open intervals. Define ρ((a, b)) = b − a. This determines an outer measure which
will be used to construct Lebesgue measure.

1.3.4. D EFINITION . Let µ∗ be an outer measure on X. A subset A ⊂ X is


µ∗ -measurable if
µ∗ (E) = µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) for all E ⊂ X.
1.3 Construction of Measures 7

Note that subadditivity shows that µ∗ (E) ≤ µ∗ (E ∩ A) + µ∗ (E ∩ Ac ). So we


only need to show that µ∗ (E) ≥ µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) when µ∗ (E) < ∞. We
are selecting those sets A for which this is always additive.
The main result about outer measures is the following important result.

1.3.5. C ARATH ÉODORY ’ S T HEOREM . Let µ∗ be an outer measure on X.


Then the collection B of all µ∗ -measurable sets is a σ-algebra, and µ = µ∗ |B is a
complete measure.

P ROOF. It is clear from the definition that if A is measurable, then so is Ac .


Suppose that A, B ∈ B and let E ⊂ X. Then

µ∗ (E) = µ∗ (E ∩ A) + µ∗ (E ∩ Ac )
= µ∗ (E ∩ A) + µ∗ (E ∩ Ac ∩ B) + µ∗ (E ∩ Ac ∩ B c )
≥ µ∗ (E ∩ (A ∪ B)) + µ∗ (E ∩ (A ∪ B)c ).

The first line follows since A is µ∗ -measurable. The second line follows since
B is µ∗ -measurable. The last line follows from subadditivity of µ∗ . This is the
non-trivial direction, so this is an equality. Thus A ∪ B is µ∗ -measurable.
Combining these two observations shows that B is an algebra of sets, and thus
is closed under finite unions and intersections.
Next suppose that A and B are disjoint. Then

µ∗ (E ∩ (A∪B))
˙ = µ∗ (E ∩ (A∪B)
˙ ∩ A) + µ∗ (E ∩ (A∪B)
˙ ∩ Ac )
= µ∗ (E ∩ A) + µ∗ (E ∩ B).

By induction, we see that if A1 , . . . , An are pairwise disjoint µ∗ -measurable sets,


then for any E ⊂ X,
n
˙n
[ X
µ∗ (E ∩ Ai ) = µ∗ (E ∩ Ai ).
i=1
i=1

Now we consider countable unions. Let Ai ∈ B for i ≥ 1. Set Bn = ni=1 Ai


S
Sn
and B = i≥1 Ai . Set A0i = Ai \Bi−1 so that Bn = ˙ i=1 A0i and B = ˙ i≥1 A0i . We
S S

know that Bn , A0n ∈ B. Take any E ⊂ X. Using the additivity from the previous
paragraph,

µ∗ (E) = µ∗ (E ∩ Bn ) + µ∗ (E ∩ Bnc )
≥ µ∗ (E ∩ Bn ) + µ∗ (E ∩ B c )
n
X
= µ∗ (E ∩ A0i ) + µ∗ (E ∩ B c ).
i=1
8 Measures

This is true for all n ≥ 1 so we can take limits and get


X
µ∗ (E) ≥ µ∗ (E ∩ A0i ) + µ∗ (E ∩ B c )
i≥1
≥ µ (E ∩ B) + µ∗ (E ∩ B c )

≥ µ∗ (E)
The last two lines use subadditivity. When µ∗ (E) < ∞, we obtain equality, and
thus B is µ∗ -measurable. Therefore B is a σ-algebra.
Next suppose that the Ai are pairwise disjoint. So A0i = Ai in the previous
paragraph. Taking E = B, we get that
X
µ(B) ≥ µ(Ai ) ≥ µ(B).
i≥1

Therefore µ is countably additive on B.


Finally suppose that µ∗ (A) = 0. Then for E ⊂ X,
µ∗ (E) ≤ µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) ≤ µ∗ (A) + µ∗ (E) = µ∗ (E).
Hence this is an equality, showing that A is µ∗ -measurable and µ(A) = 0. Any
subset F ⊂ A also has µ∗ (F ) = 0, and thus F is µ∗ -measurable. Therefore
(X, B, µ) is a complete measure. 

1.4. Premeasures
1.4.1. D EFINITION . A premeasure is a function µ : A → [0, ∞] defined on
an algebra A ⊂ P(X) of sets such that µ(∅) = 0 and whenever Ai ∈ A are
pairwise disjoint and A = ˙ i≥1 Ai ∈ A, then µ(A) = i≥1 µ(Ai ). In particular,
S P

˙ 2 ) = µ(A1 ) + µ(A2 ) if A1 , A2 ∈ A are disjoint.


premeasures are additive: µ(A1 ∪A

A premeasure is an improvement on the arbitrary function used in Proposi-


tion 1.3.2 to define an outer measure. In that earlier construction, the outer measure
need not reflect much about the original function µ. However in the case of a
premeasure, the Carathéodory construction yields a measure that extends the pre-
measure.

1.4.2. T HEOREM . If µ is a premeasure on an algebra A ⊂ P(X), then apply-


ing Carathéodory’s Theorem to the outer measure µ∗ yields a complete measure
(X, B, µ̄) such that B ⊃ A and µ̄|A = µ.

P ROOF. By Proposition 1.3.2, µ∗ is an outer measure. So an application of


Carathéodory’s Theorem 1.3.5, there is a complete measure (X, B, µ̄) defined on
the σ-algebra B of µ∗ -measurable sets.
1.4 Premeasures 9

S
Suppose that A ∈ A and Ai ∈ A such that A ⊂ i≥1 Ai . Define
i−1
[
Bi = A ∩ Ai \ Aj for i ≥ 1,
j=1

which belong to SA since it is an algebra. By construction, the Bi are pairwise


disjoint and A = ˙ i≥1 Bi . Therefore, since µ is a premeasure,
X X
µ(A) = µ(Bi ) ≤ µ(Ai ).
i≥1 i≥1

Therefore
X [
µ∗ (A) = inf µ(Ai ) : A ⊂ Ai = µ(A).
i≥1 i≥1

Now we show that A is µ∗ -measurable. ∗


S Let E ⊂ XPwith µ (E) < ∞, and let
ε > 0. We can find Ai ∈ A so that E ⊂ i≥1 Ai and i≥1 µ(Ai ) < µ∗ (E) + ε.
Notice that E ∩ A ⊂ i≥1 Ai ∩ A and E ∩ Ac ⊂ i≥1 Ai ∩ Ac . Therefore by the
S S
additivity of µ,
X X
µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) ≤ µ(Ai ∩ A) + µ(Ai ∩ Ac )
i≥1 i≥1
X

= µ(Ai ) < µ (E) + ε.
i≥1

Since ε > 0 is arbitrary, µ∗ (E ∩ A) + µ∗ (E ∩ Ac ) ≤ µ∗ (E), which is the non-


trivial inequality; so this is an equality. Hence A is µ∗ -measurable. We conclude
that A ⊂ B and for A ∈ A, µ̄(A) = µ∗ (A) = µ(A). So µ̄|A = µ. 

Here is further detail about the outer measure construction which explains when
the extension is unique.

1.4.3. P ROPOSITION . Let µ be a premeasure on an algebra A ⊂ P(X) and let


(X, B, µ̄) be the measure of Theorem 1.4.2. Let ν be any measure on a σ-algebra
C satisfying A ⊂ C ⊂ B such that ν|A = µ. Then ν(E) ≤ µ̄(E) for all E ∈ C,
with equality if µ̄(E) < ∞. Moreover, if E is σ-finite, then ν(E) = µ̄(E). So if µ
is σ-finite, then µ̄|C is the unique extension of µ to a measure on C.
S
P ROOF. Let E ∈ C. If E ⊂ for Ai ∈ A, then by subadditivity,
i≥1 Ai
X X
ν(E) ≤ ν(Ai ) = µ(Ai ).
i≥1 i≥1

Now take the inf over all such covers of E to obtain ν(E) ≤ µ̄(E).
10 Measures

P
If µ̄(E) < ∞ and ε > 0, we can choose the Ai so that i≥1 µ(Ai ) < µ̄(E)+ε.
Let Bi = Ai \ i−1
S S ˙
S
j=1 Aj , which are in A; and A = i≥1 Ai = i≥1 Bi . Then
X X
ν(A) = ν(Bi ) = µ(Bi ) = µ̄(A).
i≥1 i≥1

Therefore
ν(E) + ν(A \ E) = ν(A) = µ̄(A) = µ̄(E) + µ̄(A \ E) < µ̄(E) + ε.
So ν(A \ E) ≤ µ̄(A \ E) < ε. Whence ν(E) ≥ µ̄(A) − ε ≥ µ̄(E) − 2ε. Let ε → 0
to get ν(E) = µ̄(E).
Now if E = n≥1 En where µ̄(En ) < ∞, we may replace En with ni=1 Ei
S S
so that En ⊂ En+1 . Then by continuity from below,
ν(E) = lim ν(Ei ) = lim µ̄(Ei ) = µ̄(E).
n→∞ n→∞

Finally if µ is σ-finite, then every set E ∈ C is sigma-finite. Thus µ̄|C is the unique
extension of µ. 

1.5. Lebesgue-Stieltjes measures on R


Suppose that µ is a Borel measure on R such that µ(K) < ∞ if K is compact.
Then we define a function F : R → R by
(
µ([0, x]) if x ≥ 0
F (x) =
−µ((x, 0)) if x < 0.
Note that F is monotone increasing (i.e. non-decreasing if you prefer double neg-
atives). For x ≥ 0 and xn+1 ≤ xn with xn ↓ x, the continuity from above Prop-
erty 1.2.5(4) shows that
\
F (x) = µ( [0, xn ]) = lim µ([0, xn ]) = lim F (xn ).
n→∞ n→∞
n≥1

This uses the fact that µ([0, x1 ]) < ∞. Similarly if x < 0 and 0 > xn ↓ x, then the
continuity from below property shows that
[
F (x) = −µ( (xn , 0)) = lim −µ((xn , 0)) = lim F (xn ).
n→∞ n→∞
n≥1

Thus F is right continuous.


( F may fail to be left continuous. For example, for the point mass
The function
1 if x ≥ 0
δ0 , F (x) = . If we knew that there was Lebesgue measure on
0 if x < 0
the line, it would determine the function F (x) = x.
1.5 Lebesgue-Stieltjes measures on R 11

The Lebesgue-Stieltjes construction works in the other direction. Let F : R →


R be a monotone increasing, right continuous function. We will extend the defini-
tion to include F (∞) = limx→∞ F (x) and F (−∞) = limx→−∞ F (x), and these
values may include ∞ or −∞ respectively.
Set A to be the algebra of sets consisting of all finite unions of half open in-
tervals (a, b], where we allow b = ∞ and a = −∞; so that (a, ∞), (−∞, b] and
(−∞, ∞) belong to A. Since (a, b]c = (−∞, a] ∪ (b, ∞) is in A, it is easy to check
that A is an algebra. Define

n
˙n
[  X
µF (ai , bi ] = F (bi ) − F (ai )
i=1
i=1

1.5.1. L EMMA . µF is a premeasure.

P ROOF. First let’s show that µF is well defined. To this end, suppose that I =
Sn
(a, b] = ˙ i=1 (ai , bi ]. Then after rearrranging if necessary, we may suppose that
a = a1 < b1 = a2 < · · · < bn−1 = an < bn = b. Then if we set an+1 := bn = b,

n
X n
X
F (bi ) − F (ai ) = F (ai+1 ) − F (ai ) = F (b) − F (a).
i=1 i=1

Thus µF (I) does not depend on the decomposition into finitely many pieces. This
readily extends to a finite union of intervals.
Now we S consider the restricted version of countable additivity. Suppose that
I = (a, b] = ˙ i≥1 (ai , bi ]. This is more difficult to deal with because these intervals
cannot be ordered, end to end, as in the case of a finite union. One direction is easy:
˙
S n Sn
˙
since I = i=1 (ai , bi ]∪ I \ i=1 (ai , bi ] and the last set belongs to A,

n
X n
[ n
 X
µF (I) = µF ((ai , bi ]) + µF I \ (ai , bi ] ≥ µF ((ai , bi ]).
i=1 i=1 i=1

Taking a limit as n → ∞ yields that µF (I) ≥ ∞


P
i=1 µF ((ai , bi ]).
Conversely, first suppose that a, b ∈ R. Fix ε > 0. By right continuity, there
is a δ > 0 so that F (a + δ) < F (a) + ε. Likewise, there are δi > 0 so that
F (bi + δi ) < F (bi ) + 2−i ε. The compact interval [a + δ, b] has the open cover
{(ai , bi + δi ) : i ≥ 1}; so there is a finite subcover (ai1 , bi1 +δi1 ), . . . , (aip , bip +δip ).
12 Measures

Pp
Therefore j=1 F (bij +δij ) − F (aij ) ≥ F (b) − F (a+δ). Consequently,
X X
µF ((ai , bi ]) = F (bi ) − F (ai )
i≥1 i≥1
Xp
≥ F (bij ) − F (aij )
j=1
p
X
≥ F (bij + δij ) − 2−ij ε − F (aij )
j=1
≥ F (b) − F (a + δ) − ε
≥ F (b) − F (a) − 2ε.
P
Now let ε ↓ 0 to get i≥1 µF ((ai , bi ]) ≥ F (b) − F (a); whence we have equality.
P
Now if b = ∞, we still have i≥1 µF ((ai , bi ]) ≥ µF ((a, N ]) = F (N ) − F (a) for
P
N ∈ N. Letting N → ∞, we get i≥1 µF ((ai , bi ]) ≥ F (∞) − F (a). Similarly
we can handle a = S −∞.
m
Finally if A = ˙ k=1 (ak , bk ] is written as a disjoint union of half open intervals,
we can split the union into m pieces and use the argument for a single interval on
each one. Thus we obtain countable additivity (provided the union remains in A).
So µF is a premeasure. 

1.5.2. T HEOREM . If F : R → R is an increasing, right continuous function,


then there is a complete measure (R, B, µ̄F ) that extends µF . The σ-algebra B
contains the σ-algebra BorR of Borel sets. The restriction of µ̄F to BorR is the
unique Borel measure µF on R such that µF ((a, b]) = F (b) − F (a) for all a < b.
Conversely, if µ is a Borel measure on R such that µ(K) < ∞ for compact sets
K ⊂ R, then there is an increasing, right continuous function F : R → R such that
µ = µF . Also given two increasing, right continuous functions F, G, µF = µG if
and only if F − G is constant.

P ROOF. By Lemma 1.5.1, µF is a premeasure. Thus by Theorem 1.4.2, there


is a complete measure µ̄F on a σ-algebra B containing A which extends µF . The
σ-algebraSB contains the σ-algebra generated by A. So it contains all intervals
(a, b) = n≥1 (a, b− n1 ]. Hence it contains all open sets (because every open subset
of R is the countable union of intervals), and thus all Borel sets. The restriction µF
of µ̄F to BorR is a Borel measure with µF ((a, b]) = F (b) − F (a) for all a < b.
Since µF is σ-finite, Proposition 1.4.3 shows that this Borel measure is unique.
Conversely, we showed that every Borel measure which is finite on bounded
intervals determines an increasing, right continuous function F : R → R such that
µF ((a, b]) = F (b) − F (a) for all a < b. By the uniqueness of the construction, we
see that µ = µF . Finally, if µG = µF , then
µG ((a, b]) = G(b) − G(a) = F (b) − F (a) for a < b.
1.5 Lebesgue-Stieltjes measures on R 13

It follows that G(x) = F (x) + (G(0) − F (0)); i.e., F − G is constant. 

1.5.3. C OROLLARY. Lebesgue measure is the complete measure m = µ̄F for


the function F (x) = x. The σ-algebra L of Lebesgue measurable sets contains all
Borel sets, and m((a, b)) = b − a for all a < b in R.

Lebesgue measure has some special properties. If E ⊂ R and s ∈ R, let


E + s = {x + s : x ∈ E} and sE = {sx : x ∈ E}.

1.5.4. T HEOREM . Lebesgue measure is translation invariant: m(E + s) =


m(E) for all Lebesgue measurable sets E ⊂ R. Also m(sE) = |s|m(E).

P ROOF. The set of open intervals is invariant under translation, and hence so
is BorR . The measure ms (E) = m(E + s) agrees with m on open intervals. By
Theorem 1.5.2, the measures are determined by the functions
( (
m((0, x]) x ≥ 0 ms ((0, x]) x ≥ 0
F (x) = and G(x) = .
−m((x, 0]) x < 0 −ms ((x, 0]) x < 0
However as we have observed, these functions are equal. So ms = m. Hence
m(E + s) = m(E) for all Borel sets E ⊂ R. Now Proposition 1.4.3 shows that
there is a unique extension of m|BorR to the σ-algebra L of Lebesgue measurable
sets.
If s = 0, then m(0E) = 0 = 0m(E), so suppose that s 6= 0. Let ns (E) =
−1
|s| m(sE). Observe that
ns ((a, b)) = |s|−1 |sb − sa| = b − a = m((a, b)).
Arguing as in the previous paragraph, we see that ns = m. Hence m(sE) =
|s|ns (E) = |s|m(E) for all measurable sets E. 

1.5.5. R EMARK . A point {a} has Lebesgue-Stieltjes measure


µF ({a}) = lim µF ((a− n1 , a]) = F (a) − lim F (x) = F (a) − F (a− ).
n→∞ x→a−

Thus µF (a) > 0 if and only if F has a jump discontinuity at a. In this case, we
say that a is an atom of µF . The only discontinuities of a monotone function are
jump discontinuities. There are at most countably many jump discontinuities. To
see this, consider how many jumps of size at least δ there can be in (a, b]; at most
(F (b) − F (a))/δ which is finite. So there are at most n(F (n) − F (−n)) points
in (−n, n] with a jump of more than n1 . The union of this countable collection
of finite sets is countable, and contains all of the jumps. Note, however that the
discontinuities can be dense, say at every rational point.
Suppose that the discontinuities occur at ai with jump αi > 0 for i ≥ 1. Recall
that δx is the point mass that sets δ(A) = 1 if x ∈ A and δx (A) = 0 otherwise.
14 Measures

P
Then µa = i≥1 αi δai is called an atomic measure. When we subtract it from µF ,
we get a measure µc = µF − µa which has no atomic part. There are increasing,
right continuous functions Fa and Fc so that µa = µFa and µc = µFc . Moreover
Fc is continuous and Fa is entirely determined by its jumps. We can show that (up
to a constant)
( P
αi if x ≥ 0
Fa (x) = P{i:0≤ai ≤x} .
− {i:x<ai <0} αi if x < 0
P
This converges for every x because {i:0≤ai ≤x} αi ≤ F (x) − F (0) < ∞ when
P
x ≥ 0, and similarly {i:x<ai <0} αi ≤ F (0) − F (x) when x < 0.

We conclude this section with some regularity properties of µ̄F -measurable


sets.

1.5.6. T HEOREM . Let µ̄F be a Lebesgue-Stieltjes measure. For E ⊂ R, the


following are equivalent.
(1) E is µ̄F -measurable.
(2) For all ε > 0, there is an open set U ⊃ E such that µ∗F (U \ E) < ε.
(3) For all ε > 0, there is an closed set C ⊂ E such that µ∗F (E \ C) < ε.
(4) There is a Gδ set G ⊃ E so that µ∗F (G \ E) = 0.
(5) There is an Fσ set F ⊂ E so that µ∗F (E \ F ) = 0.
In this case,
µ̄F (E) = inf{µF (U ) : E ⊂ U open} = sup{µF (K) : E ⊃ K compact}.

P ROOF. First assume that E is bounded. Fix S ε > 0. Since E is measurable,


µ̄F (E) = µ∗F (E) = inf µF (A) where E ⊂ A = i≥1 (ai , bi ]. Now µ̄F (E) < ∞
so we can choose A so that µF (A) < µ̄F (E) S + ε/2. For each i ≥ 1, choose ci > bi
so that F (ci ) < F (bi ) + 2−i−1 ε. Let U = i≥1 (ai , ci ). Since E is measurable,
µ∗F (A) = µ∗F (E) + µ∗F (A \ E). Thus µ∗F (A \ E) < ε/2. Also
[  X
µF (U \ A) = µF (bi , ci ) ≤ F (ci ) − F (bi ) < ε/2.
i≥1 i≥1

So µ̄∗F (U
\ E) = µF (U \ A) + µ∗F (A \ E) < ε.
Now if E is unbounded, let En = E ∩ (n − 1, n] for n ∈ Z. Find an open
set Un ⊃ En with µ∗F (Un \ En ) < 2−2|n|−1 ε. Then E ⊂ U := n∈Z Un is an
S

open set, and µ∗F (U \ E) ≤ n∈Z µ∗F (Un \ En ) < 2ε + 2 n≥1 2−2n−1 ε < ε. So
P P

(1) =⇒ (2).
(2) =⇒ (4). Find open sets Un ⊃ E with µ∗F (Un \E) < n1 . Set G = n≥1 Un .
T

This is a Gδ set containing E such that µ∗F (G\E) ≤ µ∗F (Un \E) < n1 for all n ≥ 1.
Hence µ∗F (G \ E) = 0.
1.5 Lebesgue-Stieltjes measures on R 15

(4) =⇒ (1). Since µ̄F is complete and µ∗F (G \ E) = 0, the set G \ E is


µ̄F -measurable. Since µF is a Borel measure, G is also µ̄F -measurable. Therefore
E = G \ (G \ E) is µ̄F -measurable.
(1) =⇒ (3). If E is measurable, so is E c . By the equivalence of (1) and (2),
there is an open set U ⊃ E c such that µ∗F (U \ E c ) < ε. Let C = U c , which is
closed and C ⊂ E. Then µ∗F (E \ C) = µ∗F (U \ E c ) < ε.
(3) =⇒ (5). Select closed sets Cn ⊂ E with µ∗F (E \ Cn ) < n1 . Let F =
∗ ∗ 1
S
n≥1 Cn . This is an Fσ set contained in E such that µF (E \F ) < µF (E \Cn ) < n
for all n ≥ 1. Hence µ∗F (E \ F ) = 0.
(5) =⇒ (1). Since µ̄F is complete and µ∗F (E \ F ) = 0, the set E \ F is
µ̄F -measurable. Since µF is a Borel measure, F is also µ̄F -measurable. Therefore
E = F ∪ (E \ F ) is µ̄F -measurable.
For the final statement, the first statement holds by (2), and the second holds by
(4) provided that we use closed sets. If E is bounded, the closed sets are compact.
So suppose that E is unbounded. Set En = E ∩ [−n, n]. Then
µ̄F (E) = sup µ̄F (En ) = sup sup µF (K). 
n≥1 n≥1 En ⊃K compact
C HAPTER 2

Functions

2.1. Measurable Functions


2.1.1. D EFINITION . Let (X, A) and (Y, B) be measure spaces. A function
f : X → Y is measurable if f −1 (B) ∈ A for all B ∈ B.
In particular, f : X → F ∈ {R, C} is measurable if f −1 (B) ∈ A for all
B ∈ BorF ; i.e. for all Borel sets B ⊂ F.

We are most often interested in scalar valued functions, i.e. with range in R or
C. We always use the σ-algebra of Borel sets, not Lebesgue measurable sets, when
defining measurable functions.
It is not necessary to verify the measurability condition on all Borel sets, only
on a generating family. That is, if E ⊂ B generates B as a σ-algebra and f −1 (E) ∈
A for all E ∈ E, then f is measurable. This follows from the easy facts:
[ [
f −1 (E c ) = (f −1 (E))c and f −1 ( En ) = f −1 (En ).
n≥1 n≥1

So the following proposition is immediate.

2.1.2. P ROPOSITION . Let (X, A) and (Y, B) be measure spaces.


(a) If f : X → Y , then {B ∈ B : f −1 (B) ∈ A} is a σ-algebra.
(b) If f : X → R, the following are equivalent:
(1) f is measurable.
(2) {x ∈ X : f (x) < α} is measurable for all α ∈ R.
(3) {x ∈ X : f (x) ≤ α} is measurable for all α ∈ R.
(4) {x ∈ X : f (x) > α} is measurable for all α ∈ R.
(5) {x ∈ X : f (x) ≥ α} is measurable for all α ∈ R.
(c) If f : X → C, then f is measurable if and only if Re f and Im f are
measurable R-valued functions.

2.1.3. C OROLLARY. Let X be a topological space, and consider (X, BorX ).


Continuous functions f : X → F are (Borel) measurable.
16
2.1 Measurable Functions 17

P ROOF. Since f is continuous, if U ⊂ F is open, then f −1 (U ) is open and thus


Borel. 

The following records some basic operations that preserve measurability.

2.1.4. P ROPOSITION . Let (X, B) be a measure space.


(a) Suppose that f, g : X → F are measurable. Then f + λg for λ ∈ F and
f g are measurable; and if g 6= 0, then f /g is measurable.
(b) If fn : X → R are measurable, then sup fn , inf fn , lim sup fn and
lim inf fn are measurable. Thus, if f = lim fn exists pointwise, then
f is measurable.

P ROOF. Clearly if λ 6= 0, then g is measurable if and only if λg is measurable.


So consider f + g. Note that
[
{x : f (x) + g(x) < α} = {x : f (x) < r} ∩ {x : g(x) < α − r} ∈ B.
r∈Q

So f + g is measurable. Now consider f g. If α > 0, then



{x : f (x)g(x) < α} = {x : f (x) ≥ 0} ∩ {x : g(x) ≤ 0}

∪ {x : f (x) ≤ 0} ∩ {x : g(x) ≥ 0}
[
∪ {x : |f (x)| < r} ∩ {x : |g(x)| < α/r}
r∈Q+

This lies in B. Also



{x : f (x)g(x) < 0} = {x : f (x) > 0} ∩ {x : g(x) < 0}

∪ {x : f (x) < 0} ∩ {x : g(x) > 0}
And finally, if α < 0, then
[ 
{x : f (x)g(x) < α} = {x : f (x) > r} ∩ {x : g(x) < α/r}
r∈Q+
[ 
∪ {x : f (x) < α/r} ∩ {x : g(x) > r}
r∈Q+

Thus f g is measurable. Finally, if g 6= 0, then for α > 0, {x : 1/g(x) < α} =


{x : g(x) > α−1 } ∪ {x : g(x) < 0}; also {x : 1/g(x) < 0} = {x : g(x) < 0};
and if α < 0, {x : 1/g(x) < α} = {x : α−1 < g(x) < 0}. So 1/g is measurable.
Thus f /g is measurable by the product result.
Now suppose that fn are measurable for n ≥ 1. Then
[
{x : sup fn (x) > α} = {x : fn (x) > α} ∈ B.
n≥1
18 Functions

So sup fn is measurable. Similarly inf fn is measurable. Therefore lim sup fn =


infk≥1 supn≥k fn is measurable, and similarly for lim inf fn . So f = lim fn exists
pointwise means that f = lim sup fn = lim inf fn . So f is measurable. 

When the measure is complete, changing things on a set of measure 0 has no


important consequence.

2.1.5. P ROPOSITION . Let µ be a complete measure on (X, B). Then if f is


measurable and g = f a.e.(µ), then g is measurable. Also if fn are measurable
and f is a function such that fn converge to f a.e.(µ), then f is measurable.

P ROOF. Suppose that N is a null set such that f = g on X \ N . For any Borel
set B ⊂ R, f −1 (B) 4 g −1 (B) ⊂ N . As these two sets differ by a subset of a null
set, they differ by a null set by completeness. So one is measurable because the
other is.
Suppose that f (x) = lim fn (x) except on a null set N . Then f = lim sup fn
except on N , and hence is measurable by the previous paragraph. 

2.2. Simple Functions


2.2.1. D EFINITION . Let (X, B) be a measure space. A simple function is a
function ϕ : X → C of the form
n
X
ϕ(x) = ai χEi where ai ∈ C∗ , Ei ∈ B and Ei ∩ Ej = ∅ for i 6= j.
i=1

A simple function is just a measurable function with finite range. Indeed, the
range of ϕ is contained in {ai : 1 ≤ i ≤ n} ∪ {0}. Conversely, if the range of a
measurable function ϕ is contained in {ai : 1 ≤ i ≤ n} ∪ {0} with ai 6=
Panj 6= 0 for
−1
i 6= j, then we can set Ei = ϕ ({ai }). These are disjoint, and ϕ = i=1 ai χEi .
This is called the standard form of a simple function.
Note that the set of all simple functions is an algebra, meaning that it is a
vector space and is closed under products. The subset of simple functions such that
µ(Ei ) < ∞ for all i is a subalgebra.
The main result of this section is that every measurable function is a limit of
simple functions in a nice way.

2.2.2. P ROPOSITION .
(a) If f : X → [0, ∞] is measurable, then there is a sequence of simple functions
ϕn so that ϕn ≤ ϕn+1 , limn→∞ ϕn (x) = f (x), and convergence is uniform on
{x : f (x) ≤ R} for any R ∈ R.
2.3 Two Theorems about Measurable Functions 19

(b) If f : X → C is measurable, then there is a sequence of simple functions ϕn


so that |ϕn | ≤ |ϕn+1 |, limn→∞ ϕn (x) = f (x), and convergence is uniform on
{x : |f (x)| ≤ R} for any R ∈ R.

P ROOF. (a) For n ≥ 1, let Ak,n = x : 2kn ≤ f (x) < k+1


for 0 ≤ k < 4n

2n
and A4n ,n = {x : f (x) ≥ 2n }. Define simple functions
4 n
X k
ϕn = χA .
2n k,n
k=1
Observe that
n+1
˙ 2k+1,n+1 for 0 ≤ k < 4n ˙4
[
Ak,n = A2k,n+1 ∪A and A4n ,n = Ak,n+1 .
k=2·4n
Moreover for 0 ≤ k < 4n ,
n+1
4X
k χ 2k χ 2k+1 χ nχ k χ
2n Ak,n ≤ 2n+1 A2k,n+1
+ 2n+1 A2k+1,n+1
and 2 A4n ,n ≤ 2n+1 Ak,n+1
.
k=2·4n

Thus ϕn ≤ ϕn+1 ≤ f . Moreover, if f (x) ≤ 2N , then f (x) − ϕn (x) ≤ 2−n for


all n ≥ N ; and if f (x) = ∞, then ϕn (x) = 2n . Thus limn→∞ ϕn = f point-
wise. Moreover convergence is uniform on {x : |f (x)| ≤ R} since our estimate is
uniformly good once 2n ≥ R.
(b) Let f = g1 −g2 +ig3 −ig4 where g1 = max{Re f, 0}, g2 = max{− Re f, 0},
g3 = max{Im f, 0} and g4 = max{− Im f, 0}. For each 1 ≤ i ≤ 4, apply part (a)
to get sequences of simple functions ψi,n increasing to gi . Then the sequence
ϕn = ψ1,n − ψ2,n + iψ3,n − iψ4,n
works. Details are left to the reader. 

2.3. Two Theorems about Measurable Functions


Littlewood had three principles of Lebesgue measure:
• every measurable set of finite mesure is almost a finite union of intervals
(use Theorem 1.5.6(2) and throw out the very small intervals).
• Every measurable function is almost continuous (Lusin’s Theorem).
• A pointwise convergence sequence of measurable functions is almost uni-
formly convergent (Egorov’s Theorem).

2.3.1. E GOROV ’ S T HEOREM . Let (X, B, µ) be a finite measure space,


i.e., µ(X) < ∞. Suppose that fn : X → C are measurable functions and
fn → f a.e.(µ). Then for ε > 0, there is a set E ∈ B with µ(X \ E) < ε so
that fn → f uniformly on E.
20 Functions

P ROOF. Observe that fn → f uniformly on E provided that for each m ≥ 1,


1
there is an integer Nm so that |fn (x) − f (x)| < m for all x ∈ E and all n ≥ Nm .
Define
\
1 1
Am,N = {x : |fn (x) − f (x)| < m ∀n ≥ N } = {x : |fn (x) − f (x)| < m }.
n≥N
Note that Am,1 ⊂ Am,2 ⊂ . . . and that
[
Am,n ⊃ {x : fn (x) → f (x)} = X \ N
n≥1
where N is a null set. Since X has finite measure, we can choose an integer
N so that µ(Am,Nm ) > µ(X) − 2−m ε. Hence µ(Acm,Nm ) < 2−m ε. Let E =
Tm
m≥1 Am,Nm . Then,
[  X X
µ(E c ) = µ Acm,Nm ≤ µ(Acm,Nm ) < 2−m ε = ε.
m≥1 m≥1 m≥1
By the first line of the proof, fn converges uniformly to f on E. 

2.3.2. L USIN ’ S T HEOREM . Let f : [a, b] → C be a Lebesgue measurable


function, and let ε > 0. Then there is a continuous function g ∈ C[a, b] so that
m({x : f (x) 6= g(x)}) < ε.
Pm
P ROOF. If ϕ = χ
i=1 ai EiSis a simple function and δ > 0, we can S find
compact sets Ai ⊂ Ei so that m m

i=1 (Ei \ A i ) < δ. Observe that K = Ai is
compact and ϕ|K is continuous as a function on K (because it is locally constant).
Choose simple functions ϕn converging pointwise to f . Find compact T
sets Kn
so that ϕn |Kn is continuous and m [a, b] \ Kn < 2−n−1 ε. Then K0 = Kn is


compact, each ϕn |K0 is continuous, and


 X 
m [a, b] \ K0 ≤ m [a, b] \ Kn < ε/2.
By Egorov’s Theorem, there is a measurable set E ⊂ K0 with m(K0 \ E) < ε/4
so that so that ϕn converge uniformly to f on E. Then by Theorem 1.5.6, there
is a closed set K ⊂ E with m(E \ K) < ε/4. Since ϕn |K are continuous, and
converge uniformly to f |K , we see that f |K is continuous. Also
  ε ε ε
m [a, b] \ K < m [a, b] \ K0 + m(K0 \ E) + m(E \ K) < + + = ε.
2 4 4
It remains to extend f |K to a continuous function g on [a, b]. This is a (non-
trivial) exercise in a basic real analysis course. Just make g piecewise linear on
each open component of the complement. Once achieved, {x : f (x) 6= g(x)} is
contained in [a, b] \ K, thus has measure less than ε. 
C HAPTER 3

Integration

The goal of this chapter is to develop a theory of integration with respect to an


arbitrary measure.

3.1. Integrating positive functions


We begin by considering the integral only of positive functions. Moreover, we
insist that the function be measurable. We fix a measure space (X, B, µ). Let
L+ = {f : X → [0, ∞] measurable}.
Notice that L+ only depends on the measure space (X, B) and not on µ. The
simple functions have a very natural definition of integral. We bootstrap that into a
definition for positive measurable functions. Remember that simple functions only
take finite values.
Pn
3.1.1. D EFINITION . If ϕ = χ
i=1 ai Ei is a simple function in L+ , set
Z X n
ϕ dµ := ai µ(Ei ).
i=1

We use the convention that 0 · ∞ = 0 = ∞ · 0. For general f ∈ L+ , define


Z nZ o
f dµ := sup ϕ dµ : 0 ≤ ϕ ≤ f, ϕ simple .
Z Z
Also define f dµ := f χA dµ for any measurable set A ∈ B.
A

Z
The convention that 0 · ∞ = 0 means that f = 0χX has f dµ = 0 even
if µ(X) = ∞; since the integral of a simple function should not be changed by
adding a zero function to it. The second identity ∞ · 0 = 0 is also needed to
Z notion that the value of f on a set of measure 0 should not affect
coincide with the
the integral. So ∞χ{x} dµ = 0 is µ(x) = 0. At this stage, we allow ∞ both as a
value of f and of the integral. Later, in some cases, we impose further restrictions.
21
22 Integration

In various books, you will see the following notations which all mean the same
thing: Z Z Z Z
f dµ, f (x) dµ(x), f (x) µ(dx), f.
Of course, the last one will only make sense if the measure µ is known. We will
use this when convenient to simplify the notation.

3.1.2. L EMMA . Let ϕ, ψ be simple functions in L+ ; and let f, g ∈ L+ .


R
(a) ϕ dµ is well defined.
(b0 ) cf dµ = c f dµ.
R R R R
(b) If c ≥ 0, then cϕ dµ = c ϕ dµ.
R R R
(c) ϕ + ψ dµ = ϕ dµ + ψ dµ.
(d0 ) If f ≤ g, f dµ ≤ g dµ.
R R R R
(d) If ϕ ≤ ψ, then ϕ dµ ≤ ψ dµ.
R
(e) ν(A) := A ϕ dµ for A ∈ B defines a measure on (X, B).

P ROOF. (a) Suppose that ϕ = m


Pn
χ χ
P
i=1 ai Ei = j=1 bj Fj . We can disregard
zeros, so we may suppose that ai 6= 0 6= bj . Moreover since the Ei and Fj are
pairwise disjoint collections, we have that {ai : 1 ≤ i ≤ m} = {bj : 1 ≤ j ≤ n}.
Let c1 , . . . , cp be the distinct non-zero values of ϕ, and set Ak = {x : ϕ(x) = ck }
for 1 ≤ k ≤ p. Then
[ [
Ak = {Ei : ai = ck } = {Fj : bj = ck }.
Therefore
n
X p
X X
ai µ(Ei ) = ck µ(Ei )
i=1 k=1 ai =ck
Xp [ p
 X
= ck µ Ei = ck µ(Ak )
k=1 ai =ck k=1
p
X [ n
X

= ck µ Fj = bj µ(Fj ).
k=1 bj =ck j=1

(b) is trivial, and (b0 ) follows because


{ϕ : ϕ ∈ L+ , simple, ϕ ≤ cf } = {cϕ : ϕ ∈ L+ , simple, ϕ ≤ f }.
For (c), suppose that ϕ = m ai χEi and ψ = m bj χFj , where we add in
P P
i=0S c j=0 S c
m n
the term a0 = b0 = 0 and E0 = i=1 Ei and F0 = j=1 Fj in order that
Sm Sn
X = i=0 Ei = j=0 Fj . Define Aij = Ei ∩ Fj for 0 ≤ i ≤ m and 0 ≤ j ≤ n.
Then
m X
X n m X
X n Xm X n
ϕ= ai χAij , ψ = bi χAij and ϕ + ψ = (ai + bj )χAij .
i=0 j=0 i=0 j=0 i=0 ij=0
3.2 Two limit theorems 23

Therefore using (a), we get


Z Xm Xn
ϕ + ψ dµ = (ai + bj )µ(Aij )
i=0 j=0
Xm X n n
X m
X
= ai µ(Aij ) + bj µ(Aij )
i=0 j=0 j=0 i=0
Xm n
[ Xn [m
 
= ai µ Aij + bj µ Aij
i=0 j=0 j=0 i=0
Xm n
X Z Z
= ai µ(Ei ) + bj µ(Fj ) = ϕ dµ + ψ dµ.
i=0 j=0

(d) Write ϕ and ψ as in (c), and note that


Xm X n m X
X n
ϕ= ai χAij ≤ ψ = bi χAij
i=0 j=0 i=01 j=0

means that ai ≤ bj if Aij 6= ∅. Therefore by (a),


Z Xm X n m X
X n Z
ϕ dµ = ai µ(Aij ) ≤ bj µ(Aij ) = ψ dµ.
i=0 j=0 i=0 j=0

(d0 ) follows because


{ϕ : ϕ ∈ L+ , simple ϕ ≤ f } ⊂ {ϕ : ϕ ∈ L+ , simple ϕ ≤ g}.
(e) Clearly ν(∅) = 0 and ν takes values in [0, ∞]. Suppose that Aj are pairwise
disjoint sets in B with A = ˙ j≥1 Aj . Then
S

Z Xn
ν(A) = ϕ dµ = ai µ(Ei ∩ A)
A i=1
n n
X ˙
[  X X
= ai µ Ei ∩ Aj = ai µ(Ei ∩ Aj )
j≥1
i=1 i=1 j≥1
n
XX XZ X
= ai µ(Ei ∩ Aj ) = ϕ dµ = ν(Aj ).
j≥1 i=1 j≥1 Aj j≥1

Thus ν is countably additive, and hence is a measure. 

3.2. Two limit theorems


This section establishes two fundamental limit theorems, and extends the addi-
tivity of the integral to arbitrary functions in L+.
24 Integration

3.2.1.
S L EMMA . Let ϕ ≥ 0 be a simple function,. Let An ∈ B with An ⊂ An+1
and n≥1 An = X. Then
Z Z
lim ϕ dµ = ϕ dµ.
n→∞ A
n

Z
P ROOF. Let ν(A) = ϕ dµ be the measure defined in Lemma 3.1.2(e). Then
A
Z Z
lim ϕ dµ = lim ν(An ) = ν(X) = ϕ dµ
n→∞ A n→∞
n

by the property of continuity from below. 

3.2.2. M ONOTONE C ONVERGENCE T HEOREM . Let (X, B, µ) be a mea-


sure space. Let fn ∈ L+ such that fn ≤ fn+1 for n ≥ 1. Define f (x) =
limn→∞ fn (x). Then Z Z
f dµ = lim fn dµ.
n→∞

P ROOF. Since fn (x) is increasing, the limit always exists in [0, ∞]. Also since
fn ≤ fn+1 ≤ f , we have that
Z Z Z
fn dµ ≤ fn+1 dµ ≤ f dµ.
R R
Thus limn→∞ fn dµ exists in [0, ∞] and is no larger than f dµ.
Conversely, suppose that ϕ ≤ f is a non-negative simple function, and let
ε > 0. Define An = {x : fn (x) ≥ (1 − ε)ϕ(x)}. S Since ϕ(x) < ∞ and
fn (x) → f (x) ≥ ϕ(x), we have An ⊂ An+1 and n≥1 An = X. Therefore
by Lemma 3.2.1,
Z Z Z Z
(1−ε) ϕ dµ = lim (1−ε)ϕ dµ ≤ lim fn dµ ≤ lim fn dµ.
n→∞ A n→∞ A n→∞
n n
R R
R ε ↓ 0 to get that
Let R ϕ dµ ≤ limn→∞ fn dµ. Taking the supremum yields
f dµ ≤ limn→∞ fn dµ. So equality holds. 

3.2.3. C OROLLARY. The integral is countably additive on L+ . That is, if fn ∈


L+ for n ≥ 1, then Z X XZ
fn dµ = fn dµ.
n≥1 n≥1

Pk
P ROOF. Apply MCT to the sequence gk = n=1 fn . 

3.2.4. C OROLLARY. If f ∈ L+ , ν(A) =


R
Af dµ is a measure on (X, B).
3.2 Two limit theorems 25

P ROOF
S . Countable additivity
P is a special case of the previous corollary; since
if A = ˙ n≥1 An , then f χA = n≥1 f χAn . 

3.2.5. L EMMA . If f ∈ L+ , then f dµ = 0 if and only if f = 0 a.e.(µ).


R

P ROOF. If f = 0 a.e.(µ) and 0 ≤ ϕ ≤ f whereRϕ = ni=1 χ


P
Pani Ei is a simple
R then ai > 0 implies
function, R µ(Ei ) = 0. Therefore ϕ dµ = i=1 ai µ(Ei ) = 0.
Hence f dµ = sup0≤ϕ≤f ϕR dµ = 0.
R {x f (x)
Conversely, suppose that f dµ = 0. Let E = :
R > 0} and En =
{x : f (x) ≥ n1 }. Then ϕn = n1 χEn ≤ f . So 0 = f dµ ≥ ϕn dµ = n1 µ(En ).
Therefore µ(E) = limn→∞ µ(En ) = 0; i.e. f = 0 a.e.(µ). 

3.2.6. C OROLLARY. If f, fn ∈ L+ such that fn ≤ fn+1 a.e.(µ) for n ≥ 1 and


f (x) = limn→∞ fn (x) a.e.(µ), then
Z Z
f dµ = lim fn dµ.
n→∞

P ROOF. Define
 
En = x : fn (x) > fn+1 (x) for n ≥ 1 and E0 = x : lim fn (x) 6= f (x) .
n→∞
S
Then µ(En ) = 0 for n ≥ 0 by hypothesis. Let E = n≥0 En . Then µ(E) = 0,
fn χE c ≤ fn+1 χE c and f χE c = limn→∞ fn χE c . So by the MCT and Lemma 3.2.5,
Z Z Z Z
f dµ = f χE c dµ = lim fn χE c dµ = lim fn dµ. 
n→∞ n→∞

3.2.7. E XAMPLE . Things don’t work out quite as well for limits which are not

monotone. For example, let fn = n [0,n] ∈ L+ (m). Then fn → 0 uniformly on
R. However, Z Z
lim fn dm = lim 1 = 1 6= 0 = 0 dm.
n→∞ n→∞

The following theorem provides an important inequality.

3.2.8. FATOU ’ S L EMMA . Let (X, B, µ) be a measure space. Suppose fn ∈ L+


for n ≥ 1. Then Z Z
lim inf fn dµ ≤ lim inf fn dµ.

P ROOF. Let gn (x) = infk≥n fk (x) ≤ fn (x). Then gn ≤ gn+1 for n ≥ 1, and
lim inf fn = limn→∞ gn . Therefore by MCT,
26 Integration

Z Z Z Z
lim inf fn dµ = lim gn dµ = lim inf gn dµ ≤ lim inf fn dµ. 
n→∞

3.3. Integrating complex valued functions


3.3.1. D EFINITION . Let (X, B, µ) be a measure space. A measurable function
f : X → C is integrable if
Z
kf k1 = |f | dµ < ∞.

The set of all integrable functions will be (provisionally) called L1 (µ). Write f =
f1 − f2 + if3 − if4 where f1 = Re f ∨ 0 = max{Re f, 0}, f2 = (− Re f ) ∨ 0,
f3 = Im f ∨ 0 and f3 = (− Im f ) ∨ 0. If f is integrable, define
Z Z Z Z Z
f dµ = f1 dµ − f2 dµ + i f3 dµ − i f4 dµ.

+ and f ≤ |f |, we have
R R
R Note that since fi ∈ L i fi dµ ≤ |f | dµ < ∞. So
f dµ is defined as a complex number.
The reason for the temporary nature of our definition of L1 (µ) is that this is
not a normed vector space, due to the fact that if f = 0 a.e.(µ), then kf k1 = 0. So
k · k1 is just a pseudo-norm. That is, it satisfies kλf k1 = |λ| kf k1 and the triangle
inequality, but is not positive definite. We will rectify this soon by identifying
functions which agree a.e. into an equivalence class representing an element of
L1 (µ).

3.3.2. P ROPOSITION . L1 (µ) is a vector space and k · k1 is a pseudo-norm.


That is, it is positive homogeneous: kλf k1 = |λ| kf k1 for λ ∈ C, andR the triangle
inequality
R holds, but kf k1 = 0 if and only if f = 0 a.e.(µ). The map taking f to
f dµ is linear, and
Z Z
f dµ ≤ |f | dµ = kf k1 .

P ROOF. Showing that L1 (µ) is a vector space is straightforward. Multipli-


cation by a real number or an imaginary number merely scales and shuffles the
functions fi , so homogeneity for these scalars is easy. In general, you need to write
out λ = a + ib and chase some details which are left to the reader.
Lemma 3.2.5 shows that kf k1 = 0 if and only if f = 0 a.e.(µ). The triangle
inequality is also straightforward: if f, g ∈ L1 (µ), then
Z Z
kf + gk1 = |f + g| dµ ≤ |f | + |g| dµ = kf k1 + kgk1 .
3.3 Integrating complex valued functions 27

+
1 iθ
R in L , and is left to the reader.
Linearity follows from linearity Rof the integral
Let f ∈ L (µ) and choose θ so that f = e f . Then
Z Z Z Z
f = e−iθ f = Re e−iθ f = Re e−iθ f.

Write Re e−iθ f = g1 − g2 , where g1 = Re e−iθ f ∨ 0 and g2 = −(Re e−iθ f ) ∨ 0.


Then Z Z Z Z
−iθ
0 ≤ Re e f = g1 − g2 ≤ g1 + g2 ≤ |f |.

Combining these two formulae, we get the desired inequality. 

We now arrive at the most important limit theorem in measure theory. With
this result and MCT, you can deal with most situations that arise.

3.3.3. L EBESGUE D OMINATED C ONVERGENCE T HEOREM . Suppose


that fn , g ∈ L1 (µ), g ≥ 0, such that lim fn = f a.e.(µ) and |fn | ≤ g a.e.(µ) for
n→∞
n ≥ 1. Then f ∈ L1 (µ) and
Z Z
f dµ = lim fn dµ.
n→∞

P ROOF. First assume that fn are real valued. Then we apply Fatou’s Lemma 3.2.8
to the sequences g ± fn in L+ to get
Z Z Z Z
g + f dµ ≤ lim inf g + fn dµ = g dµ + lim inf fn dµ

and
Z Z Z Z
g − f dµ ≤ lim inf g − fn dµ = g dµ − lim sup fn dµ.
R
Since g dµ < ∞, it can be cancelled off and we obtain that
Z Z Z
lim sup fn dµ ≤ f dµ ≤ lim inf fn dµ.
R R
It follows that f dµ = limn→∞ fn dµ. In particular, applying this to |f | and
|fn | we obtain that
Z Z Z
kf k1 = |f | dµ = lim |fn | dµ ≤ g dµ < ∞.
n→∞

So f ∈ L1 (µ) and thus is integrable.


In general, split f as the sum of its real and imaginary parts, use the real result,
and then recombine. 
28 Integration

Z b
n
3.3.4. E XAMPLE . Consider lim dx for 0 ≤ a < b ≤ ∞.
n→∞ a 1 + n2 x2
n 1
Observe that fn (x) = 1+n2 x2
= 1 +nx2. So lim fn (x) = 0 for all x > 0. The
n n→∞
1 2
function h(t) = t + t x attains its minimum at t = x. Thus if 0 ≤ x ≤ 1,
1 1 1
sup fn (x) ≤ sup = = .
n≥1 0<t≤1 h(t) h(x) 2x
On the other hand, if x ≥ 1, then
1 1 1
sup fn (x) ≤ sup = = .
n≥1 0<t≤1 h(t) h(1) 1 + x2
(
1
2x if 0 < x ≤ 1
So if we set g(x) = 1 , then 0 ≤ fn ≤ g. The problem is
1+x2
if x ≥ 1
R1 1
that g 6∈ L1 (0, ∞) since 0 2x dx = +∞. However if 0 < a ≤ 1, then
Z ∞ Z 1 ∞ ∞
Z
1
g(x) dx = 2x dx+ 1 1
1+x2
dx = 1
2 ln x + tan−1 (x) = 1
2 ln a−1 + π2 < ∞.
a a 1 a 1

Hence as long as a > 0, the Lebesgue dominated convergence theorem (LDCT)


Rb
applies. So limn→∞ a fn dµ = 0. However if a = 0,
Z b Z b
n b
fn (x) dx = 2 2
dx = tan−1 (nx) = tan−1 (nb).
0 0 1+n x 0

So
Z b Z b
π
lim fn (x) dx = lim tan−1 (nb) =
6= 0 dx = 0.
n→∞ 0 n→∞ 2 0
Thus the domination condition by an integrable function is crucial.

3.4. The space L1 (µ)


For the purposes of this section, let L1 (µ) denote the vector space of all µ-
integrable functions.

3.4.1. D EFINITION . Given (X, B, µ), let N = {f ∈ L1 (µ) : kf k1 = 0} =


{f measurable : f = 0 a.e.(µ)}. Put an equivalence relation on L1 (µ) by f ∼ g if
f = g a.e.(µ); i.e. if f − g ∈ N . The normed vector space space L1 (µ) = L1 /N
consists of the equivalence classes with the induced norm k[f ]k1 = kf k1 . By abuse
of notation, we will frequently write f ∈ L1 (µ) when we mean [f ].

3.4.2. T HEOREM . L1 (µ) is a complete normed vector space (a Banach space).


3.5 Comparison with the Riemann Integral 29

P ROOF. It is easy to check that N is a subspace of the vector space L1 (µ), so


that the quotient L1 (µ) = L1 /N is a vector space. Notice that by Proposition 3.3.2,
f = g a.e.(µ) if and only if f − g ∈ N if and only if kf − Rgk1 = 0. If [f
R ] is an ele-
1 0 0 0
ment of L (µ) and f ∼ f , then h = f − f = 0 a.e. Thus |f | dµ = |f | dµ. So
k[f ]k1 is well defined. The construction has quotiented out all non-zero elements
of zero norm, and thus the norm on L1 (µ) is positive definite. It is easily seen to
be positive homogeneous. The triangle inequality follows immediately from the
triangle inequality for L1 (µ). Thus L1 (µ) is a normed vector space.
Now suppose that ([fn ])n≥1 is a Cauchy sequence in L1 (µ). Choose repre-
sentatives fn ∈ [fn ] so that we can evaluate them at points. For each k ≥ 1,
there is an nk so that if m, n ≥ nk , then −k
−k
P kfm − fn k1 < 2 . In particular,
kfnk+1 − fnk k1 < 2 . Let g = |fn1 | + k≥1 |fnk+1 − fnk |. By the MCT,
Z Z XZ
kgk1 = g dµ = |fn1 | dµ + |fnk+1 − fnk | dµ
k≥1
X X
= kfn1 k1 + kfnk+1 − fnk k1 < kfn1 k1 + 2−k < ∞.
k≥1 k≥1

Hence g is integrable, and therefore is finite a.e.(µ), say on X \ N for µ(N ) = 0.


It follows that the sum
X
f (x) = fn1 (x) + fnk+1 (x) − fnk (x) = lim fnk (x)
k→∞
k≥1

converges absolutely on X \ N . We define f |N = 0. Moreover |f | ≤ g so that f is


integrable, and f = limk→∞ fnk a.e.(µ). Also
k−1
X
|fnk | ≤ kfn1 k1 + kfni+1 − fni k ≤ g.
i=1

Therefore |f − fnk | ≤ 2g; and so by LDCT,


0 = kf − f k1 = lim kf − fnk k1 .
k→∞

That is, fnk converges to f in L1 (µ). Finally a standard argument shows that the
whole Cauchy sequence (fn ) converges to f in norm. 

3.5. Comparison with the Riemann Integral


We will do a speedy review of part of the Riemann integral theory. Given a
bounded real valued function f on a finite interval [a, b], we consider a partition
P = {a = x0 < x1 < · · · < xn = b} of [a, b]. Define real numbers
Mj = sup f (x) and mj = inf f (x) for 1 ≤ j ≤ n.
xj−1 ≤x≤xj xj−1 ≤x≤xj
30 Integration

Then
n
X n
X
lP (x) = m1 χ{a} + mj χ(xj−1 ,xj ] ≤ f ≤ M1 χ{a} + Mj χ(xj−1 ,xj ] = uP (x).
j=1 j=1

The upper and lower sums are


n
X n
X
L(f, P) = mj (xj − xj−1 ) and U (f, P) = Mj (xj − xj−1 ).
j=1 j=1

If P and Q are two partitions, let P ∨ Q denote the partition using the points in
P ∪ Q. It is easy to show that
L(f, P) ≤ L(f, P ∨ Q) ≤ U (f, P ∨ Q) ≤ U (f, Q).
It follows that
sup L(f, P) ≤ inf U (f, Q).
P Q

A function f is Riemann integrable if for every ε > 0, there is a partition P so


that
n
X
U (f, P) − L(f, P) = (Mj − mj )(xj − xj−1 ) < ε.
j=1
In this case, one defines
Z b
f (x) dx := sup L(f, P) = inf U (f, Q).
a P Q

If one defines the mesh of a partition as mesh(P) = max1≤j≤n xj − xj−1 , then


for any ε > 0, there is a δ > 0 so that if mesh(P) < δ, then U (f, P)−L(f, P) < ε.
Hence we can take a nested sequence of partitions P1 ⊂ Pn ⊂ Pn+1 ⊂ . . . with
mesh Pn → 0 and conclude that
Z b
f (x) dx = lim L(f, Pn ) = lim U (f, Pn ).
a n→∞ n→∞

3.5.1. T HEOREM . Every Riemann integrable function f on [a, b] is Lebesgue


Z Z b
integrable, and f dm = f (x) dx.
a

P ROOF. Notice that a Riemann integrable function is approximated from above


and below by a special class of simple functions, the piecewise constant functions,
lP ≤ f ≤ uP . Moreover the Lebesgue and Riemann integrals agree on these
piecewise constant functions. Take a nested sequence of partitions Pn as above
with mesh Pn → 0 and observe that
lPn ≤ lPn+1 ≤ f ≤ uPn+1 ≤ uPn .
3.5 Comparison with the Riemann Integral 31

Define l(x) = limn→∞ lPn (x) and u(x) = limn→∞ uPn (x). Note that l ≤ f ≤ u.
These functions are Lebesgue measurable. Moreover, lPn − LP1 ≥ 0 and increase
to l − lP1 . Since lP1 is integrable, the MCT shows that
Z Z Z
l dm = lP1 dm + lim lPn − lP1 dm
n→∞
Z Z b
= lim lPn dm = lim L(f, Pn ) = f (x) dx.
n→∞ n→∞ a
Z Z b Z
Similarly, u dm = f (x) dx. In particular, u−l dm = 0. By Lemma 3.2.5,
a
u − l = 0 a.e.(m). Therefore l = f = u a.e.(m), so that f is measurable and
Z Z Z b
f dm = u dm = f (x) dx. 
a

3.5.2. R EMARKS . There is one situation where the Riemann integral can do
something that the Lebesgue integral can’t. That is the improper Riemann integrals
in which a function which Z ∞is not Riemann integrable and be integrated as a limit.
sin x
The typical example is dx. The integrand extends to be continuous at
0 x
x = 0, so that is not an issue. In the Riemann theory, this integral is evaluated as
Z ∞ Z r
sin x sin x
dx = lim dx.
0 x r→∞ 0 x
This can be shown by a number of techniques to equal π2 . The reason it is not
Lebesgue integrable is that this integral exists only as a conditional limit, and
Z ∞ Z r
sin x sin x
dx = lim dx = +∞.
0 x r→∞ 0 x
So this function is neither Lebesgue nor Riemann integrable. However if a function
f is Lebesgue integrable, then so is |f |.
Let’s examine this example in more detail.
Z ∞ ∞ Z (n+1)π ∞
sin x X sin x X
dx = dx = (−1)n an
0 x nπ x
n=0 n=0

where for n ≥ 1,
(n+1)π π
| sin x|
Z Z
1 2
an = dx ≤ sin x dx =
nπ x nπ 0 nπ
and
(n+1)π π
| sin x|
Z Z
1 2
an = dx ≥ sin x dx = .
nπ x (n + 1)π 0 (n + 1)π
32 Integration

2
It follows that an ≥ (n+1)π ≥ an+1 for n ≥ 1; so an → 0 monotonely. Therefore
P∞ n
n=0 (−1) an converges by the alternating series test. It also shows that
Z ∞ ∞ Z (n+1)π ∞
sin x X | sin x| X
dx = dx = an .
0 x nπ x
n=0 n=0
2
Since an ≥ (n+1)π for n ≥ 1, this series diverges by comparison with the harmonic
series.
The other place where an improper Riemann integral is requires is the integra-
tion of unbounded functions. The Riemann theory works only for bounded func-
tions. A function like f (x) = x−a for 0 < a < 1 has an improper Riemann integral
on [0, 1]. Since this function is positive and continuous with a finite integral, it is
Lebesgue integrable. It is possible again to construct a function with alternating
sign on (0, 1] that blows up near 0 so that the improper Riemann integral exists as
Z 1 Z 1
f (x) dx = lim f (x) dx
0 ε→0 ε
Z 1
1 1
but so that the absolute value is not integrable. An example is sin dx, which
0Z x x

1 sin u
after the change of variables u = x converts this to the integral du which
1 u
we have already analyzed.
When you have to integrate specific functions, you will usually find yourself
falling back on the many techniques that have been developed for the Riemann in-
tegral. Almost all explicit functions that you will need to integrate will be Riemann
integrable or at least locally Riemann integrable.
The power of the Lebesgue integral lies in the limit theorems MCT and LDCT.
These are much stronger than anything available in the Riemann theory. Also we
obtain the completeness of L1 (µ) and many other related spaces that allow the
power of functional analysis to apply.

There is a nice characterization of Riemann integrable functions due to Lebesgue.

3.5.3. T HEOREM . If f : [a, b] → R is a bounded function, then f is Riemann


integrable if and only if f is continuous except on a set of measure 0.

P ROOF. Define functions


U (x) = inf sup f (y) and L(x) = sup inf f (z).
δ>0 |y−x|<δ δ>0 |z−x|<δ

It is an easy exercise to show that f is continuous at x if and only if L(x) = U (x).


The quantity ω(f, x) = U (x) − L(x) is called the oscillation of f at x.
3.6 Comparison with the Riemann Integral 33

Suppose that f is Riemann integrable. We use the notation from the proof of
Theorem 3.5.1. Choose an increasing family of partitions Pn with mesh(Pn ) → 0,
and define lPn , l, uPn and u as before. Let
[
A = {x : l(x) 6= u(x)} ∪ P .
n≥1 n
S
The set {x : l(x) 6= u(x)} has measure 0, and n≥1 Pn is countable. Therefore
m(A) = 0. Moreover
lim uPn (x) − lPn (x) = u(x) − l(x) = 0 for x ∈ [a, b] \ A.
n→∞
If x ∈ [a, b] \ A, choose n so that uPn (x) − lPn (x) < ε. Then there are adjacent
points xi−1 < xi in Pn so that xi−1 < x < xi . Let δ = min{x − xi−1 , xi − x}.
Then since uPn and lPn are constant on (xi−1 , xi ] and are upper and lower bounds
for f , respectively, it follows that
U (x) − L(x) ≤ sup f (y) − f (z) ≤ uPn (x) − lPn (x) < ε.
|y−x|<δ
|z−x|<δ

Since ε > 0 is arbitrary, we have U (x) = L(x), and thus f is continuous on


[a, b] \ A, which is a.e.(m).
Conversely suppose that f is continuous except on a set A
Sof Lebesgue measure
0. Then U (x) = L(x) for x ∈ [a, b] \ A. Note that A = n≥1 An where An =
{x ∈ [a, b] : U (x) − L(x) = ω(f, x) ≥ 2−n }. The set An is closed because its
complement is open: if ω(f, x) < r, then there is a δ > 0 so that
sup f (y) − f (z) < r. So if |x0 − x| = d < δ, sup f (y) − f (z) < r.
|y−x|<δ |y−x0 |<δ−d
|z−x|<δ |z−x0 |<δ−d

Thus ω(f, x0 ) < r as well. Cover An with a countable family of open intervals
of total length at most 2−n . S Then since An is compact, there is a finite subcover
I1 , . . . , Im . Let B = [a, b] \ m i=1 Ii . For each x ∈ B, since U (x) − L(x) < 2
−n ,

there is an open inteval Jx 3 x so that the oscillation over Jx is less than 2 . The−n

collection {Jx : x ∈ B} is an open cover of the compact set B. Therefore there is


a finite subcover Jx1 , . . . , Jxp .
Let P be the partition consisting of the endpoints of I1 , . . . , Im , Jx1 , . . . , Jxp
(together S with a, b if necessary). Any of the intervals [xi−1 , xi ] contained in the
union pj=1 Jxj will have oscillation less than 2−n, meaning uPn (x)−lPn (x) < 2−n
on (xi−1 , xi ]. The remaining intervals are contained in m
S
i=1 Ii and thus have total
−n
length at most 2 . The only thing we can say is that uPn (x) − lPn (x) ≤ 2kf k∞
on these intervals. Therefore we obtain the estimate
X p Xm
−k
U (f, Pn ) − L(f, Pn ) < 2 m(Jxj ) + 2kf k∞ m(Ii )
j=1 i=1
−k

<2 b − a + 2kf k∞ .
As n → ∞, this converges to 0. Hence f is Riemann integrable. 
34 Integration

3.6. Product Measures


3.6.1. D EFINITION . If Xλ for λ ∈ Λ are non-empty sets, then the product
space is
Y
X = Xλ = {(xλ ) : xλ ∈ Xλ , λ ∈ Λ} = {f : Λ → ∪˙ λ∈Λ Xλ : f (λ) = xλ ∈ Xλ }.
λ∈Λ

The maps πλ : X → Xλ given by πλ (x) = xλ are the coordinate Q projections.


N
If (Xλ , Bλ ) are σ-algebras, then the product σ-algebra ( λ∈Λ Xλ , λ∈Λ Bλ )
is the σ-algebra of subsets of X generated by the sets {πλ−1 (A) : A ∈ Bλ . λ ∈ Λ}.

3.6.2. R EMARK . When Λ = {1, . . . , n} is finite, there is no problem defining


the product space. In this case, the product σ-algebra is generated by all of the
“cubes” A1 × · · · × An for Ai ∈ Bi , 1 ≤ i ≤ n.
When Λ is infinite, the Axiom of Choice is often needed to guarantee that the
Q In this case, the product σ-algebra is generated by
product space is non-empty.
sets πλ−1 (Aλ ) = Aλ × µ∈Λ\{λ} Xµ . The intersection of finitely many yields
Q
a cube of the form Aλ1 × · · · × Aλn × µ∈Λ\{λi :1≤i≤n} Xµ . One can also take
the intersection of countably many such slices, but if Λ is uncountable, the cubes
formed with proper subsets from each Xλ will generally not belong to the product
σ-algebra. Even when Λ is countable, these infinite cubes many turn out to be
measure 0 and hence negligible.

The following proposition shows that in the familiar case of Borel sets on met-
ric spaces, and finite products, we get the desired result. The separability hypothesis
is crucial.

3.6.3. P ROPOSITION . If (Xi , di ) are separable metric spaces for 1 ≤ i ≤ n,


then
n
O n
Y
Bor(Xi ) = Bor( Xi ).
i=1 i=1

P ROOF. The product space X = ni=1 Xi is also a metric space with the metric
Q
d((xi ), (yi )) = max{di (xi , yi )}. With this (or any equivalent) metric, the coordi-
−1
Nn Thus if Ui is open in Xi , the set πi (Ui ) is open
nate projections are continuous.
in X. These sets generate i=1 Bor(Xi ), and thus it is contained in Bor(X).
Conversely, since each Xi is separable, so is X. Therefore X is second count-
able. Indeed if {xj : j ≥ 1} is a dense subset, then {br (xj )Q: j ≥ 1, r ∈ Q+ }
n
is a countable neighbourhoodNn base for X. Now br (xj ) = i=1 br (xj,i ), where
πi (xj ) = xj,i , belongs to i=1 Bor(Xi ). As a σ-algebra is closed under countable
unions and every open set in X is the union of those (countably many) sets in the
3.6 Product Measures 35

neighbourhood Nevery open subset of X belongs


base that it contains, it follows that
to ni=1 Bor(Xi ). Thus Bor(X) is contained in ni=1 Bor(Xi ).
N


3.6.4. C OROLLARY.
n
O
n
Bor(R ) = Bor(R) and Bor(C) = Bor(R) ⊗ Bor(R).
i=1

Now let (X, B, µ) and (Y, B 0 , ν) be two measure spaces. Let A be the collec-
tion of all finite unions of disjoint rectangles of the form A × B for A ∈ B and
B ∈ B 0 . This is an algebra since it is closed under finite unions and complements.
Indeed, (A × B)c = Ac × Y ∪˙ A × B c and
A1 ×B1 ∪ A2 ×B2 = A1 ×B1 ∪˙ (A2 \ A1 )×B2 ∪˙ (A1 \ A2 )×(B2 \ B1 ).
Now define a set function
n
[ ˙n  X
π Ai × B i = µ(Ai )ν(Bi ).
i=1
i=1
To apply Carathéodory’s Theorem, we need the following lemma.

3.6.5. L EMMA . π is a premeasure on A.

P ROOF. We need to show that if A × B = ˙ i≥1 Ai × Bi , then


S
X X
π(A × B) = µ(A)ν(B) = µ(Ai )ν(Bi ) = π(Ai × Bi ).
i≥1 i≥1
Note that X
χA (x)χB (y) = χAi (x)χBi (y).
i≥1
If we fix y ∈ Y , we obtain a sum of non-negative measurable functions on X. By
the Monotone Convergence Theorem,
Z
µ(A)χB (y) = χA (x) dµ(x) χB (y)
XZ X
= χAi (x) dµ(x) χBi (y) = µ(Ai )χBi (y).
i≥1 i≥1
These functions are non-negative measurable functions on Y , so a second applica-
tion of the MCT yields
Z
µ(A)ν(B) = µ(A) χB (y) dν(y)
X Z X
= µ(Ai ) χBi (y) dν(y) = µ(Ai )ν(Bi ).
i≥1 i≥1
36 Integration

Hence π is a premeasure. 

Now we can apply Carathéodory’s Theorem via Theorem 1.4.2 to obtain the
following, called the product measure.

3.6.6. T HEOREM . Let (X, B, µ) and (Y, B0 , ν) be two measure spaces. There
is a complete measure (X ×Y, B ⊗ B 0 , µ×ν) on X ×Y such that B ⊗ B 0 ⊃ B ⊗B 0
and µ × ν(A × B) = µ(A)ν(B) for all A ∈ B and B ∈ B 0 .

We need some more refined information about the sets in B ⊗ B 0 .

3.6.7. D EFINITION . Let A be the algebra of finite unions of rectangle in B ⊗ B0


as above. Define [

Aσ = E = Ai : Ai ∈ A
i≥1
and \

Aσδ = G = Ej : Ej ∈ Aσ .
j≥1

3.6.8. L EMMA . If E ∈ B ⊗ B0 and µ × ν(E) < ∞, then there is a set G ∈ Aσδ


such that E ⊂ G and µ × ν(G \ E) = 0.

P ROOF. By definition of the outer measure π ∗ which defines µ × ν, we have


that nX o
[
µ × ν(E) = inf µ × ν(Ai ) : Ai ∈ A and E ⊂ Ai .
i≥1 i≥1

⊃ E in Aσ so that µ×ν(Ej ) < µ×ν(E)+ 1j .


S
Thus we can choose Ej = i≥1 Aj,i
1 T
Hence µ × ν(Ej \ E) < j . Define G = j≥1 Ej ∈ Aσδ . Then E ⊂ G and
µ × ν(G \ E) = 0. 

3.7. Product Integration


An important result from the Riemann theory for integration in multiple vari-
ables is that integration over a nice region in Rn is equal to an iterated integral in
which the integration is done one variable at a time. There is an important analogue
of this for general measures. It actually comes in two flavours.

3.7.1. F UBINI ’ S T HEOREM . Let (X, B, µ) and (Y, B0 , ν) be complete mea-


sures. If f ∈ L1 (µ × ν), then
3.7 Product Integration 37

(1) (i) fx (y) = f (x, y) ∈ L1 (ν) for a.e. x(µ).


(ii) f y (x) = f (x, y) ∈ L1 (µ) for a.e. y(ν).
Z
(2) (i) fx (y) dν =: F (x) ∈ L1 (µ).
Y
Z
(ii) f y (x) dµ =: G(y) ∈ L1 (ν).
X
Z Z Z  Z Z 
(3) f dµ×ν = f (x, y)dν(y) dµ(x) = f (x, y) dµ(x) dν(y).
X×Y X Y Y X

3.7.2. T ONELLI ’ S T HEOREM . Let (X, B, µ) and (Y, B0 , ν) be complete mea-


sures, and suppose that µ × ν is σ-finite. If f ∈ L+ (µ × ν), then
(1) (i) fx (y) = f (x, y) ∈ L+ (ν) for a.e. x(µ).
(ii) f y (x) = f (x, y) ∈ L+ (µ) for a.e. y(ν).
Z
(2) (i) fx (y) dν =: F (x) ∈ L+ (µ).
Y
Z
(ii) f y (x) dµ =: G(y) ∈ L+ (ν).
X
Z Z Z  Z Z 
(3) f dµ×ν = f (x, y)dν(y) dµ(x) = f (x, y) dµ(x) dν(y).
X×Y X Y Y X

3.7.3. R EMARKS . (1) The hypothesis that µ×ν is σ-finite in Tonelli’s Theorem
is critical, as an example will show. However it
Z is not needed for Fubini’s Theorem
because when f ∈ L1 (µ × ν), we have that |f | dµ × ν = L < ∞. It follows
that Cn = {(x, y) ∈ X × Y : |f (x, y)| ≥ n1 } has µ × ν(CnS) ≤ nL. Thus the
restriction of µ × ν to C = {(x, y) ∈ X × Y : f (x, y) 6= 0} = n≥1 Cn is σ-finite.

(2) Also the hypothesis that f ∈ L1 (µ × ν) or f ∈ L+ (µ × ν) is also critical as


examples will show. The failure to check this condition leads to common misuses
of these theorems.

(3) Normal practice is to omit the parentheses and if there is no confusion, also
the variables, in multiple integrals. So we write
ZZ ZZ Z Z 
f dν dµ or f (x, y)dν(y) dµ(x) for f (x, y)dν(y) dµ(x).
X Y

If E ⊂ X × Y , define Ex = {y ∈ Y : (x, y) ∈ E} for x ∈ X; and let


E y = {x ∈ X : (x, y) ∈ E} for y ∈ Y .
38 Integration

3.7.4. L EMMA . If E ∈ Aσδ and


Z µ × ν(E) < ∞, then g(x) = ν(Ex ) is µ-
measurable, g ∈ L+ ∩ L1 (µ), and g dµ = µ × ν(E). Similarly, h(y) = µ(E y )
Z
+ 1
is ν-measurable, h ∈ L ∩ L (ν), and h dν = µ × ν(E).

(
B if x ∈ A
P ROOF. If E = A×B for A ∈ B and B ∈ B 0 , then Ex =
∅ if x ∈
6 A.
χ
Therefore g(x) = ν(B) A is µ-measurable and g ≥ 0. Moreover,
Z Z
g dµ = ν(B)χA dµ = µ(A)ν(B) = µ × ν(E) < ∞.

Thus g ∈ L+ ∩ L1 (µ).
Next, if E ∈ Aσ , we can write E = i≥1 Ai × Bi for Ai ∈ B and Bi ∈ B 0 .
S

We can rewrite this is a disjoint union because An × Bn \ n−1


S
i=1 Ai × Bi ∈ A
and thus can be written as a finite disjoint union of rectangles, which are clearly
also disjoint from n−1
S
i=1 Ai × Bi . Proceeding recursively, E can be rewritten as a
disjoint union E = ˙ i≥1 Ci × Di for Ci ∈ B and Di ∈ B 0 . Let g(x) = ν(Ex ) and
S

gi (x) = ν(Ci )χDi . Then g(x) = i≥1 gi (x). Hence g is measurable, and thus
P

belongs to L+ (µ). By the MCT,


Z XZ X
g dµ = gi dµ = µ × ν(Ci × Di ) = µ × ν(E) < ∞.
i≥1 i≥1

Moreover, we now have g ∈ L+∩ 1


T L (µ).
Finally, suppose that E = n≥1 En where En ⊃ En+1 all lie in Aσ such
that µ × ν(E1 ) < ∞ and µ × ν(En ) ↓ µ × ν(E). Let gn (x) = ν((En )x ) and
g(x) = ν(Ex ). Then
0 ≤ g ≤ gn+1 ≤ gn ≤ g1 ;
and g(x) = inf gn (x) = lim gn (x). Hence g is µ-measurable. It is dominated by g1
which is integrable, so g ∈ L+ ∩ L1 (µ). By the LDCT,
Z Z
g dµ = lim gn dµ = lim µ × ν(En ) = µ × ν(E).
n→∞ n→∞

The last equality follows by continuity from above since µ × ν(E1 ) < ∞.
Similarly we can interchange the role of x and y. 

3.7.5. L EMMA . If E ∈ B ⊗ B0 has µ × ν(E) = 0, then ν(Ex ) = 0 a.e.(µ) and


µ(E y ) = 0 a.e.(ν).

P ROOF. By Lemma 3.6.8, there is a set G ∈ Aσδ such that E ⊂ G and µ ×


ν(G)
R = 0. Therefore by the previous lemma, f (x) = ν(Gx ) ∈ L+ ∩ L1 (µ) and
f dµ = 0. Therefore f = 0 a.e.(µ). Now Ex ⊂ Gx , and since ν is a complete
3.7 Product Integration 39

measure, g(x) = ν(Ex ) = 0 a.e.(µ). It could happen that Ex is not measurable on


a set of ν-measure 0. A function which differs from a measurable function on a set
of measure 0 is also measurable.
Similarly we can interchange the role of x and y. 

3.7.6. C OROLLARY. If E ∈ B ⊗ B0 and µ × ν(E) < ∞, then Ex is ν-


measurable
Z for a.e.(µ) x ∈ X, g(x) = ν(Ex ) is µ-measurable, g ∈ L+ ∩ L1 (µ),
and g dµ = µ × ν(E). Similarly, E y is µ-measurable for a.e.(ν) y ∈ Y ,
Z
y + 1
h(y) = µ(E ) is ν-measurable, h ∈ L ∩ L (ν), and h dν = µ × ν(E).

P ROOF. Find G ∈ Aσδ so that E ⊂ G and µ ×Z ν(G \ E) = 0. Then by


Lemma 3.7.4, f (x) = ν(Gx ) ∈ L+ ∩ L1 (µ), and f dµ = µ × ν(G). By
Lemma 3.7.5, 0 ≤ f − g = 0 a.e.(µ). The result follows.
Similarly we can interchange the role of x and y. 
Now we are ready to prove the theorems.
P ROOF OF F UBINI ’ S T HEOREM . The various integrals and iterated integrals
are linear operations, so we can write f ∈ L1 (µ × ν) as f = f1 − f2 + if3 − if4
where fi ∈ L+ ∩ L1 (µ × ν) by letting f1 = Re f ∨ 0, etc. So we may suppose that
f ∈ L+ ∩ L1 (µ × ν). There are simple functions ϕn with 0 ≤ ϕn ≤ ϕn+1 ≤ f so
that f = limn→∞ ϕn and each ϕn is supported on a set of finite measure (such as
the set Cn in Remark 3.7.3(1)).
By Corollary 3.7.6 and linearity, Fubini’s Theorem is valid for simple functions
with finite measure support. By the MCT,
Z Z Z Z
f dµ × ν = lim ϕn dµ × ν = lim ϕn (x, y)dν(y) dµ(x).
n→∞ n→∞ X Y
Z
The functions Fn (x) = ϕn (x, y)dν(y) are positive, measurable and monotone
Z Y
increasing to F (x) = f (x, y)dν(y). Hence this is a measurable function, and by
Y
MCT, Z Z
lim Fn (x) dµ(x) = F (x) dµ(x).
n→∞ X X
Thus Z Z Z
f dµ × ν = f (x, y)dν(y) dµ(x).
X×Y X Y
Interchanging the role of X and Y yields the other iterated integral. 
P ROOF OF T ONELLI ’ S T HEOREM . Let f ∈ L+ (µ×ν). Since µ×ν is
Sσ-finite,
there are measurable sets Cn ⊂ Cn+1 with µ×ν(Cn ) < ∞ and X ×Y = n≥1 Cn .
40 Integration

For n ≥ 1, let fn = (f ∧ n)χCn , where (f ∧ n)(x) = min{f (x), n}. Then


0 ≤ fn ≤ nχCn , and so fn ∈ L+ ∩ L1 (µ × ν). Now by Fubini’s Theorem and
MCT, we have
Z Z Z Z
f dµ × ν = lim fn dµ × ν = lim fn (x, y)dν(y) dµ(x).
n→∞ n→∞ X Y
Z
The proof is completed as before. The functions Fn (x) = fn (x, y)dν(y) are
Z Y
positive, measurable and monotone increasing to F (x) = f (x, y)dν(y). Hence
Y
this is a measurable function, and by MCT,
Z Z
lim Fn (x) dµ(x) = F (x) dµ(x).
n→∞ X X

Thus Z Z Z
f dµ × ν = f (x, y)dν(y) dµ(x).
X×Y X Y
Interchanging the role of X and Y yields the other iterated integral. 

There is a straightforward variant of these results when the measures are not
complete. The difference is that given measures (X, B, µ) and (Y, B 0 , ν), we form
µ × ν and restrict it to the σ-algebra B ⊗ B 0 . I will also call this µ × ν.

3.7.7. F UBINI -T ONELLI T HEOREM WITHOUT C OMPLETENESS . Let


(X, B, µ) and (Y, B 0 , ν) be measures with product (X × Y, B ⊗ B 0 , µ × ν). If
f ∈ L1 (µ × ν) (or if µ × ν is σ-finite and f ∈ L+ (µ × ν)), then
(1) (i) fx (y) = f (x, y) ∈ L1 (ν) (or L+ (ν)) for all x ∈ X
(ii) f y (x) = f (x, y) ∈ L1 (µ) (or L+ (µ)) for all y ∈ Y .
Z
(2) (i) fx (y) dν =: F (x) ∈ L1 (µ) (or L+ (µ)).
Y
Z
(ii) f y (x) dµ =: G(y) ∈ L1 (ν) (or L+ (ν)).
X
Z Z Z  Z Z 
(3) f dµ×ν = f (x, y)dν(y) dµ(x) = f (x, y) dµ(x) dν(y).
X×Y X Y Y X

In Assignment 4, Q5, you showed that if E ∈ B ⊗ B 0 , then Ex ∈ B 0 and


Ey ∈ B for all x ∈ X and y ∈ Y . This extends to simple functions, so that (1i) and
(1ii) hold for simple functions. This extends to limits, and since every measurable
function is a limit of simple functions, we obtain (1i) and (1ii). The remainder of
the proof is identical to the proof in the complete case.
We now present a few counterexamples that show the limits of these theorems.
3.7 Product Integration 41

3.7.8. E XAMPLE . Consider counting measure on (N, P(N), mc ). Form the


product, which is (N2 , P(N2 ), mc × mc ), where mc × mc is just counting measure
on N2 . Consider 
 1
 if n = m
f (m, n) = −1 if n = m + 1

0 otherwise.

Think of this as an N by N array


1 −1 0 0 0 ...
0 1 −1 0 0 ...
0 0 1 −1 0 ...
0 0 0 0 1 ...
.. .. .. .. .. ..
. . . . . .
The first iterated integral first sums the rows, each absolutely summable with total
0:
Z Z X X  X
f (m, n) dmc (n) dmc (m) = f (m, n) = 0 = 0.
X Y m≥1 n≥1 m≥1

The second iterated integral first sums the columns, each absolutely summable.
However the first column sums to 1, the rest to 0:
Z Z X X  X
f (m, n) dmc (m) dmc (n) = f (m, n) = 1 + 0 = 1.
Y X n≥1 m≥1 n≥2

These measures are σ-finite since N2 is countable and points have measure 1.
Tonelli’s theorem is not applicable because f is not positive. More importantly, Fu-
Z Z f is not integrable.
bini’s theorem does not apply because If it were, then |f | would
XX
have a finite integral, but clearly |f | dmc × mc = |f (m, n)| = ∞.
m≥1 n≥1

3.7.9. E XAMPLE . Consider Lebesgue measure (X = [0, 1], B, m) and counting


measure on (Y = [0, 1], P([0, 1]), mc ). We have ([0, 1]2 , B ⊗ P([0, 1]), m×mc )
as the product. Consider D = {(x, x) : x ∈ [0, 1]}, and let f = χD . Observe that
2 k
\[  j − 1 j 2
D= ,
2k 2k
k≥1 j=1

is a measurable set. Consider the iterated integrals. When integrating fx (y) = δx


with respect to counting measure, we get 1. Therefore
Z Z Z
f (x, y) dmc (y) dm(x) = 1 dm(x) = 1.
X Y X
42 Integration

When integrating f y (x) = δy with respect to Lebesgue measure, we get 0. There-


fore Z Z Z
f (x, y) dm(x) dmc (y) = 0 dmc (y) = 0.
Y X Y
What went wrong? Here f ≥ 0 is measurable, so belongs to L+ (m × mc ). The
problem is that mc on [0, 1]
Z Zis not σ-finite because [0, 1] is uncountable.
Finally let’s compute f dm × mc = m × mc (D). This is computed using
the outer measure
S obtained by covering D with a countable number of rectangles,
say D ⊂ i≥1 Ai × Bi where Ai are Lebesgue measurable. Note that this union
cannot cover D if for each i ≥ 1, either m(Ai ) = 0 or Bi is finite. Indeed if
m(Ai ) = 0 for i ∈ J1 and Bi is finite for i ∈ J2 , then
[ [
Ai × Bi ⊂ A × [0, 1] and Ai × Bi ⊂ [0, 1] × B
i∈J1 i∈J2
S S
where A = i∈J1 Ai has m(A) = 0, and B = i∈J2 Bi is countable. But
their union omits Ac × B c , which contains (A ∪ B)c × (A ∪ B)c and so inter-
sects D. Therefore there is some i0 so that m(Ai0 ) > 0 and Bi0 is infinite, and
hence mc (Bi0 ) = ∞. But then the premeasure of the rectangle is π(Ai0 × Bi0 ) =
m(Ai0 )mc (Bi0 ) = ∞. Since D is measurable, we have that m × mc (D) = ∞. So
neither iterated integral yields the value of the product integral even though both
exist.

3.7.10. E XAMPLE . Let X = Y = [0, 1] with Lebesgue measure. If we assume


the Continuum Hypothesis, there is a well-ordering ≺ on [0, 1] with the property
that {y ∈ [0, 1] : y ≺ x} is countable for all x. Let E = {(x, y) : y ≺ x} ⊂ [0, 1]2 .
Let f = χE . Clearly f ≥ 0. For each x ∈ [0, 1], Ex = {y ∈ [0, 1] : y ≺ x} is
countable. And for each y ∈ [0, 1], E y = {x ∈ [0, 1] : y ≺ x} is the complement
of a countable set. In particular, these sets are Lebesgue measurable. So we can
evaluate the iterated integrals
Z Z Z
f (x, y) dm(y) dm(x) = m(Ex ) dm(x) = 0
X Y X
and Z Z Z
f (x, y) dm(x) dm(y) = m(E y ) dmc (y) = 1.
Y X Y
What went wrong? The measures are finite, and hence σ-finite. If E were mea-
surable, then its measure would be finite and so f would belong to L+ ∩ L1 (m2 )
Then both Fubini’s Theorem and Tonelli’s Theorem would apply! It must be the
case that E is not measurable. This is in spite of the fact that Ex and E y are Borel
for all x and y.
There is a variant of this example which avoids the Continuum Hypothesis,
but requires more detailed knowledge of ordinals. Let Ω be the first uncountable
ordinal. This is a well-ordered uncountable set with respect to an order ≺ with the
3.8 Lebesgue Measure on Rn 43

property that {y ∈ Ω : y ≺ x} is countable for all x ∈ Ω. The open sets in the


order topology are generated by {y ∈ Ω : y ≺ x} and {y ∈ Ω : x ≺ y} for
x ∈ Ω. It is not hard to see that these sets are either countable of have countable
complement (co-countable). Moreover one can check that Bor(Ω) consists or all
countable and co-countable sets. Put a measure µ on (Ω, Bor(Ω)) by
(
0 if A is countable
µ(A) = .
1 if A is co-countable.

This is a finite measure. Form the product space (Ω2 , Bor(Ω2 ), µ × µ). Again we
set E = {(x, y) ∈ Ω2 : y ≺ x} and f = χE . We have the same contradiction. It
must be the case that E is not a measurable set.

3.8. Lebesgue Measure on Rn


Let (R, L, m) be Lebesgue measure on the real line. Then we can form mn =
m × · · · × m on (Rn , L ⊗ · · · ⊗ L) = (Rn , Bor(Rn )). This is the completion of a
Borel measure on Rn such that
n
Y
mn (A1 × · · · × An ) = m(Ai ) for Ai ∈ L.
i=1
This is the unique complete measure with this property. Let Ln denote the σ-
algebra of Lebesgue measurable sets in Rn .
Even though m is a complete measure, the σ-algebra L ⊗ L is not complete.
For example, if E ⊂ R is not measurable and B ⊂ L has m(B) = 0, then
[
E×B ⊂R×B = [n, n + 1) × B.
n∈Z

Since m2 ([n, n + 1) × B) = m(B) = 0, we have that m(R × B) = 0. Then since


2 2
m is complete, m (E × B) = 0 as well. This set does not belong to L ⊗ L.
Note that when there is no confusion, Lebesgue measure on Rn is commonly
written as m instead of mn . We will adopt this practice.
The first result is to show that certain regularity properties of Lebesgue measure
on the line transfer to mn .

3.8.1. P ROPOSITION . Let E ∈ Ln , and let ε > 0.


(1) There is an open set U ⊃ E such that m(U \ E) < ε; and
m(E) = inf{m(U ) : E ⊂ U, U open}.
(2) There is a closed set C ⊂ E so that m(E \ C) < ε; and
m(E) = sup{m(K) : K ⊂ E, K compact}.
44 Integration

(3) There is an Fσ set F and a Gδ set G so that F ⊂ E ⊂ G and


m(E \ F ) = 0 = m(G \ E).
(4) If m(E) < ∞, then there is a finite set of pairwise disjoint rectangles
R1 , . . . , RN whose sides are intervals so that m E4 N
S
i=1 Ri < ε.

P ROOF. (1) First assume that E is bounded. The construction of mn comes


from the outer measure on the algebra A of finite unions of disjoint rectangles.
P there is a countable union of (bounded) rectangles Ri covering E such that
Thus
i≥1 m(Ri ) < m(E) + ε/2. For each i ≥ 1, we can use Theorem 1.5.6 to
enclose the sides of Ri in open sets with slight increase in measure to obtain
S an
open rectangle Ui ⊃ Ri with m(Ui ) < m(Ri ) + 2−i−1 ε. Then E ⊂ U = i≥1 Ui
and X X
m(U ) ≤ m(Ei ) < m(Ri ) + 2−i−1 ε < m(E) + ε.
i≥1 i≥1

Hence m(U \ E) < ε.


Now if E is unbounded, divide S Rn into countably many disjoint unit cubes
labeled Ci for i ≥ 1. Then E = i≥1 E ∩ Ci . For each i, find an open set
Ui ⊃ E ∩ Ci so that m(Ui \ (E ∩ Ci ) < 2−i ε. Then U = i≥1 Ui does the job.
S
The other parts of this proposition are left as an exercise. 

3.8.2. L EMMA . If f is Lebesgue measurable, there is a Gδ set G with m(G) = 0


so that f χGc is Borel measurable.

P ROOF. Write f = f1 − f2 + if3 − if4 where fi are positive and measurable.


This reduces the problem to the case of a positive function f .
Let {rn : n ≥ 1} be a dense subset of [0, ∞). Then En = f −1 ([0, rn )) are
measurable.
S Select Fσ sets Bn and null sets Nn so that En = Bn ∪ Nn . Then
N = n≥1 Nn is a null set. Let G be a Gδ set so that N ⊂ G and m(G) = 0.
Then (f χGc )−1 ([0, rn )) = En ∪ G = Bn ∪ G is Borel for all rn . The countable
union of Borel sets is Borel, and thus f χGc is Borel. 

As for Lebesgue measure on the line, Lebesgue measure on Rn is translation


invariant. It also behaves well under a linear change of variables just as the Riemann
integral does. Let GLn denote the group of invertible n × n real matrices. It is a
fact from linear algebra that every invertible matrix is the product of a number
of elementary matrices [1,§3.2, Corollary 3]. The elementary matrices have three
types: multiplying the ith row by a non-zero scalar c
Ri (x1 , . . . , xi , . . . , xn ) = (x1 , . . . , cxi , . . . , xn ),
adding a multiple of the jth row to the ith row for i 6= j, 1 ≤ i, j ≤ n
Aij (x1 , . . . , xi , . . . , xj , . . . , xn ) = (x1 , . . . , xi + cxj , . . . , xj , . . . , xn ),
3.8 Lebesgue Measure on Rn 45

and interchanging two rows i, j for i 6= j, 1 ≤ i, j ≤ n


Sij (x1 , . . . , xi , . . . , xj , . . . , xn ) = (x1 , . . . , xj , . . . , xi , . . . , xn ).

3.8.3. T HEOREM . Let m be Lebesgue measure on Rn .


(1) m is translation invariant: m(E + x) = m(E) for E ∈ Ln and x ∈ Rn .
(2) If T ∈ GLn and f is measurable, then f ◦ T is measurable. If f ∈ L1 (m)
or L+ (m), then
Z Z
f dm = | det T | f ◦ T dm.

In particular, if E ∈ Ln , then m(T (E)) = | det T |m(E).


(3) m is invariant under rotation: m(U (E)) = m(E) for E ∈ Ln and U
unitary.

P ROOF. (1) Note that translation of a cube preserves the measure. Thus the
premeasure is translation invariant. It follows that the outer measure is translation
invariant, and hence so is m.
(2) If f is Borel measurable, then so is f ◦T because T is continuous and hence
Borel by Corollary 2.1.3. If the result is true for invertible matrices S and T , then
Z Z
f dm = | det S| f ◦ S dm
Z
= | det S| | det T | (f ◦ S) ◦ T dm
Z
= | det ST | f ◦ ST dm.

Hence it is true for their product. Therefore it suffices to establish the result for
elementary matrices.
Using either the Fubini or Tonelli Theorem, the integral is equal to the iterated
integral. For multiplying a row by a non-zero constant, the iterated integral reduces
to the one variable fact. Adding a multiple of one row to another, integrate first
with respect to xi and use translation invariance. Interchanging two coordinates
is equivalent to interchanging the order of integration, which results in no change,
again by the Fubini or Tonelli theorem.
For E Borel, apply the result to χE to get
Z Z
m(E) = χE dm = | det T −1 | χE ◦ T −1 dm
Z
−1 χT (E) dm = | det T |−1 m(T (E)).
= | det T |

The result for a Lebesgue measurable function f now follows because there is
a Gδ set G so that f χGc is Borel measurable. Now (f χGc ) ◦ T = f ◦ T χT −1 (G)c .
46 Integration

By the Borel result, m(T −1 (G)) = 0. So f ◦ T = (f χGc ) ◦ T a.e.(m). The result


follows.
(3) If U is unitary, then | det U | = 1. In particular every rotation is unitary; and
thus m is rotation invariant. 

3.9. Infinite Product Measures


We briefly mention the subject of infinite products. Suppose that we are given
probability measures
Q (XiN, Bi , µi ); i.e. µi (Xi ) = 1 for i ≥ 1. We wish to define a
measure µ on ( i≥1 Xi , i≥1 Bi ) so that
 Y  n
Y
µ Ai × · · · × An × Xi = µi (Ai )
i>n i=1
for all n ≥ 1 and Ai ∈ Bi .
Let A denote the algebra of sets generated by the cylinders πi−1 (A
Qi ) for i ≥ 1
and Ai ∈ Bi , where as usual πi is the coordinate projection of X = i≥1 Xi onto
Xi . One can show as in section 3.6 that elements of A can be written as a finite
disjoint union of rectangles of the form
Y
R(A1 , . . . , An ) := Ai × · · · × An × Xi .
i>n

Define µ R(A1 , . . . , An ) = ni=1 µi (Ai ) and extend this to finite disjoint


 Q
unions by additivity. Note Q that weNhave already shown that there are measures
µ(n) = µ1 × · · · × µn on ( ni=1 Xi , ni=1 Bi ) so that µ(n) (Ai × · · · × An ) = µ(R)
for all rectangles in ni=1 Bi × i>n Xi . If we let µ(m,n] = µm+1 × · · · × µn for
N Q

1 ≤ m < n, we have that µ(n) = µ(m) × µ(m,n] . for 1 ≤ m < n.

3.9.1. T HEOREM . µ is a premeasure on A.

to show that if R, Ri for i ≥ 1 are rectangles and R =


P ROOF. It sufficesP
˙
S
i≥1 Ri , then µ(R) = i≥1 µ(Ri ). There is an integer m0 so that
m0
O Y
R∈ Bj × Xj =: B (m0 ) .
j=1 j>m0

Then choose integers m0 ≤ mn ≤ mn+1 so that Ri ∈ B (mn ) for 1 ≤ i ≤ n. Define


˙n
S (mn ) , F ⊃ F ⊃ F
Tn = R \ i=1 Ri . Observe that Fn ∈ A ∩ B
F 1 n n+1 ⊃ . . . and
n≥1 Fn = ∅. Therefore
n
X n
X
µ(R) = µ(mn ) (R) = µ(mn ) (Ri ) + µ(mn ) (Fn ) = µ(Ri ) + µ(Fn )
i=1 i=1
3.9 Infinite Product Measures 47

Our result will follow if we show that limn→∞ µ(Fn ) = 0.


Assuming that this is false, we have that µ(Fn ) ≥ ε > 0 for all n ≥ 1. We will
produce a contradiction. Given F ∈ A and xi ∈ Xi for 1 ≤ i ≤ p, define
Y
F (x1 , . . . , xp ) = {x ∈ Xi : (x1 , . . . , xp , x) ∈ F }.
i>p

Then define
G1,n = x1 ∈ X1 : µ(1,mn ] (Fn (x1 )) > ε/2 .


Note that
Z
ε ≤ µ(Fn ) = µ(1,mn ] (Fn (x1 )) dµ1 (x1 )
X1
Z Z
ε ε
≤ 1 dµ1 (x1 ) + dµ1 (x1 ) ≤ µ(1) (G1,n ) + .
G1,n c
G1,n 2 2

Therefore µ(1) (G1,n ) ≥ 2ε . Now G1,n ⊃ G1,n+1 and hence µ n≥1 G1,n ≥ 2ε .
T 
T
Choose a point a1 ∈ n≥1 G1,n . Then
ε
µ(1,mn ] (Fn (a1 )) ≥ for all n ≥ 1.
2
Recursively we construct points ai ∈ Xi so that
ε
µ(k,mk ] (F (a1 , . . . , ak )) ≥ k for all n ≥ 1.
2
Suppose that a1 , . . . , ak−1 have been defined with this property. Define
ε
Gk,n = xk ∈ Xk : µ(k,mn ] (Fn (a1 , . . . , ak−1 , xk )) > k .

2
Arguing as above,
ε
k−1
≤ µ((k−1,mn ] (Fn (a1 , . . . , ak−1 ))
2 Z
= µ(k,mn ] (Fn (a1 , . . . , ak−1 , xk )) dµk (xk )
Xk
Z Z
ε ε
≤ 1 dµk (xk ) + k
dµk (xk ) ≤ µ(k) (Gk,n ) + k .
Gk,n c
Gk,n 2 2
ε
T  T
Therefore µk n≥1 Gk,n ≥ 2k . Pick a point ak ∈ n≥1 Gk,n . This does the job.
It follows that for all k ≥ 1 and n ≥ 1, there is a point x so that (a1 , . . . , ak , x)
is in Fn . In particular, there is a point y so that (a1 , . . . , amn , y) ∈ Fn . But Fn is a
Q sets of level no greater than mn , and hence (aT
union of cylinder 1 , . . . , amn , y) ∈ Fn
for all y ∈ i>mn Xi . Therefore a = (a1 , a2 , a3 , . . . ) ∈ n≥1 Fn . This is a
contradiction since this intersection in empty. P
It follows that limn→∞ µ(Fn ) = 0 and hence µ(R) = i≥1 µ(Ri ). 
As an immediate consequence of Theorem 1.4.2, we obtain the desired mea-
sure.
48 Integration

Y
3.9.2. C OROLLARY. There is a unique complete measure µ = µi on
i≥1
Y O
( Xi , Bi ) so that
i≥1 i≥1
 Y  n
Y
µ Ai × · · · × An × Xi = µi (Ai )
i>n i=1
for all n ≥ 1 and Ai ∈ Bi .

3.9.3. E XAMPLE . Fix 0 < p < 1. Take Xi = 2 = {0, 1} and let µi be the
Q measure on (Xi , P(Xi )) such that µi (0) = p and µi (1) = 1 − p. Then
probability
X = i≥1 Xi is homeomorphic to the Cantor set C. Let µp denote the infinite
Q
product measure i≥1 µi .

Identify the Cantor ternary set with {x = 0.(2ε1 )(2ε2 )(2ε3 ) . . . base 3 } for
ε = (ε1 , ε2 , . . . ) ∈ X. The homeomorphism of X onto C is given by

h(ε) = 0.(2ε1 )(2ε2 )(2ε3 ) . . . base 3 for ε = (ε1 , ε2 , ε3 , . . . ) ∈ X.
We obtain a Borel measure on C by setting νp (A) = µp (h−1 (A)).
Let Cn denote the subset of [0, 1] after n operations of removing the middle
thirds has been accomplished; so that there remain 2n intervals of length 3−n . With
this explicit homeomorphism, the rectangle R(ε1 , . . . , εn ) corresponds to one of the
intervals of Cn intersected with C. For example, R(0, 0) = [0, 19 ] ∩ C, R(0, 1) =
[ 29 , 31 ] ∩ C, R(1, 0) = [ 32 , 79 ] ∩ C and R(1, 1) = [ 98 , 1] ∩ C.
Consider the analogue of the Cantor function which is defined on [0, 1] by
fp (x) = νp (C ∩ [0, x]). Then f (0) = 0 and fp (1) = 1. Observe that this is a
monotone increasing function. Moreover on the middle third [ 31 , 23 ], it takes the
value p. Then on the interval [ 19 , 29 ], it takes the value p2 and on [ 79 , 89 ], it takes the
value p + p(1 − p). Indeed, at level n, we have defined fp (x) on [0, 1] \ int(Cn ). If
[a, b] is a component of Cn , then the middle third is removed to form Cn+1 and
b−a 2(b−a)
fp (x) = fp (a) + p(fp (b) − fp (a)) for a + 3 ≤x≤a+ 3 .
Notice that fp takes all of the values pk (1 − p)n−k for 0 ≤ k ≤ n and n
≥ 1.
These values are dense in [0, 1]. As for the usual Cantor function, it follows that fp
has no jump discontinuities. So fp is continuous.
The measure νp is the Lebesgue-Stieltjes measure for the function fp by Theo-
rem 1.5.2. So νp is a probability measure supported on C. Since fp take the same
value at each endpoint of a removed interval, we see that νp (R \ C) = 0. The
continuity of fp means that νp has no atoms.
C HAPTER 4

Differentiation and Signed


Measures

4.1. Differentiation
Question: Is there a measure theoretic analogue R of the Fundamental Theorem
of Calculus? That is, to what extent is F (x) = c + [a,x] f dm differentiable on R?
What functions arise in this way?
We restrict our attention to real valued functions. In fact, splitting a function f
as a difference of two positive functions, we may suppose that f ≥ 0; and so the
integral will be monotone increasing.

4.1.1. D EFINITION . The upper and lower derivative from left and right of a
real valued function f are defined as
f (x + h) − f (x) f (x + h) − f (x)
Dr f (x) = lim sup Dr f (x) = lim inf
h→0+ h h→0 + h
f (x) − f (x − h) f (x) − f (x − h)
Dl f (x) = lim sup Dl f (x) = lim inf
h→0+ h h→0+ h

Then f is differentiable at x if Dr f (x) = Dr f (x) = Dl f (x) = Dl f (x) ∈ R.

This definition is clearly equivalent to the familiar definition from calculus: a


function f is differentiable at x if and only if there is a finite limit
f (x + h) − f (x)
f 0 (x) = lim .
h→0 h
We require a technical definition. A singleton {a} and the empty set are con-
sidered to be degenerate intervals.

4.1.2. D EFINITION . If E ⊂ R, a collection J of non-degenerate intervals is


a Vitali cover if for every x ∈ E and ε > 0, there is I ∈ J so that x ∈ I and
m(I) < ε.

49
50 Differentiation and Signed Measures

4.1.3. V ITALI C OVERING L EMMA . If E ⊂ R has finite outer measure


(m∗ (E) < ∞), J is a Vitali cover of E, and
 ε > 0, then there are disjoint intervals

SN
I1 , . . . , IN ∈ J such that m E \ j=1 Ij < ε.

P ROOF. Fix an open U ⊃ E with m(U ) < ∞. Let J 0 = {I ∈ J : I ⊂ U }.


This is also a Vitali cover since if x ∈ E and δ = dist(x, U c ), then any interval
I ∈ J containing x of length less than δ will belong to J 0 .
Recursively choose disjoint Ik ∈ J 0 so that m(Ik ) > αk /2 where
αk = sup{m(I) : I ∈ J 0 , I disjoint from I1 , . . . , Ik−1 }.
S P
Observe that m k≥1 Ik ) = k≥1 m(Ik ) ≤ m(U ) < ∞. Thus αk is a summable
P
sequence. So we may choose N so that k>N m(Ik ) < ε/5. We claim that
I1 , . . . , IN works. S
Let X = E \ N
SN
j=1 Ij . If x ∈ X, δ := dist(x, j=1 Ij ) > 0. Hence there is
some I ∈ J with x ∈ I and m(I) < δ. So I is disjoint from N
S
j=1 Ij , and hence
m(I) ≤ αN +1 . Pick K > N so that αK+1 < m(I) ≤ αK . Then by construction,
I cannot be disjoint from K
S
j=1 Ij . Thus there is some k with N < k ≤ K so that
Ik ∩ I 6= ∅. Now m(Ik ) ≥ αk /2 ≥ αK /2 ≥ m(I)/2. Hence
dist(x, midpoint of Ik ) ≤ 21 m(Ik ) + m(I) ≤ 52 m(Ik ).
Let Jk be the closed interval with theSsame midpoint as Ik but with m(Jk ) =
5m(Ik ). Then x ∈ Jk . Therefore X ⊂ k>N Jk . If follows that
X X
m∗ (X) ≤ m(Jk ) = 5 m(Ik ) < ε.
k>N k>N

This proves the claim. 

4.1.4. T HEOREM . Let f : [a, b] → R be monotone increasing. Then f is con-


tinuous except on a countable set, and is differentiable except on a set of measure
Z b
0
0. The derivative f is integrable, and f 0 dm ≤ f (b) − f (a).
a

P ROOF. Define f (x) = f (a) for x < a and f (x) = f (b) for x > b. Since f is
monotone, for c ∈ [a, b], we have
f (c− ) = lim f (x) = sup f (x) ≤ f (c) ≤ f (c+ ) = inf f (x) = lim+ f (x).
x→c− x<c x>c x→c

Thus f is continuous at c unless it is a jump


P discontinuity, in which case the jump
has length j(c) = f (c+ )−f (c− ). Clearly j(c) ≤ f (b)−f (a) < ∞. In particular
the number of points with jump at least δ > 0 is at most δ −1 (f (b) − f (a)) < ∞.
It follows that the number of discontinuities is countable.
Clearly we have Dr f (x) ≤ Dr f (x) and Dl f (x) ≤ Dl f (x). We will show
that Dl f (x) ≤ Dr (x) a.e. and Dr f (x) ≤ Dl f (x) a.e. For u, v ∈ Q with u < v,
4.1 Differentiation 51

let [
Eu,v = {x : Dr (x) < u < v < Dl f (x)} and E= Eu,v .
u<v∈Q
To show that m∗ (E) = 0, it suffices to show that m∗ (Eu,v ) = 0 for all u < v.
Let m∗ (Eu,v ) = s and fix an ε > 0. Choose an open set U ⊃ Eu,v with
m(U ) < s + ε. Let J = {I = [x, x + h] ⊂ U : f (x + h) − f (x) < uh}. By
definition of Dr f (x), this contains arbitrarily small intervals [x, x + h] for each
x ∈ Eu,v . Thus J is a Vitali cover of E. By the Vitali Covering Lemma, we can
find I1 = [x1 , x1 + h1 ], . . . , IN = [xN , xN + hN ] disjoint intervals in J so that
m∗ (Eu,v \ N
S
j=1 Ij ) < ε. Therefore
N
X N
X
s−ε< m(Ij ) = hj < m(U ) < s + ε,
j=1 j=1
SN
and m∗ (Eu,v ∩ j=1 Ij ) > s − ε. Let
N
[ N
[
F = Eu,v ∩ (xj , xj + hj ) ⊂ (xj , xj + hj ) =: V.
j=1 j=1

Consider J 0 = {I = [x − k, x] ⊂ V : f (x) − f (x − k) > vk}. As for J , one sees


that J 0 is a Vitali cover of F . Choose disjoint intervals Ji = [yi − ki , yi ] ∈ J 0 for
1 ≤ i ≤ M so that m∗ (F \ M
S
i=1 Ji ) < ε. Therefore
M
X M
X
ki = m(Ji ) > m∗ (F ) − ε > s − 2ε.
i=1 i=1
SN
Since the intervals Ji are disjoint and are contained in j=1 Ij , we have that
M
X M
X
v(s − 2ε) < ki v < f (yi ) − f (yi − ki )
i=1 i=1
XN N
X
≤ f (xj + hj ) − f (xj ) < uhj < u(s + ε).
j=1 j=1

Letting ε → 0 yields vs ≤ us, and thus s = 0. Hence m∗ (Eu,v ) = 0. Therefore,


since m is complete, m(E) = 0. Consequently, Dl f (x) ≤ Dr f (x) on E c , that is
a.e.
Similarly Dr f (x) ≤ Dl f (x) a.e. Thus f 0 (x) exists except on a set of measure
0. Note however that the derivative might be +∞. We will show soon that this also
happens only on a set of measure 0.
Define gn (x) = n f (x + n1 ) − f (x) on [a, b]. Monotone functions are Borel,

and thus are measurable. So gn is measurable and positive. Moreover
f (x + n1 ) − f (x)
lim gn (x) = lim = f 0 (x)
n→∞ n→∞ 1/n
52 Differentiation and Signed Measures

whenever f 0 (x) is defined, which is almost everywhere. Therefore f 0 is measurable


and f 0 ≥ 0. We can apply Fatou’s Lemma to this sequence to get

Z b Z b Z b+ n1 Z b
0
f dm ≤ lim inf gn dm = lim inf n f dm − n f dm
a n→∞ a n→∞ a+ n1 a
Z b+ n1 Z a+ n1
= lim inf n f (b) dm − n f dm ≤ f (b) − f (a) < ∞.
n→∞ b a

In particular, f 0 is integrable. This implies that f 0 (x) < ∞ a.e. 

4.1.5. E XAMPLE . Recall the Cantor ternary function. This is defined on [0, 1]
by f (0) = 0, f (1) = 1. Then f (x) = 12 for x ∈ [ 13 , 23 ], the (closure of) the
middle third removed in the construction of the Cantor set. Then we set f (x) = 14
on [ 19 , 92 ] and f (x) = 34 on [ 97 , 89 ]. This repeats, on each middle third removed,
f is defined as the midpoint between the values defined at the endpoints of the
(larger) interval. Finally, for x ∈ C, where C is the Cantor set, we can define
f (x) = sup{f (y) : 0 ≤ y ≤ x, y 6∈ C}. This evidently yields a monotone
increasing function. Moreover the values attained by f include all diadic rationals
2−n k for 0 ≤ k ≤ 2n . So the range of f is dense in [0, 1]. In particular, f cannot
have any jump discontinuities. Therefore f is continuous.
Next observe that on each of the removed (open) intervals, ( 31 , 32 ), ( 19 , 29 ) and
( 97 , 89 ), etc., the function f is constant. Therefore it is differentiable with f 0 (x) = 0.
This occurs on [0, 1] \ C, and since m(C) = 0, we see that f 0 is defined and finite
almost everywhere. However

Z 1
f 0 dm = 0 < 1 = f (1) − f (0).
0

So it can certainly be the case that you cannot recover f from its derivative.

4.1.6. D EFINITION . A function f : [a, b] → R is bounded variation (belongs


to BV[a, b]) if

n
nX o
Vab (f ) = sup |f (xi )−f (xi−1 )| : n ≥ 1, a = x0 < x1 < . . . < xn = b < ∞.
i=1

4.1.7. T HEOREM . A function f ∈ BV[a, b] if and only if f = g − h where g, h


are monotone increasing.
4.1 Differentiation 53

P ROOF. If f = g − h where g, h are monotone increasing, then for a partition


a = x0 < x1 < . . . < xn = b,
n
X n
X n
X
|f (xi ) − f (xi−1 )| ≤ |g(xi ) − g(xi−1 )| + |h(xi ) − h(xi−1 )|
i=1 i=1 i=1
= g(b) − g(a) + h(b) − h(a) =: L.
So Vab (f ) ≤ L.
Conversely, if f ∈ BV , define g(x) = Vax (f ). Clearly this is an increasing
function of x since if a ≤ x < y ≤ b, Vay (f ) = Vax (f ) + Vxy (f ). Let h = g − f ,
so that f = g − h. If a ≤ x < y ≤ b, then
h(y) − h(x) = Vxy (f ) − (f (y) − f (x)) ≥ |f (y) − f (x)| − (f (y) − f (x)) ≥ 0.
So h is also monotone increasing. 

4.1.8. C OROLLARY. If f ∈ BV [a, b], then f is continuous except on a count-


able set, f 0 (x) exists a.e.(m) and f 0 is m-integrable.
Z x
4.1.9. C OROLLARY. If f ∈ L1 [a, b], then F (x) = f dm is bounded varia-
a
tion.

P ROOF. Write f = f+ − f− where


R x f+ = f ∨ 0 and f−R=x (−f ) ∨ 0 are positive
integrable functions. Then g(x) = a f+ dm and h(x) = a f− dm are monotone
increasing. Hence f = g − h is BV. 

4.1.10. D EFINITION . A function f : [a, b] → R is absolutely continuous (AC)


if for all εP> 0, there is a δ > 0 so
Pthat whenever (xi , yi ) are disjoint intervals in
[a, b] and ni=1 yi − xi < δ, then ni=1 |f (yi ) − f (xi )| < ε.

4.1.11. E XAMPLE . The Cantor ternary function f is monotone, and thus BV.
However it is not absolutely continuous. If xi , yi for 1 ≤ i ≤ 2n are the endpoints
of the intervals remaining after n stages of the construction of the Cantor set, then
because f is constant on the complements of these intervals, we have
n 2n

2 n
X X 
|f (yi ) − f (xi )| = 1 and yi − xi = 3 .
i=1 i=1
2 n

Since 3 → 0, f is not absolutely continuous.

4.1.12. P ROPOSITIONZ. Let f ∈ L1 (µ) and ε > 0. Then there is a δ > 0 so


that whenever µ(A) < δ, |f | dµ < ε.
A
54 Differentiation and Signed Measures

P ROOF. Choose a simple function 0 ≤ ϕ ≤ |f | so that ϕ dµ > |f | dµ − 2ε .


R R
ε
Since ϕ is simple, there is a constant M so that ϕ ≤ M . Set δ = 2M . If µ(A) < δ,
then
Z Z
ε ε
|f | dµ < ϕ dµ + ≤ M µ(A) + < ε. 
A A 2 2

Z x
4.1.13. C OROLLARY. If f ∈ L1 [a, b], then F (x) = f dm is absolutely
a
continuous.

P ROOF. Given ε > 0, let δ beP provided by Proposition 4.1.12. Then if (xi , yi )
are disjoint intervals in [a, b] and ni=1 yi − xi < δ, then
Xn Xn Z yi Z
|F (yi ) − F (xi )| = f dm ≤ S |f | dm < ε. 
n
i=1 i=1 xi i=1 [xi ,yi ]

4.1.14. L EMMA . If f is absolutely continuous on [a, b], then it has bounded


variation.

P ROOF. Take ε = 1, and find δ > 0. Split [a, b] = pj=1 [aj−1 , aj ] where
S

aj − aj−1 < δ for


Pn1 ≤ j ≤ p. If [aj−1 , aj ] contains disjoint
Pn intervals (xi , yi ) for
1 ≤ i ≤ n, then i=1 yi − xi ≤ aj − aj−1 < δ. Hence i=1 |f (yi ) − f (xi )| < 1.
aj aj
(f ) ≤ 1. Thus Vab (f ) = pj=1 Vaj−1
P
Taking the supremum yields Vaj−1 (f ) ≤ p. 

Z x
4.1.15. L EMMA . If f ∈ L1 [a, b] and F (x) = f dm is monotone increasing,
a
then f ≥ 0 a.e.

P ROOF. Let E = {x : f (x) < 0} and En = {x : f (x) < − n1 } for n ≥ 1.


Then if m(E) > 0, there is some n so that m(En ) > 0. Let ε = m(En )/2n;
and choose δ > 0 using Proposition 4.1.12. Select an open set U ⊃ En so that
m(U \ En ) < δ. Write U = ˙ i≥1 (xi , yi ). Then
S

X Z
0≤ F (yi ) − F (xi ) = f dm
i≥1 U
Z Z
m(En )
= f dm + f dm ≤ − + ε < 0.
En U \En n
This contradiction shows that m(E) = 0, or that f ≥ 0 a.e. 
The following consequence is immediate.
4.1 Differentiation 55

Z x
4.1.16. C OROLLARY. Let f ∈ L1 [a, b] and F (x) = f dm. If F = 0, then
a
f = 0 a.e.

The main result here is the following result of Lebesgue, his substitute for the
Fundamental Theorem of Calculus, which answers the first part of the question
posed at the beginning of this section.

4.1.17. L EBESGUE
Z x D IFFERENTIATION T HEOREM . Let f ∈ L1 [a, b] and
F (x) = c + f dm. Then F 0 (x) = f (x) a.e.
a

P ROOF. Since F is BV by Corollary 4.1.9, Corollary 4.1.8 shows that F 0 exists


a.e. and the derivative is integrable. Extend f by setting f (x) = 0 for x > b; so that
F (x) = F (b) for x > b. Let gn (x) = n(F (x + n1 ) − F (x)). Then gn converges
pointwise a.e. to F 0 .
Z x+ 1
n
Case 1 |f | ≤ M . Then gn (x) = n f dm. So
x
Z x+ n1
|gn (x)| ≤ n M dx = M.
x

Thus |gn | ≤ M χ[a,b] , which is integrable. By the LDCT and the fact that F is
continuous, for c ∈ [a, b],
Z c Z c Z c
F 0 dm = lim gn dm = lim n F (x + n1 ) − F (x) dx
a n→∞ a n→∞ a
Z c+ n1 Z a+ n1
= lim n F (x) dx − n F (x) dx
n→∞ c a
Z c
= F (c) − F (a) = f dm.
a
Z c
Therefore (F 0 − f ) dm = 0 for all c ∈ [a, b]. By Corollary 4.1.16, F 0 = f a.e.
a
Case 2 f ≥ 0. Let fn = f ∧ n for n ≥ 1. Then fn is bounded and case 1
applies. Observe that
Z x Z x
F (x) = fn dm + f − fn dm.
a a

Because the second term is monotone increasing,


Z x
d
F 0 (x) = fn (x) + f − fn dm ≥ fn (x) a.e.
dx a
56 Differentiation and Signed Measures

Consequently, F 0 (x) ≥ limn→∞ fn (x) = f (x) a.e. For c ∈ [a, b],


Z c Z c Z c
0
F (x) dm ≤ F (c) − F (a) = f dm ≤ F 0 (x) dm.
a a a
Z c
Therefore F 0 (x) − f (x) dm = 0 for all c ∈ [a, b]. Hence F 0 = f a.e. by
a
Corollary 4.1.16.
Case 3. Write f = f+ − f− where f± ∈ L+ ∩ L1 (m). Then by case 2, we
obtain that F 0 = f+ − f− = f a.e. 

Next we wish to characterize which functions are integrals. We first need an-
other lemma.

4.1.18. L EMMA . If f ∈ C[a, b] is absolutely continuous and f 0 = 0 a.e., then


f is constant.

P ROOF. Fix c ∈ (a, b]. Let ε > 0. Obtain δ > 0 from the definition of absolute
continuity. Set E = {x ∈ (a, c) : f 0 (x) = 0}. So m([a, c] \ E) = 0. Define
J = {[x, x + h] : x ∈ E, h > 0, [x, x + h] ⊂ (a, c) and |f (x + h) − f (x)| < εh}.
This collection is a Vitali cover of E because for each x, all sufficiently small h
work. Therefore there are disjoint intervals I1 , . . . , In in J so that
n
[ n
[
 
m [a, c] \ Ij = m E \ Ij < δ.
j=1 j=1

Write Ij = (aj , bj ) ordered so that


a < a1 < b1 < a2 < b2 < · · · < an < bn < c.
Then
n
X
|f (c) − f (a)| ≤ |f (bj ) − f (aj )|
j=1
 n−1
X 
+ |f (a1 )−f (a)| + |f (aj+1 )−f (bj )| + |f (c)−f (bn )|
j=1
n
X
< ε|bj − aj | + ε < (c − a + 1)ε.
j=1

The second term in parentheses is at most ε by absolute continuity since the total
length of the intervals is less than δ. Now ε > 0 was arbitrary, and therefore
f (c) = f (a); whence f is constant. 
4.2 Signed Measures 57

4.1.19. T HEOREM . Let F : [a, b] → R. The following are equivalent:


Z x
(1) There is an f ∈ L1 (a, b) so that F (x) = c + f dm.
a
(2) F is absolutely continuous.
Z x
(3) F is differentiable a.e., F0 ∈ L1 (a, b) and F (x) = F (a) + F 0 dm.
a

P ROOF. Lebesgue’s Differentiation Theorem shows that (1) implies (3). Clearly
(3) implies (1). Lemma 4.1.9 shows that integrals are AC, so (1) implies (2). If (2)
holds, then F is BV by Lemma 4.1.14; and thus by Corollary 4.1.8, F 0 exists a.e.
and belongs to L1 (m). Let
Z x
G(x) = F (a) + F 0 dm.
a

Then Lebesgue’s Differentiation Theorem shows that G0 = F 0 a.e. Both F and G


are absolutely continuous, and (G − F )0 = 0 a.e. Therefore Lemma 4.1.18 shows
that G − F is constant; and G(a) = F (a). So (3) holds. 

4.2. Signed Measures


4.2.1. D EFINITION . A signed measure on (X, B) is a map ν : B → R ∪ {±∞}
such that ν(∅) = 0, ν takes at most one of the values ±∞, and  νP
is countably
additive, meaning that if Ei are disjoint sets in B, then ν ˙ i≥1 Ei = i≥1 ν(Ei ).
S

4.2.2.PR EMARK . The definition implies that if ν ˙ i≥1 Ei < ∞, then the
S 

series i≥1 ν(Ei ) converges absolutely. One argument would be that the union is
independent of order, and thus the sum should be independent of the order as well.
But unconditionally convergent sequences of real numbers converge absolutely. An
alternative argument, which is better, is that if the sum converges only conditionally,
then the sum of the positive terms diverges, as does S the sum
 ofP
the negative terms.
But then if J = {i : ν(Ei ) ≥ 0}, we have that ν ˙ i∈J Ei = i∈J ν(Ei ) = +∞
and ν ˙ i6∈J Ei = i6∈J ν(Ei ) = −∞. This is not permitted for signed measures
S  P

because we cannot make sense of ν(E ∪F ˙ ) = ν(E) + ν(F ) is ν(E) = +∞ and


ν(F ) = −∞.

4.2.3. E XAMPLES .
(1) Let f ∈ L1R (µ) and define ν(E) = E f dµ. In this case, neither ±∞ arises as
R
a value. The absolute convergence over countably many disjoint sets follows from
the LDCT.
58 Differentiation and Signed Measures

(2) Let f ∈ L+ (µ) and g ∈ L+ ∩ L1 (µ). Define ν(E) = E f − g dµ. If f is not


R

integrable, then ν(X) = f dµ−kgk1 = +∞. If E = ˙ i≥1 Ei and E f dµ < ∞,


R S R
then countable additivity follows as in (1).

4.2.4. D EFINITION . A null set for a signed measure ν is a measurable set E


such that ν(F ) = 0 for all F ⊂ E, F ∈ B. A positive set (or negative set) for ν is
a measurable set E such that ν(F ) ≥ 0 (or ≤ 0) for all F ⊂ E, F ∈ B.

4.2.5. H AHN D ECOMPOSITION T HEOREM . Let ν be a signed measure on


(X, B). Then there are sets P, N ∈ B so that X = P ∪N ˙ , P is a positive set, and
N is a negative set. If X = P 0 ∪N ˙ 0 is another decomposition into positive and
negative sets, then P 4P 0 is a null set.

4.2.6. L EMMA . If 0 < ν(E) < ∞, then there is a positive set A ⊂ E with
ν(A) > 0.

P ROOF. If E contains a set of negative measure, choose a set B1 ⊂ E in B so


that
ν(B1 ) ≤ max − 1, 12 inf{ν(B) : B ⊂ E, B ∈ B} .


Sn−1
Recursively choose Bn ⊂ E \ ˙ i=1 Bi with
˙ n−1
[
ν(Bn ) ≤ max − 1, 12 inf{ν(B) : B ⊂ E \

Bi , B ∈ B} .
i=1
Sn
Either this terminates because A = E \ ˙ i=1 Bi is a positive set, or there is an
infinite sequence. In this case, let A = E \ ˙ i≥1 Bi . Now by countable additivity,
S

X
ν(E) = ν(A) + ν(Bi ).
i≥1

Since |ν(E)| < ∞, this series converges absolutely. Therefore


X
ν(A) = ν(E) − ν(Bi ) < ∞.
i≥1

Moreover this shows that ν(A) ≥ ν(E) > 0.


The convergence also shows that ν(Bi ) → 0. If B ⊂ A had ν(B) < 0, then
ν(B) < 2ν(Bn ) for some n. But this contradicts the fact that
[˙ n−1
inf{ν(B) : B ⊂ E \ Bi , B ∈ B} > 2ν(Bn ).
i=1

Hence if B ⊂ A, then ν(B) ≥ 0. Thus A is a positive set. 


S
4.2.7. L EMMA . If An are positive sets, so is A := n≥1 An .
4.2 Signed Measures 59

P ROOF. If B ⊂ A and B ∈ B, let Bn = B ∩ An \ n−1


S 
i=1 Ai for n ≥ 1. Then
the Bn are pairwise disjoint sets in B such that B = ˙ n≥1 Bn . Since each An is
S
P
positive, ν(Bn ) ≥ 0. Therefore ν(B) = n≥1 ν(Bn ) ≥ 0. 

P ROOF OF THE H AHN D ECOMPOSITION . We may suppose that ν does not


take the value +∞ (by considering −ν instead if necessary). Let
m = sup{ν(A) : A is positive}.
S
Choose positive sets An with ν(An ) → m. Set P = n≥1 An . By Lemma 4.2.7,
P is a positive set. Moreover ν(P ) = ν(An ) + ν(P \ An ) ≥ ν(An ) for all n ≥ 1,
and hence ν(P ) = m. Since ν does not take the value +∞, we see that m < ∞.
Let N = P c . Claim: N is negative. If it isn’t, there is a subset E ⊂ N such
that ν(E) > 0. By Lemma 4.2.6, E contains a positive set A with ν(A) > 0. Then
˙ is positive with ν(P ) + ν(A) > m. This contradicts the definition of m.
P ∪A
Hence N must be negative.
Uniqueness. Suppose that X = P 0 ∪N˙ 0 is a second decomposition of X into
a positive and negative set. Then A = P \ P 0 = N 0 \ N is both positive and
negative, and hence is a null set. Similarly B = P 0 \ P = N \ N 0 is a null set.
Thus P 4P 0 = A ∪ B is a null set. 

4.2.8. D EFINITION . Two signed measures µ, ν on (X, B) are mutually singular


˙ such that A is ν-null and B is µ-
(µ ⊥ ν) if there is a decomposition X = A∪B
null.

4.2.9. J ORDAN D ECOMPOSITION T HEOREM . If ν is a signed measure


on (X, B), there there is a unique pair of mutually singular (positive) measures ν+
and ν− such that ν = ν+ − ν− .

P ROOF. Let X = P ∪N ˙ be the Hahn decomposition for ν. Define ν+ (A) =


ν(A ∩ P ) and ν− (A) = −ν(A ∩ N ). Then by construction, ν+ and ν− are positive
measures supported on disjoint sets, so they are mutually singular, and ν = ν+ −ν− .
For uniqueness, suppose that µ+ ⊥ µ− and ν = µ+ − µ− . Let X = P 0 ∪N ˙ 0
such that P is µ− -null and N is µ+ -null. Then P is a positive set for ν and N 0
0 0 0

is a negative set. By the uniqueness of the Hahn decomposition, P 4P 0 is a ν-null


set. For any set A ∈ B,
µ+ (A) = ν(A ∩ P 0 ) = ν(A ∩ P ) = ν+ (A).
So µ+ = ν+ . Similarly µ− = ν− . 

4.2.10. D EFINITION . The absolute value of a signed measure with Jordan de-
composition ν = ν+ − ν− is |ν| = ν+ + ν− .
60 Differentiation and Signed Measures

Note that a set A ∈ B is ν-null if and only if |ν|(A) = 0.

4.3. Decomposing Measures


4.3.1. D EFINITION . A signed measure ν is absolutely continuous with respect
to a (positive) measure µ (ν  µ) if A ∈ B and µ(A) = 0 implies that ν(A) = 0.

Since any measurable subset of A is also a µ-null set, A must be ν-null. This
is equivalent to saying that |ν|(A) = 0. So ν  µ if and only if |ν|  µ.

4.3.2. R ADON -N IKODYM T HEOREM . Let µ and ν be σ-finite measures


Z on
(X, B). Suppose that ν  µ. Then there is an f ∈ L+ so that ν(E) = f dµ for
E
E ∈ B. Also f is uniquely determined a.e.(µ).

P ROOF. First we assume that µ(X) < ∞ and ν(X) < ∞. For each r ∈ Q+ ,
ν − rµ has a Hahn decomposition (Pr , Nr ). Also, let P0 = X and S N0 = ∅.
Define f (x) = sup{r ≥ 0 : x ∈ Pr }. Then for t ≥ 0, f −1 (t, ∞] = r>t Pr is
measurable. So f ∈ L+ .
If r < s in Q+ , then Ps is a positive set for ν − sµ and thus is positive for
ν − rµ = (ν − sµ) + (s − r)µ. Since S Nr is a negative
 set for ν − rµ, we have
µ(Nr ∩ Ps ) = 0. Therefore µ Nr ∩ s>r∈Q+ Ps = 0. Hence f |Nr ≤ r a.e.(µ).
Thus µ(f −1 (r, ∞]) ≤ µ(Pr ).
Since Pr is positive for ν − rµ, rµ(Pr ) ≤ ν(Pr ). Hence
µ(Pr ) ≤ ν(Pr )/r ≤ ν(X)/r.
This goes to 0 as r → ∞, and thus µ(f −1 (∞)) ≤ limr→∞ µ(Pr ) = 0. Therefore
f < ∞ a.e.(µ).
Let E ∈ B. Fix N . Define Ek = E ∩ P k ∩ N k+1 ) for k ≥ 0; and let
S T N N
E∞ = E \ k≥0 Ek = E ∩ r Pr . Then µ(E∞ ) = 0 and so ν(E∞ ) = 0 as well.
Observe that
k k+1
(ν − N µ)(Ek ) ≥ 0 ≥ (ν − N µ)(Ek ).
Thus
k k+1
N µ(Ek ) ≤ ν(Ek ) ≤ N µ(Ek ).

Also Nk ≤ f (x) ≤ k+1


N for a.e. x ∈ Ek . So kχ
N Ek ≤ f χEk ≤ k+1 χ
N Ek . Integrating
yields
Z
k k+1
N µ(Ek ) ≤ f dν ≤ N µ(Ek ).
Ek
4.3 Decomposing Measures 61

Summing over k ∈ N, we obtain that


X
k
X X
k+1
X
k µ(E)
N µ(E k ) ≤ ν(E k ) = ν(E) ≤ N µ(Ek ) = N µ(Ek ) + N .
k≥0 k≥0 k≥0 k≥0

And similarly
X XZ Z X X µ(E)
k k+1 k
N µ(E k ) ≤ f dν = f dν ≤ N µ(Ek ) = N µ(Ek ) + N .
k≥0 Ekk≥0 E k≥0 k≥0

Therefore Z
µ(E)
ν(E) − f dν ≤ .
E N
Z
However N is arbitrary, and hence ν(E) = f dν.
E Z Z
+
Suppose that f, g ∈ L such that ν(E) = f dµ = g dµ for E ∈ B.
E E
Define A = {x : f (x) > g(x)} and An = {x : f (x) ≥ g(x) + n1 }. If µ(A) > 0,
then µ(An ) > 0 for some n. But then
Z Z Z
1
ν(An ) = f dµ ≥ g + n dµ = g dµ + n1 µ(An ) > ν(An ).
An An An
Hence f ≤ g a.e.(µ). Similarly g ≤ f a.e.(µ), so that f = g a.e.(µ).
For the general case, use the fact that µ and ν are σ-finite to chop X up into a
disjoint union of countable many pieces Xn so that µ(Xn ) < ∞ and ν(Xn ) < ∞
for each n ≥ 1. Then apply the previous argument on each piece, and combine. 

4.3.3. E XAMPLE . On ([0, 1], Bor([0, 1])), consider Lebesgue measure m and
counting measure
Z mc . Then m  mc but there is no function f ∈ L+ so that
m(E) = f dmc . But mc is not σ-finite, and this shows the necessity of this
E
condition in the Radon-Nikodym Theorem.

Next we deduce the corresponding results for signed measures.

4.3.4. C OROLLARY. If ν is a signedZmeasure on (X, B), there is a measurable


function f with |f | = 1 so that ν(E) = f d|ν|.
E
If in addition, |ν| and µ are σ-finite measures on (X, B) and ν  µ, there
is a measurable
Z function g = g+ − g− with at least one of g± integrable so that
ν(E) = g dµ.
E

˙ be the Hahn decomposition for ν. Define f (x) = 1


P ROOF. Let X = P ∪N
on P and f (x) = −1 on N . Then we have that ν = ν+ − ν− and |ν| = ν+ + ν−
62 Differentiation and Signed Measures

and ν+ (E) = ν(E ∩ P ). So


Z Z Z
f d|ν| = 1 dν+ + −1 dν− = ν+ (E) − ν− (E) = ν(E).
E E∩P E∩N
If |ν|  µ are σ-finite, then by the Radon-Nikodym
Z theorem, there is a mea-
surable function h ∈ L+ so that |ν|(E) = h dµ. Then
E
Z Z Z
ν(E) = h dµ − h dµ = f h dµ.
E∩P E∩N E
So let g = f h = hχP − hχN =: g+ − g− . Then in particular, one of ν(P ) and
ν(N ) is finite, which means that one of g± is integrable. 
The Radon-Nikodym Theorem is usually combined with the following decom-
position result.

4.3.5. L EBESGUE D ECOMPOSITION T HEOREM . Let ν, µ be two σ-finite


measures on (X, B). Then there is a unique decomposition ν = νa + νs so that
νa  µ and νs ⊥ µ.

P ROOF. Let λ = µ + ν. This is also a σ-finite measure on (X, B). Clearly


µ  λ and ν  λ. By the Radon-Nikodym Theorem, there are functions f, g ∈
L+ so that
Z Z
µ(E) = f dλ and ν = g dλ for E ∈ B.
E E
Let A = f −1 (0, ∞] and B = f −1 (0). Set νa (E) = ν(E ∩ A) and νs (E) =
ν(E ∩ B). Clearly ν = νa + νs . Also νs ⊥ µ since it is supported on B, which
is a µ-null set. If µ(E) = 0, then f χE = 0 a.e.(λ); and hence λ(E ∩ A) = 0.
Therefore νa (E) = ν(E ∩ A) = 0. Thus νa  µ.
Uniqueness. Suppose that ν = νa0 +νs0 is another decomposition so that νa0  µ
and νs0 ⊥ µ. Then there is a set A0 ∈ B so that νs0 (A0 ) = 0 and µ(B 0 ) = 0, where
B 0 = A0c . Thus
µ(B ∪ B 0 ) = 0 and νs (A ∩ A0 ) = νs0 (A ∩ A0 ) = 0.
Now if E ⊂ B ∪ B 0 , then µ(E) = 0 and so νa (E) = νa0 (E) = 0; and so νs (E) =
νs0 (E) = ν(E). On the other hand, if E ⊂ A ∩ A0 , then νs (E) = νs0 (E) = 0.
Combining, we deduce that νs0 = νs . Hence νa0 = νa as well. 
We finish this section with a discussion of measures taking complex values.
These measures are not allowed to take infinite values.

4.3.6. D EFINITION . A complex measure on (X, B) isS a map ν : B → C


such that ν(∅) = 0 that is countably additive: If E = ˙ i≥1 Ei , then ν(E) =
P
i≥1 ν(Ei ) and this series converges absolutely.
4.3 Decomposing Measures 63

4.3.7. R EMARK . As in Remark 4.2.2, when E = ˙ i≥1 Ei , countable additivity


S
together withP
the fact that every set has a finite measure forces absolute convergence
of the series i≥1 ν(Ei ).

Note that Re ν and Im ν are finite signed measures. Thus there are positive
finite measures νi for 1 ≤ i ≤ 4 so that Re ν = ν1 − ν2 and Im ν = ν3 − ν4 . Hence
ν = ν1 − ν2 + iν3 − iν4 .

4.3.8. E XAMPLE . The main example (and essentially the only example)
R is ob-
tained as follows: let µ be a measure and let f ∈ L1 (µ). Define ν(E) = E f dµ.
You can readily check that this is a complex measure. We will write dν = f dµ

and = f in this case.

The second statement of the following result is also sometimes called the Radon-
Nikodym Theorem.

4.3.9. T HEOREM . If ν is a complex measure on (X, B), there is a unique finite


measure |ν| and measurable function h with |h| = 1 a.e.(|ν|) so that dν = h d|ν|.
Moreover if µ is a σ-finite measure on (X, B), then ν decomposes as νa + νs where
νa = f dµ for f ∈ L1 (µ) and νs is supported on a µ-null set.

P ROOF. With the notation as in the discussion preceding the proposition, let
µ = ν1 + ν2 + ν3 + ν4 . Then Re ν  µ and Im ν  µ. By the Radon-Nikodym
Theorem,Zthere are f, g ∈ L1 (µ) so that d Re
Z ν = f dµ and d Im ν = g dµ. Hence
ν(E) = f + ig dµ. Define |ν|(E) = |f + ig| dµ and h = sign(f + ig).
E Z E
Then |h| = 1 a.e.(|ν|) and ν(E) = h d|ν|. Also |ν|(X) = kf + igk1 < ∞; so
E
|ν| is finite. Uniqueness is left as an exercise.
Now if µ is a σ-finite, the Lebesgue deomposition for |ν| yields |ν| = |ν|a +|ν|s
where d|ν|a = f dµ for f ∈ L+ ∩ L1 (µ) and |ν|s is supported on a µ-null set A.
Then dν = h d|ν| = h d|ν|a + h d|ν|s = hf dµ + h d|ν|s . Thus dνa = hf dµ  µ
and hf ∈ L1 (µ) and dνs = h d|ν|s is supported on the µ-null set A. 

Finally, we can integrate with respect to complex measures. If dν = hd|ν| is a


complex measure and f ∈ L1 (|ν|), then we define
Z Z
f dν := f h d|ν|.

It is easy to check that this is linear. It is advisable to convert to integrals with


respect to positive measures when taking limits.
C HAPTER 5

Lp spaces

5.1. Lp as a Banach space


5.1.1. D EFINITION . Let (X, B, µ) be a measure space. For 1 ≤ p < ∞, let
Z
“Lp (µ)” = f measurable, complex valued : kf kpp = |f |p dµ < ∞ .


Set
N = {f measurable, complex valued : f = 0 a.e.(µ)}
and define Lp (µ) = “Lp (µ)”/N with the norm k[f ]kp = kf kp .
“L∞ (µ)” = {f measurable, complex valued : kf k∞ = ess sup |f | < ∞} where
ess sup |f | = sup{t ≥ 0 : µ({x : |f (x)| > t} > 0}. Set L∞ (µ) = “L∞ (µ)”/N
with norm k[f ]k∞ = kf k∞ .
By convention, we write elements of Lp (µ) as f , where the equivalence class is
understood.

Note that “Lp ” is a linear space because if f, g ∈ Lp (µ), then λf ∈ Lp (µ) with
kλf kp = |λ| kf kp for λ ∈ C; and
|f + g|p ≤ (2 max{|f |, |g|})p ≤ 2p (|f |p + |g|p ).
whence kf + gkpp ≤ 2(kf kp + kgkp ) < ∞. Notice that N is a subspace of
“Lp (µ)”, Rso that Lp (µ) is also a vector space. Moreover, if f is measurable, then
kf kpp = |f |p dµ = 0 if and only if f = 0 a.e.(µ) if and only if f ∈ N . In
particular, k · kp is only a seminorm on “Lp (µ)”. However, for f ∈ Lp (µ), we
have kf kp = 0 if and only if f = 0. To verify that k · kp is a norm on Lp (µ), we
need to verify the triangle inequality, which means eliminating the annoying 2 in
the inequality above.
A function belongs to “L∞ (µ)” if it agrees with a bounded function a.e.(µ).
The triangle inequality is very easy here. It is also very easy for L1 (µ) (see sec-
tion 3.4).

5.1.2. M INKOWSKI ’ S INEQUALITY . Let (X, B, µ) be a measure space. For


1 < p < ∞, the triangle inequality is valid for Lp (µ). Equality holds only when f
and g lie in a 1-dimensional subspace.
64
5.1 Lp as a Banach space 65

P ROOF. Let f, g ∈ Lp (µ). We may suppose that neither is 0, as that case is


trivial. Define A = kf kp and B = kgkp , and set f0 = f /A and g0 = g/B; so
kf0 kp = 1 = kg0 kp .
Consider ϕ(x) = xp on [0, ∞). Note that ϕ00 (x) = p(p−1)xp−2 > 0 on (0, ∞),
and thus ϕ(x) is a strictly convex function, meaning that for all x1 , x2 ∈ [0, ∞) and
0 ≤ t ≤ 1,
ϕ(tx1 + (1 − t)x2 ) ≤ tϕ(x1 ) + (1 − t)ϕ(x2 ),
with equality only when x1 = x2 or t = 0 or t = 1. That is every chord between
distinct points on the curve y = ϕ(x) lies strictly above the curve.
Hence for any x ∈ X,
 A B p A B
|f0 (x)| + |g0 (x)| ≤ |f0 (x)|p + |g0 (x)|p
A+B A+B A+B A+B
with equality only when |f0 (x)| = |g0 (x)|. Also
1 |f (x)| + |g(x)| A B
|f (x) + g(x)| ≤ = |f0 (x)| + |g0 (x)|,
A+B A+B A+B A+B
with equality only when sign(f (x)) = sign(g(x)). Integrate the pth power:
Z Z Z b
1 p A p B
|f (x) + g(x)| dµ ≤ |f0 (x)| dx + |g0 (x)|p dµ
(A+B)p A+B A+B a
A B
= kf0 kpp + kg0 kpp = 1.
A+B A+B
Multiplying through by (A + B)p and take the pth root to get
kf + gkp ≤ (A + B) = kf kp + kgkp .
For equality, we require that |f0 (x)| = |g0 (x)| a.e. and sign(f (x)) = sign(g(x)) a.e.
This implies that f0 = g0 a.e.(µ), so g = Bf /A; i.e., g is a scalar multiple of f . 

Now that we have established that Lp (µ) is a normed space, we will show
that it is complete, and so is a Banach space. This extends the result for L1 (µ),
Theorem 3.4.2. This is another important piece of evidence that this is the right
context for integration.

5.1.3. R IESZ -F ISCHER T HEOREM . Let (X, B, µ) be a measure space. For


1 ≤ p ≤ ∞, Lp (µ) is complete.

P ROOF. Suppose that (fn )n≥1 is a Cauchy sequence in Lp (µ). Select a subse-
quence (nj )j≥1 so that kfnj − fm kp < 2−j for all m > nj . Define
k−1
X
hk = |fn1 | + |fnj+1−fnj | and h = lim hk .
k→∞
j=1
66 Lp spaces

Note that 0 ≤ hk ≤ hk+1 . By the triangle inequality,


k−1
X
khk kp ≤ kfn1 kp + 2−j < kfn1 kp + 1.
j=1

Since hpk are increasing to hp , the MCT shows that khkp ≤ kfn1 kp + 1 when
p < ∞. For p = ∞, khk∞ ≤ kfn1 kp + 1 follows directly. So h ∈ Lp (µ), and in
particular, h(x) < ∞ a.e.(µ).
Note that fnk = fn1 + k−1
P
j=1 (fnj+1−fnj ). Let
X
f = fn1 + (fnj+1−fnj ) = lim fnk .
k→∞
j≥1

This series converges absolutely whenever h(x) < ∞,Pand so is defined a.e.(µ).
p
Moreover |f | ≤ h, so f ∈ Lp (µ). Also |f − fnk |p ≤ j≥k |fnj − fnj+1 | ≤ hp
when p < ∞. Thus by the LDCT,
Z Z
lim kf − fnk kpp = lim |f − fnk |p dµ = lim |f − fnk |p dµ = 0.
k→∞ k→∞ k→∞
1−k . Therefore f
P
When p = ∞, kf − fnk k∞ ≤ j≥k kfnj − fnj+1 k∞ < 2 nk
converges to f in L (µ). Since (fn )n≥1 is Cauchy, in fact fn → f in Lp (µ). So
p

Lp (µ) is complete. 

5.1.4. E XAMPLES .
(1) Lp (0, 1) = Lp ((0, 1), m). Here Lebesgue measure, m, is a probability measure
on (0, 1). If 1 ≤ p < r < ∞ and f ∈ Lr (0, 1), then
Z Z Z
p p
kf kp = |f | dm ≤ 1 dm + |f |r dm ≤ 1 + kf krr < ∞.
|f |≤1 |f |>1

Thus f ∈ Lp (0, 1).So L1 (0, 1)


⊃ ⊃ Lr (0, 1) ⊃ L∞ (0, 1). These
Lp (0, 1)
subspaces are not closed in the larger spaces. In the next section, we will improve
the norm inequality. We also write Lp (R) for Lp ((R, m)).
(2) Consider (N, P(N), mc ). Then
X 1/p
Lp (N, mc ) = lp = (ai )i≥1 : k(ai )kp = |ai |p

<∞
i≥1

for p < ∞, and L∞ (N, mc ) = l∞ = (ai )i≥1 : k(ai )k∞ = supi≥1 |ai | < ∞ . If


1 ≤ p < r < ∞ and (ai ) ∈ lp , then


X X
k(ai )krr = |ai |r = |ai |p |ai |r−p
i≥1 i≥1
X
p
sup |ai |r−p ≤ k(ai )kpp k(ai )kr−p = k(ai )krp .

≤ |ai | p
i≥1

Thus k(ai )kr ≤ k(ai )kp . In particular, l1 ⊂ lp ⊂ lr ⊂ l∞ .


5.1 Lp as a Banach space 67

(3) Let µ be the measure on (N, P(N)) given by µ(A) = ∞ is A 6= ∅ and µ(∅) =
0. Then Lp (µ) = {0} if 1 ≤ p < ∞ and L∞ (µ) = l∞ . You need sets of non-zero
finite measure to get interesting functions in Lp (µ).

5.1.5. P ROPOSITION . Let (X, B, µ) be aPmeasure and let 1 ≤ p < ∞. Then


the simple functions of finite support (ϕ = ni=1 ai χAi , where µ(Ai ) < ∞) are
dense in Lp (µ). For p = ∞, the set of all simple functions is dense in L∞ (µ).

P ROOF. Fix f ∈ Lp (µ) for p < ∞. Choose simple functions ϕn so that

|ϕn | ≤ |ϕn+1 | ≤ |f | and f (x) = lim ϕn (x).


n→∞

Then |ϕn |p ≤ |f |p is integrable, so ϕn ∈ Lp (µ). In particular, each set Ai in the


definition of ϕn for which ai 6= 0 must have finite measure. Also since |f − ϕn | ≤
|f | + |ϕn | ≤ 2|f |, we have |f − ϕn |p ≤ 2p |f |p . Therefore by the LDCT,
Z Z
lim kf − ϕn kpp = lim |f − ϕn |p dµ = lim |f − ϕn |p dµ = 0.
n→∞ n→∞ n→∞

So ϕn converge to f in Lp (µ).
When p = ∞, suppose that kf k∞ = N < ∞. For n ≥ 1, and (j, k) ∈
[−nN, nN ]2 , let Aj,k = {x : nj ≤ Re f (x) < j+1 k k+1
n , n ≤ Im f (x) < n }. These
PnN √
are measurable sets, and ϕn = j,k=−nN j+ik n
χAj,k satisfies kf − ϕn k∞ ≤ 2 .
n
So the simple functions are dense. If µ(A) = ∞, then χA is not the limit of simple
functions with finite support. 

Recall that if X is a topological space, then Cc (X) denotes the space of con-
tinuous functions on X with compact support (i.e. {x : f (x) 6= 0} is compact).

5.1.6. C OROLLARY. If 1 ≤ p < ∞, then Cc (R) is dense in Lp (R). This is


false for p = ∞.

P ROOF. Let f ∈ Lp (R) and let ε > 0. Use Proposition 5.1.5


P to find a simple
function of finite support ϕ with kf − ϕkp < ε/2. Write ϕ = ni=1 ai χAi , where
µ(Ai ) < ∞. Let L = sup |ai |. By the regularity of Lebesgue measure (Theo-
ε
rem 1.5.6), there are compact sets Ki so that Ki ⊂ Ai and m(Ai \ Ki ) < 4nL .
ε
Choose disjoint open sets Ui ⊃ Ki so that m(Ui \ Ki ) < 4nL and Ui is compact.
dist(x,U c )
For each 1 ≤ i ≤ n, let hi = dist(x,Ki )+dist(x,U
i
c . Note that this is a continuous
i)
function which vanishes off of Ui , has hi (x) = 1 on Ki , and takes values in [0, 1].
Then χAi − hi vanishes on Ki and on Uic ∩ Aci ; so
ε ε ε
kχAi − hi kpp ≤ m(Ai \ Ki ) + m(Ui \ Ai ) ≤ + = .
4nL 4nL 2nL
68 Lp spaces

Pn
Hence h = i=1 ai hi belongs to Cc (R) and
n
X ε ε
kϕ − hkp ≤ |ai |kχAi − hi kp < nL = .
2nL 2
i=1

Thus kf − hkp < ε.


Clearly the constant function 1 is not a limit in L∞ (R) of functions of compact
support. So this fails for p = ∞. 

5.2. Duality for Normed Vector Spaces


In this section, we recall some basic facts about linear functionals on normed
vector spaces.

5.2.1. D EFINITION . Let (V, k · k) be a normed vector space over F ∈ {R, C}.
Let L(V, F) denote the vector space of linear maps of V into the scalars, linear
functionals, and let V ∗ denote the dual space of V consisting of all continuous
linear functionals.

5.2.2. P ROPOSITION . Let (V, k · k) be a normed vector space over F, and let
ϕ ∈ L(V, F). The following are equivalent:
(1) ϕ is continuous.
(2) kϕk := sup{|ϕ(v)| : kvk ≤ 1} < ∞.
(3) ϕ is continuous as v = 0.

u−v
P ROOF. (2) ⇒ (1). If u 6= v ∈ V , set w = ku−vk and note that

|ϕ(u) − ϕ(v)| = |ϕ(u − v)| = |ϕ(w)| ku − vk ≤ kϕkku − vk.


Hence ϕ is Lipschitz, and in particular is continuous.
(1) ⇒ (3) is trivial.
(3) ⇒ (2). Assume that (2) fails. Then there are vectors vn ∈ V with kvn k =
1 and |ϕ(vn )| > n2 . Thus n1 vn → 0 while |ϕ( n1 vn )| > n diverges. So ϕ is
discontinuous at 0. The result follows. 

5.2.3. T HEOREM . Let (V, k · k) be a normed vector space. Then (V ∗ , k · k) is


a Banach space.

P ROOF. First we show that k · k is a norm. Clearly kϕk = 0 if and only if


ϕ(v) = 0 for all v ∈ V with kvk ≤ 1. This forces ϕ = 0 by linearity. Also if
5.3 Duality for Lp 69

λ ∈ F, then
kλϕk = sup |λϕ(v)| = |λ| sup |ϕ(v)| = |λ| kϕk.
kvk≤1 kvk≤1

For the triangle inequality, take ϕ, ψ ∈ V ∗.

kϕ + ψk = sup |ϕ(v) + ψ(v)| ≤ sup |ϕ(v)| + sup |ψ(v)| = kϕk + kψk.


kvk≤1 kvk≤1 kvk≤1

To establish completeness, let (ϕn )n≥1 be a Cauchy sequence in V ∗ . For each


v ∈ V , |ϕm (v)−ϕn (v)| ≤ kϕm −ϕn k kvk. It follows that (ϕn (v))n≥1 is a Cauchy
sequence in F. Thus we may define ϕ(v) = limn→∞ ϕn (v). Then
ϕ(λu+µv) = lim ϕn (λu+µv) = lim λϕn (u)+µϕn (v) = λϕ(u)+µϕ(v).
n→∞ n→∞
Therefore ϕ is linear. Now let ε > 0 and select N so that if m, n ≥ N , then
kϕm − ϕn k < ε. In particular, if kvk ≤ 1, we have |ϕm (v) − ϕn (v)| < ε. Holding
m fixed and letting n → ∞, we obtain that |ϕm (v) − ϕ(v)| ≤ ε. Taking the
supremum over all v with kvk ≤ 1 yields kϕm − ϕk ≤ ε when m ≥ N . In
particular, kϕk ≤ kϕm k + kϕm − ϕk < ∞; so ϕ ∈ V ∗ . Moreover we have shown
that limm→∞ ϕm = ϕ in (V ∗ , k · k). So V ∗ is complete. 

5.3. Duality for Lp


In this section, we determine the Banach space dual of the spaces Lp (µ) for
1 ≤ p < ∞. We need another import inequality.

5.3.1. L EMMA . If a, b ∈ (0, ∞) and 0 ≤ t ≤ 1, then


at b1−t ≤ ta + (1 − t)b
with equality only when a = b or t = 0 or 1.

P ROOF. This is just the AMGM inequality. The function f (x) = ex is strictly
convex. So etα+(1−t)β ≤ teα + (1 − t)eβ , with equality only when α = β or t = 0
or 1. Take a = eα and b = eβ and the result follows. 

5.3.2. H ÖLDER ’ S I NEQUALITY . Let (X, B, µ) be a measure space. Let


1 < p < ∞ and define q so that p1 + q1 = 1. If f ∈ Lp (µ) and g ∈ Lq (µ) , then
f g ∈ L1 (µ) and
kf gk1 ≤ kf kp kgkq .
Equality holds if and only if |f |p and |g|q are collinear.

P ROOF. We may suppose that f, g are non-zero since the inequality is trivial
if f g = 0 a.e.(µ). Let f0 = |f |/A where A = kf kp and g0 = |g|/B where
70 Lp spaces

1
B = kgkq . Apply Lemma 5.3.1 to a = f0 (x)p and b = g0 (x)q with t = p and
1 − t = q1 . Then
|f (x)g(x)| 1 1 1 1
= f0 (x)g0 (x) ≤ f0 (x)p + g0 (x)q = |f (x)|p + |g(x)|q .
AB p q pAp qB q
Integrate to get
kf gk1 1 1 1 1
≤ p
kf (x)kpp + q
kg(x)kqq = + = 1.
AB pA qB p q
Hence kf gk1 ≤ AB = kf kp kgkq .
Equality holds in the first inequality only when f0 (x)p = g0 (x)q . For it to hold
for the integral, this identity must hold a.e.(µ). Hence |g|q = B p
A |f | a.e.(µ). 
For more elementary reasons, if f ∈ L1 (µ) and g ∈ L∞ (µ), we have that
f g ∈ L1 (µ) and kf gk1 ≤ kf k1 kgk∞ .

5.3.3. E XAMPLE . We return to Example 5.1.4(1). Suppose that µ is a probabil-


ity measure. If 1 ≤ p < r < ∞ and f ∈ Lr (µ), choose s so that pr + 1s = 1. Note
that |f |p ∈ Lr/p (µ). Then
kf kpp = |f |p 1 1
≤ |f |p r/p
k1ks = kf kpr .
Hence kf kp ≤ kf kr .
More generally if µ(X) < ∞, the same computation yields
1
− r1
kf kp ≤ k1k1/p
s kf kr = µ(X)
1/sp
kf kr = µ(X) p kf kr .
lp
When every point has measure 1, you get the spaces, where the inequalities
are reversed. See Example 5.1.4(2).
For the spaces Lp (R), there is no containment between Lp (R) and Lr (R) when
p 6= r.

We now get to the main result of this section.

5.3.4. T HEOREM (Riesz). Let (X, B, µ) be a σ-finite measure. Suppose that


1 ≤ p < ∞, and let q satisfy p1 + q1 = 1, where q = ∞ when p = 1. Then Lp (µ)∗ =
Z
Lq (µ) via the isometric pairing Lq (µ) 3 g → Φg , where Φg (f ) = f g dµ.

5.3.5. L EMMA . Suppose that µ is σ-finite, 1 ≤ p < ∞ and g ∈ Lq (µ). Then


kΦg k = kgkq .

P ROOF. First suppose that 1 < p < ∞. By Hölder’s inequality


Z
|Φg (f )| = f g dµ ≤ kf gk1 ≤ kf kp kgkq .
5.3 Duality for Lp 71

Hence kΦg k = supkf kp ≤1 |Φg (f )| ≤ kgkq . On the other hand, if g 6= 0, let


|g|q−1 sign(g) |g|q
f= . We will use that p(q −1) = pq(1− q1 ) = q. Then |f |p = .
kgkq−1
q kgkqq
So
|g|q
Z
kf kpp = dµ = 1.
kgkqq
Therefore
|g|q−1 sign(g)
Z Z
1
kΦg k ≥ Φg (f ) = g dµ = |g|q dµ = kgkq .
kgkqq−1 kgkq−1
q
Z
For p = 1, take g ∈ L∞ (µ). As above, |Φg (f )| ≤ |f | kgk∞ dµ = kf k1 kgk∞ .
Conversely, given ε > 0, let A = {x : |g(x)| > kgk∞ − ε}. Then µ(A) > 0. Since
µ is σ-finite, there is a measurable subset E ⊂ A with 0 < µ(E) < ∞. Let
Z
sign(g) 1
f = χE . Then kf k1 = 1 and Φg (f ) = |g| dµ > kgk∞ − ε. Thus
µ(E) µ(E)
E
kΦg k = kgk∞ . 

5.3.6. L EMMA . Suppose that µ is σ-finite, 1 ≤ p < ∞ and g is a measurable


function such that
Z
ϕg dµ ≤ M kϕkp for all ϕ simple, finite support.

Then g ∈ Lq (µ) and kgkq ≤ M .

P ROOF. First take 1 < p < ∞, so that q < ∞. Suppose first that g is real
valued. Choose simple functions ψn soS that |ψn | ≤ |ψn+1 | ≤ |g| and ψn → g.
Since µ is σ-finite, we can write X = n≥1 Xn where Xn ⊂ Xn+1 and µ(Xn ) <
∞ for all n. Then ϕn = ψn χXn are simple functions with finite support such that
|ϕn | ≤ |ϕn+1 | ≤ |g| and ϕn → g. Analogous to the previous lemma, define
|ϕn |q−1 sign(g)
fn = .
kϕn kq−1
q
Note that sign(g) take only the values ±1, 0 and so fn is a simple function of finite
support. As in the previous lemma, kfn kp = 1. Therefore
|ϕn |q−1 |g|
Z Z
M ≥ sup fn g dµ = sup dµ
n≥1 n≥1 kϕn kq−1
q
|ϕn |q
Z
≥ sup dµ = sup kϕn kq = kgkq .
n≥1 kϕn kqq−1 n≥1

The last equality follows from the MCT.


Now for complex valued g, note that Re g and Im g satisfy the hypotheses of the
lemma. Thus they both belong to Lq (µ), and hence g ∈ Lq (µ). So by Lemma 5.3.5,
72 Lp spaces

kΦg k = kgkq . Proposition 5.1.5 shows that simple functions of finite support are
dense in Lp (µ), and hence the optimal constant M must be kΦg k; so kgkq ≤ M .
For p = 1, we need to show that kgk∞ ≤ M . If this fails, then for some
N > M , {x : |g(x)| > N } has positive measure. Hence there is a θ ∈ R
so that A = {x : Re eiθ g(x) > M } has positive measure. Let E ⊂ A have
iθ −1 χ . Then kf k = 1 and
R finite measure. Define f = e µ(E)
positive and E 1
M ≥ Re f g dµ > M . This is a contradiction, and hence kgk∞ ≤ M . 

P ROOF OF T HEOREM 5.3.4. First assume that µ(X) < ∞. Let Φ ∈ Lp (µ)∗ .
Define ν(E) = Φ(χE ) for E ∈ B. Note that ν(∅) = Φ(0) = 0. Also if E =
˙
S
i≥1 Ei is a disjoint union of sets in B, then
n
χE −
X
χEi p ˙
[ 
p
=µ Ei → 0 as n → ∞.
i>n
i=1
Since Φ is continuous,
n
X  X
ν(E) = Φ(χE ) = lim Φ χEi = ν(Ei ).
n→∞
i=1 i≥1

Therefore ν is countably additive, and thus is a complex measure. Moreover if


µ(E) = 0, then χE = 0 a.e.(µ) and hence ν(E) = Φ(0) = 0. Thus ν  µ.
By the Radon-Nikodym
Z Theorem 4.3.2, there is a measurable g ∈ L1 (µ) so
g dµ. If ϕ = ni=1 ai χEi is a simple function (with finite support),
P
that ν(E) =
E
then
n
X Z Z
Φ(ϕ) = ai ν(Ei ) = ϕ dν = ϕg dµ.
i=1
Thus Z
ϕg dµ = |Φ(ϕ)| ≤ kΦk kϕkp .
By Lemma 5.3.6, g ∈ Lq (µ), and so Φg is a continuous functional on Lp (µ) which
agrees with Φ on all simple functions of finite support. Such functions are dense
in Lp (µ) by Proposition 5.1.5. So by continuity, Φ = Φg . By Lemma 5.3.5,
kΦk = kΦg k = kgkq . Moreover this shows that g is unique, because distinct
Lp (µ) functions yield distinct functionals. S
Now consider when µ(X) = ∞. We can write X = n≥1 Xn where Xn ⊂
Xn+1 and µ(Xn ) < ∞. We can restrict Φ to Lp (Xn , µ|Xn ) (since this is a closed
subspace of Lp (µ)). Call it Φn . The first part of the proof shows that S
thee is a
p
unique gn ∈ L (Xn , µ|Xn ) so that Φn = Φgn and kgn kq ≤ kΦk. Let g = n≥1 gn
be the unique measurable function such that g|Xn = gn . The various functions
must match up a.e.(µ) because of uniqueness. By the MCT,
kgkq = lim kgn kq ≤ kΦk.
n→∞
5.3 Duality for Lp 73

So Φ and Φg agree on the dense subspace n≥1 Lp (Xn , µ|Xn ) and they are con-
S

tinuous functionals; and so Φ = Φg . Again by Lemma 5.3.5, kΦk = kgkq . 

5.3.7. R EMARK . In fact, σ-finiteness is not needed to obtain Lp (µ)∗ = Lq (µ)


provided that 1 < p < ∞. See Folland’s book for the details.
However it is critical to establish that L1 (µ)∗ = L∞ (µ). In Example 5.1.4
(3), each point in N has µ({n}) = ∞ and so L1 (µ) = {0} while L∞ (µ) = l∞ .
However L1 (µ)∗ = {0}.
A more subtle example is to take (R, P(R), mc ). Let B be the σ-algebra of
countable and co-countable subsets of R. Let µ = mc |B . Any function in L1 (mc )
has countable support, and thus is B-measurable. Therefore L1 (µ) = L1 (mc ).
However changing the measure has a dramatic effect on the L∞ spaces. The
space L∞ (mc ) = l∞ (R) is the space of all bounded functions on R. However
for a function to be B-measurable, it must be constant on a co-countable set. In
fact, L1 (µ)∗ = L1 (mc )∗ = L∞ (mc ) = l∞ (R). One way to see this is that if
Φ ∈ L1 (µ)∗ , we can define g(r) = Φ(χ{r} ). Then |g(r)| ≤ kΦk, so that g is a
bounded function; and so g ∈ L∞ (mc ). Moreover, if f ∈ L1 (µ), then there is a
countable P {rn : n ≥ 1} and scalars αn so that
P (or finite) collection of real numbers
χ
f = n≥1 αn {rn } . Moreover kf k1 = n≥1 |αn |; whence this series converges
absolutely. Therefore by continuity,
X X
Φ(f ) = αn Φ(χ{rn } ) = αn g(n).
n≥1 n≥1

Moreover this same computation shows that any element of L∞ (mc ) determines a
distinct continuous functional on L1 (µ).

.
C HAPTER 6

Some Topology

This is a brief introduction skewed to certain things needed in the next section.
This is not intended as a comprehensive introduction to point set topology. A good
general reference is Willard’s book [3].

6.1. Topological spaces


6.1.1. D EFINITION . A topology τ on a set X is a collection of subsets such
that
(1) ∅, X ∈ τ .
S
(2) If {Uλ : λ ∈ Λ} ⊂ τ , then λ∈Λ Uλ ∈ τ .
(3) If U1 , . . . , Un ∈ τ , then ni=1 Ui ∈ τ .
T

The elements U ∈ τ are called open sets.

6.1.2. E XAMPLES .
(1) If (X, d) is a metric space, then U is open if for every x ∈ U , there is an r > 0
so that the open ball br (x) ⊂ U .
(2) If X is any set, the discrete topology has τd = P(X), the collection of all
subsets of X.
(3) If X is any set, the trivial topology has τ = {∅, X}.
(4) If (X, ≤) is a totally ordered set, the intervals (a, b) = {x ∈ X : a < x < b},
(−∞, b) = {x ∈ X : x < b} and (a, ∞) = {x ∈ X : x > a} are open, and the
topology consists of arbitrary unions of such intervals.
(5) If (X, τ ) is a topology and Y ⊂ X, the induced topology on Y is τ |Y =
{U ∩ Y : U ∈ τ }.

6.1.3. D EFINITION c
T . A set F ⊂ X is closed if F is open. If A ⊂ X, the
closure of A is A = {F : A ⊂ F, F closed}. A point in A is called a limit point
of A.
If A ⊂ X, then a ∈ A is an interior point of A if there
S exists U ∈ τ with
a ∈ U ⊂ A. If A ⊂ X, the interior of A is Ao or int A = {U ∈ τ : U ⊂ A}.
74
6.1 Topological spaces 75

If x ∈ X, a neighbourhood of x is a set N such that x ∈ N o .

6.1.4. P ROPOSITION .

(1) Finite unions and arbitrary intersections of closed sets are closed.
(2) A is the smallest closed set containing A.
(3) x ∈ A if and only if every U ∈ τ with x ∈ U has A ∩ U 6= ∅.
(4) A = Acoc is the complement of the interior of Ac .

P ROOF. Since open sets are closed under arbitrary unions and finite intersec-
tions, the collection of closed sets is closed under arbitrary intersections and finite
unions. Hence the intersection of all closed sets F ⊃ A is closed, and is thus the
smallest closed set containing A. Now x ∈ A if and only if x ∈ F for every closed
F ⊃ A if and only if x 6∈ U if U is open and disjoint from A. Finally
[ [
X \ A = {U ∈ τ : U ∩ A = ∅} = {U ∈ τ : U ⊂ Ac } = Aco . 

6.1.5. D EFINITION . If σ and τ are two topologies on X, we say that σ is a


weaker topology than τ , and τ is a stronger topology than σ, if σ ⊂ τ .

6.1.6. P ROPOSITION . If S ⊂ P(X), then there is a weakest topology τ con-


taining S. It consists of arbitrary unions of sets which are intersections of finitely
many elements of S.

P ROOF. Clearly if τ ⊃ S is a topology, then it contains all intersections of


finitely many elements of S, and arbitrary unions of these sets. The intersection of
no sets is X by convention, and ∅ is the union of no sets, so they both belong to
τ . This collection is clearly closed under arbitrary unions. To check that it is stable
under intersection, observe that if Aα,i and Bβ,j are in S, then
[ [
Aα,1 ∩ · · · ∩ Aα,nα ∩ Bβ,1 ∩ · · · ∩ Bβ,mβ
α∈A β∈B
[
= Aα,1 ∩ · · · ∩ Aα,nα ∩ Bβ,1 ∩ · · · ∩ Bβ,mβ .
α∈A, β∈B

Hence this collection is a topology. By construction, this is the weakest topology


containing S. 

6.1.7. D EFINITION . Say that S ⊂ P(X) is a base for a topology τ if every


open set U ∈ τ is the union of elements of S. Also S is a subbase for a topology τ
if the collection of finite intersections of elements of S is a base for τ .
76 Some Topology

6.1.8. E XAMPLES . 
(1) If (X, d) is a metric space, then b1/n (x) : x ∈ X, n ≥ 1 is a base for the
topology.

(2) (r, s) : r < s ∈ Q is a base for the topology of R.
(3) Let C[0, 1] denote the space of continuous functions on [0, 1]. For each x ∈
[0, 1], a ∈ C and r > 0, let U (x, a, r) = {f ∈ C[0, 1] : f (x) ∈ br (a)}. Let τ be the
topology generated by these sets. This is the topology of pointwise convergence.
An open neighbourhood of f must contain a set of the form
{g ∈ C[0, 1] : |g(xi ) − f (xi )| < r for 1 ≤ i ≤ n}
for x1 , . . . , xn ∈ [0, 1] and r > 0.

6.1.9. D EFINITION . A set A is dense in X if X = A. X is separable if it has a


countable dense subset. X is first countable if for each x ∈ X, there is a countable
family {Ui } ⊂ τ with x ∈ Ui which forms a countable base of neighbourhoods
of x; i.e., if x ∈ V is open, then there is some i so that Ui ⊂ V . X is second
countable if there is a countable family of open sets which is a base for τ .

6.1.10. E XAMPLES . 
(1) If (X, d) is a metric space and x ∈ X, then b1/n (x) : n ≥ 1 is a countable
base of neighbourhoods of x. If X is separable, and {xi : i ≥ 1} is dense in X,
then b1/n (xi ) : i ≥ 1, n ≥ 1 is a base for τ . Indeed, suppose that x ∈ U is
open. Pick r > 0 so that br (x) ⊂ U and xi so that d(x, xi ) < 1/n < r/2. Then
x ∈ b1/n (xi ) ⊂ U . So X is second countable. In particular, compact metric spaces
are separable and so second countable.
(2) Consider the discrete topology τd on a set X. Since the topology is generated
by {x} : x ∈ X , X is always first countable. However it is second countable if
and only if X is countable if and only if X is separable.

6.2. Continuity
6.2.1. D EFINITION . A function f : (X, τ ) → (Y, σ) between topological
spaces is continuous if for all V ⊂ Y open, the set f −1 (V ) is open in X. Say that
f is a homeomorphism if f is a bijection such that both f and f −1 are continuous.

6.2.2. E XAMPLES .
id id
(1) The identity map (X, discrete) −→ (X, τ ) −→ (X, trivial) is a continuous
bijection, however in both cases f −1 will be discontinuous provided that τ satisfies
{∅, X} ( τ ( P(X).
6.2 Continuity 77

(2) A function f : (X, trivial) → R is continuous only if it is constant, while


every function f : (X, discrete) → R is continuous. On the other hand, a function
f : R → (X, discrete) is continuous only if it is constant, while every function
f : R → (X, trivial) is continuous.

(3) f : (−1, 1) → R by f (x) = tan πx


2 is a homeomorphism.


(4) Let X = {0, 1}. Let τ = ∅, {0}, X . Then {1} is closed, but {0} is not, and
{0} = X. If f : X → R is continuous, then f is constant.

(5) Let X = [0, 1) ∪ {a, b}. Let the open sets in τ be U ⊂ [0, 1) which are open in
the usual metric on [0, 1) together with sets U ∪ (r, 1) ∪ {a}, U ∪ (r, 1) ∪ {b} and
U ∪ (r, 1) ∪ {a, b} for r < 1. Here the points {a} and {b} are closed because the
complement is open. However if a ∈ U and b ∈ V are open sets, then U ∩ V ⊃
(r, 1) for some r < 1. That means that you cannot separate a and b from one
another by open sets. If f : X → R is continuous, then f (a) = f (b).

6.2.3. D EFINITION . Let C b (X) and CRb (X) or C b (X, R) denote the normed
vector space of bounded continuous functions from X into C and R, respectively,
with norm kf k∞ = supX |f (x)|. Similarly, C(X) and CR (X) or C(X, R) denote
the vector space of continuous functions from X into C and R, respectively.

6.2.4. D EFINITION . A topological space is Hausdorff if for all x, y ∈ X two


distinct points, there are open sets U 3 x and V 3 y so that U ∩ V = ∅.

6.2.5. P ROPOSITION . If C b (X) separates points of X, i.e., for x 6= y in


X, there is a continuous function f ∈ C b (X) so that f (x) 6= f (y), then X is
Hausdorff.

P ROOF. If f (x) = α and f (y) = β and r = |α − β|/2 > 0, then x ∈ U =


f −1 (br (α))
and y ∈ V = f −1 (br (β)) and U ∩ V = ∅. 

Consider the Examples 6.2.2 (4) and (5) in light of this proposition.
Recall that fn ∈ C b (X) converge uniformly to a function f if kf − fn k∞ → 0.
The following standard result for metric spaces extends easily.

6.2.6. P ROPOSITION . The uniform limit f of a sequence fn ∈ C b (X) is con-


tinuous.

P ROOF. Let U be open in C and let x ∈ f −1 (U ). Then there is an r > 0 so


that br (f (x)) ⊂ U . Choose n so large that kf − fn k∞ < r/3. Then x ∈ V =
78 Some Topology

fn−1 (br/3 (fn (x))) is open. If y ∈ V , then |fn (y) − fn (x)| < r/3, so
|f (y) − f (x)| ≤ |f (y) − fn (y)| + |fn (y) − fn (x)| + |fn (x) − f (x)|
r
< kf − fn k∞ + + kf − fn k∞ < r.
3
Hence f (y) ∈ br (f (x)) ⊂ U . Thus V ⊂ f −1 (U ). So f is continuous. 

The norm kf k∞ makes C b (X) into a normed vector space. In view of Propo-
sition 6.2.5, the following is most interesting when X is Hausdorff.

6.2.7. T HEOREM . For any topological space, C b (X) is complete.

P ROOF. Let (fn )n≥1 be a Cauchy sequence in C b (X). If ε > 0, there is an


N so that if N ≤ m < n, then kfn − fm k∞ < ε. In particular, for x ∈ X, the
sequence fn (x) n≥1 is Cauchy in C. So we may define f (x) = limn→∞ fn (x)
pointwise. However for m ≥ N ,
|f (x) − fm (x)| = lim |fn (x) − fm (x)| ≤ ε.
n→∞

Hence kf − fm k∞ ≤ ε. So convergence is uniform. By Proposition 6.2.6, f


is continuous. Also kf k∞ = limn→∞ kfn k∞ < ∞, and so f lies in C b (X).
Therefore C b (X) is complete. 

6.3. Compactness
6.3.1. D EFINITION . An open S cover of a set A ⊂ X is a collection of open sets
{Uλ : λ ∈ Λ} such that A ⊂ Λ Uλ . A set A is compact if every S open cover has a
finite subcover, i.e., a finite subset Uλ1 , . . . , Uλn such that A ⊂ ni=1 Uλi .

6.3.2. E XAMPLE . In 6.2.2 (4), the point {0} is compact but not closed.

6.3.3. P ROPOSITION . If X is compact and A ⊂ X is closed, then A is com-


pact.
If X is Hausdorff and A ⊂ X is compact, then A is closed. Moreover, if x 6∈ A,
there are disjoint open sets U ⊃ A and V 3 x.

P ROOF. If U = {Uλ : λ ∈ Λ} is an open cover of A, then U ∪ {Ac } is an open


cover of X. By compactness, it has a finite subcover Uλ1 , . . . , Uλn , Ac . Hence
Uλ1 , . . . , Uλn covers A; whence A is compact.
Suppose that X is Hausdorff and A ⊂ X is compact, and let x ∈ Ac . For each
a ∈ A, there are open sets a ∈ Ua and x ∈ Va so that Ua ∩ Va = ∅. Clearly
6.3 Compactness 79

{Ua : a ∈ A} is an open T cover of A. By compactness, there is a finite subcover


Ua1 , . . . , Uan . Let V = ni=1 Vai . Then x ∈ V is open, and
n
[
V ∩A⊂ V ∩ Uai = ∅.
i=1
Sn
Hence x 6∈ A. So A is closed. Moreover A ⊂ U = i=1 Uai , and U ∩ V = ∅. 

6.3.4. D EFINITION . A family {Aλ : λ ∈ Λ} of subsets of X has the finite


T property (FIP) if whenever λ1 , . . . , λn are finitely many elements of
intersection
Λ, then ni=1 Aλi 6= ∅.

6.3.5. P ROPOSITION . A topological space X is compact if and only if every


family
T FT= {Aλ : λ ∈ Λ} of closed sets with FIP has non-empty intersection
F := Λ Aλ 6= ∅.

P ROOF. Suppose that X is compact and F has FIP. Define open sets Uλ = Acλ .
T S T c
If F = ∅, then Λ Uλ = F = X. So U = {Uλ : λ ∈ Λ} is an
open
Tn cover of S X. By compactness,
c there is a finite subcover TUλ1 , . . . , Uλn . Hence
n
i=1 A λ i
= i=1 U λ i
= ∅, contradicting FIP. Therefore F 6= ∅.
Conversely, suppose that U = {Uλ : λ ∈ Λ} is anTopen cover of SnX. Define c
closed sets Aλ = Uλc . If there is no finite subcover, then ni=1 T A λ i
= i=1 U λi
6=
S and thusTF =c {Aλ : λ ∈ Λ} has FIP. But then F 6= ∅. Therefore
∅;
Λ Uλ = F 6= X, contradicting the fact that U is an openTcover. Hence
F does not have FIP, so there is a finite set Aλ1 , . . . , Aλn such that ni=1 Aλi = ∅.
c
Hence ni=1 Uλi =
S Tn
i=1 Aλi = X. Therefore X is compact. 

6.3.6. P ROPOSITION . If f : (X, τ ) → (Y, σ) is continuous and A ⊂ X is


compact, then f (A) is compact.

P ROOF. Let {Vλ : λ ∈ Λ} is an open cover of f (A) in Y . Define Uλ =


f −1 (Vλ ). These are open sets by continuity, and they cover A. Thus there is a
finite subcover Uλ1 , . . . , Uλn . Then since Vλ ⊃ f (Uλ ), it follows that Vλ1 , . . . , Vλn
covers f (A). Hence f (A) is compact. 
The following important consequence follows directly.

6.3.7. E XTREME VALUE T HEOREM . If (X, τ ) is compact and f ∈ C(X),


then |f | attains it maximum. In particular, kf k∞ < ∞.

6.3.8. D EFINITION . If (XλQ


, τλ ) are topological spaces for λ ∈ Λ, we define
the product space to be X = Λ Xλ = {(xλ ) : xλ ∈ Xλ } with the weakest
topology τ which makes the coordinate projections πλ : X → Xλ by πλ (x) = xλ
80 Some Topology

continuous. That is, the sets πλ−1 (U ) =


Q
µ∈Λ\{λ} Xµ × U are open and form a
subbase for the topology.

6.3.9. R EMARKS .
(1) The product topology τ consist of arbitrary unions of finite intersections of the
subbase. So if λ1 , . . . , λn ∈ Λ and Ui ∈ τλi , then the sets of the form
Y
U1 × · · · × Un × Xµ
µ∈Λ\{λi ,1≤i≤n}

form a base for the topology.

(2) If Λ is finite, this is a familiar construction in the metric space case. Indeed, if
(Xi , di ) are metric spaces for 1 ≤ i ≤ n, then
 
D (x1 , . . . , xn ), (y1 , . . . , yn ) = max di (xi , yi ) : 1 ≤ i ≤ n

is a metric on the product, and the metric topology coincides with the product
topology.

(3) When Λ is infinite, it often requires the Axiom of Choice to be able to say that
X is non-empty.
Qn
6.3.10. T HEOREM . If Xi are compact for 1 ≤ i ≤ n, then X = i=1 Xi is
compact.

P ROOF. It suffices to show that X ×Y is compact if both X and Y are compact,


as the result follows by finitely many repetitions. Let W = {Wλ : λ ∈ Λ} be an
open cover of X × Y . For each (x, y) ∈ X × Y , there is a λ(x, y) so that (x, y) ∈
Wλ(x,y) . Hence this set contains a basic open set Wλ(x,y) ⊃ Ux,y × Vx,y 3 (x, y)
for open sets Ux,y in X and Vx,y in Y . Fix y ∈ Y . The sets {Ux,y : x ∈ X} is an
open cover of X. Select a finite subcover Uxy1 ,y , . . . , Uxyny ,y . Then Uxyi ,y × Vxyi ,y
Tny
for 1 ≤ i ≤ ny covers X × {y}. Let Vy = i=1 Vxyi ,y . This is open, contains
y, and Uxi ,y × Vxi ,y for 1 ≤ i ≤ ny covers X × Vy . Now {Vy : y ∈ Y } is an
y y

open cover of Y . Let Vy1 , . . . , Vym be a finite subcover. Then the finite collection
{Uxyj, y × Vxyj, y : 1 ≤ i ≤ nyj , 1 ≤ j ≤ m} covers X × Y . Therefore
i j i j
{Wλ(xyj, y ) : 1 ≤ i ≤ nyj , 1 ≤ j ≤ m} covers X × Y . 
i j

6.3.11. R EMARK . An infinite product of compact spaces is also compact. This


is known as Tychonoff’s Theorem. It turns out to be equivalent to the Axiom of
Choice. We usually prove it in the functional analysis course.
6.4 Separation Properties 81

6.4. Separation Properties


There is a whole hierarchy of separation properties to classify how nice a topo-
logical space is.

6.4.1. D EFINITION . A topological space is T0 if x 6= y ∈ X, then there is an


open set containing one of these points, but not the other.
A topological space is T1 if points are closed.
A topological space is T2 if it is Hausdorff.
A topological space is T3 if it is T1 and regular: given a closed set A and a point
x 6∈ A, there are disjoint open sets U ⊃ A and V 3 x.
A topological space is T3.5 or Tychonoff if it is T1 and completely regular: given a
closed set A and a point x 6∈ A, there is a continuous function f : X → [0, 1] so
that f (x) = 1 and f |A = 0.
A topological space is T4 if it is T1 and normal: given disjoint closed sets A, B,
there are disjoint open sets U ⊃ A and V ⊃ B.

6.4.2. R EMARKS .
(1) The T0 property is very weak. Example 6.2.2(4) is T0 but not T1 .
(2) T1 is also a very weak property. It implies T0 since if x 6= y, the set {x}c is
an open set containing y but not x. To be T1 it is enough to find open sets U 3 x
with y 6∈ U and V 3 y with x 6∈ V . (Exercise.) Example 6.2.2(5) is T1 but not
Hausdorff. Points are closed in Hausdorff spaces, so T2 implies T1 .
(3) The trivial topology is regular because there are no points which are disjoint
from a non-empty closed set. But throwing in the T1 condition makes a T3 space
Hausdorff because you can take your closed set to be {y}.
(4) Completely regular spaces are regular, because if x 6∈ A and A is closed, let
f : X → [0, 1] be a continuous function with f (x) = 1 and f |A = 0. Then
U = {y : f (y) > 1/2} and V = {y : f (x) < 1/2} are disjoint open sets with
x ∈ U and A ⊂ V . Again we need to add the T1 property to exclude examples like
the trivial topology.
(5) The T1 property ensures that T4 spaces are Hausdorff. A metric space (X, d) is
normal. If A and B are disjoint closed sets, let U = {x : d(x, A) < d(x, B)} and
V = {x : d(x, A) > d(x, B)}. It is easy to check that these are disjoint open sets.

6.4.3. P ROPOSITION . Compact Hausdorff spaces are normal.

P ROOF. Let A, B be disjoint closed subsets of a compact Hausdorff space X.


Then A and B are compact by Proposition 6.3.3. Moreover since X is Hausdorff,
that same Proposition shows that for each point x ∈ B, there are disjoint open sets
82 Some Topology

Ux ⊃ A and Vx 3 x. The collection {Vx S : x ∈ B} is an open


T cover of B. Let
Vx1 , . . . , Vxn be a finite subcover. Set V = ni=1 Vxi and U = ni=1 Uxi . These are
disjoint open sets with A ⊂ U and B ⊂ V . 
Now we prove that T4 spaces have lots of continuous functions.

6.4.4. U RYSOHN ’ S L EMMA . Let X be a normal topological space, and let A


and B be disjoint closed sets. Then there is a continuous function f : X → [0, 1]
such that f |A = 0 and f |B = 1.

P ROOF. Normality implies the following property: if A is closed and W is


open and A ⊂ W , then there is an open set U such that A ⊂ U ⊂ U ⊂ W . To see
this, take B = W c . Use normality to find disjoint open sets U ⊃ A and V ⊃ B.
Then U ⊂ V c ⊂ W .
Start with U1 = B c . Find an open U1/2 so that A ⊂ U1/2 ⊂ U1/2 ⊂ U1 .
Repeating this procedure recursively, we find open sets Uk/2n for 1 ≤ k ≤ 2n and
n ≥ 1 so that
A ⊂ Uk/2n ⊂ Uk/2n ⊂ U(k+1)/2n for 1 ≤ k < 2n .
Let D = {k/2n : 1 ≤ k ≤ 2n , n ≥ 1}. Define f (x) = inf{r ∈ D : x ∈ Ur } if
x ∈ U1 and f |B = 1. Clearly 0 ≤ f ≤ 1 and f |A = 0.
Claim: f is continuous. Note that
[
f −1 [0, t) =

Ur is open for t ∈ [0, 1].
r<t, r∈D

Also for 0 ≤ t < 1, since t < r < s for r, s ∈ D implies that Ur ⊂ Us ,


\ \ \
f −1 [0, t] = f −1 [0, r) =
 
Ur = Ur .
r>t, r∈D r>t, r∈D r>t, r∈D
c
This is closed, and therefore f −1 (t, 1] =
 T
r>t, r∈D Ur is open. Hence,
f −1 (s, t) is open for s < t, and so f is continuous.



6.4.5. R EMARK . If (X, d) is a metric space and A and B are disjoint closed
sets, define
d(x, A)
f (x) = .
d(x, A) + d(x, B)
This satisfies the conclusion of Urysohn’s Lemma.

6.4.6. C OROLLARY. If X is a T4 space, then it is Tychonoff (T3.5 ).

6.4.7. C OROLLARY. If X is a compact Hausdorff space, C(X) separates


points.
6.4 Separation Properties 83

Urysohn’s Lemma implies the following significant strengthening.

6.4.8. T IETZE ’ S E XTENSION T HEOREM . Let X be a normal topological


space, and let A ⊂ X be a closed set. If f : A → [a, b] is continuous, there is a
continuous function F : X → [a, b] such that F |A = f .

P ROOF. After scaling, we may assume that the range is [−1, 1]. Let A1 =
−1
f ([−1, − 31 ]) and B1 = f −1 ([ 31 , 1]). By Urysohn’s Lemma, there is a function
g1 : X → [− 13 , 31 ] so that g1 |A1 = − 31 and g1 |B1 = 13 . Then f1 = f −g1 |A has range
in [− 23 , 32 ]. Repeat the process, setting A2 = f1−1 ([− 23 , − 29 ]) and B2 = f1−1 ([ 29 , 23 ]),
and finding g2 : X → [− 92 , 29 ] with g2 |A2 = − 29 and g2 |B2 = 29 .
2 2
Then f2 = f1 − g2 |A has range in [− 32 , 23 ]. Recursively we obtain func-
→ [−2 · 3−n , 2 · 3−n ] so that fn = f − ni=1 gn |A has range in
P
tions gn : X
n n
[− 23 , 32 ]. Let g = n≥1 gn . Then g|A = f and
 P
X X
kgk∞ ≤ kgn k∞ = 2 · 3−n = 1. 
n≥1 n≥1

6.4.9. D EFINITION . A Hausdorff space X is locally compact (LCH) if every


point has a compact neighbourhood; i.e. for x ∈ X, there is a compact set K with
x ∈ int K.

6.4.10. P ROPOSITION . Locally compact Hausdorff spaces are regular.

P ROOF. Let A be a closed set in a LCH space X, and x 6∈ A. Let K be a


compact neighbourhood of x. Then A ∩ K is compact. So by Proposition 6.3.3,
there are disjoint open sets U 3 x and V ⊃ A ∩ K. Then W = U ∩ int K ⊂ K \ V
c
is an open neighbourhood of x and W ⊂ K \ V ⊂ Ac . Hence W ⊃ A is open
and disjoint from W . Therefore X is regular. 

6.4.11. D EFINITION . If f : X → C, the support of f is


supp(f ) = {x : f (x) 6= 0}.
If X is LCH, Cc (X) denotes the space of continuous C-valued functions with com-
pact support. Let C0 (X) be the closure of Cc (X) in (C b (X), k · k∞ ).

There is a weaker version of Urysohn’s Lemma valid for LCH spaces.

6.4.12. P ROPOSITION . Locally compact Hausdorff spaces are completely reg-


ular. Moreover, if X is LCH, A ⊂ X is closed, B ⊂ X is compact and A ∩ B = ∅,
then there is f ∈ Cc (X) with 0 ≤ f ≤ 1, f |A = 0 and f |B = 1.
84 Some Topology

P ROOF. Arguing as in the previous proof, for each x ∈ B, there is a compact


neighbourhood Wx 3 x which is disjoint from A. We can take Wx = int Wx . Then
{Wx :Sx ∈ B} is an open cover. Let Wx1 , . . . ,SWxn be a finite subcover. Then
W = ni=1 Wxi is open containing B and W = ni=1 Wxi is compact and disjoint
from A.
Let C = W \ W . Then B and C are disjoint closed sets in the compact
Hausdorff space W . By Urysohn’s Lemma, there is a continuous function f :
W → [0, 1] with f |B = 1 and f |C = 0. Extend f to a function on X by setting
f |W c = 0. Now A ⊂ W c , and thus f |A = 0. To see that f is continuous, note
that if a ≥ 0, f −1 (a, b) is open in W , and hence open in X. If a < 0 < b, then
D = f −1 ([b, ∞) is closed in W , and thus f −1 (a, b) = Dc is open in X. Since
supp(f ) ⊂ W , we have f ∈ Cc (X). 
There is also a weaker version of Tzietze’s Theorem valid for LCH spaces.

6.4.13. C OROLLARY. Suppose that X is LCH, A ⊂ X is closed, B ⊂ X is


compact, A ∩ B = ∅ and g ∈ C(B). Then there is an f ∈ Cc (X) with f |A = 0
and f |B = g, and kf k∞ = kgk∞ .

P ROOF. The setup is the same as the previous proof. Define f |B = g and
f |C = 0, and extend this to a continuous function on W using Tietze’s Theorem.
Then extend f to all of X by setting f |W c = 0. This is continuous with compact
support as in the previous proof. 

6.4.14. D EFINITION . If U is open, say f ≺ U for a continuous function f if


0 ≤ f ≤ 1 and supp f ⊂ U . Suppose that U = {Uλ : λ ∈ Λ} is an open cover of
K ⊂ X. Then a partition of unity for K relative to U is a collection of functions
gλ ≺ Uλ which is locally finite, i.e., for each x ∈ X, {λ : gλ (x) 6= 0} is finite, and
P
Λ gλ (x) = 1 for all x ∈ K.

6.4.15. P ROPOSITION . Let X be LCH, and let K ⊂ X be compact. Suppose


that U = {U1 , . . . , Un } is an open cover of K. Then there are gi ≺ Ui in Cc (X)
which is a partition of unity for K relative to U.

P ROOF. For each x ∈ K, there is some Ui 3 x. Pick a compact neighbour-


hood with x ∈ Cx ⊂ Ui . Now {intS Cx : x ∈ K} is an open cover with finite
subcover Cx1 , . . . , Cxm . Let Ki = {Cxj : Cxj ⊂ Ui }. Then Ki ⊂ Ui and
K ⊂ ni=1 Ki =: C is compact . Let W be an open neighbourhood of C with W
S
compact. By PUrysohn’s lemma, there are functions hi so that χKi ≤ hi ≺ Ui ∩ W .
Let gi = hi / ni=1 hi . 
6.5 Nets* 85

6.5. Nets*
Sequences are not sufficient for dealing with convergence in general topolog-
ical spaces, including many that arise in normal contexts. The replacement is the
notion of a net, which you can think of as a very wide and very long generalized
sequence. This optional section is not required in the next chapter.
We will need to call on the Axiom of Choice frequently. Recall that this is the
assumption that whenever {Aλ : λ ∈ Λ} S is a collection of non-empty sets, there is
a selection (choice function) ϕ : Λ → Λ Aλ so that ϕ(λ) ∈ Aλ for all λ ∈ Λ.

6.5.1. D EFINITION . A partial order on a set Λ is a relation ≤ satisfying

(1) λ ≤ λ for λ ∈ Λ (reflexive)


(2) λ ≤ µ and µ ≤ λ implies that λ = µ (antisymmetric)
(3) λ ≤ µ and µ ≤ ν implies that λ ≤ ν (transitive).

Then (Λ, ≤) is called a poset. A poset is upward directed if for λ1 , λ2 ∈ Λ, there


is a µ ∈ Λ so that λ1 ≤ µ and λ2 ≤ µ.

6.5.2. D EFINITION . A net in X is an upward directed poset Λ with a function


j : Λ → X, say xλ = j(λ). We usually write the net as (xλ )Λ . A net (xλ )Λ
converges to x in (X, τ ) if for every open set U 3 x, there is λ0 ∈ Λ so that
xλ ∈ U for every λ ≥ λ0 . We write limΛ xλ = x.
A subnet (yγ )Γ of (xλ )Λ is given by a cofinal function ϕ : Γ → Λ so that
yγ = xϕ(γ) , where we say that ϕ is cofinal if for all λ ∈ Λ, there is a γ0 ∈ Γ so that
ϕ(γ) ≥ λ for all γ ≥ γ0 . It is convenient if ϕ is monotone, meaning that γ1 ≤ γ2
implies that ϕ(γ1 ) ≤ ϕ(γ2 ). But this is not necessary.

We present a detailed example to explain why nets are needed, and how to use
them.

6.5.3. E XAMPLE . Let X = N0 ×N0 . Declare that U ⊂ X is open if (0, 0) 6∈ U ;


and that a set U 3 (0, 0) is open if {m : π1−1 (m) ∩ U is cofinite in N0 } is cofinite
in N0 . It is easy to verify that this defines a topology.
(a) X is Hausdorff because {(m, n)} is open if m + n ≥ 1 and {(m, n)}c is an
open neighbourhood of (0, 0).
(b) (0, 0) ∈ X \ {(0, 0)} because every open set U 3 (0, 0) intersects X \
{(0, 0)}.
(c) However no sequence xk = (mk , nk ) in X \ {(0, 0)} converges to (0, 0).
There are two cases. If {mk : k ≥ 1} is bounded, pick m0 so that mk = m0 infin-
itely often. The set U = {(m, n) : m 6= m0 } ∪ {(0, 0)} is an open neighbourhood
of (0, 0), and the sequence is not eventually in U . Otherwise, there is a sequence
86 Some Topology

ki → ∞ so that mki < mki+1 for i ≥ 1. Then U = X \ {xki : i ≥ 1} is an open


neighbourhood of (0, 0), and the sequence is not eventually in U .
(d) There is a net in X \ {(0, 0)} converging to (0, 0). Let Λ = {U ∈ τ :
(0, 0) ∈ U } where U ≤ V if U ⊃ V . (We say that Λ is ordered by containment.)
This is directed because U, V ≤ U ∩ V . Order X \ {(0, 0)} by

(0, 1), (1, 0), (0, 2), (1, 1), (2, 0), (0, 3), (1, 2), (2, 1), (3, 0), . . . .

Define xU to be the least element in this list which belongs to U . (This avoids any
issues with the Axiom of Choice.) Then (xU ) converges to (0, 0) because given an
open neighbourhood U 3 (0, 0), we have xV ∈ V ⊂ U whenever U ≤ V .
(e) The sequence (0, 1), (1, 0), (0, 2), (1, 1), (2, 0), (0, 3), (1, 2), (2, 1), . . . has
a subnet converging to (0, 0). Let Λ be the net just constructed. Define ϕ(U ) = xU
considered as an element in this sequence. To see that this map is cofinal, let
(m0 , n0 ) be in this sequence, and set N0 = m0 + n0 . Let

U0 = X \ {(m, n) : 1 ≤ m + n ≤ N0 }.

Then if U0 ≤ U , it follows that xU = (m, n) with m + n > N0 and thus ϕ(U )


follows (m0 , n0 ) in the sequence. Therefore this is a subnet of the sequence which
converges to (0, 0).

Now we show that nets replace sequences in some familiar results.

6.5.4. P ROPOSITION . Let A ⊂ X. Then x ∈ A if and only if there is a net


(aλ )Λ in A such that limΛ aλ = x.

P ROOF. Suppose that x ∈ A. By Proposition 6.1.4(3), every open neighbour-


hood U of x intersects A. Let O(x) be the open neighbourhoods of x ordered by
containment. By the Axiom of Choice, we can pick a point aU ∈ A ∩ U for each
U ∈ O(x). The net (aU )O(x) converges to x by construction.
Conversely, suppose that (aλ )Λ is a net in A such that limΛ aλ = x. Then for
any neighbourhood U of x, there is a λ so that aλ ∈ U . In particular, A ∩ U 6= ∅.
Hence by Proposition 6.1.4(3), x ∈ A. 

6.5.5. T HEOREM . Let f : (X, τ ) → (Y, σ). Then f is continuous if and only if
whenever (xλ )Λ is a net in X converging to x, it follows that f (x) = limΛ f (xλ ).

P ROOF. Suppose that f is continuous, and let (xλ )Λ be a net in X converging


to x. Let V be an open neighbourhood of f (x). Then U = f −1 (V ) is an open
neighbourhood of x. By convergence, there is a λ0 ∈ Λ so that xλ ∈ U for all
λ ≥ λ0 . Hence f (xλ ) ∈ f (U ) ⊂ V for all λ ≥ λ0 . That means that f (x) =
limΛ f (xλ ).
6.5 Nets* 87

Conversely, suppose f is not continuous. Thus there is an open set V ⊂ Y


such that U = f −1 (V ) is not open. Then U c ∩ U contains a point x. By Proposi-
tion 6.5.4, there is a net (xλ )Λ in U c with limit x. Therefore f (xλ ) ∈ f (U c ) ⊂ V c .
Since V c is closed, any limit point of this net must remain in V c by Proposi-
tion 6.5.4. Therefore it cannot converge to f (x) which lies in V . So f (x) 6=
limΛ f (xλ ). 

6.5.6. T HEOREM . A topological space X is compact if and only if every net in


X has a convergent subnet.

P ROOF. Suppose that every net in X has a convergent subnet. Consider a


collection F = {Cα : α ∈ A} of closed sets with the FIP. Let
Λ = {F ⊂ A : F is finite, non-empty}
ordered by inclusion, i.e., F ≤ G if F ⊂ G. This is an upward directed poset:
F1 , F2 T≤ F1 ∪ F2 . For each F ∈ Λ, use the Axiom of Choice to select a point
xF ∈ α∈F Cα . This is possible since the finite intersection is non-empty. Then
(xF )Λ is a net in X. Let (yγ )Γ be a subnet with limit x; where ϕ : Γ → Λ and
yγ = xϕ(γ) . For any α ∈ A, there is a γα ∈ Γ so that γ ≥ γα implies that
ϕ(γ) ≥ {α}. Hence yγ ∈ Cα for all γ ≥ γα . SinceTCα is closed, the limit point
x ∈ Cα . This holds for all α ∈ A. Therefore x ∈ F. By Proposition 6.3.5, it
follows that X is compact.
Conversely suppose that X is compact. Let (xλ )Λ be a net in X. For each
λ ∈ Λ, define Cλ = {xµ : µ ≥ λ}. Then F = {Cλ : λ ∈ Λ} is a collection of
non-empty closed sets. It has FIP because if λ1 , . . . , λn ∈ Λ, the upward directed
property
Tn ensures that there is some λ0 ∈ Λ so that λi ≤ λ0 for 1T ≤ i ≤ n. Hence
i=1 Cλi ⊃ Cλ0 6= ∅. By Proposition 6.3.5, there is a point x ∈ Λ Cλ .
Now we build a subnet with limit x. Let O(x) be the set of all open neigh-
bourhoods of x. Let Γ = Λ × O(x) with order (λ, U ) ≤ (µ, V ) if λ ≤ µ and
U ⊃ V . Let Sλ,U = {µ ∈ Λ : µ ≥ λ and xµ ∈ U }. This set is non-empty
because x ∈ Cλ ∩ U = {xµ : µ ≥ λ} ∩ U ; and thus by Proposition 6.5.4, xµ ∈ U
for some µ ≥ λ. Use the Axiom of Choice to select µ = ϕ(λ, U ) ∈ Sλ,U for
each (λ, U ) ∈ Γ. The map ϕ : Γ → Λ is cofinal because if λ0 ∈ Λ, then every
(λ, U ) ≥ (λ0 , X) will have ϕ(λ, U ) = µ ≥ λ ≥ λ0 . So yλ,U = xϕ(λ,U ) defines a
subnet (yλ,U )Γ of (xλ )Λ .
Finally we claim that limΓ yλ,U ) = x. Indeed, let U ∈ O(x) be any open
neighbourhood of x. Fix some λ0 ∈ Λ. Whenever (λ, V ) ≥ (λ0 , U ), we have
yλ,V ∈ V ⊂ U . Thus this net converges to x. 
C HAPTER 7

Functionals on Cc(X) and C0(X)

7.1. Radon measures


Let X be a locally compact Hausdorff space. Let Cc (X) denote the space of
continuous functions with compact support (i.e. {x : f (x) 6= 0} is compact). Let
C0 (X) be the space of all continuous functions vanishing at infinity, meaning that
{x : |f (x)| ≥ ε} is compact for all ε > 0. Both spaces are endowed with the sup
k·k∞
norm kf k∞ = sup |f (x)|. It is easy to see that C0 (X) = Cc (X) . When X is
compact, Cc (X) = C0 (X) = C(X).

7.1.1. D EFINITION . A Borel measure µ is outer regular on E ∈ Bor(X) if


µ(E) = inf{µ(U ) : E ⊂ U open}

and inner regular on E if

µ(E) = sup{µ(K) : E ⊃ K compact}.

We call µ a regular measure if it is both inner and outer regular on all Borel sets.
A Borel measure µ is a Radon measure if µ(K) < ∞ for all compact sets K, it is
outer regular on all Borel sets and inner regular on open sets.

7.1.2. R EMARKS .
(1) Some books define a Radon measure to be a regular measure which takes finite
values on all compact sets. This is somewhat less general (so we would require an
additional hypothesis such as σ-compactness), and does not simplify the proof of
the main result.

(2) If µ(E) < ∞, then µ is regular on E if and only if for ε > 0, there is an open
set U and a compact set K so that K ⊂ E ⊂ U and µ(U \ K) < ε.

(3) There are compact Hausdorff spaces which support finite Borel measures which
are not regular. See the exercises in [2].

7.1.3. P ROPOSITION . If X is LCH, µ is a Radon measure on X and E ∈


Bor(X) is σ-finite, then µ is regular on E.
88
7.1 Radon measures 89

P ROOF. Let E ∈ Bor(X) such that µ(E) < ∞, and let ε > 0. By outer
regularity, there is an open U ⊃ E with µ(U \ E) < ε/2. Then there is an open
set V ⊃ U \ E with µ(V ) < ε/2. Since µ is inner regular on U , there is a compact
L ⊂ U with µ(U \ L) < ε/2. Then K = L \ V is compact and K ⊂ U \ V ⊂ E.
Finally,
µ(E \ K) < µ(U \ K) ≤ µ(U \ L) + µ(V ) < ε.
Thus µ is regular on E.
If E is a σ-finite set, write E = ˙ i≥1 Ei where µ(Ei ) < ∞. Find compact
S

sets Kn ⊂ En with µ(En \ Kn ) < 2−n ε. Then Cp = pn=1 Kn are compact. Set
S
S P
K = n≥1 Kn . Then K ⊂ E and µ(E \ K) ≤ n≥1 µ(En \ Kn ) < ε. By
continuity from below, we get limp→∞ µ(Cp ) = µ(K) > µ(E) − ε. Therefore µ
is regular on E. 
The following immediate corollary covers many cases of interest.

7.1.4. C OROLLARY. Every σ-finite Radon measure is regular.


In particular, if X is σ-compact (i.e. a countable union of compact sets) and µ
is a Radon measure on X, then µ is regular.

7.1.5. C OROLLARY. If X is a separable, locally compact metric space and µ


is a Radon measure on X, then µ is regular.

P ROOF. We show that X is σ-compact. A separable metric space is second


countable (see Example 6.1.10(1)). Indeed, if {xi : i ≥ 1} is a dense subset, then
T = {br (xi ) : i ≥ 1, r ∈ Q+ }
is a countable base for the topology. For each xi , local compactness implies that
there is some ri > 0 so that bri (xi ) is compact. Then
T0 = {br (xi ) : i ≥ 1, r ∈ Q+ , r ≤ ri }
is also a base for the topology. Hence
[ [
X= br (xi ) = br (xi )
T0 T0

is the union of countably many compact sets. The result now follows from Corol-
lary 7.1.4. 

7.1.6. P ROPOSITION . If X is a separable, locally compact metric space and µ


is a finite Borel measure on X, then µ is regular and hence Radon.

P ROOF. Let S be the collection of all Borel sets on which µSis regular. The
proof of Corollary 7.1.5 shows that X is σ-compact. Write X = n≥1 Kn where
Kn are compact. If C is closed, then the sets Cn = C ∩ ni=1 Ki are compact, and
S
90 Functionals on Cc (X) and C0 (X)

S
Cn ⊂ Cn+1 ⊂ n≥1 Cn = C. By continuity from below, µ(C) = limn→∞ µ(Cn ).
So µ is inner regular on closed sets.
As X is a metric space, a closed set C is a Gδ (let Un = T{x : d(x, C) < n1 }).
Since µ is finite, continuity from above shows that if C = n≥1 Un and Un ⊃
Un+1 , then µ(C) = limn→∞ µ(Un ). Thus µ is regular on C.
Next we claim S is closed under complements. Let A ∈ S. By Remark 7.1.2(2),
for ε > 0, there is a compact set K and open set U with K ⊂ A ⊂ U and
µ(U \ K) < ε. Therefore U c ⊂ Ac ⊂ K c , and µ(K c \ U c ) = µ(U \ K) < ε. The
set U c is closed but not necessarily compact. Since U c ∈ S, there is a compact set
L ⊂ U c so that µ(U c \ L) < ε − µ(K c \ U c ). Thus µ(K c \ L) < ε. Therefore µ
is regular on Ac .
Now we S show that S is closed under countable unions. Let Ai ∈ S for i ≥ 1,
and let A = i≥1 Ai . Given ε > 0, find compact sets Ki and open sets Ui so that
Ki ⊂ Ai ⊂ Ui and µ(Ui \ Ki ) < 2−i ε. Set
n
[ [ [
Cn = Ki , C= Ki and U = Ui .
i=1 i≥1 i≥1

Then Cn ⊂ A ⊂ U , each Cn is compact, and U is open. Therefore


X X
lim µ(U \ Cn ) = µ(U \ C) ≤ µ(Ui \ Ki ) < 2−i ε = ε.
n→∞
i≥1 i≥1

The first step uses continuity from above, which requires the finiteness of µ.
Hence S is a σ-algebra. As S contains all closed sets, it contains all Borel sets.
So µ is regular and finite, whence Radon. 

7.2. Positive functionals on Cc (X)


7.2.1. D EFINITION . A positive linear fiunctional ϕ on Cc (X) or C0 (X) is a
linear functional such that ϕ(f ) ≥ 0 if f ≥ 0.

7.2.2. P ROPOSITION . Let X be a locally compact Hausdorff space; and let µ


be a Borel measure on X which is finite on compact sets.
Z
(1) Φµ (f ) = f dµ determines a positive linear functional on Cc (X).

(2) Φµ is continuous with respect to the sup norm (i.e. |Φµ (f )| ≤ Ckf k∞
for f ∈ Cc (X)) if and only if M = sup{µ(K) : K is compact} < ∞.
Moreover kΦµ k = M .
(3) In particular, if µ is Radon, then Φµ is continuous if and only if kµk :=
µ(X) < ∞.
7.2 Positive functionals on Cc (X) 91

P ROOF. If f ∈ Cc (X) with kf k∞ ≤ 1 and supp(f ) ⊂ K, where K is com-


pact, then
Z Z
|Φµ (f )| = f dµ ≤ |f | dµ ≤ µ(K).
K
So this is a linear functional which is norm continuous on the subspace of functions
supported on K. Clearly it takes positive functions to positive values.
By Proposition 6.4.12, given a compact set K, there is a function h ∈ Cc (X)
so that χK ≤ h ≤ 1. Hence kΦµ k ≥ Φµ (h) ≥ µ(K). Taking the supremum over
all compact K shows that kΦµ k ≥ M = sup{µ(K) : K is compact}. Therefore if
Φµ is continuous with respect to the sup norm, then M < ∞. Conversely, if M is
finite and f ∈ Cc (X) with kf k∞ ≤ 1, then K = supp f is compact. Thus
Z Z
|Φµ (f )| ≤ |f | dµ ≤ dµ = µ(K) ≤ M.
K

Hence kΦµ k = M .
If µ is Radon, then it is inner regular on X; so µ(X) = M . Thus Φµ is
continuous if and only if kµk := µ(X) < ∞. 

7.2.3. P ROPOSITION . Let X be a LCH space. Suppose that ϕ is a positive


linear functional on Cc (X). If K ⊂ X is compact, then there is a constant CK so
that |ϕ(f )| ≤ CK kf k∞ for all f ∈ Cc (X) with supp(f ) ⊂ K.
A positive linear functional on C0 (X) is continuous, i.e. there is a C < ∞ so
that |ϕ(f )| ≤ Ckf k∞ for all f ∈ C0 (X).

P ROOF. As in the previous proof, there is a function h ∈ Cc (X) such that


χK ≤ h ≤ χL for some compact set L. Let f ∈ Cc (X) with supp(f ) ⊂ K
with kf k∞ ≤ 1. First suppose that f is real valued. Then −h ≤ f ≤ h, so
0 ≤ f +h ≤ 2h. By positivity, 0 ≤ ϕ(f )+ϕ(h) ≤ 2ϕ(h); whence |ϕ(f )| ≤ ϕ(h).
If f takes arbitrary complex values, let ϕ(f ) = eiθ |ϕ(f )|. Replace f by e−iθ f
so that we may suppose that ϕ(f ) > 0. Write f = Re f + i Im f . Since real
functions are sent to real values, 0 ≤ ϕ(f ) = ϕ(Re f ) + iϕ(Im f ) implies that
ϕ(f ) = ϕ(Re f ). Hence |ϕ(f )| ≤ ϕ(h). So CK = ϕ(h) is the desired constant.
Now suppose that ϕ is defined on C0 (X). Let
M = sup{ϕ(f ) : 0 ≤ f ≤ 1, f ∈ C0 (X)}.
= ∞, we can choose fn ∈ C0 (X) with 0 ≤ fn ≤ 1 and ϕ(fn ) > 2n . Define
If M P
f = k≥1 2−k fnk . This converges uniformly, and so lies C0 (X). By positivity,
p
X p
X
2−n fn = sup 2−n ϕ(fn ) ≥ sup p = ∞.

ϕ(f ) ≥ sup ϕ
p≥1 n=1 p≥1 n=1 p≥1

This is impossible, and thus M < ∞. We now argue as in the first paragraph to
deduce that |ϕ(f )| ≤ ϕ(|f |) ≤ M kf k∞ for f in C0 (X). 
92 Functionals on Cc (X) and C0 (X)

For convenience of notation, given U ⊂ X is open, we will write f ≺ U if


f ∈ Cc (X), 0 ≤ f ≤ 1 and supp(f ) ⊂ U . Also recall that Cc (X, R) is a lattice,
and we write f ∨ g = max{f, g} and f ∧ g = min{f, g}.

7.2.4. R IESZ -M ARKOV T HEOREM . Let ϕ be a positive linear functional on


Cc (X) for a LCH space X. Then there is a unique Radon measure µ on X so that
ϕ = Φµ .

P ROOF. For U ⊂ X open, define


ρ(U ) = sup{ϕ(f ) : f ≺ U }.
Use this to define an outer measure µ∗ given by
X [
µ∗ (E) = inf ρ(Ui ) : E ⊂ Ui , Ui open .
i≥1 i≥1

Let µ̄ denote the complete measure on the σ-algebra of µ∗ -measurable sets.


Claim 1: µ∗ (U )S= ρ(U ) for U open. Since U ⊂ U , we have µ∗ (U ) ≤ ρ(U ).
Suppose that U ⊂ i≥1 Ui for Ui open. Take f ≺ U . Then K = supp(f ) is a
compact subset of U , and hence K ⊂ U1 ∪ · · · ∪ Un for some n. By Proposi-
tion 6.4.15, Punity gi ∈ Cc (X) with gi ≺ Ui ∩ U such that
P there is a partition of
χK ≤ ni=1 gi ≺ U . Then f = ni=1 f gi and f gi ≺ Ui ∩ U . Therefore
n
X n
X
ϕ(f ) = ϕ(f gi ) ≤ ρ(Ui ).
i=1 i=1

Taking the supremum over f ≺ U , we see that ρ(U ) ≤ ∞


P
i=1 ρ(Ui ).
This means thatSthere is no advantage to using more than
Pone open set to cover
E. That is, if E ⊂ i≥1 Ui = U for Ui open, then ρ(U ) ≤ i≥1 ρ(Ui ). Therefore
µ∗ (E) = inf{ρ(U ) : E ⊂ U, U open}.
Claim 2: an open set U is µ∗ -measurable, i.e., µ∗ (E) = µ∗ (E∩U )+µ∗ (E\U )
for all sets E ⊂ X. Recall that µ∗ (E) ≤ µ∗ (E ∩ U ) + µ∗ (E \ U ) is always valid.
Thus is is enough to prove the reverse inequality when µ∗ (E) < ∞.
First let E be open with µ∗ (E) < ∞. Let ε > 0. Then for some f ≺ E ∩ U ,
µ∗ (E ∩ U ) = ρ(E ∩ U ) < ϕ(f ) + ε
Let K = supp(f ). Then µ∗ (E \ U ) ≤ µ∗ (E \ K) = ρ(E \ K) < ϕ(g) + ε for
some g ≺ E \ K. Since f, g have disjoint supports, f + g ≤ χE . Therefore
µ∗ (E ∩ U ) + µ∗ (E \ U ) < ϕ(f + g) + 2ε ≤ ρ(E) + 2ε.
Letting ε → 0, we get µ∗ (E ∩ U ) + µ∗ (E \ U ) ≤ µ∗ (U ), and thus equality holds.
Now let E be arbitrary with µ∗ (E) < ∞. For ε > 0, pick E ⊂ V with V open
so that ρ(V ) < µ∗ (E) + ε. Then
µ∗ (E ∩ U ) + µ∗ (E \ U ) ≤ µ∗ (V ∩ U ) + µ∗ (V \ U ) = µ∗ (V ) < µ∗ (E) + ε.
7.2 Positive functionals on Cc (X) 93

Let ε → 0 to get the non-trivial inequality. Thus U is µ∗ -measurable.


Therefore µ̄ is defined on the σ-algebra generated by all open sets, namely
Bor(X). Let µ be the Borel measure obtained by restricting µ̄ to Bor(X).
Claim 3: µ is Radon. First µ is outer regular since for E ∈ Bor(X),
µ(E) = µ∗ (E) = inf{µ(U ) : E ⊂ U, U open}.
If K is compact, there is an open set W ⊃ K with L = W compact. By
Urysohn’s Lemma, there is a function f with χK ≤ f ≺ W . Thus
µ(K) ≤ µ(W ) ≤ CL < ∞,
where CL is the constant from Proposition 7.2.3. By outer regularity, we can choose
an open W ⊃ K so that µ(W ) < µ(K) + ε. If f is chosen with χK ≤ f ≺ W ,
then ϕ(f ) < µ(W ) < µ(K) + ε.
On the other hand, suppose f ∈ Cc (X) such that χK ≤ f . Let ε > 0, and
set V = {x : f (x) > 1 − ε}. Then K ⊂ V , and V is open. Let g ≺ V so
that ϕ(g) > µ(V ) − ε ≥ µ(K) − ε. Observe that (1 − ε)g ≤ f ; and hence
ϕ(f ) ≥ (1 − ε)(µ(K) − ε). Letting ε → 0 yields ϕ(f ) ≥ µ(K). Combining these
two results shows that
µ(K) = inf{ϕ(f ) : χK ≤ f, f ∈ Cc (X)}.
Now if U is open, we have that
µ(U ) = sup{ϕ(f ) : f ≺ U }.
Suppose that r < µ(U ). Choose g ≺ U with ϕ(g) > r. Set K = supp(g), which
is a compact subset of U . By Urysohn’s Lemma, there is an h ∈ Cc (X) with
χK ≤ h ≺ U . By the previous paragraph, there is an f ∈ Cc (X) with χK ≤ f
and ϕ(f ) < µ(K) + ε. So χK ≤ h ∧ f ≺ U . Therefore
r < ϕ(g) ≤ ϕ(h ∧ f ) ≤ ϕ(f ) < µ(K) − ε.
Hence µ(K) > r − ε. It follows that
µ(U ) = sup{µ(K) : K ⊂ U, K compact}.
Thus µ is inner regular on open sets. Consequently µ is Radon.
Claim 4: ϕ = Φµ . By linearity, it suffices to verify this for 0 ≤ f ≤ 1. Given
N ∈ N, let ti = Ni for 0 ≤ i ≤ N . Define

fi = (f ∨ ti−1 ) ∧ ti − ti−1 and Ki = {x : f (x) ≥ ti } for 1 ≤ i ≤ N,

and K0 = supp(f ). Then N1 χKi ≤ fi ≤ N1 χKi−1 for 1 ≤ i ≤ N and f = N


P
i=1 fi .
Integrating, we obtain
1 1
µ(Ki ) ≤ Φµ (fi ) ≤ µ(Ki−1 ).
N N
94 Functionals on Cc (X) and C0 (X)

We also obtain N1 µ(Ki ) ≤ ϕ(fi ). If Ki−1 ⊂ U for U open, then N fi ≺ U and so


ϕ(fi ) ≤ N1 µ(U ). But µ is outer regular, and hence ϕ(fi ) ≤ N1 µ(Ki−1 ). Therefore
1 1
µ(Ki ) ≤ ϕ(fi ) ≤ µ(Ki−1 ).
N N
Now sum both of these inequalities from 1 to N , we obtain
N N −1
1 X 1 X
µ(Ki ) ≤ Φµ (fi ), ϕ(f ) ≤ µ(Ki ).
N N
i=1 i=0

Hence
µ(K0 ) − µ(KN ) µ(K0 )
Φµ (f ) − ϕ(f ) ≤ ≤ .
N N
Since N was arbitrary, we obtain Φµ (f ) = ϕ(f ).
Claim 5: uniqueness. Suppose that ν is a Radon measure such that Φν = ϕ.
Then if U is open and f ≺ U , then ϕ(f ) = Φν (f ) ≤ ν(U ). On the other hand,
since ν is inner regular on U , if r < ν(U ), there is a compact K ⊂ U with
ν(K) > r. Take some f ∈ Cc (X) with χK ≤ f ≺ U and observe that ϕ(f ) =
Φν (f ) ≥ ν(K) > r. Therefore
ν(U ) = sup{ϕ(f ) : f ≺ U } = µ(U ).
Both measures are outer regular, and therefore they agree on all Borel sets. 

7.2.5. C OROLLARY. Let X be a LCH space. Suppose that ϕ is a positive linear


functional on C0 (X). Then there is a unique finite Radon measure on X so that
ϕ = Φµ .

P ROOF. The restriction of ϕ to Cc (X) is positive. Thus by the Riesz-Markov


Theorem, there is a unique Radon measure µ on X so that ϕ = Φµ on Cc (X). By
Proposition 7.2.3, ϕ is continuous and thus ϕ = Φµ on C0 (X), and moreover
µ(X) = sup{ϕ(f ) : 0 ≤ f ≤ 1, f ∈ Cc (X)} ≤ C < ∞.
So µ is a finite measure. 

7.3. Linear functionals on C0 (X)


To discuss the continuous linear functionals on C0 (X), we need complex Borel
measures. Since they are finite measures, things simplify somewhat. A complex
measure µ is called regular if |µ| is regular.
7.3 Linear functionals on C0 (X) 95

7.3.1. P ROPOSITION . ZLet X be a LCH space. Let µ be a complex Borel mea-


sure on X. Then Φµ (f ) = f dµ defines a continuous linear functional on C0 (X)
such that kΦµ k ≤ kµk := |µ|(X). If µ is regular, then kΦµ k = kµk.

P ROOF. If f ∈ C0 (X),
Z Z
|Φµ (f )| = f dµ ≤ |f | d|µ| ≤ kf k∞ |µ|(X).

Thus kΦµ k ≤ kµk.


Suppose that µ is regular. By the Radon-Nikodym Theorem, there is a Borel
function h so that |h| = 1 and µ = h|µ|. Given ε > 0, chop the unit circle T into
disjoint arcs I1 , . . . , In of length at most ε and midpoints ζj . Set Ej = h−1 (Ij ).
By regularity, find compact Kj ⊂ Ej so that |µ|(Ej \ Kj ) < ε/n. Find disjoint
open sets Uj ⊃ Kj . By Urysohn’s Lemma, there are functions fj ∈ Cc (X) with
χKj ≤ fj ≺ Uj . Let f = nj=1 ζj fj . Then kf k∞ = 1 and
P

Z Z
kµk − Φµ (f ) = h̄ − f dµ ≤ h̄ − f d|µ|
n Z
X n
X
≤ h̄ − ζj d|µ| + 2|µ|(Ej \ Kj )
j=1 Kj j=1
ε
≤ kµk + 2ε.
2
Letting ε → 0 yields kΦµ k = kµk. 

7.3.2. C OROLLARY. If µ, ν are regular complex Borel measures on a LCH


space X, then Φµ = Φν if and only if µ = ν.

7.3.3. D EFINITION . If X is a LCH space, let M (X) be the vector space of all
complex regular Borel measures on X with norm kµk = |µ|(X).

7.3.4. R IESZ R EPRESENTATION T HEOREM . Let X be a LCH space. Then


C0 (X)∗ is isometrically isomorphic to M (X) via the pairing µ → Φµ .

P ROOF. Proposition 7.3.1 shows that the map from M (X) into C0 (X)∗ is an
isometric linear map. On the other hand, Corollary 7.2.5 of the Riesz-Markov
Theorem shows that every positive linear functional on C0 (X) is Φµ for a finite
positive regular Borel measure µ. The result will follow if the linear span of the
positive linear functional on C0 (X) is all of C0 (X)∗ .
We call a linear functional self-adjoint if ϕ(f ) ∈ R for f ∈ C0 (X, R). Let
ϕ ∈ C0 (X)∗ . Define ϕ̃(f ) = ϕ(f¯). Set Re ϕ = 12 (ϕ + ϕ̃) and Im ϕ = 2i1 (ϕ − ϕ̃).
96 Functionals on Cc (X) and C0 (X)

Then if f ∈ C0 (X, R) is real valued,


ϕ(f ) + ϕ(f )
(Re ϕ)(f ) = = Re ϕ(f ) ∈ R.
2
Similarly Im ϕ takes real values on C0 (X, R). So Re ϕ and Im ϕ are self-adjoint.
Note that these functionals are not real valued on C0 (X). Observe that ϕ = Re ϕ +
i Im ϕ. So C0X )∗ is spanned by the self-adjoint linear functionals.
Claim: if ϕ is self-adjoint, there are positive linear functionals ϕ1 , ϕ2 on
C0 (X) such that ϕ = ϕ1 − ϕ2 . This is an analogue of the Jordan decomposition
theorem. If f ≥ 0, define
ϕ1 (f ) = sup{ϕ(g) : 0 ≤ g ≤ f }.
Then 0 ≤ ϕ1 (f ) ≤ sup0≤g≤f kϕk kgk = kϕk kf k. Also for t > 0, ϕ1 (tf ) =
tϕ1 (f ).
Additivity: suppose that f1 ≥ 0 and f2 ≥ 0 in C0 (X). If 0 ≤ gi ≤ fi , then
0 ≤ g1 + g2 ≤ f1 + f2 . Hence ϕ1 (f1 + f2 ) ≥ ϕ(g1 ) + ϕ(g2 ). Taking the supremum
over all choices of gi yields ϕ1 (f1 + f2 ) ≥ ϕ1 (f1 ) + ϕ1 (f2 ). On the other hand, if
0 ≤ g ≤ f1 + f2 , let g1 = g ∧ f1 and
(
0 if g(x) ≤ f1 (x)
g2 = g − g1 =
g(x) − f1 (x) ≤ f2 (x) if g(x) > f1 (x).
So 0 ≤ g2 ≤ f2 . Thus ϕ(g) = ϕ(g1 ) + ϕ(g2 ) ≤ ϕ1 (f1 ) + ϕ1 (f2 ). Taking the
supremum over 0 ≤ g ≤ f1 + f2 yields the reverse inequality : ϕ1 (f1 + f2 ) ≤
ϕ1 (f1 ) + ϕ1 (f2 ) Therefore ϕ1 (f1 + f2 ) = ϕ1 (f1 ) + ϕ1 (f2 ).
Now we extend ϕ1 to C0 (X). If f ∈ C0 (X), we write f = Re f + i Im f =
f1 − f2 + if3 − if4 where f1 = Re f ∨ 0, f2 = − Re f ∨ 0, f3 = Im f ∨ 0 and
f4 = − Im f ∨ 0. Define ϕ1 (f ) = ϕ1 (f1 ) − ϕ1 (f2 ) + iϕ1 (f3 ) − iϕ1 (f4 ). It is left
to the reader to verify that ϕ1 is linear. Once that it checked, it is clear that ϕ1 is a
continuous positive linear functional. Set ϕ2 = ϕ1 − ϕ. Then if f ≥ 0,
ϕ2 (f ) = sup{ϕ(g) − ϕ(f ) : 0 ≤ g ≤ f } ≥ ϕ(f ) − ϕ(f ) = 0.
So ϕ2 is also a positive linear functional; and ϕ = ϕ1 − ϕ2 .
We have shown that the linear span of the positive linear functionals is all of
C0 (X)∗ . Therefore by the Riesz-Markov theorem for C0 (X), each is given as in-
tegration against a finite Radon measure, hence regular. Thus the whole functional
is given by integration against a regular complex Borel measure. 
C HAPTER 8

A Taste of Probability*

This chapter is purely for interest. I expect to give one lecture on some of this, but
it won’t be tested. This chapter closely follows Folland’s book [2].

8.1. The language of probability


While the basic theorems of probability theory often coincide with the analysts
view of measure theory, the vocabulary is completely different. This chart explains
the correspondence for some familiar notions.

Analyst’s term Probabalist’s term


measure space (X, B, µ) sample space (Ω, B, P ) and P (Ω) = 1.
σ-algebra σ-field
measurable set event
real measurable
R function random variable X
integral f dµ expectation E(X)
µ({x : f (x) < a}) P (X < a)
kf kp < ∞ X has a finite pth moment
almost everywhere (a.e.) almost surely (a.s.)
Borel probability measure on R distribution
charcteristic function χA indicator function 1A

In analysis, we say that a sequence fn of measurable functions on (X, B, µ)


converges in measure to f if for all ε > 0,

lim µ {x : |f (x) − fn (x)| > ε} = 0.
n→∞

In probability theory, this is called convergence in probability.


If X is a random variable, E(X) is the expected value or mean. The standard
deviation (which is finite if X ∈ L2 (P )) is
q 
σ(X) = E (X − E(X))2 = kX − E(X)k2 .

If ϕ : (Ω, B) → (Ω0 , B 0 ) is a measurable map, define Pϕ on (Ω0 , B 0 ) by


Pϕ (B) = P (ϕ−1 (B)). For example, if X : Ω → R is a random variable, the
97
98 A Taste of Probability*

distribution function of X is

F (t) = P (X ≤ t) = PX (−∞, t] .
Moreover PX = dF = µF is the Lebesgue-Stieltjes measure of F . A family
of random variables {Xα }α∈A are identically distributed if PXα = PXβ for all
α, β ∈ A. The joint distribution of (X1 , . . . , Xn ) is P(X1 ,...,Xn ) where we consider
(X1 , . . . , Xn ) : Ω → Rn .
Many properties of random variables can be recovered from their distributions.

8.1.1. P ROPOSITION . If ϕ : (Ω, B) → (Ω0 , B0 ) is measurable and f : Ω0 → R


is measurable with respect to Pϕ , then
Z Z
f dPϕ = f ◦ ϕ dP.
Ω0 Ω

P ROOF. Let f = χA . Then


Z
χA dPϕ = Pϕ (A) = P (ϕ−1 (A))
Ω0
Z Z
= χ ϕ−1 (A) dP =
χA ◦ ϕ dP.
Ω Ω
Thus the result is true for simple functions. It extends to all positive measurable
functions by the MCT. Then it extends to all integrable functions by the LDCT. 

8.1.2. E XAMPLES
Z . Z
(1) E(X) = X dP = t dPX .
R
Z
2
σ 2 (X) E(X))2

(2) = E (X − = t − E(X) dPX .
R
Z
(3) E(X + Y ) = s + t dP(X,Y ) (s, t).
R2

8.2. Independence
8.2.1. D EFINITION . Events {Ai : i ∈ I} are independent if whenever i1 , . . . , ik
in I are distinct, then
k
Y
P (Ai1 ∩ · · · ∩ Aik ) = P (Ais ).
s=1

Random variables {Xi : i ∈ I} are independent if for every choice of Borel sets Bi ,
the collection {Xi−1 (Bi ) : i ∈ I} is independent. That is, whenever i1 , . . . , ik ∈ I
8.2 Independence 99

are distinct,
k
Y k
\ k k k
 Y Y Y
Xi−1
 
P(Xi1 ,...,Xi ) Bis = P s
(B i ) = P (A is ) = P Xis Bis .
k
s=1 s=1 s=1 s=1 s=1
Qk
In other words, P(Xi1 ,...,Xi ) = s=1 PXis .
k

8.2.2. E XAMPLE . It is not sufficient to check independence for pairs of vari-


ables. Let X = {1, 2, 3, 4}, P (i) = 14 , and A1 = {2, 3}, A2 = {1, 3} and
A3 = {1, 2}. You can check that P (Ai ) = 12 and P (Ai ∩ Aj ) = 41 for i 6= j, but
P (A1 ∩ A2 ∩ A3 ) = 0.

8.2.3. P ROPOSITION . Let X, Y be independent integrable random variables.


Then E(XY ) = E(X) E(Y ).

P ROOF. Compute
Z ZZ
E(XY ) = st dP(X,Y ) (s, t) = st dPX dPY
Z Z
= s dPX (s) t dPY (t) = E(X) E(Y ). 

8.2.4. P ROPOSITION . Let {Xi,j : 1 ≤ j ≤ Ji , 1 ≤ i ≤ n} be inde-


pendent random variables. If fi : RJi → R are Borel for 1 ≤ i ≤ n, and
Yi = fi (Xi,1 , . . . , Xi,Ji ), then {Y1 , . . . , Yn } are independent.

P ROOF. Let X̄i = (Xi,1 , . . . , Xi,Ji ). Then Yi−1 (B) = X̄i−1 (fi−1 (B)). Thus if
Y = (Y1 , . . . , Yn ) and X = (X̄1 , . . . , X̄n ),
n
\ n
Y
Y −1 (B1 × · · · × Bn ) = X̄i−1 (fi−1 (Bi )) = X −1 fi−1 (Bi ) .


i=1 i=1

Therefore
n
Y
P Y −1 (B1 × · · · × Bn ) = PX fi−1 (Bi )
 

i=1
n
Y YJi n
 Y
fi−1 (Bi )

= PXi,j
i=1 j=1 i=1
n
Y Ji
Y
PXi,j fi−1 (Bi )

=
i=1 j=1
100 A Taste of Probability*

n
Y n
Y
= PX̄i (fi−1 (Bi )) = Pyi (Bi ).
i=1 i=1

So {Y1 , . . . , Yn } are independent. 

8.2.5. C OROLLARY. If {Xi } are independent variables in L2 (P ), then


n
X
σ 2 (X1 + · · · + Xn ) = σ 2 (Xi ).
i=1

P ROOF. Let Yi = Xi − E(Xi ); so E(Yi ) = 0. Then {Yi } are independent, so


E(Yi Yj ) = E(Yi ) E(Yj ) = 0 for i 6= j. Therefore
n
X 2  n
X 2 
σ 2 (X1 + · · · + Xn ) = E Xi − E(Xi ) =E Yi
i=1 i=1
n
X X X n
= E(Yi2 ) + E(Yi ) E(Yj ) = σ 2 (Xi ).
i=1 i6=j i=1 

8.3. The Law of Large Numbers


The following easy lemma is surprisingly useful.

8.3.1. L EMMA (Chebychev’s Inequality). If X ≥ 0 is a random variable and


a > 0, then P (X ≥ a) ≤ E(X)/a.
Z ∞ Z ∞
P ROOF. E(X) = t dPX (t) ≥ a dPX (t) = aP (X ≥ a). 
0 a

8.3.2. C OROLLARY. Let X be a random variable such that σ(X) < ∞ (i.e.
X ∈ L2 (P )). Then for b > 0, P (|X − E(X)| ≥ b) ≤ σ 2 (X)/b2 .

P ROOF. Let Y = (X − E(X))2 ; so E(Y ) = σ 2 (X). Hence by Chebychev’s


inequality, P (|X − E(X)| ≥ b) = P (Y ≥ b2 ) ≤ E(Y )/b2 = σ 2 (X)/b2 . 
8.3 The Law of Large Numbers 101

8.3.3. W EAK L AW OF L ARGE N UMBERS . Let {Xi }i≥1 be independent


L2 random variables. Denote E(Xi ) = µi and σ 2 (Xi ) = σi2 for i ≥ 1. If
n
lim n12 σi2 = 0, then for any ε > 0,
P
n→∞ i=1

 n
X 
1
lim P n Xi − µi > ε = 0.
n→∞
i=1

n
1 P
That is, lim Xi − µi = 0 in probability.
n→∞ n i=1

n
P ROOF. Let Sn = n1
P
(Xi − µi ); so E(Sn ) = 0 and by Corollary 8.2.5,
i=1
σ 2 (Sn ) = n12 ni=1 σ 2 (Xi − µi ) = n12 ni=1 σi2 . Hence by Corollary 8.3.2,
P P

P (|Sn | > ε) ≤ σ 2 (Sn )/ε2 → 0.

That is, Sn converges to 0 in probability. 

8.3.4. B OREL -C ANTELLI L EMMA T . SLet {An }n≥1 be events in a probability


space. Define B = lim sup An := k≥1 n≥k An .
P
(a) If n≥1 P (An ) < ∞, then P (B) = 0.
P
(b) If {An } are independent and n≥1 P (An ) = ∞, then P (B) = 1.
S  P
P ROOF. (a) P (B) ≤ P n≥k An ≤ n≥k P (An ) → 0.

(b) 1 − P [ A  = P \ Ac  = Y P (Ac )
n n n
n≥k n≥k n≥k
P
e−P (An ) = e−
Y Y
P (An )
= 1 − P (An ) ≤ n≥k = 0.
n≥k n≥k

Thus P (B) = 1. 

8.3.5. KOLMOGOROV ’ S I NEQUALITY . Let X1 , . . . , Xn be independent


random variables with E(Xi ) = 0 and σ 2 (Xi ) = σi2 . Then for a > 0,
n
 1 X 2
P max |X1 + · · · + Xk | ≥ a ≤ 2 σi .
1≤k≤n a
i=1


P ROOF. Let Sk = X1 + · · · + Xk . Let A = max{|S1 |, . . . , |Sn | ≥ a and
Sn
Ak = |Sj | < a, 1 ≤ j < k, |Sk | ≥ a . So A = ˙ k=1 Ak .

102 A Taste of Probability*

Since Xi are independent and mean 0,


n
X n
X
σi2 = σ 2 (X1 + · · · + Xn ) = E(Sn2 ) ≥ E(Sn2 χA ) = E(Sn2 χAk ).
i=1 k=1

Also note that Sk χAk depends only on X1 , . . . , Xk and Sn − Sk depends only on


Xk+1 , . . . , Xn . Hence these are independent quantities, so that
E (Sk χAk )(Sn − Sk ) = E(Sk χAk )E(Xk+1 + · · · + Xn ) = 0.


Therefore
E(Sn2 χAk ) = E (Sk χAk + (Sn − Sk )χAk )2


= E(Sk2 χAk ) + 2E (Sk χAk )(Sn − Sk ) + E (Sn − Sk )2 χAk )


 

≥ E(Sk2 χAk ) ≥ a2 P (Ak ).


Consequently,
n
X n
X n
X
a2 P (A) = a2 P (Ak ) ≤ E(Sn2 χAk ) ≤ σi2 .
k=1 k=1 i=1
1 Pn 2
Thus P (A) ≤ a2 i=1 σi . 

8.3.6. L EMMA . Let {Xi : P


i ≥ 1} be independent
Prandom variables with
2 2 2
E(Xi ) = 0 and σ (Xi ) = σi . If < ∞, then P
i≥1 σi i≥1 Xi converges = 1.

P ROOF. Let Sn = XP1 +· · ·+Xn . So Sn+k −Sn = Xn+1 +· · ·+Xn+k . Given


δ > 0, choose N so that i≥N σi2 < δ. Then if ε > 0 and n ≥ N , Kolmogorov’s
inequality yields
n+m
 1 X 2
P max |Sn+k − Sn | ≥ ε ≤ 2 σi .
1≤k≤m ε
i=n+1

Letting m → ∞, we get
 1 X 2 δ
P max |Sn+k − Sn | ≥ ε ≤ 2 σi < 2 .
k≥1 ε ε
i>n

Choose δ = 2−3j and ε = 2−j and obtain an integer nj so that


Aj = max |Snj +k − Snj | ≥ 2−j

k≥1

has P (Aj ) < 2−j . By the Borel-Cantelli Lemma, since j≥1 P (Aj ) < ∞, we
P
c
T S  S T 
have P k≥1 Aj = 0. Hence P i≥1 j≥i Aj = 1. Therefore a.s., x
T j≥k
belongs to j≥i Aj for some i, meaning that maxk≥1 |Snj +k − Snj | < 2−j for
c

j ≥ i. This means that Sn (x) is Cauchy and thus converges with probability 1. 
8.3 The Law of Large Numbers 103

P ai
8.3.7. L EMMA . If (ai )i≥1 is a sequence of real numbers and i≥1 i converges,
Pn
then lim 1 i=1 ai = 0.
n→∞ n

Pn
P ROOF. Let bn = ann and Sn = i=1 bi . Let L = limn→∞ Sn . Then L =
limn→∞ n1 n−1
P
i=0 Si . Compute
n n n n
1X 1X 1 XX
ai = ibi = bk
n n n
i=1 i=1 i=1 k=i
n n−1
1 X 1X
= Sn − Si−1 = Sn − Sk → 0.
n n 
i=1 k=0

8.3.8. T HEOREM (Kolmogorov). Let {Xi P


: i ≥ 1} be independent random
1 2
variables with E(Xi ) = µi and σ 2 (Xi ) = σi2 . If i≥1 i2 σi < ∞, then
n
X
lim 1

Xi − µi = 0 a.s.
n→∞ n
i=1

P ROOF. Let Yi = 1i (Xi − ai ). Then E(Yi ) = 0, σ 2 (Yi ) = i12 σi2 and {Yi }
are independent. By hypothesis, i≥1 σ 2 (Yi ) < ∞. Hence by Lemma 8.3.6,
P
P
P ( i≥1 Yi converges} = 1. Therefore by Lemma 8.3.7, with ai = P (Xi ),
 n
X 
1
P lim (Xi − µi ) = 0 = 1.
n→∞ n 
i=1

8.3.9. S TRONG L AW OF L ARGE N UMBERS . Let {Xi }i≥1 be independent,


identically distributed random variables.
(a) If Xi are in L1 with E(Xi ) = µ, then
1

P lim (X1 + · · · + Xn ) = µ = 1.
n→∞ n

(b) If Xi are not in L1 , then


P lim sup n1 |X1 + · · · + Xn | = ∞ = 1.

n→∞

P ROOF. (a) By replacing Xn by Xn − µ, we may suppose that E(XZ n ) = 0.


Let λ = PXn , which is independent of n. Then kXn k1 = E(|Xn |) = |t| dλ and
R
104 A Taste of Probability*

Z
t dλ = E(Xn ) = 0. Let Yn = Xn χ{|Xn |≤n} and Zn = Xn − Yn . Then
R
X X XZ −nZ ∞
P (Zn 6= 0) = P (|Xn | > n) = + dλ
n≥1 n≥1 n≥1 −∞ n
Z ∞ Z ∞
= b|t|c dλ ≤ |t| dλ = kX1 k1 < ∞.
−∞ −∞
By the Borel-Cantelli Lemma, P (Zn 6= 0 infinitely often) = 0. Therefore
n
X
1

P n Zi → 0 = 1.
i=1
Z n
Now σ 2 (Yn ) = E(Yn2 ) = t2 dλ. Hence by the MCT,
−n
n ∞
XZ t2
X Z X 
1 2
n2
σ (Yn ) = dλ = 1 χ
n2 [−n,n]
t2 dλ
n2
n≥1 n≥1 −n −∞ n≥1
P 1 χ P 1 2
for n−1 < |t| ≤ n, we have n≥1 n2 [−n,n] (t) = k≥n k2 < n and thus
Z ∞
≤ 2|t| dλ < ∞.
−∞

Therefore by Kolmogorov’s Theorem, P limn→∞ n1 ni=1 Yi = 0 = 1. Conse-


P 
quently,
n
X Xn n
X
1 1 1
  
P lim n Xi = 0 ≥ P lim n Yi = 0 · P lim n Zi = 0 = 1.
n→∞ n→∞ n→∞
i=1 i=1 i=1
(b) In this case, for any C > 0,
XZ |t|
X Z
P (|Xn | ≥ Cn) = χ{|t|≥Cn} dλ ≥ − 1 dλ = +∞.
n≥1 n≥1 R C

By the Borel-Cantelli Lemma, P (|Xn | ≥ Cn infinitely often) = 1. But then


n |X + · · · + X | |X + · · · + X | o C
1 n−1 1 n
max , ≥ infinitely often a.s.
n−1 n 2
Therefore lim supn→∞ n1 |X1 + · · · + Xn | = ∞ a.s. 
B IBLIOGRAPHY
[1] S. Friedberg, A. Insel and L. Spence, Linear Algebra. 5th ed., Prentice Hall, 2018, Englewood
Cliffs, NJ
[2] G.B. Folland, Real Analysis: modern techniques and their applications, 2nd ed., John Wiley &
Sons, New York, 1999.
[3] S. Willard, General Topology, Addison-Wesley Pub. Co, Reading, 1970.

105
I NDEX

Fσ set, 3 dense, 76
Gδ set, 3 differentiable, 49
Lp (µ), 64 discrete topology, 74
M (X), 95 distribution function of X, 98
T0 , 81 dual of Lp (µ), 70
T1 , 81
T2 , 81 Egorov’s Theorem, 19
T3 , 81 expectation, 97
T3.5 , 81 expected value, 97
µ∗ -measurable, 6 Extreme Value Theorem, 79
σ-algebra, 3
Fatou’s Lemma, 25
σ-finite, 3
finite, 3
absolute value, 59 finite intersection property, 79
absolutely continuous, 53, 60 first countable, 76
algebra of subsets, 3 Fubini’s Theorem, 36
almost everywhere, 5 Fubini-Tonelli Theorem without
almost surely, 97 Completenes, 40
atom, 13
Hölder’s Inequality, 69
atomic measure, 14
Hahn Decomposition Theorem, 58
base, 75 Hausdorff, 77
Borel sets, 3 homeomorphism, 76
Borel-Cantelli Lemma, 101
identically distributed, 98
bounded variation, 52
independent, 98
Cantor ternary function, 52 indicator function, 97
Carathéodory’s Theorem, 7 induced topology, 74
closed, 74 inner regular, 88
closure, 74 integrable, 26
cofinal, 85 integral, 21
compact, 78 interior, 74
complete, 4 interior point, 74
completely regular, 81
joint distribution, 98
completion, 5
Jordan DecompositionTheorem, 59
complex measure, 62
continuity from above, 4 Kolmogorov’s Inequality, 101
continuity from below, 4 Kolmogorov’s Theorem, 103
continuous, 76
converge uniformly, 77 Lebesgue Decomposition Theorem, 62
convergence in probability, 97 Lebesgue Differentiation Theorem, 55
converges in measure, 97 Lebesgue Dominated Convergence Theorem,
coordinate projections, 34 27
countable base of neighbourhoods of x, 76 limit point, 74
countably additive, 3 linear functionals, 68
Counting measure, 4 locally compact, 83

106
Index 107

lower derivative, 49 subadditivity, 4


Lusin’s Theorem, 20 subbase, 75
subnet, 85
measurable, 16 support, 83
measure, 3
Minkowski’s inequality, 64 Tietze’s Extension Theorem, 83
Monotone Convergence Theorem, 24 Tonelli’s Theorem, 37
monotonicity, 3 topology, 74
mutually singular, 59 trivial topology, 74
Tychonoff, 81
neighbourhood, 75
net, 85 upper derivative, 49
normal, 81 upward directed, 85
null set, 58 Urysohn’s Lemma, 82

open cover, 78 vanishing at infinity, 88


open sets, 74 Vitali cover, 49
oscillation, 32 Vitali Covering Lemma, 49
outer measure, 6
outer regular, 88 Weak Law of Large Numbers, 101
weaker topology, 75
partial order, 85
partition of unity, 84
point mass, 4
poset, 85
positive linear fiunctional, 90
positive set, 58
premeasure, 8
probability measure, 3
product σ-algebra, 34
product measure, 36
product space, 34, 79

Radon measure, 88
Radon-Nikodym Theorem, 60
random variable, 97
regular, 81, 94
regular measure, 88
Riemann integrable, 30
Riesz Representation Theorem, 95
Riesz-Fischer Theorem, 65
Riesz-Markov Theorem, 92

second countable, 76
self-adjoint, 95
semi-finite, 3
separable, 76
separation properties, 81
signed measure, 57
simple function, 18
standard deviation, 97
Strong Law of Large Numbers, 103
stronger topology, 75

You might also like