Measure Theory
Measure Theory
Chapter 2. Functions 16
2.1. Measurable Functions 16
2.2. Simple Functions 18
2.3. Two Theorems about Measurable Functions 19
Chapter 3. Integration 21
3.1. Integrating positive functions 21
3.2. Two limit theorems 23
3.3. Integrating complex valued functions 26
3.4. The space L1 (µ) 28
3.5. Comparison with the Riemann Integral 29
3.6. Product Measures 34
3.7. Product Integration 36
3.8. Lebesgue Measure on Rn 43
3.9. Infinite Product Measures 46
Chapter 5. Lp spaces 64
5.1. Lp as a Banach space 64
5.2. Duality for Normed Vector Spaces 68
5.3. Duality for Lp 69
i
ii Contents
Bibliography 105
Index 106
C HAPTER 1
Measures
1.1. Introduction
The idea of Riemann integration is the following: Convergence of the upper
and lower sums to the same limit is required for Riemann integrability. The integral
|A| is meant to convey the length of a set A. The complication is that, even for
continuous functions, the sets {x : yi−1 < f (x) ≤ yi } are neither open nor closed,
and can be complicated. It is a Borel set, and that will be seen to be good enough.
Lebesgue’s idea was to extend the notion of length or measure from open intervals
to a function m on a much larger class of sets so that
(1) m((a, b)) = b − a
(2) m(A + x) = m(A) for sets A and x ∈ R (translation invariance)
S∞
(3) if A is the disjoint union A = ˙ n=1 An , then m(A) = ∞
P
n=1 m(An ).
(countable additivity)
It turns out that it is not possible to define such a function on all subsets of R.
To see this, we define an equivalence relation on [0, 1) by x ∼ y if x − y ∈ Q. Then
use the Axiom of Choice to select a set A which contains exactly one element from
each equivalence class. Enumerate Q ∩ [0, 1) = {0 = r0 , r1 , r2 , . . . } and define
An = A + rn (mod 1) = (A + rn ) ∪ (A + rn − 1) ∩ [0, 1) for n ≥ 0.
m(A) = m(A + rn )
= m (A + rn ) ∩ [0, 1) + m (A + rn ) ∩ [1, 2)
= m( A + rn ) ∩ [0, 1) + m (A + rn − 1) ∩ [0, 1)
= m(An ).
S
Observe that Am ∩ An = ∅ if m 6= n. By construction, n≥1 An = [0, 1). Hence
by countable additivity,
∞
X ∞
X
1 = m([0, 1)) = m(An ) = m(A).
n=0 n=0
1.2. σ-Algebras
1.2.1. D EFINITION . If X is a set, an algebra of subsets is a non-empty col-
lection of subsets of X, i.e., A ⊂ P(X), which contains the empty set, ∅, and is
closed under complements and finite unions. A σ-algebra is an algebra B of sub-
sets of X which is closed under countable unions. We say that (X, B) is a measure
space.
1.2.3. D EFINITION . If X is a topological space, the Borel sets are the elements
of the σ-algebra BorX or Bor(X) generated by the collection of open subsets of X.
A Gδ set is a countable intersection of open sets. An Fσ set is a countable
union of closed sets.
then [ [ [
Ei ∪ Fi = Ei ∪ Fi = E ∪ F
i≥1 i≥1 i≥1
S S
where F = i≥1 Fi ⊂ i≥1 Ni =: N , N ∈ B and µ(N ) = 0 by subadditivity. So
B is a σ-algebra. Moreover this is clearly the smallest σ-algebra containing B and
all subsets of null sets.
Next we show that µ̄ is well-defined. Suppose that A = E1 ∪ F1 = E2 ∪ F2
where Ei ∈ B, Fi ⊂ Ni and µ(Ni ) = 0. Let E = E1 ∩ E2 ∈ B. Then
E ⊂ Ei ⊂ A ⊂ (E1 ∪ N1 ) ∩ (E2 ∪ N2 ) ⊂ (E1 ∩ E2 ) ∪ N1 ∪ N2 .
Therefore µ(E) ≤ µ(Ei ) ≤ µ(E) + µ(N1 ) + µ(N2 ) = µ(E). So µ(E1 ) = µ(E2 ).
Thus µ̄(A) = µ(Ei ) is well-defined. In particular, if A ∈ B, then µ̄(A) = µ(A).
So µ̄|B = µ.
To see that µ̄ is countably additive, supposeSthat Ai = ESi ∪ Fi are
disjoint,
S and
Fi ⊂ Ni where Ni are null sets. Then A := ˙ i≥1 Ai = ˙ i≥1 Ei ∪ ˙ i≥1 Fi .
S S
Moreover F = i≥1 Fi ⊂ i≥1 Ni =: N , and µ(N ) = 0 by subadditivity. So
[˙ X X
µ̄(A) = µ Ei = µ(Ei ) = µ̄(Ai ).
i≥1
i≥1 i≥1
Thus µ̄ is a measure.
If E ∈ B, then µ̄(E) = µ(E), so µ̄|B = µ; i.e. µ̄ extends the definition of µ.
To see that µ̄ is complete, suppose that µ̄(M ) = 0 and G ⊂ M . Then M = E ∪ F
where E, N ∈ B, F ⊂ N and µ(N ) = 0. Also µ(E) = µ̄(M ) = 0. Thus
G ⊂ M ⊂ E ∪ N , and µ(E ∪ N ) = 0. Hence G ∈ B. Thus µ̄ is a complete
measure.
Clearly B is the smallest σ-algebra containing B and all subsets of null sets.
Moreover it is clear that the only way to extend the definition of µ to B and be a
measure is to set µ̄(F ) = 0 when F ⊂ N and µ(N ) = 0. Thus, this is the unique
smallest complete measure extending µ.
1.2.9. D EFINITION . If µ is a measure on (X, B), then (X, B, µ̄) is called the
completion of µ.
∗
S ifP{Ai : ∗i ≥ 1} is a countable collection of sets, then
(3) (sub-additivity)
µ i≥1 Ai ≤ i≥1 µ (Ai ).
sets Eij ∈ E for j ≥ 1 so that Ai ⊂ j≥1 Eij and j≥1 µ(Eij ) < µ∗ (Ai ) + 2−i ε.
S P
S S S
Then A = i≥1 Ai is covered by i≥1 j≥1 Eij and so
XX X X ∗
µ∗ (A) ≤ µ(Eij ) < µ∗ (Ai ) + 2−i ε = µ (Ai ) + ε.
i≥1 j≥1 i≥1 i≥1
1.3.3. E XAMPLE . Let X = R and let E denote the collection of all bounded
open intervals. Define ρ((a, b)) = b − a. This determines an outer measure which
will be used to construct Lebesgue measure.
µ∗ (E) = µ∗ (E ∩ A) + µ∗ (E ∩ Ac )
= µ∗ (E ∩ A) + µ∗ (E ∩ Ac ∩ B) + µ∗ (E ∩ Ac ∩ B c )
≥ µ∗ (E ∩ (A ∪ B)) + µ∗ (E ∩ (A ∪ B)c ).
The first line follows since A is µ∗ -measurable. The second line follows since
B is µ∗ -measurable. The last line follows from subadditivity of µ∗ . This is the
non-trivial direction, so this is an equality. Thus A ∪ B is µ∗ -measurable.
Combining these two observations shows that B is an algebra of sets, and thus
is closed under finite unions and intersections.
Next suppose that A and B are disjoint. Then
µ∗ (E ∩ (A∪B))
˙ = µ∗ (E ∩ (A∪B)
˙ ∩ A) + µ∗ (E ∩ (A∪B)
˙ ∩ Ac )
= µ∗ (E ∩ A) + µ∗ (E ∩ B).
know that Bn , A0n ∈ B. Take any E ⊂ X. Using the additivity from the previous
paragraph,
µ∗ (E) = µ∗ (E ∩ Bn ) + µ∗ (E ∩ Bnc )
≥ µ∗ (E ∩ Bn ) + µ∗ (E ∩ B c )
n
X
= µ∗ (E ∩ A0i ) + µ∗ (E ∩ B c ).
i=1
8 Measures
≥ µ∗ (E)
The last two lines use subadditivity. When µ∗ (E) < ∞, we obtain equality, and
thus B is µ∗ -measurable. Therefore B is a σ-algebra.
Next suppose that the Ai are pairwise disjoint. So A0i = Ai in the previous
paragraph. Taking E = B, we get that
X
µ(B) ≥ µ(Ai ) ≥ µ(B).
i≥1
1.4. Premeasures
1.4.1. D EFINITION . A premeasure is a function µ : A → [0, ∞] defined on
an algebra A ⊂ P(X) of sets such that µ(∅) = 0 and whenever Ai ∈ A are
pairwise disjoint and A = ˙ i≥1 Ai ∈ A, then µ(A) = i≥1 µ(Ai ). In particular,
S P
S
Suppose that A ∈ A and Ai ∈ A such that A ⊂ i≥1 Ai . Define
i−1
[
Bi = A ∩ Ai \ Aj for i ≥ 1,
j=1
Therefore
X [
µ∗ (A) = inf µ(Ai ) : A ⊂ Ai = µ(A).
i≥1 i≥1
Here is further detail about the outer measure construction which explains when
the extension is unique.
Now take the inf over all such covers of E to obtain ν(E) ≤ µ̄(E).
10 Measures
P
If µ̄(E) < ∞ and ε > 0, we can choose the Ai so that i≥1 µ(Ai ) < µ̄(E)+ε.
Let Bi = Ai \ i−1
S S ˙
S
j=1 Aj , which are in A; and A = i≥1 Ai = i≥1 Bi . Then
X X
ν(A) = ν(Bi ) = µ(Bi ) = µ̄(A).
i≥1 i≥1
Therefore
ν(E) + ν(A \ E) = ν(A) = µ̄(A) = µ̄(E) + µ̄(A \ E) < µ̄(E) + ε.
So ν(A \ E) ≤ µ̄(A \ E) < ε. Whence ν(E) ≥ µ̄(A) − ε ≥ µ̄(E) − 2ε. Let ε → 0
to get ν(E) = µ̄(E).
Now if E = n≥1 En where µ̄(En ) < ∞, we may replace En with ni=1 Ei
S S
so that En ⊂ En+1 . Then by continuity from below,
ν(E) = lim ν(Ei ) = lim µ̄(Ei ) = µ̄(E).
n→∞ n→∞
Finally if µ is σ-finite, then every set E ∈ C is sigma-finite. Thus µ̄|C is the unique
extension of µ.
This uses the fact that µ([0, x1 ]) < ∞. Similarly if x < 0 and 0 > xn ↓ x, then the
continuity from below property shows that
[
F (x) = −µ( (xn , 0)) = lim −µ((xn , 0)) = lim F (xn ).
n→∞ n→∞
n≥1
n
˙n
[ X
µF (ai , bi ] = F (bi ) − F (ai )
i=1
i=1
P ROOF. First let’s show that µF is well defined. To this end, suppose that I =
Sn
(a, b] = ˙ i=1 (ai , bi ]. Then after rearrranging if necessary, we may suppose that
a = a1 < b1 = a2 < · · · < bn−1 = an < bn = b. Then if we set an+1 := bn = b,
n
X n
X
F (bi ) − F (ai ) = F (ai+1 ) − F (ai ) = F (b) − F (a).
i=1 i=1
Thus µF (I) does not depend on the decomposition into finitely many pieces. This
readily extends to a finite union of intervals.
Now we S consider the restricted version of countable additivity. Suppose that
I = (a, b] = ˙ i≥1 (ai , bi ]. This is more difficult to deal with because these intervals
cannot be ordered, end to end, as in the case of a finite union. One direction is easy:
˙
S n Sn
˙
since I = i=1 (ai , bi ]∪ I \ i=1 (ai , bi ] and the last set belongs to A,
n
X n
[ n
X
µF (I) = µF ((ai , bi ]) + µF I \ (ai , bi ] ≥ µF ((ai , bi ]).
i=1 i=1 i=1
Pp
Therefore j=1 F (bij +δij ) − F (aij ) ≥ F (b) − F (a+δ). Consequently,
X X
µF ((ai , bi ]) = F (bi ) − F (ai )
i≥1 i≥1
Xp
≥ F (bij ) − F (aij )
j=1
p
X
≥ F (bij + δij ) − 2−ij ε − F (aij )
j=1
≥ F (b) − F (a + δ) − ε
≥ F (b) − F (a) − 2ε.
P
Now let ε ↓ 0 to get i≥1 µF ((ai , bi ]) ≥ F (b) − F (a); whence we have equality.
P
Now if b = ∞, we still have i≥1 µF ((ai , bi ]) ≥ µF ((a, N ]) = F (N ) − F (a) for
P
N ∈ N. Letting N → ∞, we get i≥1 µF ((ai , bi ]) ≥ F (∞) − F (a). Similarly
we can handle a = S −∞.
m
Finally if A = ˙ k=1 (ak , bk ] is written as a disjoint union of half open intervals,
we can split the union into m pieces and use the argument for a single interval on
each one. Thus we obtain countable additivity (provided the union remains in A).
So µF is a premeasure.
P ROOF. The set of open intervals is invariant under translation, and hence so
is BorR . The measure ms (E) = m(E + s) agrees with m on open intervals. By
Theorem 1.5.2, the measures are determined by the functions
( (
m((0, x]) x ≥ 0 ms ((0, x]) x ≥ 0
F (x) = and G(x) = .
−m((x, 0]) x < 0 −ms ((x, 0]) x < 0
However as we have observed, these functions are equal. So ms = m. Hence
m(E + s) = m(E) for all Borel sets E ⊂ R. Now Proposition 1.4.3 shows that
there is a unique extension of m|BorR to the σ-algebra L of Lebesgue measurable
sets.
If s = 0, then m(0E) = 0 = 0m(E), so suppose that s 6= 0. Let ns (E) =
−1
|s| m(sE). Observe that
ns ((a, b)) = |s|−1 |sb − sa| = b − a = m((a, b)).
Arguing as in the previous paragraph, we see that ns = m. Hence m(sE) =
|s|ns (E) = |s|m(E) for all measurable sets E.
Thus µF (a) > 0 if and only if F has a jump discontinuity at a. In this case, we
say that a is an atom of µF . The only discontinuities of a monotone function are
jump discontinuities. There are at most countably many jump discontinuities. To
see this, consider how many jumps of size at least δ there can be in (a, b]; at most
(F (b) − F (a))/δ which is finite. So there are at most n(F (n) − F (−n)) points
in (−n, n] with a jump of more than n1 . The union of this countable collection
of finite sets is countable, and contains all of the jumps. Note, however that the
discontinuities can be dense, say at every rational point.
Suppose that the discontinuities occur at ai with jump αi > 0 for i ≥ 1. Recall
that δx is the point mass that sets δ(A) = 1 if x ∈ A and δx (A) = 0 otherwise.
14 Measures
P
Then µa = i≥1 αi δai is called an atomic measure. When we subtract it from µF ,
we get a measure µc = µF − µa which has no atomic part. There are increasing,
right continuous functions Fa and Fc so that µa = µFa and µc = µFc . Moreover
Fc is continuous and Fa is entirely determined by its jumps. We can show that (up
to a constant)
( P
αi if x ≥ 0
Fa (x) = P{i:0≤ai ≤x} .
− {i:x<ai <0} αi if x < 0
P
This converges for every x because {i:0≤ai ≤x} αi ≤ F (x) − F (0) < ∞ when
P
x ≥ 0, and similarly {i:x<ai <0} αi ≤ F (0) − F (x) when x < 0.
So µ̄∗F (U
\ E) = µF (U \ A) + µ∗F (A \ E) < ε.
Now if E is unbounded, let En = E ∩ (n − 1, n] for n ∈ Z. Find an open
set Un ⊃ En with µ∗F (Un \ En ) < 2−2|n|−1 ε. Then E ⊂ U := n∈Z Un is an
S
open set, and µ∗F (U \ E) ≤ n∈Z µ∗F (Un \ En ) < 2ε + 2 n≥1 2−2n−1 ε < ε. So
P P
(1) =⇒ (2).
(2) =⇒ (4). Find open sets Un ⊃ E with µ∗F (Un \E) < n1 . Set G = n≥1 Un .
T
This is a Gδ set containing E such that µ∗F (G\E) ≤ µ∗F (Un \E) < n1 for all n ≥ 1.
Hence µ∗F (G \ E) = 0.
1.5 Lebesgue-Stieltjes measures on R 15
Functions
We are most often interested in scalar valued functions, i.e. with range in R or
C. We always use the σ-algebra of Borel sets, not Lebesgue measurable sets, when
defining measurable functions.
It is not necessary to verify the measurability condition on all Borel sets, only
on a generating family. That is, if E ⊂ B generates B as a σ-algebra and f −1 (E) ∈
A for all E ∈ E, then f is measurable. This follows from the easy facts:
[ [
f −1 (E c ) = (f −1 (E))c and f −1 ( En ) = f −1 (En ).
n≥1 n≥1
P ROOF. Suppose that N is a null set such that f = g on X \ N . For any Borel
set B ⊂ R, f −1 (B) 4 g −1 (B) ⊂ N . As these two sets differ by a subset of a null
set, they differ by a null set by completeness. So one is measurable because the
other is.
Suppose that f (x) = lim fn (x) except on a null set N . Then f = lim sup fn
except on N , and hence is measurable by the previous paragraph.
A simple function is just a measurable function with finite range. Indeed, the
range of ϕ is contained in {ai : 1 ≤ i ≤ n} ∪ {0}. Conversely, if the range of a
measurable function ϕ is contained in {ai : 1 ≤ i ≤ n} ∪ {0} with ai 6=
Panj 6= 0 for
−1
i 6= j, then we can set Ei = ϕ ({ai }). These are disjoint, and ϕ = i=1 ai χEi .
This is called the standard form of a simple function.
Note that the set of all simple functions is an algebra, meaning that it is a
vector space and is closed under products. The subset of simple functions such that
µ(Ei ) < ∞ for all i is a subalgebra.
The main result of this section is that every measurable function is a limit of
simple functions in a nice way.
2.2.2. P ROPOSITION .
(a) If f : X → [0, ∞] is measurable, then there is a sequence of simple functions
ϕn so that ϕn ≤ ϕn+1 , limn→∞ ϕn (x) = f (x), and convergence is uniform on
{x : f (x) ≤ R} for any R ∈ R.
2.3 Two Theorems about Measurable Functions 19
Integration
Z
The convention that 0 · ∞ = 0 means that f = 0χX has f dµ = 0 even
if µ(X) = ∞; since the integral of a simple function should not be changed by
adding a zero function to it. The second identity ∞ · 0 = 0 is also needed to
Z notion that the value of f on a set of measure 0 should not affect
coincide with the
the integral. So ∞χ{x} dµ = 0 is µ(x) = 0. At this stage, we allow ∞ both as a
value of f and of the integral. Later, in some cases, we impose further restrictions.
21
22 Integration
In various books, you will see the following notations which all mean the same
thing: Z Z Z Z
f dµ, f (x) dµ(x), f (x) µ(dx), f.
Of course, the last one will only make sense if the measure µ is known. We will
use this when convenient to simplify the notation.
Z Xn
ν(A) = ϕ dµ = ai µ(Ei ∩ A)
A i=1
n n
X ˙
[ X X
= ai µ Ei ∩ Aj = ai µ(Ei ∩ Aj )
j≥1
i=1 i=1 j≥1
n
XX XZ X
= ai µ(Ei ∩ Aj ) = ϕ dµ = ν(Aj ).
j≥1 i=1 j≥1 Aj j≥1
3.2.1.
S L EMMA . Let ϕ ≥ 0 be a simple function,. Let An ∈ B with An ⊂ An+1
and n≥1 An = X. Then
Z Z
lim ϕ dµ = ϕ dµ.
n→∞ A
n
Z
P ROOF. Let ν(A) = ϕ dµ be the measure defined in Lemma 3.1.2(e). Then
A
Z Z
lim ϕ dµ = lim ν(An ) = ν(X) = ϕ dµ
n→∞ A n→∞
n
P ROOF. Since fn (x) is increasing, the limit always exists in [0, ∞]. Also since
fn ≤ fn+1 ≤ f , we have that
Z Z Z
fn dµ ≤ fn+1 dµ ≤ f dµ.
R R
Thus limn→∞ fn dµ exists in [0, ∞] and is no larger than f dµ.
Conversely, suppose that ϕ ≤ f is a non-negative simple function, and let
ε > 0. Define An = {x : fn (x) ≥ (1 − ε)ϕ(x)}. S Since ϕ(x) < ∞ and
fn (x) → f (x) ≥ ϕ(x), we have An ⊂ An+1 and n≥1 An = X. Therefore
by Lemma 3.2.1,
Z Z Z Z
(1−ε) ϕ dµ = lim (1−ε)ϕ dµ ≤ lim fn dµ ≤ lim fn dµ.
n→∞ A n→∞ A n→∞
n n
R R
R ε ↓ 0 to get that
Let R ϕ dµ ≤ limn→∞ fn dµ. Taking the supremum yields
f dµ ≤ limn→∞ fn dµ. So equality holds.
Pk
P ROOF. Apply MCT to the sequence gk = n=1 fn .
P ROOF
S . Countable additivity
P is a special case of the previous corollary; since
if A = ˙ n≥1 An , then f χA = n≥1 f χAn .
P ROOF. Define
En = x : fn (x) > fn+1 (x) for n ≥ 1 and E0 = x : lim fn (x) 6= f (x) .
n→∞
S
Then µ(En ) = 0 for n ≥ 0 by hypothesis. Let E = n≥0 En . Then µ(E) = 0,
fn χE c ≤ fn+1 χE c and f χE c = limn→∞ fn χE c . So by the MCT and Lemma 3.2.5,
Z Z Z Z
f dµ = f χE c dµ = lim fn χE c dµ = lim fn dµ.
n→∞ n→∞
3.2.7. E XAMPLE . Things don’t work out quite as well for limits which are not
1χ
monotone. For example, let fn = n [0,n] ∈ L+ (m). Then fn → 0 uniformly on
R. However, Z Z
lim fn dm = lim 1 = 1 6= 0 = 0 dm.
n→∞ n→∞
P ROOF. Let gn (x) = infk≥n fk (x) ≤ fn (x). Then gn ≤ gn+1 for n ≥ 1, and
lim inf fn = limn→∞ gn . Therefore by MCT,
26 Integration
Z Z Z Z
lim inf fn dµ = lim gn dµ = lim inf gn dµ ≤ lim inf fn dµ.
n→∞
The set of all integrable functions will be (provisionally) called L1 (µ). Write f =
f1 − f2 + if3 − if4 where f1 = Re f ∨ 0 = max{Re f, 0}, f2 = (− Re f ) ∨ 0,
f3 = Im f ∨ 0 and f3 = (− Im f ) ∨ 0. If f is integrable, define
Z Z Z Z Z
f dµ = f1 dµ − f2 dµ + i f3 dµ − i f4 dµ.
+ and f ≤ |f |, we have
R R
R Note that since fi ∈ L i fi dµ ≤ |f | dµ < ∞. So
f dµ is defined as a complex number.
The reason for the temporary nature of our definition of L1 (µ) is that this is
not a normed vector space, due to the fact that if f = 0 a.e.(µ), then kf k1 = 0. So
k · k1 is just a pseudo-norm. That is, it satisfies kλf k1 = |λ| kf k1 and the triangle
inequality, but is not positive definite. We will rectify this soon by identifying
functions which agree a.e. into an equivalence class representing an element of
L1 (µ).
+
1 iθ
R in L , and is left to the reader.
Linearity follows from linearity Rof the integral
Let f ∈ L (µ) and choose θ so that f = e f . Then
Z Z Z Z
f = e−iθ f = Re e−iθ f = Re e−iθ f.
We now arrive at the most important limit theorem in measure theory. With
this result and MCT, you can deal with most situations that arise.
P ROOF. First assume that fn are real valued. Then we apply Fatou’s Lemma 3.2.8
to the sequences g ± fn in L+ to get
Z Z Z Z
g + f dµ ≤ lim inf g + fn dµ = g dµ + lim inf fn dµ
and
Z Z Z Z
g − f dµ ≤ lim inf g − fn dµ = g dµ − lim sup fn dµ.
R
Since g dµ < ∞, it can be cancelled off and we obtain that
Z Z Z
lim sup fn dµ ≤ f dµ ≤ lim inf fn dµ.
R R
It follows that f dµ = limn→∞ fn dµ. In particular, applying this to |f | and
|fn | we obtain that
Z Z Z
kf k1 = |f | dµ = lim |fn | dµ ≤ g dµ < ∞.
n→∞
Z b
n
3.3.4. E XAMPLE . Consider lim dx for 0 ≤ a < b ≤ ∞.
n→∞ a 1 + n2 x2
n 1
Observe that fn (x) = 1+n2 x2
= 1 +nx2. So lim fn (x) = 0 for all x > 0. The
n n→∞
1 2
function h(t) = t + t x attains its minimum at t = x. Thus if 0 ≤ x ≤ 1,
1 1 1
sup fn (x) ≤ sup = = .
n≥1 0<t≤1 h(t) h(x) 2x
On the other hand, if x ≥ 1, then
1 1 1
sup fn (x) ≤ sup = = .
n≥1 0<t≤1 h(t) h(1) 1 + x2
(
1
2x if 0 < x ≤ 1
So if we set g(x) = 1 , then 0 ≤ fn ≤ g. The problem is
1+x2
if x ≥ 1
R1 1
that g 6∈ L1 (0, ∞) since 0 2x dx = +∞. However if 0 < a ≤ 1, then
Z ∞ Z 1 ∞ ∞
Z
1
g(x) dx = 2x dx+ 1 1
1+x2
dx = 1
2 ln x + tan−1 (x) = 1
2 ln a−1 + π2 < ∞.
a a 1 a 1
So
Z b Z b
π
lim fn (x) dx = lim tan−1 (nb) =
6= 0 dx = 0.
n→∞ 0 n→∞ 2 0
Thus the domination condition by an integrable function is crucial.
That is, fnk converges to f in L1 (µ). Finally a standard argument shows that the
whole Cauchy sequence (fn ) converges to f in norm.
Then
n
X n
X
lP (x) = m1 χ{a} + mj χ(xj−1 ,xj ] ≤ f ≤ M1 χ{a} + Mj χ(xj−1 ,xj ] = uP (x).
j=1 j=1
If P and Q are two partitions, let P ∨ Q denote the partition using the points in
P ∪ Q. It is easy to show that
L(f, P) ≤ L(f, P ∨ Q) ≤ U (f, P ∨ Q) ≤ U (f, Q).
It follows that
sup L(f, P) ≤ inf U (f, Q).
P Q
Define l(x) = limn→∞ lPn (x) and u(x) = limn→∞ uPn (x). Note that l ≤ f ≤ u.
These functions are Lebesgue measurable. Moreover, lPn − LP1 ≥ 0 and increase
to l − lP1 . Since lP1 is integrable, the MCT shows that
Z Z Z
l dm = lP1 dm + lim lPn − lP1 dm
n→∞
Z Z b
= lim lPn dm = lim L(f, Pn ) = f (x) dx.
n→∞ n→∞ a
Z Z b Z
Similarly, u dm = f (x) dx. In particular, u−l dm = 0. By Lemma 3.2.5,
a
u − l = 0 a.e.(m). Therefore l = f = u a.e.(m), so that f is measurable and
Z Z Z b
f dm = u dm = f (x) dx.
a
3.5.2. R EMARKS . There is one situation where the Riemann integral can do
something that the Lebesgue integral can’t. That is the improper Riemann integrals
in which a function which Z ∞is not Riemann integrable and be integrated as a limit.
sin x
The typical example is dx. The integrand extends to be continuous at
0 x
x = 0, so that is not an issue. In the Riemann theory, this integral is evaluated as
Z ∞ Z r
sin x sin x
dx = lim dx.
0 x r→∞ 0 x
This can be shown by a number of techniques to equal π2 . The reason it is not
Lebesgue integrable is that this integral exists only as a conditional limit, and
Z ∞ Z r
sin x sin x
dx = lim dx = +∞.
0 x r→∞ 0 x
So this function is neither Lebesgue nor Riemann integrable. However if a function
f is Lebesgue integrable, then so is |f |.
Let’s examine this example in more detail.
Z ∞ ∞ Z (n+1)π ∞
sin x X sin x X
dx = dx = (−1)n an
0 x nπ x
n=0 n=0
where for n ≥ 1,
(n+1)π π
| sin x|
Z Z
1 2
an = dx ≤ sin x dx =
nπ x nπ 0 nπ
and
(n+1)π π
| sin x|
Z Z
1 2
an = dx ≥ sin x dx = .
nπ x (n + 1)π 0 (n + 1)π
32 Integration
2
It follows that an ≥ (n+1)π ≥ an+1 for n ≥ 1; so an → 0 monotonely. Therefore
P∞ n
n=0 (−1) an converges by the alternating series test. It also shows that
Z ∞ ∞ Z (n+1)π ∞
sin x X | sin x| X
dx = dx = an .
0 x nπ x
n=0 n=0
2
Since an ≥ (n+1)π for n ≥ 1, this series diverges by comparison with the harmonic
series.
The other place where an improper Riemann integral is requires is the integra-
tion of unbounded functions. The Riemann theory works only for bounded func-
tions. A function like f (x) = x−a for 0 < a < 1 has an improper Riemann integral
on [0, 1]. Since this function is positive and continuous with a finite integral, it is
Lebesgue integrable. It is possible again to construct a function with alternating
sign on (0, 1] that blows up near 0 so that the improper Riemann integral exists as
Z 1 Z 1
f (x) dx = lim f (x) dx
0 ε→0 ε
Z 1
1 1
but so that the absolute value is not integrable. An example is sin dx, which
0Z x x
∞
1 sin u
after the change of variables u = x converts this to the integral du which
1 u
we have already analyzed.
When you have to integrate specific functions, you will usually find yourself
falling back on the many techniques that have been developed for the Riemann in-
tegral. Almost all explicit functions that you will need to integrate will be Riemann
integrable or at least locally Riemann integrable.
The power of the Lebesgue integral lies in the limit theorems MCT and LDCT.
These are much stronger than anything available in the Riemann theory. Also we
obtain the completeness of L1 (µ) and many other related spaces that allow the
power of functional analysis to apply.
Suppose that f is Riemann integrable. We use the notation from the proof of
Theorem 3.5.1. Choose an increasing family of partitions Pn with mesh(Pn ) → 0,
and define lPn , l, uPn and u as before. Let
[
A = {x : l(x) 6= u(x)} ∪ P .
n≥1 n
S
The set {x : l(x) 6= u(x)} has measure 0, and n≥1 Pn is countable. Therefore
m(A) = 0. Moreover
lim uPn (x) − lPn (x) = u(x) − l(x) = 0 for x ∈ [a, b] \ A.
n→∞
If x ∈ [a, b] \ A, choose n so that uPn (x) − lPn (x) < ε. Then there are adjacent
points xi−1 < xi in Pn so that xi−1 < x < xi . Let δ = min{x − xi−1 , xi − x}.
Then since uPn and lPn are constant on (xi−1 , xi ] and are upper and lower bounds
for f , respectively, it follows that
U (x) − L(x) ≤ sup f (y) − f (z) ≤ uPn (x) − lPn (x) < ε.
|y−x|<δ
|z−x|<δ
Thus ω(f, x0 ) < r as well. Cover An with a countable family of open intervals
of total length at most 2−n . S Then since An is compact, there is a finite subcover
I1 , . . . , Im . Let B = [a, b] \ m i=1 Ii . For each x ∈ B, since U (x) − L(x) < 2
−n ,
there is an open inteval Jx 3 x so that the oscillation over Jx is less than 2 . The−n
The following proposition shows that in the familiar case of Borel sets on met-
ric spaces, and finite products, we get the desired result. The separability hypothesis
is crucial.
P ROOF. The product space X = ni=1 Xi is also a metric space with the metric
Q
d((xi ), (yi )) = max{di (xi , yi )}. With this (or any equivalent) metric, the coordi-
−1
Nn Thus if Ui is open in Xi , the set πi (Ui ) is open
nate projections are continuous.
in X. These sets generate i=1 Bor(Xi ), and thus it is contained in Bor(X).
Conversely, since each Xi is separable, so is X. Therefore X is second count-
able. Indeed if {xj : j ≥ 1} is a dense subset, then {br (xj )Q: j ≥ 1, r ∈ Q+ }
n
is a countable neighbourhoodNn base for X. Now br (xj ) = i=1 br (xj,i ), where
πi (xj ) = xj,i , belongs to i=1 Bor(Xi ). As a σ-algebra is closed under countable
unions and every open set in X is the union of those (countably many) sets in the
3.6 Product Measures 35
3.6.4. C OROLLARY.
n
O
n
Bor(R ) = Bor(R) and Bor(C) = Bor(R) ⊗ Bor(R).
i=1
Now let (X, B, µ) and (Y, B 0 , ν) be two measure spaces. Let A be the collec-
tion of all finite unions of disjoint rectangles of the form A × B for A ∈ B and
B ∈ B 0 . This is an algebra since it is closed under finite unions and complements.
Indeed, (A × B)c = Ac × Y ∪˙ A × B c and
A1 ×B1 ∪ A2 ×B2 = A1 ×B1 ∪˙ (A2 \ A1 )×B2 ∪˙ (A1 \ A2 )×(B2 \ B1 ).
Now define a set function
n
[ ˙n X
π Ai × B i = µ(Ai )ν(Bi ).
i=1
i=1
To apply Carathéodory’s Theorem, we need the following lemma.
Hence π is a premeasure.
Now we can apply Carathéodory’s Theorem via Theorem 1.4.2 to obtain the
following, called the product measure.
3.6.6. T HEOREM . Let (X, B, µ) and (Y, B0 , ν) be two measure spaces. There
is a complete measure (X ×Y, B ⊗ B 0 , µ×ν) on X ×Y such that B ⊗ B 0 ⊃ B ⊗B 0
and µ × ν(A × B) = µ(A)ν(B) for all A ∈ B and B ∈ B 0 .
3.7.3. R EMARKS . (1) The hypothesis that µ×ν is σ-finite in Tonelli’s Theorem
is critical, as an example will show. However it
Z is not needed for Fubini’s Theorem
because when f ∈ L1 (µ × ν), we have that |f | dµ × ν = L < ∞. It follows
that Cn = {(x, y) ∈ X × Y : |f (x, y)| ≥ n1 } has µ × ν(CnS) ≤ nL. Thus the
restriction of µ × ν to C = {(x, y) ∈ X × Y : f (x, y) 6= 0} = n≥1 Cn is σ-finite.
(3) Normal practice is to omit the parentheses and if there is no confusion, also
the variables, in multiple integrals. So we write
ZZ ZZ Z Z
f dν dµ or f (x, y)dν(y) dµ(x) for f (x, y)dν(y) dµ(x).
X Y
(
B if x ∈ A
P ROOF. If E = A×B for A ∈ B and B ∈ B 0 , then Ex =
∅ if x ∈
6 A.
χ
Therefore g(x) = ν(B) A is µ-measurable and g ≥ 0. Moreover,
Z Z
g dµ = ν(B)χA dµ = µ(A)ν(B) = µ × ν(E) < ∞.
Thus g ∈ L+ ∩ L1 (µ).
Next, if E ∈ Aσ , we can write E = i≥1 Ai × Bi for Ai ∈ B and Bi ∈ B 0 .
S
gi (x) = ν(Ci )χDi . Then g(x) = i≥1 gi (x). Hence g is measurable, and thus
P
The last equality follows by continuity from above since µ × ν(E1 ) < ∞.
Similarly we can interchange the role of x and y.
Thus Z Z Z
f dµ × ν = f (x, y)dν(y) dµ(x).
X×Y X Y
Interchanging the role of X and Y yields the other iterated integral.
There is a straightforward variant of these results when the measures are not
complete. The difference is that given measures (X, B, µ) and (Y, B 0 , ν), we form
µ × ν and restrict it to the σ-algebra B ⊗ B 0 . I will also call this µ × ν.
The second iterated integral first sums the columns, each absolutely summable.
However the first column sums to 1, the rest to 0:
Z Z X X X
f (m, n) dmc (m) dmc (n) = f (m, n) = 1 + 0 = 1.
Y X n≥1 m≥1 n≥2
These measures are σ-finite since N2 is countable and points have measure 1.
Tonelli’s theorem is not applicable because f is not positive. More importantly, Fu-
Z Z f is not integrable.
bini’s theorem does not apply because If it were, then |f | would
XX
have a finite integral, but clearly |f | dmc × mc = |f (m, n)| = ∞.
m≥1 n≥1
This is a finite measure. Form the product space (Ω2 , Bor(Ω2 ), µ × µ). Again we
set E = {(x, y) ∈ Ω2 : y ≺ x} and f = χE . We have the same contradiction. It
must be the case that E is not a measurable set.
P ROOF. (1) Note that translation of a cube preserves the measure. Thus the
premeasure is translation invariant. It follows that the outer measure is translation
invariant, and hence so is m.
(2) If f is Borel measurable, then so is f ◦T because T is continuous and hence
Borel by Corollary 2.1.3. If the result is true for invertible matrices S and T , then
Z Z
f dm = | det S| f ◦ S dm
Z
= | det S| | det T | (f ◦ S) ◦ T dm
Z
= | det ST | f ◦ ST dm.
Hence it is true for their product. Therefore it suffices to establish the result for
elementary matrices.
Using either the Fubini or Tonelli Theorem, the integral is equal to the iterated
integral. For multiplying a row by a non-zero constant, the iterated integral reduces
to the one variable fact. Adding a multiple of one row to another, integrate first
with respect to xi and use translation invariance. Interchanging two coordinates
is equivalent to interchanging the order of integration, which results in no change,
again by the Fubini or Tonelli theorem.
For E Borel, apply the result to χE to get
Z Z
m(E) = χE dm = | det T −1 | χE ◦ T −1 dm
Z
−1 χT (E) dm = | det T |−1 m(T (E)).
= | det T |
The result for a Lebesgue measurable function f now follows because there is
a Gδ set G so that f χGc is Borel measurable. Now (f χGc ) ◦ T = f ◦ T χT −1 (G)c .
46 Integration
Then define
G1,n = x1 ∈ X1 : µ(1,mn ] (Fn (x1 )) > ε/2 .
Note that
Z
ε ≤ µ(Fn ) = µ(1,mn ] (Fn (x1 )) dµ1 (x1 )
X1
Z Z
ε ε
≤ 1 dµ1 (x1 ) + dµ1 (x1 ) ≤ µ(1) (G1,n ) + .
G1,n c
G1,n 2 2
Therefore µ(1) (G1,n ) ≥ 2ε . Now G1,n ⊃ G1,n+1 and hence µ n≥1 G1,n ≥ 2ε .
T
T
Choose a point a1 ∈ n≥1 G1,n . Then
ε
µ(1,mn ] (Fn (a1 )) ≥ for all n ≥ 1.
2
Recursively we construct points ai ∈ Xi so that
ε
µ(k,mk ] (F (a1 , . . . , ak )) ≥ k for all n ≥ 1.
2
Suppose that a1 , . . . , ak−1 have been defined with this property. Define
ε
Gk,n = xk ∈ Xk : µ(k,mn ] (Fn (a1 , . . . , ak−1 , xk )) > k .
2
Arguing as above,
ε
k−1
≤ µ((k−1,mn ] (Fn (a1 , . . . , ak−1 ))
2 Z
= µ(k,mn ] (Fn (a1 , . . . , ak−1 , xk )) dµk (xk )
Xk
Z Z
ε ε
≤ 1 dµk (xk ) + k
dµk (xk ) ≤ µ(k) (Gk,n ) + k .
Gk,n c
Gk,n 2 2
ε
T T
Therefore µk n≥1 Gk,n ≥ 2k . Pick a point ak ∈ n≥1 Gk,n . This does the job.
It follows that for all k ≥ 1 and n ≥ 1, there is a point x so that (a1 , . . . , ak , x)
is in Fn . In particular, there is a point y so that (a1 , . . . , amn , y) ∈ Fn . But Fn is a
Q sets of level no greater than mn , and hence (aT
union of cylinder 1 , . . . , amn , y) ∈ Fn
for all y ∈ i>mn Xi . Therefore a = (a1 , a2 , a3 , . . . ) ∈ n≥1 Fn . This is a
contradiction since this intersection in empty. P
It follows that limn→∞ µ(Fn ) = 0 and hence µ(R) = i≥1 µ(Ri ).
As an immediate consequence of Theorem 1.4.2, we obtain the desired mea-
sure.
48 Integration
Y
3.9.2. C OROLLARY. There is a unique complete measure µ = µi on
i≥1
Y O
( Xi , Bi ) so that
i≥1 i≥1
Y n
Y
µ Ai × · · · × An × Xi = µi (Ai )
i>n i=1
for all n ≥ 1 and Ai ∈ Bi .
3.9.3. E XAMPLE . Fix 0 < p < 1. Take Xi = 2 = {0, 1} and let µi be the
Q measure on (Xi , P(Xi )) such that µi (0) = p and µi (1) = 1 − p. Then
probability
X = i≥1 Xi is homeomorphic to the Cantor set C. Let µp denote the infinite
Q
product measure i≥1 µi .
Identify the Cantor ternary set with {x = 0.(2ε1 )(2ε2 )(2ε3 ) . . . base 3 } for
ε = (ε1 , ε2 , . . . ) ∈ X. The homeomorphism of X onto C is given by
h(ε) = 0.(2ε1 )(2ε2 )(2ε3 ) . . . base 3 for ε = (ε1 , ε2 , ε3 , . . . ) ∈ X.
We obtain a Borel measure on C by setting νp (A) = µp (h−1 (A)).
Let Cn denote the subset of [0, 1] after n operations of removing the middle
thirds has been accomplished; so that there remain 2n intervals of length 3−n . With
this explicit homeomorphism, the rectangle R(ε1 , . . . , εn ) corresponds to one of the
intervals of Cn intersected with C. For example, R(0, 0) = [0, 19 ] ∩ C, R(0, 1) =
[ 29 , 31 ] ∩ C, R(1, 0) = [ 32 , 79 ] ∩ C and R(1, 1) = [ 98 , 1] ∩ C.
Consider the analogue of the Cantor function which is defined on [0, 1] by
fp (x) = νp (C ∩ [0, x]). Then f (0) = 0 and fp (1) = 1. Observe that this is a
monotone increasing function. Moreover on the middle third [ 31 , 23 ], it takes the
value p. Then on the interval [ 19 , 29 ], it takes the value p2 and on [ 79 , 89 ], it takes the
value p + p(1 − p). Indeed, at level n, we have defined fp (x) on [0, 1] \ int(Cn ). If
[a, b] is a component of Cn , then the middle third is removed to form Cn+1 and
b−a 2(b−a)
fp (x) = fp (a) + p(fp (b) − fp (a)) for a + 3 ≤x≤a+ 3 .
Notice that fp takes all of the values pk (1 − p)n−k for 0 ≤ k ≤ n and n
≥ 1.
These values are dense in [0, 1]. As for the usual Cantor function, it follows that fp
has no jump discontinuities. So fp is continuous.
The measure νp is the Lebesgue-Stieltjes measure for the function fp by Theo-
rem 1.5.2. So νp is a probability measure supported on C. Since fp take the same
value at each endpoint of a removed interval, we see that νp (R \ C) = 0. The
continuity of fp means that νp has no atoms.
C HAPTER 4
4.1. Differentiation
Question: Is there a measure theoretic analogue R of the Fundamental Theorem
of Calculus? That is, to what extent is F (x) = c + [a,x] f dm differentiable on R?
What functions arise in this way?
We restrict our attention to real valued functions. In fact, splitting a function f
as a difference of two positive functions, we may suppose that f ≥ 0; and so the
integral will be monotone increasing.
4.1.1. D EFINITION . The upper and lower derivative from left and right of a
real valued function f are defined as
f (x + h) − f (x) f (x + h) − f (x)
Dr f (x) = lim sup Dr f (x) = lim inf
h→0+ h h→0 + h
f (x) − f (x − h) f (x) − f (x − h)
Dl f (x) = lim sup Dl f (x) = lim inf
h→0+ h h→0+ h
49
50 Differentiation and Signed Measures
P ROOF. Define f (x) = f (a) for x < a and f (x) = f (b) for x > b. Since f is
monotone, for c ∈ [a, b], we have
f (c− ) = lim f (x) = sup f (x) ≤ f (c) ≤ f (c+ ) = inf f (x) = lim+ f (x).
x→c− x<c x>c x→c
let [
Eu,v = {x : Dr (x) < u < v < Dl f (x)} and E= Eu,v .
u<v∈Q
To show that m∗ (E) = 0, it suffices to show that m∗ (Eu,v ) = 0 for all u < v.
Let m∗ (Eu,v ) = s and fix an ε > 0. Choose an open set U ⊃ Eu,v with
m(U ) < s + ε. Let J = {I = [x, x + h] ⊂ U : f (x + h) − f (x) < uh}. By
definition of Dr f (x), this contains arbitrarily small intervals [x, x + h] for each
x ∈ Eu,v . Thus J is a Vitali cover of E. By the Vitali Covering Lemma, we can
find I1 = [x1 , x1 + h1 ], . . . , IN = [xN , xN + hN ] disjoint intervals in J so that
m∗ (Eu,v \ N
S
j=1 Ij ) < ε. Therefore
N
X N
X
s−ε< m(Ij ) = hj < m(U ) < s + ε,
j=1 j=1
SN
and m∗ (Eu,v ∩ j=1 Ij ) > s − ε. Let
N
[ N
[
F = Eu,v ∩ (xj , xj + hj ) ⊂ (xj , xj + hj ) =: V.
j=1 j=1
Z b Z b Z b+ n1 Z b
0
f dm ≤ lim inf gn dm = lim inf n f dm − n f dm
a n→∞ a n→∞ a+ n1 a
Z b+ n1 Z a+ n1
= lim inf n f (b) dm − n f dm ≤ f (b) − f (a) < ∞.
n→∞ b a
4.1.5. E XAMPLE . Recall the Cantor ternary function. This is defined on [0, 1]
by f (0) = 0, f (1) = 1. Then f (x) = 12 for x ∈ [ 13 , 23 ], the (closure of) the
middle third removed in the construction of the Cantor set. Then we set f (x) = 14
on [ 19 , 92 ] and f (x) = 34 on [ 97 , 89 ]. This repeats, on each middle third removed,
f is defined as the midpoint between the values defined at the endpoints of the
(larger) interval. Finally, for x ∈ C, where C is the Cantor set, we can define
f (x) = sup{f (y) : 0 ≤ y ≤ x, y 6∈ C}. This evidently yields a monotone
increasing function. Moreover the values attained by f include all diadic rationals
2−n k for 0 ≤ k ≤ 2n . So the range of f is dense in [0, 1]. In particular, f cannot
have any jump discontinuities. Therefore f is continuous.
Next observe that on each of the removed (open) intervals, ( 31 , 32 ), ( 19 , 29 ) and
( 97 , 89 ), etc., the function f is constant. Therefore it is differentiable with f 0 (x) = 0.
This occurs on [0, 1] \ C, and since m(C) = 0, we see that f 0 is defined and finite
almost everywhere. However
Z 1
f 0 dm = 0 < 1 = f (1) − f (0).
0
So it can certainly be the case that you cannot recover f from its derivative.
n
nX o
Vab (f ) = sup |f (xi )−f (xi−1 )| : n ≥ 1, a = x0 < x1 < . . . < xn = b < ∞.
i=1
4.1.11. E XAMPLE . The Cantor ternary function f is monotone, and thus BV.
However it is not absolutely continuous. If xi , yi for 1 ≤ i ≤ 2n are the endpoints
of the intervals remaining after n stages of the construction of the Cantor set, then
because f is constant on the complements of these intervals, we have
n 2n
2 n
X X
|f (yi ) − f (xi )| = 1 and yi − xi = 3 .
i=1 i=1
2 n
Since 3 → 0, f is not absolutely continuous.
Z x
4.1.13. C OROLLARY. If f ∈ L1 [a, b], then F (x) = f dm is absolutely
a
continuous.
P ROOF. Given ε > 0, let δ beP provided by Proposition 4.1.12. Then if (xi , yi )
are disjoint intervals in [a, b] and ni=1 yi − xi < δ, then
Xn Xn Z yi Z
|F (yi ) − F (xi )| = f dm ≤ S |f | dm < ε.
n
i=1 i=1 xi i=1 [xi ,yi ]
P ROOF. Take ε = 1, and find δ > 0. Split [a, b] = pj=1 [aj−1 , aj ] where
S
Z x
4.1.15. L EMMA . If f ∈ L1 [a, b] and F (x) = f dm is monotone increasing,
a
then f ≥ 0 a.e.
X Z
0≤ F (yi ) − F (xi ) = f dm
i≥1 U
Z Z
m(En )
= f dm + f dm ≤ − + ε < 0.
En U \En n
This contradiction shows that m(E) = 0, or that f ≥ 0 a.e.
The following consequence is immediate.
4.1 Differentiation 55
Z x
4.1.16. C OROLLARY. Let f ∈ L1 [a, b] and F (x) = f dm. If F = 0, then
a
f = 0 a.e.
The main result here is the following result of Lebesgue, his substitute for the
Fundamental Theorem of Calculus, which answers the first part of the question
posed at the beginning of this section.
4.1.17. L EBESGUE
Z x D IFFERENTIATION T HEOREM . Let f ∈ L1 [a, b] and
F (x) = c + f dm. Then F 0 (x) = f (x) a.e.
a
Thus |gn | ≤ M χ[a,b] , which is integrable. By the LDCT and the fact that F is
continuous, for c ∈ [a, b],
Z c Z c Z c
F 0 dm = lim gn dm = lim n F (x + n1 ) − F (x) dx
a n→∞ a n→∞ a
Z c+ n1 Z a+ n1
= lim n F (x) dx − n F (x) dx
n→∞ c a
Z c
= F (c) − F (a) = f dm.
a
Z c
Therefore (F 0 − f ) dm = 0 for all c ∈ [a, b]. By Corollary 4.1.16, F 0 = f a.e.
a
Case 2 f ≥ 0. Let fn = f ∧ n for n ≥ 1. Then fn is bounded and case 1
applies. Observe that
Z x Z x
F (x) = fn dm + f − fn dm.
a a
Next we wish to characterize which functions are integrals. We first need an-
other lemma.
P ROOF. Fix c ∈ (a, b]. Let ε > 0. Obtain δ > 0 from the definition of absolute
continuity. Set E = {x ∈ (a, c) : f 0 (x) = 0}. So m([a, c] \ E) = 0. Define
J = {[x, x + h] : x ∈ E, h > 0, [x, x + h] ⊂ (a, c) and |f (x + h) − f (x)| < εh}.
This collection is a Vitali cover of E because for each x, all sufficiently small h
work. Therefore there are disjoint intervals I1 , . . . , In in J so that
n
[ n
[
m [a, c] \ Ij = m E \ Ij < δ.
j=1 j=1
The second term in parentheses is at most ε by absolute continuity since the total
length of the intervals is less than δ. Now ε > 0 was arbitrary, and therefore
f (c) = f (a); whence f is constant.
4.2 Signed Measures 57
P ROOF. Lebesgue’s Differentiation Theorem shows that (1) implies (3). Clearly
(3) implies (1). Lemma 4.1.9 shows that integrals are AC, so (1) implies (2). If (2)
holds, then F is BV by Lemma 4.1.14; and thus by Corollary 4.1.8, F 0 exists a.e.
and belongs to L1 (m). Let
Z x
G(x) = F (a) + F 0 dm.
a
4.2.2.PR EMARK . The definition implies that if ν ˙ i≥1 Ei < ∞, then the
S
series i≥1 ν(Ei ) converges absolutely. One argument would be that the union is
independent of order, and thus the sum should be independent of the order as well.
But unconditionally convergent sequences of real numbers converge absolutely. An
alternative argument, which is better, is that if the sum converges only conditionally,
then the sum of the positive terms diverges, as does S the sum
ofP
the negative terms.
But then if J = {i : ν(Ei ) ≥ 0}, we have that ν ˙ i∈J Ei = i∈J ν(Ei ) = +∞
and ν ˙ i6∈J Ei = i6∈J ν(Ei ) = −∞. This is not permitted for signed measures
S P
4.2.3. E XAMPLES .
(1) Let f ∈ L1R (µ) and define ν(E) = E f dµ. In this case, neither ±∞ arises as
R
a value. The absolute convergence over countably many disjoint sets follows from
the LDCT.
58 Differentiation and Signed Measures
4.2.6. L EMMA . If 0 < ν(E) < ∞, then there is a positive set A ⊂ E with
ν(A) > 0.
Sn−1
Recursively choose Bn ⊂ E \ ˙ i=1 Bi with
˙ n−1
[
ν(Bn ) ≤ max − 1, 12 inf{ν(B) : B ⊂ E \
Bi , B ∈ B} .
i=1
Sn
Either this terminates because A = E \ ˙ i=1 Bi is a positive set, or there is an
infinite sequence. In this case, let A = E \ ˙ i≥1 Bi . Now by countable additivity,
S
X
ν(E) = ν(A) + ν(Bi ).
i≥1
4.2.10. D EFINITION . The absolute value of a signed measure with Jordan de-
composition ν = ν+ − ν− is |ν| = ν+ + ν− .
60 Differentiation and Signed Measures
Since any measurable subset of A is also a µ-null set, A must be ν-null. This
is equivalent to saying that |ν|(A) = 0. So ν µ if and only if |ν| µ.
P ROOF. First we assume that µ(X) < ∞ and ν(X) < ∞. For each r ∈ Q+ ,
ν − rµ has a Hahn decomposition (Pr , Nr ). Also, let P0 = X and S N0 = ∅.
Define f (x) = sup{r ≥ 0 : x ∈ Pr }. Then for t ≥ 0, f −1 (t, ∞] = r>t Pr is
measurable. So f ∈ L+ .
If r < s in Q+ , then Ps is a positive set for ν − sµ and thus is positive for
ν − rµ = (ν − sµ) + (s − r)µ. Since S Nr is a negative
set for ν − rµ, we have
µ(Nr ∩ Ps ) = 0. Therefore µ Nr ∩ s>r∈Q+ Ps = 0. Hence f |Nr ≤ r a.e.(µ).
Thus µ(f −1 (r, ∞]) ≤ µ(Pr ).
Since Pr is positive for ν − rµ, rµ(Pr ) ≤ ν(Pr ). Hence
µ(Pr ) ≤ ν(Pr )/r ≤ ν(X)/r.
This goes to 0 as r → ∞, and thus µ(f −1 (∞)) ≤ limr→∞ µ(Pr ) = 0. Therefore
f < ∞ a.e.(µ).
Let E ∈ B. Fix N . Define Ek = E ∩ P k ∩ N k+1 ) for k ≥ 0; and let
S T N N
E∞ = E \ k≥0 Ek = E ∩ r Pr . Then µ(E∞ ) = 0 and so ν(E∞ ) = 0 as well.
Observe that
k k+1
(ν − N µ)(Ek ) ≥ 0 ≥ (ν − N µ)(Ek ).
Thus
k k+1
N µ(Ek ) ≤ ν(Ek ) ≤ N µ(Ek ).
And similarly
X XZ Z X X µ(E)
k k+1 k
N µ(E k ) ≤ f dν = f dν ≤ N µ(Ek ) = N µ(Ek ) + N .
k≥0 Ekk≥0 E k≥0 k≥0
Therefore Z
µ(E)
ν(E) − f dν ≤ .
E N
Z
However N is arbitrary, and hence ν(E) = f dν.
E Z Z
+
Suppose that f, g ∈ L such that ν(E) = f dµ = g dµ for E ∈ B.
E E
Define A = {x : f (x) > g(x)} and An = {x : f (x) ≥ g(x) + n1 }. If µ(A) > 0,
then µ(An ) > 0 for some n. But then
Z Z Z
1
ν(An ) = f dµ ≥ g + n dµ = g dµ + n1 µ(An ) > ν(An ).
An An An
Hence f ≤ g a.e.(µ). Similarly g ≤ f a.e.(µ), so that f = g a.e.(µ).
For the general case, use the fact that µ and ν are σ-finite to chop X up into a
disjoint union of countable many pieces Xn so that µ(Xn ) < ∞ and ν(Xn ) < ∞
for each n ≥ 1. Then apply the previous argument on each piece, and combine.
4.3.3. E XAMPLE . On ([0, 1], Bor([0, 1])), consider Lebesgue measure m and
counting measure
Z mc . Then m mc but there is no function f ∈ L+ so that
m(E) = f dmc . But mc is not σ-finite, and this shows the necessity of this
E
condition in the Radon-Nikodym Theorem.
Note that Re ν and Im ν are finite signed measures. Thus there are positive
finite measures νi for 1 ≤ i ≤ 4 so that Re ν = ν1 − ν2 and Im ν = ν3 − ν4 . Hence
ν = ν1 − ν2 + iν3 − iν4 .
4.3.8. E XAMPLE . The main example (and essentially the only example)
R is ob-
tained as follows: let µ be a measure and let f ∈ L1 (µ). Define ν(E) = E f dµ.
You can readily check that this is a complex measure. We will write dν = f dµ
dν
and = f in this case.
dµ
The second statement of the following result is also sometimes called the Radon-
Nikodym Theorem.
P ROOF. With the notation as in the discussion preceding the proposition, let
µ = ν1 + ν2 + ν3 + ν4 . Then Re ν µ and Im ν µ. By the Radon-Nikodym
Theorem,Zthere are f, g ∈ L1 (µ) so that d Re
Z ν = f dµ and d Im ν = g dµ. Hence
ν(E) = f + ig dµ. Define |ν|(E) = |f + ig| dµ and h = sign(f + ig).
E Z E
Then |h| = 1 a.e.(|ν|) and ν(E) = h d|ν|. Also |ν|(X) = kf + igk1 < ∞; so
E
|ν| is finite. Uniqueness is left as an exercise.
Now if µ is a σ-finite, the Lebesgue deomposition for |ν| yields |ν| = |ν|a +|ν|s
where d|ν|a = f dµ for f ∈ L+ ∩ L1 (µ) and |ν|s is supported on a µ-null set A.
Then dν = h d|ν| = h d|ν|a + h d|ν|s = hf dµ + h d|ν|s . Thus dνa = hf dµ µ
and hf ∈ L1 (µ) and dνs = h d|ν|s is supported on the µ-null set A.
Lp spaces
Set
N = {f measurable, complex valued : f = 0 a.e.(µ)}
and define Lp (µ) = “Lp (µ)”/N with the norm k[f ]kp = kf kp .
“L∞ (µ)” = {f measurable, complex valued : kf k∞ = ess sup |f | < ∞} where
ess sup |f | = sup{t ≥ 0 : µ({x : |f (x)| > t} > 0}. Set L∞ (µ) = “L∞ (µ)”/N
with norm k[f ]k∞ = kf k∞ .
By convention, we write elements of Lp (µ) as f , where the equivalence class is
understood.
Note that “Lp ” is a linear space because if f, g ∈ Lp (µ), then λf ∈ Lp (µ) with
kλf kp = |λ| kf kp for λ ∈ C; and
|f + g|p ≤ (2 max{|f |, |g|})p ≤ 2p (|f |p + |g|p ).
whence kf + gkpp ≤ 2(kf kp + kgkp ) < ∞. Notice that N is a subspace of
“Lp (µ)”, Rso that Lp (µ) is also a vector space. Moreover, if f is measurable, then
kf kpp = |f |p dµ = 0 if and only if f = 0 a.e.(µ) if and only if f ∈ N . In
particular, k · kp is only a seminorm on “Lp (µ)”. However, for f ∈ Lp (µ), we
have kf kp = 0 if and only if f = 0. To verify that k · kp is a norm on Lp (µ), we
need to verify the triangle inequality, which means eliminating the annoying 2 in
the inequality above.
A function belongs to “L∞ (µ)” if it agrees with a bounded function a.e.(µ).
The triangle inequality is very easy here. It is also very easy for L1 (µ) (see sec-
tion 3.4).
Now that we have established that Lp (µ) is a normed space, we will show
that it is complete, and so is a Banach space. This extends the result for L1 (µ),
Theorem 3.4.2. This is another important piece of evidence that this is the right
context for integration.
P ROOF. Suppose that (fn )n≥1 is a Cauchy sequence in Lp (µ). Select a subse-
quence (nj )j≥1 so that kfnj − fm kp < 2−j for all m > nj . Define
k−1
X
hk = |fn1 | + |fnj+1−fnj | and h = lim hk .
k→∞
j=1
66 Lp spaces
Since hpk are increasing to hp , the MCT shows that khkp ≤ kfn1 kp + 1 when
p < ∞. For p = ∞, khk∞ ≤ kfn1 kp + 1 follows directly. So h ∈ Lp (µ), and in
particular, h(x) < ∞ a.e.(µ).
Note that fnk = fn1 + k−1
P
j=1 (fnj+1−fnj ). Let
X
f = fn1 + (fnj+1−fnj ) = lim fnk .
k→∞
j≥1
This series converges absolutely whenever h(x) < ∞,Pand so is defined a.e.(µ).
p
Moreover |f | ≤ h, so f ∈ Lp (µ). Also |f − fnk |p ≤ j≥k |fnj − fnj+1 | ≤ hp
when p < ∞. Thus by the LDCT,
Z Z
lim kf − fnk kpp = lim |f − fnk |p dµ = lim |f − fnk |p dµ = 0.
k→∞ k→∞ k→∞
1−k . Therefore f
P
When p = ∞, kf − fnk k∞ ≤ j≥k kfnj − fnj+1 k∞ < 2 nk
converges to f in L (µ). Since (fn )n≥1 is Cauchy, in fact fn → f in Lp (µ). So
p
Lp (µ) is complete.
5.1.4. E XAMPLES .
(1) Lp (0, 1) = Lp ((0, 1), m). Here Lebesgue measure, m, is a probability measure
on (0, 1). If 1 ≤ p < r < ∞ and f ∈ Lr (0, 1), then
Z Z Z
p p
kf kp = |f | dm ≤ 1 dm + |f |r dm ≤ 1 + kf krr < ∞.
|f |≤1 |f |>1
for p < ∞, and L∞ (N, mc ) = l∞ = (ai )i≥1 : k(ai )k∞ = supi≥1 |ai | < ∞ . If
(3) Let µ be the measure on (N, P(N)) given by µ(A) = ∞ is A 6= ∅ and µ(∅) =
0. Then Lp (µ) = {0} if 1 ≤ p < ∞ and L∞ (µ) = l∞ . You need sets of non-zero
finite measure to get interesting functions in Lp (µ).
So ϕn converge to f in Lp (µ).
When p = ∞, suppose that kf k∞ = N < ∞. For n ≥ 1, and (j, k) ∈
[−nN, nN ]2 , let Aj,k = {x : nj ≤ Re f (x) < j+1 k k+1
n , n ≤ Im f (x) < n }. These
PnN √
are measurable sets, and ϕn = j,k=−nN j+ik n
χAj,k satisfies kf − ϕn k∞ ≤ 2 .
n
So the simple functions are dense. If µ(A) = ∞, then χA is not the limit of simple
functions with finite support.
Recall that if X is a topological space, then Cc (X) denotes the space of con-
tinuous functions on X with compact support (i.e. {x : f (x) 6= 0} is compact).
Pn
Hence h = i=1 ai hi belongs to Cc (R) and
n
X ε ε
kϕ − hkp ≤ |ai |kχAi − hi kp < nL = .
2nL 2
i=1
5.2.1. D EFINITION . Let (V, k · k) be a normed vector space over F ∈ {R, C}.
Let L(V, F) denote the vector space of linear maps of V into the scalars, linear
functionals, and let V ∗ denote the dual space of V consisting of all continuous
linear functionals.
5.2.2. P ROPOSITION . Let (V, k · k) be a normed vector space over F, and let
ϕ ∈ L(V, F). The following are equivalent:
(1) ϕ is continuous.
(2) kϕk := sup{|ϕ(v)| : kvk ≤ 1} < ∞.
(3) ϕ is continuous as v = 0.
u−v
P ROOF. (2) ⇒ (1). If u 6= v ∈ V , set w = ku−vk and note that
λ ∈ F, then
kλϕk = sup |λϕ(v)| = |λ| sup |ϕ(v)| = |λ| kϕk.
kvk≤1 kvk≤1
P ROOF. This is just the AMGM inequality. The function f (x) = ex is strictly
convex. So etα+(1−t)β ≤ teα + (1 − t)eβ , with equality only when α = β or t = 0
or 1. Take a = eα and b = eβ and the result follows.
P ROOF. We may suppose that f, g are non-zero since the inequality is trivial
if f g = 0 a.e.(µ). Let f0 = |f |/A where A = kf kp and g0 = |g|/B where
70 Lp spaces
1
B = kgkq . Apply Lemma 5.3.1 to a = f0 (x)p and b = g0 (x)q with t = p and
1 − t = q1 . Then
|f (x)g(x)| 1 1 1 1
= f0 (x)g0 (x) ≤ f0 (x)p + g0 (x)q = |f (x)|p + |g(x)|q .
AB p q pAp qB q
Integrate to get
kf gk1 1 1 1 1
≤ p
kf (x)kpp + q
kg(x)kqq = + = 1.
AB pA qB p q
Hence kf gk1 ≤ AB = kf kp kgkq .
Equality holds in the first inequality only when f0 (x)p = g0 (x)q . For it to hold
for the integral, this identity must hold a.e.(µ). Hence |g|q = B p
A |f | a.e.(µ).
For more elementary reasons, if f ∈ L1 (µ) and g ∈ L∞ (µ), we have that
f g ∈ L1 (µ) and kf gk1 ≤ kf k1 kgk∞ .
P ROOF. First take 1 < p < ∞, so that q < ∞. Suppose first that g is real
valued. Choose simple functions ψn soS that |ψn | ≤ |ψn+1 | ≤ |g| and ψn → g.
Since µ is σ-finite, we can write X = n≥1 Xn where Xn ⊂ Xn+1 and µ(Xn ) <
∞ for all n. Then ϕn = ψn χXn are simple functions with finite support such that
|ϕn | ≤ |ϕn+1 | ≤ |g| and ϕn → g. Analogous to the previous lemma, define
|ϕn |q−1 sign(g)
fn = .
kϕn kq−1
q
Note that sign(g) take only the values ±1, 0 and so fn is a simple function of finite
support. As in the previous lemma, kfn kp = 1. Therefore
|ϕn |q−1 |g|
Z Z
M ≥ sup fn g dµ = sup dµ
n≥1 n≥1 kϕn kq−1
q
|ϕn |q
Z
≥ sup dµ = sup kϕn kq = kgkq .
n≥1 kϕn kqq−1 n≥1
kΦg k = kgkq . Proposition 5.1.5 shows that simple functions of finite support are
dense in Lp (µ), and hence the optimal constant M must be kΦg k; so kgkq ≤ M .
For p = 1, we need to show that kgk∞ ≤ M . If this fails, then for some
N > M , {x : |g(x)| > N } has positive measure. Hence there is a θ ∈ R
so that A = {x : Re eiθ g(x) > M } has positive measure. Let E ⊂ A have
iθ −1 χ . Then kf k = 1 and
R finite measure. Define f = e µ(E)
positive and E 1
M ≥ Re f g dµ > M . This is a contradiction, and hence kgk∞ ≤ M .
P ROOF OF T HEOREM 5.3.4. First assume that µ(X) < ∞. Let Φ ∈ Lp (µ)∗ .
Define ν(E) = Φ(χE ) for E ∈ B. Note that ν(∅) = Φ(0) = 0. Also if E =
˙
S
i≥1 Ei is a disjoint union of sets in B, then
n
χE −
X
χEi p ˙
[
p
=µ Ei → 0 as n → ∞.
i>n
i=1
Since Φ is continuous,
n
X X
ν(E) = Φ(χE ) = lim Φ χEi = ν(Ei ).
n→∞
i=1 i≥1
So Φ and Φg agree on the dense subspace n≥1 Lp (Xn , µ|Xn ) and they are con-
S
Moreover this same computation shows that any element of L∞ (mc ) determines a
distinct continuous functional on L1 (µ).
.
C HAPTER 6
Some Topology
This is a brief introduction skewed to certain things needed in the next section.
This is not intended as a comprehensive introduction to point set topology. A good
general reference is Willard’s book [3].
6.1.2. E XAMPLES .
(1) If (X, d) is a metric space, then U is open if for every x ∈ U , there is an r > 0
so that the open ball br (x) ⊂ U .
(2) If X is any set, the discrete topology has τd = P(X), the collection of all
subsets of X.
(3) If X is any set, the trivial topology has τ = {∅, X}.
(4) If (X, ≤) is a totally ordered set, the intervals (a, b) = {x ∈ X : a < x < b},
(−∞, b) = {x ∈ X : x < b} and (a, ∞) = {x ∈ X : x > a} are open, and the
topology consists of arbitrary unions of such intervals.
(5) If (X, τ ) is a topology and Y ⊂ X, the induced topology on Y is τ |Y =
{U ∩ Y : U ∈ τ }.
6.1.3. D EFINITION c
T . A set F ⊂ X is closed if F is open. If A ⊂ X, the
closure of A is A = {F : A ⊂ F, F closed}. A point in A is called a limit point
of A.
If A ⊂ X, then a ∈ A is an interior point of A if there
S exists U ∈ τ with
a ∈ U ⊂ A. If A ⊂ X, the interior of A is Ao or int A = {U ∈ τ : U ⊂ A}.
74
6.1 Topological spaces 75
6.1.4. P ROPOSITION .
(1) Finite unions and arbitrary intersections of closed sets are closed.
(2) A is the smallest closed set containing A.
(3) x ∈ A if and only if every U ∈ τ with x ∈ U has A ∩ U 6= ∅.
(4) A = Acoc is the complement of the interior of Ac .
P ROOF. Since open sets are closed under arbitrary unions and finite intersec-
tions, the collection of closed sets is closed under arbitrary intersections and finite
unions. Hence the intersection of all closed sets F ⊃ A is closed, and is thus the
smallest closed set containing A. Now x ∈ A if and only if x ∈ F for every closed
F ⊃ A if and only if x 6∈ U if U is open and disjoint from A. Finally
[ [
X \ A = {U ∈ τ : U ∩ A = ∅} = {U ∈ τ : U ⊂ Ac } = Aco .
6.1.8. E XAMPLES .
(1) If (X, d) is a metric space, then b1/n (x) : x ∈ X, n ≥ 1 is a base for the
topology.
(2) (r, s) : r < s ∈ Q is a base for the topology of R.
(3) Let C[0, 1] denote the space of continuous functions on [0, 1]. For each x ∈
[0, 1], a ∈ C and r > 0, let U (x, a, r) = {f ∈ C[0, 1] : f (x) ∈ br (a)}. Let τ be the
topology generated by these sets. This is the topology of pointwise convergence.
An open neighbourhood of f must contain a set of the form
{g ∈ C[0, 1] : |g(xi ) − f (xi )| < r for 1 ≤ i ≤ n}
for x1 , . . . , xn ∈ [0, 1] and r > 0.
6.1.10. E XAMPLES .
(1) If (X, d) is a metric space and x ∈ X, then b1/n (x) : n ≥ 1 is a countable
base of neighbourhoods of x. If X is separable, and {xi : i ≥ 1} is dense in X,
then b1/n (xi ) : i ≥ 1, n ≥ 1 is a base for τ . Indeed, suppose that x ∈ U is
open. Pick r > 0 so that br (x) ⊂ U and xi so that d(x, xi ) < 1/n < r/2. Then
x ∈ b1/n (xi ) ⊂ U . So X is second countable. In particular, compact metric spaces
are separable and so second countable.
(2) Consider the discrete topology τd on a set X. Since the topology is generated
by {x} : x ∈ X , X is always first countable. However it is second countable if
and only if X is countable if and only if X is separable.
6.2. Continuity
6.2.1. D EFINITION . A function f : (X, τ ) → (Y, σ) between topological
spaces is continuous if for all V ⊂ Y open, the set f −1 (V ) is open in X. Say that
f is a homeomorphism if f is a bijection such that both f and f −1 are continuous.
6.2.2. E XAMPLES .
id id
(1) The identity map (X, discrete) −→ (X, τ ) −→ (X, trivial) is a continuous
bijection, however in both cases f −1 will be discontinuous provided that τ satisfies
{∅, X} ( τ ( P(X).
6.2 Continuity 77
(4) Let X = {0, 1}. Let τ = ∅, {0}, X . Then {1} is closed, but {0} is not, and
{0} = X. If f : X → R is continuous, then f is constant.
(5) Let X = [0, 1) ∪ {a, b}. Let the open sets in τ be U ⊂ [0, 1) which are open in
the usual metric on [0, 1) together with sets U ∪ (r, 1) ∪ {a}, U ∪ (r, 1) ∪ {b} and
U ∪ (r, 1) ∪ {a, b} for r < 1. Here the points {a} and {b} are closed because the
complement is open. However if a ∈ U and b ∈ V are open sets, then U ∩ V ⊃
(r, 1) for some r < 1. That means that you cannot separate a and b from one
another by open sets. If f : X → R is continuous, then f (a) = f (b).
6.2.3. D EFINITION . Let C b (X) and CRb (X) or C b (X, R) denote the normed
vector space of bounded continuous functions from X into C and R, respectively,
with norm kf k∞ = supX |f (x)|. Similarly, C(X) and CR (X) or C(X, R) denote
the vector space of continuous functions from X into C and R, respectively.
Consider the Examples 6.2.2 (4) and (5) in light of this proposition.
Recall that fn ∈ C b (X) converge uniformly to a function f if kf − fn k∞ → 0.
The following standard result for metric spaces extends easily.
fn−1 (br/3 (fn (x))) is open. If y ∈ V , then |fn (y) − fn (x)| < r/3, so
|f (y) − f (x)| ≤ |f (y) − fn (y)| + |fn (y) − fn (x)| + |fn (x) − f (x)|
r
< kf − fn k∞ + + kf − fn k∞ < r.
3
Hence f (y) ∈ br (f (x)) ⊂ U . Thus V ⊂ f −1 (U ). So f is continuous.
The norm kf k∞ makes C b (X) into a normed vector space. In view of Propo-
sition 6.2.5, the following is most interesting when X is Hausdorff.
6.3. Compactness
6.3.1. D EFINITION . An open S cover of a set A ⊂ X is a collection of open sets
{Uλ : λ ∈ Λ} such that A ⊂ Λ Uλ . A set A is compact if every S open cover has a
finite subcover, i.e., a finite subset Uλ1 , . . . , Uλn such that A ⊂ ni=1 Uλi .
6.3.2. E XAMPLE . In 6.2.2 (4), the point {0} is compact but not closed.
P ROOF. Suppose that X is compact and F has FIP. Define open sets Uλ = Acλ .
T S T c
If F = ∅, then Λ Uλ = F = X. So U = {Uλ : λ ∈ Λ} is an
open
Tn cover of S X. By compactness,
c there is a finite subcover TUλ1 , . . . , Uλn . Hence
n
i=1 A λ i
= i=1 U λ i
= ∅, contradicting FIP. Therefore F 6= ∅.
Conversely, suppose that U = {Uλ : λ ∈ Λ} is anTopen cover of SnX. Define c
closed sets Aλ = Uλc . If there is no finite subcover, then ni=1 T A λ i
= i=1 U λi
6=
S and thusTF =c {Aλ : λ ∈ Λ} has FIP. But then F 6= ∅. Therefore
∅;
Λ Uλ = F 6= X, contradicting the fact that U is an openTcover. Hence
F does not have FIP, so there is a finite set Aλ1 , . . . , Aλn such that ni=1 Aλi = ∅.
c
Hence ni=1 Uλi =
S Tn
i=1 Aλi = X. Therefore X is compact.
6.3.9. R EMARKS .
(1) The product topology τ consist of arbitrary unions of finite intersections of the
subbase. So if λ1 , . . . , λn ∈ Λ and Ui ∈ τλi , then the sets of the form
Y
U1 × · · · × Un × Xµ
µ∈Λ\{λi ,1≤i≤n}
(2) If Λ is finite, this is a familiar construction in the metric space case. Indeed, if
(Xi , di ) are metric spaces for 1 ≤ i ≤ n, then
D (x1 , . . . , xn ), (y1 , . . . , yn ) = max di (xi , yi ) : 1 ≤ i ≤ n
is a metric on the product, and the metric topology coincides with the product
topology.
(3) When Λ is infinite, it often requires the Axiom of Choice to be able to say that
X is non-empty.
Qn
6.3.10. T HEOREM . If Xi are compact for 1 ≤ i ≤ n, then X = i=1 Xi is
compact.
open cover of Y . Let Vy1 , . . . , Vym be a finite subcover. Then the finite collection
{Uxyj, y × Vxyj, y : 1 ≤ i ≤ nyj , 1 ≤ j ≤ m} covers X × Y . Therefore
i j i j
{Wλ(xyj, y ) : 1 ≤ i ≤ nyj , 1 ≤ j ≤ m} covers X × Y .
i j
6.4.2. R EMARKS .
(1) The T0 property is very weak. Example 6.2.2(4) is T0 but not T1 .
(2) T1 is also a very weak property. It implies T0 since if x 6= y, the set {x}c is
an open set containing y but not x. To be T1 it is enough to find open sets U 3 x
with y 6∈ U and V 3 y with x 6∈ V . (Exercise.) Example 6.2.2(5) is T1 but not
Hausdorff. Points are closed in Hausdorff spaces, so T2 implies T1 .
(3) The trivial topology is regular because there are no points which are disjoint
from a non-empty closed set. But throwing in the T1 condition makes a T3 space
Hausdorff because you can take your closed set to be {y}.
(4) Completely regular spaces are regular, because if x 6∈ A and A is closed, let
f : X → [0, 1] be a continuous function with f (x) = 1 and f |A = 0. Then
U = {y : f (y) > 1/2} and V = {y : f (x) < 1/2} are disjoint open sets with
x ∈ U and A ⊂ V . Again we need to add the T1 property to exclude examples like
the trivial topology.
(5) The T1 property ensures that T4 spaces are Hausdorff. A metric space (X, d) is
normal. If A and B are disjoint closed sets, let U = {x : d(x, A) < d(x, B)} and
V = {x : d(x, A) > d(x, B)}. It is easy to check that these are disjoint open sets.
6.4.5. R EMARK . If (X, d) is a metric space and A and B are disjoint closed
sets, define
d(x, A)
f (x) = .
d(x, A) + d(x, B)
This satisfies the conclusion of Urysohn’s Lemma.
P ROOF. After scaling, we may assume that the range is [−1, 1]. Let A1 =
−1
f ([−1, − 31 ]) and B1 = f −1 ([ 31 , 1]). By Urysohn’s Lemma, there is a function
g1 : X → [− 13 , 31 ] so that g1 |A1 = − 31 and g1 |B1 = 13 . Then f1 = f −g1 |A has range
in [− 23 , 32 ]. Repeat the process, setting A2 = f1−1 ([− 23 , − 29 ]) and B2 = f1−1 ([ 29 , 23 ]),
and finding g2 : X → [− 92 , 29 ] with g2 |A2 = − 29 and g2 |B2 = 29 .
2 2
Then f2 = f1 − g2 |A has range in [− 32 , 23 ]. Recursively we obtain func-
→ [−2 · 3−n , 2 · 3−n ] so that fn = f − ni=1 gn |A has range in
P
tions gn : X
n n
[− 23 , 32 ]. Let g = n≥1 gn . Then g|A = f and
P
X X
kgk∞ ≤ kgn k∞ = 2 · 3−n = 1.
n≥1 n≥1
P ROOF. The setup is the same as the previous proof. Define f |B = g and
f |C = 0, and extend this to a continuous function on W using Tietze’s Theorem.
Then extend f to all of X by setting f |W c = 0. This is continuous with compact
support as in the previous proof.
6.5. Nets*
Sequences are not sufficient for dealing with convergence in general topolog-
ical spaces, including many that arise in normal contexts. The replacement is the
notion of a net, which you can think of as a very wide and very long generalized
sequence. This optional section is not required in the next chapter.
We will need to call on the Axiom of Choice frequently. Recall that this is the
assumption that whenever {Aλ : λ ∈ Λ} S is a collection of non-empty sets, there is
a selection (choice function) ϕ : Λ → Λ Aλ so that ϕ(λ) ∈ Aλ for all λ ∈ Λ.
We present a detailed example to explain why nets are needed, and how to use
them.
(0, 1), (1, 0), (0, 2), (1, 1), (2, 0), (0, 3), (1, 2), (2, 1), (3, 0), . . . .
Define xU to be the least element in this list which belongs to U . (This avoids any
issues with the Axiom of Choice.) Then (xU ) converges to (0, 0) because given an
open neighbourhood U 3 (0, 0), we have xV ∈ V ⊂ U whenever U ≤ V .
(e) The sequence (0, 1), (1, 0), (0, 2), (1, 1), (2, 0), (0, 3), (1, 2), (2, 1), . . . has
a subnet converging to (0, 0). Let Λ be the net just constructed. Define ϕ(U ) = xU
considered as an element in this sequence. To see that this map is cofinal, let
(m0 , n0 ) be in this sequence, and set N0 = m0 + n0 . Let
U0 = X \ {(m, n) : 1 ≤ m + n ≤ N0 }.
6.5.5. T HEOREM . Let f : (X, τ ) → (Y, σ). Then f is continuous if and only if
whenever (xλ )Λ is a net in X converging to x, it follows that f (x) = limΛ f (xλ ).
We call µ a regular measure if it is both inner and outer regular on all Borel sets.
A Borel measure µ is a Radon measure if µ(K) < ∞ for all compact sets K, it is
outer regular on all Borel sets and inner regular on open sets.
7.1.2. R EMARKS .
(1) Some books define a Radon measure to be a regular measure which takes finite
values on all compact sets. This is somewhat less general (so we would require an
additional hypothesis such as σ-compactness), and does not simplify the proof of
the main result.
(2) If µ(E) < ∞, then µ is regular on E if and only if for ε > 0, there is an open
set U and a compact set K so that K ⊂ E ⊂ U and µ(U \ K) < ε.
(3) There are compact Hausdorff spaces which support finite Borel measures which
are not regular. See the exercises in [2].
P ROOF. Let E ∈ Bor(X) such that µ(E) < ∞, and let ε > 0. By outer
regularity, there is an open U ⊃ E with µ(U \ E) < ε/2. Then there is an open
set V ⊃ U \ E with µ(V ) < ε/2. Since µ is inner regular on U , there is a compact
L ⊂ U with µ(U \ L) < ε/2. Then K = L \ V is compact and K ⊂ U \ V ⊂ E.
Finally,
µ(E \ K) < µ(U \ K) ≤ µ(U \ L) + µ(V ) < ε.
Thus µ is regular on E.
If E is a σ-finite set, write E = ˙ i≥1 Ei where µ(Ei ) < ∞. Find compact
S
sets Kn ⊂ En with µ(En \ Kn ) < 2−n ε. Then Cp = pn=1 Kn are compact. Set
S
S P
K = n≥1 Kn . Then K ⊂ E and µ(E \ K) ≤ n≥1 µ(En \ Kn ) < ε. By
continuity from below, we get limp→∞ µ(Cp ) = µ(K) > µ(E) − ε. Therefore µ
is regular on E.
The following immediate corollary covers many cases of interest.
is the union of countably many compact sets. The result now follows from Corol-
lary 7.1.4.
P ROOF. Let S be the collection of all Borel sets on which µSis regular. The
proof of Corollary 7.1.5 shows that X is σ-compact. Write X = n≥1 Kn where
Kn are compact. If C is closed, then the sets Cn = C ∩ ni=1 Ki are compact, and
S
90 Functionals on Cc (X) and C0 (X)
S
Cn ⊂ Cn+1 ⊂ n≥1 Cn = C. By continuity from below, µ(C) = limn→∞ µ(Cn ).
So µ is inner regular on closed sets.
As X is a metric space, a closed set C is a Gδ (let Un = T{x : d(x, C) < n1 }).
Since µ is finite, continuity from above shows that if C = n≥1 Un and Un ⊃
Un+1 , then µ(C) = limn→∞ µ(Un ). Thus µ is regular on C.
Next we claim S is closed under complements. Let A ∈ S. By Remark 7.1.2(2),
for ε > 0, there is a compact set K and open set U with K ⊂ A ⊂ U and
µ(U \ K) < ε. Therefore U c ⊂ Ac ⊂ K c , and µ(K c \ U c ) = µ(U \ K) < ε. The
set U c is closed but not necessarily compact. Since U c ∈ S, there is a compact set
L ⊂ U c so that µ(U c \ L) < ε − µ(K c \ U c ). Thus µ(K c \ L) < ε. Therefore µ
is regular on Ac .
Now we S show that S is closed under countable unions. Let Ai ∈ S for i ≥ 1,
and let A = i≥1 Ai . Given ε > 0, find compact sets Ki and open sets Ui so that
Ki ⊂ Ai ⊂ Ui and µ(Ui \ Ki ) < 2−i ε. Set
n
[ [ [
Cn = Ki , C= Ki and U = Ui .
i=1 i≥1 i≥1
The first step uses continuity from above, which requires the finiteness of µ.
Hence S is a σ-algebra. As S contains all closed sets, it contains all Borel sets.
So µ is regular and finite, whence Radon.
(2) Φµ is continuous with respect to the sup norm (i.e. |Φµ (f )| ≤ Ckf k∞
for f ∈ Cc (X)) if and only if M = sup{µ(K) : K is compact} < ∞.
Moreover kΦµ k = M .
(3) In particular, if µ is Radon, then Φµ is continuous if and only if kµk :=
µ(X) < ∞.
7.2 Positive functionals on Cc (X) 91
Hence kΦµ k = M .
If µ is Radon, then it is inner regular on X; so µ(X) = M . Thus Φµ is
continuous if and only if kµk := µ(X) < ∞.
This is impossible, and thus M < ∞. We now argue as in the first paragraph to
deduce that |ϕ(f )| ≤ ϕ(|f |) ≤ M kf k∞ for f in C0 (X).
92 Functionals on Cc (X) and C0 (X)
Hence
µ(K0 ) − µ(KN ) µ(K0 )
Φµ (f ) − ϕ(f ) ≤ ≤ .
N N
Since N was arbitrary, we obtain Φµ (f ) = ϕ(f ).
Claim 5: uniqueness. Suppose that ν is a Radon measure such that Φν = ϕ.
Then if U is open and f ≺ U , then ϕ(f ) = Φν (f ) ≤ ν(U ). On the other hand,
since ν is inner regular on U , if r < ν(U ), there is a compact K ⊂ U with
ν(K) > r. Take some f ∈ Cc (X) with χK ≤ f ≺ U and observe that ϕ(f ) =
Φν (f ) ≥ ν(K) > r. Therefore
ν(U ) = sup{ϕ(f ) : f ≺ U } = µ(U ).
Both measures are outer regular, and therefore they agree on all Borel sets.
P ROOF. If f ∈ C0 (X),
Z Z
|Φµ (f )| = f dµ ≤ |f | d|µ| ≤ kf k∞ |µ|(X).
Z Z
kµk − Φµ (f ) = h̄ − f dµ ≤ h̄ − f d|µ|
n Z
X n
X
≤ h̄ − ζj d|µ| + 2|µ|(Ej \ Kj )
j=1 Kj j=1
ε
≤ kµk + 2ε.
2
Letting ε → 0 yields kΦµ k = kµk.
7.3.3. D EFINITION . If X is a LCH space, let M (X) be the vector space of all
complex regular Borel measures on X with norm kµk = |µ|(X).
P ROOF. Proposition 7.3.1 shows that the map from M (X) into C0 (X)∗ is an
isometric linear map. On the other hand, Corollary 7.2.5 of the Riesz-Markov
Theorem shows that every positive linear functional on C0 (X) is Φµ for a finite
positive regular Borel measure µ. The result will follow if the linear span of the
positive linear functional on C0 (X) is all of C0 (X)∗ .
We call a linear functional self-adjoint if ϕ(f ) ∈ R for f ∈ C0 (X, R). Let
ϕ ∈ C0 (X)∗ . Define ϕ̃(f ) = ϕ(f¯). Set Re ϕ = 12 (ϕ + ϕ̃) and Im ϕ = 2i1 (ϕ − ϕ̃).
96 Functionals on Cc (X) and C0 (X)
A Taste of Probability*
This chapter is purely for interest. I expect to give one lecture on some of this, but
it won’t be tested. This chapter closely follows Folland’s book [2].
distribution function of X is
F (t) = P (X ≤ t) = PX (−∞, t] .
Moreover PX = dF = µF is the Lebesgue-Stieltjes measure of F . A family
of random variables {Xα }α∈A are identically distributed if PXα = PXβ for all
α, β ∈ A. The joint distribution of (X1 , . . . , Xn ) is P(X1 ,...,Xn ) where we consider
(X1 , . . . , Xn ) : Ω → Rn .
Many properties of random variables can be recovered from their distributions.
8.1.2. E XAMPLES
Z . Z
(1) E(X) = X dP = t dPX .
R
Z
2
σ 2 (X) E(X))2
(2) = E (X − = t − E(X) dPX .
R
Z
(3) E(X + Y ) = s + t dP(X,Y ) (s, t).
R2
8.2. Independence
8.2.1. D EFINITION . Events {Ai : i ∈ I} are independent if whenever i1 , . . . , ik
in I are distinct, then
k
Y
P (Ai1 ∩ · · · ∩ Aik ) = P (Ais ).
s=1
Random variables {Xi : i ∈ I} are independent if for every choice of Borel sets Bi ,
the collection {Xi−1 (Bi ) : i ∈ I} is independent. That is, whenever i1 , . . . , ik ∈ I
8.2 Independence 99
are distinct,
k
Y k
\ k k k
Y Y Y
Xi−1
P(Xi1 ,...,Xi ) Bis = P s
(B i ) = P (A is ) = P Xis Bis .
k
s=1 s=1 s=1 s=1 s=1
Qk
In other words, P(Xi1 ,...,Xi ) = s=1 PXis .
k
P ROOF. Compute
Z ZZ
E(XY ) = st dP(X,Y ) (s, t) = st dPX dPY
Z Z
= s dPX (s) t dPY (t) = E(X) E(Y ).
P ROOF. Let X̄i = (Xi,1 , . . . , Xi,Ji ). Then Yi−1 (B) = X̄i−1 (fi−1 (B)). Thus if
Y = (Y1 , . . . , Yn ) and X = (X̄1 , . . . , X̄n ),
n
\ n
Y
Y −1 (B1 × · · · × Bn ) = X̄i−1 (fi−1 (Bi )) = X −1 fi−1 (Bi ) .
i=1 i=1
Therefore
n
Y
P Y −1 (B1 × · · · × Bn ) = PX fi−1 (Bi )
i=1
n
Y YJi n
Y
fi−1 (Bi )
= PXi,j
i=1 j=1 i=1
n
Y Ji
Y
PXi,j fi−1 (Bi )
=
i=1 j=1
100 A Taste of Probability*
n
Y n
Y
= PX̄i (fi−1 (Bi )) = Pyi (Bi ).
i=1 i=1
8.3.2. C OROLLARY. Let X be a random variable such that σ(X) < ∞ (i.e.
X ∈ L2 (P )). Then for b > 0, P (|X − E(X)| ≥ b) ≤ σ 2 (X)/b2 .
n
X
1
lim P n Xi − µi > ε = 0.
n→∞
i=1
n
1 P
That is, lim Xi − µi = 0 in probability.
n→∞ n i=1
n
P ROOF. Let Sn = n1
P
(Xi − µi ); so E(Sn ) = 0 and by Corollary 8.2.5,
i=1
σ 2 (Sn ) = n12 ni=1 σ 2 (Xi − µi ) = n12 ni=1 σi2 . Hence by Corollary 8.3.2,
P P
(b) 1 − P [ A = P \ Ac = Y P (Ac )
n n n
n≥k n≥k n≥k
P
e−P (An ) = e−
Y Y
P (An )
= 1 − P (An ) ≤ n≥k = 0.
n≥k n≥k
Thus P (B) = 1.
P ROOF. Let Sk = X1 + · · · + Xk . Let A = max{|S1 |, . . . , |Sn | ≥ a and
Sn
Ak = |Sj | < a, 1 ≤ j < k, |Sk | ≥ a . So A = ˙ k=1 Ak .
102 A Taste of Probability*
Therefore
E(Sn2 χAk ) = E (Sk χAk + (Sn − Sk )χAk )2
Letting m → ∞, we get
1 X 2 δ
P max |Sn+k − Sn | ≥ ε ≤ 2 σi < 2 .
k≥1 ε ε
i>n
has P (Aj ) < 2−j . By the Borel-Cantelli Lemma, since j≥1 P (Aj ) < ∞, we
P
c
T S S T
have P k≥1 Aj = 0. Hence P i≥1 j≥i Aj = 1. Therefore a.s., x
T j≥k
belongs to j≥i Aj for some i, meaning that maxk≥1 |Snj +k − Snj | < 2−j for
c
j ≥ i. This means that Sn (x) is Cauchy and thus converges with probability 1.
8.3 The Law of Large Numbers 103
P ai
8.3.7. L EMMA . If (ai )i≥1 is a sequence of real numbers and i≥1 i converges,
Pn
then lim 1 i=1 ai = 0.
n→∞ n
Pn
P ROOF. Let bn = ann and Sn = i=1 bi . Let L = limn→∞ Sn . Then L =
limn→∞ n1 n−1
P
i=0 Si . Compute
n n n n
1X 1X 1 XX
ai = ibi = bk
n n n
i=1 i=1 i=1 k=i
n n−1
1 X 1X
= Sn − Si−1 = Sn − Sk → 0.
n n
i=1 k=0
P ROOF. Let Yi = 1i (Xi − ai ). Then E(Yi ) = 0, σ 2 (Yi ) = i12 σi2 and {Yi }
are independent. By hypothesis, i≥1 σ 2 (Yi ) < ∞. Hence by Lemma 8.3.6,
P
P
P ( i≥1 Yi converges} = 1. Therefore by Lemma 8.3.7, with ai = P (Xi ),
n
X
1
P lim (Xi − µi ) = 0 = 1.
n→∞ n
i=1
Z
t dλ = E(Xn ) = 0. Let Yn = Xn χ{|Xn |≤n} and Zn = Xn − Yn . Then
R
X X XZ −nZ ∞
P (Zn 6= 0) = P (|Xn | > n) = + dλ
n≥1 n≥1 n≥1 −∞ n
Z ∞ Z ∞
= b|t|c dλ ≤ |t| dλ = kX1 k1 < ∞.
−∞ −∞
By the Borel-Cantelli Lemma, P (Zn 6= 0 infinitely often) = 0. Therefore
n
X
1
P n Zi → 0 = 1.
i=1
Z n
Now σ 2 (Yn ) = E(Yn2 ) = t2 dλ. Hence by the MCT,
−n
n ∞
XZ t2
X Z X
1 2
n2
σ (Yn ) = dλ = 1 χ
n2 [−n,n]
t2 dλ
n2
n≥1 n≥1 −n −∞ n≥1
P 1 χ P 1 2
for n−1 < |t| ≤ n, we have n≥1 n2 [−n,n] (t) = k≥n k2 < n and thus
Z ∞
≤ 2|t| dλ < ∞.
−∞
105
I NDEX
Fσ set, 3 dense, 76
Gδ set, 3 differentiable, 49
Lp (µ), 64 discrete topology, 74
M (X), 95 distribution function of X, 98
T0 , 81 dual of Lp (µ), 70
T1 , 81
T2 , 81 Egorov’s Theorem, 19
T3 , 81 expectation, 97
T3.5 , 81 expected value, 97
µ∗ -measurable, 6 Extreme Value Theorem, 79
σ-algebra, 3
Fatou’s Lemma, 25
σ-finite, 3
finite, 3
absolute value, 59 finite intersection property, 79
absolutely continuous, 53, 60 first countable, 76
algebra of subsets, 3 Fubini’s Theorem, 36
almost everywhere, 5 Fubini-Tonelli Theorem without
almost surely, 97 Completenes, 40
atom, 13
Hölder’s Inequality, 69
atomic measure, 14
Hahn Decomposition Theorem, 58
base, 75 Hausdorff, 77
Borel sets, 3 homeomorphism, 76
Borel-Cantelli Lemma, 101
identically distributed, 98
bounded variation, 52
independent, 98
Cantor ternary function, 52 indicator function, 97
Carathéodory’s Theorem, 7 induced topology, 74
closed, 74 inner regular, 88
closure, 74 integrable, 26
cofinal, 85 integral, 21
compact, 78 interior, 74
complete, 4 interior point, 74
completely regular, 81
joint distribution, 98
completion, 5
Jordan DecompositionTheorem, 59
complex measure, 62
continuity from above, 4 Kolmogorov’s Inequality, 101
continuity from below, 4 Kolmogorov’s Theorem, 103
continuous, 76
converge uniformly, 77 Lebesgue Decomposition Theorem, 62
convergence in probability, 97 Lebesgue Differentiation Theorem, 55
converges in measure, 97 Lebesgue Dominated Convergence Theorem,
coordinate projections, 34 27
countable base of neighbourhoods of x, 76 limit point, 74
countably additive, 3 linear functionals, 68
Counting measure, 4 locally compact, 83
106
Index 107
Radon measure, 88
Radon-Nikodym Theorem, 60
random variable, 97
regular, 81, 94
regular measure, 88
Riemann integrable, 30
Riesz Representation Theorem, 95
Riesz-Fischer Theorem, 65
Riesz-Markov Theorem, 92
second countable, 76
self-adjoint, 95
semi-finite, 3
separable, 76
separation properties, 81
signed measure, 57
simple function, 18
standard deviation, 97
Strong Law of Large Numbers, 103
stronger topology, 75