0% found this document useful (0 votes)
51 views

Lecture 21

This document summarizes Birkhoff's Ergodic Theorem, which describes the limiting behavior of the average of a function over iterates of a measure-preserving transformation. It introduces concepts like conditional expectation, Radon-Nikodym derivatives, and the maximal inequality. The theorem states that for integrable functions, the time average converges almost everywhere to the conditional expectation with respect to the invariant sigma-algebra. For ergodic systems, the time average converges almost everywhere to the integral of the function. The proof uses the maximal inequality and dominated convergence.

Uploaded by

durgesh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
51 views

Lecture 21

This document summarizes Birkhoff's Ergodic Theorem, which describes the limiting behavior of the average of a function over iterates of a measure-preserving transformation. It introduces concepts like conditional expectation, Radon-Nikodym derivatives, and the maximal inequality. The theorem states that for integrable functions, the time average converges almost everywhere to the conditional expectation with respect to the invariant sigma-algebra. For ergodic systems, the time average converges almost everywhere to the integral of the function. The proof uses the maximal inequality and dominated convergence.

Uploaded by

durgesh
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

MATH41112/61112 Ergodic Theory Lecture 21

21. Birkhoff ’s Ergodic Theorem

§21.1 Introduction
An ergodic theorem is a result that describes the limiting behaviour of the
sequence
n−1
1X
f ◦ Tj (21.1)
n
j=0

as n → ∞. The precise formulation of an ergodic theorem depends on the


class of function f (for example, one could assume that f is integrable,
L2 , or continuous), and the notion of convergence that we use (for ex-
ample, we could study pointwise convergence, L 2 convergence, or uniform
convergence). The result that we are interested here—Birkhoff’s Ergodic
Theorem—deals with pointwise convergence of (21.1) for an integrable func-
tion f .

§21.2 Conditional expectation


We will need the concepts of Radon-Nikodym derivates and conditional ex-
pectation.
Definition. Let µ be a measure on (X, B). We say that a measure ν
is absolutely continuous with respect to µ and write ν  µ if ν(B) = 0
whenever µ(B) = 0, B ∈ B.

Remark. Thus ν is absolutely continuous with respect to µ if sets of µ-


measure zero also have ν-measure zero (but there may be more sets of ν-
measure zero).

For example, let f ∈ L1 (X, B, µ) be non-negative and define a measure ν by


Z
ν(B) = f dµ.
B

Then ν  µ.
The following theorem says that, essentially, all absolutely continuous
measures occur in this way.

Theorem 21.1 (Radon-Nikodym)


Let (X, B, µ) be a probability space. Let ν be a measure defined on B and
suppose that ν  µ. Then there is a non-negative measurable function f
such that Z
ν(B) = f dµ, for all B ∈ B.
B

1
MATH41112/61112 Ergodic Theory Lecture 21

Moreover, f is unique in the sense that if g is a measurable function with


the same property then f = g µ-a.e.

Exercise 21.1
If ν  µ then it is customary to write dν/dµ for the function given by the
Radon-Nikodym theorem, that is

Z
ν(B) = dµ.
B dµ

Prove the following relations:


(i) If ν  µ and f is a µ-integrable function then

Z Z
f dν = f dµ.

(ii) If ν1 , ν2  µ then
d(ν1 + ν2 ) dν1 dν2
= + .
dµ dµ dµ
(iii) If λ  ν  µ then
dλ dλ dν
= .
dµ dν dµ
Let A ⊂ B be a sub-σ-algebra. Note that µ defines a measure on A by
restriction. Let f ∈ L1 (X, B, µ). Then we can define a measure ν on A by
setting Z
ν(A) = f dµ.
A
Note that ν  µ|A . Hence by the Radon-Nikodym theorem, there is a
unique A-measurable function E(f | A) such that
Z
ν(A) = E(f | A) dµ.

We call E(f | A) the conditional expectation of f with respect to the σ-


algebra A.
So far, we have only defined E(f | A) for non-negative f . To define
E(f | A) for an arbitrary f , we split f into positive and negative parts
f = f+ − f− where f+ , f− ≥ 0 and define

E(f | A) = E(f+ | A) − E(f− | A).

Thus we can view conditional expectation as an operator

E(· | A) : L1 (X, B, µ) → L1 (X, A, µ).

Note that E(f | A) is uniquely determined by the two requirements that

2
MATH41112/61112 Ergodic Theory Lecture 21

(i) E(f | A) is A-measurable, and


R R
(ii) A f dµ = A E(f | A) dµ for all A ∈ A.

Intuitively, one can think of E(f | A) as the best approximation to f in the


smaller space of all A-measurable functions.

Exercise 21.2
(i) Prove that f 7→ E(f | A) is linear.

(ii) Suppose that g is A-measurable and |g| < ∞ µ-a.e. Show that E(f g |
A) = gE(f | A).

(iii) Suppose that T is a measure-preserving transformation. Show that


E(f | A) ◦ T = E(f ◦ T | T −1 A).

(iv) Show that E(f | B) = f .

(v) Let N denote the trivial σ-algebra


R consisting of all sets of measure 0
and 1. Show that E(f | N ) = f dµ.
To state Birkhoff’s Ergodic Theorem precisely, we will need the sub-σ-
algebra I of T -invariant subsets, namely:

I = {B ∈ B | T −1 B = B a.e.}.

Exercise 21.3
Prove that I is a σ-algebra.

§21.3 Birkhoff ’s Pointwise Ergodic Theorem


1 Pn−1
Birkhoff’s Ergodic Theorem deals with the behaviour of n j=0 f (T j x) for
µ-a.e. x ∈ X, and for f ∈ L1 (X, B, µ).

Theorem 21.2 (Birkhoff ’s Ergodic Theorem)


Let (X, B, µ) be a probability space and let T : X → X be a measure-
preserving transformation. Let I denote the σ-algebra of T -invariant sets.
Then for every f ∈ L1 (X, B, µ), we have
n−1
1X
f (T j x) → E(f | I)
n
j=0

for µ-a.e. x ∈ X.

3
MATH41112/61112 Ergodic Theory Lecture 21

Corollary 21.3
Let (X, B, µ) be a probability space and let T : X → X be an ergodic
measure-preserving transformation. Let f ∈ L 1 (X, B, µ). Then
n−1
1X
Z
f (T j x) → f dµ, as n → ∞,
n
j=0

for µ-a.e. x ∈ X.

Proof. If T is ergodic then I is the trivial σ-algebra NR consisting of sets


of measure 0 and 1. If f ∈ L1 (X, B, µ) then E(f | N ) = f dµ. The result
follows from the general version of Birkhoff’s ergodic theorem. 2

§21.4 Appendix: The proof of Birkhoff ’s Ergodic Theorem


The proof is something of a tour de force of hard analysis. It is based on
the following inequality.

Theorem 21.4 (Maximal Inequality)


Let (X, B, µ) be a probability space, let T : X → X be a measure-preserving
transformation and let f ∈ L1 (X, B, µ). Define f0 = 0 and, for n ≥ 1,

fn = f + f ◦ T + · · · + f ◦ T n−1 .

For n ≥ 1, set
Fn = max fj
0≤j≤n

(so that Fn ≥ 0). Then


Z
f dµ ≥ 0.
{x|Fn (x)>0}

Proof. Clearly Fn ∈ L1 (X, B, µ). For 0 ≤ j ≤ n, we have Fn ≥ fj , so


Fn ◦ T ≥ fj ◦ T . Hence

Fn ◦ T + f ≥ fj ◦ T + f = fj+1

and therefore
Fn ◦ T (x) + f (x) ≥ max fj (x).
1≤j≤n

If Fn (x) > 0 then

max fj (x) = max fj (x) = Fn (x),


1≤j≤n 0≤j≤n

so we obtain that
f ≥ F n − Fn ◦ T

4
MATH41112/61112 Ergodic Theory Lecture 21

on the set A = {x | Fn (x) > 0}.


Hence
Z Z Z
f dµ ≥ Fn dµ − Fn ◦ T dµ
A ZA ZA
= Fn dµ − Fn ◦ T dµ
X A
Z Z
≥ Fn dµ − Fn ◦ T dµ
X X
= 0

where we have used

(i) Fn = 0 on X \ A

(ii) Fn ◦ T ≥ 0

(iii) µ is T -invariant.

Corollary 21.5
If g ∈ L1 (X, B, µ) and if
 
n−1
 1 X 
Bα = x ∈ X | sup g(T j x) > α
 n≥1 n j=0

then for all A ∈ B with T −1 A = A we have that


Z
g dµ ≥ αµ(Bα ∩ A).
Bα ∩A

Proof. Suppose first that A = X. Let f = g − α, then


 
[∞  n−1
X 
Bα = x| g(T j x) > nα
 
n=1 j=0

[
= {x | fn (x) > 0}
n=1
[∞
= {x | Fn (x) > 0}
n=1

(since fn (x) > 0 ⇒ Fn (x) > 0 and Fn (x) > 0 ⇒ fj (x) > 0 for some 1 ≤ j ≤
n). Write Cn = {x | Fn (x) > 0} and observe that Cn ⊂ Cn+1 . Thus χCn

5
MATH41112/61112 Ergodic Theory Lecture 21

converges to χBα and so f χCn converges to f χBα , as n → ∞. Furthermore,


|f χCn | ≤ |f |. Hence, by the Dominated Convergence Theorem,
Z Z Z Z
f dµ = f χCn dµ → f χBα dµ = f dµ, as n → ∞.
Cn X X Bα

Applying the maximal inequality, we have, for all n ≥ 1,


Z
f dµ ≥ 0.
Cn

Therefore Z
f dµ ≥ 0,

i.e., Z
g dµ ≥ αµ(Bα ).

For the general case, we work with the restriction of T to A, T : A → A,


and apply the maximal inequality on this subset to get
Z
g dµ ≥ αµ(Bα ∩ A),
Bα ∩A

as required. 2

Proof of Birkhoff ’s Ergodic Theorem. Let


n−1
∗ 1X
f (x) = lim sup f (T j x)
n→∞ n
j=0

and
n−1
1X
f∗ (x) = lim inf f (T j x).
n→∞ n
j=0

Writing
n−1
1X
an (x) = f (T j x),
n
j=0

observe that
n+1 1
an+1 (x) = an (T x) + f (x).
n n
Taking the lim sup and lim inf as n → ∞ gives us that f ∗ ◦ T = f ∗ and
f∗ ◦ T = f ∗ .
We have to show

(i) f ∗ = f∗ µ-a.e

6
MATH41112/61112 Ergodic Theory Lecture 21

(ii) f ∗ ∈ L1 (X, B, µ)

(iii) f ∗ dµ = f dµ.
R R

We prove (i). For α, β ∈ R, define

Eα,β = {x ∈ X | f∗ (x) < β and f ∗ (x) > α}.

Note that [
{x ∈ X | f∗ (x) < f ∗ (x)} = Eα,β
β<α, α,β∈Q

(a countable union). Thus, to show that f∗


= f∗ µ-a.e., it suffices to show
that µ(Eα,β ) = 0 whenever β < α. Since f∗ ◦ T = f∗ and f ∗ ◦ T = f ∗ , we
see that T −1 Eα,β = Eα,β . If we write
 
n−1
 1 X 
Bα = x ∈ X | sup f (T j x) > α
 n≥1 n j=0

then Eα,β ∩ Bα = Eα,β .


Applying Corollary 21.5 we have that
Z Z
f dµ = f dµ
Eα,β Eα,β ∩Bα
≥ αµ(Eα,β ∩ Bα ) = αµ(Eα,β ).

Replacing f , α and β by −f , −β and −α and using the fact that (−f ) ∗ =


−f∗ and (−f )∗ = −f ∗ , we also get
Z
f dµ ≤ βµ(Eα,β ).
Eα,β

Therefore
αµ(Eα,β ) ≤ βµ(Eα,β )
and since β < α this shows that µ(Eα,β ) = 0. Thus f ∗ = f∗ µ-a.e. and
n−1
1X
lim f (T j x) = f ∗ (x) µ-a.e.
n→∞ n
j=0

We prove (ii). Let


n−1
1 X j

gn =
f ◦ T .
n j=0
Then gn ≥ 0 and Z Z
gn dµ ≤ |f | dµ

7
MATH41112/61112 Ergodic Theory Lecture 21

so we can apply Fatou’s Lemma to conclude that lim n→∞ gn = |f ∗ | is inte-


grable, i.e., that f ∗ ∈ L1 (X, B, µ).
We prove (iii). For n ∈ N and k ∈ Z, define
 
n k ∗ k+1
Dk = x ∈ X | ≤ f (x) < .
n n
For every ε > 0, we have that

Dkn ∩ B k −ε = Dkn .
n

Since T −1 Dkn = Dkn , we can apply Corollary 22.4 again to obtain


 
k
Z
f dµ ≥ − ε µ(Dkn ).
n
Dk n

Since ε > 0 is arbitrary, we have


k
Z
f dµ ≥ µ(Dkn ).
n
Dk n

Thus
k+1
Z
f ∗ dµ ≤ µ(Dkn )
Dkn n
1
Z
n
≤ µ(Dk ) + f dµ
n Dkn

(where the first inequality follows from the definition of D kn ). Since


[
X= Dkn
k∈Z

(a disjoint union), summing over k ∈ Z gives


1
Z Z

f dµ ≤ µ(X) + f dµ
X n X
1
Z
= + f dµ.
n X

Since this holds for all n ≥ 1, we obtain


Z Z
f ∗ dµ ≤ f dµ.
X X

Applying the same argument to −f gives


Z Z

(−f ) dµ ≤ −f dµ

8
MATH41112/61112 Ergodic Theory Lecture 21

so that Z Z Z

f dµ = f∗ dµ ≥ f dµ.

Therefore Z Z

f dµ = f dµ,

as required.
Finally, we prove that f ∗ = E(f | I). First note that as f ∗ is T -invariant,
it is measurable with respect to I. Moreover, if I is any T -invariant set then
Z Z
f, dµ = f ∗ dµ.
I I

Hence f ∗ = E(f | I). 2

You might also like