0% found this document useful (0 votes)
114 views8 pages

Basic Concepts of Ergodic Theory

In these notes I will outline basic concepts about ergodic theory, Boltzmann's ergodic hyphothesis and the concept of a deterministic system behaving in a probabilistic manner. I will start by shortly demonstrating with simple and concrete examples how does a mathematician think of probability. Then I will explain important parts of ergodic theorem; that is how space, time averages and mean visit times are related to invariance and ergodicity. With these in hand, I will discuss these result regarding Boltzmann's ergodic hyphothesis and explain the concept of an ergodic system having "probabilistic behaviour". The notes are aimed for non mathematicians as well so I will skip technical details whenever possible and add them later on at the end of sections as remarks for the interested reader.

Uploaded by

iavicenna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
114 views8 pages

Basic Concepts of Ergodic Theory

In these notes I will outline basic concepts about ergodic theory, Boltzmann's ergodic hyphothesis and the concept of a deterministic system behaving in a probabilistic manner. I will start by shortly demonstrating with simple and concrete examples how does a mathematician think of probability. Then I will explain important parts of ergodic theorem; that is how space, time averages and mean visit times are related to invariance and ergodicity. With these in hand, I will discuss these result regarding Boltzmann's ergodic hyphothesis and explain the concept of an ergodic system having "probabilistic behaviour". The notes are aimed for non mathematicians as well so I will skip technical details whenever possible and add them later on at the end of sections as remarks for the interested reader.

Uploaded by

iavicenna
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Basic Concepts of Ergodic Theory

December 12, 2014


Abstract
In these notes I will outline basic concepts about ergodic theory, Boltzmanns ergodic hyphothesis and the concept of a deterministic system behaving in a probabilistic manner. I will start by shortly demonstrating
with simple and concrete examples how does a mathematician think of
probability. Then I will explain important parts of ergodic theorem; that
is how space, time averages and mean visit times are related to invariance
and ergodicity. With these in hand, I will discuss these result regarding
Boltzmanns ergodic hyphothesis and explain the concept of an ergodic
system having probabilistic behaviour. The notes are aimed for non
mathematicians as well so I will skip technical details whenever possible
and add them later on at the end of sections as remarks for the interested
reader.

Probability

Probability is a tool for modeling randomly occurring results. Thus first


and foremost this model starts with a certain set of all possible results.
Enough real life experiments will then give information about how likely
it is to observe a certain subsets of all possible results. Since finding out
these rates is the fundamental question in randomly occurring results, the
model must also contain a way of assigning certain expected occurrence
rates to these subsets of all possible results. In this respect, when a
mathematician thinks of a probability s/he thinks of three objects:
1- A sample space S which is the set of all possible results(we denote
the set of all subsets of S as 2S )
2- A set S 2S which can be taught as of the set of all observable
subsets of S
3- And a probability measure which measure how likely each event
is to happen. It is a function : S [0, 1], so it takes a subset of S and
gives it a real value.
It is better to demonstrate these by dierent cases, i.e when S is discrete and continuous.
First is the discrete case:
Example 1.1. Lets describe the probabilistic event of a fair dice toss with
six faces. In this case the space S is the space {1, 2, 3, 4, 5, 6}, while S is
the set of all subsets of S, for instance {1}, {1, 3}, {4, 5, 6}, {1, 2, 3, 4, 5, 6}.
For instance the set {1, 2} describes the event that the dice results in
either 1 or 2. We can call events made of a single number such as {1} as
basic events or as results as previously agreed.

1
The probability measure can be constructed
by saying

({i}) = 6 for
i = 1, 2, 3, 4, 5, 6 and for A S, (A) = iA ({i}) = iA 16 .
This means that each number has equal probability 16 of being observed
and the probability of an event is simply the sum of probabilities for each
result (i.e number) there.

Second is the continuous case:


Example 1.2. Lets now describe the probabilistic event of throwing a
dart. Consider S to be the unit square which will be our dart board.
Let S 2S some collection of subsets whose area can be measured.
Then S contains open sets, closed sets and some other sets who can be
approximated by open and closed sets. The measure we will use will be
the area m of a region noting that the area of a unit square is 1 . Then
probability of throwing a dart in a certain region, which corresponds to
an event, is simply its area.
These two examples describe how to formalize the notion of a single
probabilistic experiment, that is throwing a dart in a certain region, or
getting a number (among a certain collection of numbers) after a dice
throw. Lets take these a bit further and describe independent successive experiments which is really the first step to making the connection
to dynamical systems. If events of a probabilistic experiment can be described as subsets of certain sample space S then the natural space to
consider for independent successive events is the product of the space
that is S N = S S .... The observable subsets of S N are described as the
collections A1 A2 ... where Ai S. So we can also set S N = S S....
Then given a measure on S, the measure N on S N is simply given as
N (A1 A2 ....) = (A1 )(A2 )...
Lets see this fits nicely into the theme of successive experiments.
Example 1.3. Lets say we are throwing dice. What is the probability of
throwing 1, 2 or 3 for each throw up to and including the third throw in
3
any successive dice throwing experiment? It is 12 . This can be modelled
N
in S as follows. Let A = {1, 2, 3}. This means that our first three throws
must be in A, while the following throws can be anything i.e they should
be in S. Then the probability of this event is N (A A A S S....) =
3
1
12 12 11... = 12 . Thus we see that any set of the form A1 A2 ...
2
corresponds to the event where the result of ith experiment is in Ai . Now
lets try to understand a probability experiment from a point of view of
dynamical systems1 . Define the following map on : S N S N as
((x1 , x2 , x3 ....)) = (x2 , x3 , ...)
which is sometimes called the shift map. Any ideal probabilistic experiment is a single point in S N i.e a sequence of the form s = (s1 , s2 , ...).
Asymptotics of this sequence can be described for A S
1
{k such that k (s)1 A}
k
where is the number of the elements in a set, k is applying k times
and k (s)1 is the first entry of the sequence k (s). Probability theory tells
us that for all most all experiments s we have that
n(s, A) = lim

1 For

now be content with the idea that a dynamical system is simply a map f acting on
some space as f : M M (i.e an evolution of some space under some dynamics). We will
elaborate on this later on.

n(s, A) = m(A)

(1.1)

for every s and A. In fact the probability or measure of obtaining an


ideal experiment for which this is not satisfied is 0. Consider for instance
the theoretically possible experiment (6, 6, 6, ......). This does not satisfy
equation (1.1). There are in number uncountably many such sequences
that will not satisfy this equation. However if you group them all together
into some set A0 , probability tells us that even all this set has 0 measure
in S with respect to our measure . Thus the set of all sequences which
satisfy equation (1.1) cover all the space S from a probabilistic or measure
theoretic point of view. At this moment we take time to introduce the
notion of almost everywhere. Given a space M with a measure on it,
a certain property is said to hold almost everywhere if the property
holds on a full measure subset A M . In this case we can rephrase
what we have said as follows: For almost every sequence s S N , the
equation (1.1) holds true. From a physical point of view this says that the
probability of observing something contrary to the equation (1.1) is 0.
Keep this example in mind as it will be very useful when we want to
compare Ergodic dynamical systems to probabilistic ones.
The same concepts above directly apply to the continuous case of dart
throwing as well and we leave it as an exercise.
Remark 1.4. Formally the triple (S, S, ) is a probability space where S
is any space, S is a algebra of subsets of S and is a measure on this
algebra. For instance in the dart throwing example, S is the smallest
algebra that contains all the open and closed subsets. The formal name
of generalizing from S to S N is called a Bernoulli scheme. They have very
deep and intricate connections to certain kinds of dynamical systems. In
this note we will explore their connection to ergodic systems.
We also remark that given any finite set S, S N will not be countable.
Indeed consider the case when S = {0, 1, 2, ...9}. Then one can build a
bijection between S N and R using decimal expansions of real numbers.

Dynamical Systems

A dynamical system is a way to represent something that changes with


time. For instance we might want to find a representation for a classical
mechanics system that changes under application of forces. Newtons laws
and the theory of ordinary dierential equations tell us that such a system
can be uniquely represented by its coordinates and velocity which together
make up the phase space M = R2n . Each trajectory in the phase space
can be described by a flow
t (x) : M R M
t (s (x)) = t+s (x)
or its time discretization with n Z
n (x) = ((...((x)))) : M Z M
In the first case the forward orbits of a point is given by O(x) =
{t (x), t R+ } and in the second case O(x) = {n (x), n N}. The
second case when we model the evolution by a discrete time model, is

more close to the probability models we have described in the first section.
Indeed the orbit of a point can be given as a sequence (x, (x), 2 (x), ...) or
as a point in M N . Then if we are given a measure on M , the asymptotics
of the map
(x1 , x2 , ...) = (x2 , x3 , ...)
is precisely what will be that connects ergodicity of to probability in
the light of the example 1.3. In fact the concept of being ergodic can be
precisely stated as a condition on the map acting on M N that will turn
out to be formula 1.1. The reader who is not familiar with the notion
of measure may imagine as the volume. Then a measurable set will be
any set whose volume can be measured. Then much like the continuous
maps (which are maps such that inverse image of open sets under these
maps are open), maps which are called measurable are maps such that
inverse image of any measurable set is measurable under these maps. In
particular, open and closed sets are measurable and continuous maps are
measurable. Therefore again the reader not comfortable with notions of
measure theory can think in terms of open sets and continuous maps.
Now to make this connection precise we will study ergodic dynamical
systems.

Ergodic Dynamical Systems

The main reference for this chapter is [1].


In most cases such as physics, it is important to understand what is the
asymptotic behaviour of a system starting at a certain initial condition.
For instance systems near equilibrium such as protein molecules tend to
fold their way to global minimum of the energy landscapes in long time
evolution periods. In this case the phase space trajectory of the protein
converges to a subset of the phase space which gives this minimum energy. Thus evidently, it is as much important to understand the overall
behaviour of a trajectory as knowing where it is at a certain time. This
somehow forces us to ask, how much time a system spends in a certain
region of the phase space M when it is let to evolve for long times. Does it
keep coming back to a certain region? After a certain time does it always
stay in one place? To answer these questions we have to evidently assign
a measure to the space M , the most physical of it being again the volume
measure, which we denote as . With these questions in mind, we will
first discuss the discrete time dynamics and the case when M is a subset
which satisfies (M ) and then comment on flows and the cases where
M = R2n . So let f : M M be a dieomorphism (i.e an invertible map
which is and its inverse is dierentiable) and be the normalized volume
measure on M (so that (M ) = 1). We first give the dry definition
of ergodicity. Note that a subset A M is called measurable if we can
assign it a value (A) through our measure (it is a technical detail that
not all sets will be measurable). The collection of all such sets we denote
as M (mark the analogy to S, S, in the first section).
Definition 3.1. is called f-invariant if for any A M, (f 1 (A)) =
(A). A set is called invariant if f 1 (A) = A (which also implies f (A) =
A). is called ergodic if it is invariant and if for all invariant sets A,
(A) = 1 or (A) = 0 (we remind that (M ) = 1 i.e is the normalized
volume measure for M , a set of finite volume)

Remark 3.2. A bit of intuition about this definition is as follows, if


f (A) = A then f n (A) = A and if x
/ A, then f n (x)
/ A for all n. Thus
an invariant set describes a subset of M which really does not interact
with the rest of the system, it is an isolated part of the dynamics by itself.
If a map is ergodic for the volume measure , this means that the system
does not break up into smaller invariant, isolated parts. Any such part
must be almost equal to M (up to volume) and if not, it is a very small
set (in terms of volume). For instance if you take the rotation around 0
of a disk, then this map breaks into many isolated dynamical systems of
positive volume. Indeed take any annulus in the disk centered at 0, which
is invariant but has positive measure not equal to 1. But the concept of
ergodicity obviously depends both on the measure and the map. Instead
of volume if one takes the measure to be the dirac delta at 0, i.e 0 , then
it is intuitively clear that any invariant set has either measure 1 or 0. In
fact sets which dont contain 0 has zero measure (for instance annuli).
And although 0 is an invariant set it has full measure with respect to 0
and any other smaller disk centered at 0 is invariant but again has full
measure with respect to this measure. So as you see it is more precise to
say that if a measure is ergodic for f then the parts of M which the
measure sees does not break up into significant invariant parts. As far
as 0 is concerned, the only significant parts are those that contain 0
What are some other more physical ways to define ergodicity in terms
of physical quantities, i.e observables on M (such as energy and so on)? To
able to answer this we need to understand certain properties of invariant
measures.

3.1

Invariant Measures

One of the most important theorems about statistical properties of maps


with invariant measures is the Poincar
e recurrence theorem. It is one of
the earliest theorems related to statistical properties of invariant measures
and was invented formulated by Poincar
e to study dynamics of celestial
bodies.
It is known as the recurrence theorem which states:
Theorem 3.3. Let f : M M be a continuous (or measurable) map
on some space M with a measure . If if f invariant then for
almost x M and for any subset U M that contains x and has positive
measure, f n (x) returns to U infinitely many times.
We note that this return time could be quite quite large and is inversely
proportional to measure of the set U , which is the content of some theorem
called the Ka
c theorem. After Poincar
es recurrence theorem, and with
the motivation given by Boltzmanns Ergodic Hypothesis, there has been
a lot of work on the statistical properties of invariant measures which
lead to Birkho ergodic theorem. To state this theorem, we define the
following averages, the first is the mean visit time and the second time
average of an observable:
1
n(E, x) = lim {j = 0, 1, ..., n 1 : f j (x) E}
n n
1

(x)
= lim
n n

n1

(f i (x))

i=0

where : M R is any measurable function which is usually called an


observable (for instance Energy).The signs are put to remind that we

dont know yet if these limits exist. In fact one of the hardest to establish,
main results of this theory is the following which states that the limits
exist:
Theorem 3.4. (Birkho )Let f : M M be measurable and an invariant measure. Given any measurable set E, the mean visit time n(E, x)
exists for almost every x. Moreover n(E, x)d = (E). Moreover
(E, x) is f invariant that is (E, f (x)) = (E, x). Similiarly given

any integrable function (x)


exists almost everywhere and satisfies

d = d. Also again (f(x)) = f (x)


Remark 3.5. A critical aspect to notice here that for each integrable
function , there exists a full measure set A such that for all x A
the limit exists. In principle for each dierent integrable function one
might have dierent full measure sets A .
We note that up to here we have not introduced ergodicity at all. That
is why we dont have very precise information about how or (E, x) behaves. We only know how they behave on average. Next subsection
finalizes this issue. Before we move on, we will also give an earlier version
of Birkhos theorem, due to Von Neumann, which may be of more interest to people doing quantum mechanics and we will also state the flow
version of the Birkhos theorem.
Theorem 3.6. (Von Neumann) Let U : H H be an isometry in a
Hilbert space H and P be the orthogonal projection to the invariant subspace I(U ) = {v H such that U v = v}. Then
lim

n1
1 j
U v = Pv
n j=0

for every v H.
Here the concept of invariance is replaced by
isometry. We also note
j
that here the convergence of the partial sums n1 n1
j=0 U v to P v will be

j
in the norm of the Hilbert space (that is | n1 n1
j=0 U v P v| 0) which is
weaker than converge almost everywhere given by the Birkho theorem).
Finally to define the flow version of the Birkhos theorem we define
the time averages for flows. Let f t : M M be a flow on some space
with a measure . The measure is called invariant for this flow if it
is invariant for f t for every t R that is (f t A) = (A) for every t
and measurable set A. Then the average visit time and time average of
functions are respectively defined as
1
({t such that 0 t T, f t (x) E})
T

1 T
= lim
(f t (x))dt
T T
0

=
with the property that M (E, x)d = (E) and M d
d.
X
The Birkhos theorem for flows states that given E or these limit
exist for almost every point x.
(E, x) = lim

4 Ergodic Measures and the Connection


to Probability
In the beginning of section 3 we had given a definition of Ergodic measure.
Now we are going to state some equivalent definitions which are much more
dynamic in nature but depend crucially in the results given in Birkhos
Ergodic theorem.
Theorem 4.1. Let be an invariant probability measure for a measurable
transformation f : M M . The following conditions are all equivalent:
For every measurable set B, n(B, x) = (B) for all most every point
x.
For every measurable set B, n(B, ) is constant almost every where.

For every integrable function , = d for all most every point


x.
For every integrable function , is constant for all most every point
x.

For every invariant integrable function , (x) = d for all most


every point x.
For every invariant integrable function , (x) is constant all most
everywhere.
For every invariant subset A, one has that either (A) = 0 or (A) =
1.
Remark 4.2. Although this theorem gives, statistical wise, very powerful
results, the proof of Birkhos ergodic theorem is MUCH harder than the
proof of this theorem. Proof of this theorem is actually quite easy and is
not even attributed to any one. The existence of the limits in Birkhos
theorem is really the tricky part. Every now on then, one can still see
newly published proofs of that theorem, claiming to be more simple than
the previous ones.
Now it is really easy to make the connection to the probability we have
introduced in the first section. Take any set E. Now by theorem above,
we know that we can pick a point x such that considering the sequence
x = (x, f (x), f 2 (x), ...), one as that
1
{j such that j (x)1 E} = (E)
n
This is precisely the same condition that we have stated for probabilistic experiments in 1.1. The connection is therefore that generic orbits of
an ergodic system behave as if they are successive results of a probability
experiment where the sample space is the map M and the probability
measure is the measure on M .
We also state the flow version of the ergodic theorem. A measure
is called ergodic for the flow f t if every invariant set A (a set such that
f t (A) A for all t) has zero or full measure. In this case ergodicitiy is
equivalent to the statement that for every given E measurable
n(E, x) = lim

(E, x) = lim

1
({t such that 0 t T, f t (x) E}) = (E)
T

for all most every x which is to say that the intersection of the orbit
of x with E has full measure.

Boltzmanns Ergodic Hyphothesis

Finally we comment on Boltzmanns Ergodic Hyphothesis and in what


sense it was not completely true. The systems that interested Boltzmann
were very complicated conservative Hamiltonian systems and he had the
intuition that such systems could be described in a statistical manner.
This intuition was what fueled the following studies in dynamical systems
where the common idea was that if one wants to study a dynamical system in complete generality, then statistical approach is the only viable
approach. Together with Boltzmann, Poincar
e was one of the fathers
of this idea. And to study Hamiltonian systems which are described on
phases spaces (of configuration and momentum), one starts with a theorem which preceded these observations, namely the Liouville theorem
which asserts that the volume measure of the phase space is invariant under any Hamiltonian flows. However note that one can not directly use
this measure as it is not a probability measure, it is not even finite and
ergodic theory of non-finite measures do not have very useful analogues
of Birkhos theorem. Therefore one needs to use the observation that
energy level sets (that is points in the phase that correspond to same energy) of conservative Hamiltonian systems are invariant that is we dont
have to consider the whole phase space but it is enough to consider each
energy level set separately. In this case we can describe the normalized
Liouville measure. Given an energy level set S (assume that it has certain
regularities so that it is a hypersurface), we can consider the restriction
of the Liouville measure to S which we denote as S . This gives a
volume measure on S. Then we devide this by the gradient of the Hamil1
tonian to obtain
S = H
s . In the case S is a bounded hypersurface
(for instance when the system is constrained in configuration to a certain
place in space) this measure is a probability measure on S and one can
study the Ergodicty properties of the Hamiltonian flows restricted to this
surface. It was Boltzmanns expectation that this flow is invariant and
in particular that the trajectory of an orbit fills up all the space. He did
not use measure theoretic notions at that time but it is now know due to
Kolmogorov-Arnold-Moser theorem that it is not typically true even in
a measure theoretic sense. There are Hamiltonian systems whose energy
level sets contains invariant subsets that have non-zero and non-full measure (with respect to normalized Liouville measure) which contradicts the
definition of an ergodic measure. Not only this but such systems retain
this property under small perturbations of the Hamiltonian which also
implies that such systems are physically observable. Neverthless Boltzmanns expectation is true for certain type of systems which may be called
systems near equilibrium. This idea forms much of the basic ideas that
make up modern equilibrium thermodynamics where time average of the
observables of a system is accepted to be the same to the space average
(that is the ensemble average). This idea is expolited in many simulations
of such systems such as protein folding where a protein near equilibrium
(folded configuration) is assumed to sample the phase space near the
equilibrium under long enough simulation times.

References
[1] M. Viana, K. Oliviera Fundamentos da Teoria Ergodica

You might also like