0% found this document useful (0 votes)
335 views95 pages

Lectures On Mathematical Statistical Mechanics

This document contains the contents and preface for a set of lectures on mathematical statistical mechanics. The lectures cover topics ranging from ergodic theory and entropy to Gibbs measures, the thermodynamic limit, and specific models of statistical mechanics. The goal is to develop the mathematical concepts and theory of statistical mechanics starting from basic principles in classical mechanics.

Uploaded by

hhjder_bag8995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
335 views95 pages

Lectures On Mathematical Statistical Mechanics

This document contains the contents and preface for a set of lectures on mathematical statistical mechanics. The lectures cover topics ranging from ergodic theory and entropy to Gibbs measures, the thermodynamic limit, and specific models of statistical mechanics. The goal is to develop the mathematical concepts and theory of statistical mechanics starting from basic principles in classical mechanics.

Uploaded by

hhjder_bag8995
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 95

ISSN 0070-7414

Sgrbhinn Institi
uid Ard-L
einn Bhaile Atha
Cliath
Sraith. A. Uimh 30
Communications of the Dublin Institute for Advanced Studies
Series A (Theoretical Physics), No. 30

LECTURES ON
MATHEMATICAL
STATISTICAL MECHANICS
By

S. Adams

DUBLIN

Institi
uid Ard-L
einn Bhaile Atha
Cliath
Dublin Institute for Advanced Studies
2006

Contents
1 Introduction

2 Ergodic theory
2.1 Microscopic dynamics and time averages . . . . . . .
2.2 Boltzmanns heuristics and ergodic hypothesis . . . .
2.3 Formal Response: Birkhoff and von Neumann ergodic
2.4 Microcanonical measure . . . . . . . . . . . . . . . .

2
. . . . . 2
. . . . . 8
theories
9
. . . . . 13

3 Entropy
16
3.1 Probabilistic view on Boltzmanns entropy . . . . . . . . . . . 16
3.2 Shannons entropy . . . . . . . . . . . . . . . . . . . . . . . . 17
4 The
4.1
4.2
4.3
4.4

Gibbs ensembles
The canonical Gibbs ensemble
The Gibbs paradox . . . . . .
The grandcanonical ensemble
The orthodicity problem . .

5 The
5.1
5.2
5.3

Thermodynamic limit
33
Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Thermodynamic function: Free energy . . . . . . . . . . . . . 37
Equivalence of ensembles . . . . . . . . . . . . . . . . . . . . . 42

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

6 Gibbs measures
6.1 Definition . . . . . . . . . . . . . . . . .
6.2 The one-dimensional Ising model . . . .
6.3 Symmetry and symmetry breaking . . .
6.4 The Ising ferromagnet in two dimensions
6.5 Extreme Gibbs measures . . . . . . . . .
6.6 Uniqueness . . . . . . . . . . . . . . . .
6.7 Ergodicity . . . . . . . . . . . . . . . . .

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

7 A variational characterisation of Gibbs measures

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

.
.
.
.

.
.
.
.
.
.
.

20
20
26
27
31

44
44
47
51
52
57
58
60
62

8 Large deviations theory


68
8.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
8.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
8.3 Some results for Gibbs measures . . . . . . . . . . . . . . . . . 72

9 Models
9.1 Lattice Gases . . . . . .
9.2 Magnetic Models . . . .
9.3 Curie-Weiss model . . .
9.4 Continuous Ising model .

.
.
.
.

.
.
.
.

.
.
.
.

ii

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

.
.
.
.

73
74
75
77
84

Preface
In these notes we give an introduction to mathematical statistical mechanics,
based on the six lectures given at the Max Planck institute for Mathematics in
the Sciences February/March 2006. The material covers more than what has
been said in the lectures, in particular examples and some proofs are worked
out as well the Curie-Weiss model is discussed in section 9.3. The course
partially grew out of lectures given for final year students at the University
College Dublin in spring 2004. Parts of the notes are inspired from notes of
Joe Pule at University College Dublin.
The aim is to motivate the theory of Gibbs measures starting from basic
principles in classical mechanics. The first part covers Sections 1 to 5 and
gives a route from physics to the mathematical concepts of Gibbs ensembles
and the thermodynamic limit. The Sections 6 to 8 develop a mathematical
theory for Gibbs measures. In Subsection 6.4 we give a proof of the existence of phase transitions for the two-dimensional Ising model via Peierls
arguments. Translation invariant Gibbs measures are characterised by a variational principle, which we outline in Section 7. Section 8 gives a quick introduction to the theory of large deviations, and Section 9 covers some models
of statistical mechanics. The part about Gibbs measures is an excerpt of
parts of the book by Georgii ([Geo88]). In these notes we do not discuss
Boltzmanns equation, nor fluctuations theory nor quantum mechanics.
Some comments on the literature. More detailed hints are found throughout the notes. The books [Tho88] and [Tho79] are suitable for people, who
want to learn more about the physics behind the theory. A standard reference in physics is still the book [Hua87]. The route from microphysics to
macrophysics is well written in [Bal91] and [Bal92]. The old book [Kur60] is
nice for starting with classical mechanics developing axiomatics for statistical
mechanics. The following books have a higher level with special emphasis
on the mathematics. The first one is [Khi49], where the setup for the microcanonical measures is given in detail (although not in used modern manner).
The standard reference for mathematical statistical mechanics is the book
[Rue69] by Ruelle. Further developments are in [Rue78] and [Isr79]. The
book [Min00] contains notes for a lecture and presents in detail the twodimensional Ising model and the Pirogov-Sinai theory, the latter we do not
study here. A nice overview of deep questions in statistical mechanics gives
[Gal99], whereas [Ell85] and [Geo79],[Geo88] have their emphasis on probability theory and large deviation theory. The book [EL02] gives a very nice
introduction to the philosophical background as well as the basic skeleton of
statistical mechanics.
iii

I hope these lectures will motivate further reading and perhaps even further research in this interesting field of mathematical physics and stochastics.
Many thanks to Thomas Blesgen for reading the manuscript. In particular I
thank Tony Dorlas, who gave valuable comments and improvements.
Leipzig, Easter 2006

Stefan Adams

iv

Introduction

The aim of equilibrium Statistical Mechanics is to derive all the equilibrium


properties of a macroscopic system from the dynamics of its constituent particles. Thus its aim is not only to derive the general laws of thermodynamics
but also the thermodynamic functions of a given system. Mathematical Statistical Mechanics has originated from the desire to obtain a mathematical
understanding of a class of physical systems of the following nature:
The system is an assembly of identical subsystems.
The number of subsystems is large (N 1023 , Avogardos number 6.023
1023 , e.g. 1cm3 of hydrogen contains about 2.7 1019 molecules/atoms).
The interactions between the subsystems are such as to produce a thermodynamic behaviour of the system.
Thermodynamic behaviour is phenomenological and refers to a macroscopic
description of the system. Now macroscopic description is operationally
defined in the way that subsystems are considered as small and not individually observed.
Thermodynamic behaviour:
(1) Equilibrium states are defined operationally. A state of an isolated system
tends to an equilibrium state as time tends to + (approach to equilibrium).
(2) An equilibrium state of a system consists of one or more macroscopically
homogeneous regions called phases.
(3) Equilibrium states can be parametrised by a finite number of thermodynamic parameters (e.g. temperature, volume, density, etc) which determine
the thermodynamic functions (e.g. free energy, pressure, entropy, magnetisation, etc).
It is believed that the thermodynamic functions depend piecewise analytical (or smoothly) on the parameters and that singularities correspond to
changes in the phase structure of the system (phase transitions). Classical
thermodynamics consists of laws governing the dependence of the thermodynamic functions on the experimental accessible parameters. These laws
are derived from experiments with macroscopic systems and thus are not
derived from a microscopic description. Basic principles are the zeroth, first
and second law as well as the equation of state for the ideal gas.
The art of the mathematical physicist consists in finding a mathematical
justification for the statements (1)-(3) from a microscopic description. A microscopic complete information is not accessible and is not of interest. Hence,
despite the determinism of the dynamical laws for the subsystems, random1

ness comes into play due to lack of knowledge (macroscopic description). The
large number of subsystems is replaced in a mathematical idealisation by infinitely many subsystems such that the extensive quantities are scaled to stay
finite in that limit. Stochastic limit procedures as the law of large numbers,
central limit theorems and large deviations principles will provide appropriate tools. In these notes we will give a glimpse of the basic concepts. In
the second chapter we are concerned mainly with the Mechanics in the name
Statistical Mechanics. Here we motivate the basic concepts of ensembles via
Hamilton equations of motions as done by Boltzmann and Gibbs.

2
2.1

Ergodic theory
Microscopic dynamics and time averages

We consider in the following N identical classical particles moving in Rd , d


1, or in a finite box Rd . We idealise these particles as point masses
having the mass m. All spatial positions and momenta of the single particles
are elements in the phase space
= Rd Rd )N

or

= Rd )N .

(2.1)

Specify, at a given instant of time, the values of positions and momenta of the
N particles. Hence, one has to specify 2dN coordinate values that determine
a single point in the phase space respectively . Each single point in
the phase space corresponds to a microscopic state of the given system of N
particles. Now the question arises whether the 2dN dimensional continuum
of microscopic states is reasonable. Going back to Boltzmann [Bol84] it seems
that at that time the 2dN dimensional continuum was not really deeply
accepted ([Bol74], p. 169):
Therefore if we wish to get a picture of the continuum in words,
we first have to imagine a large, but finite number of particles
with certain properties and investigate the behaviour of the ensembles of such particles. Certain properties of the ensemble may
approach a definite limit as we allow the number of particles ever
more to increase and their size ever more to decrease. Of these
properties one can then assert that they apply to a continuum,
and in my opinion this is the only non-contradictory definition of
a continuum with certain properties...
and likewise the phase space itself is really thought of as divided into a finite number of very small cells of essentially equal dimensions, each of which
2

determines the position and momentum of each particle with a maximum


precision. Here the maximum precision that the most perfect measurement
apparatus can possibly provide is meant. Thus, for any position and momentum coordinates
(j)
p(j)
i qi h,

for i = 1, . . . , N, j = 1, . . . , d,

with h = 6, 62 1034 Js being Plancks constant. Microscopic states of


the system of N particles should thus be represented by phase space cells
consisting of points in R2dN (positions and momenta together) with given
centres and cell volumes hdN . In principle all what follows should be formulated within this cell picture, in particular when one is interested in an
approach to the quantum case. However, in this lectures we will stick to the
mathematical idealisation of the 2dN continuum of the microscopic states,
because all important properties remain nearly unchanged upon going over
to the phase cell picture.
Let two functions W : Rd R and V : R+ R be given. The energy of
the system of N particles is a function of the positions and momenta of the
single particles, and it is called Hamiltonian or Hamilton function. It is
of the form
N  2

X
X
pi
V (|qi qj |),
+ W (qi ) +
H(q, p) =
2m
i=1
1i<jN

(2.2)

where q = (q1 , . . . , qN ), p = (p1 , . . . , pN ). Here, the function W is called the


external-potential at the spatial positions due to external forces (walls,
pressure,...) or external fields (gravitation, magnetic field,...), and the function V is called the pair potential, depending only on the spatial distances
of each pair of particles. Also more general many-particle interaction potentials can be considered (see [Rue69] for details). In the following we assume
that the Hamiltonian H : R is twice continuously differentiable and we
abbreviate n = dN . The phase space dynamics is governed by Hamiltons
equations of motion
qi =

H
H
and pi =
,
pi
qi

i = 1, . . . , N,

(2.3)

where the dot denotes as usual differentiation with respect to the time variable. If J denotes the 2n 2n matrix


0 1ln
,
1ln 0
3

1ln the identity in Rn , the Hamilton vector field is given as


v : , x 7 JH(x)
with x = (q, p) R2n and
H(x) =

 H
q1

,...,

H H
H 
,
...,
.
qN p1
pN

With the vector field v we associate the differential equations


d
x(t) = v(x(t)),
(2.4)
dt
where x(t) denotes a single microstate of the system for any time t R. For
each point x there is one and only one function x : R such that
x(0) = x and dx(t)
= v(x(t)) for any t R. For any t R we define a phase
dt
space map
t : , x 7 t (x) = x(t).
From the uniqueness property we get that = {t : t R} is a one-parameter
group which is called a Hamiltonian flow. Hamiltonian flows have the
following property.
Lemma 2.1 Let be a Hamiltonian flow with Hamiltonian H, then any
function F = f H of H is invariant under :
F t = F
for all t R.
Proof. The proof follows from the chain rule and hx, Jxi = 0. Recall that
if G : Rn Rm then G0 (x) is the m n matrix


Gi
(x)
for x Rn .
xj
i=1,...,m
j=1,...,n

d
dt
F (t (x)) = F 0 (t (x))
(x)
dt
dt
dt
(x)
dt
dt
= f 0 (H(t (x))) (H(t (x)))T
(x)
dt
dt
= f 0 (H(t (x))) hH(t (x)),
(x)i
dt
= f 0 (H(t (x))) hH(t (x)) , J H(t (x))i.
= f 0 (H(t (x))) H 0 (t (x))


The following theorem is the well known Liouvilles Theorem.
4

Theorem 2.2 (Liouvilles Theorem) The Jacobian | det 0t (x)| of a Hamiltonian flow is constant and equal to 1.
Proof.

Let M (t) and A(t) be linear mappings such that


dM (t)
= A(t)M (t)
dt

for all t 0, then


det M (t) = det M (0) exp

Z

trace A(s)ds .

Now

Thus

dt (x)
= (v t )(x).
dt
d0t (x)
= v 0 (t (x))0t (x)
dt

Rt
so that det 0t (x) = exp 0 trace v 0 (s (x))ds because 00 (x) is the identity map
on . Now
v(x) = J H(x) and v 0 (x) = JH 00 (x),
 2 
H
is the Hessian of H at x. Since H is twice continuwhere H 00 (x) = xi x
j
00
ously differentiable, H (x) is symmetric. Thus
trace (JH 00 (x)) = trace((JH 00 (x))T ) = trace ((H 00 (x))T J T )
= trace (H 00 (x)(J)) = trace (JH 00 (x)).
Therefore trace (JH 00 (x)) = 0 and det(0t (x)) = 1.

From Lemma 2.1 and Theorem 2.2 it follows that a probability measure
on the phase space , i.e., an element of the set P(, B ) of probability
measure on with Borel algebra B , whose Radon-Nikodym density with
respect to the Lebesgue measure is a function of the Hamiltonian H alone is
stationary with respect to the Hamiltonian flow = {t : t R}.
Corollary 2.3 Let P(, B )R with density = F H for some function
F : R R be given, ie., (A) = A (x)dx for any A B . Then
1
t =

for any t R.
5

Proof.

We have
Z
Z
(A) = 1lA (x)(x)dx = 1lA (t (x))(t (x))| det 0t (x)|dx

Z
= 1l1
(x)(x)dx = (1
t A).
t A

For such a stationary probability measure one gets a unitary group of time
evolution operators in an appropriate Hilbert space as follows.
Theorem 2.4 (Koopmans Lemma:) Let 1 be a subset of the phase space
invariant under the flow , i.e., t 1 1 for all t R. Let
P(1 , B1 ) be a probability measure on 1 stationary under the flow that
is 1
= for all t R. Define Ut f = f t for any t R and any
t
function f L2 (1 , ), then {Ut : t R} is a unitary group of operators in
the Hilbert space L2 (1 , ).
Proof. Since Ut Ut = U0 = I, Ut is invertible and thus we have to prove
only that Ut preserves inner products.
Z
Z
hUt f, Ut gi = (f t )(x)(g t (x)(dx) = (fg)(t (x))(dx)
Z
Z
1

= (f g)(x)( t )(dx) = (fg)(x)(dx) = hf, gi.



Remark 2.5 (Boundary behaviour) If the particles move inside a finite
volume box Rd according to the equations (2.3) respectively (2.4); these
equations of motions do not hold when one of the particles reaches the boundary of . Therefore it is necessary to add to theses equations some rules of
reflection of the particles from the inner boundary of the domain . For example, we can consider the elastic reflection condition: the angle of incidence
is equal to the angle of reflection. Formally, such a rule can be specified by a
boundary potential Vbc .
We discuss briefly Boltzmanns Proposal for calculating measured values
of observables. Observables are bounded continuous functions on the phase
space. Let 1 be a subset of phase space invariant under the flow , i.e.
t 1 1 for all t R. Suppose that f : R is an observable and suppose
that the system is in 1 so that it never leaves 1 . Boltzmann proposed that
when we make a measurement it is not sharp in time but takes place over
6

a period of time which is long compared to say times between collisions.


Therefore we can represent the observed value by
Z T
1

f (x) = lim
dt(f t )(x).
T 2T T
Suppose that is a probability measure on 1 invariant with respect to t
then
Z T Z
Z
1
dt
(f t )(x)(dx)
f(x)(dx) = lim
T 2T T
1
1
Z T Z
1
= lim
dt
f (x)( 1
t )(dx)
T 2T T
1
Z T Z
1
= lim
dt
f (x)(dx)
T 2T T
1
Z
=
f (x)(dx).
1

Assume now that the observed value is independent of where the system is
at t = 0 in 1 , i.e. if f(x) = f for a constant f, then
Z
Z
f(dx) = f.
f(x)(dx) =
1

Therefore
f =

Z
f (x)(dx).
1

We have made
Z Ttwo assumptions in this argument:
1
(1) lim
dt(f t )(x) exists.
T 2T T
(2) f(x) is constant on 1 .
Statement (1) has been proved by Birkhoff (Birkhoffs pointwise ergodic theorem (see section 2.3 below)): f(x) exists almost everywhere. We shall prove
a weaker version of this, Von Neumanns ergodic theorem.
Statement (2) is more difficult and we shall discuss it later.
The continuity of the Hamiltonian flow entails that for each f Cb (Rd ) and
each x the function
fx : R R, 7 fx (t) = f (x(t))

(2.5)

is a continuous function of the time t. For any time dependent function


we define the following limit
Z T
1
hi := lim
dt(t)
(2.6)
T 2T T
as the time average.

2.2

Boltzmanns heuristics and ergodic hypothesis

Boltzmanns argument (1896-1898) for the introduction of ensembles in statistical mechanics can be broken down into steps which were unfortunately
entangled.
1. Step: Find a set of time dependent functions which admit an invariant mean hi like (2.6).
2. Step: Find a reasonable set of observables, which ensure that the time
average of each observable over an orbit in the phase space is independent of the orbit.
Let Cb (R) be the set of all bounded continuous functions on the real line R,
equip it with the supremum norm |||| = suptR |(t)|, and define
Z T
o
n
1
dt(t) exists .
(2.7)
M = Cb (R) : hi = lim
T 2T T
Lemma 2.6 There exist positive linear functionals : Cb (R) R, normalised to 1, and invariant under time translations, i.e., such that
(i) () 0 for all M,
(ii) linear,
(iii) (1) = 1,
(iv) (s ) = () for all s R, where s (t) = (t s), t, s R,
RT
1
such that () = limT 2T
dt(t) for all Cb (R) where this limit
T
exists.
Our time evolution
(t, x) R 7 t (x) = x(t)
is continuous, and thus we can substitute fx in (2.5) for in the above result.
This allows to define averages on observables as follows.
8

Lemma 2.7 For every f Cb () and every x , there exists a timeinvariant mean x given by
x : Cb () R, f 7 x (f ) = (fx ),
with x depending only on the orbit {x(t) : t R, x(0) = x}.
For any E R+ let
E = {(q, p) : : H(q, p) = E}
denote the energy surface for the energy value E for a given Hamiltonian
H for the system of N particles.
The strict ergodicity hypothesis
The energy surface contains exactly one orbit, i.e. for every x
and E 0
{x(t) : t R, x(0) = x} = E .
There is a more realistic mathematical version of this conjecture.
The ergodicity hypothesis
Each orbit in the phase space is dense on its energy surface, i.e.
{x(t) : t R, x(0) = x} is a dense subset of E .

2.3

Formal Response: Birkhoff and von Neumann ergodic theories

We present in this section briefly the important results in the field of ergodic
theory initiated by the ergodic hypothesis. For that we introduce the notion
of a classical dynamical system.
Notation 2.8 (Classical dynamical system) A classical dynamical system is a quadruple (, F, ; ) consisting of a probability space (, F, ),
where F is a algebra on , and a one-parameter (additive) group T (R
or Z) and a group of actions, : T , (t, x) 7 t (x), of the group
T on the phase space , such that the following holds.
(a) fT : T R, (t, x) 7 f (t (x)) is measurable for any measurable
f : R,
(b) t s = t+s for all s, t T ,
9

(c) (t (A)) = (A) for all t T and A F.


Theorem 2.9 (Birkhoff ) Let (, F, ; ) be a classical dynamical system.
For every f L1 (, F, ), let
Z T
1
T
x (f ) =
dtf (t (x)).
2T T
Then there exists an event Af F with (Af ) = 1 such that
(i) x (f ) = limT Tx (f ) exists for all x Af ,
(ii) t (x) (f ) = x (f ) for all (t, x) R Af ,
(iii)
Z

Z
(dx)x (f ) =

Proof.

(dx)f (x).

[Bir31] or in the book [AA68].

Note that in Birkhoffs Theorem one has convergence almost surely. There
exists a weaker version, the following ergodic theorem of von Neumann. We
restrict in the following to classical dynamical system with T = R, the real
time. Let H = L2 (1 , F, ) and define Ut f = f t for any f H. Then by
Koopmans lemma Ut is unitary for any t R.
Theorem 2.10 (Von Neumanns Mean Ergodic Theorem) Let
M = {f H : Ut f = f t R},
then for any g H,
Z
1 T
dtUt g
gT :=
T 0
converges to P g as T , where P is the orthogonal projection onto M.
For the proof of this theorem we need the following discrete version.
Theorem 2.11 Let H be a Hilbert space and let U : H H be an unitary
operator. Let N = {f H : U f = f } = ker(U I) then
N 1
1 X n
U g = Qg
N N
n=0

lim

where Q is the orthogonal projection onto N .


10

Proof of Theorem 2.10.


n

Let U = U1 and g =
Z

and thus

N
1
X

dtUt f , then

n+1

dtUn+t f =

U g=

R1

dtUt f
n

U g=

dtUt f.
0

n=0

RN
Therefore N1 0 dtUt f converges as N . For T R+ , by writing T =
RT
N + r where 0 r < 1 and N N, we deduce that T1 0 dtUt f converges as
T . Define the operator P by
Z
1 T

P f = lim
dtUt f.
T T 0
Note that P f M and
1
P f = lim
T T

dtUt f M.
0

If f M then clearly P f = f , while if f M for all g H, hP f, gi =


hf, P gi = 0 since P g M and therefore P f = 0. Thus P = P .

Proof of the Discrete form, Theorem 2.11.

We first check that

[ker(I U )] = Range(I U ).
If f ker(I U ) and g Range(I U ), then for some h,
hf, gi = hf, (I U )hi = h(I U )f, hi
= h(I U )f, U hi = 0.
Thus Range(I U ) [ker(I U )] . Since [ker(I U )] is closed,
Range(I U ) [ker (I U )] .
If f [Range(I U )] , then for all g
0 = hf, (I U )U gi = h(I U )f, U gi = h(I U )f, gi.
Therefore (I U )f = 0, that is f ker(I U ). Thus
[Range (I U )] ker (I U ).
11

Then
[ker (I U )] [Range (I U )] = Range (I U ).
If g Range(I U ), then g = (I U )h for some h. Therefore
N 1
1 X n
1
U g = {h U h + U h U 2 h + U 2 h U 3 h + . . . + U N 1 h U N h}
N n=0
N

=
Thus

1
{h U N h}.
N
1
1 N

X n 2khk
U g
0 as N .

N n=0
N

Approximating
of Range(I U ) by elements of Range(I U ), we
PNelements
1 n
1
have that N n=0 U g 0 = P g for all
g [ker (I U )] = Range(I U ).
If g ker (I U ), then
N 1
1 X n
U g = g = P g.
N n=0


Definition 2.12 (Ergodicity) Let = (t )tR be a flow on 1 and a
probability measure on 1 which is stationary with respect to . is said to
be ergodic if for every measurable set F 1 such that t (F ) = F for all
t R, we have (F ) = 0 or (F ) = 1.
Theorem 2.13 (Ergodic flows) = (t )tR is ergodic if and only if the
only functions in L2 (1 , ) which satisfy f t = f are the constant functions.
Proof. Below a.s.(almost surely) means that the statement is true except
on a set of zero measure. Suppose that the only invariant functions are the
constant functions. If t (F ) = F for all t then 1lF is an invariant function
and so 1lF is constant a.s. which means that 1lF (x) = 0 a.s. or 1lF (x) = 1 a.s.
Therefore (F ) = 0 or (F ) = 1l.
Conversely suppose t is ergodic and f t = f . Let F = {x| f (x) < a}.
Then t F = F since
t (F ) = {t (x)| f (x) < a} = {t (x)| f (t (x) < a} = F.
12

Therefore (F ) = 0 or (F ) = 1. Thus f (x) < a a.s. or f (x) a a.s. for


every a R. Let
a0 = sup{a| f (x) a a.s.}.
Then, if a > a0 , ({x| f (x) a}) = 0, and if a < a0 , ({x| f (x) < a}) = 0.
Let (an ) and (bn ) be sequences converging to a0 such that an > a0 > bn .
Then
{x| f (x) 6= a0 } = n {x| f (x) an } {x| f (x) < bn }.
Thus
({x| f (x) 6= a0 })

(({x| f (x) an }) + ({x| f (x) < bn })) = 0,

and so f (x) = a0 a.s.

If we can prove that a system is ergodic, then there is no problem in applying


Boltzmanns prescription for the time average. For an ergodic system, by the
above theorem, M is the one-dimensional space of constant functions so that
Z
P g = h1, gi1 =
g(x)(dx).
1

Therefore
1
lim
T 2T

Z
dtUt g =

g(x)(dx).
1

Remark 2.14 However proving ergodicity has turned out to be the most difficult part of the programme. There is only one example for which ergodicity
has been claimed to be proved and that is for a system of hard rods (Sinai).
This concerns finite systems. In the thermodynamic limit (see Chapter 5)
ergodicity should hold, but we do not discuss this problem.

2.4

Microcanonical measure

Suppose that we consider a system with Hamiltonian H and suppose also


that we fix the energy of the system to be exactly E. We would like to
devise a probability measure on the points of with energy E such that the
measure is stationary with respect to the Hamiltonian flow.
Note that the energy surface E is closed since E = H 1 ({E}) and H is
assumed to be continuous. Clearly t (E ) = E since H t = H . Let A()
denote the algebra of continuous functions R with compact support.
The following Riesz-Markov theorem identifies positive linear functionals on
A() with positive measures on .
13

Theorem 2.15 (Riesz-Markov) If l : A() R is linear and for any positive f A() it holds l(f ) 0, then there is a unique Borel measure on
such that
Z
l(f ) = f (x)(dx).

Now define a linear functional lE on A() by


Z
1
lE (f ) = lim
f (x)dx,
0
[E,E+]
where [E,E+] = {x| H(x) [E, E + ]} is the energy-shell of thickness .
By the Riesz-Markov theorem there is a unique Borel measure 0E on such
that
Z
lE (f ) = f (x)0E (dx)

with the properties:


(i) 0E is concentrated on E .
If suppf E = then for small enough since suppf and [E,E+]
are closed [E,E+] suppf = and
Z
Z
f (x)dx = f (x)1l[E,E+] (x)dx = 0.
[E,E+]

(ii) 0E is stationary with respect to t .


Since the Lebesgue measure is stationary,
Z
Z
1
1
lE (f t ) = lim
(f t )(x)dx = lim
f (x)dx
0
0 (
t
[E,E+]
[E,E+] )
Z
1
= lim
f (x)dx = lE (f ).
0
[E,E+]
Definition 2.16 (Microcanonical measure) If (E) := 0E () < we
can normalise 0E to obtain
E := 0E /(E)

(2.8)

which is a probability measure on (, B ), concentrated of the energy shell E .


The probability E is called the microcanonical measure or microcanonical ensemble. The normalisation (E) is also called the microcanonical
partition function.
The expression S = k log (E) is called the Boltzmann entropy or microcanonical entropy, where k = 1, 38 1023 Js is Boltzmanns constant.
14

We now give an explicit expression for the microcanonical measure. First we


briefly recall briefly facts on curvilinear coordinates.
Let : R A R be a bijection. Then we can use coordinates t1 , t2 , . . . t
where the point x corresponds to the point t = (x) in the new coordinates.
The coordinates are orthogonal if the level surfaces ti = constant, i = 1, . . . ,
are orthogonal to each other, that is, for all x R if i 6= j,
hi (x), j (x)i = 0.
Changing the variables of integration we then get
Z
Z
Z
dt
1
1 0
f (x)dx =
f ( (t))| det( ) (t)|dt =
f ( 1 (t))
0
| det( ( 1 (t))|
R
A
ZA
dt
.
=
f ( 1 (t)) Q
1 (t))k
A
i=1 k(i )(
Note that if A is an n n matrix with rows a1 , . . . , an where hai , aj i = 0 for
i 6= j, then AAT is aQ
diagonal matrix with diagonal entries ka1 k2 , . . . , kan k2 .
Therefore det(A) = ni=1 kai k.
Let t1 be the level surface 1 (x) = t1 (constant). We define the element of
surface area on t1 to be
dt2 . . . dt
.
1 (t))k
i=2 k(i )(

dt1 = Q
Then

dt1
dt .
k1 k 1
We apply this to the Microcanonical Measure.
Choose : R2n A R2n such that 1 = H and so t1 is an energy surface.
Then
Z
Z E+
Z
f ( 1 (t))
f (x)dx =
dt1
dt1 .
1 (t))k
E
t1 kH(
[E,E+]
dx =

Therefore
1
lim
0

Z
f (x)dx =

[E,E+]

Thus
0E (dx) =

f ( 1 (E, t2 , . . . , t2n ))dE


.
kH( 1 (E, t2 . . . , tn ))k
dE
.
kHk

In particular
Z
(E) =
E

15

dE
.
kHk

Note also that


Z
Z
Z
g(H(x))f (x)dx = dt1 g(t1 )

t1

f ( 1 (t))
dt .
kH( 1 (t))k 1

Notation 2.17 (Mircocanonical Gibbs ensemble) Let Rd and N


N, and H(N ) denotes the Hamiltonian for N particles in with elastic boundary conditions. Then we denote the microcanonical measure on ( , B ) by
0E, and the partition function by
(E, N ) =

dE
.
||H(N ) ||

The microcanonical entropy is denoted by S (E, N ) = k log (E, N ).


Remark 2.18 Measure which are constructed like the microcanonical measure on hyperplanes are called Gelfand-Leray measures. In general one might
imagine that there are several integrals of motions. For example the angular
momentum is conserved. Then one has to consider intersections of several
level surfaces. We will not discuss this in these lectures.

3
3.1

Entropy
Probabilistic view on Boltzmanns entropy

We discuss briefly the famous Boltzmann formula S = k log W for the entropy
and give here an elementary probabilistic interpretation. For that let be
a finite set (the state space) and let there be given a probability measure
P() on , where P() denotes the set of probability measures on
with the -algebra being the set of all subsets of . In the picture of Maxwell
and Boltzmann, the set is the set of all possible energy levels for a system of
particles, and the probability measure corresponds to a specific histogram
of energies describing some macrostate of the system. Assume that (x) is
a multiple of n1 for any x , n N, i.e. is a histogram for n trials or a
macrostate for a system of n particles. A microscopic state for the system of
n particles is any configuration n .
Boltzmanns idea: The entropy of a macrostate P() corresponds to
the degree of uncertainty about the actual microstate n when only
is known and thus can be measured by log Nn (), the logarithmic number of
microstates leading to .
16

The associate macrostate for a microstate n is


n
1X
Ln () =
,
n i=1 i
and Ln () is called the empirical distribution or histogram of n . The
number of microstates leading to a given P() n1 [0, 1] is the number



n!


Nn () = { n : Ln () = } = Q
.
x (n(x))!
We may approximate P() by a sequence (n )nN of probability measures n P() with n n1 [0, 1] . Then we define the uncertainty H() of
via Stirlings formula as the n -limit of the mean-uncertainty of n
per particle.
Proposition 3.1 Let P() and n P() n1 [0, 1] with n N and
n as n . Then the limit limn n1 log Nn (n ) exists and equals
X
(x) log (x).
(3.9)
H() =
x

Proof.

A proof with exact error bounds can be found in [CK81].

The entropy H() counts the number of possibilities to obtain the macrostate
or histogram , and thus it describes the hidden multiplicity of the true
microstates consistent with the observed . It is therefore a measure of the
complexity inherent in .

3.2

Shannons entropy

We give a brief view on the basic facts on Shannons entropy, which was
established by Shannon 1949 ([Sha48] and [SW49]). We base the specific form
of the Shannon entropy functional on probability measures just on a couple of
clear intuitive arguments. For that we start with a sequence of four axioms on
a functional S that formalises the intuitive idea that entropy should measure
the lack of information (or uncertainty) pertaining to a probability measure.
For didactic reasons we limit ourselves to probability measures on a finite
set = {1 , . . . , n } of elementary events. Let P P()
P be the probability
measure with P ({i }) = pi [0, 1], i = 1, . . . , n, and ni=1 pi = 1. Now we
formulate four axioms for a functional S acting on the set P() of probability
measures.
Axiom 1: To express the property that S is a function of P P() alone
and not of the order of the single entries, one imposes:
17

(a) For every permutation Sn , where Sn is the group of permutations of n elements, and any P P() let P P() be defined as
P ({i }) = p(i) for any i = 1, . . . , n. Then
S(P ) = S(P ).
(b) S(P ) is continuous in each of the entries pi = P ({i }), i = 1, . . . , n.
The next axiom expresses the intuitive fact that the outcome is most random
for the uniform distribution.
Axiom 2: Let P (uniform) ({i }) =

1
n

for i = 1, . . . , n. Then

S(P ) S(P (uniform) )

for any P P().

The next axiom states that the entropy remains constant, whenever we extend our space of outcomes with vanishing probability.
Axiom 3: Let P 0 P(0 ) where 0 = {n+1 } and assume that
P 0 ({n+1 }) = 0. Then
S(P 0 ) = S(P )
for P P() with P ({i }) = P 0 ({i }) for all i = 1, . . . , n.
Finally we consider compositions.
0
Axiom 4: Let P P() and Q P(0 ) for some set 0 = {10 , . . . , m
}
0
with m N. Define the probability measure P Q P( ) as

P Q({i , l0 }) = Q({l0 }|{i })P ({i })


for i = 1, . . . , n, and l = 1, . . . , m. Here Q({l0 }|{i }) is the conditional
probability of the event {l0 } 0 conditioned that the event {i }
occurred. Then
S(P Q) = S(P ) + S(Q|P ),
Pn
where S(Q|P ) = i=1 pi Si (Q) is the expectation of
Si (Q) =

m
X

Q({l0 }|{i }) log Q({l0 }|{i })

l=1

with respect to the probability measure P .


Si (Q) is the conditional entropy of Q given that event {i } occurred.
Note that when P and Q are independent one has S(P Q) = S(P ) + S(Q).
Equipped with theses elementary assumptions we cite the following theorem
which gives birth to the Shannon entropy.
18

Theorem 3.2 (Shannon entropy) Let = {i , . . . , n } be a finite set.


Any functional S : P() R satisfying Axioms (1) to (4) must be necessarily
of the form
S(P ) = k

n
X

pi log pi

for P P() with P ({i }) = pi , i = 1, . . . , n,

i=1

and where k R+ is a positive constant.


Proof. The proof can be found in the original work by Shannon and
Weaver [SW49] or in the book by Khinchin [Khi57].

Notation 3.3 (Entropy) The functional
X
H() =
() log ()

for P()

is called the Shannon entropy of the probability measure .


The connection with the previous Boltzmann entropy for the microcanonical
ensemble is apparent from Axiom 2 above. Moreover, there are also connections to the Boltzmann-H-function not mentioned at all. The interested
reader is referred to any of the following monographs [Bal91],[Bal92],[Gal99]
and [EL02].

19

The Gibbs ensembles

In 1902 Gibbs proposed three Gibbs ensembles, the microcanonical, the


canonical and the grandcanonical ensemble. The microcanonical ensemble
was introduced in Section 2.4 as a probability measure on the energy surface, a hyperplane in the phase space. The microcanonical ensemble is most
natural from the physical point of view. However, in practise mainly the
canonical and the grandcanonical Gibbs ensembles are studied. The main
reason is that these ensembles are defined as probability measures in the
phase space with a density, the so-called Boltzmann factor eH , where
> 0 is the inverse temperature and H the Hamiltonian of the system.
The mathematical justification to replace the microcanonical ensemble by
the canonical or grandcanonical Gibbs ensemble goes under the name equivalence of ensembles, which we will discuss in Subsection 5.3. In this section
we introduce first the canonical Gibbs ensemble. Then we study the so-called
Gibbs paradox concerning the correct counting for a system of indistinguishable identical particles. It follows the definition of the grandcanonical Gibbs
ensemble. In the last subsection we relate all the introduced Gibbs ensembles to classical thermodynamics. This leads to the orthodicity problem,
namely the question whether the laws of thermodynamics are derived from
the ensembles averages in the thermodynamic limit.

4.1

The canonical Gibbs ensemble

We define the canonical Gibbs ensemble for a finite volume box Rd


and a fixed number N N of particles with Hamiltonian H(N ) having appropriate boundary conditions (like elastic ones as for the microcanonical
ensemble or periodic ones). In the following we denote the Borel--algebra
on the phase space by B . The universal Boltzmann constant is
k = kB = 1.3806505 1023 joule/kelvin. In the following T denotes temperature measured in Kelvin.
1
> 0 the inverse temperature.
Definition 4.1 Call the parameter = kT
The canonical Gibbs ensemble for parameter is the probability measure

P( , B ) having the density


,N
(N )

,N (x)

eH (x)
=
Z (, N )

, x ,

with respect to the Lebesgue measure. Here


Z
(N )
Z (, N ) =
dx eH (x)

20

(4.10)

(4.11)

is the normalisation and is called partition function (Zustandsssume).


Gibbs introduced this canonical measure as a matter of simplicity: he wanted
the measure with density to describe an equilibrium, i.e., to be invariant under the time evolution, so the most immediate candidates were to be
functions of the energy. Moreover, he proposed that the most simple case
conceivable is to take the log linear in the energy. The following theorem
was one of his justifications of the utility of the definition of the canonical
ensemble.
Theorem 4.2 Let 1 , 2 Rd , 1 2 be an aggregate phase space 0
and 0 P(0 , B0 ) with Lebesgue density 0 be given. Define the reduced
probability measures (or marginals) i P(i , Bi ), i = 1, 2, as
Z
1 (A) =
1lA (x1 )0 (x1 , x2 )dx2 for A B1 ,
A2
Z
2 (B) =
1lA (x2 )0 (x1 , x2 )dx1 for B B2
1 B

with the Lebesgue densities


Z
1 (x1 ) =
0 (x1 , x2 )dx2

Z
and

2 (x2 ) =

0 (x1 , x2 )dx1 .
0

Then the entropies


Z
Si = k

i (x) log i (x)dx

with i = 0, 1, 2,

satisfy the inequality


S0 S1 + S2
with equality S0 = S1 + S2 if and only if 0 = 1 2 .
Proof. The proof is given with straightforward calculation and the use of
Jensens inequality for the convex function f (x) = x log x + 1 x.

Gibbs himself recognised the condition for equality as a condition for independence. He claimed that with the notations of Theorem 4.2 in the special
case is the canonical ensemble density and the Hamiltonian is of the form
H0 = H1 + H2 with H1 (respectively H2 ) is independent of 2 (respectively
1 ), the reduced densities (marginal densities) 1 and 2 are independent,
i.e., 0 = 1 2 , and are themselves canonical ensemble densities.
In [Gib02] he writes:
21

-a property which enormously simplifies the discussion, and is the


foundation of extremely important relations to thermodynamics.
Indeed, it follows from this that the temperatures are all equal, i.e., T1 =
T2 = T0 .
Remark 4.3 (Lagrange multipliers) We note that the inverse temperature in the canonical Gibbs ensemble can be seen as the Lagrange multiplicator for the extremum problem for the entropy under the constraint that the
mean energy is fixed with a parameter like , see further [Jay89], [Bal91],
[Bal92], [EL02].
Theorem 4.4 (Maximum Principle for the entropy) Let > 0,

and N N be given. The canonical Gibbs ensemble ,N


, where > 0 is
R

determined by dx,N (x) = U , maximises the entropy


Z
S() = k

(x) log (x)dx

for any P( , B ) having a Lebesgue density subject to the constraint


Z
U=
(x)H(N ) (x)dx.
(4.12)

Moreover, the values of the temperature T and the partition function Z (, N )


are uniquely determined from the condition
U =
Proof.

log Z (, N )

with =

1
.
kT

We give only a rough sketch of the proof. We use that


a log a b log b (a b)(1 + log a) a, b (0, ).

Let P( , B ) with Lebesgue density . Put a = ,N (x) and b = (x)


for any x and recall that ,N is the density of the canonical Gibbs
ensemble. Then
,N (x) log ,N (x) (x) log (x) (,N (x) (x))(1 log Z (, N )
H(N ) (x)).

22

Integrating with respect to Lebesgue we get


Z

S(,N ) S() k
((1 log Z (, N ) H(N ) (x)),N (x)dx


Z
(N )

(1 log Z (, N ) H (x))(x))dx

= k {(1 log Z (, N ) U ) (1 log Z (, N ) U )}


= 0.
Therefore

S(,N
) S().

Note that the entropy for the canonical ensemble is given by


Z

(log Z (, N ) + H(N ) (x)),N (x)dx


S(,N ) = k

Z
= k log Z (, N ) + k
H(N ) (x),N (x)dx.

To prove the second assertion, note that


Z
log Z (, N ) =
,N (x)H(N ) (x)dx,
Z
Z


(N )
2
log Z =
,N (x) H (x)

2
,N (x)H(N ) (x)dx dx 0.


Thermodynamic functions
For the canonical ensemble the relevant thermodynamical variables are the
temperature T (or = (kT )1 ) and the volume V of the region R. We
have already defined the entropy S of the canonical ensemble by
S (, N ) = k log Z (, N ) +
where U = E (H(N ) ) =
,N

1
E (H (N ) ),
T ,N

H(N ) (x),N (x)dx, the expectation of H ,

sometimes denoted also by hH(N ) i. We define the Helmholtz Free Energy


by A = U T S, and we shall call the Helmholtz Free Energy simply the free
energy from now on. We have
A = U T S (, N ) = E (H(N ) ) T (k log Z (, N ) +
,N

1
= log Z (, N ).

23

1
E (H (N ) ))
T ,N

By analogy with Thermodynamics we define the absolute pressure P of the


system by


A
P =
.
V T
The other thermodynamic functions can be defined as usual:
The Gibbs Potential, G = U + P V T S = A + P V,
.
The Heat Capacity at Constant Volume, CV = U
T
V
Note that


A
S=
T V
is also satisfied. The thermodynamic functions can all be calculated from A.
Therefore all calculations in the canonical ensemble begin with the calculation of the partition function Z (, N ).
To make the free energy density finite in the thermodynamic limit we redefine
the canonical partition function by introducing correct Boltzmann
counting
Z
Z
(N )
1
1
H (x)
e
dx =
eH (x) dx,
(4.13)
Z (, N ) =
(n/d)!
N !
see the following Subsection 4.2 for a justification of this correct Boltzmann
counting.
Example 4.5 (The ideal gas in the canonical ensemble) Consider a noninteracting gas of N identical particles of mass m in d dimensions, contained
in a box Rd of volume V . The Hamiltonian for this system is
N

1 X 2
H (x) =
p
2m i=1 i

, x = (q, p) .

We have for the partition function Z (, N )


Z
N
Z
2
1
1
H (x)
N
p
2m
e
V
e
Z (, N ) =
dx =
dp
N ! hN d
N ! hN d
Rd
Z
N d

 1 Nd
2
1 N 2m 2
1
N
p
(4.14)
=
V
e 2m dp
=
V
N ! hN d
N!
h2
R
 N
V
1
=
,
N ! d
where

=

h2
2m

24

 12

is called the thermal wavelength because it is of the order of the de Broglie


wavelength of a particle of mass m with energy 1 . The free energy A (, N )
is given by
1
1
A (, N ) = log Z (, N ) = (log N ! + N d log N log V ) .

Thus the pressure is given by


 A (, N ) 
N
kT N

P (, N ) =
=
=
.
V
V
V
T
Let aN (, v) be the free energy per particle considered as a function of the
specific density v, that is,
aN (, v) =

1
A (, N ),
N N

where N is a sequence of boxes with volume vN and let pN (, v) = PN (, N )


be the corresponding pressure. Then
 a (, v) 
N
.
pN (, v) =
v
T
For the ideal gas we get then


1 1
aN (, v) =
log N ! + d log log v log N ,
N
leading to
a(, v) := lim aN (, v) =
N

since


lim

1
(d log log v 1) ,

1
log N ! log N
N


= 1.

If p(, v) := limN pN (, v), one gets


p(, v) =

 a(, v) 
v

,
T

1
and thus p(, v) = v
. We can also define the free energy density as a
function of the particle density , i.e.,

fl (, ) =

1
A (, Vl ),
Vl l
25

where l is a sequence of boxes with volume Vl with liml Vl = and


f (, ) = lim fl (, ).
l

The pressure p(, ) then satisfies




f (, )
p(, ) =
f (, ).

T
Clearly f (, ) = a(, 1 ). For the ideal gas we get
f (, ) =

(d log + log 1)

and therefore

Finally we want to check the relative dispersion of the energy in the canonical
ensemble. Let hH(N ) i = E (H(N ) ). Then
p(, ) =

,N

2 log Z (, N )
h(H(N ) hH(N ) i)2 i
.
=
( log Z (, N ))2
hH(N ) i2
This gives for the ideal gas
q
h(H(N ) hH(N ) i)2 i
(N )

hH i

4.2

1
1
1
= ( dN ) 2 = O(N 2 ).
2

The Gibbs paradox

The Gibbs paradox illustrates an essential correction of the counting within


the microcanonical and the canonical ensemble. Gibbs 1902 was not aware
of the fact that the partition function needed a re-definition, for instance a
redefinition in (4.13) in case of the canonical ensemble. The ideal gas suffices
to illustrate the main issue of that paradox. Recall the entropy of the ideal
gas in the canonical ensemble
d
d
S (, N ) = kN log(V T 2 ) + kN (1 + log(2m)) , 1 = kT,
2

(4.15)

where V = || is the volume of the box Rd . Now, make the following


Gedanken-experiment. Consider two vessels having volume Vi containing
Ni , i = 1, 2, particles separated by a thin wall. Suppose further that both
26

vessels are in equilibrium having the same temperature and pressure. Now
imagine that the wall between the two vessels is gently removed. The aggregate vessel is now filled with a gas that is still in equilibrium at the same
temperature and pressure. Denote by S1 and S2 the entropy on each side of
the wall. Since the corresponding canonical Gibbs ensembles are independent of one another, the entropy S12 of the aggregate vessel - before the wall
is removed - is exactly S1 + S2 . However an easy calculation gives us
S12 (S1 + S2 ) = k((N1 + N2 ) log(V1 + V2 ) N1 log V1 N2 log V2 )

(4.16)
V2 
V1
+ N2 log
> 0.
= k N1 log
V1 + V2
V1 + V2
This shows that the informational (Shannon) entropy has increased, while
we expected the thermodynamic entropy to remain constant, since the wall
between the two vessels is immaterial from a thermodynamical point of view.
This is the Gibbs paradox.
We have indeed lost information in the course of removing the wall. Imagine
the gas before removing the wall consists of yellow molecules in one vessel
and of blue molecules in the other. After removal of the wall we get a
uniform greenish mixture throughout the aggregate vessel. Before we knew
with probability 1 that a blue molecule was initially in the vessel where we
had put it, after removal of the wall we only know that it is in that part of
the aggregate vessel with probability NN1 N1 2 .
The Gibbs paradox is resolved in classical statistical mechanics with an ad
hoc ansatz. Namely, instead of the canonical partition function Z (, N )
one takes N1 ! Z (, N ) and instead of the microcanonical partition function
(E, N ) one takes N1 ! (E, N ). This is called the correct Boltzmann
counting. The appearance of the factorial can be justified in quantum
mechanics. It has something to do with the in-distinguishability of identical particles. A state describing a system of identical particles should be
invariant under any permutation of the labels identifying the single particle variables. However, this very interesting issue goes beyond the scope of
this lecture, and we will therefore assume it from now on. In Subsection 5.1
we give another justification by computing the partition function and the
entropy in the microcanonical ensemble.

4.3

The grandcanonical ensemble

We give a brief introduction to the grandcanonical Gibbs ensemble. One


can argue that the canonical ensemble is more physical since in experiments
we never consider an isolated system and we never measure the total energy
27

but we deal with systems with a given temperature. Similarly we like not to
specify the number of particles but the average number of particles. In the
grandcanonical ensemble the system can have any number of particles with
the average number determined by external sources. The grandcanonical
Gibbs ensemble is obtained if the canonical ensemble is put in a particlebath, meaning that the particle number is no longer fixed, only the mean of
the particle number is determined by a parameter. This was similarly done
in the canonical ensemble for the energy, where one considers a heat-bath.
The phase space for exactly N particles in box Rd can be written as
,N = { ( Rd ) : = {(q, pq ) : q
b }, Card (b
) = N },

(4.17)

where
b , the set of positions occupied by the particles, is a locally finite
subset of , and pq is the momentum of the particle at positions q. If the
number of the particles is not fixed, then the phase space is
= { ( Rd ) : = {(q, pq ) : q
b }, Card (b
) finite}.

(4.18)

A counting variable on is a random variable N on for any Borel set


defined by N () = Card (b
) for any .
Definition 4.6 (Grandcanonical ensemble) Let Rd , > 0 and
d 2N
is the
R. Define the phase space =
N =0 ,N , where ,N = ( R )

phase space in for N particles, and equip it with the -algebra B generated
,

by the counting variables.


The probability measure P( , B ) such
,
onto ,N have the densities
that the restrictions
,N

(N )

(N )
,
(x) = Z (, )1 e(H

(x)N )

, N N,

where H(N ) is the Hamiltonian for N particles in , and partition function


Z
X
(N )
Z (, ) =
e(H (x)N ) dx
(4.19)
N =0

,N

is called the grandcanonical ensemble in for the inverse temperature


and the chemical potential .
Instead of the chemical potential sometimes the fugacity or activity e
is used for the grandcanonical ensemble. Observables are now sequences
f = (f0 , f1 , . . .) with f0 R and fN : ,N R, N N, are functions on
the N -particle phase spaces. Hence, the expectation in the grandcanonical
ensemble is written as
Z

X
1
N
e
Z (, N )
fN (x),N (dx).
(4.20)
E , (f ) =

Z (, ) N =0
,N
28

If N denotes the particle number observable we get that


E , (N ) =

1
log Z (, ).

For the grandcanonical measure we have a Principle of Maximum Entropy very similar to those for the other two ensembles. We maximise the
entropy subject to the constraint that the mean energy E , (H ) and the

mean particle number E , (N ) are fixed, where H = (H(0) , H(1) , . . .) is the

sequence of Hamiltonians for each possible number of particles.


Theorem 4.7 (Principle of Maximum Entropy) Let P be a probability
measure on such that its restriction to ,N , denoted by PN , is absolutely
continuous with respect to the Lebesgue measure, that is
Z
PN (A) =
N (x)dx for any A B(N ) .
A

Define the entropy of the probability measure P to be


S(P ) = k0 log 0 k

Z
X
N =1

N (x) log(N !N (x))dx.

,N

Then the grandcanonical ensemble/measure , , where and are determined by E , (H ) = E and E , (N ) = N0 , N0 N, maximises the entropy

among the absolutely continuous probability measures on with mean energy E and mean particle number N0 .
Proof. As in the two previous cases we use a log a b log b (a b)(1 +
log a) and so
a log ta b log tb (a b)(1 + log a + log t).
eN eHN (x)
)
and put a = (N
, (x), b = N (x) and t = N !.
N !Z (, )
Then, writing Z for Z (, ),
Let ,
N (x) =

(N )
(N )
,
(x) log(N !,
(x)) N (x) log(N !N (x))
(N )
(,
(x) N (x))(1 log Z H(N ) (x) + N ).

29

Integrating with respect to the Lebesgue measure on ,N and summing over


N we get
S(

) S() k

Z
nX

+ (1 log Z)(0)
,
(1 log Z)0

)
(1 log Z H(N ) (x) + N )(N
, (x)dx

N =1 ,N

XZ

(1 log Z H(N ) (x) + N )N (x))dx

o N =1

,N

= k {(1 log Z E + N0 ) (1 log Z E + N0 )} = 0.


Therefore
S(, ) S().
Note that the entropy for the grandcanonical ensemble is given by
S(, ) = k log Z (, ) + kE , (H ) kE , (N ).


Thermodynamic Functions:
We shall write Z for Z (, ) and we suppress for a while some obvious
sub-indices and arguments. We have already defined the entropy S by
S = k log Z +

1
(E , (H ) E , (N )),

and as before we define the internal energy of the system U by U = E , (H ).

We then define the Helmholtz Free Energy as before by A = U T S, and


we shall call the Helmholtz Free Energy simply the free energy from now on.
We have
1
A =U T S = E , (H ) T (k log Z + (E , (H ) E , (N )))

T


1
=
log Z E , (N ) .

In analogy with thermodynamics we should define the absolute pressure P


of the system by


A
P =
V T
with the constraint
E , (N ) = constant.

30

This constraint means that is a function of V and . Therefore


 1
1
1 
log Z E , (N ) +
log Z =
log Z.
P =

V
V
V
1
It is argued that log Z should be independent of V for large V and therefore
V
we can write




1
1
1
1
V
1
P =
log Z =
V log Z =
log Z +
log Z
V
V
V
V
V V
1

log Z.
V
Therefore we define the pressure by the equation
P =

1
log Z.
V

This definition can be justified a posteriori when we consider the equivalence


of ensembles, see Subsection 5.3. The other thermodynamic functions can
be defined as usual:
The Gibbs Potential G = U + P V T S = A+ PV ,
The heat capacity at constant volume, CV = U
. Note
T
V

S=

 A 
T

is also satisfied.
All the thermodynamic functions can be calculated from Z = Z (, ).
Therefore all calculations in the grandcanonical ensemble begin with the
calculation of the partition function Z = Z (, ).

4.4

The orthodicity problem

We refer to one of the main aims of statistical mechanics, namely to derive


the known laws of classical thermodynamics from the ensemble theory. The
following question is called the Orthodicity Problem.
Which set E of statistical ensembles or probability measures has the property
that, as an element E changes infinitesimally within the set E, the
corresponding infinitesimal variations dU and dV of U and V are related to
the pressure P and to the average kinetic energy per particle,
Tkin

E (Tkin )
=
N

, Tkin
31

1 X 2
=
p,
2m i=1 i

such that the differential

dU + P dV
Tkin
is an exact differential at least in the thermodynamic limit. This will then
provide the second law of thermodynamics. Let us provide a heuristic check
for the canonical ensemble. Here,
Z
(N )
1

E,N (Tkin ) =
Tkin (x)eH (x) dx,
Z (, N ) ,N
and U = Z (, N ). The pressure in the canonical ensemble can be
calculated as
Z
X
(N )
1 2 a dq2 dqN dp1 dpN
N

p
,
P (,N ) =
eH (x)
Z (, N ) p>0
2m A
N!
Q
where the sum goes over all small cubes Q adjacent to thePboundary of the
box with volume V by a side with area a while A = Q a is the total
area of the container surface and q1 is the centre of Q. Let n(Q, v)dv, where
1
v = 2m
p is the velocity, be the density of particles with normal velocity v
that are about to collide with the external walls of Q. Particles will cede a
momentum 2mv = p in normal direction to the wall at the moment of their
collision (mv mv due to elastic boundary conditions). Then
XZ
va
dvn(Q, v)(2mv)
A
v>0
Q
is the momentum transferred per unit time and surface area to the wall.
Gaussian calculation gives then after a couple of steps that due to F (, N ) =
1 log Z (, N ) and S (E; N ) = (U F (, N )) we have that T =
2 Tkin
, and that
(k)1 = dk
N
T dS = d(F + T S ) + pdV = dU + pdV,

with p = 1 V
log Z (, N ). Details can be found in [Gal99], where also
references are provided for rigorous proofs for the orthodicity in the canonical
ensemble. The orthodicity problem is more difficult in the microcanonical
ensemble. The heuristic approach goes similar. However, for a rigorous proof
of the orthodicity one needs here a proof that the expectation of the kinetic
energy in the microcanonical ensemble satisfies

E(Tkin
) = E(Tkin ) (1 + N ) , > 0,

32

with N 0 as N (thermodynamic limit). The last requirement


would be easy for independent velocity, but this is not the case here due
to the microcanonical energy constraint, and therefore this refers not to an
application of the usual law of large numbers. A rigorous proof concerning
any fluctuations and moments of the kinetic energy in the microcanonical
ensemble is in preparation [AL06].

The Thermodynamic limit

In this section we introduce the concept of taking the thermodynamic limit,


give a simple example in the microcanonical ensemble and prove in Subsection 5.2 the existence of the thermodynamic limit for the specific free energy
in the canonical Gibbs ensemble for a given class of interactions. In the last
subsection we briefly discuss the equivalence of ensembles and the thermodynamic limit at the level of states/measures.

5.1

Definition

Let us call state of a physical system an expectation value functional on the


observable quantities for this system. The averages, i.e., the expectation values with respect to the Gibbs ensembles, are such states. We shall say that
the systems for which the expectation with respect to the Gibbs ensembles
is taken are finite systems (e.g. finitely many particles in a region with
finite volume), but we may also consider the corresponding infinite systems which contain an infinity of subsystems and extend throughout Rd or
Zd for lattice systems. Thus the discussion in the introduction leads us to assume that the ensemble expectation for finite systems approach in some sense
states/measures of the corresponding infinite system. Besides the existence
of such limit states/measures one is also interested in proving that these are
independent on the choice of the ensemble leading to the question of equivalence of ensembles. One of the main problems of equilibrium statistical
mechanics is to study the infinite systems equilibrium states/measures and
their relation to the interactions which give rise to them. In Section 6 we introduce the mathematical concept of Gibbs measures which appear as natural
candidates for equilibrium states/measures. We turn down one level in our
study and consider the problem of determination of the thermodynamic functions from statistical mechanics in the thermodynamic limit. We introduced
earlier for each Gibbs ensemble a partition function, which is the total mass
of the measure defining the Gibbs ensemble. The logarithm of the partition
function divided by the volume of the region containing the system has a
33

limit when this systems becomes large ( Rd ), and this limit is identified
with a thermodynamic function. Any singularities for these thermodynamic
functions in the thermodynamic limit may correspond to phase transitions
(see [Min00],[Gal99] and [EL02] for details on theses singularities).
Taking the thermodynamic limit thus involves letting tend to infinity, i.e., approaching Rd or Zd respectively. We have to specify how tends
to infinity. Roughly speaking we consider the following notion.
Notation 5.1 (Thermodynamic limit) A sequence (n )nN of boxes n
Rd is a cofinal sequence approaching Rd if the following holds,
(i) n Rd as n ,
(ii) If hn = {x Rd : dist(x, ) h} denotes the set of points with
distance less or equal h to the boundary of , the limit
|hn |
=0
n ||
lim

exists.
The thermodynamic limit consists thus in letting n for a cofinal sequence (n )nN of boxes with the following additional requirements for the
microcanonical and the canonical ensemble:
Microcanonical ensemble: There are energy densities n (0, ), given
En
with En as n , and particle densities n (0, ),
as n = |
n|
n
given as n = N
with Nn as n , such that n and
n
n (0, ) as n .
Canonical ensemble: There are particle densities n (0, ), given as
n
with Nn as n , such that n (0, ) as n .
n = N
n
In some models one needs more assumptions on the cofinal sequence of
boxes, for details see [Rue69] and [Isr79].
We check the thermodynamic limit of the following simple model in the
microcanonical ensemble, which will give also another justification of the
correct Boltzmann counting.
Ideal gas in the microcanonical ensemble
Consider a non-interacting gas of N identical particles of mass m in d dimensions, contained in the box of volume V = ||. The gradient of the
34

Hamiltonian for this system is


n

1 X 2
1
H (x) =
pi = (0, . . . , 0, p1 , . . . , pn ) , x ,
2m i=1
m
where as usual n = N d. We have
n
1 X 2
2
|H(x)| = 2
pi = H(x) , x ,
m i=1
m
2

where | | denotes the norm in R2n . Therefore on the energy surface E ,


1
|H| = ( 2E
) 2 . Let S (r) be the hyper sphere of radius r in dimensions,
m
that is S (r) = {x| x R , |x| = r}. Let S (r) be the surface area of S (r).
Then
S (r) = c r1 for some constant c .
1

For the non-interacting gas, we have E = N Sn ((2mE) 2 ) and


 m  21 Z
 m  12
1
(E) =
d =
V N Sn ((2mE) 2 )
2E
2E
E
 m  21
1
1
=
V N cN d (2mE) 2 (N d1) = mV N cN d (2mE) 2 N d1 .
2E
The entropy S is given by
1

exp(S/k) = (E) = mV N cN d (2mE) 2 N d1


and therefore
1

(2mE) 2 N d1 =

exp(S/k)V N
.
mcdN

Thus, the internal energy follows as




2N

V (N d2)
1 exp
U (S, V ) = E =
,
2
2m
(mcN d ) (N d2)
2S
k(N d2)

and the temperature as the partial derivative of the internal energy with
respect to the entropy is
 U 
2
2U
2U
T =
=
U=

2
S V
k(N d 2)
kN d
kN d(1 N d )
for large N . This gives for large N the following relations
d
U N kT,
2
35

 U 

d
N k.
T V
2
 U 
2
2U
N kT
U
P =
=

.
2
V S d(1 N d ) V
dV
V
CV =

The previous relation is the empirical ideal gas law in which k is Boltzmanns
constant. We can therefore identify k in the definition of the entropy with
Boltzmanns constant.
We need to calculate c . We have via a standard trick
Z
Z
 Z

2
x2
|x|2
2
=
e dx =
S (r)er dr
e
dx =
0
R
Z
Z

c  
c
1 t
1 r2
2
t e dt =
.
= c
r e dr =
2 0
2
2
0

This gives c =

2 2
,
( 2 )

where is the Gamma-Function, defined as

tx1 et dt.

(x) =
0

Note that if n N then (n) = (n 1)!. The behaviour of (x) for large
positive x is given by Stirlings formula

1
(x) 2xx 2 ex .
This gives limx ( x1 log (x) log x) = 1. We now have for the entropy of
the non-interacting gas in a box of volume V
Nd


1
2 2
S (E, N ) = k log mV N N d (2mE) 2 N d1 .
( 2 )

Let v be the specific volume and the energy per particle, that is
v=

V
N

and

E
.
N

Let sN (, v) be the entropy per particle considered as a function of and v,


sN (, v) =

1
S (N, N ),
N N

36

where N is a sequence of boxes with volume vN . Then


Nd


2
1
1
N
N 2
N d1
2
(2mN
)
sN (, v) = k log mv N
N
( N2d )

 N d 
d
1
d+2
log N + log(4m) log
k log v +
2
2
N
2


d d
d
d
k log v + log N + log(4m) + log( )
2
2 2
2
k log N.
We expect sN (, v) to be finite for large N . Gibbs (see Section 4.2) postulated that we have made an error in calculating (E), the number of states
of the gas with energy E. We must divide (E) by N !. It is not possible to understand this classically since in classical mechanics particles are
distinguishable. The reason is inherently quantum mechanical. In quantum mechanics particles are indistinguishable. We also divide (E) by hdN
where h is Plancks constant. This makes classical and quantum statistical
mechanics compatible for high temperatures.
We therefore redefine the microcanonical entropy of the system to be
 (E, N ) 
 (E, N ) 

S (E, N ) = k log
= k log
,
(5.21)
(n/d)!
N!
where we put Plancks constant h = 1. Then

d
d d
d
sN (, v) k log v + log N + log(4m) + log( ) d log h
2
2 2
2

1
log(N !)
N


 h  d
d
d
d
d+2
k
+ log v + log log
log( )
2
2
2
2
2
4m
as N .

5.2

Thermodynamic function: Free energy

We shall prove that the canonical free energy density exists in the thermodynamic limit for very general interactions. We consider a general interacting
gas of N identical particles of mass m in d dimensions, contained in the box
of volume of V with elastic boundary conditions. The Hamiltonian for this
system is
N
1 X 2
(N )
H =
p + U (r1 , . . . , rN ).
2m i=1 i
37

We have for the partition function Z (, N )


Z
N Z
Z
2
(N )
1
1
p
H (x)
Z (, N ) =
e 2m dp
e
dx =
eU (q) dq
N ! hN d
N!
d
N
R


 12 N d Z
Z
1 1
1 2m
U (q)
e
dq
=
eU (q) dq.
=
dN
N ! h2
N
!

N
N

We shall assume that the interaction potential is given by a pair interaction


potential : R R, that is
X
U (q1 , . . . , qN ) =
(|qi qj |) , (q1 , . . . , qN ) RdN ,
1i<jN

where |x| denotes the norm for a the vector x Rd .


Definition 5.2 Let the pair potential function : R R be given.
(i) The pair potential is tempered if there exists R > 0 such that
(|q|) 0 if |q| > R, q Rd .
(ii) The pair potential is stable if there exists B 0 such that for all
q1 , r2 , . . . , qN in Rd
X
(|qi qj |) N B.
1i<jN

(iii) If the pair potential is not stable then it is said to be catastrophic.

Recall that the free energy in a box Rd for an inverse temperature


and particle number N is given by
1
A (, N ) = log Z (, N ).

Theorem 5.3 (Fisher-Ruelle) Let be a stable and tempered pair potential. Let R be as above in Definition 5.2 and for (0, ) let L0 be such that
(L0 + R)d N. Let Ln = 2n (L0 + R) R and let n be the cube centred at
the origin with side Ln and volume Vn = |n | = Ldn . Let Nn = (L0 + R)d 2dn .
If
1
fn (, ) =
A (, Nn ),
Vn n
then limn fn (, ) exists.
38

Figure 1: typical form of

39

R
(1)
n-1

(2)
n-1

(4)
n-1

(3)
n-1

Figure 2: n contains 4 cubes of side length Ln1


Proof. Note that limn Nn /Vn = . Note also that Nn = 2d Nn1 and
Ln = 2Ln1 + R. Because of the last equation n contains 2d cubes of side
(i)
Ln1 with a corridor of width R between them. Denote these by n1 , i =
1, . . . , 2d and let
X
(|qi qj |).
UN (q1 , . . . , qN ) =
1i<jN

Let Zn = Zn (, Nn ) and gn = N1n log Zn . It is sufficient to prove that gn


converges.
Z
1
dq1 . . . dqNn eUNn (q1 ,...,qNn ) .
Zn =
Nn !dNn Nn n
Let
(i)

0
n
n = {(r1 , . . . rNn ) N
n | each n1 contains Nn1 rk s}.

Note that for (q1 , . . . qNn ) n there are no rk s in the corridor between the
(i)
n1 .
Let
(2)
n = {(q1 , . . . qNn ) Nn : q1 , . . . , qNn1 (1)

n1 , qNn1 +1 , . . . , q2Nn1 n1 ,
n
(2d )

. . . , q(2d 1)Nn1 +1 , . . . q2d Nn1 n1 }.


40

n
Since n N
n

Z
1
dq1 . . . dqNn eUNn (q1 ,...,qNn )
Zn
Nn !dNn n
Z
Nn !
1 1
=
dq1 . . . dqNn eUNn (q1 ,...qNn ) .
(Nn1 !)2d Nn ! dNn n
n
Since is tempered, we get for q1 , . . . , qNn
UNn (q1 , . . . , qNn ) UNn1 (q1 , . . . , qNn1 ) . . .
+ UNn1 (q(2d 1)Nn1 +1 , . . . , q2d Nn1 ).
Thus
Zn

!2d

(Nn1 !))2d dNn

Nn1
n1

dq1 . . . dqNn1 eUNn1 (q1 ,...,qNn1 )

= (Zn1 )2 .
Therefore
gn =

1
1
1
d
log Zn = d
log Zn d
log(Zn1 )2 = gn1 .
Nn
2 Nn1
2 Nn1

The sequence (gn )nN is increasing. To prove that gn converges it is sufficient


to show that gn is bounded from above. Since the pair potential is stable
we have
UNn (q1 , . . . , qNn ) BNn
and therefore
1
Zn =
Nn !dNn

Nn !dNn

n
N
n

n
N
n

dq1 . . . dqNn eUNn (q1 ,...,qNn )


dq1 . . . dqNn eBNn

Vn Nn BNn
e
.
Nn !dNn

Thus
1
gn log Vn
log Nn ! d log + B = log
Nn
1

log Nn ! d log + B
Nn
log + 2 d log + B

Nn
Vn


+ log Nn

for large n since




1
Nn
lim
= and lim log Nn
log Nn ! = 1.
n Vn
n
Nn

41

5.3

Equivalence of ensembles

The equivalence of the Gibbs ensemble is the key problem of equilibrium


statistical mechanics. It goes back to Gibbs 1902, who conjectured that both
the canonical and the grandcanonical Gibbs ensemble are equivalent with
the microcanonical ensemble. The main difficulty to answer the question of
equivalence lies in the precise definition of the notion equivalence. Nowadays
the term can have three different meanings, each on a different level of information. We briefly introduce these concepts, but refer for details to one of
the following research articles ([Ada01], [Geo95], [Geo93]).
Equivalence at the level of thermodynamic functions
Under general assumptions on the interaction potential, e.g. stability and
temperedness as in Subsection 5.2, one is able to prove the following thermodynamic limits of the thermodynamic functions given by the three Gibbs ensembles. Let (n )nN be any cofinal sequence of boxes with corresponding sequences of energy densities (n )nN and particle densities (n )nN . Then there
is a closed packing density (cp) (0, ) and an energy density () (0, )
such that the following limits exist under some additional requirements (details are in [Rue69]) depending on the specific model chosen.
Grandcanonical Gibbs ensemble Let > 0 be the inverse temperature
and R the chemical potential. Then the function p(, ), defined by
1
log Zn (, ),
n |n |

p(, ) = lim

is called the pressure.


Canonical Gibbs ensemble Let > 0 be the inverse temperature and
Nn
.
(n )nN with n (0, (cp) ) a sequence of particle densities n = |
n|
Then the function f (, ), defined by
1
log Zn (, n |n |),
n |n |

f (, ) = lim

is called the free energy.


Microcanonical Gibbs ensemble Let (n )nN with n (0, (cp) ) be
Nn
a sequence of particle densities n = |
, and let (n )nN with n
n|
En
((), ) be a sequence of energy densities n = |
. Then the function
n|
1
log n (n |n |), n |n |)
n |n |

s(, ) = lim

42

is called entropy.

Now equivalence at the level of thermodynamic functions is given if all three


thermodynamic functions in the thermodynamic limit are related to each
other by a Legendre-Fenchel transform for specific regions of parameter regions of , , and . Roughly speaking this kind of equivalence is mainly
given in absence of phase transitions, i.e., only for those parameters where
there are no singularities. For details see the monograph [Rue69], where the
following transforms are established.
p(, ) =

{s(, ) }

sup

(cp) ,>()

f (, ) = inf { 1 s(, )}
>()

p(, ) = sup { f (, )}.


(cp)

Equivalence at the level of canonical and microcanonical Gibbs


measures
In Section 6 we introduce the concept of Gibbs measures. A similar concept
can be formulated for canonical Gibbs measures (see [Geo79] for details)
as well as for microcanonical Gibbs measures (see [Tho74] and [AGL78] for
details). The idea behind these concepts is roughly speaking to condition
outside any finite region on particle density and energy density events.
Equivalence at the level of canonical and microcanonical Gibbs measures is
then given if the microcanonical and canonical Gibbs measures are certain
convex combinations of Gibbs measures (see details in [Tho74], [AGL78] and
[Geo79]).
Equivalence at the level of states/measures
At the level of states/measures one is interested in any weak (in the sense
of probability measures, i.e., weak--topology) limit points (accumulation
points) of the Gibbs ensembles. To define a consistent limiting procedure we
need an appropriate phase or configuration space for the infinite systems in
the thermodynamic limit. We consider here only continuous systems whose
phase space (configuration space) for any finite region and finitely many
particles is . In Section 6 we introduce the corresponding configuration
space for lattice systems. Define
= { (Rd Rd ) : = {(q, pq ) : q
}},
where
, the set of occupied positions, is a locally finite subset of Rd , and pq
is the momentum of the particle at position q. Let B denote the -algebra of
43

this set generated by counting variables (see [Geo95] for details). Then each
Gibbs ensemble can be extended trivially to a probability on (, B) just by
putting the whole mass on a subset. Therefore it makes sense to consider
all weak limit points in the thermodynamic limit. If the limit points are not
unique, i.e., there are several accumulation points, one considers the whole
set of accumulation points closed appropriately as the set of equilibrium
states/measure or Gibbs measures.
Equivalence at the level of states/measures is given if all accumulation points
of the different Gibbs ensembles belong to the same set of equilibrium points
or the same set of Gibbs measure ([Geo93],[Geo95],[Ada01]).
In the next section we develop the mathematical theory for Gibbs measures without any limiting procedure.

Gibbs measures

In this section we introduce the mathematical concept of Gibbs measures,


which are natural candidates to be the equilibrium measures for infinite systems, i.e., for systems after taking the thermodynamic limit. We will restrict
our study from now on to lattice systems, i.e., the phase space is given as the
set of functions (configurations) on some countable discrete set with values
in a finite set, called the state space.

6.1

Definition

Let Zd the square lattice for dimensions d 1 and let E be any finite set.
d
Define := E Z = { = (i )iZd : i E} the set of configurations with
values in the state space E. Let E be the power set of E, and define the
d
algebra F = E Z such that (, F) is a measurable space. Denote the set
of all probability measures on (, F) by P(, F).
Definition 6.1 (Random field) Let P(, F). Any family (i )iZd of
random variables which is defined on the probability space (, F, ) and which
takes values in (E, E) is called a random field.
If one considers the canonical setup, where i : E are the projections
for any i Zd , a random field is synonymous with a probability measure
P(, F). Let S = { Zd : || < } be the set of finite volume
subsets of the square lattice Zd . Cylinder events are defined as { A}
for any A E and any projection : E for S. Then F is the
smallest - algebra containing all cylinder events. If S the algebra
F on contains all cylinder events {00 A} for all A E and 0 .
44

If we return to our physical intuition we are interested in random fields for


which the so-called spin variables i exhibit a particular type of dependence.
We employ a similar dependence structure as for Markov chains, where the
dependence is expressed as a condition on past events. This approach was
introduced by Dobrushin ([Dob68a],[Dob68b],[Dob68c]) and Lanford and Ruelle ([LR69]). Here, we condition on the complement of any finite set Zd .
To prescribe these conditional distributions of all finite collections of variables
we define the -algebras
T = FZd \

for any S.

(6.22)

The intersection of all these -algebras is denoted by T = S T and called


the tail--algebra or tail-field.
The dependence structure will be described by some functions linking the
random variables and expressing the energy for a given dependence structure.
Definition 6.2 An interaction potential is a family = (A )AS of functions A : R such that the following holds.
(i) A is FA -measurable for all A S,
(ii) For any S and any configuration the expression
X
H () =
A ()

(6.23)

AS,A6=

exists. The term exp(H ()) is called the Boltzmann factor for
some parameter > 0, where is the inverse temperature.
Example 6.3 (Pair potential) Let A = 0 whenever |A| > 2 and let
J : Zd Zd R, : E E R and : E R symmetric and measurable.
Then a general pair interaction potential is given by

J(i, j)(i , j ) if A = {i, j}, i 6= j,


J(i, i)(i ) if A = {i},
A () =
for .

0 if |A| > 2
We combine configurations outside and inside of any finite set of random
variable as follows. Let and , S, be given. Then Zd \
with (Zd \ ) = and Zd \ (Zd \ ) = Zd \ . With this notation we can
define a nearest-neighbour Hamiltonian with given boundary condition.

45

Example 6.4 (Hamiltonian with boundary) Let S and and


the functions J, and as in Example 6.3 be given. Then
H () =

1
2

J(i, j)(i , j ) +

X
i,jc ,
hiji=1

i,j,hiji=1

(i , j ) +

J(i, i)(i )

denotes a Hamiltonian in with nearest-neighbour interaction and with configurational boundary condition , where hx, yi = maxi{1,...,d} |xi yi |
for x, y Zd . Instead of a given configurational boundary condition one can
model the free boundary condition and the periodic boundary condition as
well.
In the following we fix a probability measure P(E, E) on the state space
and call it the reference or a priori measure. Later we may also consider the Lebesgue measure as reference measure. Choosing a probability
measure as a reference measure for finite sets gives just a constant from
normalisation.
Definition 6.5 (Gibbs measure)
(i) Let , S, > 0 the inverse temperature and be an interaction potential. Define for any event A F
Z



1
(A|) = Z ()
(d)1lA (Zd \ ) exp H (Zd \ )

(6.24)
with normalisation or partition function
Z


Z () =
(d) exp H (Zd \ ) .

Then (|) is called the Gibbs distribution in with boundary


condition Zd \ , with interaction potential , inverse temperature
and reference measure .
(ii) A probability measure P(, F) is a Gibbs measure for the interaction potential and inverse temperature if
(A|F ) = (A|)

a.s. for all A F, S,

(6.25)

where is the Gibbs distribution for the parameter (6.24). The set
of Gibbs measures for inverse temperature with interaction potential
is denoted by G(, ).
46

(iii) An interaction potential is said to exhibit a first-order phase transition if |G(, )| > 1 for some > 0.
If the interaction potential is known we may skip the explicit appearance
of the interaction potential and write instead G() for the set of Gibbs measure with inverse temperature . However, the parameter can ever be
incorporated in the interaction potential .
Remark 6.6
(i) (A|) is T -measurable for any event A F.
(ii) The equation (6.25) is called the DLR-equation or DLR-condition
in honour of R. Dobrushin, O. Lanford and D. Ruelle.

6.2

The one-dimensional Ising model

In this subsection we study the one-dimensional Ising model. If the state


space E is finite, then one can show a one-to-one correspondence between
the set of all positive transition matrices and a suitable class of nearestneighbour interaction potentials such that the set G() of Gibbs measures
is the singleton with the Markov chain distribution. This is essentially for a
geometric reason in dimension one, the condition on the boundary is a twosided Markov chain, for details see [Geo88]. A simple one-dimensional model
which shows this equivalence was suggested 1920 by W. Lenz ([Len20]), and
its investigation by E. Ising ([Isi24]) was a first and important step towards
a mathematical theory of phase transitions. Ising discovered that this model
fails to exhibit a phase transition and he conjectured that this will hold also
in the multidimensional case. Nowadays we know that this is not true. In
Subsection 6.4 we discuss the multidimensional case.
Let E = {1, +1} be the state space and consider the lattice Z. At
each site the spin can be downwards, i.e., 1, or be upwards, i.e., +1. The
nearest-neighbour interaction is modelled by a constant J, called the coupling
constant, through the expression Ji j for any i, j Z with |i j| = 1:
J > 0: Any two adjacent spins have minimal energy if and only if they
are aligned in that they have the same sign. This interaction is therefore
ferromagnetic.
J < 0: Any two adjacent spins prefer to point in opposite directions. Thus
this is a model of an antiferromagnet.

47

h R: A constant h describes the action of an external field (directed


upwards when h > 0).
Hence the nearest-neighbour interaction potential J,h = (J,h
A )AS reads

Ji i+1 , if A = {i, i + 1},


J,h
h , if A = {i},
A () =

0 , else

for .

(6.26)

We employ periodic boundary conditions, i.e., for Z finite and with


= {1, . . . , ||} we set ||+1 = 1 for any . The Hamiltonian in
with periodic boundary conditions reads
(per)

H () = J

||
X

i i+1 h

||
X

i .

(6.27)

i=1

i=1

The partition function depends on the inverse temperature > 0, the coupling constant J and the external field h R, and is given by


X
X
(per)
Z (, J, h) =

exp H () .
(6.28)
1 =1

|| =1

We compute this by the one-dimensional version of transfer matrix formalism


introduced by [KW41] for the two-dimensional Ising model. More details
about this formalism and further investigations on lattice systems can be
found in [BL99a] and [BL99b]. Crucial is the identity
X
X
Z (, J, h) =

V1 ,2 V2 ,3 V||1,|| V|| ,1
1 =1

with

|| =1

1


1
Vi i+1 = exp hi + Ji i+1 + hi+1
2
2
for any and i = 1, . . . , ||. Hence, Z (, J, h) = Trace V|| with the
symmetric matrix
 (J+h)

e
eJ
V=
eJ e(Jh)
that has the eigenvalues
J

= e

cosh(h) e

2J

sinh (h) + e

2J

 12

(6.29)

This gives finally


||

||

Z (, J, h) = + + .
48

(6.30)

This is a smooth expression in the external field parameter h and the inverse
temperature ; it rules out the appearance of a discontinuous isothermal
magnetisation: so far, no phase transition. The thermodynamic limit of the
free energy per volume is
1
1
log Z (, J, h) = log + ,
(6.31)
Z ||



1
because ||1 log Z (, J, h) = log + ||
1+( + )|| . The magnetisation
in the canonical ensemble is given as the partial derivative of the specific free
energy per volume,
f (, J, h) = lim

sinh(h)
m(, J, h) = h f (, J, h) = q
.
sinh2 (h) + e4J
This is symmetric, m(, J, 0) = 0 and limh m(, J, h) = 1 and for all
h 6= 0 we have |m(, J, h)| > |m(, 0, h)| saying that the absolute value of the
magnetisation is increased by the non-vanishing coupling constant J. The
set G(, ) of Gibbs measures contains only one element, called J,h , see
[Geo88] for the explicit construction of this measure as the corresponding
Markov chain distribution, here we outline only the main steps.
1.) The nearest-neighbour interaction J,h in (6.26) defines in the usual way
the Gibbs distributions J,h (|) for any finite Z and any boundary
condition . Define the function g : E 3 (0, ) by
J,h
{i}
(i = y|) = g(i1 , y, i+1 ),

y E, i Z, .

(6.32)

We compute
g(x, y, z) = ey(h+Jx+Jz) /2 cosh((h + Jx + Jz))
Fix any a E. Then the matrix


g(a, x, y)
Q=
g(a, a, y) x,yE

for x, y, z E.

(6.33)

(6.34)

is positive. By the well-known theorem of Perron and Frobenius we have


a unique positive eigenvalue q such that there is a strictly positive right
eigenvector r corresponding to q.
2.) The matrix PJ,h , defined as


Q(x, y)r(y)
PJ,h =
(6.35)
qr(x)
x,yE
49

is uniquely determined by the matrix Q and therefore by g in (6.33). Clearly,


PJ,h is stochastic. We then let J,h P() denote (the distribution of) the
unique stationary Markov chain with transition matrix PJ,h . It is uniquely
defined by
J,h (i = x0 , i+1 = x1 , . . . , i+n = xn ) = PJ,h (x0 )

n
Y

PJ,h (xi1 , xi ), (6.36)

i=1

where i Z, n N, x0 , . . . , xn E, and PJ,h satisfies PJ,h PJ,h = PJ,h .


The expectation at each site is given by

 21
EJ,h (i ) = e4J + sinh2 (h)
sinh(h) , i Z.
In the low temperature limit one is interested in the behaviour of the set
G(, ) of Gibbs measures as . A configuration is called a
ground state of the interaction potential if for each site i Z the pair
(i , i+1 ) is a minimal point of the function
1
: {1, +1}2 R; (x, y) 7 (x, y) = Jxy + h(x + y).
2
Note that the interaction potential = (A )AS with

(i , i+1 ) , if A = {i, i + 1}
A =
0 , otherwise
is equivalent to the given nearest neighbour interaction potential . We
denote the constant configuration with only upward-spins by + (respectively
the constant configuration with only downward-spins by ). The Dirac
measure on these constant configurations is denoted by + respectively .
Then, for h > 0, we get that
J,h +

weakly in sense of probability measures as ,

and hence + is the unique ground state of the nearest-neighbour interaction


potential . Similarly, for h < 0, we get that
J,h

weakly in sense of probability measures as ,

and hence is the unique ground state of the nearest-neighbour interaction


potential . In the case h = 0 the nearest neighbour interaction potential
has precisely two ground states, namely + and , and hence we get
1
1
J,0 + +
2
2

weakly in sense of probability measures as .


50

6.3

Symmetry and symmetry breaking

Before we study the two-dimensional Ising model, we briefly discuss the role
of symmetries for Gibbs measures and their connections with phase transitions. As is seen by the spontaneous magnetisation below the Curie temperature, the spin system takes one of several possible equilibrium states
each of which is characterised by a well-defined direction of magnetisation.
In particular, these equilibrium states fail to be preserved by the spin reversal (spin-flip) transformation. Thus breaking of symmetries has some
connection with the occurrence of phase transitions.
Let T denote the set of transformations
: , 7 (i 1 i )iZd ,
where : Zd Zd is any bijection of the lattice Zd , and the i : E E, i
Zd , are invertible measurable transformations of E with measurable inverses.
Each T is a composition of a spatial transformation and the spin
transformations i , i Zd , which act separately at distinct sites of the square
lattice Zd .
Example 6.7 (Spatial shifts) Denote by = (i )iZd the group of all
spatial transformations or spatial shifts or shift transformations
j : , (i )iZd 7 (ij )iZd .
Example 6.8 (Spin-flip transformation) Let the state space E be a symmetric Borel set of R and define the spin-flip transformation
: , ()iZd 7 (i )iZd .
Notation 6.9 The set of all translation invariant probability measures on
is denoted by P (, F) = { P(, F) : = i1 for any i Zd }.
The set of all translation invariant Gibbs measures for the interaction potential and inverse temperature is denoted by G (, ) = { G(, ) : =
i1 for any i Zd }.
Definition 6.10 (Symmetry breaking) A symmetry T is said to be
broken if there exists some G(, ) such that () 6= for some .
A direct consequence of symmetry breaking is that |G(, )| > 1, i.e.,
when there is a symmetry breaking the interaction potential exhibit a phase
transition. There are models where all possible symmetries are broken as
well as models where only a subset of symmetries is broken. A first example is the one-dimensional inhomogeneous Ising model, which is probably the
51

simplest model showing symmetry breaking. The one-dimensional inhomogeneous Ising model on the lattice N has the inhomogeneous nearest-neighbour
interaction potential = (A )AS defined
a sequence (Jn )nN of real
P for2J
numbers Jn > 0 for all n N such that nN e n < , as follows

Jn n n+1 , if A = {n, n + 1},
A =
.
0 otherwise ,
This model is spatial inhomogeneous, the potential is invariant under the
spin-flip transformation , but some Gibbs measures are not invariant under this spin-flip transformation (for details see [Geo88]) for > 0. The
simplest spatial shift invariant model which exhibits a phase transition is
the two-dimensional Ising model, which we will study in the next subsection 6.4. This model breaks the spin-flip symmetry while the shift-invariance
is preserved. Another example of symmetry breaking is the discrete twodimensional Gaussian model by Shlosman ([Shl83]). Here the spatial shift
invariance is broken. More information can be found in [Geo88] or [GHM00].

6.4

The Ising ferromagnet in two dimensions

Let E = {1, 1} be the state space and define the nearest-neighbour interaction potential = (A )AS as

i j , if A = {i, j}, |i j| = 1
A =
.
0 , otherwise
The interaction potential is invariant under the spin flip transformation
and the shift-transformations i , i Zd . Let + , be the Dirac measures for
the constant configurations + and . The interaction potential
takes its minimum at + and , hence + and are ground states for
the system. The ground state generacy implies a phase transition if + ,
are stable in the sense that the set of Gibbs measure G(, ) is attracted by
each of the measures + and for . Let d denote the Levy metric
compatible with weak convergence in the sense of probability measures.
Theorem 6.11 (Phase transition) Under the above assumptions it holds
lim d(G (, ), + ) = lim d(G (, ), ) = 0.

For sufficiently large there exist two shift-invariant Gibbs measure + ,


G (, ) with (+ ) = and
(0 ) = E (0 ) < 0 < + (0 ) = E (0 ).

52

Remark 6.12
(i) + (0 ) is the mean magnetisation. Thus: The two-dimensional Ising
ferromagnet admits an equilibrium state/measure of positive magnetisation although there is no action of an external field. This phenomenon
is called spontaneous magnetisation.


(ii) G (, ) > 1 + (0 ) > 0 goes back to [LL72]. Moreover, the Griffiths inequality implies that the magnetisation + (0 ) is a non-negative
non-decreasing function
there is a critical
of . Moreover

inverse temperature c such that G (, ) = 1 when < c and G (, ) > 1
when > c . The value of c is

1
1
c = sinh1 1 = log 1 + 2)
2
2
and the magnetisation for c is
+ (0 ) = 1 (sinh 2)4

 81

(iii) For the same model in three dimensions one has again + , G (, ),
but there also exist non-shift-invariant Gibbs measures ([Dob73]).
Proof of Theorem 6.11.

Let Z2 be a centred cube. Denote by

B = {{i, j} Z2 : |i j| = 1, {i, j} 6= }
the set of all nearest-neighbour bonds which emanate from sites in . Each
bond b = {i, j} B should be visualised as a line segment between i
and j. This line segment crosses a unique dual line segment between two
nearest-neighbour sites u, v in the dual cube (shift by 12 in the canonical
directions). The associate set b = {u, v} is called the dual bond of b, and we
write
B = {b : b B } = {{u, v} : |u v| = 1}
for the set of all dual bonds. Note
1
b = {u + : |u (i + j)/2| = }.
2
A set c B is called a circuit of length l if c = {{u(k1) , u(k) } : 1 k l} for
some (u(0) , . . . , u(l) ) with u(l) = u(0) , |{u(1) , . . . , u(l) }| = l and {u(k1) , u(k) }
B , 1 k l. A circuit c surrounds a site a if for all paths (i(0) , . . . , i(n) )
in Z2 with i(0) = a and i(n)
/ and {i(m1) , i(m) } B for all 1 m n there
(m1)
exits a m N with {i
, i(m) } c. We denote the set of circuits which
surround a by Ca . We need a first lemma.
53

Lemma 6.13 For all a and l 1 we have




{c Ca : |c| = l} l3l1 .
Proof.

Each c Ca of length l contains at least one of the l dual bonds


{a + (k 1, 0), a + (k, 0)}

k = 1, . . . , l,

which cross the horizontal half-axis from a to the right for example. The
remaining l 1 dual bonds are successively added, at each step there are at
most 3 possible choices.

The ingenious idea of Peierls ([Pei36]) was to look at circuits which occur in
a configuration. For each we let
B () = {b : b = {i, j} B , i 6= j }
denote the set of all dual bonds in B which cross a bond between spins of
opposite sign. A circuit c with c B () is called a contour for . We let
outside of be constant. As in Figure 3 we put outside + -spins. If a site
a is occupied by a minus spin then a is surrounded by a contour for .
The idea for the proof of Theorem 6.11 is as follows. Fix + boundary
condition outside of . Then the minus spins in form (with high probability) small islands in an ocean of plus spins. Then in the limit Z2 one
obtains a + G() which is close to the delta measure + for sufficiently

large. As
+ and are distinct, so are + and when is large. Hence
G() > 1 when is large. We turn to the details. The next lemma just
ensures the existence of a contour for positive boundary conditions and one
minus spin in . We just cite it without any proof.
Lemma 6.14 Let with i = +1 for all i c and a = 1 for some
a . Then there exists a contour for which surrounds a.
Now we are at the heart of the Peierls argument, which is formulated in the
next lemma.
Lemma 6.15 Suppose c B is a circuit. Then
(c B ()|) e2|c|
for all > 0 and for all .

54

+ +

+ +
-

+ +
+
+ +

+
+

+ +
+

+ + +
+ + +
+
+

+ +

+
+
+

- -

55

Figure 3: a contour for + boundary condition

+ +
+
+
+
+
- - +
- +
+
+ + +
+ - - - +
- - +
- - +
+ + +
+ +
-

- + +
+ +
-

+ + +
- - +
- +
+
- + + - +
+ - +
+ - +
- - +
- +
- - - + - + + +
-

+ +
+ +
+ +
+
+ +
+
- - +
+ +

- -

+
+
+
- +
+ + + + +

+ +
+ + +
+ + +
+ + + +

+
+
+
+
+
+

+
+
+
+
+
+
+

+ + +
+
+
- - + +
+ + +
- - +
- - +
+ + - - -

Proof.

Note that for all we have


X
X
H () =
i j = |B |
(1 i j )
{i,j}B

{i,j}B

= |B | 2|{{i, j} B : 6= j }| = |B | 2|B |.
Now we define two disjoint sets of configurations which we need later for an
estimation.
A1 = { : Zd \ = Zd \ , c B ()}
A2 = { : Zd \ = Zd \ , c B () = }.
There is a mapping c : with

, if i is surrounded by c,
(c )i =
,
, otherwise
which flips all spins in the interior of the circuit c. Moreover, for all {i, j}
B we have

i j , if {i, j}
/c
(c )i (c )j =
,

i j , if {i, j} c
resulting in B (c )4B () = c (this was the motivation behind the definition of mappings), where 4 denotes the symmetric difference of sets. In
particular we get that c is a bijection from A2 to A1 , and we have
H () H (c ) = 2|B ()| 2|B (c )| = 2|c|.
Now we can estimate with the help of the set of events A1 , A2
P
P
exp(H
())

A exp(H (c ))
A
1
(c B ()|) P
= P 2
A2 exp(H ())
A2 exp(H ())
= exp(2|c|).

Now we finish our proof of Theorem 6.11. For > 0 define
X
r() = 1
l(3e2 )l ,
l1

where denotes the minimum, and note that r() 0 as . The


preceding lemmas yield
X
X
X
(a | + )
(c B ()| + )
e2|c|
l3l2l ,
cCa

cCa

56

l1

and thus (a | + ) r() for all a Z2 , > 0 and Z2 . Choose


N = [N, N ]2 Z2 and define the approximating measures
N+ =

1 X
N +i (| + ).
|N | i
N

As P(, F) is compact, the sequence (N+ )N N has a cluster point + and one
can even show that + G (, ) ([Geo88]). Our estimation above gives
then + (a = 1) r(), and in particular one can show that
lim + = +

and

lim d(G (), + ) = 0.

Note = (+ ). Hence,
+ (0 = 1| + ) = (0 = +1| ).
If is so large that + (0 = 1) r() < 31 , then
1
+ (0 = 1) = (0 = +1) < .
3
But {0 = 1} {0 = +1} = , and hence
2
+ (0 = +1) = 1 + (0 = 1) .
3


6.5

Extreme Gibbs measures

The set G(, ) of Gibbs measures for some interaction potential and
inverse temperature > 0 is a convex set, i.e., if , G(, ) and 0 < s < 1
then s + (1 s) G(, ). An extreme Gibbs measure (or in physics: a
pure state) is an extreme element of the convex set G(, ). The set
of all extreme Gibbs measures is denoted by ex G(, ). Below we give a
characterisation of extreme Gibbs measures. But first we briefly discuss
microscopic and macroscopic quantities. A real function f : R is
said to describe a macroscopic observable if f is measurable with respect
to the tail--algebra T . The T -measurability of a function f means that the
value of f is not affected by the behaviour of any finite set of spins. For
example, the event
n
o
1 X
i exists and belongs to B
B BR ,
lim
n |n |
i
n

57

is a tail event in T for any cofinal sequence (n )nN with n Zd as n .


A function f describes a microscopic observable if it depends only on finitely
many spins. A function f : R is called cylinder function or local
function if it is F -measurable for some S. The function f is called
quasi-local if it can be approximated in supremum norm by local functions.
The following theorem gives a characterisation of extreme Gibbs measures.
It was invented by Lanford and Ruelle [LR69] and but was introduced earlier
in a weaker form by Dobrushin [Dob68a],[Dob68b].
Theorem 6.16 (Extreme Gibbs measures) A Gibbs measure G()
is extreme if and only if is trivial on the tail--algebra T , i.e. if (A) = 0
or (A) = 1 for any A T .
Microscopic quantities are subject to rapid fluctuations in contrast to
macroscopic quantities. A probability measure P(, F) describing the
equilibrium state of a given system is consistent with the observed empirical distributions of microscopic variables when it is a Gibbs measure. The
second requirement even gives that macroscopic quantities are constant with
probability one, and with Theorem 6.16 it follows that only extreme Gibbs
measures are an appropriate description of equilibrium states. For this reason, an extreme Gibbs measure is often called a phase. However this term
should not be confused with the physical concept of a pure phase. Note
that the stable coexistence of distinct pure phases in separated regions of
space will also be represented by an extreme Gibbs measure (see Figure 4,
which was taken from [Aiz80]). This can be seen quite nicely in the threedimensional Ising model ([Dob73]), where a Gibbs measure is constructed via
Gibbs distributions whose boundary is on one half-space given by upwardspins and on the other half-space given by downward-spins. It is a tempting
misunderstanding to believe that the coexistence of two pure phases was described by a mixture like 21 (1 + 2 ), 1 , 2 G(, ). Such a mixture rather
corresponds to an uncertainty about the true phase of the system. See the
Figure 4 for an illustration of this fact.

6.6

Uniqueness

In this subsection we give a short intermezzo about the question of uniqueness of Gibbs measures, i.e., the situation when there is a most one Gibbs
measure possible for the given interaction potential. One might guess that
this question has something to due with the dependence structure introduced
from the interaction potential. One is therefore led to check the dependence
structure of the conditional Gibbs distributions at one given lattice site. For
58

Figure 4: An extreme (coexistence) and mixture 21 ((water) + (ice) )


that, fix any i Zd and consider the j -dependence of the Gibbs distribu
tion {i}
(|) for each j Zd and for a given interaction potential .
Introduce the matrix elements
Ci,j () =

sup
,,
d
= d
Z \{j}
Z \{j}

||{i}
(|) {i}
(|)||,

where |||| denotes the uniform distance of probability measures on E, which is

one half of the total variation distance. Note that {i}


(|) P(E, E) for any
. The matrix (Ci,j )i,jZd is called Dobrushins interdependence matrix.
A first guess describing the dependence structure would be to consider the
sum
X
Ci,j (),
jZd

however this tells us nothing about the behaviour of the configuration


at infinity.
Definition 6.17 An interaction potential is said to satisfy Dobrushins
uniqueness condition if
X
Ci,j () < 1.
(6.37)
C() = sup
iZd

jZd

To provide a sufficient condition for Dobrushins condition to hold, define


the oscillation of any function f : R as
(f ) = sup |f () f ()|.
,

Theorem 6.18 Let be an interaction potential and d 1.


(i) If Dobrushins uniqueness condition (6.37) is satisfied, then |G()| 1.
59

(ii) If
sup

(|A| 1)(A ) < 2,

iZd A3i

then Dobrushins uniqueness condition (6.37) is satisfied.


Proof.

See [Geo88] and references therein.

Example 6.19 (Lattice gas) Let E = {0, 1} and let the reference measure
be the counting measure. Let K : S R be any function on the set of all
finite subsets of Zd and define for the interaction potential by

Q
K(A) , if A = iA i = 1
A () =
0 , otherwise
for any A S. Note that (A ) = |K(A)|. Thus uniqueness is given whenever
X
sup
(|A| 1)|K(A)| < 4.
iZd A3i

Example 6.20 (One-dimensional systems) Let be a shift-invariant interaction potential and d = 1. Then there is at most one Gibbs measure
whenever
X
diam (A)(A ) < .
AS,
min A=0

6.7

Ergodicity

We look at the convex set P (, F) of all shift-invariant random fields on


Zd . P (, F) is always non-empty. We also consider the -algebra
I = {A F : i A = A for all i Zd }

(6.38)

of all shift-invariant events. A F-measurable function f : R is Imeasurable if and only if f is invariant, in that f i = f for all i Zd . A
standard result in ergodic theory is the following theorem.
Theorem 6.21 (i) A probability measure P (, F) is extreme in
P (, F) if and only if is trivial on the invariant -algebra.
(ii) Each P (, F) is uniquely determined (within P (, F)) by its
restriction to I.

60

(iii) Distinct probability measures , ex P (, F) are mutually singular


on I in that there exists an A I such that (A) = 1 and (A) = 0.
Proof.

Standard textbooks of ergodic theory or [Geo88].

Definition 6.22 (Ergodic measure) A probability measure P (, F)


is said to be ergodic (with respect to the shift-transformation group ) if
is trivial on the -algebra I of all shift-invariant events. In mathematical
physics such a is often called a pure state.
Proposition 6.23 (Characterisation of ergodic measures)
Let be a probability measure P (, F) and (N )N N any sequence of
cubes with N Zd as N . Then the following statements are equivalent.
(i) is ergodic.
(ii) For all events A F,

1 X


(A i B) (A)(B) = 0.
lim sup
N BF |N |
i
N

(iii) For arbitrary cylinder events A and B,


1 X
(A i B) = (A)(B).
N |N |
i
lim

One can show that each extreme measure is a limit of finite volume Gibbs
distributions with suitable boundary conditions. Now, what about ergodic
Gibbs measures? The ergodic Theorem 6.24 below gives an answer: If
ex P (, F) and (N )N N a sequence of cubes with N Zd as N one
gets
1 X
f (i )
(f ) = lim
N |N |
i
N

for -almost all and bounded measurable function f : R. Thus


1 X
i
N |N |
i

= lim

for a.a.

in any topology which is generated by countably many evaluation mappings


7 (f ). For E finite, the weak topology (of probability measures) has this
property.
61

For any given measurable function f : R define


1 X
RN f =
f i N N.
|N | i

(6.39)

The multidimensional ergodic theorem says something about the limiting behaviour of RN f as N . Let (N )N N be a cofinal sequence of boxes with
N Zd as N .
Theorem 6.24 (Multidimensional Ergodic Theorem) Let a probability measure P (, F) be given. For any measurable f : R with
(|f |) < ,
lim RN f = (f |I) a.s.
N

A variational characterisation of Gibbs measures

In this section we give a variational characterisation for translation invariant


Gibbs measures. This characterisation will prove useful in the study of Gibbs
measures and it has a close connection to the physical intuition, namely
that an equilibrium state minimises the free energy. This will be proved
rigorously in this section. Let us start with some heuristics and assume
for this purpose only that the set of configurations is finite. Denote by
() = Z 1 exp(H()) a Gibbs measure with suitable normalisation Z and
Hamiltonian H. The mean energy for any probability measure P() is
X
()H(),
E (H) = (H) =

and its entropy is given by


H() =

() log ().

Now (H)H() = F () is called the free energy of , and for any P()
we have
F () log Z

and F () = log Z

if and only if = .

To see this, apply Jensens inequality for the convex function (x) = x log x
and conclude by simple calculation
 () 
 ()  X
X
=
()
(H) H() + log Z =
() log
()
()

X

()

()
= (1) = 0,
()

62

and as is strictly convex there is equality if and only if ()


is a constant.
()
As is finite one gets that = . If is not finite one has to employ quite
some mathematical theory which we present briefly in the rest of this section.
Definition 7.1 (Relative entropy) Let A F be a sub--algebra of F
and , P(, F) be two probability measures. Then
R

(fA log fA ) = fA () log fA ()(d) , if  on A
HA (|) =
,
, otherwise
where fA is the Radon-Nikodym density of |A relative to |A (|A and |A
are the restrictions of the measures to the sub--algebra A), is called the
relative entropy or Kullback-Leibler information or information divergence of relative to on A.
If A = F for some S one writes
R

(f log f ) = f () log f ()(d) , if  on
,
H (|) =
, otherwise
 

is the Radon-Nikodym density and and are the


where f = d
d
marginals of and on for the projection map : .
We collect the most important properties of the relative entropy in the following proposition.
Proposition 7.2 Let A F a sub--algebra of F and , P(, F) any
two probability measures. Then
(a) HA (|) 0,
(b) HA (|) = 0 if and only if = on A,
(c) HA is an increasing function of A,
(d) H(|) is convex.
We now connect the relative entropy to our previous definition of the entropy
functional in Section 3. For this let any finite signed a priori measure on
(E, E) be given. Note that the a priori or reference measure need not be
normalised to one (probability measure), and the following notion depends
on the choice of this reference measure. Recall that denotes the product
measure on = (E , E ).
63

Notation 7.3 Let P(, F). The function


S = H (|1 ( ))
is called the entropy of in relative to the reference/a priori measure .
If the reference measure is the counting measure we get back Shannons
formula
X
H () =
( = ) log ( = ) 0

for the entropy. We wish to show that the thermodynamic limit of the entropy
exists, i.e.wish to show that
h() = lim

1
H ()
|n | n

exists for any cofinal sequence (n )nN of finite volume boxes in S. Essential
device for the proof of the existence of this limit is the following sub-additivity
property.
Proposition 7.4 (Strong Sub-additivity) Let , S and P(, F)
be given. Then
H () + H () H () + H ().
Proof.

A proof is given in [Rue69],[Isr79] and in [Geo88].

(7.40)


Equipped with this inequality we go further and assume = (note


H () = 0) and observe that for a translation invariant probability measure
P (, F) we get that
H+i () = H ()

for any S and any i Zd .

Denote by Sr.B. the set of all rectangular boxes in Zd .


Lemma 7.5 Suppose that the function a : Sr.B. [, ) satisfies
(i) a( + i) = a() for all Sr.B. , i Zd ,
(ii) a() + a() a( ) for , Sr.B. , = ,
(n )nN a cofinal sequence of cubes with n Zd as n . Then the limit
1
1
a(n ) = inf
a()
n |n |
Sr.B. ||
lim

exists in [, ).
64

(7.41)

Proof.

Choose
c > := inf

Sr.B.

1
a()
||

1
and let Sr.B. be such that ||
a() < c. Denote by Nn the number of
disjoint translates of contained in n . Then n is split into Nn translates
of and a remainder in the boundary layer. Choose Nn as large as possible.
n|
Then limn N|n ||
= 1. The sub-additivity gives

a(n ) Nn a() + (|n | Nn ||)a({0}).


Hence,
1
a(n ) = lim sup Nn1 ||1 a(n )
n |n |
n
1
< || a() < c.

lim sup

Letting c tend to gives the proof of the lemma.

Now, both Proposition 7.4 and Lemma 7.5 provide the main steps of the
proof of the following theorem.
Theorem 7.6 (Specific entropy) Fix a finite signed reference measure
on the measurable state space (E, E). Let P (, F) be a translation
invariant probability measure and (n )nN a cofinal sequence of boxes with
n Zd as n . Then,
(a)
1
Hn ()
n |n |

h() = lim
exists in [, (E)].

(b) h : P (, F) R, 7 h(), is affine and upper semi-continuous. The


level sets {h c}, c R, are compact with respect to the weak topology
of probability measures.
Notation 7.7 h() is called the specific entropy per site of P (, F)
relative to the reference measure .
Proof of Theorem 7.6. The existence of the specific entropy was proved
first by Shannon ([Sha48]) for the case d = 1, |E| < and the counting
measure. Extensions are due to McMillan ([McM53]) and Breiman ([Bre57]).
The multidimensional version of Shannons result is due to Robinson and
Ruelle ([RR67]). The first two assertions of (b) can already be found in
[RR67], an explicit proof of this can be found in [Isr79].

65

Now the following question arises. What happens if we take instead of the
reference measure any Gibbs measure and evaluate the relative entropy? We
analyse this question in the following. To define the specific energy of a
translation invariant probability measure it proves useful to introduce the
following function. Let = (A )AS be a translation invariant interaction
potential. Define the function f : R as
X
f =
|A|1 A .
(7.42)
A30

In the following theorem we prove the existence of the specific energy. To


derive an expression which is independent of any chosen boundary condition,
we formulate the theorem for an arbitrary sequence of boundary conditions,
which applies also to the case of periodic and free boundary conditions.
Theorem 7.8 (Specific energy) Let P (, F), be a translation
invariant interaction potential, (n )nN be a cofinal sequence of boxes with
n Zd as n and (n )nN be a sequence of configurations n .
Then the specific energy
1
(Hnn )
n |n |

E (f ) = (f ) = lim

(7.43)

exists.
Notation 7.9 (Specific free energy) E (f ) or (f ) is called the specific (internal) energy per site of relative to . The quantity f () =
(f ) h() is called the specific free energy of for .
Proof of Theorem 7.8. For the proof see any of the books [Geo88],[Rue69]
or [Isr79]. The proof goes back to Dobrushin [Dob68b] and Ruelle [Rue69].

We continue our investigations with the previously occurred question of the
relative entropy with respect to a given Gibbs measure.
Theorem 7.10 (Pressure) Let P (, F) and G (, ), > 0,
be a translation invariant interaction potential, (n )nN be a cofinal sequence
of boxes with n Zd as n and (n )nN be a sequence of configurations
n . Then,
(a) P () = limn

1
|n |

log Zn (n ) exists.

66

(b) The limit limn

1
H (|)
|n | n

exists and equals

h(|) = P () + (f ) h() = P () + f ().

(7.44)

Notation 7.11 P = P () is called the pressure and specific Gibbs free


energy.
Proof of Theorem 7.10. We just give the main idea of the proof. Details
can be found in [Isr79],[Geo88] and go back to [GM67]. Let S and
fixed. Recall that the marginal of on is a probability measure on
( , F ) as well as the conditional Gibbs distribution (|) for any given
configuration . Then compute
Z
Z
d
d
(d) log ()
() =
H (| (|)) =
(d) log
d
d

(d) log
()
d

Z
= H () +
(d)H (Zd \ ) + log Z ().

We can draw an easy corollary, which is the first part of the variational
principle for Gibbs measures.
Corollary 7.12 (First part variational principle) For a translation invariant interaction potential and P (, F) we have h(|) 0. If
moreover G (, ) then h(|) = 0.
Proof. The assertions are due to Dobrushin ([Dob68a]) and Lanford and
Ruelle ([LR69]).

The next theorem gives the reversed direction and a summary of the whole
variational principle.
Theorem 7.13 (Variational principle) Let be a translation invariant
interaction potential, (n )nN a cofinal sequence of boxes with n Zd as
n and P (, F). Then,
(a) Let P (, F) be such that lim inf n
G (, ).

1
H (|)
|n | n

(b) h(|) 0 and h(|) = 0 if and only if G (, ).


67

= 0. Then

(c) h(|) : P (, F) [0, ] is an affine lower semi continuous functional which attains its minimum 0 on the set G (, ). Equivalently,
G () is the set on which the specific free energy functional
f : P (, F) [0, ]
attains its minimum P ().
Proof. This variational principle is due to Lanford and Ruelle ([LR69]).
A transparent proof which reveals the significance of the relative entropy is
due to Follmer ([Fol73]).


Large deviations theory

In this section we give a short view on large deviations theory. We motivate


this by the simple coin tossing model. We finish with some recent large
deviations results for Gibbs measures, which we can only discuss briefly.

8.1

Motivation

Consider the coin tossing experiment. The microstates are elements of the
configuration space = {0, 1}N equipped with the product measure P ,
where P({0, 1}) is given as = 0 0 +1 1 with 0 +1 = 1. If 0 = 1 = 21
we have a fair coin. Recall the projections j : {0, 1}, j N, and
consider the mean
N
1 X
SN () =
j ()
N j=1

for .

If m denotes the mean (m = 12 for a fair coin), the weak law of large
numbers (WLLN) tells us that for > 0
P (SN (m , m + )) 1

as N ,

and for > small enough and z 6= m


P (SN (z , z + )) 0

as N .

In particular one can even prove exponential decay of the latter probability,
which we sketch briefly. The problem of decay of probabilities of rare events

68

is the main task of large deviations theory. For simplicity we assume now
that m = 21 . Then
1
log P (SN (z , z + ))
N N
inf
I(x),

F (z, ) = lim
=

x(z,z+)

where the function I is defined as



x log 2x + (1 x) log 2(1 x) , x [0, 1]
I(x) =
,
,x
/ [0, 1]
where as usual 0 log 0 = 0. Since F (z, ) I(z) as 0, we may (heuristically) write
P (SN (z , z + )) exp(N I(z))
for N large and small. The term I(z) measures the randomness of z and
z = 21 = m is the macrostate which is compatible with the most microstates
(I( 12 ) = 0). The mean SN gives only very limited information. If we want
to know more about the whole random process we might go over to the
empirical measure
LN () =

N
1 X
() P({0, 1})
N j=1 j

for any ;

or even to the empirical field


N 1
1 X
RN () =
k (N ) P(),
N k=0 T

where T 0 = id and (T )j = j+1 is the shift and (N ) is the periodic continuation of the restriction of to N .
The latter example can be connected to our experience with Gibbs measures
and distributions as follows. Let N = [N, N ]d Zd , N N, and define the
periodic empirical field as
(per)
RN
() =

1 X
(N ) P (, F)
|N | k k

for all ,

where (N ) is the periodic continuation of the restriction of onto N


to the whole lattice Zd . Here, the periodic continuation ensures that the
69

periodic empirical field is translation invariant. The LLN is not available


in general, it is then replaced by some ergodic theorem. For example if
(per)
P (, F) is an ergodic measure, then RN
-a.s. as N .
Going back to the coin tossing example the distributions of SN , LN and RN
under the product measure P are the following probability measures
1
P SN
P([0, 1])
1
P LN P(P([0, 1]))
1
P RN
P(P()).

The WLLN implies for all of these probabilities exponential decay of the rare
events given by a function I as the rate in N . This will be generalised in the
next subsection, where such functions I are called rate functions.

8.2

Definition

In the following we consider the general setup, i.e. we let X denote a Polish
space and equip it with the corresponding Borel--algebra BX .
Definition 8.1 (Rate function) A function I : X [0, ] is called a
rate function if
(1) I 6= ,
(2) I is lower semi continuous,
(3) I has compact level sets.
Definition 8.2 (Large deviations principle) A sequence (PN )N N of probability measures PN P(X, BX ) on X is said to satisfy the large deviations
principle with rate (speed) N and rate function I if the following upper and
lower bound hold,
1
log PN (C) inf I(x)
xC
N N
1
lim inf log PN (O) inf I(x)
N N
xO

lim sup

for C X closed,
(8.45)
for O X open.

Let us consider the following situation. Let (Xi )iN be i.i.d. real-valued
random variables, i.e., there is a probability space (, F, P) such that each
random variable has the distribution = P X11 P(R, BR ). Denote
1
the distribution of the mean SN by N = PN SN
P(R, BR ). For this
situation there is the following theorem about a large deviations principle
70

for the sequence (N )N N . Before we formulate that theorem we need some


further definitions. For P(R, BR ) let
Z

() = log
exp(x)(dx)
R,
R

be the logarithmic moment generating function. It is known that is lower


semi continuous and () (, ], R. The Legendre-Fenchel transform of is given by
(x) = sup{x ()} x R.
R

Theorem 8.3 (Cram


ers Theorem) Let (Xi )iN be i.i.d. real-valued random variables with distribution P(R, BR ) and let N denote the distribution of the mean SN . Assume further that () < for all R. Then
the sequence (N )N N satisfies a large deviations principle with rate function
given by the limit of the Legendre-Fenchel transform of the logarithmic
moment generating function N , i.e., for any measurable BR
1
log N () inf (x)
x
N N
1
lim inf log N () inf (x).
N N
x

lim sup

Proof.

See [DZ98] or [Dor99].

(8.46)

An important tool in proving large deviations principle is the following alternative version of the well-known Varadhan Lemma ([DZ98]).
Theorem 8.4 (Tilted LDP) Let the sequence (PN )N N of probability measures PN P(X, BX ) satisfy a large deviations principle with rate (speed) N
and rate function I. Let F : X R be a continuous function that is bounded
from above. Define
Z
eN F (x) PN (dx) , S BX .
JN (S) =
S

Then the sequence (PNF )N N of probability measures PNF P(X, BX ) defined


by
JN (S)
, S BX
PNF (S) =
JN (X)
satisfies a large deviations principle on X with rate N and rate function
I F (x) = sup{F (y) I(y)} (F (x) I(x)).
yX

71

Proof. See [dH00] or [DZ98] for the original version of Varadhans Lemma
and [Ell85] or [Dor99] for a version as in the theorem.


8.3

Some results for Gibbs measures

We present some results on large deviations principles for Gibbs measures.


We assume the set-up of Section 6 and Section 7. Let an interaction
potential and note that the expectation of the interaction potential with the
periodic empirical field is given by
(per)
hRN
(), i = |N |1 H(per)
() , ,
N

(8.47)

where H(per)
is the Hamiltonian in N with interaction potential and periN
odic boundary conditions. Recall that ,
denotes the Gibbs distribution in
N
N with configurational boundary condition and ,per
the Gibbs disN
tribution in N with periodic boundary condition. Further, if G (, )
is a Gibbs measure, h(|) = h(|) denotes the specific relative entropy with
respect to the Gibbs measure with respect to the given interaction potential . Denote by e() the evaluation -algebra for the probability measures
on . Note that the mean energy h, i can be identified as a linear form on
a vector space of finite range interaction potentials. In particular we define
(per)
N () = hRN
(), i

for any interaction potential with finite range. In the limit N one
gets a linear functional on the vector space V of all interaction potentials
with finite range (see [Isr79] and[Geo88] for details on this vector space).
Theorem 8.5 (LDP for Gibbs measures) Let N = [N, N ]d Zd , >
0 and be an interaction potential with finite range. Then the following
assertions hold.
(per) 1
(a) Let G (, ) be given. Then the sequence ( (RN
) )N N of
(per) 1
) P(P(, F), e()) satisfies a large
probability measures (RN
deviations principle with rate (speed) |N | and rate function h(|).

(b) Let ,
be the Gibbs distribution in N with boundary condition .
N
Then for any closed set F P(, F) and any open set G P(, F),
1
(per)
log sup ,
(RN
F ) inf {h() + h, i + P ()},
N
F

N |N |
1
(per)
lim inf
log sup ,
(RN
G) inf {h() + h, i + P ()}.
N
N |N |
G

(8.48)

lim sup

72

(c) Let K in V be a measurable subset. Then


1
log sup ,
(N K) inf {JV ( )}
N
K
N |N |

1
lim inf
log sup ,
(N K) inf {JV ( )},
N
N |N |
K

lim sup

(8.49)

with JV ( ) = P () + inf V { () + P ( + )}.


Proof. The part (a) can be found in [Geo88] and in [FO88] or alternatively
in [Oll88], all for the case that the state space E is finite. If E is an arbitrary
measurable space, see [Geo93]. Part (b) is in [Geo93] and [Oll88], and part
(c) in [Geo93]. Note that the restrictions on the interaction potential can
even be relaxed, see [Geo93]. The corresponding theorems for continuous
systems can be found in [Geo95].

A last remark to part (c) of the Theorem 8.5.
Theorem 8.6 (Equivalence of ensembles) Let N = [N, N ]d Zd and
be an interaction potential with finite range. Let K R be a measurable
set of energy densities. Then there is an interaction potential V such
that + V and it holds.


accN ,per
(8.50)
(|N K) G ( + ).
N
Proof. See [Geo93] and [Geo95] and [LPS95]. Observe that the periodic
boundary conditions are crucial for this result (they ensure the translation
invariance). Translation invariance provides, as we know from Section 7, a
variational characterisation for Gibbs measures. There exists no proof for
configurational boundary conditions for dimension d 2. For the case d = 1
see [Ada01].


Models

We present here some important models in statistical mechanics. For more


models for lattice systems see [BL99a] and [BL99b]. The last example in this
section is the continuous Ising model which is an effective model for interfaces
and plays an important role for many investigations.

73

9.1

Lattice Gases

We consider here a system of particles occupying a set Zd with || = V .


Here || denotes the number of sites in . At each point of there is at
most one particle. For i we set i = 1 if there is a particle at the site i
and set i = 0 otherwise. Any := {0, 1} is called a configuration. For
a configuration we have the Hamiltonian H (). The canonical partition
function is
X
Z (, N ) =
eH () .
P ,
i i =N

Note that there is no need here for N ! since the particles are indistinguishable.
The thermal wavelength is put equal to 1. The grandcanonical partition
function is then
Z (, ) =

V
X
N =0

eN

eH () =

N =0

P ,
i i =N

e(H ()

V
X

i )

e(H ()

i )

P ,
i i =N

The thermodynamic functions are defined in the usual way. The probability
for a configuration {0, 1} is
P

e(H () i i )
.
Z (, )
The Hamiltonian is of the form
H () =

i j (qi qj ),

i,j, i6=j

where qi is the position vector of the site i . However this is too difficult
to solve in general. We consider two simplifications of the potential energy:
Mean-field Models: is taken to be a constant. Therefore
X
H () =
i j .
i,j, i6=j

Take > 0, otherwise the interaction potential is not tempered. When


i = 1 for all i , H () = V (V21) and therefore H is not stable. For
H to be stable we must take = V with > 0. Thus
X
H () =
i j .
V i,j, i6=j
74

Note that for Mean-field models the lattice structure is not important since
the interaction does not depend on the location of the lattice sites and therefore we can take = {1, 2, . . . , V }.
Nearest-neighbour Models: In these models we take

J , if |qi qj | = 1
(qi qj ) =
,
0 , if |qi qj | > 1
that is the interaction is only between nearest neighbours and is then equal
to J, J R. If we denote a pair of neighbouring sites i and j by hi, ji we
have
X
H () = J
i j .
hi,ji

Note that J can be negative or positive.

9.2

Magnetic Models

In magnetic models at each site of there is a dipole or spin. This spin could
be pointing upwards or downwards, that is, along the direction of the external
magnetic field or in the opposite direction. For i we set i = 1 if the spin
at the site i is pointing upwards and i = 1 if it is pointing downwards.
The term {1, 1} is called a configuration. For a configuration we
have an energyP
E() and an interaction with an external magnetic field of
strength h, h i i . The partition function is then
X

Z (, h) =

e(E()h

{1,1}

The free energy per lattice site is


f (, h) =

1
log Z (, h).
V

The probability for a configuration {1, 1} is


P

e(E()h i i )
.
Z (, h)
The total magnetic moment is the random variable
X
M () =
i
i

75

i )

and therefore
P
E(M ) =

{1,1}

P
P
( i i )e(E()h i i )

Z (, h)

1
log Z (, h).
h

Then if m (, h) denotes the mean magnetisation per lattice site we have


m (, h) =

E(M )

= f (, h).
V
h

Note that

f (, h) = E(M E(M ))2 0.


2
h
V
Therefore h 7 f (, h) is concave. If E() = E(), then
f (, h) = f (, h).
If l is a sequence of regions tending to infinity and if
lim fl (, h) = f (, h),

then h 7 f (, h) is also concave and if it is differentiable


m(, h) := lim ml (, h) =
l

f (, h).
h

Relation between Lattice Gas and Magnetic Models


We can relate the Lattice Gas to a Magnetic Model and vice versa by the
transformation
i = (i + 1)/2
or
i = 2i 1.
This gives
H ()

1
1 X
i (b + )V,
i = E() (a + )
2 i
2
i

where a and b are constants. Therefore


1
1
(, ) = (b + ) f (, a + )
2
2
and

1
1
(, ) = (1 + m (, a + )).
2
2
76

9.3

Curie-Weiss model

We study here the Curie-Weiss Model, which is a mean-field model given by


the interaction energy
E() =

i j =

ii<jV

V
X

2V

!2
i

i=1

, {1, 1} ,

where > 0 and is any finite set with || = V . We sketch here only
some explicit calculations, more on the model can be found in the books
[Ell85],[Dor99], [Rei98], and [TKS92]. The partition function is given by
X
PV
Z (, h) =
e(E()h i=1 i ) .
{1,1}V

For = this becomes

Z (, h) = e 2

exp

{1,1}V

2V

V
X

!2
i

+ h

V
X

i .

i=1

i=1

Note that Z (, h) = Z (, h). In the identity


Z

1 2
e 2 y dy = 2,

put y = x a. This gives


Z

1 2
+ax)

e( 2 x

dx =

1 2

2e 2 a

or

Z
1 2
1

e
=
e( 2 x +ax) dx.
2

p PV
Using this identity with a = V
i=1 i we get
1 2
a
2

#
r
V
X
1 2 

Z (, h) = e
exp x + x
+ h
i dx
2
V

i=1
{1,1}V
" r
#
Z
V

X
X

1 2
1

exp x
= e 2
e 2 x
+ h
i dx.
V
2
V
i=1
2

"

{1,1}

77

Now
X

exp(

V
X

i ) = (2 cosh )V .

i=1

{1,1}V

Therefore
Z (, h) = e
Putting =

x
V

"

21 x2

#
 V
 r2
+ h
2 cosh x
dx.
V

, we get
2 V

Z (, h) = e

= e 2 2V

V
2

 12 Z

V
2

 12 Z

V
2
exp(
) cosh( + h) d
2

eV G(h,) d,

where

1
G(h, ) = 2 + log cosh( + h).
2
The free energy per lattice site is
1

1
1
f (, h) =
log Z (, h) =
log 2
log
V
2V

2V
Z
1
log
eV G(h,) d.

V
2

Therefore by Laplaces Theorem (see for example [Ell85] or [Dor99]), the free
energy per lattice site in the thermodynamic limit is
Z
1
1
f (, h) = log 2 lim
log
eV G(h,) d
V V

1
1
= log 2 sup G(h, ).

R
Suppose that the supremum of G(h, ) is attained at (h). Then
1
1
f (, h) = log 2 G(h, (h))

and

G
(h, (h)) = (h) + tanh((h, (h)) + h) = 0,

78

y
y= tanh (

+h

y=

y= tanh (

h/

m0

(h )

Figure 5: h > 0, > 1


or
(h) = tanh((h) + h).
The mean magnetisation per site in thermodynamic limit is

1
f (, h) =
G(h, (h))
h
h
G

= tanh((h) + h) +
(h, (h)) (h)

h
= (h),

m(, h) =

since

G
(h, (h))

= 0.

Since
f (, h) = f (, h)

and

m(, h) = m(, h),

it is sufficient to consider the case h 0 (see Figure 7 and 8). The expression
m0 = limh0 m(h) is called the spontaneous magnetisation, this is the
mean magnetisation as the magnetic field is decreased to zero,
m0 = lim m(h) = lim (h).
h0

We have from above


m0

h0

= 0 , if 1
.
> 0 , if > 1

79

y= tanh (

+h

y=

)
y= tanh (

h/

(h )

Figure 6: h > 0, 1

-f ( ,
h

em
p
o
l

s
1_ ln!2

h
Figure 7: > 1

80

-f ( ,
h

_1 ln!2

h
Figure 8: 1

Let Tc = k ; Tc is called the Curie Point. T Tc corresponds to 1 (see


figure 8) and T < Tc to > 1 (see Figure 7).

> 0 , when T < Tc
m0
.
= 0 , when T Tc
We have a phase transition at the Curie Point corresponding to the onset of
spontaneous magnetisation.

We can consider this model from the point of view of a lattice gas. Consider
a lattice gas with potential energy

H () =
V
Let ti =

(i +1)
.
2

X
ii<jV

i j =
2V

V
X
i=1

Then
V
X
i=1

V
1 X
i = (
i + V ).
2 i=1

81

!2
ti

V
X
i
+
2V i=1

Therefore

H () =
8V

V
X

!2
i

i=1

V
V
X
V
X

i
+
i + .
4 i=1
8
4V i=1
4

We can neglect the last two terms, because is small and the expectation of
a single spin is zero, and we take

H () =
8V

V
X

!2
i

i=1

V
X
V

i
.
4 i=1
8

Then
H ()

V
X
i=1

i =
8V

V
X

!2
i

i=1

X

( + )
i ( + )V
4 2 i=1
8 2
V

X

= E() ( + )
i ( + )V.
4 2 i=1
8 2
with =

and

and

1
1
(, ) = ( + ) f (, + )
8 2
4 2
1
1
(, ) = (1 + m(, + )).
2
4 2

Let 0 = 2 , then
1
1
(, ) = ( + ) f (, ( 0 ))
8 2
2
and

1
1
(, ) = (1 + m(, ( 0 ))).
2
2
4
If > , then (, ) has a discontinuity in its derivative at 0 and (, )
has a discontinuity at 0 (see Figure 9).

82

( , )

0
( , )

Figure 9: >

83

9.4

Continuous Ising model

In the continuous Ising model the state space E = {1, +1} is replaced by
d
the real numbers R. Let = RZ denote the space of configurations. Due
to the non-compactness of the state space severe mathematical difficulties
arise. We note that the continuous Ising model can be seen as an effective
model describing the height of an interface, here the functions give the
height of an interface for some reference height; and any collection (x )xZd or
probability measure P P(, F) is called random field of heights. Details
about this model can be found in [Gia00] and [Fun05]. One first considers
the so-called massive model, where there is a mass m > 0 implying a selfinteraction. Let S, and m > 0. We write synonymously x = (x)
for . Nearest neighbour heights do interact with an elastic interaction
potential V : R R, which we assume to be strictly convex with quadratic
growth, and which depends only on the difference in the heights of the nearest
2
neighbours. In the simplest case V (r) = r2 one gets the Hamiltonian
H ()

X m2
x

2x +

1 X
(x y )2 ,
4d x,y
|xy|=1

with x = x for x c . The interface here is said to be anchored at


outside of . A random interface anchored at outside of is given by the
Gibbs distribution
(d) =

1
eH () (d),
Z ()

where
(d) =

dx

x (dx )

x
/

is the product of the Lebesgue measure at each single site in and the Dirac
measure at x for x c . The term is called reference measure in with
boundary . The thermodynamic limit exists for the model with m > 0
in any dimension. However, for the most interesting case m = 0 this exists
only for d 3. These models are called massless models or harmonic
crystals. The interesting feature of these models is that there are infinitely
many Gibbs measures due to the continuous symmetry. Hence we are in
a regime of phase transitions (see [BD93] for some rigorous results for this
regime). The massless models have been studied intensively during the last
fifteen years (see [Gia00] for an overview). The main technique applied is
the random walk representation. This can be achieved when one employs
summation by parts to obtain a discrete elliptic problem.
84

Figure 10: height-functions : Zd R


This gives also the hint that we need d 3 due to this random walk representation and the transience of the random walk. Luckily, if one goes over to the
random field of gradient, i.e. the field derived with the discrete gradient
mapping from the random field of heights, one has the existence of infinite
Gibbs measure for any dimension ([Gia00],[Fun05]). However, one looses the
product structure of the reference measure and one has to deal with the
curl free condition. The fundamental result concerning these gradient Gibbs
measures is given in [FS97]. For a recent review see [Fun05].

85

References
[AA68]

V.I. Arnold and A. Avez. Ergodic problems of classical mechanics.


Benjamin, New York, 1968.

[Ada01]

S. Adams. Complete Equivalence of the Gibbs Ensembles for onedimensional Markov Systems. Journal Stat. Phys., 105(5/6), 2001.

[AGL78] M. Aizenman, S. Goldstein, and J.L. Lebowitz. Conditional Equilibrium and the Equivalence of Microcanonical and Grandcanonical Ensembles in the Thermodynamic limit. Commun. Math.
Phys., 62:279302, 1978.
[Aiz80]

M. Aizenman. Instability of phase coexistence and translation invariance in two dimensions. Number 116 in Lecture Notes in
Physics. Springer, 1980.

[AL06]

S. Adams and J.L. Lebowitz. About Fluctuations of the Kinetic


Energy in the Microcanonical Ensemble. in preparation, 2006.

[Bal91]

R. Balian. From Microphysics to Macrophysics - Methods and


Applications of Statistical Physics. Springer-Verlag, Berlin, 1991.

[Bal92]

R. Balian. From Microphysics to Macrophysics - Methods and


Applications of Statistical Physics. Springer-Verlag, Berlin, 1992.

[BD93]

E. Bolthausen and J.D. Deuschel. Critical Large Deviations For


Gaussian Fields In The Phase Transition Regime, I. Ann. Probab.,
21(4):18761920, 1993.

[Bir31]

G.D. Birkhoff. Proof of the ergodic theorem. Proc. Nat. Acad. Sci.
USA, 17:656600, 1931.

[BL99a]

G.M. Bell and D.A. Lavis. Statistical Mechanics of Lattice Systems, volume I. Springer-Verlag, 2nd edition, 1999.

[BL99b]

G.M. Bell and D.A. Lavis. Statistical Mechanics of Lattice Systems, volume II. Springer-Verlag, 1999.

[Bol84]

L. Boltzmann. Uber
die Eigenschaften monozyklischer und anderer damit verwandter Systeme, volume III of Wissenschaftliche
Abhandlungen. Chelsea, New York, 1884. reprint 1968.

[Bol74]

L. Boltzmann. Theoretical physics and philosophy writings. Reidel,


Dordrecht, 1974.
86

[Bre57]

L. Breiman. The individual ergodic theorem of information theory.


Ann. Math. Stat., 28:809811, 1957.

[CK81]

I. Csiszar and J. Korner. Information Theory, Coding Theorems


for Discrete Memoryless Systems. Akdaemiai Kiado, Budapest,
1981.

[dH00]

F. den Hollander. Large Deviations. American Mathematical Society, 2000.

[Dob68a] R.L. Dobrushin. The description of a random field by means of


conditional probabilities and conditions of its regularity. Theor.
Prob. Appl., 13:197224, 1968.
[Dob68b] R.L. Dobrushin. Gibbsian random fields for lattice systems with
pairwise interactions. Funct. Anal. Appl., 2:292301, 1968.
[Dob68c] R.L. Dobrushin. The problem of uniqueness of a Gibbs random
field and the problem of phase transition. Funct. Anal. Appl.,
2:302312, 1968.
[Dob73]

R.L. Dobrushin. Investigation of Gibbsian states for three dimensional lattice systems. Theor. Prob. Appl., 18:253271, 1973.

[Dor99]

T.C. Dorlas. Statistical Mechanics, Fundamentals and Model Solutions. IOP, 1999.

[DZ98]

A. Dembo and O. Zeitouni. Large Deviations Techniques and Applications. Springer Verlag, 1998.

[EL02]

G. Emch and C. Liu. The Logic of Thermostatistical Physics.


Springer, Budapest, 2002.

[Ell85]

R. S. Ellis. Entropy, Large Deviations and Statistical Mechanics.


Springer-Verlag, 1985.

[FO88]

H. Follmer and S. Orey. Large Deviations For The Empirical Field


Of A Gibbs Measure. Ann. Probab., 16(3):961977, 1988.

[Fol73]

H. Follmer. On entropy and information gain in random fields.


Probab. Theory Relat. Fields, 53:147156, 1973.

[FS97]

T. Funaki and H. Spohn. Motion by Mean Curvature from the


Ginzburg-Landau Interface Model. Commun. Math. Phys.,
185:136, 1997.
87

[Fun05]

T. Funaki. Stochastic Interface Models, volume 1869 of Lecture


Notes in Mathematics, pages 1178. Springer, 2005.

[Gal99]

G. Gallavotti. Statistical Mechanics: A short Treatise. SpringerVerlag, 1999.

[Geo79]

H. O. Georgii. Canonical Gibbs Measures. Lecture Notes in Mathematics. Springer, 1979.

[Geo88]

H. O. Georgii. Gibbs Measures and Phase Transitions. De Gruyter,


1988.

[Geo93]

H.O. Georgii. Large deviations and maximum entropy principle


for interacting random fields on Zd . Ann. Probab., 21:18451875,
1993.

[Geo95]

H.O. Georgii. The Equivalence of Ensembles for Classical Systems


of Particles. Journal Stat. Phys., 80(5/6):13411378, 1995.

[GHM00] H.O. Georgii, O. Haggstrom, and C. Maes. The random geometry


of equilibrium phases, volume 18 of Phase transitions and Critical
phenomena, pages 1142. Academic Press, London, 2000.
[Gia00]

G. Giacomin. Anharmonic Lattices, Random Walks and Random


Interfaces, volume I of Recent research developments in statistical physics, vol. I, Transworld research network, pages 97118.
Transworld research network, 2000.

[Gib02]

J.W. Gibbs. Elementary principles of statistical mechanics, developed with special reference to the rational foundations of thermodynamics. Scribner, New York, 1902.

[GM67]

G. Gallavotti and S. Miracle-Sole. Statistical mechanics of lattice


systems. Commun. Math. Phys., 5:317324, 1967.

[Hua87]

K. Huang. Statistical Mechanics. Wiley, 1987.

[Isi24]

E. Ising. Beitrag zur theorie des ferro- und paramagnetismus. Dissertation, Mathematisch-Naturwissenschaftliche Fakultat der Universitat Hamburg, 1924.

[Isr79]

R. B. Israel. Convexity in the Theory of Lattice Gases. Princeton


University Press, 1979.

88

[Jay89]

E.T. Jaynes. Papers on probability, statistics and statistical


physics. Kluwer, Dordrect, 2nd edition, 1989.

[Khi49]

A.I. Khinchin. Mathematical Foundations of Statistical Mechanics.


Dover Publications, 1949.

[Khi57]

A.I. Khinchin. Mathematical Foundations of Information Theory.


Dover Publications, 1957.

[Kur60]

R. Kurth. Axiomatics of Classical Statistical Mechanics. Pergamon


Press, 1960.

[KW41]

H.A. Kramers and G.H. Wannier. Statistics of the two-dimensional


ferromagnet I-II. Phys. Rev., 60:252276, 1941.

[Len20]

W. Lenz. Beitrag zum vertsandnis der magnetischen erscheinungen


in festen korpern. Physik. Zeitschrift, 21:613615, 1920.

[LL72]

J.L. Lebowitz and A.M. Lof. On the uniqueness of the equilibrium


state for Ising spin systems. Commun. Math. Phys., 25:276282,
1972.

[LPS95]

J.T. Lewis, C.E. Pfister, and W.G. Sullivan. Entropy, concentration of probability and conditional limit theorems. Markov Process.
Related Fields, 1(3):319386, 1995.

[LR69]

O.E. Lanford and D. Ruelle. Observables at infinity and states


with short range correlations in statistical mechanics. Commun.
Math. Phys., 13:194215, 1969.

[McM53] B. McMillan. The basic theorem of information theory. Ann. Math.


Stat., 24:196214, 1953.
[Min00]

R. A. Minlos. Introduction to Mathematical Statistical Physics.


AMS, 2000.

[Oll88]

S. Olla. Large Deviations for Gibbs Random Fields. Probab. Th.


Rel. Fields, 77:343357, 1988.

[Pei36]

R. Peierls. On Isings model of ferromagnetism. Proc. Ca,bridge


Phil. Soc., 32:477481, 1936.

[Rei98]

L.E. Reichl. A Modern Course in Statistical Physics. Wiley, New


York, 2nd edition, 1998.

89

[RR67]

D.W. Robinson and D. Ruelle. Mean entropy of states in classical


statistical mechanics. Commun. Math. Phys., 5:288300, 1967.

[Rue69]

D. Ruelle. Statistical Mechanics: Rigorous Results.


Wesley, 1969.

[Rue78]

D. Ruelle. Thermodynamic formalism: The Mathematical Structures of Classical Equilibrium. Addison-Wesley, 1978.

[Sha48]

C.E. Shannon. A mathematical theory of communication. Bell


System Techn. J., 27:379423, 1948.

[Shl83]

S.B. Shlosman. Non-translation-invariant states in two dimensions.


Commun. Math. Phys., 87:497504, 1983.

[SW49]

C.E. Shannon and W. Weaver. The mathematical theory of information. University of Illinois Press, Budapest, 1949.

[Tho74]

R.L. Thompson. Equilibrium States on Thin Energy Shells. Memoirs of the American Mathematical Society. AMS, 1974.

[Tho79]

C. J. Thompson. Mathematical Statistical Mechanics. Princeton


University Press, 1979.

[Tho88]

C. J. Thompson.
Clarendon, 1988.

Addison-

Classical Equilibrium Statistical Mechanics.

[TKS92] M. Toda, R. Kubo, and N. Saito. Statistical Physics I - Equilibrium


Statistical Mechanics. Number 30 in Solid-State Sciences. Springer,
New York, 1992.

90

You might also like