0% found this document useful (0 votes)
46 views26 pages

Canonical Ensemble: 4.1. Averages and The Partition Function

This document discusses the canonical ensemble in statistical mechanics. It begins by introducing the canonical ensemble as a system that can exchange energy with a heat bath at a fixed temperature. It then discusses how the canonical ensemble allows for simpler calculations compared to previous ensembles studied. It provides examples of using the canonical ensemble to derive the Maxwell-Boltzmann distribution and Boltzmann's equipartition theorem, which can be applied to calculate average energies and heat capacities of systems.

Uploaded by

aldo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views26 pages

Canonical Ensemble: 4.1. Averages and The Partition Function

This document discusses the canonical ensemble in statistical mechanics. It begins by introducing the canonical ensemble as a system that can exchange energy with a heat bath at a fixed temperature. It then discusses how the canonical ensemble allows for simpler calculations compared to previous ensembles studied. It provides examples of using the canonical ensemble to derive the Maxwell-Boltzmann distribution and Boltzmann's equipartition theorem, which can be applied to calculate average energies and heat capacities of systems.

Uploaded by

aldo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

4.

Canonical ensemble
In this chapter we will formulate statistical physics for subsystems
that are held at constant temperature. In the next chapter we will
work at constant temperature and constant chemical potential. The
calculations are dierent from those we have done so far, and often
far simpler. For systems at fixed temperature there are very useful
numerical techniques, Monte Carlo methods, that we will explain.

Ensembles: Gibbs gave odd names to methods involving subsystems


in contact with reservoirs.
In closed system, with fixed energy we estimate time averages by
sampling uniformly over the energy shell, as we have done until
now. Gibbs called this the microcanonical ensemble.
In a system which can exchange energy at fixed temperature we
average using the Boltzmann factor of Eq. (3.49). This is called
the canonical ensemble.
In a system which can exchange both energy and number with
reservoirs, we have a dierent method of averaging that we will
derive below. This is called the grand canonical ensemble.

4.1. Averages and the partition function


In this section we study systems which are subsystems of a larger
system: they are held at constant temperature by contact with a (much
larger) heat bath. For a system in contact with a heat bath we have
already derived the probability of finding a given energy or a given
state, Eq. (3.49). To review, we say:

Wtot = eSr (EEs )/kB Ws (Es ) eEs Ws (Es ),


P (Ei ) exp(Ei ). (4.1)

Here Ei labels a particular quantum state, and = 1/kB T . Note that


this formula holds in general: classical systems and quantum systems
4.2. CANONICAL AVERAGES AND EQUIPARTITION 94

obey this law1 .



Any probability satisfies j P (j) = 1, where the sum is over all the
alternatives. Thus we can write, for the canonical ensemble:
eEi
P (Ei ) = ; Z= eEi . (4.2)
Z i

The normalization, Z, is called the partition function. The notation,


Z, comes from the German Zustandsumme = sum over states. In
the chemical literature Z is called Q. For a classical system we re-
place the sum over states by an integral in the same way as for the
microcanonical ensemble:

1 eH(r,p) 1
P (E) = , Z= dN r dN p eH(r,p) .
N !h3N Z N !h3N
(4.3)

4.2. Canonical averages and equipartition


There is a very useful special case of the formalism above.

4.2.1. Distributions of additive degrees of freedom


Let us look at the canonical probability more closely. Suppose that
we have a classical Hamiltonian in which certain degrees of freedom
(coordinates or momenta) appear additively. For example, for an in-
teracting fluid,
N

H= p21 /2m + p2i /2m + ({r}).
2

We can find the probability distribution for that variable (the distri-
bution for p1 in this case) by considering it as a marginal distribution.
We will use the relation from probability theory that if P (AB) is the
probability of A and B, then:

P (A) = P (AB).
B

The distribution, P (A), is the marginal distribution, the distribution


of A ignoring B. We can find P (A) by summing over all the possibil-
ities of B.
1 Thismay seem to contradict the form of the Fermi and Bose distributions of
Eq. (2.41), Eq. (2.43). It does not, as we will see below.
95 CHAPTER 4. CANONICAL ENSEMBLE

For our case, we integrate the normalized probability over pj , j =


2, 3, . . . , N and rj , j = 1, 2, . . . , N .
2 N 2
ep1 /2m dN 1 p e 2 pi /2m dN r e
P (p1 ) = N 2
dN p e 1 pi /2m dN re
2
ep1 /2m
= 2 (4.4)
dp1 ep1 /2m
This is the Maxwell-Boltzmann distribution which we have seen above
for the ideal gas. Note that the interactions cancel out in numera-
tor and demominator. Any classical system, gas, liquid, or solid (or
polymer, glass, etc.) has this distribution for the momentum of any
particle.

4.2.2. Boltzmann equipartition theorem


Continuing in the same vein, consider the average of any quantity that
appears in the Hamiltonian as an additive term which is quadratic in
a degree of freedom:

H = H ({q, p}) + CQ2 .

Here, {q, p} denotes all the other variables. Then, in the same way as
in Eq. (4.4) we have:
2
2
dQ CQ2 eCQ dq dp eH
CQ =
dQeCQ2 dq dp eH
2
dQ CQ2 eCQ
=
dQeCQ2

2
= ln[ dQeCQ ]

kB T
= . (4.5)
2

This theorem quickly gives many results. For example, the average
internal energy of the ideal gas is:
N

E= p2i /2m = 3N kB T /2.
1

The heat capacity follows: CV = E/TV = 3N kB /2.


As another example, air is composed, almost entirely, of the diatomic
molecules O2 and N2 . We can model them as rigid rotors with moment
4.2. CANONICAL AVERAGES AND EQUIPARTITION 96

of intertia I. Then the kinetic energy has extra degrees of freedom.


For each type of molecule we can write:

E= [p2j /2m + L2j /2I]. (4.6)
j

The angular momentum vector L should be regarded as having two


components in our classical picture we neglect the moment of inertia
around the molecular axis. Thus we have two more degrees of freedom,
and
E = 3N (kB T /2) + 2N (kB T /2) = (5/2)N kB T.
The first term is the translational part and the second rotational. Thus
CV = (5/2)N kB , Cp = (7/2)N kB . (The last identity comes from the
thermodynamic relation for ideal gases, Cp = CV + N kB . The proof
is left as an exercise.)
This is simple, and gives answers in agreement with experiment.
However, it is quite puzzling in one respect: if we think of diatomic
molecules classically, we should consider that they can vibrate. The
simplest model for a diatomic molecule with a chemical bond would
be two masses connected by a spring. The spring should give another
quadratic degree of freedom, and another kB T /2 per molecule in the
energy. This is not observed. This fact was extremely troubling to the
founders of statistical thermodynamics; see Gibbs (1902). The expla-
nation is that the vibrations must be treated in quantum mechanics,
and equipartition does not hold in quantum theory.
We can apply these ideas to a solid as well. The simplest model for
the vibrations of a monatomic solid is due to A. Einstein. Einstein
noted that in a solid in equilibrium each atom sits at the minimum
of the potential well due to all of the other atoms. Expanding about
the minimum means that each atom is in a quadratic potential, and
if an atom moves away from equilibrium, the force is of the form
Kr (assuming isotropy). Then the energy is Kr2 /2m, where K is a
spring constant and
m is the atomic mass, so that the frequency of the
vibration is E = K/m. There are three more quadratic degrees of
freedom. We find, as before:
3N kB 3N kB
CV = (kinetic) + (vibrational) = 3N kB . (4.7)
2 2
This is called the law of Dulong and Petit; it was discovered empirically
by P. L. Dulong and A. T. Petit in 1819.
The law is quite a lot more general than this derivation seems to
imply. In fact, if we let N denote all the atoms in a molecular solid, it
97 CHAPTER 4. CANONICAL ENSEMBLE

Figure 4.1.: The Einstein model for CV as a function of T /TE where


kB TE = E . The upper line is the Dulong-Petit law, and
the lower dotted line schematically depicts the deviation of the
theory from experiment at very low T .

applies as well. The reason is this: in the Einstein model each atom
is thought to vibrate independently. In fact, all of them are coupled
so that the real excitations are waves of vibration, e.g. sound waves.
The Hamiltonian is quadratic in the amplitude and momentum of the
waves, and we have as many dierent waves as we have degrees of
freedom. How this works out the normal mode transformation will
be treated in Chapter 6.

However, well below room temperature, the Dulong-Petit law fails,


and

lim CV = 0,
T 0

as required by the third law. A quantum mechanical treatment of


the Einstein model gives this result; see Figure 4.1. We will do this
derivation below. The real situation in solids is qualitatively like the
Einstein model with a deviation at very low T .
4.3. THE PARTITION FUNCTION AND THE FREE ENERGY 98

4.3. The partition function and the free energy

The partition function is of physical significance itself, as shown by


the following theorem.

Z = eF ; F = kB T ln Z. (4.8)

4.3.1. First proof: average of F

We will show this in two ways. First we need to interpret F in the


equation. Remember F = E T S(E), and E varies as heat flows
in and out of the system from the heat bath. In fact, the quantity
of interest is F = E Tr S. In order to emphasize that the
temperature is fixed by the reservoir, we write Tr . Now from Eq. (4.2)
and Eq. (3.5) we have:

1
F = E Tr S = (Ei + kB Tr ln P (Ei ))eEi
Z i
1
= (Ei kB Tr [Ei + ln Z])eEi
Z i
= kB Tr ln Z. (4.9)

4.3.2. Second proof: energy fluctuations

It is interesting to derive Eq. (4.8) another way. Note that for a large
system we can expect that the energy levels will be close together (or,
in a classical system, the energy is continuous). We can convert the
sum in Z into an integral by introducing the many-particle density of
states, i.e. the number of states per energy interval:

(E) = (E Ei ). (4.10)
i

Clearly:

i
Z= e = dE (E)eE . (4.11)
i

However, W is the number of states for which E < i < E + , i.e.



W= [(E + i ) (E i )],
i
99 CHAPTER 4. CANONICAL ENSEMBLE

where is the step function: (x) = 1, x 0, (x) = 0, x < 0; d/dx =


(x). For E we can expand:

W (E i ) = . (4.12)
i

Thus:
S = kB ln() (4.13)

Now we use this result to deal with the integral in Eq. (4.11).

(ETr S(E))
Z = dEe / = dEeF (E) /. (4.14)

Note that F is extensive, so the argument of the exponential is very


large. We can use the Laplace method; see Appendix A. We need to
find the energy at which F (E) is a minimum, and the second derivative
at that point.
dF dS dS 1
= 0 = 1 Tr ; =
dE dE dE Tr
d2 F d2 S d(1/T ) 1
= Tr = Tr =
dE 2 dE 2 T =Tr dE T =Tr C V Tr
1
F (E) = F (E) + (E E)2 . . . (4.15)
2CV Tr
The first equation determines an energy, E, at which F is a mini-
mum. At that point the system temperature is equal to the reservoir
temperature, and we can write F (E) = F (Tr ). The second derivative
must be positive in order that F be a minimum. This gives a stability
condition: CV > 0. If this does not hold, we are dealing with a system
that cannot come to equilibrium.
Using Eq. (A.9), we have:

2
F (Tr ) /2kB CV Tr2
Z e e(EE) dE/

= eF (Tr ) 2kB T 2 CV /. (4.16)

The correction to Eq. (4.8) is now explicit



F = kB T ln Z kB T ln 2kB T 2 CV + O(1).

The leading correction is of order ln(N ).


We have actually proved more: the integrand in Eq. (4.16) is the
probability distribution for energy fluctuations. It a gaussian centered
4.4. THE CANONICAL RECIPE 100


on E; the width is E = 2kB T 2 CV . The gaussian is very narrow:
E/E = O(N 1/2 ) 1010 for a macroscopic system. This shows
that the canonical ensemble will give the same results as the micro-
canonical: the energy fluctuations are very small.

4.4. The canonical recipe


With these results in hand, we can write down a procedure for deriv-
ing a fundamental relation from statistical physics. This is the most
popular method in statistical thermodynamics.

4.4.1. Recipe
Here are the steps to do a calculation:
1. For a quantum system calculate:


Z(T, X, N ) = eEi .
i

For a classical system calculate



1
Z(T, X, N ) = d3N q d3N p eH(q,p) .
N !h3N

2. Calculate F (T, X, N ) = kB T ln Z.
3. Calculate S = (F/T )X,N , f = (F/X)T,N . For example,
for a fluid:
S = (F/T )V,N , p = (F/V )T,N .
These are the equations of state.
4. To average any quantity of interest, R(q, p) in a classical system
compute:
3N
d q d3N p eH(q,p) R(q, p)
R = .
d3N q d3N p eH(q,p)

In a quantum system,
for any operator, R we need its diagonal

matrix elements, i|R|i . Then the average is:

i i| R|i eEi
R = E .
ie
i
101 CHAPTER 4. CANONICAL ENSEMBLE

4.4.2. Phase space density and density matrix


We can rewrite our results in terms of the phase space density. For
the canonical ensemble, our results amount to saying:

1 eH(r,p)
c (q, p) = . (4.17)
N !h3N Z
The density matrix is:

eEj
j,k
c = j,k . (4.18)
Z

The density operator is:

e H
= (4.19)
Tr(e H )
Averages are, as usual:

R = Tr(R). (4.20)

4.5. Examples
With the recipe we can do a large number of very interesting compu-
tations. Here are a few examples.

4.5.1. Ideal paramagnet



For the paramagnet E = j j h. The first step of the recipe is:

Z = eh j j

{i =1}

= eh1 eh2 . . .
1 =1 2 =1

h N
= eh + e . (4.21)

The last equation is important: the partition function is the N th power


of a partition function for each spin: Z = (Z1 )N . This always happens
for non-interacting systems.
Now, the second step:

F = N kB T ln(eh + eh ). (4.22)
4.5. EXAMPLES 102

The equation of state is:



F
M = = N tanh h, (4.23)
h T

as in Eq. (3.7).
For small field we have:
M N
M N h; N = = . (4.24)
h kB T
This is the Curie law. In more conventional units the Curie law reads
(for spin-1/2 particles):
N 2
M H. (4.25)
kB T
Here M is the magnetic moment, and the magnetic moment per
particle.

4.5.2. Ideal gas



For the ideal gas H = p2j /2m. The first step is:

VN
2
pj /2m
Z = dp 1 dp 2 . . . e
N !h3N

VN p21 /2m p22 /2m
= dp1 e dp2 e ...
N !h3N
N
1 V p2 /2m
= dp e
N ! h3
N
Ve
= , = (h2 /2mkB T )1/2 . (4.26)
N 3
In the last step Stirlings approximation, N ! N N eN was used.
Again, note the factorization of the partition function into the N th
power of a one particle quantities.
The second step is to find the free energy:
F = kB T ln Z = N kB T (ln(n3 ) 1), n = N/V. (4.27)

The third step:



F 5
S = = N kB ( + ln(V /N ) ln(3 ))
T V,N 2

F
p = = N kB T /V. (4.28)
V T,N
103 CHAPTER 4. CANONICAL ENSEMBLE

The first line is the Sackur-Tetrode equation, Eq. (3.12) expressed as


a function of T .

4.5.3. Ideal gas with internal degrees of freedom: diatomic


molecules

We now return to the gas of molecules with internal degrees of free-


dom (such as rotations) which was discussed in 4.2. Consider, again,
diatomic molecules. Now the Hamiltonian has several parts:

H = Htrans + Hrot + Hvib ,

corresponding to translation of the molecule as a whole, rotation about


two axes with moment of inertia I, and vibrations, i.e. stretching of
the chemical bond between the molecules. As usual when we have
sums of independent parts of the Hamiltonian, the partition function
factors:
Z = Z1N ; Z1 = Ztrans Zrot Zvib .
The first factor is the familiar integral arising from the momentum
terms: Ztrans = V e/N 3 . Correspondingly, the free energy is a sum:

F = Ftrans + Frot + Fvib


= N kB T (ln Ztrans + ln Zrot + ln Zvib ). (4.29)

And the specific heat, T ( 2 F/T 2 ) has three parts as well.


We now discuss the last two factors.

Rotations

The rotations arise from the rotational kinetic energy L2 /2I. It is


an elementary result of quantum theory (Messiah (1968), Sakurai &
Napolitano (2010)) that for such a rotation the energy levels and de-
generacies are:

J = 2 J(J + 1)/2I, g(J) = 2J + 1, J = 0, 1, . . . (4.30)

To give a scale for the energies here we define the rotational tempera-
ture, TR , by kB TR = 2 /2I. For O2 TR = 2.1K, N2 , 2.9K, well below
the boiling point of these gases and very far below room temperature.
The largest TR s are for H2 , 85.4K, and HD (D stands for Deuterium),
64K.
4.5. EXAMPLES 104

For a diatomic molecule of dierent species such as HCl we can write


the rotational partition function at once:


2
Zrot = (2J + 1)e J(J+1)/2I
= (2J + 1)eJ(J+1)TR /T . (4.31)
0

At room temperature, we note that TR /T << 1, and the terms in


the sum dier very little on increasing J by unity. It is a very good
approximation to replace the sum by an integral:

2
Zrot 2JeTR J /T dJ = 2kB T I/2 .
0

Now the free energy is Frot = N kB T ln(2kB T I/2 ), and the heat
capacity is:
Cvrot = T ( 2 Frot /T 2 ) = N kB ,
as above.
For T < TR we have a dierent situation. As a practical matter
this only applies to HD. Now the sum in Eq. (4.31) is dominated by
the first few terms, Z 1 + 3e2TR /T , and the heat capacity vanishes
exponentially.
For a homo-nuclear molecule all this isnt quite right because of
symmetry requirements. There results an interesting behavior for the
heat capacity of H2 for low temperatures, an eect first explained by
D. Dennison; see Problem 7.

Vibrations:

Diatomic molecules are attached by a chemical bond whose length we


call r. The equilibrium value, r = L, is set by the atoms sitting at
the minimum of the mutual potential energy well; compare Figure 1.1.
Near the bottom of the well, we can expand, and replace the well with
a quadratic approximation: V (r) V (L) + K(r L)2 /2 . . . . If we set
u = r L we have a quadratic degree of freedom, H = H + Ku2 /2
which is the Hamiltonian for a simple harmonic oscillator.
We know how quantize a simple harmonic oscillator, (Messiah (1968),
Sakurai & Napolitano (2010)). The energy levels are:

n = (n + 1/2), = K/m, n = 0, 1, . . . (4.32)

To give an idea of the numbers here, we define a vibrational tem-


perature, Tv = /kB . For O2 , Tv = 2230K. Since Tv is far above
105 CHAPTER 4. CANONICAL ENSEMBLE

room temperature, we will always considering small n for which the


quadratic approximation of the potential is adequate.
Now we can write:


/2 e/2
Zvib = e en = . (4.33)
0
1 e

Dierentiating, we find the heat capacity:


2
Tv eTv /T
Cvib = N kB . (4.34)
T (eTv /T 1)2
This function is very small at room temperature since it is propor-
tional to eTv /T for Tv /T >> 1. The missing degree of freedom which
troubled Gibbs (see section 4.2) is frozen out by the large spacing of
the vibrational energy levels.

4.5.4. Low temperature solids in the Einstein model


We now have the analysis we need to see how the Einstein model works
at low T . We take over the result of Eq. (4.34) for N oscillators of
frequency E = kB TE /. This is the result plotted in Figure 4.1.
The deviations from the Einstein model at low temperatures are due
to the fact that in a real solid the frequencies are not all the same.
This theory (the Debye theory) will be given in Chapter 6.

4.5.5. Interacting fluid



For the interacting fluid we have H = j p2j /2m + ({r}). The par-
tition function divides into two factors:
2

1
Z = 3N
d pe pj /2m
d3N r e
N !h3N

VN 1 3N
= d re
N !3N VN
N
VN Ve
= QN QN (4.35)
N !3N N 3
The first factor is the ideal gas result, and the second is called2 the
configuration integral,

1 3N
QN = d re . (4.36)
VN
2 Our definition diers from the more usual one by the factor V N .
4.5. EXAMPLES 106

The configuration integral cannot be evaluated exactly. There are a


number of ways to approximate it analytically. For numerical methods,
we see that it would be useful to evaluate it alone since the transla-
tional degrees of freedom are treated exactly. One goal in evaluating
QN would be to get insight into the non-trivial parts of the phase
diagram such as the critical point.

Virial expansion

In the 1930s considerable eort was expanded in finding an expansion


of QN around the ideal gas result. When converted to an equation of
state this takes the form of an expansion in density:

p/kB T = n(1 + B2 (T )n + B3 (T )n2 . . . ). (4.37)

The Bi are called virial coecients.


A clever method of deriving entire series of virial coecients was
devised by H. Ursell and J. E. Mayer. It is summarized in most modern
books on statistical physics, and in Mayers own book, Mayer & Mayer
(1940). Unfortunately, the series is quite useless since its sum fails to
get the critical point correctly; see Mayer & Mayer (1940). We will
limit ourselves here to show how to get the first correction, B2 which
is useful at low density.
We follow the method of Landau & Lifshitz (1980). First we recall
the form of :

= (r1,2 ) + (r1,3 ) + (r2,3 ) . . . (4.38)

The free energy can be written:



1
F = Ftrans kB T ln N d3N re
V

1
= Ftrans kB T ln N d3N r (e 1) + 1
V

1 3N
= Ftrans kB T ln 1 + N d r(e 1) . (4.39)
V

Now we use the fact that we are considering a density expansion. We


suppose that at a given time in the whole sample there is only one pair
that is close together, and that there are no higher-order collisions at
all. This means that we are looking at a relatively small system. In
this case only one of the terms in Eq. (4.38) is large enough to make
the integrand much dierent from zero. However, we must sum over
107 CHAPTER 4. CANONICAL ENSEMBLE

all the possible pairs that could be close, and take e(rj,k ) = 1 for
all the rest. Thus:

1 N (N 1)
3N
d r(e 1) dr1 dr2 (e(r1,2 ) 1)
VN 2V 2

N2
ds(e(s) 1). (4.40)
2V
In the last line we have gone to the center of mass frame of the two
atoms.
Now we can expand the logarithm, assuming the correction is small:

N2
F = Ftrans kB T ds(e(s) 1) (4.41)
2V
To get the pressure we dierentiate:

F N kB T N 2 kB T
p= = + ds(1 e(s) ) (4.42)
V T V 2V 2
Comparing to Eq. (4.37) gives:

1
B2 = ds(1 e(s) ). (4.43)
2

Now we can try to approximate the integral. We can write:



B2 = 2 r2 drf (r); f = 1 e .
0

Recall that = 4[(/r)12 (/r)6 ] for the Lennard-Jones case.


Note that the function f is approximately unity for r < because of
the repulsive part of the potential. For r > we have only the small
attractive part. We assume that att /kB T < 1. So, crudely, we put:

2 6 2 6 2 3 4
B2 2 r dr 4 r dr/r = 1 (4.44)
0 3 kB T

Now B2 is of the form b a/kB T . Thus, from Eq. (4.37):


p = nkB T (1 + [b a/kB /T ]n)
nkB T
= nkB T (1 + bn) an2 an2 . (4.45)
1 nb
This is the van der Waals equation, Eq. (2.18) with an approximate
microscopic identification of a, b:
a = 8 3 /3; b = 2 3 /3.
The temperature dependence of B2 is qualitatively correct. The tem-
perature at which B2 = 0 is called the Boyle temperature.
4.6. SIMULATIONS IN THE CANONICAL ENSEMBLE 108

4.6. Simulations in the canonical ensemble


Numerical solutions for systems at fixed temperature are widely used.
In fact, the most common method in numerical computation for sta-
tistical systems is the one that we will describe here, the Metropolis
Monte Carlo technique of Metropolis, Rosenbluth, Rosenbluth, Teller
& Teller (1953). It was developed at Los Alamos in the late 1940s
and early 50s, and arose out of the work of Stanaslaw Ulam and John
von Neumann. The term Monte Carlo was coined by Ulam: it refers
to the famous gambling casino in Monaco: the method makes copious
use of random sampling and random numbers.
There are numerous references on this subject, e.g. Newman &
Barkema (1999), Gould et al. (2006) and chapters in Chandler (1987),
Gould & Tobochnik (2010), and Pathria & Beale (2011), Peliti (2011).
For some of the mathematics of Markov chains see Sethna (2006) and
Van Kampen (2007).

4.6.1. Statistical sampling at fixed T

Suppose we want to compute the canonical average of a quantity, R,


for example, the magnetic moment for an Ising system. According to
Eq. (4.2) we need to sum over the states of the system which we label
by j: Ej
j R(j)e
R = Ej . (4.46)
je

In this expression, R(j) is the value of R in state j. For a small system


we could simply sum over all the states. For 9 Ising spins we have to
sum over 29 = 512 states which is perfectly doable. However, for 256
spins, there are 1.2 1077 states far beyond the capacity of any
computer. We must sample, i.e., pick a very much smaller subset of
states, {J} and estimate the result. If we pick the J at random we
estimate:
R(J)eEJ
R = J EJ .
je

However, this is likely to give very bad estimates. For example, at low
T most of the states we pick will have very small Boltzmann factors
and will be essentially discarded from the sum.
The trick which Metropolis et al. devised is to use a set {} which
prefers more important states. As a result, the method is called im-
portance sampling. We will see, in a moment, how to generate the s.
109 CHAPTER 4. CANONICAL ENSEMBLE

First we need to see how to use them to estimate averages. Suppose


the s generate a biased sample so that a state has a probability P .
Then more probable states will occur more often than they should in
Eq. (4.46), and we need to correct for this: if in a sample R() occurs
twice as often as it should, we need to divide its contribution by 2. In
general, to get a good estimate, we should put:
1
R()P ()eE
R = 1 ()eE
. (4.47)
P

Now the P ()s are at our disposal. The best choice is to make every
term in the sum equally important: presumably it will take computer
resources to generate a term in the sum, so we want to use all of them.
A nice choice will be: P () eE . In this case we have, for a
sample of size N :
R()
R = . (4.48)
N
Our sample is the simple numerical average over the sequence of s.
This is not the only choice for P which has proven useful. We may
want to emphasize regions of the state space where R has values we
are interested in. Methods like this are called umbrella sampling: see
Chandler (1987). We will only discuss the simplest method.

4.6.2. Markov processes

We need to pick the set {}. This sounds hard since the important
states might be a very small fraction of all the possibilities. However,
we know that the problem of finding the high probability regions of
the state space cannot be too hard, since real dynamics does go to
equilibrium in the real world. The strategy is to impose a simplified
dynamics on the system. That is, {} will be generated as a sequence
of states arising from a fictitious dynamics. The equilibrium attained
will be real; the way we get there need not be.
We will choose the as states of a Markov chain. A Markov chain
is a sequence that is generated by picking states one after another at
discrete time steps according to prescribed time-independent transi-
tion probabilities, W. If we are in at step t the probability of being
in state at t + 1 is W( ). (We will usually abbreviate this to
W(, ).) Since we pick new states according to a probability, if we
rerun the process we will, in general, get a dierent sequence. These
dierent sequences are called sample paths.
4.6. SIMULATIONS IN THE CANONICAL ENSEMBLE 110

We need to require that the W be interpreted as a probability, and


that something happens at each step. This means that we require:

W(, ) 0; W(, ) = 1. (4.49)

It is quite possible for the state to stay the same at a step: that
is W(, ) = 0. We do not require that the matrix of the W be
symmetric. Note that the transition probabilities at a given step do
not depend on how the path got to state . Markov processes have no
memory.

Master equation

Since the dynamics we are representing is ergodic, we require our pro-


cess to be ergodic: that is, we must choose the W so that every state
in our finite set of states can be reached by a finite number of steps
from any other. If we start in some state, at time t later there will be
a a probability, P (, t) for being in state at time t. At the next time
step:
P (, t + 1) = W(, )P (, t). (4.50)

This is called the master equation. Subtracting P (, t) from both


sides, and using Eq. (4.49) gives:

P (, t + 1) P (, t) = [W(, )P (, t) W(, )P (, t)]. (4.51)

Sometimes it is useful to consider a continuous time process. In this


case we replace the unit time step in Eq. (4.51) by a time step , divide
both sides of of the equation by , and replace the Ws by transition
rates W. This gives:

dP ()/dt = [W(, )P (, t) W(, )P (, t)].

We will continue here with the discrete time version.

Approach to equilibrium

Our first goal must be to find out under what conditions the process
we invent actually does go to equilibrium. This discussion is a bit
technical.
111 CHAPTER 4. CANONICAL ENSEMBLE

If we write the P s as a column vector, P, whose components are


P (), then we can read Eq. (4.50) as a matrix equation:

P(t + 1) = WP(t), P(t + 2) = W2 P(t), . . . , P(t + n) = Wn P(t). (4.52)

We are looking for equilibrium:

Pe (t + 1) = Pe (t) = WPe (t).

The equilibrium vector, if it exists, is an eigenvector of W with eigen-


value 1:
[W I]Pe = 0; det[W I] = 0, (4.53)

Where I is the unit matrix.


The existence of Pe is easy to prove. The transpose of W has an
eigenvector with unit eigenvalue, namely the vector with all ones as
components: this is just Eq. (4.49). The eigenvector of the trans-
pose matrix is called a left eigenvector. Left eigenvectors and right
eigenvectors do not need to be equal. But, since the determinant of
WT same as the determinant of W, a right eigenvector with the same
eigenvalue must exist.
Further, no other eigenvalue is larger than 1 by the triangle inequal-
ity:

|P ()| = |||P ()| W (, )|P ()|


|| |P ()| W (, )|P ()| = |P ()|.
,

In the last line we summed over and used Eq. (4.49). Thus || 1.
Thus equilibrium exists, the remaining question is whether it is
unique, and whether we approach it properly. Ergodicity guarantees
uniqueness. However, in the general case there is the possibility of
limit cycles, namely states where we cycle through some set:

1 2 . . . k 1 . (4.54)

If we get caught in a limit cycle, we will not go to Pe .


If we have no cycles, then there is a famous theorem of Perron and
Frobenius that guarantees approach to equilibrium. We do not need
this strong result because we will impose a further condition on the
process, that of detailed balance.
4.6. SIMULATIONS IN THE CANONICAL ENSEMBLE 112

Detailed balance

Let us return to the master equation, Eq. (4.51), and note that for Pe
we must have:

W(, )Pe (, t) = W(, )Pe (, t)

This is certainly guaranteed if we make another assumption: we pos-


tulate that each term in the two sums are equal, and that we have a
probability distribution such that:
W(, )Pe (, t) = W(, )Pe (, t). (4.55)
This is called detailed balance. Real mechanics has this property, as
do quantum transitions Van Kampen (2007). If we impose it on our
artificial dynamics, it makes things easier. Of course, this Pe has unit
eigenvalue.
We can see that detailed balance eliminates cycles automatically. In
equilibrium we cannot have Eq. (4.54) because we would be equally
likely to traverse the sequence backwards.
Also, since Pe () > 0 we can symmetrize W. Set:
V(, ) = [Pe ()]1/2 W(, )[Pe ()]1/2 .
A problem shows that V is symmetric and that its eigenvectors obey:

Si () = [Pe ()]1/2 Pi (); VSi = i Si ; WPi = i Pi . (4.56)


Since the eigenvectors of a real symmetric matrix span the space, we
can expand any vector in eigenvectors of V, and use this result to
expand in eigenvectors of W.

Result: approach to equilibrium

We have shown that if W is ergodic and obeys detailed balance then


for any starting point we can write:

P(t) = c1 Pe + ci Pi ; WPe = Pe ; WPi = i Pi . (4.57)
i

From Eq. (4.52)



P(t + n) = c1 Pe + ci ni Pi . (4.58)
i

Since the eigenvalues obey |i | < 1 the sum becomes negligible for large
n, and we must have c1 = 1, P(t + n) Pe . If we write i = ei we
identify the s as relaxation rates.
113 CHAPTER 4. CANONICAL ENSEMBLE

4.6.3. Metropolis algorithm

We need to devise a Markov process that is ergodic and obeys de-


tailed balance. Detailed balance allows us to build in the equilibrium
we want, namely the Boltzmann distribution. Thus we need, from
Eq. (4.55)
Pe () eE W ( )
= E = . (4.59)
Pe () e W ( )

There are many algorithms that satisfy this equation. The most
popular and most ecient (see Newman & Barkema (1999)) is that of
Metropolis et al. (1953). It involves several steps.
1. Start in state n, and try a move to state m.
2. Compute = Em En .
3. If 0 accept the move, i.e. put the system in state m.
4. If > 0 accept the move with probability e . That is, choose
a random number u [0, 1] and if u < e move to state m,
otherwise do nothing.
This does satisfy Eq. (4.59) because if > 0, Em > En the probability
of moving n m is e(Em En ) and the probability of moving m n
is 1.

4.6.4. Monte Carlo simulations

Here is a very simple example to illustrate the technique. Consider


a two level system, n = 0, 1 with energies E0 = 0 and E1 = . An
elementary application of the canonical recipe gives n = 1/(e +1).
Now we implement Monte Carlo: if you start in n = 0 you go to
n = 1 with probability e . If you start in n =1 you go to n = 0
with probability 1. Here is the matrix W:

1 e 1
e 0

It is easy to see that this vector is an eigenvector with eigenvalue unity:



1
e

Here is a code fragment to calculate n:


4.6. SIMULATIONS IN THE CANONICAL ENSEMBLE 114

Figure 4.2.: Monte Carlo results for a two level system: n as a function
of T . The solid line is the exact result.

n=1
nsum=0.
w=exp(-1./T)
for j in arange(maxsteps):
if n==1: nn=0
if n==0:
u=random()
if u <= w:
nn=1
else:
nn=0
n=nn
nsum=nsum+n
nave=nsum/maxsteps

Results for various T are plotted in Figure 4.2.


There is now an enormous industry in doing Monte Carlo simula-
tions of the type we have described. The first application (Metropo-
lis et al. (1953)) was to the hard-sphere gas in 2d and the second
(Rosenbluth & Rosenbluth (1954)) was to 3d hard spheres, and to the
Lennard-Jones gas.
A modern version of the second computation goes as follows: From
above we see that we need only compute the configuration integral,
QN . To do this, start with some configuration of atoms, and try a
115 CHAPTER 4. CANONICAL ENSEMBLE

move, in this case, simply pick another place for one of the atoms,
e.g. at random. Then compute and proceed as above. It is wise to
monitor the energy to guess if the system has approached equilibrium,
and then start to average. To get the pressure use the virial equation,
Eq. (2.11). A program that does this is in Appendix B. We will use
this method in Chapter 8.
For the Ising model the procedure is still simpler. Start with an
arbitrary arrangement of spins. Pick a spin at random, and flip it
according to the Metropolis scheme. After relaxation has occurred
monitor the energy and magnetic moment. From problem 2, below,
compute the heat capacity and magnetic susceptibility. This is the
subject of problem 5.

Suggested reading
The canonical ensemble is treated in all the general references, e.g.:
Huang (1987)
The rotational and vibrational heat capacities and the low temperature
behavior of H2 see:
Pathria & Beale (2011)
Landau & Lifshitz (1980)
For the Mayer expansion the original source is
Mayer & Mayer (1940)
The modern treatment in
Pathria & Beale (2011)
is particularly detailed. For a quick treatment of B2 alone, see
Landau & Lifshitz (1980)
Monte Carlo methods are in:
Newman & Barkema (1999)
Gould et al. (2006)
Chandler (1987)
Gould & Tobochnik (2010)
For Markov chains see:
Sethna (2006)
Van Kampen (2007)
PROBLEMS 116

Problems
1. Consider a tall cylinder with a classical ideal gas of N parti-
cles inside. Because of gravity, the gas is denser at the bottom.
Figure out the partition function, the free energy, and the heat
capacity.
2. Derive the formula for the vibrational heat capacity, Eq. (4.34).
Plot it as a function of T /Tv . It should resemble Figure 4.1.
3. a). Prove that V ar(E) = kB T 2 C, where C is the heat capacity.

b.) Prove for the Ising system that the magnetic susceptibility,
(1/N )M/h|T is given by:

1
N = V ar(M ).
kB T

c.) Show that for the zero-field susceptibility for the Ising model
in the disordered phase you can write:
1
= j 0 0 .
kB T j

The average uses the Hamiltonian with h = 0. Use Eq. (1.27).


d.) For the Ising case express in terms of the spin-spin cor-
relation function, Eq. (1.33). This is a simple example of a
fluctuation-dissipation relation.
4. Consider the Joule-Thomson process, pushing a gas through a
porous plug; see Figure 4.3. a.) It is a standard result that the
enthalpy is conserved in the process. Prove this by figuring out
the work done on both sides of the plug to move a parcel of gas
from left to right.
b.) Prove that the temperature change can be written:

(N/Cp )(T dB2 /dT B2 ),

where B2 is the second viral coecient, and we neglect higher


order terms.
Hints: You should compute T /p|H . It will help to prove

H H S
= Cp , =V +T .
T p p T p T
117 CHAPTER 4. CANONICAL ENSEMBLE

p1 p2

Figure 4.3.: The Joule-Thomson eect. Gas is forced through a porous plug
(center); p1 > p2 . The flow is slow enough that both sides of
the system can be considered to be in equilibrium.

Also prove that, for small density, from the virial expansion

V N kBT /p + N B2 .

c.) Show that for B2 = b a/kB T , Eq. (4.44), the gas cools at
low T and heats for high T .
5. a.) Write Metropolis Monte Carlo code for the 1d Ising model,
and verify the result in Eq. (4.23).
b.) Write a 2d code and use the result of Problem (4.3)a to find
the heat capacity as a function of T . Be sure you go through the
phase transition.
6. Use detailed balance to prove that V is symmetric and that
Eq. (4.56) is correct.
7. An interesting application of quantum statistics is to the rota-
tional states of H2 gas (see Landau & Lifshitz (1980)).
a.) The two nuclei protons in this case are identical and have
spin 1/2. Quantum mechanics says that if the total nuclear spin
is 1 (othohydrogen) then only odd J are allowed in the rotations,
and for spin 0 (parahydrogen) then only even J are allowed.
Explain this from quantum theory.
b.) Show that the equilibrium ratio of othro (o) to para (p) is:

x = No /Np = 3Zodd /Zeven


2
Zeven,odd = (2J + 1)e( /2I)J(J+1)
and the sum is over even and odd Js and I is the moment of
inertia.
PROBLEMS 118

c.) What is the low and high temperature limit of x?


d.) David Dennison, late of the University of Michigan physics
department, pointed out that the heat capacity of H2 gas at T s
low compared to 2 /2IkB could be understood by assuming that
x takes a long time to equilibrate so that when you cool the gas
it retains the high-T value. Show that this means that C rot
for H2 is much smaller than it would be in equilibrium. Find
the leading temperature dependence for C rot at low T for the
two cases. (This calculation was the first solid evidence that the
proton has spin.)
8. a.) Consider a system of N 1d classical anharmonic oscillators
for which the potential is V (x) = cx2 + f x4 . Suppose that the
extra term is small. Show that the approximate heat capacity is
C N kB (1 [3f kB T /2c2 ]).
b.) Suppose two molecules are bound by the Lennard-Jones
potential. Find the equilibrium distance between the molecules,
r . Expand the potential around r to third order, and figure
out the shift of the equilibrium with temperature. Assume the
shift is small. (Use the same technique as in part a.) This is the
origin of thermal expansion of most solids.
9. Classical theory of paramagnetism: suppose a molecule has mag-
netic moment m and is in a magnetic field B which is in the
z-direction. Use the Hamiltonian, H = m B = mB cos()
where is the polar angle. Figure out the average magnetic mo-
ment along the z direction as a function of T , i.e., m cos().
Expand the result for high temperature, and find the magnetic
susceptibility, limB0 m/B.
10. Classical theory of diamagnetism (Bohr - van Leeuven theorem):
consider a classical gas of charged particles in a constant, uniform
external field in the z-direction, Bz . We might expect that the
field would induce eddy currents by Lenzs law, and give rise to
a diamagnet. Show that this is false: the magnetic moment of
the gas is zero for any field.
Hint: Show that the external field can be taken to arise from a
vector potential, A = (0, xBz , 0). Write the Hamiltonian from
Eq. (1.2). Then, by a change of variables, show that Z is inde-
pendent of Bz .

You might also like