Stat Phys 2010
Stat Phys 2010
G. Falkovich
http://www.weizmann.ac.il/home/fnfal/papers/statphys2010.pdf
More is different
P.W. Anderson
Contents
1 Thermodynamics (brief reminder) 4
1.1 Basic notions . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Legendre transform . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Stability of thermodynamic systems . . . . . . . . . . . . . . . 13
4 Gases 41
4.1 Ideal Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
4.1.1 Boltzmann (classical) gas . . . . . . . . . . . . . . . . . 41
1
4.2 Fermi and Bose gases . . . . . . . . . . . . . . . . . . . . . . . 47
4.2.1 Degenerate Fermi Gas . . . . . . . . . . . . . . . . . . 49
4.2.2 Photons . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.3 Phonons . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.2.4 Bose gas of particles and Bose-Einstein condensation . 55
5 Non-ideal gases 58
5.1 Cluster and virial expansions . . . . . . . . . . . . . . . . . . . 58
5.2 Van der Waals equation of state . . . . . . . . . . . . . . . . . 61
5.3 Coulomb interaction and screening . . . . . . . . . . . . . . . 64
6 Phase transitions 69
6.1 Thermodynamic approach . . . . . . . . . . . . . . . . . . . . 69
6.1.1 Necessity of the thermodynamic limit . . . . . . . . . . 69
6.1.2 First-order phase transitions . . . . . . . . . . . . . . . 71
6.1.3 Second-order phase transitions . . . . . . . . . . . . . . 73
6.1.4 Landau theory . . . . . . . . . . . . . . . . . . . . . . 74
6.2 Ising model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.2.1 Ferromagnetism . . . . . . . . . . . . . . . . . . . . . . 77
6.2.2 Impossibility of phase coexistence in one dimension . . 83
6.2.3 Equivalent models . . . . . . . . . . . . . . . . . . . . 84
7 Fluctuations 88
7.1 Thermodynamic fluctuations . . . . . . . . . . . . . . . . . . . 88
7.2 Spatial correlation of fluctuations . . . . . . . . . . . . . . . . 90
7.3 Different order parameters . . . . . . . . . . . . . . . . . . . . 93
7.3.1 Goldstone mode and Mermin-Wagner theorem . . . . . 93
7.3.2 Berezinskii-Kosterlitz-Thouless phase transition . . . . 96
7.3.3 Higgs mechanism . . . . . . . . . . . . . . . . . . . . . 100
7.4 Universality classes and renormalization group . . . . . . . . . 102
2
9 Response and fluctuations 115
9.1 Static response . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.2 Temporal correlation of fluctuations . . . . . . . . . . . . . . . 118
9.3 Spatio-temporal correlation function . . . . . . . . . . . . . . 123
9.4 General fluctuation-dissipation relation . . . . . . . . . . . . . 124
9.5 Central limit theorem and large deviations . . . . . . . . . . . 128
3
This is a graduate one-semester course. Chapters 1,2,4 briefly remind
what is supposed to be known from the undergraduate courses, using a bit
more sophisticated language.
4
to measure energy changes in principle, we need adiabatic processes where
there is no heat exchange. We wish to establish the energy of a given system
in equilibrium states which are those that can be completely characterized
by the static2 values of extensive parameters like energy E, volume V and
mole number N (number of particles divided by the Avogadro number 6.02×
1023 ). Other extensive quantities may include numbers of different sorts of
particles, electric and magnetic moments etc i.e. everything which value for
a composite system is a direct sum of the values for the components. For
a given system, any two equilibrium states A and B can be related by an
adiabatic process either A → B or B → A, which allows to measure the
difference in the internal energy by the work W done by the system. Now,
if we encounter a process where the energy change is not equal to minus the
work done by the system, we call the difference the heat flux into the system:
dE = δQ − δW . (1)
5
E(S, X). Then4 (∂S/∂X)E = 0 ⇒ (∂E/∂X)S = −(∂S/∂X)E (∂E/∂S)X =
0. Differentiating the last relation once more time we get (∂ 2 E/∂X 2 )S =
−(∂ 2 S/∂X 2 )E (∂E/∂S)X , since the derivative of the second factor is zero as
it is at constant X. We thus see that the equilibrium is defined by the energy
minimum instead of the entropy maximum (very much like circle can be de-
fined as the figure of either maximal area for a given perimeter or of minimal
perimeter for a given area). On the figure, unconstrained equilibrium states
lie on the curve while all other states lie below. One can reach the state A
either maximizing entropy at a given energy or minimizing energy at a given
entropy:
S
6
derivatives (2) are defined only in equilibrium. Therefore, δQ = T dS and
δW = P dV − µdN for quasi-static processes i.e such that the system is close
to equilibrium at every point of the process. A process can be considered
quasi-static if its typical time of change is larger than the relaxation times
(which for pressure can be estimates as L/c, for temperature as L2 /κ, where
L is a system size, c - sound velocity and κ thermal conductivity). Finite
deviations from equilibrium make dS > δQ/T because entropy can increase
without heat transfer.
Let us give an example how the entropy maximum principle solves the basic
problem. Consider two simple systems separated by a rigid wall which is
impermeable for anything but heat. The whole composite system is closed
that is E1 + E2 =const. The entropy change under the energy exchange,
( )
∂S1 ∂S2 dE1 dE2 1 1
dS = dE1 + dE2 = + = − dE1 ,
∂E1 ∂E2 T1 T2 T1 T2
must be positive which means that energy flows from the hot subsystem to
the cold one (T1 > T2 ⇒ ∆E1 < 0). We see that our definition (2) is in
agreement with our intuitive notion of temperature. When equilibrium is
reached, dS = 0 which requires T1 = T2 . If fundamental relation is known,
then so is the function T (E, V ). Two equations, T (E1 , V1 ) = T (E2 , V2 ) and
E1 + E2 =const completely determine E1 and E2 . In the same way one can
consider movable wall and get P1 = P2 in equilibrium. If the wall allows for
particle penetration we get µ1 = µ2 in equilibrium.
Both energy and entropy are homogeneous first-order functions of its vari-
ables: S(λE, λV, λN ) = λS(E, V, N ) and E(λS, λV, λN ) = λE(S, V, N )
(here V and N stand for the whole set of extensive macroscopic parame-
ters). Differentiating the second identity with respect to λ and taking it at
λ = 1 one gets the Euler equation
E = T S − P V + µN . (4)
Let us show that there are only two independent parameters for a simple one-
component system, so that chemical potential µ, for instance, can be found
as a function of T and P . Indeed, differentiating (4) and comparing with (3)
one gets the so-called Gibbs-Duhem relation (in the energy representation)
N dµ = −SdT + V dP or for quantities per mole, s = S/N and v = V /N :
dµ = −sdT + vdP . In other words, one can choose λ = 1/N and use
first-order homogeneity to get rid of N variable, for instance, E(S, V, N ) =
7
N E(s, v, 1) = N e(s, v). In the entropy representation the Gibbs-Duhem
relation is again states that the sum of products of the extensive parameters
and the differentials of the corresponding intensive parameters vanish:
One uses µ(P, T ), for instance, when considering systems in the external
field. One then adds the potential energy (per particle) u(r) to the chemical
potential so that the equilibrium condition is µ(P, T ) + u(r) =const. Par-
ticularly, in the gravity field u(r) = mgz and differentiating µ(P, T ) under
T = const one gets vdP = −mgdz. Introducing density ρ = m/v one gets
the well-known hydrostatic formula P = P0 − ρgz. For composite systems,
the number of independent intensive parameters (thermodynamic degrees of
freedom) is the number of components plus one.
Processes. While thermodynamics is fundamentally about states it is
also used for describing processes that connect states. Particularly important
questions concern performance of engines and heaters/coolers. Heat engine
works by delivering heat from a reservoir with some higher T1 via some system
to another reservoir with T2 doing some work in the process. If the entropy
of the hot reservoir decreases by some ∆S1 then the entropy of the cold one
must increase by some ∆S2 ≥ ∆S1 . The work ∆W is the difference between
the heat given by the hot reservoir T1 ∆S1 and the heat absorbed by the cold
one T2 ∆S2 . It is clear that maximal work is achieved for minimal entropy
change ∆S2 = ∆S1 , which happens for reversible (quasi-static) processes —
if, for instance, the system is a gas which works by moving a piston then the
pressure of the gas and the work are less for a fast-moving piston than in
equilibrium. Engine efficiency is the fraction of heat used for work that is
∆W ∆Q1 − ∆Q2 T2 ∆S2 T2
= =1− ≤1− .
∆Q1 ∆Q1 T1 ∆S1 T1
8
4) adiabatic compression until the temperature reaches T1 . The auxiliary
system does work during 1 and 2 increasing the energy of our system which
then decreases its energy by working on the auxiliary system during 3 and 4.
The total work is the area in the graph. For heat transfer, one reverses the
direction.
T P
1
T1 1
4
4 2
2
3
T2 3
S V
P V = N RT , E = 3N RT /2 . (6)
9
Here e0 , v0 are parameters of the state of zero internal energy used to deter-
mine the temperature units, and s0 is the constant of integration.
X X
To fix the shift one may consider the curve as the envelope of the family of
the tangent lines characterized by the slope P and the position ψ of intercept
of the Y -axis. The function ψ(P ) = Y [X(P )] − P X(P ) completely defines
the curve; here one substitutes X(P ) found from P = ∂Y (X)/∂X (which is
possible only when ∂P/∂X = ∂ 2 Y /∂X 2 ̸= 0). The function ψ(P ) is referred
to as a Legendre transform of Y (X). From dψ = −P dX − XdP + dY =
−XdP one gets −X = ∂ψ/∂P i.e. the inverse transform is the same up to
a sign: Y = ψ + XP . In mechanics, we use the Legendre transform to pass
from Lagrangian to Hamiltonian description.
10
Y
Y = Ψ + XP
P X
11
V dP + µdN + . . . and is minimal in equilibrium at constant temperature
and pressure. From (4) we get (remember, they all are functions of different
variables):
E G
V T
=
S P S P
S H P
Maxwell relations are given by the corners of the diagram, for example,
(∂V /∂S)P = (∂T /∂P )S etc. If we consider constant N then any fundamental
relation of a single-component system is a function of only two variables and
therefore have only three independent second derivatives. Traditionally, all
derivatives are expressed via the three basic ones (those of Gibbs potential),
the specific heat and the coefficient of thermal expansion, both at a constant
12
pressure, and isothermal compressibility:
( ) ( ) ( ) ( )
∂S ∂ 2G 1 ∂V 1 ∂V
cP = T = −T , α= , κT = − .
∂T P
∂T 2 P
V ∂T P
V ∂P T
13
has a definite sign if the determinant is positive: SEE SV V − SEV
2
≥ 0. Ma-
nipulating derivatives one can show that this is equivalent to (∂P/∂V )S ≤
0. Alternatively, one may consider the energy representation, here stabil-
ity requires the energy minimum which gives ESS = T /cv ≥ 0, EV V =
−(∂P/∂V )S ≥ 0. Considering both variations one can diagonalize d2 E =
ESS (dS)2 + EV V (dV )2 + 2ESV dSdV by introducing the temperature differen-
−1 −1
tial dT = ESS dS+ESV dV so that 2d2 E = ESS (dT )2 +(EV V −ESV
2
ESS )(dV )2 .
−1
It is thus clear that EV V − ESV ESS = (∂ E/∂V )T = −(∂P/∂V )T and we
2 2 2
∆ E ∆E
14
P V µ J
C E
D L Q
E D
N
L J
Q J E C
D
N L C N
V Q P P
15
2 Basic statistical physics (brief reminder)
Here we introduce microscopic statistical description in the phase space and
describe three principal ways (microcanonical, canonical and grand canoni-
cal) to derive thermodynamics from statistical mechanics.
The main idea is that ρ(p, q) for a subsystem does not depend on the initial
states of this and other subsystems so it can be found without actually solving
equations of motion. We define statistical equilibrium as a state where macro-
scopic quantities equal to the mean values. Assuming short-range forces we
conclude that different macroscopic subsystems interact weakly and are sta-
tistically independent so that the distribution for a composite system ρ12 is
factorized: ρ12 = ρ1 ρ2 .
Now, we take the ensemble of identical systems starting from different
points in phase space. In a flow with the velocity v = (ṗ, q̇) the density
changes according to the continuity equation: ∂ρ/∂t + div (ρv) = 0. If the
motion is considered for not very large time it is conservative and can be
described by the Hamiltonian dynamics: q̇i = ∂H/∂pi and ṗi = −∂H/∂qi .
Hamiltonian flow in the phase space is incompressible: div v = ∂ q̇i /∂qi +
∂ ṗi /∂pi = 0. That gives the Liouville theorem: dρ/dt = ∂ρ/∂t + (v · ∇)ρ = 0
that is the statistical distribution is conserved along the phase trajectories
of any subsystem. As a result, equilibrium ρ must be expressed solely via
the integrals of motion. Since ln ρ is an additive quantity then it must be
expressed linearly via the additive integrals of motions which for a general
mechanical system are energy E(p, q), momentum P(p, q) and the momentum
of momentum M(p, q):
ln ρa = αa + βEa (p, q) + c · Pa (p, q) + d · M(p, q) . (12)
16
Here αa is the normalization constant for a given subsystem while the seven
constants β, c, d are the same for all subsystems (to ensure additivity) and
are determined by the values of the seven integrals of motion for the whole
system. We thus conclude that the additive integrals of motion is all we
need to get the statistical distribution of a closed system (and any sub-
system), those integrals replace all the enormous microscopic information.
Considering system which neither moves nor rotates we are down to the sin-
gle integral, energy. For any subsystem (or any system in the contact with
thermostat) we get Gibbs’ canonical distribution
Usually one considers the energy fixed with the accuracy ∆ so that the mi-
crocanonical distribution is
{
1/Γ for E ∈ (E0 , E0 + ∆)
ρ= (15)
0 for E ̸∈ (E0 , E0 + ∆) ,
where Γ is the volume of the phase space occupied by the system
∫
Γ(E, V, N, ∆) = d3N pd3N q . (16)
E<H<E+∆
For example, for N noninteracting particles (ideal gas) the states with the
∑ 2
energy √
E = p /2m are in the p-space near the hyper-sphere with the
radius 2mE. Remind that the surface area of the hyper-sphere with the
radius R in 3N -dimensional space is 2π 3N/2 R3N −1 /(3N/2 − 1)! and we have
To link statistical physics with thermodynamics one must define the fun-
damental relation i.e. a thermodynamic potential as a functions of respective
17
variables. It can be done using either canonical or microcanonical distribu-
tion. We start from the latter and introduce the entropy as
This is one of the most important formulas in physics8 (on a par with F =
ma , E = mc2 and E = h̄ω).
Noninteracting subsystems are statistically independent so that the sta-
tistical weight of the composite system is a product and entropy is a sum.
For interacting subsystems, this is true only for short-range forces in the
thermodynamic limit N → ∞. Consider two subsystems, 1 and 2, that
can exchange energy. Assume that the indeterminacy in the energy of any
subsystem, ∆, is much less than the total energy E. Then
E/∆
∑
Γ(E) = Γ1 (Ei )Γ2 (E − Ei ) . (19)
i=1
We denote Ē1 , Ē2 = E − Ē1 the values that correspond to the maximal
term in the sum (19). The derivative of it is proportional to (∂Γ1 /∂Ei )Γ2 −
(∂Γ2 /∂Ei )Γ1 = (Γ1 Γ2 )−1 [(∂S1 /∂E1 )Ē1 − (∂S2 /∂E2 )Ē2 ]. Then the extremum
condition is evidently (∂S1 /∂E1 )Ē1 = (∂S2 /∂E2 )Ē2 , that is the extremum
corresponds to the thermal equilibrium where the temperatures of the subsys-
tems are equal. The equilibrium is thus where the maximum of probability is.
It is obvious that Γ(Ē1 )Γ(Ē2 ) ≤ Γ(E) ≤ Γ(Ē1 )Γ(Ē2 )E/∆. If the system con-
sists of N particles and N1 , N2 → ∞ then S(E) = S1 (Ē1 )+S2 (Ē2 )+O(logN )
where the last term is negligible.
Identification with the thermodynamic entropy can be done consider-
ing any system, for instance, an ideal gas. The problem is that the log-
arithm of (17) contains non-extensive term N ln V . The resolution of this
controversy is that we need to treat the particles as indistinguishable, oth-
erwise we need to account for the entropy of mixing different species. We
however implicitly assume that mixing different parts of the same gas is
a reversible process which presumes that the particles are identical. For
identical particles, one needs to divide Γ (17) by the number of transmu-
tations N ! which makes the resulting entropy of the ideal gas extensive:
S(E, V, N ) = (3N/2) ln E/N + N ln V /N +const. Note that quantum parti-
cles (atoms and molecules) are indeed indistinguishable, which is expressed
8
It is inscribed on the Boltzmann’s gravestone
18
by a proper symmetrization of the wave function. One can only wonder
at the genius of Gibbs who introduced N ! long before quantum mechan-
ics (see, L&L 40 or Pathria 1.5 and 6.1). Defining temperature in a usual
way, T −1 = ∂S/∂E = 3N/2E, we get the correct expression E = 3N T /2.
We express here temperature in the energy units. To pass to Kelvin de-
grees, one transforms T → kT and S → kS where the Boltzmann constant
k = 1.38 · 1023 J/K. The value of classical entropy (18) depends on the units.
Proper quantitative definition comes from quantum physics with Γ being the
number of microstates that correspond to a given value of macroscopic pa-
rameters. In the quasi-classical limit the number of states is obtained by
dividing the phase space into units with ∆p∆q = 2πh̄.
The same definition (entropy as a logarithm of the number of states)
is true for any system with a discrete set of states. For example, consider
the set of N two-level systems with levels 0 and ϵ. If energy of the set is
E then there are L = E/ϵ upper levels occupied. The statistical weight is
determined by the number of ways one can choose L out of N : Γ(N, L) =
CNL = N !/L!(N − L)!. We can now define entropy (i.e. find the fundamental
relation): S(E, N ) = ln Γ. Considering N ≫ 1 and L ≫ 1 we can use the
Stirling formula in the form d ln L!/dL = ln L and derive the equation of
state (temperature-energy relation),
∂ N! N −L
T −1 = ∂S/∂E = ϵ−1 ln = ϵ−1 ln ,
∂L L!(N − L)! L
and specific heat C = dE/dT = N (ϵ/T )2 2 cosh−2 (ϵ/T ). Note that the ratio
of the number of particles on the upper level to those on the lower level is
L/(N − L) = exp(−ϵ/T ) (Boltzmann relation). Specific heat turns into zero
both at low temperatures (too small portions of energy are ”in circulation”)
and in high temperatures (occupation numbers of two levels already close to
equal).
The derivation of thermodynamic fundamental relation S(E, . . .) in the
microcanonical ensemble is thus via the number of states or phase volume.
19
consisting of infinitely many copies of our system — this is so-called canonical
ensemble, characterized by N, V, T ). Here our system can have any energy
and the question arises what is the probability W (E). Let us find first the
probability of the system to be in a given microstate a with the energy E.
Assuming that all the states of the thermostat are equally likely to occur
we see that the probability should be directly proportional to the statistical
weight of the thermostat Γ0 (E0 −E) where we evidently assume that E ≪ E0 ,
expand Γ0 (E0 − E) = exp[S0 (E0 − E)] ≈ exp[S0 (E0 ) − E/T )] and obtain
Note that there is no trace of thermostat left except for the temperature.
The normalization factor Z(T, V, N ) is a sum over all states accessible to the
system and is called the partition function.
The probability to have a given energy is the probability of the state (20)
times the number of states:
Here Γ(E) grows fast while exp(−E/T ) decays fast when the energy E grows.
As a result, W (E) is concentrated in a very narrow peak and the energy
fluctuations around Ē are very small (see Sect. 2.4 below for more details).
For example, for an ideal gas W (E) ∝ E 3N/2 exp(−E/T ). Let us stress again
that the Gibbs canonical distribution (20) tells that the probability of a given
microstate exponentially decays with the energy of the state while (22) tells
that the probability of a given energy has a peak.
An alternative and straightforward way to derive the canonical distri-
bution is to use consistently the Gibbs idea of the canonical ensemble as a
virtual set, of which the single member is the system under consideration
and the energy of the total set is fixed. The probability to have our system
in the state a with the energy Ea is then given by the average number of
systems n̄a in this state divided by the total number of systems N . The set
of occupation numbers {na } = (n0 , n1 , n2 . . .) satisfies obvious conditions
∑ ∑
na = N , Ea na = E = ϵN . (23)
a a
Any given set is realized in W {na } = N !/n0 !n1 !n2 ! . . . number of ways and
20
the probability to realize the set is proportional to the respective W :
∑
na W {na }
n̄a = ∑ , (24)
W {na }
where summation goes over all the sets that satisfy (23). We assume that
in the limit when N, na → ∞ the main contribution into (24) is given by
the most probable distribution which is found by looking at the extremum
∑ ∑
of ln W − α a na − β a Ea na . Using the Stirling formula ln n! = n ln n − n
∑
we write ln W = N ln N − a na ln na and the extremum n∗a corresponds to
ln n∗a = −α − 1 − βEa which gives
n∗a exp(−βEa )
=∑ . (25)
N a exp(−βEa )
Of course, physically ϵ(β) is usually more relevant than β(ϵ). See Pathria,
Sect 3.2.
To get thermodynamics from the Gibbs distribution one needs to define
the free energy because we are under a constant temperature. This is done
via the partition function Z (which is of central importance since macroscopic
quantities are generally expressed via the derivatives of it):
F (T, V, N ) = −T ln Z(T, V, N ) . (27)
∑
To prove that, differentiate the identity a exp[(F − Ea )/T ] = 1 with respect
to temperature which gives
( )
∂F
F = Ē + T ,
∂T V
equivalent to F = E − T S in thermodynamics.
One can also come to this by defining entropy. Remind that for a closed
system we defined S = ln Γ while the probability of state was wa = 1/Γ. For
a system in a contact with a thermostat that has a Gibbs distribution we
have ln wa is linear in E so that
∑
S(Ē) = − ln wa (Ē) = −⟨ln wa ⟩ = − wa ln wa (28)
∑
= wa (Ea /T + ln Z) = E/T + ln Z .
21
∑
Even though we derived the formula for entropy, S = − wa ln wa , for
an equilibrium, this definition can be used for any set of probabilities wa ,
since it provides a useful measure of our ignorance about the system, as we
shall see later.
See Landau & Lifshitz (Sects 31,36).
Here we used ∂S/∂E = 1/T , ∂S/∂N = −µ/T and introduced the grand
canonical potential which can be expressed through the grand partition func-
tion ∑ ∑
Ω(T, V, µ) = −T ln exp(µN/T ) exp(−EaN )/T ) . (30)
N a
22
To describe fluctuations one needs to expand the respective thermody-
namic potential around the mean value, using the second derivatives ∂ 2 S/∂E 2
and ∂ 2 S/∂N 2 (which must be negative for stability). That will give Gaus-
sian distributions of E − Ē and N − N̄ . A straightforward way to find the
energy variance (E − Ē)2 is to differentiate with respect to β the identity
E − Ē = 0. For this purpose one can use canonical distribution and get
( )
∂ ∑ ∑ ∂F ∂ Ē
(Ea − Ē)eβ(F −Ea ) = (Ea − Ē) F + β − Ea eβ(F −Ea ) − = 0,
∂β a a ∂β ∂β
∂ Ē
(E − Ē)2 = − = T 2 CV . (32)
∂β
Magnitude of fluctuations is determined by the second derivative of the re-
spective thermodynamic potential (which is CV ). Since both Ē and CV are
proportional to N then the relative fluctuations are small indeed: (E − Ē)2 /Ē 2 ∝
N −1 . In what follows (as in the most of what preceded) we do not distinguish
between E and Ē.
Let us now discuss the fluctuations of particle number. One gets the
probability to have N particles by summing (29) over a:
where F (T, V, N ) is the free energy calculated from the canonical distribu-
tion for N particles in volume V and temperature T . The mean value N̄
is determined by the extremum of probability: (∂F/∂N )N̄ = µ. The sec-
ond derivative determines the width of the distribution over N that is the
variance:
( )−1 ( )−1
∂2F −2 ∂P
(N − N̄ )2 = 2T = 2T N v ∝N . (33)
∂N 2 ∂v
23
uses grand canonical ensemble in such cases (as we shall do in the Chapter 4
∑
below). Note that any extensive quantity f = N i=1 fi which is a sum over
independent subsystems (i.e. fi fk = f¯i f¯k ) have a small relative fluctuation:
(f 2 − f¯2 )/f¯2 ∝ 1/N .
See also Landau & Lifshitz 35 and Huang 8.3-5.
T=+0 T=−0 E
0 Nε
Indeed, entropy is zero at E = 0, N ϵ when all the particles are in the same
state. The entropy is symmetric about E = N ϵ/2. We see that when
E > N ϵ/2 then the population of the higher level is larger than of the
lower one (inverse population as in a laser) and the temperature is negative.
Negative temperature may happen only in systems with the upper limit of
energy levels and simply means that by adding energy beyond some level we
actually decrease the entropy i.e. the number of accessible states. Available
24
(non-equilibrium) states lie below the S(E) plot, notice that the entropy
maximum corresponds to the energy minimum for positive temperatures and
to the energy maximum for the negative temperatures part. A glance on
the figure also shows that when the system with a negative temperature is
brought into contact with the thermostat (having positive temperature) then
our system gives away energy (a laser generates and emits light) decreasing
the temperature further until it passes through infinity to positive values
and eventually reaches the temperature of the thermostat. That is negative
temperatures are actually ”hotter” than positive.
Let us stress that there is no volume in S(E, N ) that is we consider only
subsystem or only part of the degrees of freedom. Indeed, real particles have
kinetic energy unbounded from above and can correspond only to positive
temperatures [negative temperature and infinite energy give infinite Gibbs
factor exp(−E/T )]. Apart from laser, an example of a two-level system is
spin 1/2 in the magnetic field H. Because the interaction between the spins
and atom motions (spin-lattice relaxation) is weak then the spin system for
a long time (tens of minutes) keeps its separate temperature and can be
considered separately.
External fields are parameters (like volume and chemical potential) that
determine the energy levels of the system. They are sometimes called gen-
eralized thermodynamic coordinates, and the derivatives of the energy with
respect to them are called respective forces. Let us derive the generalized
force M that corresponds to the magnetic field and determines the work done
under the change of magnetic field: dE = T dS − M dH. Since the projection
of every magnetic moment on the direction of the field can take two values ±µ
then the magnetic energy of the particle is ∓µH and E = −µ(N+ − N− )H.
The force (the partial derivative of the energy with respect to the field at a
fixed entropy) is called magnetization or magnetic moment of the system:
( )
∂E exp(µH/T ) − exp(−µH/T )
M =− = µ(N+ − N− ) = N µ . (35)
∂H S
exp(µH/T ) + exp(−µH/T )
25
At weak fields and positive temperature, µH ≪ T , (35) gives the formula
for the so-called Pauli paramagnetism
M µH
= . (36)
Nµ T
Para means that the majority of moments point in the direction of the ex-
ternal field. This formula shows in particular a remarkable property of the
spin system: adiabatic change of magnetic field (which keeps N+ , N− and
thus M ) is equivalent to the change of temperature even though spins do not
exchange energy. One can say that under the change of the value of the ho-
mogeneous magnetic field the relaxation is instantaneous in the spin system.
This property is used in cooling the substances that contain paramagnetic
impurities. Note that the entropy of the spin system does not change when
the field changes slowly comparatively to the spin-spin relaxation and fast
comparatively to the spin-lattice relaxation.
To conclude let us treat the two-level system by the canonical approach
where we calculate the partition function and the free energy:
∑
N
Z(T, N ) = CNL exp[−Lϵ/T ] = [1 + exp(−ϵ/T )]N , (37)
L=0
F (T, N ) = −T ln Z = −N T ln[1 + exp(−ϵ/T )] . (38)
We can now re-derive the entropy as S = −∂F/∂T and derive the (mean)
energy and specific heat:
∑ ∂ ln Z ∂ ln Z
Ē = Z −1 Ea exp(−βEa ) = − = T2 (39)
a ∂β ∂T
Nϵ
= , (40)
1 + exp(ϵ/T )
dE N exp(ϵ/T ) ϵ2
C = = . (41)
dT [1 + exp(ϵ/T )]2 T 2
Note that (39) is a general formula which we shall use in the future. Specific
heat turns into zero both at low temperatures (too small portions of energy
are ”in circulation”) and in high temperatures (occupation numbers of two
levels already close to equal).
26
C/N
1/2
2 T/ε
27
Using kinetic energy and simply replacing q → p/ω one obtains a similar
formula dwp = (2πT )−1/2 exp(−p2 /2T )dp which is the Maxwell distribution.
For a quantum case, the energy levels are given by En = h̄ω(n + 1/2).
The single-oscillator partition function
∞
∑
Z1 (T ) = exp[−h̄ω(n + 1/2)/T ] = 2 sinh−1 (h̄ω/2T ) (45)
n=0
where one sees the contribution of zero quantum oscillations and the break-
down of classical equipartition. The specific heat is as follows: CP = CV =
N (h̄ω/T )2 exp(h̄ω/T )[exp(h̄ω/T ) − 1]−2 . Comparing with (41) we see the
same behavior at T ≪ h̄ω: CV ∝ exp(−h̄ω/T ) because “too small energy
portions are in circulation” and they cannot move system to the next level.
At large T the specific heat of two-level system turns into zero because the
occupation numbers of both levels are almost equal while for oscillator we
have classical equipartition (every oscillator has two degrees of freedom so it
has T in energy and 1 in CV ).
C/N
1
2 T/ε
28
one gets:
( )1/2 ( )
ω h̄ω 2ω h̄ω
dwq = tanh exp −q tanh dq . (47)
πh̄ 2T h̄ 2T
At h̄ω ≪ T it coincides with (44) while at the opposite (quantum) limit gives
dwq = (ω/πh̄)1/2 exp(−q 2 ω/h̄)dq which is a purely quantum formula |ψ0 |2 for
the ground state of the oscillator.
See also Pathria Sect. 3.7 for more details.
29
the position and v = (ṗ, q̇) is the velocity in the phase space. The trace
of the tensor σij is the rate of the volume change which must be zero ac-
cording to the Liouville theorem (that is a Hamiltonian dynamics imposes
an incompressible flow in the phase space). We can decompose the tensor
of velocity derivatives into an antisymmetric part (which describes rotation)
and a symmetric part (which describes deformation). We are interested here
in deformation because it is the mechanism of the entropy growth. The sym-
metric tensor, Sij = (∂vi /∂xj + ∂vj /∂xi )/2, can be always transformed into
a diagonal form by an orthogonal transformation (i.e. by the rotation of
the axes). The diagonal components are the rates of stretching in different
directions. Indeed, the equation for the distance between two points along a
principal direction has a form: ṙi = δvi = ri Sii (no summation over i). The
solution is as follows:
[∫ t ]
′ ′
ri (t) = ri (0) exp Sii (t ) dt . (48)
0
Of course, as the system moves in the phase space, both the strain val-
ues and the orientation of the principal directions change, so that expand-
30
ing direction may turn into a contracting one and vice versa. The ques-
tion is whether averaging over all possibilities gives a zero net result. One
can show that in a general case an exponential stretching persists on av-
erage and the majority of trajectories separate. To see that, consider as
an example two-dimensional incompressible saddle-point flow (pure strain):
vx = λx, vy = −λy. The vector r = (x, y) (which is supposed to char-
acterize the distance between two close trajectories) satisfies the equations
ẋ = vx and ẏ = vy . Whether the vector is stretched or contracted after
some time T depends on its orientation and on T . Since x(t) = x0 exp(λt)
and y(t) = y0 exp(−λt) = x0 y0 /x(t) then every trajectory is a hyperbole.
A unit vector initially forming an angle φ with the x axis will have its
length [cos2 φ exp(2λT ) + sin2 φ exp(−2λT )]1/2 after time T . The vector will
be stretched if cos φ ≥ [1 + exp(2λT )]−1/2 < 1/sqrt2, i.e. the fraction of
stretched directions is larger than half. When along the motion all orienta-
tions are equally probable, the net effect is stretching.
y
y(0)
y(T)
ϕ
0
x(0) x(T) x
Figure 2: The distance of the point from the origin increases if the angle is
less than φ0 = arccos[1 + exp(2λT )]−1/2 > π/4. Note that for φ = φ0 the
initial and final points are symmetric relative to the diagonal.
31
ellipsoid axes will change. The limiting eigenvalues
define the so-called Lyapunov exponents. The sum of the exponents is zero due to
the Liouville theorem so there exists at least one positive exponent which corre-
sponds to stretching. Mathematical lesson to learn is that multiplying N random
matrices with unit determinant (recall that the determinant is the product of
eigenvalues), one generally gets some eigenvalues growing (and some decreasing)
exponentially with N .
The probability to find a ball turning into an exponentially stretching
ellipse goes to unity as time increases. The physical reason for it is that
substantial deformation appears sooner or later. To reverse it, one needs
to contract the long direction, that is the direction of contraction must be
inside the narrow angle defined by the ellipse eccentricity, which is unlikely.
Randomly oriented deformations on average continue to increase the eccen-
tricity. After the strip length reaches the scale of the velocity change (when
one already cannot approximate the phase-space flow by a linear profile σ̂r),
strip starts to fold, continuing locally the exponential stretching. Eventually,
one can find the points from the initial ball everywhere which means that
the flow is mixing.
On the figure, one can see how the black square of initial conditions (at
the central box) is stretched in one (unstable) direction and contracted in an-
other (stable) direction so that it turns into a long narrow strip (left and right
boxes). Rectangles in the right box show finite resolution (coarse-graining).
Viewed with such resolution, our set of points occupies larger phase volume
(i.e. corresponds to larger entropy) at t = ±T than at t = 0. Time reversibil-
ity of any particular trajectory in the phase space does not contradict the
time-irreversible filling of the phase space by the set of trajectories considered
with a finite resolution. By reversing time we exchange stable and unstable
directions but the fact of space filling persists.
p p p
32
When the density spreads, entropy grows. For example, consider a simple
case, when the spread of the probability density ρ(r, t) can be described by
the simple diffusion (in Sect 8 below we show that it is possible on a timescale
where one can consider the motion as a series of uncorrelated random walks):
∂ρ/∂t = −κ∆ρ. Entropy increases monotonically under diffusion:
dS d ∫ ∫ ∫
(∇ρ)2
=− ρ(r, t) ln ρ(r, t) dr = −κ ∆ρ ln ρ dr = κ dr ≥ 0 . (51)
dt dt ρ
33
We see that the force is equal to the derivative of the thermodynamic energy
at constant entropy because the probabilities wa do not change. Note that
in an adiabatic process all wa are assumed to be constant i.e. the entropy of
any subsystem us conserved. This is more restrictive than the condition of
reversibility which requires only the total entropy to be conserved. In other
words, the process can be reversible but not adiabatic. See Landau & Lifshitz
(Section 11) for more details.
The last statement we make here about entropy is the third law of thermo-
dynamics (Nernst theorem) which claims that S → 0 as T → 0. A standard
argument is that since stability requires the positivity of the specific heat
cv then the energy must monotonously increase with the temperature and
zero temperature corresponds to the ground state. If the ground state is
non-degenerate (unique) then S = 0. Since generally the degeneracy of the
ground state grows slower than exponentially with N , then the entropy per
particle is zero in the thermodynamic limit. While this argument is correct
it is relevant only for temperatures less than the energy difference between
the first excited state and the ground state. As such, it has nothing to do
with the third law established generally for much higher temperatures and
related to the density of states as function of energy. We shall discuss it later
considering Debye theory of solids. See Huang (Section 9.4) for more details.
That it must be a logarithm is clear also from obtaining the missing informa-
tion by asking the sequence of questions in which half we find the box with
34
the candy, one then needs log2 n of such questions and respective one-bit
answers. We can easily generalize the definition (54) for non-integer ratio-
nal numbers by I(n/l) = I(n) − I(l) and for all positive real numbers by
considering limits of the series and using monotonicity.
If we have an alphabet with n symbols then the message of the length N
can potentially be one of nN possibilities so that it brings the information
kN ln n or k ln n per symbol. If all the 25 letters of the English alphabet
were used with the same frequency then the word ”love” would bring the
information equal to 4k ln 25.
n
A B ... L ... Z
A B ...... O ... Z
N
A B ... V ... Z
A B E Z
... ...
In reality though it brings even less information (no matter how emotional
we can get) since we know that letters are used with different frequencies.
Indeed, consider the situation when there is a probability wi assigned to
each letter (or box) i = 1, . . . , n. Now if we want to evaluate the missing
information (or, the information that one symbol brings us on average) we
ought to think about repeating our choice N times. As N → ∞ we know
that candy in the i-th box in N wi cases but we do not know the order in
which different possibilities appear. Total number of orders is N !/ Πi (N wi )!
and the missing information is
( ) ∑
IN = k ln N !/ Πi (N wi )! ≈ −N k wi ln wi + O(lnN ) . (55)
i
The missing information per problem (or per symbol in the language) coin-
cides with the entropy (28):
∑
n
I(w1 . . . wn ) = lim IN /N = −k wi ln wi . (56)
N →∞
i=1
35
The information (56) is zero for delta-distribution wi = δij ; it is generally
less than the information (54) and coincides with it only for equal probabil-
ities, wi = 1/n, when the entropy is maximum. Indeed, equal probabilities
we ascribe when there is no extra information, i.e. in a state of maximum
ignorance. Mathematically, the property
is called convexity. It follows from the fact that the function of a single
variable s(w) = −w ln w is strictly downward convex (concave) since its
second derivative, −1/w, is everywhere negative for positive w. For a concave
function, the average over the set of points wi is less or equal to the function
at the average value (so-called Jensen inequality):
( )
1∑ n
1∑ n
s (wi ) ≤ s wi . (58)
n i=1 n i=1
The relation (58) can be proven by induction using the convexity condition,
which states that the linear interpolation between two points a, b lies every-
where below the function graph: s(λa + b − λb) ≥ λs(a) + (1 − λ)s(b) for
∑n−1
any λ ∈ [0, 1]. Now we can choose λ = (n − 1)/n, a = (n − 1)−1 i=1 wi and
b = wn to see that
( ) ( )
1∑ n
n−1 ∑
n−1
wn
s wi = s (n − 1)−1 wi +
n i=1 n i=1 n
( )
n−1 ∑
n−1
1
≥ s (n − 1)−1 wi + s (wn )
n i=1 n
∑
1 n−1 1 1∑ n
≥ s (wi ) + s (wn ) = s (wi ) . (60)
n i=1 n n i=1
In the last line we used the truth of (58) for n − 1 to prove it for n.
Note that when n → ∞ then (54) diverges while (56) may well be finite.
We can generalize (56) for a continuous distribution by dividing into cells
36
(that is considering a limit of discrete points). Here, different choices of
variables to define equal cells give different definitions of information. It is in
such a choice that physics enters. We use canonical coordinates in the phase
space and write the missing information in terms of the density which may
also depend on time:
∫
I(t) = − ρ(p, q, t) ln[ρ(p, q, t)] dpdq . (61)
37
where P (B|A) is the so-called conditional probability (of B in the presence
of A). The conditional probability is related to the joint probability P (A, B)
by the evident formula P (A, B) = P (B|A)P (A), which allows one to write
the information in a symmetric form
[ ]
[P (B, A)
I(A, B) = ln .
P (A)P (B)
38
we obtain the distribution
[ ∑ ]
ρ(p, q, t) = Z −1 exp − λj Rj (p, q, t) , (64)
j
We can explicitly solve this for k ≪ 1 ≪ kn when one can approximate the
sum by the integral so that Z(λ) ≈ nI0 (λ) where I0 is the modified Bessel
function. Equation I0′ (λ) = 0.3I0 (λ) has an approximate solution λ ≈ 0.63.
Note in passing that the set of equations (65) may be self-contradictory
or insufficient so that the data do not allow to define the distribution or
allow it non-uniquely. If, however, the solution exists then (61,64) define the
missing information I{ri } which is analogous to thermodynamic entropy as
a function of (measurable) macroscopic parameters. It is clear that I have
a tendency to increase whenever a constraint is removed (when we measure
less quantities Ri ).
If we know the given information at some time t1 and want to make
guesses about some other time t2 then our information generally gets less
relevant as the distance |t1 − t2 | increases. In the particular case of guessing
the distribution in the phase space, the mechanism of loosing information
is due to separation of trajectories described in Sect. 3. Indeed, if we know
39
that at t1 the system was in some region of the phase space, the set of
trajectories started at t1 from this region generally fills larger and larger
regions as |t1 − t2 | increases. Therefore, missing information (i.e. entropy)
increases with |t1 − t2 |. Note that it works both into the future and into the
past. Information approach allows one to see clearly that there is really no
contradiction between the reversibility of equations of motion and the growth
of entropy. Also, the concept of entropy as missing information10 allows
one to understand that entropy does not really decrease in the system with
Maxwell demon or any other information-processing device (indeed, if at the
beginning one has an information on position or velocity of any molecule,
then the entropy was less by this amount from the start; after using and
processing the information the entropy can only increase). Consider, for
instance, a particle in the box. If we know that it is in one half then entropy
(the logarithm of available states) is ln(V /2). That also teaches us that
information has thermodynamic (energetic) value: by placing a piston at the
half of the box and allowing particle to hit and move it we can get the work
T ∆S = T ln 2 done; on the other hand, to get such an information one must
make a measurement whose minimum energetic cost is T ∆S = T ln 2 (Szilard
1929).
Yet there is one class of quantities where information does not age. They
are integrals of motion. A situation in which only integrals of motion are
known is called equilibrium. The distribution (64) takes the canonical form
(12,13) in equilibrium. On the other hand, taking micro-canonical as constant
over the constant-energy surface corresponds to the same approach of not
adding any additional information to what is known (energy).
From the information point of view, the statement that systems approach
equilibrium is equivalent to saying that all information is forgotten except the
integrals of motion. If, however, we possess the information about averages
of quantities that are not integrals of motion and those averages do not
coincide with their equilibrium values then the distribution (64) deviates
from equilibrium. Examples are currents, velocity or temperature gradients
like considered in kinetics.
More details can be found in Katz, Sects. 2-5, Sethna Sect. 5.3 and Kardar
I, Problem 2.6.
10
that entropy is not a property of the system but of our knowledge about the system
40
4 Gases
We now go on to apply a general theory given in the Chapter 2. Here we
consider systems with the kinetic energy exceeding the potential energy of
inter-particle interactions: ⟨U (r1 − r2 )⟩ ≪ ⟨mv 2 /2⟩.
To get the feeling of the order of magnitudes, one can make an estimate with
m = 1.6 · 10−24 g (proton) and n = 1021 cm−3 which gives T ≫ 0.5K. Another
41
way to interpret (68) is to say that the mean distance between molecules
n−1/3 must be much larger than the wavelength h/p. In this case, one can
pass from the distribution over the quantum states to the distribution in the
phase space: [ ]
µ − ϵ(p, q)
n̄(p, q) = exp . (69)
T
In particular, the distribution over momenta is always quasi-classical for the
Boltzmann gas. Indeed, the distance between energy levels is determined by
the size of the box, ∆E ≃ h2 m−1 V −2/3 ≪ h2 m−1 (N/V )2/3 which is much less
than temperature according to (68). To put it simply, if the thermal quantum
wavelength h/p ≃ h(mT )−1/2 is less than the distance between particles it is
also less than the size of the box. We conclude that the Boltzmann gas has the
Maxwell distribution over momenta. If such is the case even in the external
field then n(q, p) = exp{[µ − ϵ(p, q)]/T } = exp{[µ − U (q) − p2 /2m]/T }. That
gives, in particular, the particle density in space n(r) = n0 exp[−U (r)/T ]
where n0 is the concentration without field. In the uniform gravity field we
get the barometric formula n(z) = n(0) exp(−mgz/T ).
Since now molecules do not interact then we can treat them as members
of the Gibbs canonical ensemble. The partition function of the Boltzmann
gas can be obtained from the partition function of a single particle (like we
did for two-level system and oscillator) with the only difference that particles
are now real and indistinguishable so that we must divide the sum by the
number of transmutations:
[ ]N
1 ∑
Z= exp(−ϵa /T ) .
N! a
Using the Stirling formula ln N ! ≈ N ln(N/e) we write the free energy
[ ]
e ∑
F = −N T ln exp(−ϵa /T ) . (70)
N a
Since the motion of the particle as a whole is always quasi-classical for the
Boltzmann gas, one can single out the kinetic energy: ϵa = p2 /2m + ϵ′a .
If in addition there is no external field (so that ϵ′a describes rotation and
the internal degrees of freedom of the particle) then one can integrate over
d3 pd3 q/h3 and get for the ideal gas:
[ ( )3/2 ∑ ]
eV mT
F = −N T ln exp(−ϵ′a /T ) . (71)
N 2πh̄2 a
42
To complete the computation we need to specify the internal structure of the
∑
particle. Note though that a exp(−ϵ′a /T ) depends only on temperature so
that we can already get the equation of state P = −∂F/∂V = N T /V .
Mono-atomic gas. At the temperatures much less than the distance to
the first excited state all the atoms will be in the ground state (we put ϵ0 = 0).
That means that the energies are much less than Rydberg ε0 = e2 /aB =
me4 /h̄2 ≃ 4 · 10−11 erg and the temperatures are less than ε0 /k ≃ 3 · 105 K
(otherwise atoms are ionized).
If there is neither orbital angular momentum nor spin (L = S = 0 —
∑
such are the atoms of noble gases) we get a exp(−ϵ′a /T ) = 1 as the ground
state is non-degenerate and
[ ( ) ]
eV mT 3/2 eV
F = −N T ln 2 = −N T ln − N cv T ln T − N ζT , (72)
N 2πh̄ N
3 m
cv = 3/2 , ζ = ln . (73)
2 2πh̄2
Here ζ is called the chemical constant. Note that for F = AT + BT ln T the
energy is linear E = F − T ∂F/∂T = BT that is the specific heat, Cv = B, is
independent of temperature. The formulas thus derived allow one to derive
the conditions for the Boltzmann statistics to be applicable which requires
n̄a ≪ 1. Evidently, it is enough to require exp(µ/T ) ≪ 1 where
( )3/2
E − TS + PV F + PV F + NT N 2πh̄2
µ= = = = T ln .
N N N V mT
43
Without actually specifying ϵJ we can determine this sum in two limits of
large and small temperature. If ∀J one has T ≫ ϵJ , then exp(−ϵJ /T ) ≈ 1
and z = (2S + 1)(2L + 1) which is the total number of components of the
fine level structure. In this case
ζSL = ln(2S + 1)(2L + 1) .
In the opposite limit of temperature smaller than all the fine structure level
differences, only the ground state with ϵJ0 = 0 contributes and one gets
ζJ = ln(2J0 + 1) ,
where J0 is the total angular momentum in the ground state.
ζ
ζ SL
ζJ
T
cv
3/2
T
Note that cv = 3/2 in both limits that is the specific heat is constant at
low and high temperatures (no contribution of electron degrees of freedom)
having some maximum in between (due to contributions of the electrons).
We have already seen this in considering two-level system and the lesson is
general: if one has a finite number of levels then they do not contribute to
the specific heat both at low and high temperatures.
Specific heat of diatomic molecules. We need to calculate the sum
over the internal degrees of freedom in (71). We assume the temperature to
be smaller than the energy of dissociation (which is typically of the order
of electronic excited states). Since most molecules have S = L = 0 in the
ground state we disregard electronic states in what follows. The internal
excitations of the molecule are thus vibrations and rotations with the energy
ϵ′a characterized by two quantum numbers, j and K:
( )
ϵjK = h̄ω(j + 1/2) + h̄2 /2I K(K + 1) . (74)
44
be Bohr radius aB = h̄2 /me2 ≃ 0.5 · 10−8 cm and the typical energy to be
Rydberg ε0 = e2 /aB = me4 /h̄2 ≃ 4 · 10−11 erg. Note that m = 9 · 10−28 g is
the electron mass here. Now the frequency of the atomic oscillations is given
by the ratio of the Coulomb restoring force and the mass of the ion:
√ √
ε0 e2
ω≃ 2
= .
aB M a3B M
where one sees the contribution of zero quantum oscillations and the break-
down of classical equipartition. The specific heat (per molecule) of vibrations
is thus as follows: cvib = (h̄ω/T )2 exp(h̄ω/T )[exp(h̄ω/T ) − 1]−2 . At T ≪ h̄ω:
we have CV ∝ exp(−h̄ω/T ). At large T we have classical equipartition (every
oscillator has two degrees of freedom so it has T in energy and 1 in CV ).
45
To calculate the contribution of rotations one ought to calculate the par-
tition function
( )
∑ h̄2 K(K + 1)
zrot = (2K + 1) exp − . (77)
K 2IT
Again, when temperature is much smaller than the distance to the first
level, T ≪ h̄2 /2I, the specific heat must be exponentially small. Indeed,
retaining only two first terms in the sum (77), we get zrot = 1+3 exp(−h̄2 /IT )
which gives in the same approximation Frot = −3N T exp(−h̄2 /IT ) and crot =
3(h̄2 /IT )2 exp(−h̄2 /IT ). We thus see that at low temperatures diatomic gas
behaves an mono-atomic.
At large temperatures, T ≫ h̄2 /2I, the terms with large K give the main
contribution to the sum (77). They can be treated quasi-classically replacing
the sum by the integral:
∫ ( )
∞ h̄2 K(K + 1) 2IT
zrot = dK(2K + 1) exp − = . (78)
0 2IT h̄2
That gives the constant specific heat crot = 1. The resulting specific heat of
the diatomic molecule, cv = 3/2 + crot + cvibr , is shown on the figure:
Cv
7/2
5/2
3/2
Ι/h 2 hω T
Note that for h̄2 /I < T ≪ h̄ω the specific heat (weakly) decreases be-
cause the distance between rotational levels increases so that the level density
(which is actually cv ) decreases.
For (non-linear) molecules with N > 2 atoms we have 3 translations, 3
rotations and 6N − 6 vibrational degrees of freedom (3n momenta and out
of total 3n coordinates one subtracts 3 for the motion as a whole and 3 for
rotations). That makes for the high-temperature specific heat cv = ctr +crot +
cvib = 3/2 + 3/2 + 3N − 3 = 3N . Indeed, every variable (i.e. every degree
46
of freedom) that enters ϵ(p, q), which is quadratic in p, q, contributes 1/2
to cv . Translation and rotation each contributes only momentum and thus
gives 1/2 while each vibration contributes both momentum and coordinate
(i.e. kinetic and potential energy) and gives 1.
Landau & Lifshitz, Sects. 47, 49, 51.
Here the sum is over all possible occupation numbers na . For fermions, there
are only two terms in the sum with na = 0, 1 so that
Ωa = −T ln {1 + exp[β(µ − ϵa )]} .
For bosons, one must sum the infinite geometric progression (which converges
when µ < 0) to get Ωa = T ln {1 − exp[β(µ − ϵa )]}. Remind that Ω depends
on T, V, µ. The average number of particles in the state with the energy ϵ is
thus
∂Ωa 1
n̄(ϵ) = − = . (80)
∂µ exp[β(ϵ − µ)] ± 1
Upper sign here and in the subsequent formulas corresponds to the Fermi
statistics, lower to Bose. Note that at exp[β(ϵ − µ)] ≫ 1 both distributions
turn into Boltzmann distribution (67). The thermodynamic potential of the
whole system is obtained by summing over the states
∑ [ ]
Ω = ∓T ln 1 ± eβ(µ−ϵa ) . (81)
a
47
now be comparable to the distance between particles). In this case we may
pass from summation to the integration over the phase space with the only
addition that particles are also distinguished by the direction of the spin s
so there are g = 2s + 1 particles in the elementary sell of the phase space.
We thus replace (80) by
gdpx dpy dpz dxdydxh−3
dN (p, q) = . (82)
exp[β(ϵ − µ)] ± 1
Integrating over volume we get the quantum analog of the Maxwell dis-
tribution: √
gV m3/2 ϵ dϵ
dN (ϵ) = √ 2 3 . (83)
2π h̄ exp[β(ϵ − µ)] ± 1
In the same way we rewrite (81):
∫ [ ]
gV T m3/2 ∞ √
Ω=∓ √ 2 3 ϵ ln 1 ± eβ(µ−ϵ) dϵ
2π h̄ 0
2 gV m3/2 ∫ ∞ ϵ3/2 dϵ 2
=− √ = − E. (84)
3 2π 2 h̄3 0 exp[β(ϵ − µ)] ± 1 3
Since also Ω = −P V we get the equation of state
2
PV = E . (85)
3
We see that this relation is the same as for a classical gas, it actually is true for
any non-interacting particles with ϵ = p2 /2m in 3-dimensional space. Indeed,
consider a cube with the side l. Every particle hits a wall |px |/2ml times per
unit time transferring the momentum 2|px | in every hit. The pressure is the
total momentum transferred per unit time p2x /ml divided by the wall area l2
(see Kubo, p. 32):
∑
N
p2ix ∑
N
p2i 2E
P = 3
= 3
= . (86)
i=1 ml i=1 3ml 3V
In the limit of Boltzmann statistics we have E = 3N T /2 so that (85)
reproduces P V = N T . Let us obtain the (small) quantum corrections to the
pressure assuming exp(µ/T ) ≪ 1. Expanding integral in (84)
∫∞ ∫∞ [ ]
√
ϵ3/2 dϵ 3 π βµ ( −5/2 βµ
)
≈ ϵ 3/2 β(µ−ϵ)
e 1 ∓ eβ(µ−ϵ)
dϵ = e 1 ∓ 2 e ,
eβ(ϵ−µ) ± 1 4β 5/2
0 0
48
and substituting Boltzmann expression for µ we get
[ ]
π 3/2 N h3
P V = NT 1 ± . (87)
2g V (mT )3/2
Non-surprisingly, the small factor here is the ratio of the thermal wavelength
to the distance between particles. We see that quantum effects give some
effective attraction between bosons and repulsion between fermions.
Landau & Lifshitz, Sects. 53, 54, 56.
ε
ε
F
At T = 0 electrons fill all the momenta up to pF that can be expressed
via the concentration (g = 2 for s = 1/2):
N 4π ∫ pF 2 p3
=2 3 p dp = 2F 3 , (88)
V h 0 3π h̄
49
which gives the Fermi energy
( )2/3
2 2/3 h̄2 N
ϵF = (3π ) . (89)
2m V
The chemical potential at T = 0 coincides with the Fermi energy (putting
already one electron per unit cell one obtains ϵF /k ≃ 104 K). Condition
T ≪ ϵF is evidently opposite to (68). Chemical potential of the fermi gas
decreases with temperature and changes sign at T = ϵF . Note that the
condition of ideality requires that the electrostatic energy Ze2 /a is much
less than ϵF where Ze is the charge of ion and a ≃ (ZV /N )1/3 is the mean
distance between electrons and ions. We see that the condition of ideality,
N/V ≫ (e2 m/h̄2 )3 Z 2 , surprisingly improves with increasing concentration.
Note nevertheless that in most metals the interaction is substantial, why one
can still use Fermi distribution (only introducing an effective electron mass)
is the subject of Landau theory of Fermi liquids to be described in the course
of condensed matter physics (in a nutshell, it is because the main effect of
interaction is reduced to some mean effective periodic field).
To obtain the specific heat, Cv = (∂E/∂T )V,N one must find E(T, V, N )
i.e. exclude µ from two relations, (83) and (84):
√
2V m3/2 ∫ ∞ ϵdϵ
N=√ 2 3 ,
2π h̄ 0 exp[β(ϵ − µ)] + 1
2V m3/2 ∫ ∞ ϵ3/2 dϵ
E=√ 2 3 .
2π h̄ 0 exp[β(ϵ − µ)] + 1
At T ≪ µ ≈ ϵF this can be done perturbatively using the formula
∫ ∞ ∫ µ
f (ϵ) dϵ π2
≈ f (ϵ) dϵ + T 2 f ′ (µ) , (90)
0 exp[β(ϵ − µ)] + 1 0 6
which gives
2V m3/2 2 ( )
N = √ 2 3 µ3/2 1 + π 2 T 2 /8µ2 ,
2π h̄ 3
2V m3/2 2 ( )
E = √ 2 3 µ5/2 1 + 5π 2 T 2 /8µ2 .
2π h̄ 5
From the first equation we find µ(N, T ) perturbatively
( )2/3 ( )
µ = ϵF 1 − π 2 T 2 /8ϵ2F ≈ ϵF 1 − π 2 T 2 /12ϵ2F
50
and substitute it into the second equation:
3 ( )
E = N ϵF 1 + 5π 2 T 2 /12ϵ2F , (91)
5
π2 T
CV = N . (92)
2 ϵF
We see that CV ≪ N . Another important point to stress is that the energy
(and P V ) are much larger than N T , the consequence is that the fermionic
nature of electrons is what actually determines the resistance of metals (and
neutron stars) to compression. For a typical electron density in metals, n ≃
1022 cm−3 , we get
4.2.2 Photons
Consider electromagnetic radiation in an empty cavity kept at the temper-
ature T . Since electromagnetic waves are linear (i.e. they do not interact)
thermalization of radiation comes from interaction with walls (absorption
and re-emission)11 . One can derive the equation of state without all the for-
malism of the partition function. Indeed, consider the plane electromagnetic
wave with the fields having amplitudes E and B. The average energy density
is (E 2 + B 2 )/2 = E 2 while the momentum flux modulus is |E × B| = E 2 .
The radiation field in the box can be considered as incoherent superposi-
tion of plane wave propagating in all directions. Since all waves contribute
the energy density and only one-third of the waves contribute the radiation
pressure on any wall then
P V = E/3 . (93)
In a quantum consideration we treat electromagnetic waves as photons
which are massless particles with the spin 1 that can have only two inde-
pendent orientations (correspond to two independent polarizations of a clas-
sical electromagnetic wave). The energy is related to the momentum by
11
It is meaningless to take perfect mirror walls which do not change the frequency of
light under reflection and formally correspond to zero T .
51
ϵ = cp. Now, exactly as we did for particles [where the law ϵ = p2 /2m gave
P V = 2E/3 — see (86)] we can derive (93) considering12 that every incident
photon brings momentum ∫
2p cos θ to the wall, that the normal velocity is
2
c cos θ and integrating cos θ sin θ dθ. Photon pressure is relevant inside the
stars, particularly inside the Sun.
Let us now apply the Bose distribution to the system of photons in a
cavity. Since the number of photons is not fixed then a minimum of the free
energy, F (T, V, N ), requires zero chemical potential: (∂F/∂N )T,V = µ =
0. The Bose distribution over the quantum states with fixed polarization,
momentum h̄k and energy ϵ = h̄ω = h̄ck is called Planck distribution
1
n̄k = . (94)
eh̄ω/T −1
At T ≫ h̄ω it gives the Rayleigh-Jeans distribution h̄ωn̄k = T which is
classical equipartition. Assuming cavity large we consider the distribution
over wave vectors continuous. Multiplying by 2 (the number of polarizations)
we get the spectral distribution of energy
2V 4πk 2 dk V h̄ ω 3 dω
dEω = h̄ck = . (95)
(2π)3 eh̄ck/T − 1 π 2 c3 eh̄ω/T − 1
52
Nernst law is satisfied: S → 0 when T → 0. Under adiabatic compres-
sion or expansion of radiation, entropy constancy requires V T 3 = const and
P V 4/3 = const.
If one makes a small orifice in the cavity then it absorbs all the incident
light like a black body. Therefore, what comes out of such a hole is called
black-body radiation. The energy flux from a unit surface is the energy
density times c and times the geometric factor
cE ∫ π/2 cE
I= cos θ sin θ dθ = = σT 4 . (97)
V 0 4V
Landau & Lifshitz, Sect. 63 and Huang, Sect. 12.1.
4.2.3 Phonons
The specific heat of a crystal lattice can be calculated using the powerful
idea of quasi-particles: turning the set of strongly interacting atoms into a
set of weakly interacting waves. In this way one considers the oscillations of
the atoms as acoustic waves with three branches (two transversal and one
longitudinal) ωi = ui k where ui is the respective sound velocity. Debye took
this expression for the spectrum and imposed a maximal frequency ωmax so
that the total number of degrees of freedom is equal to 3 times the number
of atoms:
3 ω∫max 2
4πV ∑ ω dω 3
V ωmax
= = 3N . (98)
(2π)3 i=1 u3i 2π 2 u3
0
53
At T ≪ Θ for the specific heat we have the same cubic law as for photons:
12π 4 T 3
C=N . (101)
5 Θ3
For liquids, there is only one (longitudinal) branch of phonons so C =
N (4π 4 /5)(T /Θ)3 which works well for He IV at low temperatures.
At T ≫ Θ we have classical specific heat (Dulong-Petit law) C = 3N .
Debye temperatures of different solids are between 100 and 1000 degrees
Kelvin. We can also write the free energy of the phonons as a sum/integral
over frequencies of the single oscillator expression:
( )3 Θ/T
∫ ( ) [ ( ) ]
T
F = 9N T z 2 ln 1 − e−z dz = N T 3 ln 1−e−Θ/T −D(Θ/T ) ,(102)
Θ
0
54
One may ask why we didn’t account for zero oscillations when considered
photons in (95,96). Since the frequency of photons is not restricted from
above, the respective contribution seems to be infinite. How to make sense
out of such infinities is considered in quantum electrodynamics; note that
the zero oscillations of the electromagnetic field are real and manifest them-
selves, for example, in the Lamb shift of the levels of a hydrogen atom. In
thermodynamics, zero oscillations of photons are of no importance.
Landau & Lifshitz, Sects. 64–66; Huang, Sect. 12.2
We introduced the thermal wavelength λ = (2πh̄2 /mT )1/2 and the function
1 ∫ ∞ xa−1 dx ∞
∑ zi
ga (z) = = . (103)
Γ(a) 0 z −1 ex − 1 i=1 ia
One may wonder why we single out the contribution of zero-energy level as it
is not supposed to contribute at the thermodynamic limit V → ∞. Yet this
is not true at sufficiently low temperatures. Indeed, let us rewrite it denoting
n0 = z/(1 − z) the number of particles at p = 0
n0 1 g3/2 (z)
= − . (104)
V v λ3
The function g3/2 (z) behaves as shown at the figure, it monotonically grows
while z changes from zero (µ = −∞) to unity (µ = 0). Remind that the
chemical potential of bosons is non-positive (otherwise one would have in-
finite occupation numbers). At z = 1, the value is g3/2 (1) = ζ(3/2) ≈ 2.6
and the derivative is infinite. When the temperature and the specific volume
55
v = V /N are such that λ3 /v > g3/2 (1) (notice that the thermal wavelength is
now larger than the inter-particle distance) then there is a finite fraction of
particles that occupies the zero-energy level. The graphic solution of (104)
for a finite V can be seen in the Figure below by plotting λ3 /v (broken
line) and g3/2 (z) (solid line). When V → ∞ we have a sharp transition at
λ3 /v = g3/2 (1) i.e. at T = Tc = 2πh̄2 /m[vg3/2 (1)]2/3 : at T ≤ Tc we have
z ≡ 1 that is µ ≡ 0. At T > Tc we obtain z solving λ3 /v = g3/2 (z).
_
λ
3
z
v
2.6 O(1/V) O(1/V)
1
g3/2
3
1/2.6 v/λ
0 1 z
Therefore, at the thermodynamic limit we put n0 = 0 at T > Tc and
n0 /N = 1 − (T /Tc )3/2 as it follows from (104). All thermodynamic relations
have now different expressions above and below Tc (upper and lower cases
respectively):
{
3 2πV ∫ ∞ p4 dp (3V T /2λ3 )g5/2 (z)
E = PV = = (105)
2 mh3 0 z −1 exp(p2 /2mT ) − 1 (3V T /2λ3 )g5/2 (1)
{
(15v/4λ3 )g5/2 (z) − 9g3/2 (z)/4g1/2 (z)
cv = (106)
(15v/4λ3 )g5/2 (1)
isotherms
3/2
T
T v
Tc vc(T)
56
At T < Tc the pressure is independent of the volume which prompts the
analogy with a phase transition of the first order. Indeed, this reminds the
properties of the saturated vapor (particles with nonzero energy) in contact
with the liquid (particles with zero energy): changing volume at fixed tem-
perature we change the fraction of the particles in the liquid but not the pres-
sure. This is why the phenomenon is called the Bose-Einstein condensation.
Increasing temperature we cause evaporation (particle leaving condensate in
our case) which increases cv ; after all liquid evaporates (at T = Tc ) cv starts
to decrease. It is sometimes said that it is a “condensation in the momentum
space” but if we put the system in a gravity field then there will be a spatial
separation of two phases just like in a gas-liquid condensation (liquid at the
bottom).
We can also obtain the entropy [above Tc by usual formulas∫ that fol-
low∫ from (84) and below Tc just integrating specific heat S = dE/T =
N cv (T )dT /T = 5E/3T = 2N cv /3):
{
S (5v/2λ3 )g5/2 (z) − log(z)
= (107)
N (5v/2λ3 )g5/2 (1)
The entropy is zero at T = 0 which means that the condensed phase has no
entropy. At finite T all the entropy is due to gas phase. Below Tc we can
write S/N = (T /Tc )3/2 s = (v/vc )s where s is the entropy per gas particle:
s = 5g5/2 (1)/2g3/2 (1). The latent heat of condensation per particle is T s that
it is indeed phase transition of the first order.
To conclude, we have seen in this Section how quantum effects lead to
switching off degrees of freedom at low temperatures. Fermi and Bose sys-
tems reach the zero-entropy state at T = 0 in different ways. It is also
instructive to compare their chemical potentials:
µ
Fermi
T
Bose
57
5 Non-ideal gases
Here we take into account a weak interaction between particles. There are
two limiting cases when the consideration is simplified:
i) when the typical range of interaction is much smaller than the mean dis-
tance between particles so that it is enough to consider only two-particle
interactions,
ii) when the interaction is long-range so that every particle effectively interact
with many other particles and one can apply some mean-field description.
Integrating over momenta we get the partition function Z and the grand
partition function Z as
1 ∫ ZN (V, T )
Z(N, V, T ) = 3N
dr1 . . . drN exp[−U (r1 , . . . , rN )] ≡ .
N !λT N !λ3N
T
∑∞
z N ZN
Z(z, V, T ) = 3N
. (108)
N =0 N !λT
58
simultaneous interaction of three particles etc. We can now write the Gibbs
potential Ω = −P V = −T ln Z and expand the logarithm in powers of z/λ3T :
∞
∑
P = λ−3
T bl z l . (110)
l=1
∫
b1 = 1 , b2 = (1/2)λ−3
T f12 dr12 ,
∫ ( )
b3 = (1/6)λ−6
T e−U123 /T −e−U12 /T −e−U23 /T −e−U13 /T +2 dr12 dr13
∫
= (1/6)λ−6
T (3f12 f13 + f12 f13 f23 ) dr12 dr13 . (111)
59
In the square brackets here stand integrals like
∫ ∫ ∫
dr = V for l = 1, f (r12 ) dr1 dr2 = V f (r) dr for l = 2 , etc .
Using the cluster expansion we can now show that the cluster integrals bl
indeed appear in the expansion (110). For l = 1, 2, 3 we saw that this is
indeed so.
Denote ml the number of l-clusters and by {ml } the whole set of m1 , . . ..
In calculating ZN we need to include the number of ways to distribute N
∏
particles over clusters which is N !/ l (l!)ml . We then must multiply it by
the sum of all possible clusters in the power ml divided by ml ! (since an
exchange of all particles from one cluster with another cluster of the same
3(l−1)
size does not matter). Since the sum of all l-clusters is bl l!λT V then
∑ ∏
ZN = N !λ3N (bl λ−3 ml
T V ) /ml ! .
{ml } l
∑
Here we used N = lml . The problem here is that the sum over different
partitions {ml } needs to be taken under this restriction too and this is techni-
cally very cumbersome. Yet when we pass to calculating the grand canonical
partition function13 and sum over all possible N we ∑ obtain an unrestricted
∏
summation over all possible {ml }. Writing z N = z lml = l (z l )ml we get
∞ ∞
( )ml
∑ ∏ V bl z l 1
Z =
m1 ,m2 ,...=0 l=1 λ3T ml !
∞ ∞
[ ( )m1 ( )m2 ]
∑ ∑ 1 V b1 1 V b2
= ··· z z ···
m1 =0 m2 =0 m1 ! λ3T m2 ! λ3T
( ∞
)
V ∑
= exp 3 bl z l . (114)
λT l=1
We can now reproduce (110) and write the total number of particles:
∞
∑
P V = −Ω = T ln Z(z, V ) = (V /λ3T ) bl z l (115)
l=1
∞
1 z ∂ ln Z ∑
= = λ−3
T lbl z l . (116)
v V ∂z l=1
13
Sometimes the choice of the ensemble is dictated by the physical situation, sometimes
by a technical convenience like now. The equation of state must be the same in the
canonical and microcanonical as we expect the pressure on the wall restricting the system
to be equal to the pressure measured inside.
60
To get the equation of state of now must express z via v/λ3 from (116) and
substitute into (115). That will generate the series called the virial expansion
∞
( )l−1
Pv ∑ λ3
= al (T ) T . (117)
T l=1 v
Dimensionless virial coefficients can be expressed via cluster coefficients i.e.
they depend on the interaction potential and temperature:
∫
a1 = b1 = 1 , a2 = −b2 , a3 = 4b22 − 2b3 = −λ−6
T f12 f13 f23 dr12 dr13 /3 . . . .
can be estimated by splitting the integral into two parts, from 0 to r0 (where
we can neglect the exponent assuming u large positive) and from r0 to ∞
(where we can assume small negative energy, u ≪ T , and expand the expo-
nent). That gives
∫ ∞
a
B(T ) = b − , a ≡ 2π u(r)r2 dr . (119)
T r0
61
Generally, B(T ) is negative at low and positive at high temperatures. We
have seen that for Coulomb interaction the correction to pressure (129) is
always negative while in this case it is positive at high temperature where
molecules hit each other often and negative at low temperatures when long-
range attraction between molecules decreases the pressure. Since N B/V <
N b/V ≪ 1 the correction is small. Note that a/T ≪ 1 since we assume weak
interaction.
While by its very derivation the formula (120) is derived for a dilute
gas one may desire to change it a bit so that it can (at least qualitatively)
describe the limit of incompressible liquid. That would require the pressure
to go to infinity when density reaches some value. This is usually done by
replacing in (120) 1 + bn by (1 − bn)−1 which is equivalent for bn ≪ 1 but
for bn → 1 gives P → ∞. The resulting equation of state is called van der
Waals equation: ( )
P + an2 (1 − nb) = nT . (121)
There is though an alternative way to obtain (121) without assuming the gas
dilute. This is some variant of the mean field even though it is not a first
step of any consistent procedure. Namely, we assume that every molecule
moves in some effective field Ue (r) which is a strong repulsion (Ue → +∞)
in some region of volume bN and is an attraction of order −aN outside:
{∫ } [ ]
aN
F − Fid ≈ −T N ln e−Ue (r)/T dr/V = −T N ln(1 − bn) + . (122)
VT
Differentiating (122) with respect to V gives (121). That “derivation” also
helps understand better the role of the parameters b (excluded volume) and
a (mean interaction energy per molecule). From (122) one can also find the
entropy of the van der Waals gas S = −(∂F/∂T )V = Sid + N ln(1 − nb)
and the energy E = Eid − N 2 a/V , which are both lower than those for an
ideal gas, while the sign of the correction to the free energy depends on the
temperature. Since the correction to the energy is T -independent then CV
is the same as for the ideal gas.
Let us now look closer at the equation of state (121). The set of isotherms
is shown on the figure:
62
P V µ J
C E
D L Q
E D
N
L J
Q J E C
D
N L C N
V Q P P
Since it is expected to describe both gas and liquid then it must show
phase transition. Indeed, we see the region with (∂P/∂V )T > 0 at the lower
isotherm in the first figure. When the pressure correspond to the level NLC,
it is clear that L is an unstable point and cannot be realized. But which
stable point is realized, N or C? To get the answer, one must minimize the
Gibbs potential G(T, P, N ) = N µ(T, P ) since we have T and P fixed. For
one mole, integrating the relation∫ dµ(T, P ) = −sdT +vdP under the constant
temperature we find: G = µ = v(P )dP . It is clear that the pressure that
corresponds to D (having equal areas before and above the horizontal line)
separates the absolute minimum at the left branch Q (liquid-like) from that
on the right one C (gas-like). The states E (over-cooled or over-compressed
gas) and N (overheated or overstretched liquid) are metastable, that is they
are stable with respect to small perturbations but they do not correspond to
the global minimum of chemical potential. We thus conclude that the true
equation of state must have isotherms that look as follows:
P
Tc
V
63
pressure P (T ), that corresponds to the point D, where the two phases coexist.
On the other hand, we see that if temperature is higher than some Tc (critical
point), the isotherm is monotonic and there is no phase transition. Critical
point was discovered by Mendeleev (1860) who also built the periodic table
of elements. At critical temperature the dependence P (V ) has an inflection
point: (∂P/∂V )T = (∂ 2 P/∂V 2 )T = 0. According to (33) the fluctuations
must be large at the critical point (more detail in the next chapter).
64
We can now estimate the electrostatic contribution to the energy of the
system of N particles (what is called correlation energy):
e2 N 3/2 e3 A
Ū ≃ −N ≃−√ = −√ . (125)
rD VT VT
The (positive) addition to the specific heat
A e2
∆CV = ≃N ≪N . (126)
2V 1/2 T 3/2 rD T
One can get the correction to the entropy by integrating the specific heat:
∫ ∞ CV (T )dT A
∆S = − = − 1/2 3/2 . (127)
T T 3V T
We set the limits of integration here as to assure that the effect of screening
disappears at large temperatures. We can now get the correction to the free
energy and pressure
2A A
∆F = Ū − T ∆S = − , ∆P = − . (128)
3V 1/2 T 1/2 3V 3/2 T 1/2
Total pressure is P = N T /V − A/3V 3/2 T 1/2 — a decrease at small V (see
figure) hints about the possibility of phase transition which indeed happens
(droplet creation) for electron-hole plasma in semiconductors even though
our calculation does not work at those concentrations.
P
ideal
65
Now, we can do all the consideration in a more consistent way calculating
exactly the value of the constant A. To calculate the correlation energy of
electrostatic interaction one needs to multiply every charge by the potential
created by other charges at its location. In estimates, we took the Coulomb
law for the potential around every charge, while it must differ as the distance
increases. Indeed, the electrostatic potential ϕ(r) around an ion determines
the distribution of ions (+) and electrons (-) by the Boltzmann formula
n± (r) = n0 exp[∓eϕ(r)/T ] while the charge density e(n+ − n− ) in its turn
determines the potential by the Poisson equation
( ) 8πe2 n0
∆ϕ = −4πe(n+ − n− ) = −4πen0 e−eϕ/T − eeϕ/T ≈ ϕ, (129)
T
where we expanded the exponents assuming the weakness of interaction.
This equation has a central-symmetric solution ϕ(r) = (e/r) exp(−κr) where
−2
κ2 = 8πrD . We are interesting in this potential near the ion i.e. at small r:
ϕ(r) ≈ e/r − eκ where the first term is the field of the ion itself while the
second term is precisely what we need i.e. contribution of all other charges.
We can now write the energy of every ion and electron as −e2 κ and get the
total electrostatic energy multiplying by the number of particles (N = 2n0 V )
and dividing by 2 so as not to count every couple of interacting charges twice:
√ N 3/2 e3
Ū = −n0 V κe2 = − π √ . (130)
VT
√
Comparing with the rough estimate (125), we just added the factor π.
The consideration by Debye-Hückel is the right way to account for the
1/3
first-order corrections in the small parameter e2 n0 /T . One cannot though
get next corrections within the method [further expanding the exponents in
(129)]. That would miss multi-point correlations which contribute the next
orders. Indeed, the existence of an ion at some point influences not only
the probability of having an electron at some other point but they together
influence the probability of charge distribution in the rest of the space. To
account for multi-point correlations, one needs Bogolyubov’s method of cor-
relation functions. Such functions are multi-point joint probabilities to find
simultaneously particles at given places. The correlation energy is expressed
via the two-point correlation function wab where the indices mark both the
type of particles (electrons or ions) and the positions ra and rb :
1 ∑ Na Nb ∫ ∫
E= uab wab dVa dVb . (131)
2 a,b V 2
66
Here uab is the energy of the interaction. The pair correlation function is
determined by the Gibbs distribution integrated over the positions of all
particles except the given pair:
∫ [ ]
2−N F − Fid − U (r1 . . . rN )
wab = V exp dV1 . . . dVN −2 . (132)
T
Here ∑ ∑
U = uab + (uac + ubc ) + ucd .
c c,d̸=a,b
Expanding (132) in U/T we get terms like uab wab and in addition (uac +
ubc )wabc which involves the third particle c and the triple correlation function
that one can express via the integral similar to (132):
∫ [ ]
F − Fid − U (r1 . . . rN )
wabc = V 3−N exp dV1 . . . dVN −3 . (133)
T
and observing that the equation on wab is not closed, it contains wabc ; the
similar equation on wabc will contain wabcd etc. Debye-Hückel approximation
corresponds to closing this hierarchical system of equations already at the
level of the first equation (134) putting wabc ≈ wab wbc wac and assuming
ωab = wab − 1 ≪ 1, that is assuming that two particles rarely come close
while three particles never come together:
∑ ∫
∂ωab 1 ∂uab −1 ∂ubc
=− − (V T ) Nc ωac dVc , (135)
∂rb T ∂rb c ∂rb
For other contributions to wabc , the integral turns into zero due to isotropy.
This is the general equation valid for any form of interaction. For Coulomb
interaction, we can turn the integral equation (135) into the differential equa-
tion by using ∆r−1 = −4πδ(r). For that we differentiate (135) once more:
4πza zb e2 4πzb e2 ∑
∆ωab (r) = δ(r) + Nc zc ωac (r) . (136)
T TV c
67
The dependence on ion charges and types is trivial, ωab (r) = za zb ω(r) and
we get ∆ω = 4πe2 δ(r)/T + κ2 ω which is (129) with delta-function enforc-
ing the condition at zero. We see that the pair correlation function satis-
fies the same equation as the potential. Substituting the solution ω(r) =
−(e2 /rT ) exp(−κr) into wab (r) = 1 + za zb ω(r) and that into (131) one gets
contribution of 1 vanishing because of electro-neutrality and the term linear
in ω giving (130). To get to the next order, one considers (134) together with
the equation for wabc , where one expresses wabcd via wabc .
After seeing how screening works, it is appropriate to ask what happens
when there is no screening in a system with a long-range interaction. One
example of that is gravity. Indeed, thermodynamics is very peculiar in this
case. Arguably, the most dramatic manifestation is the Jeans instability of
sufficiently large systems which leads to gravitational collapse, creation of
stars and planets and, eventually, appearance of life.
For more details on Coulomb interaction, see Landau & Lifshitz, Sects.
78,79 for more details.
The quantum (actually quasi-classical) variant of such mean-field consid-
eration is called Thomas-Fermi method (1927) and is traditionally studied
in the courses of quantum mechanics as it is applied to the electron distri-
bution in large atoms (such placement is probably right despite the method
is stat-physical because objects of study are more important than methods).
In this method we consider the effect of electrostatic interaction on a degen-
erate electron gas at zero temperature. According to the Fermi distribution
(88), the maximal kinetic energy (Fermi energy) is related to the local con-
centration n(r) by p20 /2m = (3π 2 n)2/3 h̄2 /2m, which is also the expression for
the chemical potential at the zero temperature. We need to find the elec-
trostatic potential ϕ(r) which determines the interaction energy for every
electron, −eϕ. The sum of the chemical potential and the interaction energy,
p20 /2m − eϕ = −eϕ0 , must be space-independent otherwise the electrons drift
to the places with lower −ϕ0 . The constant ϕ0 is zero for neutral atoms and
negative for ions. We can now relate the local electron density n(r) to the lo-
cal potential ϕ(r): p20 /2m = eϕ − eϕ0 = (3π 2 n)2/3 h̄2 /2m — that relation one
must now substitute into the Poisson equation ∆ϕ = 4πen ∝ (ϕ−ϕ0 )3/2 . The
boundary condition is ϕ → Z/r at r → 0 where Z is the charge of the nuclei.
This equation has a power-law solution at large distances ϕ ∝ r−4 , n ∝ r−6
which does not make much sense as atoms are supposed to have finite sizes.
Indeed, at large distances (∝ Z 1/3 ), the quasi-classical description breaks as
the quantum wavelength is comparable to the distance r. The description
68
also is inapplicable below the Bohr radius. The Thomas-Fermi approxima-
tion works well for large-atoms where there is an intermediate interval of
distances (Landau&Lifshitz, Quantum Mechanics, Sect. 70).
6 Phase transitions
6.1 Thermodynamic approach
The main theme of this Chapter is the competition (say, in minimizing the
free energy) between the interaction energy, which tends to order systems,
and the entropy, which brings a disorder. Upon the change of some parame-
ters, systems can undergo a phase transition from more to less ordered states.
We start this Chapter from the phenomenological approach to the transitions
of both first and second orders. We then proceed to develop a microscopic
statistical theory based on Ising model.
69
by eliminating z from the equations that give P (v) in a parametric form —
see (115,116):
P 1 1 [ ]
= ln Z(z, V ) = ln 1 + zZ(1, V, T ) + . . . + z Nm Z(Nm , V, T ) ,
T V V
1 z ∂ ln Z(z, V ) zZ(1, V, T ) + 2z 2 Z(2, V, T ) . . . + Nm z Nm Z(Nm , V, T )
= = .
v V ∂z V [1 + zZ(1, V, T ) +. . .+ z Nm Z(Nm , V, T )]
70
P 1/v P
first order
z z v
z0 z0
P 1/v P
second order
z z v
z0 z0
71
P T
Critical point L−S L G−L
LIQUID
T tr G
SOLID S
1 A 2
Triple point
G−S
GAS T V
Ttr
µ1 L>0
µ2 T
dP s1 − s2 L
= = . (139)
dT v1 − v2 T (v2 − v1 )
Since the entropy of a liquid is usually larger than that of a solid then L > 0
that is the heat is absorbed upon melting and released upon freezing. Most
of the substances also expand upon melting then the solid-liquid equilibrium
line has dP/dT > 0, as on the P-T diagram above. Water, on the contrary,
contracts upon melting so the slope of the melting curve is negative (for-
tunate for fish and unfortunate for Titanic, ice floats on water). Note that
symmetries of solid and liquid states are different so that one cannot contin-
uously transform solid into liquid. That means that the melting line starts
on another line and goes to infinity since it cannot end in a critical point
(like the liquid-gas line).
Clausius-Clapeyron equation allows one, in particular, to obtain the pres-
sure of vapor in equilibrium with liquid or solid. In this case, v1 ≪ v2 .
We may treat the vapor as an ideal so that v2 = T /P and (139) gives
72
d ln P/dT = L/T 2 . We may further assume that L is approximately inde-
pendent of T and obtain P ∝ exp(−L/T ) which is a fast-increasing function
of temperature. Landau & Lifshitz, Sects. 81–83.
µ2
µ1
P P
Another examples of continuous phase transitions (i.e. such that corre-
spond to a continuous change in the system) are all related to the change
in symmetry upon the change of P or T . Since symmetry is a qualitative
characteristics, it can change even upon an infinitesimal change (for exam-
ple, however small ferromagnetic magnetization breaks isotropy). Here too
every phase can exist only on one side of the transition point. The tran-
sition with first derivatives of the thermodynamic potentials continuous is
called second order phase transition. Because the phases end in the point
of transition, such point must be singular for the thermodynamic potential,
and indeed second derivatives, like specific heat, are generally discontinuous.
One set of such transitions is related to the shifts of atoms in a crystal lattice;
while close to the transition such shift is small (i.e. the state of matter is
almost the same) but the symmetry of the lattice changes abruptly at the
transition point. Another set is a spontaneous appearance of macroscopic
magnetization (i.e. ferromagnetism) below Curie temperature. Transition
to superconductivity is of the second order. Variety of second-order phase
transitions happen in liquid crystals etc. Let us stress that some transitions
with a symmetry change are first-order (like melting) but all second-order
phase transitions correspond to a symmetry change.
73
6.1.4 Landau theory
To describe general properties of the second-order phase transitions Lan-
dau suggested to characterize symmetry breaking by some order parameter
η which is zero in the symmetrical phase and is nonzero in nonsymmetric
phase. Example of an order parameter is magnetization. The choice of order
parameter is non-unique; to reduce arbitrariness, it is usually required to
transform linearly under the symmetry transformation. The thermodynamic
potential can be formally considered as G(P, T, η) even though η is not an
independent parameter and must be found as a function of P, T from requir-
ing the minimum of G. We can now consider the thermodynamic potential
near the transition as a series in small η:
G(P, T, η) = G0 + A(P, T )η 2 + B(P, T )η 4 . (140)
The linear term is absent to keep the first derivative continuous at η = 0. The
coefficient B must be positive since arbitrarily large values of η must cost a
lot of energy. The coefficient A must be positive in the symmetric phase when
minimum in G corresponds to η = 0 (left figure below) and negative in the
non-symmetric phase where η ̸= 0. Therefore, at the transition Ac (P, T ) = 0
and Bc (P, T ) > 0:
G G
A>0 A<0
η η
74
In the lowest order in η the entropy is S = −∂G/∂T = S0 + a2 (T − Tc )/2B at
T < Tc and S = S0 at T > Tc . Entropy is lower at lower-temperature phase
(which is generally less symmetric). Specific heat Cp = T ∂S/∂T has a jump
at the transitions: ∆Cp = a2 Tc /2B. Specific heat increases when symmetry
is broken since more types of excitations are possible.
If symmetries allow the cubic term C(P, T )η 3 (like in a gas or liquid
near the critical point discussed in Sect. 7.2 below) then one generally has a
first-order transition, say, when A < 0 and C changes sign:
G G G
η η η
has one solution η(h) above the transition and may have three solutions (one
stable, two unstable) below the transition:
η η η
h h h
The similarity to the van der Waals isotherm is not occasional: changing
the field at T < Tc one encounters √ a first-order phase transition at h = 0
where the two phases with η = ± a(Tc − T )/2B coexist. We see that p − pc
is analogous to h and 1/v − 1/vc to the order parameter (magnetization) η.
75
Susceptibility,
( ) {
∂η V [2α(T − Tc )]−1 at T > Tc
χ= = = (144)
∂h h=0
2a(T − Tc ) + 12Bη 2 [4α(Tc − T )]−1 at T < Tc
76
of different nature and the technical difficulties related with the commuta-
tion relations of spin operators. It is remarkable that there exists one highly
simplified approach that allows one to study systems so diverse as ferromag-
netism, condensation and melting, order-disorder transitions in alloys, phase
separation in binary solutions, and also model phenomena in economics, so-
ciology, genetics, to analyze the spread of forest fires etc. This approach is
based on the consideration of lattice sites with the nearest-neighbor interac-
tion that depends upon the manner of occupation of the neighboring sites.
We shall formulate it initially on the language of ferromagnetism and then
establish the correspondence to some other phenomena.
6.2.1 Ferromagnetism
Experiments show that ferromagnetism is associated with the spins of elec-
trons (not with their orbital motion). Spin 1/2 may have two possible pro-
jections. We thus consider lattice sites with elementary magnetic moments
±µ. We already considered (Sect. 2.5.1) this system in an external magnetic
field H without any interaction between moments and got the magnetization
(35):
exp(µH/T ) − exp(−µH/T )
M = nµ . (145)
exp(µH/T ) + exp(−µH/T )
Of course, this formula gives no first-order phase transition upon the change
of the sign of H as in Landau theory (143). The reason is that we did not
account for interaction (i.e. B-term). First phenomenological treatment of
the interacting system was done by Weiss who assumed that there appears
some extra magnetic field proportional to magnetization which one adds to
H and thus describes the influence that M causes upon itself:
µ(H + ξM )
M = N µ tanh ). (146)
T
And now put the external field to zero H = 0. The resulting equation can
be written as
Tc η
η = tanh , (147)
T
where we denoted η = M/µN and Tc = ξµ2 N . At T > Tc there is a
single solution η = 0 while at T < Tc there are two more nonzero solutions
which exactly means the appearance of the spontaneous magnetization. At
Tc − T ≪ Tc one has η 2 = 3(Tc − T ) exactly as in Landau theory (142).
77
η
1
T/Tc
1
One can compare Tc with experiments and find surprisingly high ξ ∼ 103 ÷
4
10 . That means that the real interaction between moments is much higher
than the interaction between neighboring dipoles µ2 n = µ2 /a3 . Frenkel and
Heisenberg solved this puzzle (in 1928): it is not the magnetic energy but
the difference of electrostatic energies of electrons with parallel and antipar-
allel spins, so-called exchange energy, which is responsible for the interaction
(parallel spins have antisymmetric coordinate wave function and much lower
energy of interaction than antiparallel spins).
We can now at last write the Ising model (formulated by Lenz in 1920
and solved in one dimension by his student Ising in 1925): we have the
variable σi = ±1 at every lattice site. The energy includes interaction with
the external field and between neighboring spins:
∑
N ∑
H = −µH σi + J/4 (1 − σi σj ) . (148)
i ij
78
we get:
N − 2N+ N − N+
γJ − T ln =0. (149)
N N+
Here we can again introduce the variables η = M/µN and Tc = γJ/2 and
reduce (149) to (147). We thus see that indeed Weiss approximation is equiv-
alent to the mean field. The only addition is that now we have the expression
for the free energy, F/2N = Tc (1−η 2 )−T (1+η) ln(1+η)−T (1−η) ln(1−η),
so that we can indeed make sure that the nonzero η at T < Tc correspond to
minima. Here is the free energy plotted as a function of magnetization, we
see that it has exactly the form we assumed in the Landau theory (which as
we see near Tc corresponds to the mean field approximation). The energy is
symmetrical with respect to flipping all the spins simultaneously. The free
energy is symmetric with respect to η ↔ −η. But the system at T < Tc lives
in one of the minima (positive or negative η). When the symmetry of the
state is less than the symmetry of the potential (or Hamiltonian) it is called
spontaneous symmetry breaking.
F
T<Tc
T=T_
c
T>Tc η
79
parallel to the previous one or not. If the next spin is opposite it gives the
energy J and if it is parallel the energy is zero. There are N − 1 links. The
partition function is that of the N − 1 two-level systems (37):
Here 2 because there are two possible orientations of the first spin.
One can do it for a more general Hamiltonian (as in the Exercise 1.2)
∑
N ∑
N N {
∑ }
H
H = −H σi − J σi σi+1 = − (σi + σi+1 ) + Jσi σi+1 , (151)
i=1 i=1 i=1 2
Every factor in the product can be written as a matrix element ⟨σj |T̂ |σj+1 ⟩ =
Tσj σj+1 = exp[β{H(σi + σi+1 )/2 + Jσi σi+1 } of the transfer matrix
( )
T1,1 T1,−1
T = (154)
T−1,1 T−1,−1
where T11 = eβ(J+H) , T−1,−1 = eβ(J−H) , T−1,1 = T1,−1 = e−βJ . The sum over
σj = ±1 corresponds to taking trace of the matrix:
∑ ∑
Z= Tσ1 σ2 Tσ2 σ3 . . . TσN σ1 = ⟨σ1 |T̂ N |σ1 ⟩ = trace T N . (155)
{σi } σ1 =±1
Therefore
Z = λN N
1 + λ2 (156)
80
Therefore
( )N
λ2
F = −kB T log(λN
1 + λ2 ) = −kB T N log(λ1 ) + log 1 +
N
λ1
→ −N kB T log λ1 as N →∞ (158)
T
We can improve the mean-field approximation by accounting exactly for
the interaction of a given spin σ0 with its γ nearest neighbors and replacing
the interaction with the rest of the lattice by a new mean field H ′ (this is
called Bethe-Peierls or BP approximation):
∑
γ ∑
γ
Hγ+1 = −µH ′ σj − (J/2) σ0 σj . (159)
j=1 j=1
81
we obtain [ ]
γ−1 cosh(η + ν)
η= ln (160)
2 cosh(η − ν)
instead of (147) or (149). Condition that the derivatives with respect to η at
zero are the same, (γ − 1) tanh ν = 1, gives the critical temperature:
( )
γ
Tc = J ln−1 , γ≥2. (161)
γ−2
It is lower than the mean field value γJ/2 and tends to it when γ → ∞
— mean field is exact in an infinite-dimensional space. More important, it
shows that there is no phase transition in 1d when γ = 2 and Tc = 0 (in
fact, BP is exact in 1d). Note that η is now not a magnetization, which
is given by the mean spin σ̄0 = sinh(2η)/[cosh(2η) + exp(−2ν)]. While at
T > Tc both η = σ̄0 = 0, BP gives nonzero energy and specific heat. Indeed,
now Z = 2γ+1 coshγ (βJ/2) and F = −T ln Z = −(γ/β) ln cosh(βJ/2). The
energy is E = ∂(F β)/∂β = (γJ/2) tanh(βJ/2) and the specific heat, C =
(γJ 2 /8T 2 ) cosh2 (J/2T ), such that C ∝ T −2 at T → ∞ (see Pathria 11.6 for
more details):
C exact 2d solution
BP
mean−field
T/J
1.13 1.44 2
Bragg-Williams and Bethe-Peierls approximations are the first and the
second steps of some consistent procedure. When the space dimensionality
is large, then 1/d is a small parameter whose powers determine the contri-
butions of the subsequent approximations. Mean-field corresponds to the
total neglect of fluctuations, while BP accounts for them in the first approx-
imation. One can also say that it corresponds to the account of correlations:
indeed, correlations make fluctuations (like having many spins with the same
direction in a local neighborhood) more probable and require us to account
for them. The two-dimensional Ising model was solved exactly by Onsager
(1944). The exact solution shows the phase transition in two dimensions.
The main qualitative difference from the mean field is the divergence of the
specific heat at the transition: C ∝ − ln |1 − T /Tc |. This is the result of
82
fluctuations: the closer one is to Tc the stronger are the fluctuations. The
singularity of the specific heat ∫is integrable that is, for instance, the en-
tropy change S(T1 ) − S(T2 ) = TT12 C(T )dT /T is finite across the transition
(and goes to zero when T1√→ T2 ) and so is the energy change. Note also
that the true Tc = J/2 ln[( 2 − 1)−1 ] is less than both the mean-field value
Tc = γJ/2 = 2J and BP value Tc = J/ ln 2 also because of fluctuations
(one needs lower temperature to “freeze” the fluctuations and establish the
long-range order).
83
finite in 1d at a finite temperature as in a disordered state. Let us stress
that for the above arguments it is important that the ground state is non-
degenerate so that the first excited state has a higher energy (degeneracy
leads to criticality).
gives the equation of state in the implicit form (like in Sect. 6.1.1): P =
T ln Z/N and 1/v = (z/V )∂ ln Z/∂z. The correspondence with the Ising
model can be established by saying that an occupied site has σ = 1 and
84
unoccupied one has σ = −1. Then Na = N+ and Naa = N++ . Recall that
for Ising model, we had E = −µH(N+ − N− ) + JN+− = µHN + (Jγ −
2µH)N+ − 2JN++ . Here we used the identity γN+ = 2N++ + N+− which
one derives counting the number of lines drawn from every up spin to its
nearest neighbors. The partition function of the Ising model can be written
similarly to (163) with z = exp[(γJ − 2µH)/T ]. Further correspondence can
be established: the pressure P of the lattice gas can be expressed via the
free energy per cite of the Ising model: P ↔ −F/N + µH and the inverse
specific volume 1/v = Na /N of the lattice gas is equivalent to N+ /N =
(1 + M/µN )/2 = (1 + η)/2. We see that generally (for given N and T ) the
lattice gas corresponds to the Ising model with a nonzero field H so that the
transition is generally of the first-order in this model. Indeed, when H = 0
we know that η = 0 for T > Tc which gives a single point v = 2, to get the
whole isotherm one needs to consider the nonzero H i.e. the fugacity different
from exp(γJ). In the same way, the solutions of the zero-field Ising model
at T < Tc gives us two values of η that is two values of the specific volume
for a given pressure P . Those two values, v1 and v2 , precisely correspond to
two phases in coexistence at the given pressure. Since v = 2/(1 + η) then as
T → 0 we have two roots η1 → 1 which correspond to v1 → 1 and η1 → −1
which corresponds to v1 → ∞. For example, in the mean field approximation
(149) we get (denoting B = µH)
γJ T 1 − η2 γJ T
P =B− (1 + η ) − ln
2
, B= − ln z ,
4 (
2 4 ) 2 2
2 B γJη
v= , η = tanh + . (164)
1+η T 2T
As usual in a grand canonical description, to get the equation of state one
expresses v(z) [in our case B(η)] and substitutes it into the equation for the
pressure. On the figure, the solid line corresponds to B = 0 at T < Tc where
we have a first-order phase transition with the jump of the specific volume,
the isotherms are shown by broken lines. The right figure gives the exact
two-dimensional solution.
85
P/T
c P/ T
c
5
0.2
T
1 T
c
v v
0
1 2 3 0 1 2 3
86
gives the power close 1/3. Also lattice theories give always (for any T )
1/v1 + 1/v2 = 1 which is also a good approximation of the real behavior (the
sum of vapor and liquid densities decreases linearly with the temperature
increase but very slowly). One can improve the lattice gas model considering
the continuous limit with the lattice constant going to zero and adding the
pressure of the ideal gas.
Another equivalent model is that of the binary alloy that is consisting
of two types of atoms. X-ray scattering shows that below some transition
temperature there are two crystal sublattices while there is only one lattice at
higher temperatures. Here we need the three different energies of inter-atomic
interaction: E = ϵ1 N11 +ϵ2 N22 +ϵ12 N12 = (ϵ1 +ϵ2 −2ϵ12 )N11 +γ(ϵ12 −ϵ2 )N1 +
γϵ2 N/2. This model described canonically is equivalent to the Ising model
with the free energy shifted by γ(ϵ12 − ϵ2 )N1 + γϵ2 N/2. We are interested in
the case when ϵ1 + ϵ2 > 2ϵ12 so that it is indeed preferable to have alternating
atoms and two sublattices may exist at least at low temperatures. The phase
transition is of the second order with the specific heat observed to increase
as the temperature approaches the critical value. Huang, Chapter 16 and
Pathria, Chapter 12.
As we have seen, to describe the phase transitions of the second order
near Tc we need to describe strongly fluctuating systems. We shall study
fluctuations more systematically in the next section and return to critical
phenomena in Sects. 7.2 and 7.4.
87
7 Fluctuations
7.1 Thermodynamic fluctuations
Consider fluctuations of energy and volume of a given (small) subsystem. The
probability of a fluctuation is determined by the entropy change of the whole
system w ∝ exp(∆S0 ) which is determined by the minimal work needed for
a reversible creation of such a fluctuation: T ∆S0 = −Rmin . For example,
if the fluctuation is that the system starts moving as a whole with the ve-
locity v then the minimal work is the kinetic energy of the system, so that
the probability of such a fluctuation is w(v) ∝ exp(−M v 2 /2T ). Generally,
Rmin = ∆E + P0 ∆V − T0 ∆S, where ∆S, ∆E, ∆V relate to the subsystem.
Indeed, the energy change of the subsystem ∆E is equal to the work R done
on it (by something from outside the system) plus the work done by the rest
of the system −P0 ∆V0 = P0 ∆V plus the heat received from the rest of the
system T0 ∆S0 . Minimal work corresponds to ∆S0 = −∆S. In calculating
variations we also assume P, T equal to their mean values which are P0 , T0 .
Stress that we only assumed the subsystem to be small i.e. ∆S0 ≪ S0 ,
E ≪ E0 , V ≪ V0 while fluctuations can be substantial, i.e. ∆E can be
comparable with E.
V0 S0
V − ∆S 0
Rmin
E0
88
(∂P/∂T )V = (∂S/∂V )T . That means the absence of cross-correlation i.e.
respective quantities fluctuate independently16 : ⟨∆T ∆V ⟩ = ⟨∆P ∆S⟩ = 0.
Indeed, choosing T and V as independent variables we must express
( ) ( ) ( )
∂S ∂S Cv ∂P
∆S = ∆T + ∆V = ∆T + ∆V (167)
∂T V ∂V T T ∂T V
and obtain [ ( ) ]
Cv 1 ∂P
w ∝ exp − 2 (∆T )2 + (∆V ) 2
. (168)
2T 2T ∂V T
⟨(∆v)2 ⟩ = N −2 ⟨(∆V )2 ⟩
which can be converted into the mean squared fluctuation of the number of
particles in a fixed volume:
( )
V 1 V ∆N N2 ∂V
∆v = ∆ = V ∆ = − , ⟨(∆N ) ⟩ = −T 2
2
. (169)
N N N2 V ∂P T
We see that the fluctuations cannot be considered small near the critical
point where ∂V /∂P is large.
For a classical ideal gas with V = N T /P it gives ⟨(∆N )2 ⟩ = N . In this
case, we can do more than considering small fluctuations (or large volumes).
Namely, we can find the probability of fluctuations comparable to the mean
value N̄ = N0 V /V0 . The probability for N (noninteracting) particles to be
inside some volume V out of the total volume V0 is
( ) ( )
N0 ! V N V0 − V N0 −N
wN =
N !(N0 − N )! V0 V0
N( )N0 N
N̄ N̄ N̄ exp(−N̄ )
≈ 1− ≈ . (170)
N! N0 N!
16
Remind that the Gaussian probability distribution w(x, y) ∼ exp(−ax2 − 2bxy − cy 2 )
corresponds to the second moments ⟨x2 ⟩ = 2c/(ac − b2 ), ⟨y 2 ⟩ = a/(ac − b2 ) and to the
cross-correlation ⟨xy⟩ = 2b/(b2 − ac).
89
Here we assumed that N0 ≫ N and N0 ! ≈ (N0 − N )!N0N . Note that N0
disappeared from (170). The distribution (170) is called Poisson law which
takes place for independent events. Mean squared fluctuation is the same as
for small fluctuations:
∑ N̄ N N
⟨(∆N )2 ⟩ = ⟨N 2 ⟩ − N̄ 2 = exp(−N̄ ) − N̄ 2
N =1 (N − 1)!
[ ]
∑ N̄ N ∑ N̄ N
= exp(−N̄ ) + − N̄ 2 = N̄ . (171)
N =2 (N − 2)! N =1 (N − 1)!
Recall that the measurement volume is proportional to N̄ . In particular,
the probability that a given volume is empty (N = 0) decays exponentially
with the volume. On the other hand, to cram more than average number
of particles into the volume decays with N in a factorial way, i.e. faster
than exponential: wN ∝ exp[−N ln(N/N̄ )]. One can check that near the
maximum, at |N − N̄ | ≪ N̄ , the Poisson distribution coincide with the
Gaussian distribution: wN = (2π N̄ )−1/2 exp[−(N − N̄ )2 /2N̄ ].
Of course, real molecules do interact, so that the statistics of their density
fluctuations deviate from the Poisson law, particularly near the critical point
where the interaction energy is getting comparable to the entropy contribu-
tion into the free energy.
Landau & Lifshitz, Sects. 20, 110–112, 114.
One can derive that by recalling (from Sect. 5.3) that (κ2 − ∆) exp(−κr)/r =
4πδ(r) or directly: expand the integral to −∞ and then close the contour in
the complex upper half plane for the first exponent and in the lower half plane
for the second exponent so that the integral is determined by the respective
poles k = ±iκ = ±irc−1 . We defined the correlation radius of fluctuations
rc = [2g(T )/ϕ0 (T )]1/2 . Far from any phase transition, the correlation radius
is typically the mean distance between molecules.
91
Not only for the concentration but also for other quantities, (175) is a
general form of the correlation function at long distances. Near the critical
point, ϕ0 (T ) ∝ A(T )/V = α(T − Tc ) decreases and the correlation radius
increases. To generalize the Landau theory for inhomogeneous η(r) one writes
the space density of the thermodynamic potential uniting (141) and (174) as
92
Landau theory predicts rc ∝ (T − Tc )−1/2 and the correlation function
approaching the power law 1/r as T → Tc in 3d. Of course, those scalings
are valid under the condition that the criterium (177) is satisfied that is
not very close to Tc . As we have seen from the exact solution of 2d Ising
model, the true asymptotics at T → Tc (i.e. inside the fluctuation region)
are different: rc ∝ (T − Tc )−1 and φ(r) = ⟨σ(0)σ(r)⟩ ∝ r−1/4 at T = Tc
in that case. Yet the fact of the radius divergence remains. It means the
breakdown of the Gaussian approximation for the probability of fluctuations
since we cannot divide the system into independent subsystems. Indeed, far
from the critical point, the probability distribution of the density has two
approximately Gaussian peaks, one at the density of liquid nl , another at
the density of gas ng . As we approach the critical point and the distance
between peaks is getting comparable to their widths, the distribution is non-
Gaussian. In other words, one needs to describe a strongly interaction system
near the critical point which makes it similar to other great problems of
physics (quantum field theory, turbulence).
w
ng nl n
93
∑ (∑ )2
α(T − Tc ) i ηi2 + b i ηi2 . Here we assumed that the interaction is short-
range and used the Ornshtein-Zernicke approximation for the spatial depen-
dence. When T < Tc the minimum corresponds to breaking the O(n) sym-
metry, for example, by taking the first component [α(Tc − T )/2b]1/2 and the
other components zero. Considering fluctuations we put ([α(Tc − T )/2b]1/2 +
∑
η1 , η2 , . . . , ηn ) and obtain the thermodynamic potential g i |∇ηi |2 + 2α(Tc −
T )η12 + higher order terms. That form means that only the longitudinal mode
η1 has a finite correlation length rc = [2α(Tc − T )]−1/2 . Almost uniform fluc-
tuations of the transverse modes do not cost any energy. This is an example
of the Goldstone theorem which claims that whenever continuous symmetry
is spontaneously broken (i.e. the symmetry of the state is less than the sym-
metry of the thermodynamic potential or Hamiltonian) then the mode must
exist with the energy going to zero with the wavenumber. This statement is
true beyond the mean-field approximation or Landau theory as long as the
force responsible for symmetry breaking is short-range. For a spin system,
the broken symmetry is rotational and the Goldstone mode is the excited
state where the spin turns as the location changes, as shown in the Figure.
That excitation propagates as a spin wave. For a solid, the broken symmetry
is translational and the Goldstone mode is a phonon.
1
0 1
0 11
1
00
0
111
01
0 1111
1111
111
1111
0000
0000
000
0000
1
0 1
0 1
00
0
11
1
01
0 1111
1111
111
1111
0000
0000
000
0000
1
0 1
0 1
00
0
11
1
00
0
1
0
1
0
1
0 1111
1111
111
1111
0000
0000
000
0000
1
0
1
0
1
0
1
0 11
1 0
00
0
1
10
1
0 1111
1111
111
1111
0000
0000
000
0000
1111
1111
111
1111
0000
0000
000
0000
1
0 1
0 11
1
00
0
111
01
0 1111
1111
111
1111
0000
0000
000
0000
1
0 1
0 1
00
0
111
01
10 1111
1111
111
1111
0000
0000
000
0000
1
0 1
0 1 0
00
0 1
0 1111
1111
111
1111
0000
0000
000
0000
Equivalent ground states
1
0
1 1
0 111
000 111
000
0
1
0
1
0
1
0
111
000
111
000 11111
00000
111
000 1111
0000
1
0 1
0 111
000 11111
00000
111
000 1111
0000
11111
1
0 1 000
0 00000
111 111
000 1111
0000
11111
00000 1111
0000
1
0 1
0 111
000 111
000 1111
0000
11111
00000 1111
0000
1111
1
0 1
0 111
000 111
000 0000
11111
00000 1111
0000
1111
0000
1
0 1
0 111 111
000
000 11111
00000 1111
0000
1111
0000
Goldstone mode − spin wave
Goldstone modes are easily excited by thermal fluctuations and they de-
stroy long-range order for d ≤ 2. Indeed, in less than three dimensions, the
integral (175) at ϕ0 = 0 describes the correlation function which grows with
the distance:
∫ ∫ 1/r r2−d − L2−d
⟨∆η(0)∆η(r)⟩ ∝ (1 − eikr )k −2 d d k ≈ k −2 d d k ∝ . (178)
1/L d−2
94
larger. That means that the state is actually disordered despite rc = ∞: soft
(Goldstone) modes with no energy price for long fluctuations (ϕ0 = 0) destroy
long order (this statement is called Mermin-Wagner theorem). One can state
this in another way by saying that the mean variance of the order parameter,
⟨(∆η)2 ⟩ ∝ L2−d , diverges with the system size L → ∞ at d ≤ 2. In exactly
the same way phonons with ωk ∝ k make 2d crystals impossible: the energy
of the lattice vibrations is proportional to the squared atom velocity (which
is the frequency ωk times displacement uk ),∫ T ≃ ωk2 u∫2k ; that makes mean
squared displacement proportional to ⟨u2 ⟩ ∝ dd ku2k = dd kT /ωk2 ∝ L2−d —
in large enough samples the amplitude of displacement is getting comparable
to the distance between atoms in the lattice, which means the absence of a
long-range order.
Let us consider the so-called XY model which describes a system of two-
dimensional spins s = (s cos φ, s sin φ) with ferromagnetic interaction i.e.
with the Hamiltonian
∑ ∑
H = −J s(i) · s(j) = −Js2 cos(φi − φj ) . (179)
i,j i,j
95
[ ] ( )−1/2πβJs2
2 −1 πr
≈ exp −(2πβJs ) ln(πr/a) = . (181)
a
Here we used the formula of Gaussian integration
∫ ∞ √ 2 A−1 /2
dφe−Aφ 2π/Ae−J
2 /2+iJφ
= . (182)
−∞
96
longitudinal sound exists in fluids as well, so it is transverse sound which is
a defining property of a solid). Remind that our consideration (179-181) was
for sufficiently low temperatures. One then asks if the power-law behavior
disappears and a finite correlation radius appears above some temperature.
The answer is that, apart from Goldstone modes, there is another set of
excitations, which destroy an order and lead to a finite correlation radius.
For a 2d crystal, an order is destroyed by randomly placed dislocations while
for a spin system or condensate by vortices. As opposite to initial spins or
atoms that create crystal or condensate and interact locally, vortices and
dislocations have a long-range (logarithmic) interaction and the energy E of
a single vortex is proportional to the logarithms of a sample size L. Indeed,
for the XY model in the spin-wave approximation one can minimize the free
energy:
δ ∫
|∇φ(r′ )|2 dr′ = ∆φ = 0 .
δφ(r)
The vortex is the solution of this equation, which in the polar coordinates
r, θ is simply
∫
φ(r, θ) = mθ so that (∇φ)θ = m/r and the energy E =
(Js2 /2) |∇φ(r)|2 d2 r = πJs2 m2 ln(L/a). But the entropy S associated with
a single vortex is also logarithmic: S = ln(L/a)2 . That means that there
exists some sufficiently high temperature T = πJs2 m2 /2 when the energy
contribution into the free energy E − T S is getting less than the entropy
contribution T S. That temperature corresponds to the so-called Berezinskii-
Kosterlitz-Thouless (BKT) phase transition from a high-T state of free dislo-
cations/vortices to a low-T where there are no isolated vortices. Free vortices
provide for a screening (a direct analog of the Debye screening of charges) so
that the high-T state has a finite correlation length equal to the Debye ra-
dius. Actually, in a low-T state (with power-law correlations) those vortices
that exist are bound into dipole vortex pairs whose energy does not depend
on the sample size.
Let now describe a vortex as an inhomogeneous state that corresponds
to an extremum of the thermodynamic Landau-Ginzburg potential, which
contains nonlinearity. Varying the potential (183) with respect to Ψ∗ we get
the equation
( )
δF/δΨ∗ = −g∆Ψ + 2b |Ψ|2 − ϕ20 Ψ = 0 . (184)
In polar coordinates, the Laplacian takes the form ∆ = r−1 ∂r r∂r + r−2 ∂θ2 .
Vortex corresponds to Ψ = A exp(imθ) where A can be chosen real and
97
integer m determines the circulation. Going around the vortex point in 2d
(vortex line in 3d), the phase acquires 2mπ. The dependence on the angle
θ makes the condensate amplitude turning into zero on the axis: A ∝ rm as
r → 0. At infinity, we have a homogeneous condensate: limr→∞ (A − ϕ0 ) ∝
r−2 . That this is a vortex is also clear from the fact that there is a current
J (and the velocity) around it:
( )
∂ψ ∗ ∂ψ A2
Jθ ∝ i ψ − ψ∗ = ,
r∂θ r∂θ r
so that the circulation is independent of the distance from the vortex. Second,
notice that the energy of a vortex indeed diverges logarithmically with the
sample size (as well as the energy of dislocations and 2d Coulomb charges):
velocity around the vortex
∫ 2 2 ∫ 2
decays as v ∝ 1/r so that the kinetic energy di-
verges as A v d r ∝ d r/r2 . A single vortex/charge has its kinetic/electrostatic
2
98
with mi = ±1. Since we are interested only in the mutual positions of
the vortices, we accounted only for the energy of their interaction. One
can understand it better if we recall that the Hamiltonian describes also
dynamics. The coordinates ri = (xi , yi ) of the point vortices (charges) satisfy
the Hamiltonian equations
∂H ∑ yi − yj ∂H ∑ xj − xi
mi ẋi = = mi mj , mi ẏi = − = mi mj ,
∂yi j |ri − rj |2 ∂xi j |ri − rj |2
which simply tell that every vortex moves in the velocity field induced by all
other vortices.
For example, the statistics of the XY-model at low temperatures can be
considered with the Hamiltonian which is a sum of (180) and (185), and
the partition function respectively a product of those due to spin waves and
vortices. In the same way, condensate perturbations are sound waves and
vortices. Since the Gaussian partition function of the waves is analytic,
the phase transition can only originate from the vortices; near such phase
transition the vortex degrees of freedom can be treated separately.
In the canonical approach, one writes the Gibbs distribution P = Z −1 exp(−βH).
The partition function,
∫ ∏∫
2 βmi mj 2
Z(β) = exp(−βH) Πi d ri = V rij d rij ,
i̸=j
that a neutral system with N vortices must have the following number of
pairs: N+− = (N/2)2 and N++ = N−− = N (N − 2)/8. That is the integral
converges only for β < 4(N − 1)/N . In the thermodynamic limit N →
∞, the convergence condition is β < 4. When β → 4 with overwhelming
probability the system of vortices collapses, which corresponds to the BKT
phase transition.
In the microcanonical approach one considers vortices in a domain with
the area A. The density of states,
∫
g(E, A) = δ(E − H)Πi dzi , (186)
∫
must have a maximum since the phase volume Γ(E) = E g(E)dE increases
monotonically with E from zero to AN . The entropy, S(E) = ln g(E), there-
fore, gives a negative temperature T = g/g ′ at sufficiently high energy. Since
99
∑N
g(E, A) = AN g(E ′ , 1) where E ′ = E + 2 ln A i<j mi mj we can get the equa-
tion of state
( )
∂S NT 1 ∑ N
P =T = 1+ mi mj . (187)
∂A E
A 2N T i<j
∑N
For a neutral system with mi = ±1 we find i<j mi mj = N++ +N−− −N+− =
−N/2 and
( )
NT β
P = 1− , (188)
A 4
which shows that pressure turns to zero at the BKT transition temperature.
Note that the initial system had a short-range interaction, which allowed
to use the potential (183) in the long-wave limit. In other words, particles
that create condensate interact locally. However, condensate vortices have
a long-range interaction (185). While BKT transition may seem exotic and
particular for 2d, the lesson is quite general: when there are entities (like vor-
tices, monopoles, quarks) whose interaction energy depends on the distance
similarly to entropy then the phase transition from free entities to pairs is
possible at some temperature. In particular, it is argued that a similar tran-
sition took place upon cooling of the early universe in which quarks (whose
interaction grows with the distance, just like entropy) bind together to form
the hadrons and mesons we see today.
See Kardar, Fields, sect. 8.2.
100
phase shift into the vector potential: A → A + ∇φ/eϕ0 . The quadratic part
of the thermodynamic potential is now
i.e. the correlation radii of both modes h, A are√ finite. The correlation
radius of the gauge-field mode is rc = 1/2eϕ0 = b/α(Tc − T ). This case
does not belong to the validity domain of the Mermin-Wagner theorem since
the Coulomb interaction is long-range.
Thinking dynamically, if the energy of a perturbation contains a nonzero
homogeneous term quadratic in the perturbation amplitude, then the spec-
trum of excitations has a gap at k → 0. For example, the (quadratic
part of the) Hamiltonian H = g|∇ψ|2 + ϕ20 |ψ|2 gives the dynamical equa-
tion ıψ̇ = δH/δψ∗ = −g∆ψ + ϕ20 ψ, which gives ψ ∝ exp(ıkx − ıωk t) with
ωk = ϕ20 + gk 2 . Therefore, what we have just seen is that the Coulomb
interaction leads to a gap in the plasmon spectrum in superconducting tran-
sition. In other words, photon is massless in a free space but has a mass if
it has to move through the condensate of Cooper pairs in a superconductor
(Anderson, 1963). Another manifestation of this simple effect comes via the
analogy between quantum mechanics and statistical physics (which we dis-
cuss in more detail in the subsection 8.2). On the language of relativistic
quantum field theory, the system Lagrangian plays a role of the thermody-
namic potential and the vacuum plays a role of the equilibrium. Here, the
presence of a constant term means a finite energy (mc2 ) at zero momentum,
which means a finite mass. Interaction with the gauge field described by
(189) is called Higgs mechanism by which particles acquire mass in quantum
field theory, the excitation analogous to that described by A is called vector
Higgs boson (Nambu, 1960; Englert, Brout, Higgs, 1964). What is dynami-
cally a finite inertia is translated statistically into a finite correlation radius.
When you hear about the infinite correlation radius, gapless excitations or
massless particles, remember that people talk about the same phenomenon.
See also Kardar, Fields, p. 285-6.
Let us conclude this section by restating the relation between the space
dimensionality and the possibility of phase transition and ordered state for
a short-range interaction. At d ≥ 3 an ordered phase is always possible
for a continuous as well as discrete broken symmetry. For d = 2, one type
of a possible phase transition is that which breaks a discrete symmetry and
corresponds to a real scalar order parameter like in Ising model; another type
101
is the BKT phase transition between power-law and exponential decrease of
correlations. At d = 1 no symmetry breaking and ordered state is possible
for system with a short-range interaction.
Landau & Lifshitz, Sects. 116, 152.
One can now express the quantities of interest via the derivatives at h = 0:
C = ∂ 2 F/∂t2 , η = ∂F/∂h, χ = ∂ 2 F/∂h2 and relate β = (d − b)/a etc.
A general formalism which describes how to make a coarse-graining to
keep only most salient features in the description is called the renormaliza-
tion group (RG). It consists in subsequently eliminating small-scale degrees
of freedom and looking for fixed points of such a procedure. For Ising model,
it is achieved with the help of a block spin transformation that is dividing all
the spins into groups (blocks) with the side k so that there are k d spins in
every block (d is space dimensionality). We then assign to any block a new
102
variable σ ′ which is ±1 when respectively the spins in the block are predom-
inantly up or down. We assume that the phenomena very near critical point
can be described equally well in terms of block spins with the energy of the
∑ ∑
same form as original, E ′ = −h′ i σi′ + J ′ /4 ij (1 − σi′ σj′ ), but with differ-
ent parameters J ′ and h′ . Let us demonstrate how it works using 1d Ising
model with
∑ [ ∑ h = 0 and ] J/2T ≡ K. Let us transform the partition function
18
{σ} exp K i σi σi+1 by the procedure (called decimation ) of eliminating
degrees of freedom by ascribing (undemocratically) to every block of k = 3
spins the value of the central spin. Consider two neighboring blocks σ1 , σ2 , σ3
and σ4 , σ5 , σ6 and sum over all values of σ3 , σ4 keeping σ1′ = σ2 and σ2′ = σ5
fixed. The respective factors in the partition function can be written as fol-
lows: exp[Kσ3 σ4 ] = cosh K + σ3 σ4 sinh K. Denote x = tanh K. Then only
the terms with even powers of σ3 , σ4 contribute the factors in the partition
function function that involve these degrees of freedom:
∑
exp[K(σ1′ σ3 + σ3 σ4 + σ4 σ2′ )]
σ3 ,σ4 =±1
∑
= cosh3 K (1 + xσ1′ σ3 )(1 + xσ4 σ3 )(1 + xσ2′ σ4 )
σ3 ,σ4 =±1
The expression (191) has the form of the Boltzmann factor exp(K ′ σ1′ σ2′ ) with
the re-normalized constant K ′ = tanh−1 (tanh3 K) or x′ = x3 — this formula
and (192) are called recursion relations. The partition function of the whole
system in the new variables can be written as
∑ [ ∑ ]
exp N g(K) − K ′ σi′ σi+1
′
.
{σ ′ } i
The term proportional to g(K) represents the contribution into the free en-
ergy of the short-scale degrees of freedom which have been averaged out.
This term does not affect the calculation of any spin correlation function.
Yet the renormalization of the constant, K → K ′ , influences the correlation
functions. Let us discuss this renormalization. Since K ∝ 1/T then T → ∞
18
the term initially meant putting to death every tenth soldier of a Roman army regiment
that run from a battlefield.
103
correspond to x → 0+ and T → 0 to x → 1−. One is interested in the set of
the parameters which does not change under the RG, i.e. represents a fixed
point of this transformation. Both x = 0 and x = 1 are fixed points, the first
one stable and the second one unstable. Indeed, after iterating the process
we see that x approaches zero and effective temperature infinity. That means
that large-scale degrees of freedom are described by the partition function
where the effective temperature is high so the system is in a paramagnetic
state. At this limit we have K, K ′ → 0 so that the contribution of the
small-scale degrees of freedom is getting independent of the temperature:
g(K) → − ln 4.
We see that there is no phase transition since there is no long-range order
for any T (except exactly for T = 0). RG can be useful even without critical
behavior, for example, the correlation length measured in lattice units must
satisfy rc (x′ ) = rc (x3 ) = rc (x)/3 which has a solution rc (x) ∝ ln−1 x, an
exact result for 1d Ising. It diverges at x → 1 (T → 0) as
σ1 σ2 σ3 σ4 σ5 σ6
1d 2d
104
come from K to K0 , and n(K) → ∞ as K → Kc then rc → ∞ as T → Tc .
Critical exponent ν = −d ln rc /d ln t is expressed via the derivative of RG at
Tc . Indeed, denote dK ′ /dK = k y at K = Kc so that K ′ − Kc ≈ k y (K − Kc ).
Since krc (K ′ ) = rc (K) then we may present:
rc (K) ∝ (K − Kc )−ν = k(K ′ − Kc )−ν = k [k y (K − Kc )]−ν ,
which requires ν = 1/y. We see that in general, the RG transformation of
the set of parameters K is nonlinear. Linearizing it near the fixed point one
can find the critical exponents from the eigenvalues of the linearized RG and,
more generally, classify different types of behavior. That requires generally
the consideration of RG flows in multi-dimensional spaces.
RG flow with two couplings
K2
critical surface
K2
K1 σ
K1
105
changing the temperature in a system with only nearest-neighbor coupling,
we move along the line K2 = 0. The point where this line meets critical
surface defines K1c and respective Tc1 . At that temperature, the large-scale
behavior of the system is determined by the RG flow i.e. by the fixed point.
In another system with nonzero K2 , by changing T we move along some
other path in the parameter space, indicated by the broken line at the figure.
Intersection of this line with the critical surface defines some other critical
temperature Tc2 . But the long-distance properties of this system are again
determined by the same fixed point i.e. all the critical exponents are the
same. For example, the critical exponents of a simple fluid are the same as of
a uniaxial ferromagnetic. In this case (of the Ising model in an external field),
RG changes both t and h, the free energy (per block) is then transformed
as f (t, h) = g(t, h) + k −d f (t′ , h′ ). The part g originates from the degrees
of freedom within each block so it is an analytic function even at criticality.
Therefore, if we are interested in extracting the singular behavior (like critical
exponents) near the critical point, we consider only singular part, which has
the form (190) with a, b determined by the derivatives of the RG.
See Cardy, Sect 3 and http://www.weizmann.ac.il/home/fedomany/
1 ∑d
P (x, t + τ ) = P (x ± ei , t) . (193)
2d i=1
The first (painful) way to solve this equation is to turn it into averaging
exponents as we always do in statistical physics. This is done using the
106
∫
Fourier transform, P (x) = (a/2π)d eikx P (k) dd k, which gives
1∑ d
P (k, t + τ ) = cos aki P (k, t) . (194)
d i=1
The initial condition for (193) is P (x, 0) = δ(x), which gives P (k, 0) = 1
( ∑d )t/τ
and P (k, t) = d−1 i=1 cos aki . That gives the solution in space as an
integral
∫ ( )t/τ
d ikx 1∑ d
P (x, t) = (a/2π) e cos aki dd k . (195)
d i=1
We are naturally interested in the continuous limit a → 0, τ → 0. If
we take τ → 0 first, the integral tends to zero and if we take a → 0 first,
the answer remains delta-function. A non-trivial evolution appears when the
lattice constant and the jump time go to zero simultaneously. Consider the
cosine expansion,
( )t/τ
1∑ d ( )t/τ
cos aki = 1 − a2 k 2 /2d + . . . ,
d i=1
∑
where k 2 = di=1 ki2 . The finite answer exp(−κtk 2 ) appears only if one takes
the limit keeping constant the ratio κ = a2 k 2 /2dτ . In this limit, the space
density of the probability stays finite and is given by the saddle-point calcu-
lation of the integral:
∫ ( )
−d −d ikx−tκk2 −d/2 x2
ρ(x, t) = P (x, t)a ≈ (2π) e d
d k = (4πκt) exp − (196)
.
4κt
The second (painless) way to get this answer is to pass to the continuum
limit already in the equation (193):
1 ∑d
P (x, t+τ )−P (x, t) = [P (x + ei , t) + P (x − ei , t) − 2P (x, t)] . (197)
2d i=1
Of course, ρ satisfies the same equation, and (196) is its solution. Note that
(196,197) are isotropic and translation invariant while the discrete version
107
respected only cubic∫
symmetries. Also, the diffusion equation conserves the
total probability, ρ(x, t) dx, because it has the form of a continuity equation,
∂t ρ(x, t) = −div j with the current j = −κ∇ρ.
Another way to describe it is to treat ei as a random variable with ⟨ei ⟩ = 0
∑t/τ
and ⟨ei ej ⟩ = a2 δij , so that x = i=1 ei .
Random walk is a basis of the statistical theory of fields. To see the
relation, consider the Green function which is the mean time spent on a
given site x:
∞
∑
G(x) = P (x, t) . (199)
t=0
The most natural question is whether this time if finite or infinite. It depends
on the space dimensionality. Indeed, summing (195) over t as a geometric
series one gets
∫
eikx dd k
G(x) = ∑ . (200)
1 − d−1 cos(aki )
It diverges at k → 0 when d ≤ 2. In this limit one can also use the continuous
limit, where it has a form
∫ ∞ ∫ ∫
2 eikx dd k
g(x) = lim (a2−d /2d)G(x/a) = dt eikx−tk dd k = . (201)
a→0 0 (2π)d k 2
We have seen this integral calculating the correlation function of fluctuation
(178). The divergence of this integral in one and two dimensions meant
before that the fluctuation are so strong that they destroy any order. Now,
(201) suggests another interpretation: integral divergence means that the
mean time spent on any given site is infinite. In other words, it means that
the random walker in 1d and 2d returns to any point infinite number of
times. Short-correlated nature of a random walk corresponds to short-range
interaction.
A path of a random walker behaves rather like a surface than a line.
Indeed, surfaces generally intersect along curves in 3d, they meet at isolated
points in 4d and do not meet at d > 4. That is reflected in special properties
of critical phenomena in 2d and 4d. Two-dimensionality√of the random walk
is a reflection of the square-root diffusion law: ⟨x⟩ ∝ t. Indeed, one can
define the dimensionality of a geometric object as a relation between its size
R and the number of standard-size elements (i.e. volume or area) N . For a
line, N ∝ R, generally N ∝ Rd . For a random walk, N ∝ t while R ∝ x so
that d = 2.
108
A slight generalization gives the generating function for the probabilities:
∞
∑ ∫
eikx dd k
G(x, λ) = λ λt/τ P (x, t) = ∑ . (202)
t=0 λ−1 − d−1 cos(aki )
We now divide the time interval t into an arbitrary large number of intervals
and using (196) we write
∫ [ ]
dxi+1 (xi+1 − xi )2
ρ(x, t; 0, 0) = Πni=0 exp −
[4πκ(ti+1 − ti )]d/2 4κ(ti+1 − ti )
∫ [ ∫ t ]
1
→ Dx(t′ ) exp − dt′ẋ2 (t′ ) . (206)
4κ 0
109
The last expression is an integral over paths that start at zero and end up at
x at t. By virtue of (204) it leads to a path-integral expression for the Green
function:
∫ ∞ ∫ { ∫ t [ ]}
1
g(x, m) = dt Dx(t′ ) exp − dt′ m2 + ẋ2 (t′ ) . (207)
0 0 4
Comparison of the Green functions (178,203) shows the relation between
the random walk and a free field. This analogy goes beyond the correlation
function to all the statistics. Indeed, much in the same way, the partition
function of a fluctuating field η(x) that takes continuous values in a con-
tinuum limit can be written as a path integral over all realizations of the
field:
∫
Z = Dη exp [−βH{η(x)}] . (208)
110
M in the external potential U (x), the transition amplitude T (x, t; 0, 0) from
zero to x during t is given by the sum over all possible paths connecting
the points. Every path is weighted by the factor exp(iS/h̄) where S is the
classical action:
∫ [ ( )]
′ i ∫ t ′ M ẋ2
T (x, t; 0, 0) = Dx(t ) exp dt + U (x) . (212)
h̄ 0 2
T (iβ)Q ∑
⟨Q⟩ = trace , Z(β) = trace T (iβ) = e−βEa . (213)
Z(β) a
If the inverse ”temperature” β goes to infinity then all the sums are domi-
nated by the ground state, Z(β) ≈ exp(−βE0 ) and the average in (213) are
just expectation values in the ground state.
On the other hand, recall the one dimensional ring chain (151) with a
nearest neighbor interaction, whose partition function was expressed via the
trace of the transfer matrix (154,155). Here trace means the sum (or integral)
over all possible values on the cite. In particular, for the Ising model, it is
the sum over two values. If there are m values on the cite, then T is m × m
matrix. For O(n) model, trace means integrating over orientations of the
spin in n-dimensional space. We see that the translations along the chain
111
are analogous to quantum-mechanical translations in (imaginary) time. This
analogy is not restricted to 1d systems, one can consider 2d strips that way
too.
112
is also Gaussian with a zero mean. To get its second moment we need the
different-time correlation function of the velocities
which can be obtained from (215). Note that the friction makes velocity
correlated on a longer timescale than the force. That gives
∫ t′ ∫ t′ 6T ′
⟨(∆r) ⟩ =
2
dt1 dt2 ⟨v(t1 )v(t2 )⟩ = 2
(λt′ + e−λt − 1) .
0 0 Mλ
The mean squared distance initially grows quadratically (so-called ballistic
regime at λt′ ≪ 1). In the limit of a long time (comparing to the relaxation
time λ−1 rather than to the force correlation time τ ) we have the diffusion
growth ⟨(∆r)2 ⟩ ≈ 6T t′ /M λ. Generally ⟨(∆r)2 ⟩ = 2dDt where d is space
dimensionality and D - diffusivity. In our case d = 3 and then the diffu-
sivity is as follows: D = T /M λ — the Einstein relation19 . The probability
distribution of displacement,
D∇2 n.
In the external field V (q), the particle satisfies the equations
Note that these equations characterize the system with the Hamiltonian H =
p2 /2M +V (q), that interact with the thermostat, which provides friction −λp
and agitation f - the balance between these two terms expressed by (217)
means that the thermostat is in equilibrium.
Consider now the over-damped limit, λp ≫ ṗ, which gives the first-order
equation:
λp = λM q̇ = f − ∂q V . (220)
19
With temperature in degrees, it contains the Boltzmann constant, k = DM λ/T , which
was actually determined by this relation and found constant indeed, i.e. independent of
the medium and the type of particle. That proved the reality of atoms - after all, kT is
the kinetic energy of a single atom.
113
Let us derive the equation on ρ(q, t), that is to pass from considering indi-
vidual trajectories to the description of the ”cloud” of trajectories. We know
that without V , the density satisfies the diffusion equation. Consider now
dynamic equation without any randomness, λM q̇ = −∂q V , it corresponds to
a flow in q-space with the velocity w = −∂q V /λM . In a flow, density satisfies
the continuity equation ∂t ρ = −div ρw. Together, diffusion and advection
give the so-called Fokker-Planck equation
∂ρ 1 ∂ ∂V
= D∇2 ρ + ρ = −div J . (221)
∂t λM ∂qi ∂qi
More formally, one can derive this equation by considering the Langevin
equation q̇−w = η, taking the noise Gaussian delta-correlated ⟨ηi (0)ηj (t)⟩ =
2Dδij δ(t). In this case the path-integral representation (206) change into
∫ [ ]
1 ∫t ′
ρ(q, t; 0, 0) = Dq(t′ ) exp − dt |q̇ − w|2 , (222)
4D 0
since it is the quantity q̇ − w which is Gaussian now. To describe the time
change, consider the convolution identity (205) for an infinitesimal time shift
ϵ:
∫ [ ]
′ −d/2 [q − q′ − ϵw(q′ )]2
ρ(q, t) = dq (4πDϵ) exp − ρ(q′ , t − ϵ) . (223)
4Dϵ
What is written here is simply that the transition probability is the Gaussian
probability of finding the noise η with the right magnitude to provide for the
transition from q′ to q. We now change integration variable, y = q − q′ −
ϵw(q′ ), and keep only the first term in ϵ: dq′ = dy[1 − ϵ∂q · w(q)]. Here
∂q · w = ∂i wi = div w. In the resulting expression, we expand the last factor,
∫
dy(4πDϵ)−d/2 e−y
2 /4Dϵ
ρ(q, t) ≈ (1 − ϵ∂q · w) ρ(q + y − ϵw, t − ϵ)
∫ [
dy(4πDϵ)−d/2 e−y
2 /4Dϵ
≈ (1 − ϵ∂q · w) ρ(q, t) + (y − ϵw) · ∂q ρ(q, t)
]
+(yi yj − 2ϵyi wj + ϵ2 wi wj )∂i ∂j ρ(q, t)/2 − ϵ∂t ρ(q, t)
= (1 − ϵ∂q · w)[ρ − ϵw · ∂q ρ + ϵD∆ρ − ϵ∂t ρ + O(ϵ2 )] , (224)
and obtain (221) collecting terms linear in ϵ. Note that it was necessary to
expand until the quadratic terms in y, which gave the contribution linear in
ϵ, namely the Laplacian, i.e. the diffusion operator.
114
The Fokker-Planck equation has a stationary solution which corresponds
to the particle in an external field and in thermal equilibrium with the sur-
rounding molecules:
Apparently it has a Boltzmann-Gibbs form, and it turns into zero the proba-
bility current: J = −ρ∂V /∂q − D∂ρ/∂q = 0. We shall return to the Fokker-
Planck equation in the next Chapter for the consideration of the detailed
balance and fluctuation-dissipation relations.
Ma, Sect. 12.7; Kardar Fields, Sect 9.1.
∂ B̄ ⟨Bx⟩ − B̄ x̄ ⟨Bx⟩c
χ≡ = ≡ . (226)
∂f T T
Here the cumulant (also called the irreducible correlation function) is defined
for quantities with the subtracted mean values ⟨xy⟩c ≡ ⟨(x − x̄)(y − ȳ)⟩ and
it is thus the measure of statistical correlation between x and y. We thus
115
learn that the susceptibility is the measure of the statistical coherence of the
system, increasing with the statistical dependence of variables. Consider few
examples of this relation.
Example 1. If x = H is energy itself then f represents the fractional
increase in the temperature: H(1 − f )/T ≈ H/(1 + f )T . Formula (226) then
gives the relation between the specific heat (which is a kind of susceptibility)
and the squared energy fluctuation which can be written via the irreducible
correlation function of the energy density ϵ(r):
∂ H̄ ∂E
=T = T Cv = ⟨(∆E)2 ⟩/T
∂f ∂T
∫ ∫
1 ′ ′ V
= ⟨ϵ(r)ϵ(r )⟩c drdr = ⟨ϵ(r)ϵ(0)⟩c dr (227)
T T
Growth of the specific heat when the temperature approached criticality
is related to the growth of the correlation function of fluctuations. As we
discussed before, the specific heat is extensive i.e. proportional to the volume
(or number of particles), but the coefficient of proportionality actually tells
us how many degrees of freedom are effective in absorbing energy at a given
temperature (recall two-level system where specific heat was small for high
and low temperatures). We see from (227) that the higher the correlation
the larger is the specific heat that is the more energy one needs to spend to
raise the temperature by one degree).
Example 2. If f = h is a magnetic field then the coordinate x = M is the
magnetization and (226) gives the magnetic susceptibility
∂M ⟨M 2 ⟩c V ∫
χ= = = ⟨m(r)m(0)⟩c dr .
∂h T T
Divergence of χ near the Curie point means the growth of correlations be-
tween distant spins i.e. the growth of correlation length. For example, the
Ornshtein-Zernicke correlation function (175) gives ⟨m(r)m(0)⟩c ∝ r2−d so
∫ rc d 2−d
that in the mean-field approximation χ ∝ a d rr ∝ rc2 ∝ |T − Tc |−1 .
General remark. These fluctuation-response relations can be related to
the change of the thermodynamic potential (free energy) under the action of
the force:
∑
F = −T ln Z = −T ln exp[(xf − H)/T ]
116
f2 2
= −T ln Z0 − T ln⟨exp(xf /T )⟩0 = F0 − f ⟨x⟩0 − ⟨x ⟩0c + . . . (228)
2T
⟨x⟩ = −∂F/∂f, ⟨x2 ⟩c /T = ∂⟨x⟩/∂f = −∂F 2 /∂f 2 . (229)
The mean linear response can be written as an integral of the force with the
response (Green) function:
∫
ā(r) = G(r − r′ )f (r′ ) dr′ , āk = Gk fk . (230)
One relates the Fourier components of the Green function and the correlation
function of the coordinate fluctuations choosing B = ak , x = a−k in (226):
1∫ ′ ik·(r′ −r) ′ V ∫
V Gk = ⟨a(r)a(r )⟩c e drdr = ⟨a(r)a(0)⟩c e−ik·r dr .
T T
T Gk = (a2 )k .
117
This formula coincides with (169) if one accounts for
( ) ( ) ( )
∂V ∂n ∂N
−n 2
= N =n
∂P T,N
∂P T,N
∂P T,V
( ) ( ) ( )
∂P ∂N ∂N
= = . (232)
∂µ T,V
∂P T,V
∂µ T,V
Here we used the fact Ω(T, µ) = P V and N = ∂Ω/∂µ. We conclude that the
response of the density to the pressure is expressed via the density fluctua-
tions.
In the simplest case of an ideal gas with ⟨n(r)n(0)⟩c = nδ(r), (231,232)
give dn/dP = 1/T . For the pair interaction energy U (r) the first ap-
proximation (neglecting multiple correlations) gives ⟨n(r)n(0)⟩c = n{δ(r) +
n[e−U (r)/T − 1]} and the correction to the equation of state
∫
p = nT + n2 T [1 − e−U (r)/T ] dr ,
118
Here d/dt denotes the derivative in the moving reference frame, like in
Sect. 2.1. The motion is along an unperturbed trajectory determined by
H0 . Recall now that ∂H0 /∂p = dx/dt (calculated at f = 0 i.e. also along
an unperturbed trajectory). The formal solution of (234) is written as an
integral over the past:
∫ t dx(τ )
ρ1 = βρ0 f (τ ) dτ . (235)
−∞ dτ
We now use (235) to derive the relation between the fluctuations and
response in the time-dependent case. Indeed, the linear response of the co-
ordinate to the force is as follows
∫ t ∫
′ ′ ′
⟨x(t)⟩ ≡ α(t, t )f (t ) dt = xdxρ1 (x, t) , (236)
−∞
119
is analytic in the upper half-plane of complex ω under the assumption that
α(t) is finite everywhere. Since α(t) is real then α(−ω ∗ ) = α∗ (ω). Let us
show that the imaginary part α′′ determines the energy dissipation,
dE dH ∂H ∂H df df
= = = = −x̄ (239)
dt dt ∂t ∂f dt dt
For purely monochromatic perturbation, f (t) = fω exp(−iωt) + fω∗ exp(iωt),
x̄ = α(ω)fω exp(−iωt) + α(−ω)fω∗ exp(iωt), the dissipation averaged over a
period is as follows:
dE ∫ 2π/ω ωdt
= [α(−ω) − α(ω)]ıω|fω |2 = 2ωαω′′ |fω |2 . (240)
dt 0 2π
We can now calculate the average dissipation using (235)
∫ ∫ ∫ t
dE ˙ ˙
= − xf ρ1 dpdx = −β x(t)f (t)ρ0 dpdx ẋ(τ − t)f (τ ) dτ
dt ∫ ∞
−∞
′
= −iω|fω |2 β ⟨x(t)ẋ(t′ )⟩eiω(t−t ) dt′ = βω 2 |fω |2 (x2 )ω , (241)
−∞
T α(ω) ∫ ∞
= ⟨x(0)x(t)⟩ exp(iωt)dt ,
iω 0
∫ ∞ ∫ 0
(x2 )ω = ⟨x(0)x(t)⟩ exp(iωt)dt + ⟨x(0)x(t)⟩ exp(iωt)dt
0 −∞
T [α(ω) − α(−ω)] 2T α′′ (ω)
= = .
iω ω
This truly amazing formula relates the dissipation coefficient that governs
non-equilibrium kinetics under the external force with the equilibrium fluc-
tuations. The physical idea is that to know how a system reacts to a force
one might as well wait until the fluctuation appears which is equivalent to
the result of that force. Note that the force f disappeared from the final
120
result which means that the relation is true even when the (equilibrium)
fluctuations of x are not small. Integrating (242) over frequencies we get
∫ ∞ dω T ∫ ∞ α′′ (ω)dω T ∫ ∞ α(ω)dω
⟨x ⟩ =
2
(x )ω2
= = = T α(0) . (243)
−∞ 2π π −∞ ω ıπ −∞ ω
The last step used the Cauchy integration formula (for the contour that
runs along the real axis, avoids zero by a small semi-circle and closes at the
upper half plane). We thus see that the mean squared fluctuation is the
zero-frequency response, which is the integral of the response over time:
∫ ∞
α(ω = 0) = α(t) dt .
0
We have seen the simplest case of the FDT before - the relation (217)
relating the friction coefficient of the Brownian particle to the variance of
the fluctuating molecular forces and the temperature of the medium. The
same Langevin equation (214) is satisfied by many systems, in particular by
the current I flowing through an electric L-R circuit at a finite temperature:
dI
L = −RI + V (T ) ,
dt
where the fluctuating voltage V (t) is due to thermal fluctuations. Similar to
(217) we can then relate the resistance to the equilibrium voltage fluctuations
on a resistor R: ∫ ∞
⟨V (0)V (t)⟩dt = 2RT .
−∞
This relation is called Nyquist theorem, note that L does not enter. Equipar-
tition requires L < I 2 > /2 = T /2, so that similar to (218) we can write the
current auto-correlation function ⟨I(0)I(t)⟩ = (T /L) exp(−Rt/L), which cor-
responds to the Lorentzian spectral density: (I 2 )ω = 2RT /(L2 ω 2 + R2 ). At
low frequencies it corresponds to a constant spectral density (white noise).
Generally, the spectral density has a universal Lorentzian form in the
low-frequency limit when the period of the force is much longer than the
relaxation time for establishing the partial equilibrium characterized by the
given value x̄ = α(0)f . In this case, the evolution of x is the relaxation
towards x̄:
ẋ = −λ(x − x̄) . (244)
121
For harmonics,
122
in the phase space). Let us stress that this is different from the susceptibil-
ities in equilibrium which were symmetric for the simple reason that they
were second derivatives of the thermodynamic potential; for instance, dielec-
tric susceptibility χij = ∂Pi /∂Ej = χji where P is the polarization of the
medium - this symmetry is analogous to the (trivial) symmetry of β̂, not the
(non-trivial) symmetry of γ̂.
See Landay & Lifshitz, Sect. 119-120 for the details and Sect. 124 for the
quantum case. Also Kittel Sects. 33-34.
( )d/2 ( )
′ −d m m|r − r′ |2
= n|t − t | exp − . (250)
2πT 2|t − t′ |2
123
That function determines the response of the concentration to the change
in the chemical potential. In particular, when t′ → t it tends to nδ(r − r′ ),
which determines the static response described in Sect. 9.1. For coinciding
points it decays by the diffusion law, ⟨n(r, t)n(r, t′ )⟩c ∝ |t − t′ |−d , so that the
response decays as |t − t′ |−d−1 .
ẋ = −∂x V + η . (251)
which allows one to exploit another instance of the analogy between quantum
mechanics and statistical physics. We may say that the probability density is
the ψ-function is the x-representation, ρ(x, t) = ⟨x|ψ(t)⟩. In other words, we
consider evolution in the Hilbert space of functions so that we may rewrite
(252) in a Schrödinger representation as d|ψ⟩/dt = −ĤF P |ψ⟩, which has a
formal solution |ψ(t)⟩ = exp(−tHF P )|ψ(0)⟩. The transition probability is
given by the matrix element:
124
also satisfies the detailed balance: the probability current is the (Gibbs) prob-
ability density at the starting point times the transition probability; forward
and backward currents must be equal in equilibrium:
′
ρ(x′ , t; x, 0)e−V (x)/T = ρ(x, t; x′ , 0)e−V (x )/T . (254)
†
⟨x′ |e−tHF P −V /T |x⟩ = ⟨x|e−tHF P −V /T |x′ ⟩ = ⟨x′ |e −V /T −tHF P |x⟩ .
†
Since this must be true for any x, x′ then e−tHF P = eV /T e−tHF P e−V /T and
Here the bracket means double averaging, over the initial distribution ρ(x, 0)
and over the different realizations of the Gaussian noise η(t) during the time
interval (0, t). Remark that the entropy is defined only in equilibrium, yet the
work divided by temperature is an analog of the entropy change (production),
and the exponent of it is an analog of the phase volume change.
To prove (257) we consider the generalized Fokker-Planck equation for
the joint probability ρ(x, W, t). Similar to the argument preceding (221),
125
we note that the flow along W in x − W space proceeds with the velocity
dW/dt = ∂t V so that the respective component of the current is ρ∂t V and
the equation takes the form
∂t ρ = β −1 ∂x2 ρ + ∂x (ρ∂x V ) − ∂W ρ∂t V , (258)
Since W0 = 0 then the initial condition for (258) is
ρ(x, W, 0) = Z0−1 exp[−V (x, 0)]δ(W ) . (259)
While we cannot find ρ(x, W, t) for arbitrary V (t) we can multiply (258) by
exp(−βW ) and integrate over dW . Since
∫
V (x, t) does not depend on W , we
get the closed equation for f (x, t) = dW ρ(x, W, t) exp(−βW ):
∂t f = β −1 ∂x2 f + ∂x (f ∂x V ) − βf ∂t V , (260)
Now, this equation does have an exact time-dependent solution f (x, t) =
Z0−1 exp[−βV (x, t)] where the factor is chosen to satisfy the initial condition
(259). In other words, the distribution weighted by exp(−βWt ) looks like
Gibbs state, adjusted to the time-dependent potential at every moment of
time. To get (257), what remains is to integrate f (x, t) over x.
Let us reflect. We started from a Gibbs distribution but considered ar-
bitrary temporal evolution of the potential. Therefore, our distribution was
arbitrarily far from equilibrium during the evolution. Still, to obtain the
mean exponent of the work done, it is enough to know the partition func-
tions of the equilibrium Gibbs distributions corresponding to the potential at
the beginning and at the end (even though the system is not in equilibrium
at the end). Remarkable.
One can obtain different particular results from the general fluctuation-
dissipation relation (257). For example, using Jensen inequality ⟨eA ⟩ ≥ e⟨A⟩
and introducing the free energy Ft = −T ln Zt , one can obtain the second
law of thermodynamics:
⟨W ⟩ ≥ Ft − F0 .
Moreover, Jarzynski relation is a generalization of the fluctuation-dissipation
theorem, which can be derived from it for small deviations from equilibrium.
Namely, we can consider V (x, t) = V0 (x) − f (t)x, consider limit of f → 0,
expand (257) up to the second-order terms in f and get (237).
In a multi-dimensional case, there is another way to deviate the system
from equilibrium - to apply a non-potential force f (q, t) (which is not a
gradient of any scalar):
q̇ = f − ∂q V + η . (261)
126
The new Fokker-Planck equation has an extra term
[ ( ) ]
∂ρ ∂ ∂V ∂ρ
= ρ + fi + T = −ĤF P ρ . (262)
∂t ∂qi ∂qi ∂qi
Again, there is no Gibbs steady state and the detailed balance (254,255) is
now violated in the following way:
The last term is again the power divided by temperature i.e. the entropy
production rate. A close analog of the Jarzynski relation can be formulated
for the production rate measured during the time t:
1 ∫t
σt = − (f · q̇) dt . (264)
tT 0
This quantity fluctuates from realization to realization (of the noise η).
The probabilities P (σt ) satisfy the following relation, which we give with-
out derivation (see Kurchan for details)
P (σt )
∝ etσt . (265)
P (−σt )
The second law of thermodynamics states that to keep the system away from
equilibrium, the external force f must on average do a positive work. Over a
long time we thus expect σt to be overwhelmingly positive, yet fluctuations
do happen. The relation (265) shows how low is the probability to observe a
negative entropy production rate - this probability decays exponentially with
the time of observation.
The relation similar to (265) can be derived for any system symmetric
with respect to some transformation to which we add anti-symmetric per-
turbation. Consider a system with the variables s1 , . . . , sN and the even
energy: E0 (s) = E0 (−s). Consider the energy perturbed by an odd term,
∑
E = E0 − hM/2, where M (s) = si = −M (−s). The probability of the
perturbation P [M (s)] satisfies the direct analog of (265), which is obtained
by changing the integration variable s → −s:
∫
P (a) = dsδ[M (s) − a]e−βE0 +βha/2
∫
= dsδ[M (s) + a]e−βE0 −βha/2 = P (−a)e−βha . (266)
127
9.5 Central limit theorem and large deviations
Mathematical statement underlying most of the statistical physics in the
thermodynamic limit is the central limit theorem, which states that the sum
of many independent random numbers has Gaussian probability distribution.
Recently, however, we are more and more interested in the statistics of not
very large systems or in the statistics of really large fluctuations. To answer
such questions, here we discuss the sum of random numbers in more detail.
Consider the variable X which is a sum of many independent identically
∑
distributed (iid) random numbers X = N 1 yi . Its mean value ⟨X⟩ = N ⟨y⟩
grows linearly with N . Its fluctuations X − ⟨X⟩ on the scale less and compa-
rable with O(N 1/2 ) are governed by the Central Limit Theorem that states
that (X − ⟨X⟩)/N 1/2 becomes for large N a Gaussian random variable with
variance ⟨y 2 ⟩ − ⟨y⟩2 ≡ ∆. Finally, its fluctuations on the larger scale O(N )
are governed by the Large Deviation Theorem that states that the PDF of
X has asymptotically the form
128
for (large) deviations of X/N from the mean of the order of ∆/S ′′′ (0).
A simple example is provided by tossing coin and attributing y = 0 to
heads and y = 1 to tails. X gives then the overall number of tails and
N!
it appears with probability 2N X!(N −X)!
. In this example, S(z) = ln(1/2 +
e /2), ⟨X⟩ = N/2, ∆ = 1/4. The Large Deviation form (267) may be
z
easily obtained from the Stirling formula and the entropy H is equal to
(1 − X/N ) ln(1 − X/N ) + (X/N ) ln(X/N ) + ln 2 for X/N between 0 and 1
and to +∞ otherwise (X is never bigger than N ).
∑
Another example is the statistics of the kinetic energy, E = N 2
1 pi /2, of
N classical identical unit-mass particles. We considered that back in the Sec-
tion 2.2 in the micro-canonical approach and thermodynamic limit N → ∞.
Let us now look using canonical Gibbs distribution
( ∑ which
) is Gaussian for
−N/2
momenta: ρ(p1 , . . . , pN ) = (2πT ) exp − 1 pi /2T . The energy prob-
N 2
Plotting it for different N , one can appreciate how the thermodynamic limit
appears. Taking the logarithm and using the Stirling formula one gets the
large-deviation form for the energy R = E/Ē, normalized by the mean energy
Ē = N T /2:
N RN N RN N
ln ρ(E, N ) = ln − ln ! − ≈ (1 − R + ln R) . (269)
2 2 2 2 2
This expression has a maximum at R = 1 i.e the most probable value is
the mean energy. The probability of R is Gaussian near maximum when
R − 1 ≤ N −1/2 and non-Gaussian for larger deviations. Notice that this
function is not symmetric with respect to the the minimum, it has logarithmic
asymptotics at zero and linear asymptotics at infinity.
129
Basic books
L. D. Landau and E. M. Lifshitz, Statistical Physics Part 1.
R. K. Pathria, Statistical Mechanics.
R. Kubo, Statistical Mechanics.
K. Huang, Statistical Mechanics.
C. Kittel, Elementary Statistical Physics.
Additional reading
S.-K. Ma, Statistical Mechanics.
A. Katz, Principles of Statistical Mechanics.
J. Cardy, Scaling and renormalization in statistical physics.
M. Kardar, Statistical Physics of Particles, Statistical Physics of Fields.
J. Sethna, Entropy, Order Parameters and Complexity.
J. Kurchan, Six out of equilibrium lectures, http://xxx.tau.ac.il/abs/0901.1271
130
Exam 2007
1. A lattice in one dimension has N cites and is at temperature T .
At each cite there is an atom which can be in either of two energy states:
Ei = ±ϵ. When L consecutive atoms are in the +ϵ state, we say that they
form a cluster of length L (provided that the atoms adjacent to the ends of
the cluster are in the state −ϵ). In the limit N → ∞,
a) Compute the probability PL that a given cite belongs to a cluster of
∑
length L (don’t forget to check that ∞ L=0 PL = 1);
b) Calculate the mean length of a cluster ⟨L⟩ and determine its low- and
high-temperature limits.
2. Consider a box containing an ideal classical gas at pressure P and
temperature T. The walls of the box have N0 absorbing sites, each of which
can absorb at most two molecules of the gas. Let −ϵ be the energy of an
absorbed molecule. Find the mean number of absorbed molecules ⟨N ⟩. The
dimensionless ratio ⟨N ⟩/N0 must be a function of a dimensionless parameter.
Find this parameter and consider the limits when it is small and large.
3. Consider the spin-1 Ising model on a cubic lattice in d dimensions,
given by the Hamiltonian
∑ ∑ ∑
H = −J Si Sj − ∆ Si2 − h Si ,
⟨i,j⟩ i i
∑
where Si = 0, ±1, <ij> denote a sum over z nearest neighbor sites and
J, ∆ > 0.
(a) Write down the equation for the magnetization m = ⟨Si ⟩ in the mean-
field approximation.
(b) Calculate the transition line in the (T, ∆) plane (take h = 0) which
separates the paramagnetic and the ferromagnetic phases. Here T is
the temperature.
131
phase is given by
1 1
χ= .
1 −β∆
kB T 1 + 2 e − Jz
kB T
Solutions
Problem 1.
a) Probabilities of any cite to have energies ±ϵ are
P± = e±βϵ (eβϵ + e−βϵ )−1 .
The probability for a given cite to belong to an L-cluster is PL = LP+L P−2
for L ≥ 1 since cites are independent and we also need two adjacent cites to
have −ϵ. The cluster of zero length corresponds to a cite having −ϵ so that
PL = P− for L = 0. We ignore the possibility that a given cite is within L
of the ends of the lattice, it is legitimate at N → ∞.
∞
∑ ∞
∑ ∞
∂ ∑
PL = P− + P−2 LP+L = P− + P−2 P+ P+L
L=0 L=1 ∂P+ L=1
P−2 P+
= P− + = P− + P+ = 1 .
(1 − P+ )2
b)
∞ ∞
∑ ∂ ∑ P+ (1 + P+ ) eβϵ + 2e−βϵ
⟨L⟩ = LPL = P−2 P+ LP+L = = e−2βϵ βϵ .
L=0 ∂P+ L=1 P− e + e−βϵ
At T = 0 all cites are in the lower level and ⟨L⟩ = 0. As T → ∞, the proba-
bilities P+ and P− are equal and the mean length approaches its maximum
⟨L⟩ = 3/2.
132
Problem 2.
Since each absorbing cite is in equilibrium with the gas, then the cite and
the gas must have the same chemical potential µ and the same temperature
T . The fugacity of the gas z = exp(βµ) can be expressed via the pressure
from the grand canonical partition function
Zg (T, V, µ) = exp[zV (2πmT )3/2 h−3 ] ,
P V = Ω = T ln Zg = zV T 5/2 (2πm)3/2 h−3 .
The grand canonical partition function of an absorbing cite Zcite = 1 + zeβϵ +
z 2 e2βϵ gives the average number of absorbed molecules per cite:
⟨N ⟩ ∂Zcite x + 2x2
=z =
N0 ∂z 1 + x + x2
where the dimensionless parameter is x = P T −5/2 eβϵ h3 (2πm)−3/2 . The limits
are ⟨N ⟩/N0 → 0 as x → 0 and ⟨N ⟩/N0 → 2 as x → ∞.
Problem 3.
a) Hef f (S) = −JmzS − ∆S 2 − hS, S = 0, ±1.
eβ(Jzm+h) − e−β(Jzm+h)
m = eβ∆ [ ] .
1 + eβ∆ eβ(Jzm+h) + e−β(Jzm+h)
b) h = 0,
2βJzm + (βJzm)3 /3
m ≈ eβ∆ .
1 + 2eβ∆ [1 + (βJzm)2 /2]
Transition line βc Jz = 1 + 12 e−βc ∆ . At ∆ → ∞ it turns into Ising.
c)
(β − βc )Jz
m2 = .
(βc Jz)2 /2 − (βc Jz)3 /6
d)
2βJzm + 2βh
m ≈ eβ∆ , m ≈ 2βh(2 + e−β∆ − βJz)−1 , χ = ∂m/∂h .
1 + 2eβ∆
Problem 4
Since there are 25 = 32 different characters then every character brings
5 bits and the entropy decrease is 5 × 3000/ log2 e. The energy emitted by
the lamp P t brings the entropy increase P t/kT which is 100 × 100 × 1023 ×
log2 e/1.38 × 300 × 5 × 3000 ≃ 1020 times larger.
133
Exam 2009
1. Consider a classical ideal gas of atoms whose mass is m moving in an
attractive potential of an impenetrable wall: V (x) = Ax2 /2 with A > 0 for
x > 0 and V (x) = ∞ for x < 0. Atoms move freely along y and z. Let T
be the temperature and n be the number of atoms per unit area of the wall.
Consider the thermodynamic limit.
a) Calculate the concentration n(x) — the number of atoms per∫∞
unit vol-
ume as a function of the distance x from the wall. Note that n = 0 n(x) dx.
b) Calculate the energy and specific heat per unit area of the wall.
c) Find the free energy and chemical potential.
d) Find the pressure the atoms exert on the wall (i.e. at x = 0).
134
4. One may think of certain networks, such as the internet, as a directed
graph G of N vertices. Every pair of vertices, say i and j, can be connected by
multiple edges (e.g. hyperlinks) and loops may connect edges to themselves.
The graph can therefore be described in terms of an adjacency matrix Aij
with N 2 elements; each Aij counts the number of edges connecting i and j
and can be any non-negative integer, 0, 1, 2...
∑
The entropy of the ensemble of all possible graphs is S = − G pG ln pG =
∑
− {Aij } p(Aij ) ln p(Aij ). Consider such an ensemble with the fixed average
number of edges per vertex ⟨k⟩.
(i) Write an expression for the number of edges per vertex k for a given
graph Aij . Use the maximum entropy principle to calculate pG (Aij ) and the
partition function Z (denote the Lagrange multiplier that accounts for fixed
⟨k⟩ by τ ). What is the equivalent of the Hamiltonian? What are the degrees
of freedom? What kind of “particles” are they? Are they interacting?
(ii) Calculate the free energy F = − ln Z, and express it in terms of τ . Is
it extensive with respect to any number?
(iii) Write down an expression for the mean occupation number ⟨Aij ⟩ as
a function of τ . What is the name of this statistics? What is the ”chemical
potential” and why?
(iv) Express F via N and ⟨k⟩. Express pG for a given graph as a function
of k, ⟨k⟩ and N .
135
Solutions
Problem 1.
a) n(x) = n(0) exp(−βAx2 /2) = 2n(Aβ/2π)1/2 exp(−βAx2 /2).
b) For xi > 0, ( )
∑ p2i Ax2i
H= + .
i 2m 2
Equipartition gives E = 4(T /2)N = 2T N and Cv = 2N .
c)
[ ( )]
h3N ∫ ∑ p2i Ax2i
ZN = πdpdx exp −β +
N! i 2m 2
√ N
h3N
L2N √ 2πT /A
= ( 2πmT )3N ,
N! 2
[ ( )]
F = −T 3N ln h − N ln N + N + 2N ln L + N ln 2π 2 T 2 m3/2 /A1/2 ,
∂F [ ( )]
µ= = −T 3 ln h − ln N + 2 ln L + ln 2π 2 T 2 m3/2 /A1/2
∂N
[ ( )]
= −T 3 ln h − ln n + ln 2π 2 T 2 m3/2 /A1/2 .
Problem 2
Our main task is to find the number of particles per unit time escaping
from the cavity. To this end, we first suppose that the cavity is a cube of side
L and that single-particle wavefunctions satisfy periodic boundary conditions
at its walls. Then the allowed momentum eigenvalues are pi = hni /L (i =
1, ..., 3), where the ni are positive or negative integers. Then (allowing for
two spin polarizations) the number of states with momentum in the range d3 p
is (2V /h3 )d3 p, where V = L3 is the volume. (To an excellent approximation,
this result is independent of the shape of the cavity.) Multiplying by the
grand canonical occupation numbers for these states, we find that the number
136
of particles per unit volume with momentum in the range d3 p near p is
n(p)d3 p, where
2 1
n(p) = 3 (270)
h exp[β(ϵ(p) − η)] + 1
with ϵ(p) = |p|2 /2m. Now we consider, in particular, a small volume enclos-
ing the hole in the cavity wall, and adopt a polar coordinate system with
the usual angles θ and ϕ, such that the axis θ = 0 (the z axis, say) is the
outward normal to the hole (see corresponding figure).
The number of electrons per unit volume whose volume has a magnitude
between p and p + dp and is directed into a solid angle dΩ = sinθdθdϕ
surrounding the direction (θ, ϕ) is
and, since these electrons have a speed p/m, the number per unit time cross-
ing a unit area normal to the (θ, ϕ) direction is
1
n(p)p3 sin θdpdθdϕ . (272)
m
The hole subtends an area δA cos θ normal to this direction, so the number
of electrons per unit time passing through the hole with momentum between
p and p + dp into the solid angle dΩ is
δA
n(p)p3 sin θ cos θdpdθdϕ . (273)
m
It is useful to check this for the case of a classical gas, with
where n is the total number density, and with V = 0. Bearing in mind that
only those particles escape for which 0 ≤ θ < π/2, we then find that the
total number of particles escaping per unit time is
∫ ∞ ∫ π/2 ∫ 2π [ ]
δAn p2 1
3/2
dp dθ dϕp exp −
3
sin θ cos θ = nc̄δA ,
m(2πmkB T ) 0 0 0 2mkB T 4
√ (274)
where c̄ = 8kB T /πm is the mean speed. This standard result can be
obtained by several methods in elementary kinetic theory. For the case in
137
hand, the number of electrons escaping per unit time is
dN δA ∫ ∞ ∫ π/2 ∫ 2π
= √ dp dθ dϕp3 n(p) sin θ cos θ (275)
dt m 2mV 0 0
2πδA ∫√
∞ p3
= mh3 2mV eβ(ϵ(p)−µ) +1 dp
(276)
.
Problem 3
In general, particles flow from a region of higher chemical potential to a
region of lower chemical potential. We therefore need to find out in which
region the chemical potential is higher, and we do this by considering the
grand canonical expression for the number of particles per unit volume. In
the presence of a magnetic field, the single- particle energy is ϵ ± τ H, where ϵ
is the kinetice energy, depending on whether the magnetic moment is parallel
or antiparallel to the field. The total number of particles is then given by
∫ ∞ ∫ ∞
1 1
N= dϵg(ϵ) + dϵg(ϵ) .
0 exp[β(ϵ − η − τ H)] + 1 0 exp[β(ϵ − η + τ H)] + 1
(279)
138
For non-relativistic particles in a d−dimensional volume V , the density of
states is g(ϵ) = γV ϵd/2−1 , where γ is a constant. At T = 0, the Fermi
distribution function is
( )
1
lim = θ(µ ∓ τ H − ϵ) (280)
β→∞ eβ(ϵ−µ±τ H) + 1
where θ(·) is the step function, so the integrals are easily evaluated with the
result
N 2γ [ ]
= (µ + τ H)d/2 + (µ − τ H)d/2 . (281)
V d
At the moment that the wall is removed, N/V is the same in regions A and
B; so (with H = 0 is the region B) we have
d/2
(µA + τ H)d/2 + (µA − τ H)d/2 = 2µB . (282)
to obtain ( )d/2 ( )2
µB d(d − 2) τH
=1+ + ... (284)
µA 8 µA
We see that, for d = 2, the chemical potentials are equal, so there is no flow of
particles. For d > 2, we have µB > µA so particles flow towards the magnetic
field in region A while, for d < 2, the opposite is true. We can prove that the
same result holds for any magnetic field strength as follows. For compactness,
d/2
we write λ = τ H. Since our basic equation (µA +λ)d/2 +(µA −λ)d/2 = 2µB is
unchanged if we change λ to −λ, we can take λ > 0 without loss of generality.
Bearing in mind the µB is fixed, we calculate dµA /dλ as
Since µA + λ > µA − λ, we have (µA + λ)d/2−1 > (µA − λ)d/2−1 if d > 2 and
vice versa. Therefore, if d > 2, then dµA /dλ is negative and, as the field
is increased, µA decreased from its zero-field value µB and is always smaller
than µB . Conversely, if d < 2, then µA is always greater than µB . For d = 2,
we have µA = µB independent of the field.
139
Problem 4
∑
(i) k = N −1 i,j Aij . The thermodynamic potential (Lagrangian in me-
chanical terms) to be minimized is therefore
∑ ∑ ∑ ∑
L=− p(Aij ) ln p(Aij ) + λ0 p(Aij ) − τ N −1 p(Aij ) Aij . (286)
{Aij } {Aij } {Aij } i,j
F is extensive in (N 2 . )−1
(iii) ⟨Aij ⟩ = eτ /N − 1 , that is a Bose-Einstein statistics with zero
chemical potential since additional( edges are free to form.
)−1
−1 ∑
(iv) ⟨k⟩ = N i,j ⟨Aij ⟩ = N e
τ /N
−1 ⇒ τ /N = ln(N/ ⟨k⟩ + 1). It
( )
follows that the free energy is F = N 2 ln 1 − e−τ /N = −N 2 ln(1 + ⟨k⟩ /N ).
Similarly pG (Aij ) = (N/ ⟨k⟩ + 1)−N k (1 + ⟨k⟩ /N )−N .
2
140