Physics 12c: Introduction To Statistical Mechanics: Course Website
Physics 12c: Introduction To Statistical Mechanics: Course Website
Course website
https://courses.caltech.edu/course/view.php?id=3823.
1
http://www.damtp.cam.ac.uk/user/tong/statphys.html
2
http://users.physics.harvard.edu/~schwartz/teaching
1
Ph 12c 2024 Contents
Contents
1 Counting 7
1.1 Macrostates and microstates . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Large numbers and sharp distributions . . . . . . . . . . . . . . . . . 8
1.3 Probability and expectation values . . . . . . . . . . . . . . . . . . . 11
1.4 Spin system again . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.5 Connection to random walks . . . . . . . . . . . . . . . . . . . . . . 14
2
Ph 12c 2024 Contents
7 Debye theory 59
10 Ideal Gas 74
10.1 Spin and statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
10.1.1 Bosons vs. Fermions . . . . . . . . . . . . . . . . . . . . . . . 74
10.1.2 Spin review . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
10.1.3 The spin-statistics theorem . . . . . . . . . . . . . . . . . . . 76
10.2 Statistics of Fermions and Bosons . . . . . . . . . . . . . . . . . . . . 77
10.2.1 Fermi-Dirac statistics . . . . . . . . . . . . . . . . . . . . . . 77
10.2.2 Bose-Einstein statistics . . . . . . . . . . . . . . . . . . . . . 77
10.3 The classical limit . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3
Ph 12c 2024 Contents
11 Fermi Gases 80
11.1 Gibbs sum and energy . . . . . . . . . . . . . . . . . . . . . . . . . . 80
11.2 Zero temperature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
11.3 Heat capacity at low temperature . . . . . . . . . . . . . . . . . . . . 83
4
Ph 12c 2024 Contents
5
Ph 12c 2024 List of definitions
List of definitions
Definition (microstate/state) . . . . . . . . . . . . . . . . . . . . . . 7
Definition (macrostate) . . . . . . . . . . . . . . . . . . . . . . . . . 8
Definition (expectation value) . . . . . . . . . . . . . . . . . . . . . . 11
Definition (entropy) . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Definition (temperature) . . . . . . . . . . . . . . . . . . . . . . . . . 20
Definition (heat capacity) . . . . . . . . . . . . . . . . . . . . . . . . 23
Definition (ensemble) . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Definition (microcanonical ensemble) . . . . . . . . . . . . . . . . . . 26
Definition (reservoir) . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Definition (Boltzmann factor) . . . . . . . . . . . . . . . . . . . . . . 28
Definition (partition function) . . . . . . . . . . . . . . . . . . . . . . 28
Definition (canonical ensemble) . . . . . . . . . . . . . . . . . . . . . 28
Definition (thermal density matrix) . . . . . . . . . . . . . . . . . . . 33
Definition (Gibbs/Shannon entropy) . . . . . . . . . . . . . . . . . . 35
Definition (von-Neumann entropy) . . . . . . . . . . . . . . . . . . . 35
Definition (thermodynamic limit) . . . . . . . . . . . . . . . . . . . . 40
Definition (free energy) . . . . . . . . . . . . . . . . . . . . . . . . . 41
Definition (constrained derivative) . . . . . . . . . . . . . . . . . . . 43
Definition (orbital) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Definition (black) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Definition (absorptivity) . . . . . . . . . . . . . . . . . . . . . . . . . 56
Definition (emissivity) . . . . . . . . . . . . . . . . . . . . . . . . . . 56
Definition (isentropic) . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Definition (diffusive contact) . . . . . . . . . . . . . . . . . . . . . . 68
Definition (chemical potential) . . . . . . . . . . . . . . . . . . . . . 69
Definition (general reservoir) . . . . . . . . . . . . . . . . . . . . . . 72
Definition (grand canonical ensemble) . . . . . . . . . . . . . . . . . 72
Definition (grand canonical partition function/Gibbs sum) . . . . . . 73
Definition (Fermi energy) . . . . . . . . . . . . . . . . . . . . . . . . 82
Definition (reversible process) . . . . . . . . . . . . . . . . . . . . . . 103
Definition (heat) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Definition (work) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
Definition (heat engine) . . . . . . . . . . . . . . . . . . . . . . . . . 104
Definition (mechanical contact) . . . . . . . . . . . . . . . . . . . . . 111
Definition (Landauer’s principle) . . . . . . . . . . . . . . . . . . . . 144
6
Ph 12c 2024 1 Counting
1 Counting
1.1 Macrostates and microstates
Classical mechanics and quantum mechanics give precise predictions for almost any
physical observable, within their respective realms of validity. For example, to
make a prediction using quantum mechanics, one must determine the initial state
|ψ(t = 0)⟩ of a system and the Hamiltonian H that generates time evolution. Then
one simply evolves the state forward in time using the Schrödinger equation
∂
iℏ |ψ(t)⟩ = H|ψ(t)⟩. (1.1)
∂t
This is already an interesting problem for an individual Hydrogen atom. In the
approximation that the proton is stationary and (and ignoring the spin of the pro-
ton), we can write the state as a linear combination of position and spin eigenstates
for the electron
X Z
|ψsingle-electron (t)⟩ = dd x ψs (x, t)|x, s⟩, (1.2)
s=↑,↓
where s =↑, ↓ is the spin of the electron, and d = 3 is the number of spatial di-
mensions. The Schrodinger equation becomes a partial differential equation (PDE)
for two functions ψ↑,↓ (x, t) of d + 1 = 4 variables (that is, the number of spatial
dimensions plus the time).
Now let us use this framework to describe the weather. The number of molecules
in the atmosphere is approximately 1044 . So all we need to do is solve the Schrödinger
equation, which is a PDE for a function ψs1 ,...,s1044 (x1 , . . . , x1044 , t) of 3 × 1044 + 1 ≈
3 × 1044 variables.
Even to achieve a more modest goal of exactly modeling a handful of a typical
substance, we would need to solve the Schrödinger equation for a wavefunction with
O(NA ) = O(1023 ) variables, where NA is Avagodro’s number.
Of course, these are impossible computational tasks. But that doesn’t stop
us from doing physics. The reason is that the complete wavefunction of a system,
including the positions and spins of all the constituent particles, contains much more
information than we usually care about. We can still make predictions without all
that detailed information.
Let us introduce some terminology.
Definition (microstate/state). A microstate is a complete specification of the
physical state of a system. We often use the term state to mean a microstate.
In the context of quantum mechanics, a microstate is a complete specification of
the wavefunction of the system. In the context of classical mechanics, a microstate
is a complete list of the positions and momenta (and other degrees of freedom) of
all the constituent particles {x1 , . . . , xN , p1 , . . . , pN }.
7
Ph 12c 2024 1 Counting
↓↓↓↓↓↑↑↓↑↓↓↑↑↓↑↑↑↑↓↓↓↑↑↑↓↑↑↓↑↑↓↑↑↑↓↓↑↓↓↑↓↑↑↑↓↓↓↑↓↑↑↑↓↓↑↑↓↓↓↓ (1.3)
8
Ph 12c 2024 1 Counting
where N↑ and N↓ are the number of spins pointing up and down, respectively. Using
the fact that N = N↑ + N↓ , we find
1 1
N↑ = N + s, N↓ = N − s. (1.6)
2 2
Note that s can take N + 1 different values s ∈ {− N2 , − N2 + 1, . . . , N2 − 1, N2 }.
If N is large, the number of possible values of s (macrostates) is much smaller than
the total number of microstates of the system, which is 2N . This means that some
macrostates correspond to a huge number of microstates. We see that the most
extreme macrostates s = ± N2 have one microstate each. The next most extreme
N
± 2 − 1 have N microstates each. You may be able to guess that most of the
microstates are bunched near middle values of s. Let us count them more precisely.
The number of states with spin excess s is given by the number of ways to choose
N↑ objects from a set of size N :
N N N! N!
g(N, s) = = = = 1 . (1.7)
N↑ N↓ N↑ !N↓ ! ( 2 N + s)!( 21 N − s)!
Another way to derive this result is using generating functions. First let us fix a
given state. For each site, associate a symbolic factor x to ↑ and a factor x−1 to ↓.
After multiplying the factors associated to all the sites, the result is xN↑ −N↓ = x2s .
Now, to count all possible states, we can associate to each site an expression x+x−1 .
We claim that the coefficient of x2s in
is the number of states with spin excess s. To see this, note that each configuration
(1.3) is in one-to-one correspondence with a term in the expansion of Z(x)
Z(x) = xx · · · x
+ xx−1 x · · · x
+ xxx−1 · · · x + (2N −3 more terms). (1.9)
9
Ph 12c 2024 1 Counting
Each configuration with spin excess s contributes x2s , so the coefficient of x2s is the
total number of states with spin excess s. We can compute (1.8) with the binomial
theorem
Z(x) = (x + x−1 )N
X N!
= 1 1 x2s
s
( 2 N + s)!( 2 N − s)!
X
2s
= g(N, s)x . (1.10)
s
You will derive Stirling’s approximation on the first problem set. Plugging this into
g(N, s), we have
√
log g(N, s) ≈ (N + 12 ) log N − N + log 2π
√
− (N↑ + 12 ) log N↑ + N↑ − log 2π
√
− (N↓ + 12 ) log N↓ + N↓ − log 2π (1.12)
1
now adding and subtracting 2 log N and using that N = N↑ + N↓ ,
1
log g(N, s) ≈ (N↓ + 2+ N↑ + 12 ) log N − (N↑ + 21 ) log N↑ − (N↓ + 21 ) log N↓
√
− 12 log N − log 2π
N↑ N↓ 1 √
= −(N↑ + 12 ) log − (N↓ + 12 ) log − 2 log N − log 2π. (1.13)
N N
We have already anticipated that the largest number of microstates will be concen-
trated at small s. Thus, let us make the approximation that s/N is small. We can
then Taylor expand the logarithms,
2s 2s2
N↑,↓ 1 2s
log = log 1± = − log 2 ± − 2 + ... (1.14)
N 2 N N N
4
In this course, log x will always denote the correct definition of the logarithm, which is log base
e.
10
Ph 12c 2024 1 Counting
Keeping only the terms up to quadratic order in s2 /N 2 (we justify this in a moment),
our expression becomes
2s 2s2
N 1
− +s+ − log 2 + − 2
2 2 N N
2s 2s2 √
N 1 1
− −s+ − log 2 − − 2 − log N − log 2π. (1.15)
2 2 N N 2
Because of symmetry under s ↔ −s, the odd-order terms in s will drop out. We
only need to keep the constant and quadratic terms which are
2s2
1 1 Nπ
log g(N, s) ≈ − 1− + N log 2 − log
N N 2 2
2s 2 1 Nπ
≈− + N log 2 − log . (1.16)
N 2 2
2
In the last step, we discarded a term of the form 2s N2
. To justify this, we should
make sure that this term is subleading compared to all the others — in other words,
it should be smaller than constants like − 12 log π2 in the large N limit. We confirm
this in a moment. The result can be written as
2s2
g(N, s) ≈ g(N, 0)e− N , (1.17)
where
1/2
N 2
g(N, 0) ≈ 2 . (1.18)
πN
Note that expectation values are linear ⟨af + b⟩ = a⟨f ⟩ + b, where a and b are
constants independent of s.
11
Ph 12c 2024 1 Counting
P N2 P
For convenience, we will abbreviate s=− N
→ s.
2
We can now compute expectation values of various measurements. Firstly, we
have
X
⟨s⟩ = sP (N, s) = 0. (1.23)
s
We don’t have to do any computation to get this result — it follows from the fact
that sP (N, s) is odd under s ↔ −s. A more interesting question is
X
⟨s2 ⟩ = s2 P (N, s). (1.24)
s
Here, we used another trick. Even though the maximal values of s are ± N2 , we
extended the integral from −∞ to ∞. This is OK because the integrand is extremely
tiny at extreme values of s, so we don’t make
q a very big error by changing the integral
N
there. We can now change variables s → 2 x,
1/2 3/2 Z ∞
2 N 2
2
⟨s ⟩ = dx x2 e−x (1.26)
πN 2 −∞
12
Ph 12c 2024 1 Counting
13
Ph 12c 2024 1 Counting
might think: Hey, 10−6 is also pretty small. Perhaps we might sometimes measure
t ∈ [10−6 , 2 × 10−6 ]! You would be wrong. The probability of this happening is
Z 2×10−6 1 10−6 2 1010 9
dsP (N, s) ≈ 10−6 e− 2 ( 10−11 ) ≈ 10−6 e− 2 ≈ 10−2×10 . (1.33)
10−6
If you do observe a spin excess of 10−6 , here is a list of more likely scenarios:
• Your eyes are not working (i.e. your personal measurement equipment is bro-
ken).
• You accidentally drove to the wrong university and measured the wrong spin
system.
• You are actually living in the matrix and the machines are doing an experiment
to see if you ever studied statistical mechanics.
The point is that the distribution (1.21) becomes ridiculously sharply peaked
near s = 0 when N is large, at least if we measure deviations relative to macroscopic
quantities (i.e. quantities that scale linearly with N ). For all intents and purposes,
even though we have very little knowledge of the spin system and its connection to
the environment, we can confidently predict t = 0, based on counting.
14
Ph 12c 2024 1 Counting
which has solution ⟨Sn ⟩ = ⟨S0 ⟩ = 0. This is just the statement that the average
spin excess vanishes.
Now consider the expectation value of the square of (1.34),
The middle term vanishes because the value of sn does not depend on Sn−1 . We
say that the two quantities are uncorrelated. We have ⟨Sn−1 sn ⟩ = ⟨Sn−1 ⟩⟨sn ⟩ = 0.
More precisely,
X
⟨Sn−1 sn ⟩ = P (s1 , . . . , sn )Sn−1 (s1 , . . . , sn−1 )sn
s1 ,...,sn
X
= P (s1 ) · · · P (sn )Sn−1 (s1 , . . . , sn−1 )sn
s1 ,...,sn
!
X X
= P (s1 ) · · · P (sn )Sn−1 (s1 , . . . , sn−1 ) P (sn )sn
s1 ,...,sn−1 sn
= ⟨Sn−1 ⟩⟨sn ⟩
= 0. (1.38)
In the second line, we used the fact that the probability of a configuration of spins is
the product of the probabilities for each individual spin (P (si ) = 1/2 for si = ±1).
This is true in our simple model. However, in more general cases, like in real
magnets, we might not be able to factorize P (s1 , . . . , sn ) and the above computation
would not be correct.
However, the last term is interesting because sn = ±1, so s2n is always 1. Thus,
we find
⟨Sn2 ⟩ = ⟨Sn−1
2
⟩ + 41 . (1.39)
15
Ph 12c 2024 2 Entropy and Temperature
• Time scale. Which states are accessible also depends on the time scale
being considered. Glass is a famous example of a substance that exhibits
dynamics on very long time scales — its relaxation time can be millions of
years. However, for practical purposes on human time scales, we shouldn’t
consider states where glass flows into other configurations.
16
Ph 12c 2024 2 Entropy and Temperature
respect conservation laws, so the system will wander through the space of states in
a random way while still respecting, e.g. conservation of energy.
Two properties of typical systems support this picture
All this intuition does not amount to a proof, and in fact the fundamental
assumption is only an approximation. There are some properties of a state that are
very difficult to measure, and therefore not important for our purposes, but take a
very long time to randomize. An example is the “computational complexity” of a
state.
17
Ph 12c 2024 2 Entropy and Temperature
18
Ph 12c 2024 2 Entropy and Temperature
19
Ph 12c 2024 2 Entropy and Temperature
√
deviation δU1 /U1 ∼ O(1/ N ) is small. We should think of (2.9) as being extremely
sharply-peaked as a function of U1 .
σ = log g. (2.14)
20
Ph 12c 2024 2 Entropy and Temperature
We use 1/τ on the left-hand side so that τ has units of energy. (Note that the
entropy σ is unitless.) With this definition, the condition for thermal equilibrium
becomes
τ1 = τ2 . (2.17)
For historical reasons, the temperature that we usually measure with a ther-
mometer is defined by
τ
T = , (2.18)
kB
S = kB log g, (2.19)
so that we have
1 ∂S
= . (2.20)
T ∂U
These conventions date from a time when the relationship between temperature,
entropy, and the number of states was not understood. From a modern point of
view, kB is a silly constant. Entropy should be unit-less, and temperature should
have units of energy. In this course, we will almost always work with τ and σ instead
of T and S.
σ = log(g1 g2 ) = σ1 + σ2 . (2.21)
21
Ph 12c 2024 2 Entropy and Temperature
then the number of accessible states will roughly double, g → g ′ = 2g.8 Thus, the
actual value of g depends sensitively on our precise energy resolution. However, the
entropy will change by σ → σ ′ = σ + log 2. If σ ∼ 1022 is large, the difference
between σ and σ ′ is insignificant. Thus, we can reliably discuss the entropy of a
system without having to give a detailed definition of the space of accessible states.
All of this supports the mantra that when you have a ridiculously large number,
it’s better to think about it’s logarithm. You’ll notice that in our examples with
spin systems and gaussian integrals, we were manipulating logarithms.
Clearly we have
with equality if Ub1 = U10 (i.e. the system S1 was already at the equilibrium tem-
perature of the combined system).
This is the statement that entropy increases. The entropy of a closed system will
remain constant or increase when a constraint internal to the system is removed.
Some examples of operations that increase entropy are adding particles, adding
energy, increasing volume, decomposing molecules.
It is extremely unlikely to observe a system spontaneously lower its entropy. For
example, suppose we have a gas of particles in a room. A low entropy configuration
would be one where all the particles are contained in the top left corner. (This is
low entropy because the number of such states is much smaller than the number of
states where the particles roam freely.) However, the chances of the room evolving
into this state are ridiculously tiny. Suppose the particles are non-interacting. The
probability at any given time of observing a particle in 1/2 of the room is 1/2.
The probability of observing all the particles simultaneously in 1/2 the room is
23
1/2N ∼ 1/210 — i.e. it will never happen.
8
This is true assuming the density of states is roughly constant as a function of energy over
scales of size ∆U .
22
Ph 12c 2024 2 Entropy and Temperature
Consider two systems brought into thermal contact that have not yet reached
equilibrium. They will exchange some energy U1 → U1 + δU and U2 → U2 − δU in
such a way that entropy increases. Thus, we have
∂σ1 ∂σ2
δU − δU ≥ 0
∂U1 ∂U2
1 1
− δU ≥ 0. (2.25)
τ1 τ2
In other words, δU will be positive and S1 will get hotter if τ2 > τ1 , and vice versa
if τ1 > τ2 . Thus, energy flows in the way one would expect: from hotter to cooler.
23
Ph 12c 2024 2 Entropy and Temperature
The entropy at zero temperature is the log of the ground state multiplicity
σ(0) = log g(0). For most systems, g(0) is small, so σ(0) is effectively zero compared
to macroscopic entropies at nonzero temperature (which are O(N ), where N is
the number of particles). In such systems, we can drop the last term above. For
glassy systems, σ(0) can still have macroscopic size — such systems have frozen-in
disorder, so that there are many different (near) ground states where the atoms are
distributed in many different ways.
Equation (2.31) is an awesome formula! You might think that the entropy, i.e.
the log of the number of accessible states, is a pretty abstract concept. Furthermore,
it’s not easy to imagine computing it directly: how can we count the microstates of
a system of 1023 particles? But using (2.31) you can actually figure out what it is!
Another property we should mention is that heat capacity is usually positive:
adding energy usually increases temperature. However, there are some exotic sys-
tems that possess negative heat capacity. The most famous example is a black hole:
24
Ph 12c 2024 2 Entropy and Temperature
adding energy increases the size of the black hole, which lowers the temperature
(and lowers the rate of Hawking radiation). Conversely, removing energy decreases
the size of the black hole and raises the temperature. This leads to a runaway effect:
the black hole will radiate more and more, its temperature rising and rising, until
it decays away to nothing.
0th law: If two systems are in thermal equilibrium with a third system, then
they are in thermal equilibrium with each other. This is the definition of
equilibrium τ1 = τ2 , together with the transitive property of equality.
From the point of view of statistical mechanics, most of these aren’t really funda-
mental laws — they’re consequences of statistics and properties of typical systems.
Honestly, I never learned which law had which number, just like I never bothered
to learn the difference between class 1, class 2, and class 3 levers. I’m supposedly a
professional physicist, and it’s almost never been a problem, although people talk a
lot about the 2nd law when discussing black hole physics (for reasons that we’ll see
later), so I kind of know which one that is. To avoid hypocrisy, I will try not to test
you on which law is which.
9
Here’s a song about the laws of thermodynamics: https://www.youtube.com/watch?v=VnbiVw_
1FNs.
25
Ph 12c 2024 3 The Boltzmann distribution
In the last lecture, we studied the ensemble for an isolated system with a known
fixed energy U . According to the fundamental assumption of statistical mechanics,
it is given by
(
1
if s has energy U ,
Pmicrocanonical (s) = g(U ) (3.2)
0 otherwise,
where g(U ) is the number of states with energy U — i.e., the number of accessible
states.
Definition (microcanonical ensemble). For historical reasons, the probability
distribution (3.2) is called the microcanonical ensemble.
• S is a few air molecules in the room and R is all the air in the room
• S is a dark matter particle and R is the dark matter halo in the galaxy
26
Ph 12c 2024 3 The Boltzmann distribution
What is the probability that the system S will be in a state s with energy Es ?
Let the total energy of the combined system S + R be Etot . By the fundamental
assumption of statistical mechanics,
The first factor gS (S is in state s) is simply 1 because we are demanding that the
microstate of S be exactly s. In the second factor gR (Etot − Es ), we have used
conservation of energy.
The constant of proportionality in (3.3) is the inverse of the number of accessible
states of S + R. So that we don’t have to discuss this quantity, consider the ratio
of probabilities associated to two microstates s, s′ ,
P (s) gR (Etot − Es )
′
= . (3.4)
P (s ) gR (Etot − Es′ )
∂σR 1 2 ∂ 2 σR
σR (Etot − Es ) = σR (Etot ) − Es + Es + ...
∂U 2 ∂U 2
Es 1 2 ∂ 2 σR
= σR (Etot ) − + Es + .... (3.6)
τ 2 ∂U 2
In the second line, we have assumed that R is at temperature τ and we have used
the definition of temperature.
We claim that in the limit of a very large reservoir, the second and higher
derivative terms above can be dropped. Let us first consider the second derivative
term. It is given by
∂ 2 σR
∂ 1 1 ∂τ 1 1
2
= =− 2 =− 2 , (3.7)
∂U ∂U τ τ ∂U τ C
where C is the heat capacity of the reservoir. Remember that a reservoir has ex-
tremely large heat capacity, so this term is negligible.
Let us understand this result in a cruder way that will help with the higher-
order terms in (3.6). The largeness of C comes from the fact that energy U is
extensive and temperature τ is intensive. Thus, C = ∂U ∂τ scales like O(N ). By the
∂ k σR
same large-N scaling analysis, we expect ∂U k
to scale like O(N/N k ) = O(N 1−k ):
27
Ph 12c 2024 3 The Boltzmann distribution
the N in the numerator comes from the fact that σR is extensive, and the N k in
the denominator comes from the fact that U is extensive and we have k factors of
U . Thus, in the limit of a large reservoir, the higher-order terms in (3.6) should
be dropped as well. On your homework, you will make this result rigorous in the
case where the environment can be modeled as N non-interacting copies of a single
system.
Thus, let us drop the higher-derivative terms in (3.6). The quantity σR (Etot )
cancels between the numerator and denominator, and we are left with
P (s) e−Es /τ
= =⇒ P (s) = Ae−Es /τ , (3.8)
P (s′ ) e−Es′ /τ
where A is independent of s. Using the fact that probabilities must sum to 1, we
can solve for A,
X 1
Ae−Es /τ = 1 =⇒ A= , (3.9)
s
Z(τ )
where
X
Z(τ ) ≡ e−Es /τ . (3.10)
s
Overall, we find
e−Es /τ
P (s) = , (3.11)
Z(τ )
Note that almost all reference to the reservoir has dropped out, with the only re-
maining dependence on R through its temperature τ . Equation (3.11) is the most
important equation in statistical mechanics. It’s so important that the numerator,
denominator, and ratio on the right-hand side all have their own names:
Definition (Boltzmann factor). The quantity e−Es /τ in the numerator of (3.11)
is called a Boltzmann factor.
Definition (partition function). The quantity Z(τ ) in (3.10) is called the parti-
tion function.
28
Ph 12c 2024 3 The Boltzmann distribution
of states in (3.4) and keep the first two terms of that expansion instead? Instead of
an exponential, that procedure would have given a linear function of Es , which is
the wrong answer. The reason is that Taylor expanding a logarithm gives a more
efficient way to approximate quickly-growing functions.
As an example, consider the model function g(U ) = U N , where N = 1022 is
enormous. If we Taylor expand g(U ) itself, then we have
N (N − 1) Es2
N Es
g(Etot − Es ) = Etot 1 − N + 2 + ... (3.12)
Etot 2 Etot
The problem with this series compared to (3.6) is that even though each subsequent
term comes with an extra 1/Etot (which goes like 1/N and is small), it also comes
with an extra factor of N (which is large). In the large-N limit, we cannot drop
all the higher-derivative terms. Instead, the best we can do is drop subleading
quantities in N in each term. This leads to
N 2 Es2
N Es
g(Etot − Es ) ∼ Etot 1−N + 2 + . . .
Etot 2 Etot
−Es EN
N
= Etot e tot = g(Etot )e−Es /τ (3.13)
so we get an exponential again. Here, we used the fact that τ = g(Etot )/g ′ (Etot ) =
Etot /N in this example. Taking the logarithm first gives a more efficient way to sum
all these terms.
29
Ph 12c 2024 3 The Boltzmann distribution
We can also compute the expectation value of the square of the energy
1 ∂2 ∂2 1 ∂Z(β) 2
2
⟨E ⟩ = Z(β) = log Z(β) + . (3.16)
Z(β) ∂β 2 ∂β 2 Z(β) ∂β
In particular, the standard deviation of the energy is
∂2
∆E 2 = ⟨(E − ⟨E⟩)2 ⟩ = ⟨E 2 ⟩ − ⟨E⟩2 = log Z(β). (3.17)
∂β 2
As an example, consider a two-state system with ground-state energy 0 and
excited state energy ε. The partition function is
τ 2 ∂Z(τ )
U = ⟨E⟩ =
Z(τ ) ∂τ
εe−ε/τ
= (3.19)
1 + e−ε/τ
When τ /ε is small, this is close to zero (in fact, it dies incredibly quickly as τ gets
small). When τ /ε is large, this asymptotes to ε/2. The heat capacity is
∂U ε2 e−ε/τ
C= = 2 . (3.20)
∂τ τ (1 + e−ε/τ )2
This has the surprising feature that it spikes near τ /ε ∼ 0.5, a phenomenon known
as the Schottky anomaly. It is “anomalous” because in most materials, the heat
capacity is monotonically increasing with temperature. The reason the Schottky
anomaly appears here is that our system has a maximum energy ε. As τ → ∞,
the largest U = ⟨E⟩ can ever get is ε/2 (coming from an equal probability of being
in the excited and ground state). Typical macroscopic systems have no maximum
energy (or a maximum energy that is way bigger than typical temperatures). In
that case, as we increase τ , the probability becomes nontrivial for more and more
high energy states and ⟨E⟩ can keep increasing.
[End of lecture 3(2024)]
Consider now a pair of two-state systems, each with zero ground state energy,
and with excited state energies ε1 , ε2 . The partition function is
30
Ph 12c 2024 3 The Boltzmann distribution
〈E〉,C
0.5
0.4
0.3
0.2
0.1
τ/E0
0.0 0.5 1.0 1.5 2.0 2.5 3.0
At low temperatures, we have ⟨E⟩ = −N mB as all the spins align with the magnetic
field. At high temperatures, we have ⟨E⟩ = 0 as thermal fluctuations dominate over
the effects of the applied field.
The standard deviation in energy is
∂ 1
∆E 2 = − (−N mB tanh(βmB)) = N (mB)2 2 . (3.26)
∂β cosh (βmB)
31
Ph 12c 2024 3 The Boltzmann distribution
Note that the fluctuations are of size ∆E 2 ∼ N , whereas the energy itself
√ is also
O(N ). Thus, the fractional fluctuations are small, of order ∆E/⟨E⟩ ∼ 1/ N .
There is a more general result relating the size of fluctuations in energy to the
heat capacity. Recall that
−1
∂⟨E⟩ ∂τ ∂⟨E⟩ ∂2 ∆E 2
C= =− = β 2 2 log Z = 2 (3.27)
∂τ ∂β ∂β ∂β τ
We know that heat capacity scales linearly with the size of the system N , so this
shows that ∆E 2 also scales like N .
⟨X⟩
b = ⟨ψ|X|ψ⟩.
b (3.28)
More generally, we may not know the precise state of a system, but only a
probability distribution on the space of states. For example, we may have a lab
partner that randomly decides whether to prepare an up state or down state with
probabilities p and 1 − p. In this case, the expectation value of an observable is
⟨X⟩
b = p⟨↑ |X|
b ↑⟩ + (1 − p)⟨↓ |X|
b ↓⟩. (3.29)
32
Ph 12c 2024 3 The Boltzmann distribution
33
Ph 12c 2024 4 Entropy in general ensembles
• If there are g equally likely accessible states, then we should have σ = log g,
as before.
• σ should be extensive.
If N is large enough, the number of systems in state si will be P (si )N . That is, the
string above will contain P (s1 )N s1 ’s, P (s2 )N s2 ’s, etc.. By taking a large number of
systems, we have turned probabilities into eventualities. A string where s1 , s2 , etc. do
not appear in the correct proportions should be considered inaccessible in the large-
N limit. To determine the entropy, we count the number of accessible configurations
gtotal of the total system and apply the familiar definition σtotal = log gtotal . The
entropy of a single system is then σ = σtotal /N .
The problem of counting the number of strings (4.1) with P (s1 )N s1 ’s, P (s2 )N
s2 ’s, etc. was on the homework. The answer is
eN σ , (4.2)
log 2
where σ is log e times the Shannon entropy of the probability distribution,11
X
σ=− P (s) log P (s). (4.3)
s
34
Ph 12c 2024 4 Entropy in general ensembles
σ = −Tr(b
ρ log ρb), (4.4)
35
Ph 12c 2024 4 Entropy in general ensembles
36
Ph 12c 2024 4 Entropy in general ensembles
of m < n bits. However, the number of such strings is only 2m < 2n , so this cannot
be done in a 1-1 manner.
However, if the data has some structure, compression is possible. An extreme
example is if we know the file is just a sequence of exactly n 0’s. This can be encoded
using zero bits. A slightly less extreme example is if we know the file is a constant
sequence of length n, containing either all 0’s or all 1’s. Such files can be encoded
using 1 bit: 0 if the file is all 0’s or 1 if the file is all 1’s.
In general, when data has structure, the presence of that structure decreases
the number of possible files, making compression possible. For example, an XML
file always starts with <xml> and ends with </xml>. In total, this is 88 bits of
information that we already know is there. Consequently, the number of valid XML
files of length n is less than 2n−88 < 2n .
A more interesting example of structured data is human language. A naive way
to store a document of English text uses ASCII encoding (https://en.wikipedia.
org/wiki/ASCII), where each character is stored using 8 bits. For example, a is
ASCII code 01100001, b is ASCII code 01100010, etc.. This is not very efficient
for English text because the English language contains only 26 letters — perhaps
together with punctuation and spaces, we can say that there are 25 = 32 symbols.
Thus, we should be able to compress English using no more than 5 bits per character.
However, not every random sequence of characters is a good English sentence.
For example, the letter e is more likely to appear than z. We can exploit this extra
structure to make a better compression scheme.
Let us understand how this works in a toy language with only two characters a
and b. First, suppose that a,b occur equally often. From the point of view of letter
frequency, our text file is like a random set of 0’s and 1’s, and we cannot compress
it. (This statement ignores possible correlations between letters, which we discuss
in a moment.) Now suppose instead that a occurs 90% of the time and b occurs
10% of the time. To compress a sequence, it’s a good idea to use fewer bits for a
than for b. One way to accomplish this is the following scheme:
Sequence Code
aa 0
(4.9)
a 10
b 11
Because a occurs very often, we use 0 to stand for a pair of a’s. For example, the
sequence aaaaaabaaa, which is 10 characters, becomes 00011010, which is 8 bits.
The probabilities of all possible two-letter sequences are
Sequence aa ab ba bb
Probability 81% 9% 9% 1% (4.10)
Code 0 1011 1110 1111
37
Ph 12c 2024 4 Entropy in general ensembles
38
Ph 12c 2024 4 Entropy in general ensembles
39
Ph 12c 2024 5 The thermodynamic limit and free energy
Here, we include the subscript ‘c’ to emphasize that Zc is the partition function in
the canonical ensemble. Let us organize the states into energy levels with degeneracy
g(U ) = eσmc (U ) , where σmc (U ) is the entropy of the microcanonical ensemble. The
sum (5.1) becomes
X
Zc (τ ) = eσmc (U )−U/τ . (5.2)
U
The summand involves a competition between two effects: (1) the increasing number
of states eσmc (U ) as energy increases, and (2) the exponentially dying Boltzmann
factors.13 The crossover point where these effects balance out is the solution U∗ of
the equation
∂
0= (σmc (U ) − U/τ ) . (5.3)
∂U U =U∗
In the thermodynamic limit, the summand in (5.2) becomes sharply peaked at the
“most probable configuration” U = U∗ , so we can approximate it by the value of
the summand at that point:
The right-hand side of (5.4) is the partition function of the microcanonical ensemble
(i.e. of a system with eσmc (U∗ ) states, all with energy U∗ ). Applying the formula
⟨E⟩ = τ 2 ∂ log Z
∂τ , we see that ⟨E⟩ ≈ U∗ in the thermodynamic limit.
13
Student question: how do we know that the Boltzmann factor eventually balances the growth
of g(U ) — i.e. that g(U ) doesn’t grow superexponentially? Answer: A system where g(U ) grows
superexponentially forever would have an infinite partition function. If this happens, it means that
some approximation we made in describing the system is breaking down.
40
Ph 12c 2024 5 The thermodynamic limit and free energy
⟨E⟩
σc = + log Z
τ
U∗
≈ + log eσmc (U∗ )−U∗ /τ (thermodynamic limit)
τ
= σmc (U∗ ). (5.5)
F = U − τ σ = −τ log Z. (5.6)
In section (5.1), we showed that in the thermodynamic limit, a system will evolve
toward the “most probable configuration,” where σ(U ) − U/τ = − Fτ is maximized
(equation 5.3). Equivalently, in the thermodynamic limit, a system at fixed tem-
perature will seek to minimize F .
The fact that F is minimized in the most probable configuration is not a new
principle. Recall that the entropy of a reservoir is given by
Thus, we have
F = U − τσ
= τ σR (Utot ) − τ σR (Utot − U ) − τ σ
= −τ σS+R (Utot , U ) + τ σR (Utot )
= −τ σS+R (Utot , U ) + independent of U (5.8)
41
Ph 12c 2024 5 The thermodynamic limit and free energy
the system to seek high entropy. These effects balance each other at the minimum
of the free energy.
[End of lecture 5(2024)]
A nice example is a polymer that can straighten out or curl up. There are many
more configurations where the polymer is curled up than where it is straight. Thus,
the polymer wants to curl up to increase its entropy. We can make it energetically
favorable for the polymer straighten out by pulling on it with an external force Fext .
The free energy is then
∂σ
Fext = −τ . (5.10)
∂ℓ
Thus, the entropy term −τ σ(ℓ) leads to a noticeable force that we can measure. It
is called an “entropic force.” If we increase the temperature, the entropic force gets
stronger. For example, if you heat a rubber band, it contracts.
5.3 Pressure
Consider a system in an energy eigenstate s with energy Es , and let us suppose
that Es is a function of the volume V of the system. The volume can be changed
V → V −∆V by applying an external force. Let us suppose that the volume changes
sufficiently slowly that the system remains in an energy eigenstate.15 The change
in energy is
dEs
Es (V − ∆V ) = Es (V ) − ∆V + . . . . (5.11)
dV
Suppose that the system is compressed by applying pressure p to a surface of
area A. The force is pA. If the force acts over a distance ∆x, then the work imparted
to the system is
42
Ph 12c 2024 5 The thermodynamic limit and free energy
The work ∆W is equal to the change in energy of the system. Comparing (5.11)
and (5.12), we see that the pressure in state s is
dEs
ps = − (5.13)
dV
Averaging over all states in an ensemble, we find
X dEs
⟨p⟩ = P (s) −
s
dV
d X
=− P (s)Es
dV s
∂⟨E⟩ ∂U
=− =− . (5.14)
∂V σ ∂V σ
The subscript σ means that we take the derivative, keeping the entropy fixed. The
entropy is held fixed because the probability distribution on the space of states is
fixed during the compression. Specifically, the compression is slow enough that if
the system starts out in energy eigenstate state s, it will remain in state s. Thus,
the occupation probabilities don’t change and the entropy is unchanged.
43
Ph 12c 2024 5 The thermodynamic limit and free energy
∂x dx
To compute the constrained derivative ∂y z , we can set dz = 0 and solve for dy :
∂x ∂f /∂y
=− . (5.16)
∂y z ∂f /∂x
Similarly, we find
∂y ∂f /∂z
=− ,
∂z x ∂f /∂y
∂z ∂f /∂x
=− . (5.17)
∂x y ∂f /∂z
which we will use in a moment. The nice thing about (5.18) is that it no longer
makes reference to the constraint f (x, y, z). This is useful because we don’t have to
bother writing down f (x, y, z) or choosing conventions for it.
In the second line, we used the definition of temperature. The second term involves a
constrained derivative that we don’t a-priori know, but it can be related to pressure
using the identity (5.18) for U, V, σ:
∂σ ∂V ∂U
= −1
∂V U ∂U σ ∂σ V
∂σ 1
− τ = −1
∂V U p
∂σ p
= (5.20)
∂V U τ
44
Ph 12c 2024 5 The thermodynamic limit and free energy
In the second line, we used the definition of pressure (5.14) and the definition of
temperature.
Substituting this result into (5.19), we find
1 p
dσ = dU + dV
τ τ
τ dσ = dU + pdV. (5.21)
Let us now compute the change in free energy corresponding to dU, dV . We have
dF = dU − τ dσ − σdτ
dF = −σdτ − pdV, (5.22)
from which we derive a relationship between pressure and free energy, together with
another useful expression for entropy
∂F
p=−
∂V τ
∂F
σ=− . (5.23)
∂τ V
ℏ2
− ∇2 ψ = Eψ (5.24)
2m
are
nx πx ny πy nz πz
ψ(x, y, z) = A sin sin sin , (5.25)
L L L
45
Ph 12c 2024 5 The thermodynamic limit and free energy
ℏ2 π 2 2
En = (nx + n2y + n2z ). (5.26)
2m L
We neglect the spin and other structure of the particle.
The partition function is the sum
∞
ℏ2 π 2 2 2 2
e− 2mL2 τ (nx +ny +nz )
X
Z1 =
nx ,ny ,nz =1
∞
2 (n2 +n2 +n2 )
X
= e−α x y z , (5.27)
nx ,ny ,nz =1
2 2
ℏ π
where α2 = 2mL 2 τ , and we exclude nx = 0, ny = 0, and nz = 0 terms as the
wavefunction vanishes with those values.
If α2 is sufficiently small, then the quantity inside the sum is slowly varying and
we can approximate the sum as an integral,
Z ∞ Z ∞ Z ∞
2 2 2 2
Z1 ≈ dnx dny dnz e−α (nx +ny +nz )
0 0 0
Z ∞ 3
2 2
= dne−α n
0
Z ∞ 3
1 −x2
= dxe
α 0
!3
π 1/2
=
2α
mτ 3/2 nQ
= 2
V = nQ V = , (5.28)
2πℏ n
mτ 3/2
where n = 1/V and nQ = 2πℏ 2 is called the quantum concentration. Note we
have set the integration lower bound to 0 with negligible error, as the integrand is
slowly varying—the difference between starting from 0 and 1 is small. It is sometimes
more common to define the thermal wavelength
r
1 2πℏ2
nQ = 3 , λ= . (5.29)
λ mτ
The thermal wavelength is approximately the de Broglie wavelength of an atom with
energy ⟨E⟩ ∼ τ . Specifically λ ∼ ⟨p2ℏ⟩1/2 = (2m⟨E⟩)
ℏ
1/2 ∼ (mτ )1/2 . Note that λ is an
ℏ
46
Ph 12c 2024 5 The thermodynamic limit and free energy
classically. Thus, nQ is the concentration associated with one atom in a cube of size
equal to the thermal average de Broglie wavelength.
The smallness of α is equivalent to the statement
L ≫ λ, (5.30)
i.e. the box is big compared to the thermal wavelength. When this condition holds,
we say that the gas is in the “classical regime.” An ideal gas is a gas of noninteracting
particles in the classical regime.
The average energy of the atom in a box is
2 ∂ 2 ∂ 3 3
U =τ log Z1 = τ log τ = τ. (5.31)
∂τ ∂τ 2 2
In D spatial dimensions, the answer would be
D
U= τ. (5.32)
2
This is a special case of a more general result called “equipartition of energy” that
we derive in section 8.
This will turn out to be an incorrect expression for the ideal gas, for reasons
that we will explain. However, let us continue with it for now. The energy of the
ideal gas follows from the N -particle partition function,
∂
U = τ2 log ZN
∂τ
3
= N τ. (5.34)
2
The free energy is
distinguishable
F distinguishable = −τ log ZN
= −τ N log(nQ V ). (5.35)
47
Ph 12c 2024 5 The thermodynamic limit and free energy
This is called the ideal gas law. Note that the thermal wavelength λ has dropped
out of this formula. In more traditional units,
48
Ph 12c 2024 5 The thermodynamic limit and free energy
Here, F = 0 if the particles are bosons and F = 1 if the particles are fermions.
Thus, for indistinguishable particles, the sum (5.33) over-counts the states. To
correctly count the states, we must divide by the number of permutations of the
labels s1 , . . . , sN . For example, if we have three bosonic particles in the follow-
ing states, then the distinguishable partition function overcounts by the following
factors:
|1, 1, 1⟩ → 1
|1, 1, 2⟩ → 3
|1, 2, 3⟩ → 6
|123, 14, 999⟩ → 6. (5.41)
If the particles are fermionic, then the first two states above actually vanish, while
the last two are again overcounted in Z distinguishable by a factor of 6.
If the number of particles is much smaller than the number of states si , then most
states have all labels si different (i.e. very few orbitals will be multiply occupied), so
the distinguishable partition function overcounts by a factor of N !. This is a good
approximation when we have a dilute gas, where multiple-occupancy is very rare.
We will see how to deal correctly with multiple-occupancy later.
Let us assume a dilute gas. Taking into account the indistinguishability of
fundamental particles, the correct partition function is
indistinguishable 1 N
ZN = Z (dilute gas). (5.42)
N! 1
The free energy is
F = −τ log ZN
= −τ N log (nQ V ) + τ log N !
= −τ N log (nQ V ) + τ (N log N − N ), (5.43)
where we have used Stirling’s approximation and only kept terms of size N or
N log N . The pressure computation is the same as before. However, now the result
for the entropy is different
∂F
σ=−
∂τ V
3
= N log(nQ V ) + N − N log N + N
2
V 5
= N log nQ + . (5.44)
N 2
If we fix the concentration N/V , then this formula is now extensive. The extra factor
of 1/N ! has solved the Gibbs paradox. Equation (5.44) called the Sackur-Tetrode
equation, and it does agree with experiment.
49
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
50
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
51
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
Because E and B must be real, we have the condition E(x, e ω)∗ = E(x,
e −ω) and
similarly for B.
e
Maxwell’s equations are also translationally-invariant, so the functions E(x,
e ω)
and B(x,
e ik·x
ω) can be written as linear combinations of plane waves e . However,
we must impose boundary conditions, and this will require us to take special linear
combinations of eik·x that become sines and cosines. We have two kinds of boundary
conditions:
A linear combination of plane waves that satisfies the above boundary conditions
is
ex (x, ω) = Ex0 cos nx πx sin ny πy sin nz πz
E
L L L
n x πx n y πy n z πz
E
ey (x, ω) = Ey0 sin cos sin
L L L
ez (x, ω) = Ez0 sin nx πx sin ny πy cos nz πz .
E (6.11)
L L L
where n = (nx , ny , nz ) is a vector of positive integers. It is hopefully easy to see
that (6.11) obeys the boundary condition E∥ = 0. To see that it obeys B⊥ = 0, we
can use (6.10). For example,
nx πx ny πy nz πz
Ḃx = ∂y Ez − ∂z Ey ∝ sin cos cos , (6.12)
L L L
52
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
which vanishes at x = 0.
The components of the electric field are not independent. We also have to impose
∂Ex ∂Ey ∂Ez
∇·E= + + = 0. (6.13)
∂x ∂y ∂z
This gives the condition
Ex0 nx + Ey0 ny + Ez0 nz = E0 · n = 0, (6.14)
where we have defined E0 = (Ex0 , Ey0 , Ez0 ). For any given n, equation (6.14) says
that fluctuations of the electric field must be perpendicular to n. The two perpen-
dicular directions are the two possible polarizations of electromagnetic radiation.
Finally, we also have the wave equation, which says
∂2E
c2 ∇2 E = . (6.15)
∂t2
Plugging in (6.9) with E(x,
e ω) given by (6.11), we find a relationship between ω and
n:
c2 π 2 n2
= ω2. (6.16)
L2
We return to this quantity in a moment. For now, let us study the energy U of
photons in the cavity. We have
∂
U = τ2 log Z
∂τ
X ℏωn
=2 . (6.19)
n
e n /τ − 1
ℏω
53
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
We now make a “big box” approximation, so that the quantity ℏωn /τ is slowly-
varying with n. In more detail, our assumption is ℏπcLτ ≪ 1. The quantity τ can
ℏc
1 ∞
Z
ℏωn
U ≈2 4πn2 dn ℏω /τ . (6.20)
8 0 e n −1
πc
Let us change variables to the frequency ω = L n. We find
∞
ω3
Z
U ℏ
= dω , (6.21)
V 0 π 2 c3 eℏω/τ − 1
ℏ ω3
uω = (6.22)
π 2 c3 eℏω/τ − 1
has the interpretation as the energy density of electromagnetic radiation per unit
frequency range. This is the Planck radiation law.
[End of lecture 7(2024)]
In the small ω limit, the Planck law becomes
ω2τ
uω ≈ ℏω ≪ τ (6.23)
π 2 c3
This is the Rayleigh-Jeans law. The ℏ’s have cancelled and the resulting formula
can be understood purely classically. A problem with the Rayleigh-Jeans law is
that if we extrapolate it up to high energies, the spectral density continues to grow,
and we find that the energy density is divergent. This is the so-called “ultraviolet
catastrophe” that was fixed by the advent of quantum mechanics. The catastrophe
is avoided in the Planck distribution because high frequency modes are unoccupied
due to the fact that they have a minimum energy ℏω (the energy of a single photon
in the mode). This leads to exponential suppression by the Boltzmann factor.
Let us finally integrate (6.21) to get the total energy density
ℏ τ 4 ∞ x3
Z
U
= 2 3 dx x (6.24)
V π c ℏ 0 e −1
54
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
The integral is
∞ ∞
x3
Z Z
dx x = dxx3 (e−x + e−2x + e−3x + . . . )
0 e −1 0
1 1
= Γ(4) 1 + 4 + 4 + . . .
2 3
= 6ζ(4)
π4
=6 , (6.25)
90
where ζ(s) is the Riemann-ζ function.16 Altogether, we have
U ℏ τ 4 π 4
= 2 3 6
V π c ℏ 90
π2
= τ4
15c3 ℏ3
= ατ 4 . (6.26)
The fact that the energy density of radiation is proportional to τ 4 is the Stefan-
Boltzmann law.
55
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
Thus, the flux of energy per unit area per unit time is
Vescape U c π2τ 4
JU = = ατ 4 = = σB T 4
A∆t V 4 60ℏ3 c3
π 2 kB
4
σB = . (6.29)
60ℏ3 c2
Definition (emissivity). An object has emissivity e ∈ [0, 1] if the energy flux from
the object is e times that of a black body.
Realistic objects are not black and have some absorptivity/emissivity less than
1. The above argument about radiation flux from a black body generalizes to show
that absorptivity and emissivity are always equal, a = e. Consider covering the hole
in a cavity with an object with absorptivity a at temperature τ . The object absorbs
an energy flux aJUblack . In order for thermal equilibrium to be maintained, it must
emit the same energy flux aJUblack , implying a = e. This is called Kirchhoff’s law.
The law holds as a function of frequency as well, a(ω) = e(ω).
56
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
The integration constant vanishes because when τ = 0, all modes are in their ground
57
Ph 12c 2024 6 Blackbody radiation and the Planck distribution
state, and we see explicitly from (6.17) that Z(τ = 0) = 1. The free energy is
αV τ 4
F = −τ log Z = − . (6.31)
3
The entropy is
4αV τ 3
∂F
σ=− = . (6.32)
∂τ V 3
Together with the energy density VU , this quantity plays an important role in cosmol-
ogy because it controls the way that radiation affects the evolution of the geometry
of the universe.
The factor of 1/3 in (6.33) reflects an interesting property of electromagnetic
radiation: conformal symmetry. A conformally-symmetric system has a traceless
stress-energy tensor. The stress-energy tensor of a stationary perfect fluid is given
by
−ρ 0 0 0
0 p 0 0
Tµ ν = 0 0 p 0 ,
(6.34)
0 0 0 p
58
Ph 12c 2024 7 Debye theory
7 Debye theory
A solid has vibrational modes. A quantum of energy of a vibrational mode of a
solid is called a phonon. In this section, we study the thermodynamics of a solid
by modeling it as a gas of phonons. Phonons have some similarities to photons,
so certain parts of the analysis of phonons will parallel our analysis of a photons.
However, there are also some important differences.
One similarity is that in the limit of an infinitely large system, there can be
phonons with very low energy. In the case of the electromagnetic field, this followed
from the dispersion relation ω = c|k|, where k = Lπ n is the wavevector. In the limit
that L is very large, there are wavevectors k with very small magnitude (i.e. modes
with very long wavelength), and the energy E = ℏω|k| of quanta in those modes is
small.
Phonons also satisfy a dispersion relation of the same form ω = v|k|, but this
time v is the speed of sound in the solid. In particular, note that we have arbitrarily
low-energy phonons in an arbitrarily large solid L → ∞, since then the wave number
|k| can get arbitrarily tiny in that case. This is a consequence of translational
symmetry. The argument is as follows: As the wavelength of a phonon gets longer,
it looks locally more and more like a translation of the lattice. Translation doesn’t
cost any energy because it is a symmetry. Thus, the energy of a phonon should
approach zero as k → 0. An excitation whose long-wavelength limit is controlled by
a symmetry is called a goldstone boson, and a phonon is an example of a goldstone
boson for translational symmetry.
[End of lecture 8(2024)]
An important difference between photons and phonons is as follows. There are
an infinite number of electromagnetic modes in any fixed volume. However, the
number of phonon modes in a solid is bounded. The reason is that a solid is made
up of N atoms. So the number of position coordinates of the atoms is 3N and
the number of momentum coordinates is 3N . The normal modes are obtained by
writing the Hamiltonian
X p2 1 X
i
Hsolid = + m Ωij xi xj (7.1)
2m 2
i i,j
and diagonalizing the matrix Ωij . The dimension of the space of modes is unchanged
by the diagonalization, and hence the total number of modes is still 3N .
Example 7.1 (1d lattice). Another way to say this is that wavelengths smaller
than the lattice spacing are redundant with other wavelengths. For simplicity, con-
sider a 1-dimensional lattice with lattice spacing a. A mode with number n is
59
Ph 12c 2024 7 Debye theory
Following Debye, let us suppose that n has a uniform cutoff nD and determine nD
by solving the equation
3 nD
Z
π
4πn2 dn = n3D = 3N
8 0 2
6N 1/3
nD = . (7.4)
π
The calculation of the energy is the same as for photons. Now we have ω = v|k|,
where v is the speed of sound. The speed of sound can actually be different for
transverse and longditudinal polarizations in general, but we will assume it is the
same for simplicity. As before k = πn/L is a wavevector. The energy is
X ℏω
U=
modes
eℏω/τ −1
Z nD
3 ℏωn
= 4π dn n2 ℏω /τ . (7.5)
8 0 e n −1
We have ω = πv L n, where v is the velocity of sound (which we assume is the same
for all polarizations, for simplicity). Thus, the energy can be written
3π 2 ℏv τ L 4 xD x3
Z
= dx x , (7.6)
2L πℏv 0 e −1
where
θ
xD = ,
τ
1/3
6π 2 N
θ = ℏv (7.7)
V
60
Ph 12c 2024 7 Debye theory
is the Debye temperature. The integral can be done analytically at low tempera-
tures, where xD → ∞. The result is
3π 4 N τ 4
U≈ . (7.8)
5θ3
The heat capacity is
12π 4 N τ 3
∂U
CV = = , (7.9)
∂τ V 5 θ
which matches experiment pretty well. This is called the Debye τ 3 law.
61
Ph 12c 2024 8 Classical Statistical Mechanics
−∞
[b
x, pb] = iℏ. (8.3)
17
As a sanity check on this formula, let us expand the left-hand side
b+ 1 A
eA+B = 1 + A b+B b2 + AbBb+B bAb+B b2 + . . .
b b
2
=1+A b+B b + 1A b2 + AbBb + 1Bb 2 − 1 [A,
b B]b + ...
2 2 2
1 b2 + . . . 1 b2 + . . . 1 b b
= 1+A b+ A 1+Bb+ B 1 − [A, B] + . . . (8.5)
2 2 2
The claim is that if we include higher-order terms, we can always group the discrepancy between
eA+B and eA eB into nested commutators.
b b b b
62
Ph 12c 2024 8 Classical Statistical Mechanics
63
Ph 12c 2024 8 Classical Statistical Mechanics
Thus, a little bit of quantum mechanics is still necessary to make sense of a sum
over states. However, once we know the density of states in phase space, we can
compute various quantities in a classical approximation.
Example 8.1 (Classical ideal gas). As an application, let us recover our formula
for the partition function of the ideal gas in the classical regime. In this case, we
have V (x) = 0, so the single-particle partition function is
Z
1 3 3 −β 2m ⃗2
p
Z1 = d ⃗
x d p
⃗ e
(2πℏ)3
Z ∞ 3
V p2
−β 2m
= dp e
(2πℏ)3 −∞
2πm 3/2
V mτ 3/2
= = V . (8.12)
(2πℏ)3 β 2πℏ2
Once we have the single-particle partition function, we can compute the full partition
function via
1 N
ZN = Z , (8.13)
N! 1
and this agrees with our previous analysis.
A nice property of the classical phase space integral is that it is simple and
flexible: we could have a box with a complicated shape, and we don’t have to worry
about finding the quantum wavefunctions inside this shape and figuring out the
number of states at each energy level. The contribution from the d3 x integral is
always V . This means that we have actually improved our previous analysis where
we had to assume a cubical box. We now see that the thermodynamics of the gas
doesn’t depend on the shape of the box, in the classical regime.
Example 8.2 (Density of eigenvalues of the Laplacian). We can turn this
observation around and deduce an interesting mathematical result. Consider a box
with an arbitrary shape. The wavefunctions for energy eigenstates inside the con-
tainer are solutions of the time-independent Schrodinger equation
⃗2
ℏ2 ∇
− ψ(x) = Eψ(x) (8.14)
2m
with the boundary condition ψ(x)|boundary = 0 (called a Dirichlet boundary condi-
tion). Thus, the energies are given by Ei = ℏ2 λi /2m, where λi are the eigenvalues
of the Laplacian −∇ ⃗ 2 with Dirichlet boundary conditions. For a general region,
⃗
the spectrum of −∇2 can be quite complicated, see e.g. https://en.wikipedia.
org/wiki/Hearing_the_shape_of_a_drum. However, our result (8.12) encodes a
universal answer for the asymptotic density of large eigenvalues.
64
Ph 12c 2024 8 Classical Statistical Mechanics
To see this, let us write the single-particle partition function as an integral over
eigenvalues of the Laplacian
Z
−ℏ2 λi /(2mτ ) 2
X X
−Ei /τ
Z1 = e = e = dλ ρ(λ)e−ℏ λ/(2mτ ) . (8.15)
i i
P
In the last line, we introduced the density of eigenvalues ρ(λ) = i δ(λ − λi ). In the
classical limit ℏ → 0, the exponential decays slowly and the integral is dominated
by large λ. Let us suppose that ρ(λ) behaves like a power law at large λ,18
V 1/2
ρ(λ) ∼ λ (large λ). (8.18)
4π 2
This is Weyl’s law https://en.wikipedia.org/wiki/Weyl_law in 3-dimensions.
Can you derive the d-dimensional version?
65
Ph 12c 2024 8 Classical Statistical Mechanics
• 3 momenta,
• 2 rotation angles,
66
Ph 12c 2024 8 Classical Statistical Mechanics
Thus, the heat capacity of a diatomic gas as a function of temperature looks like
figure 3.
Figure 3: Heat capacity per molecule C b = C/N for a diatomic gas, as a function of
temperature. At low temperatures, only translational degrees of freedom are in the
classical regime and C b = 3 . At intermediate temperatures, both translations and
2
rotations are classical and Cb = 3 + 1 = 5 , and at high temperatures, translations,
2 2
b = 3 + 1 + 1 = 7 . Figure
rotations, and vibrations are in the classical regime and C 2 2
from Wikipedia https://en.wikipedia.org/wiki/Molar_heat_capacity.
67
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
• Conservation of energy
U = U1 + U2 . (9.2)
From these ingredients, we derived the Boltzmann distribution, defined the partition
function, and computed lots of interesting things.
Analogous reasoning applies whenever there is a conserved quantity that the
entropy depends on. Such a conserved quantity gives rise to a generalization of the
Boltzmann distribution, a generalization of the partition function, and lots more
interesting results. In this section, we will develop these methods for an important
(approximately) conserved quantity: particle number.
68
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
Consider two systems S1 , S2 in thermal and diffusive contact. Let the number
of particles in system Si be Ni , and let the energy of Si be Ui . Let the total number
of particles be N and the total energy be U . The entropy of each system depends
on the number of particles and the energy of that system:
σ1 (U1 , N1 ) and σ2 (U2 , N2 ). (9.3)
By conservation of energy and particle number, the total entropy is
σ = σ1 (U1 , N1 ) + σ2 (U2 , N2 )
= σ1 (U1 , N1 ) + σ2 (U − U1 , N − N1 ) (9.4)
The systems will reach thermal and diffusive equilibrium in the most probable
configuration, where (9.4) is maximized. This leads to the conditions
∂σ1 ∂σ2
= ,
∂U1 N ∂U2 N
∂σ1 ∂σ2
= . (9.5)
∂N1 U ∂N2 U
The first condition is simply τ1 = τ2 . To write the second condition, let us introduce
(Here, the partial derivative is taken while holding all particle numbers except Nj
fixed.) The condition for diffusive equilibrium becomes
(µj )1 = (µj )2 for all species j. (9.9)
69
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
dF = d(U − τ σ) = dU − τ dσ − σdτ
= τ dσ + µdN − τ dσ − σdτ
= µdN − σdτ. (9.11)
These expressions are useful because F is simple to compute, via its relationship to
Z.
Example 9.1 (Ideal gas). We previously computed the free energy of the ideal
gas
1 N
F = −τ log (nQ V )
N!
= −τ (N log nQ V − N log N + N ) . (9.14)
where n = N/V is the concentration. Typical gasses are dilute relative to the
quantum concentration n/nQ ≪ 1, so that µ is negative.
Example 9.2 (Voltage barrier). Consider two systems S1 , S2 with chemical po-
tentials µ1 , µ2 . Using (9.12), we have
∂F
= µ 1 − µ2 . (9.16)
∂N1
70
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
That is, the derivative of the total free energy F = F1 + F2 with respect to N1 is
the difference of chemical potentials.
Suppose also that the particles are charged with charge q, and imagine turning
on a voltage difference between the two systems so that particles in system Si have
potential energy qVi .20 The potential energy will give a new contribution to the
total energy and hence the free energy:
F ′ = F + N1 qV1 + N2 qV2 (9.17)
We now have
∂F ′
= µ1 − µ2 + q(V1 − V2 ) (9.18)
∂N2
Thus, the new difference of chemical potentials after turning on the voltages is
µ1 − µ2 + q(V1 − V2 ). We can achieve diffusive equilibrium by tuning q(V1 − V2 ) =
−(µ1 − µ2 ).
Thus, another definition of chemical potential is the electrical potential difference
required to maintain diffusive equilibrium. This gives a way of measuring chemical
potential for charged particles.
In general, when an external potential is present, the total chemical potential of
a system is
µ = µext + µint , (9.19)
where µext is the potential energy per particle in the external potential, and the
internal chemical potential µint is the chemical potential that would be present if
the external potential were zero.
[End of lecture 10(2024)]
Example 9.3 (Isothermal model of the atmosphere). We can model the at-
mosphere as an ideal gas at temperature τ that is in thermal and diffusive equilib-
rium. The isothermal/equilibrium assumptions are actually not true, but this is a
reasonable model. The total chemical potential at height h is
µ = µint + µext
= τ log(n(h)/nQ ) + M gh, (9.20)
where M is the mass of a gas particle, h is the height, and we have allowed the
concentration n(h) to depend on height. In equilibrium, this must be constant, so
we have
τ log(n(h)/nQ ) + M gh = τ log(n(0)/nQ )
n(h) = n(0)e−M gh/τ . (9.21)
20
Don’t confuse Vi with volume in this example.
71
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
An alternative fun way to derive this result (that you’ll do on the homework) is
to demand that the pressure of a slab of gas be enough to hold up the slab above it.
Here’s a third slick way that I really like. Consider a single molecule of gas as a
subsystem and the whole atmosphere as a reservoir at temperature τ . The density of
orbitals at height h is independent of h. Thus, the probability that the molecule will
be at height h is proportional to a Boltzmann factor P (h) ∝ e−E(h)/τ = e−M gh/τ .
The concentration n(h) is proportional to this probability.
Let us suppose our system is in a state with energy Es and particle number Ns .
The reservoir has energy U0 − Es and particle number N0 − Ns . The probability of
the state is proportional to
To compute the constant of proportionality, we can impose that the sum of proba-
bilities is equal to 1. We find
eµNs /τ −Es /τ
P (s) = (9.24)
Z(τ, µ)
X ∞ X
X
(Ns µ−Es )/τ
Z(τ, µ) = e = e(N µ−Es(N ) )/τ , (9.25)
s N =0 s(N )
72
Ph 12c 2024 9 Chemical potential and the Gibbs Distribution
73
Ph 12c 2024 10 Ideal Gas
10 Ideal Gas
10.1 Spin and statistics
10.1.1 Bosons vs. Fermions
Quantum mechanical particles are indistinguishable. This means that probabili-
ties and expectation values of operators are invariant under exchanging the labels
associated to a pair of particles. Consider a two-particle state
|x1 , x2 ⟩. (10.1)
Here, the particles are labeled by their positions. If they also had spin or other inter-
nal quantum numbers, then they would carry those labels as well. Under relabeling
x1 ↔ x2 , observables must be unchanged. This means that we must have
Note that we don’t simply have 1 on the right-hand side because we are free to
rotate amplitudes by a phase without changing expectation values and probabilities.
However, exchanging the labels twice should bring us back to the same state. Thus,
the phase should be a plus or minus sign
If the particles get a ± sign under exchange according to (10.3), then the wavefunc-
tion satisfies
74
Ph 12c 2024 10 Ideal Gas
and these will automatically have the correct property under exchange (10.5). How-
ever, consider the case where a = b:
where n is a unit vector for the axis of rotation, and J = (Jx , Jy , Jz ) are angular
momentum generators. By studying the commutation relations of the J’s, one can
show that the possible eigenvalues for any of the J’s are
m = −j, −j + 1, . . . , j − 1, j, (10.9)
The generators J get contributions from both the orbital angular momentum
and the intrinsic spin of a particle. In particular, if a spin-s particle is in a position
eigenstate at the origin, then R(2π) acts on it by ±1, depending on whether the
spin is integer or half-integer.
The fact that R(2π) is not necessarily equal to 1 is surprising from the classical
point of view. Classically, a 2π rotation brings a system back to itself. However, this
is not necessarily true for quantum states. Only probabilities are preserved under
a 2π rotation, and this leaves room for the appearance of a phase. You might ask:
why can the phase only be ±1, and not something more general? The answer is
related to the topology of the rotation group.
75
Ph 12c 2024 10 Ideal Gas
• They keep track of the history of rotations applied to each particle. To model,
this we can thicken each line into a “belt.”
21
Topologically, it is equivalent to S 3 /Z2 .
76
Ph 12c 2024 10 Ideal Gas
λe−E/τ 1
f (E) = ⟨Norbital ⟩ = = (E−µ)/τ . (10.13)
Z e +1
This is the Fermi-Dirac distribution. It is equal to 1 when E ≪ µ, 0 when E ≫ µ,
and 1/2 when E = µ. At low temperatures, it approaches a step function at
E = µ(τ = 0) = EF (the Fermi energy).
∂ λe−E/τ
f (E) = λ log Zorbital =
∂λ 1 − λe−E/τ
1
= (E−µ)/τ . (10.15)
e −1
This is the Bose-Einstein distribution. The only difference with the Fermi-Dirac
distribution is the minus sign in the denominator, but that difference is dramatic.
Note that the occupancy blows up when E ≈ µ — bosons love to live together in
low energy orbitals.
e(E−µ)/τ ≫ 1, (10.16)
77
Ph 12c 2024 10 Ideal Gas
so that
N = λnQ V =⇒ λ = n/nQ ,
=⇒ µ = τ log(n/nQ ). (10.19)
where n = N/V is the conentration. This recovers our result from before.
∂F
From the chemical potential µ = ( ∂N )τ,V , we can recover the free energy
Z N
F = dN µ
0
Z N
= τ (log(N ) + log(1/(nQ V )))
0
= τ ((N log N − N ) + N log(1/(nQ V )))
= τ N (log(n/nQ ) − 1). (10.20)
The pressure is
∂F
p=− = N τ /V, (10.21)
∂V τ,N
22
A way where the errors are more easily quantified.
78
Ph 12c 2024 10 Ideal Gas
79
Ph 12c 2024 11 Fermi Gases
11 Fermi Gases
In the previous section, we considered gases in the classical regime n ≪ nQ . The
behavior of gases in the quantum regime n ≫ nQ is rich and interesting, and in
particular depends in a crucial way on whether the gas is composed of fermions or
bosons. In this section, we will study a Fermi gas (gas of fermions). Some examples
of Fermi gases are
• Liquid 3 He,
• Nuclear matter.
ℏ2 πn 2
En = . (11.1)
2m L
Each orbital can be occupied or unoccupied, so the partition function of a single
orbital is
Zn = 1 + eβ(µ−En ) , (11.2)
where β = 1/τ .
The Gibbs sum of the entire system is a product of the Gibbs sums for each
orbital
Y Y
Z= Zn = Zn2 . (11.3)
n∈Z3≥0 n∈Z3≥0
s=± 12
Note that we have a square because of the possibility of having two different spins.
80
Ph 12c 2024 11 Fermi Gases
Ns eβ(µNs −Es )
P
⟨N ⟩ = s
Z
1 ∂
= log Z
β ∂µ
1 ∂ X
= 2 log(1 + eβ(µ−En ) )
β ∂µ n
X eβ(µ−En ) X
=2 = 2 fFD (En , τ, µ), (11.4)
n
1 + eβ(µ−En ) n
P P
where we have abbreviated n = n∈Z3 . This is an intuitive formula: the Fermi-
≥0
Dirac distribution gives the expected occupancy of each mode number, so the ex-
pected value of N is the sum of the expected occupancies, times 2 to account for
spin. We abbreviate fFD (E, τ, µ) as f (E, τ, µ) or sometimes f (E), throughout this
section.
Let us make the “big box” assumption that theP thermal wavelength is much
smaller than the size of the box. The sum over modes n can then be approximated
as a phase space integral. For any function f , we have
Z 3 3 2
X d xd p p
f (En ) ≈ 3
f
n
(2πℏ) 2m
Z ∞ 2
4πV 2 p
= 3
dp p f
(2πℏ) 0 2m
Z ∞ √
4πV
= d( 2mE) (2mE)f (E)
(2πℏ)3 0
2m 3/2 ∞
Z
V
= 2 dEE 1/2 f (E)
4π ℏ2 0
Z ∞
= dED0 (E)f (E), (11.5)
0
2m 3/2 1/2
V
D0 (E) = 2 E . (11.6)
4π ℏ2
You may recognize this as just Weyl’s law (8.18). Remember that the density of
⃗ 2 is given by
eigenvalues of the Laplacian −∇
V 1/2
ρ(λ)dλ = λ dλ. (11.7)
4π 2
2mE
To get D0 (E)dE, we can just plug in λ = ℏ2
.
81
Ph 12c 2024 11 Fermi Gases
82
Ph 12c 2024 11 Fermi Gases
where we have used that D(E) ∝ E 1/2 . This gives a connection between the Fermi
energy and the concentration.
The total energy at zero temperature is
Z EF
2 3
U= dEED(E) = EF2 D(EF ) = EF N, (11.14)
0 5 5
where we have used that ED(E) ∝ E 3/2 . This is a striking result. The energy
at zero temperature is enormous when N is large or the concentration n is large,
coming from the fact that the Pauli exclusion principle forces orbitals with high
energy to be occupied.
Because EF ∝ n2/3 ∼ V −2/3 , the energy increases with decreasing volume. This
leads to an outward pressure sometimes called degeneracy pressure. The pressure at
zero temperature is given by
∂F ∂(U − τ σ) ∂U
p=− =− =−
∂V τ,N τ =0 ∂V τ,N τ =0 ∂V τ,N
2 U
=− −
3 V
2
= EF n. (11.15)
5
In metals, the degeneracy pressure is cancelled in metals by Coulomb attraction
between electrons and ions, and in white dwarf stars by gravitational attraction.
83
Ph 12c 2024 11 Fermi Gases
∂f 1 e(E−µ)/τ 1 1
= (E−µ)/τ 2
= ,
∂µ τ (e + 1) τ 4 cosh2 E−µ
2τ
∂f
Note that µ just shifts over the Fermi-dirac distribution, so ∂µ looks like the deriva-
tive of a step function, which is a delta function δ(E − µ). Meanwhile, ∂f
∂τ looks like
two closeby bumps: one negative just below µ and one positive just above µ. This
should remind you of the derivative of a delta function δ ′ (E − µ).23
Let us turn this into a quantitative approximation. Let us guess
∂f
≈ A δ(E − µ), (τ ≪ µ, E),
∂µ
∂f
≈ B δ ′ (E − µ), (τ ≪ µ, E), (11.21)
∂τ
We can determine the coefficients A, B by integrating both sides against 1 and
E − µ. Note that we can integrate from −∞ to ∞ instead of from 0 to ∞ because
23
The dirac delta function is defined by
Z ∞
δ(x − a)f (x) = f (a). (11.18)
−∞
Technically speaking, the delta function is not really a function, but a distribution — something
that can be integrated against a function to get a number. Derivatives of delta functions are similar:
to define them we must explain how to integrate them against a function. Specifically, they are
defined by integrating by parts
Z ∞ Z ∞
∂
δ ′ (x − a)f (x) = − δ(x − a) f (x) = −f ′ (a). (11.19)
−∞ −∞ ∂x
Similarly, we can compute an integral against the n-th derivative δ (n) (x − a) by integrating by parts
n times:
Z ∞
δ (n) (x − a)f (x) = (−1)n f (n) (a). (11.20)
−∞
84
Ph 12c 2024 11 Fermi Gases
the functions are very small when E ≪ µ, and we will assume that µ is large. For
example,
Z ∞ Z ∞
∂f ∂f
dE = dE − = f (−∞) − f (∞) = 1. (11.22)
−∞ ∂µ −∞ ∂E
R
(In this case, we didn’t actually have to do an integral.) Comparing to dE δ(E −
µ) = 1, we conclude
∂f
≈ δ(E − µ), (τ ≪ µ, E). (11.23)
∂µ
85
Ph 12c 2024 11 Fermi Gases
∂µ π 2 τ D′ (EF ) π2τ
=− =−
∂τ 3 D(EF ) 6EF
2
π τ 2
µ(τ ) = EF − (τ ≪ EF ). (11.30)
12EF
Let us now study the energy at small τ . We have
Z ∞
∂U ∂f ∂µ ∂f
CV = = dE ED(E) +
∂τ 0 ∂τ ∂τ ∂µ
Z ∞ 2
π τ ′ ∂µ
= dE ED(E) − δ (E − EF ) + δ(E − EF )
0 3 ∂τ
π2τ ∂µ
= (D(EF ) + EF D′ (EF )) + EF D(EF )
3 ∂τ
π2τ π2 N τ
= D(EF ) = . (11.31)
3 2 EF
where we cancelled the term involving D′ (EF ) using the first line of (11.30).
Note that the heat capacity is proportional to τ : it is small at small temperature,
which is another striking prediction. (Compare to an ideal gas in the classical regime
whose heat capacity is independent of τ .) The physical interpretation is that when
we increase τ , particles in orbitals with energy EF − τ get lifted to have energy
EF + τ . The change in energy of each particle is proportional to τ , but the number
of particles whose orbitals are changing is also proportional to τ . Thus, the total
energy changes quadratically with τ , and the heat capacity is linear in τ .
86
Ph 12c 2024 11 Fermi Gases
Example 11.1 (Metals). The heat capacity of many metals can be written as a
sum of an electronic contribution from a free electron gas and a contribution from
phonons. The former goes like τ , and the latter goes like τ 3 according to Debye
theory. Overall, we have
CVmetal = γτ + Aτ 3 (11.32)
which fits the data extremely well. In more detail, we expect the form
π2N 12π 4 N 3
CVmetal = τ+ τ (11.33)
2EF 5 θ3
At low temperatures, the free electron term will be dominant. The crossover tem-
perature occurs at
1/2
5θ3
τ= , (11.34)
24π 2 EF
which in metals is about 1 K.
In practice, the heat capacity in metals is linear at low temperatures. However,
the slope γ does not precisely agree with the prediction (11.33). For example, in
2
potassium, the observed slope γ satisfies γ/γ0 ≈ 1.23, where γ0 = π2ENF is the ideal
gas prediction. The reason is that electrons in a metal are not really free — they
experience Coulomb interactions with each other and with the lattice of ions.
In fact, it is initially surprising that the linear slope prediction works at all. The
reason is that the important excitations in a metal aren’t really individual electrons,
but “quasiparticles” — an electron together with accompanying deformations and
vibrations of the lattice of ions (called “dressing”). The dressing changes the way
electrons interact with the lattice and with each other. Firstly, it tends to screen
Coulomb interactions, so that the dressed electrons interact more weakly with each
other and the lattice than undressed electrons. Secondly, it changes the effective
mass of the excitations m → m∗ , so that the Fermi energy of dressed electrons EF ∗
is different.
This is an example of the phenomenon of “renormalization.” We are given a
microscopic theory of particles with some masses and interactions. Interactions
dress the particles, so that the quasiparticles relevant for long-distance dynamics
have different properties from their microscopic constituents. Thus, effective masses
and interactions change with distance scale. This phenomenon is ubiquitous in
particle physics — in fact, many of the so-called “fundamental” particles might be
quasiparticles of some underlying more fundamental theory. Even the fine-structure
constant α, which controls the strength of electromagnetic interactions, changes
with distance scale.
Example 11.2 (White dwarf stars). White dwarf stars have masses approx-
imately equal to the mass of the sun, but radii 100 times smaller. The density
87
Ph 12c 2024 11 Fermi Gases
of the sun is approximately 1g/cm3 . The density of a white dwarf star is 104 to
107 g/cm3 . All the atoms in a white dwarf star are ionized and the electrons form
an approximately free electron gas.
Typical interatomic spacings are 1 Bohr radius, or 10−8 cm. When the atoms are
100 times closer together, their typical spacing is 10−10 cm, so the concentration of
electrons is approximately n = 1030 electrons/cm3 . The corresponding Fermi energy
is25
ℏ2
(3π 2 n)2/3 ≈ 3 × 105 eV = 0.36 MeV. (11.35)
2m
The Fermi temperature is 109 K. The actual temperature inside a white dwarf is
expected to be on the order of 107 K, thus we are in the same highly quantum
regime that we just analyzed.
One important point is that the Fermi energy is comparable to the rest mass of
an electron, which is me c2 = 0.5 MeV. Thus, relativistic effects are nontrivial, but
don’t completely invalidate our analysis.
What determines the size of a white dwarf star? Very roughly, the energy due
to gravitational potential energy is
GM 2
Ugrav = − . (11.36)
r
The kinetic energy of the Fermi gas is
2/3
ℏ2 ℏ2 N 5/3
3 N
Ufermi = EF N ≈ N = . (11.37)
5 me r3 me r 2
We have N = M/mp , where mp is the proton mass. Minimizing the energy, we find
∂ GM 2 ℏ2 M 5/3
0= (Ugrav + Ufermi ) = − , (11.38)
∂r r2 me mp5/3 r3
which gives
ℏ2
r≈ 5/3
. (11.39)
Gme mp M 1/3
So more massive white dwarf stars are actually smaller. Plugging in a solar mass
M⊙ ≈ 1033 g, we find
88
Ph 12c 2024 11 Fermi Gases
1 nF
Z
N =2 4πn2 dn
8 0
L 3 EF
Z
π
=π dEE 2 = 3
V EF3 (11.41)
ℏcπ 0 3(ℏcπ)
This depends on 1/r in the same way that gravitational potential energy does.
Thus, if the relativistic dwarf star is heavy enough, gravitational potential energy
will dominate kinetic energy and the star will collapse. The condition for collapse is
p + e → n + νe , (11.44)
89
Ph 12c 2024 11 Fermi Gases
core of a newly-forming neutron star blasts away a shell of other collapsing matter,
creating a supernova. Most of the energy is released in the form of neutrinos.26
The Schwarzchild radius of a black hole is
2GM
rS = , (11.45)
c2
which for M = M⊙ is about 3 kilometers. Thus, a solar-mass neutron star is close to
being a black hole. Indeed, if the neutron star has mass about 3M⊙ , it will collapse
into a black hole.
[End of lecture 13(2024)]
26
The Wikipedia article on neutron stars (https://en.wikipedia.org/wiki/Neutron_star) is
pretty mindblowing. One sentence I like is: “A neutron star is so dense that one teaspoon (5
milliliters) of its material would have a mass over 5.5 × 1012 kg, about 900 times the mass of the
Great Pyramid of Giza.”
90
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
ℏ2 π 2 2
∆E = (nexcited − n2ground )
2m L
ℏ2 π 2
= (4 + 1 + 1 − (1 + 1 + 1))
2m L
ℏ2 π 2
=3 . (12.1)
2m L
The mass of 4 He is 4 GeV/c2 (4 times the mass of a nucleon). For a box of size
L = 1cm, we have
91
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
where in this section f stands for the Bose-Einstein distribution. The occupancy of
the ground orbital E = 0 is
1
f (0, µ, τ ) = . (12.4)
e−µ/τ −1
Let us begin by calculating the chemical potential at very small τ ≪ ∆E. As
we emphasized above, τ ≪ ∆E is not necessary for Bose-Einstein condensation.
However, it will help us understand how small µ is. When τ ≪ ∆E, all N particles
are in the ground orbital, and we have
1
lim f (0, µ, τ ) ≈ N = (τ ≪ ∆E).
τ →0 e−µ/τ
− 1
1 τ
µ = −τ log 1 + ≈− (τ ≪ ∆E, 1 ≪ N ). (12.5)
N N
For example, if we have 1022 particles, then the chemical potential when τ ≪ ∆E
is −τ /1022 .
N = N0 (τ ) + Ne (τ ), (12.6)
92
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
where
ℏ2 πn 2
En = − Eground . (12.9)
2m L
2
ℏ π 2
Here, we have shifted the energy by Eground = 3 2m L2
so that the ground orbital has
energy 0. For each excited orbital, we have En ≫ µ, so we can safely replace the
above sum with
X X 1
Ne (τ ) = f (En , 0, τ ) = E /τ
(12.10)
n̸=0 n̸=0
e n −1
We can now analyze this sum using our usual tools. When L ≫ √mτ ℏ
, we can
approximate the sum as an integral. The density of orbitals is half of what we
computed before, because now we have spin-0 particles
2m 3/2 1/2
V
DB (E) ≡ 2 E . (12.11)
4π ℏ2
We use the subscript “B” to indicate spin-0 bosons.
The combination DB (E)f (E, µ, τ ) behaves like E −1/2 at small E. Even though
this function has a singularity at E = 0, it is an integrable singularity near E = 0 —
the area under the curve E −1/2 is finite. This guarantees that the sum Ne (τ ) can
be approximated by an integral
Z ∞
Ne (τ ) ≈ DB (E)f (E, µ, τ ). (12.12)
0
There are some confusing things about this integral: it is supposed to represent only
the contribution of excited states, and yet the integral goes from 0 to ∞. Doesn’t
this include the ground state? And if so, aren’t we double-counting the ground
state?
Example 12.1. The answer is no, we are not double-counting. To see why let us
study the following toy model of the sum over energy levels:
∞
1 X (n/a + ϵ)1/2
S0,ϵ (a) = , (12.13)
a en/a+ϵ − 1
n=0
93
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
in the limit ϵ ≪ 1/a ≪ 1. The important feature of this toy model is that the
small quantity ϵ is only important in the first term n = 0. Thus, we will have
to treat this term differently from the others. We will see that the other terms
can be approximated as an integral from n = 0 to ∞, and this integral does not
double-count the n = 0 term.
To begin, consider the following simplified version of the sum over the “excited”
levels from n = 1 to ∞,
∞
1 X (n/a)1/2
S1 (a) = (12.14)
a en/a − 1
n=1
The sum is the area under the red curve in the following picture (here, we have
a = 10)
x1/2
ex - 1
3.5
3.0
2.5
2.0
1.5
1.0
0.5
x
0.0 0.2 0.4 0.6 0.8 1.0
The reason is that n/a ≫ ϵ for all n = 1, 2, . . . , by our assumption that ϵ ≪ 1/a.
Thus every individual term in the sum is well-approximated by replacing n/a+ϵ → ϵ.
94
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
Now consider the sum including the “ground” level n = 0. We can evaluate it
by separating out the n = 0 term:
∞
1 X (n/a + ϵ)1/2
S0,ϵ (a) =
a
n=0
en/a+ϵ − 1
1 ϵ1/2
= + S1,ϵ (a)
a eϵ − 1
ϵ−1/2
≈ +I (ϵ ≪ 1/a ≪ 1) (12.17)
a
The sum S0,ϵ (a) is analogous to the one we must do: because the ground energy is so
close to a singularity of the Bose-Einstein distribution, we must treat it separately.
The remaining energy levels are not too close to the singularity and they can be
approximated in terms of an integral.
Let us now return to the sum (12.8). Approximating it as an integral, we have
Z ∞
1
Ne (τ ) = dE DB (E) E/τ
0 e −1
3/2 Z ∞
E 1/2
V 2m
= 2 dE
4π ℏ2 0 eE/τ − 1
3/2 Z ∞
x1/2
V 2m 3/2
= 2 τ dx
4π ℏ2 0 ex − 1
3/2
V 2mτ
= 2 Γ( 32 )ζ( 23 )
4π ℏ2
= ζ( 32 )nQ V, (12.18)
mτ 3/2
is the quantum concentration. Above, we have used Γ( 32 ) =
where nQ = 2πℏ
√ 2
π/2. Numerically, ζ( 32 ) ≈ 2.61238. Note that the shift En → En − Eground is not
important in the integral approximation.
Thus, the fraction of particles in excited orbitals is
ne nQ
= ζ( 23 ) , (12.19)
n n
where n = N/V . Because particle number is conserved, the number of particles in
the ground orbital is
n0 nQ τ
= 1 − ζ( 32 ) =− . (12.20)
n n Nµ
This finally determines the temperature dependence of µ (for τ < τE — see below).
95
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
96
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
allow the object to excite the liquid. If this occurs, the object won’t experience
resistance, and the liquid will appear to have zero viscosity.
Suppose the excitations of the liquid have momentum p and energy Ep . Then
during a scattering process with the object, conservation of energy and momentum
imply
1 1
M V 2 = M V ′2 + Ep
2 2
M V = M V′ + p, (12.23)
where V and V′ are the initial and final velocity of the object. Subtracting p from
both sides of the second equation and taking 1/2M of its square, we get
1 p2 1
MV 2 − V · p + = M V ′2 . (12.24)
2 2M 2
Subtracting from the first equation, we find
p2
V·p− = Ep . (12.25)
2M
Suppose that M is very large, so that we have
V · p = Ep . (12.26)
Ep = vs p, (12.28)
where vs ≈ 237m/s is the speed of sound. The actual value of the critical velocity
in liquid 4 He is much lower, about 50m/s. This is because the curve of Ep vs. p
dips back down at larger p, corresponding to a wavelength of about 10−8 cm.
97
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
E = −µ · B (12.29)
Alkali atoms have a single valence electron with zero orbital angular momentum
L = 0, so their magnetic moment is entirely due to the spin of the valence electron
µ = −gs µB S (12.30)
where gs ≈ 2 is the electron spin g-factor, S is the electron spin angular momentum,
and µB = eℏ/2me is the Bohr magneton.
An atom whose magnetic moment is aligned with B experiences a lower potential
at high values of |B|, and hence seeks regions of large |B|. An atom whose magnetic
moment is anti-aligned will seek regions of small |B|. Earnshaw’s theorem says
that it is not possible to have a local maximum of |B|. Thus, magnetic traps
work by setting up a local minimum of |B|. If the field is sufficiently strong, the
magnetic moments of atoms moving around in the trap will adiabatically follow the
orientation of B — i.e. high-field-seekers will remain high-field-seekers and low-
field-seekers will remain low-field-seekers. The low-field-seekers will be attracted to
the local minimum of |B|, and the high-field-seekers will be expelled from the trap.
The magnetic traps used in the first BEC experiments were quadrupole traps,
whose local magnetic field looks like
98
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
(Note that this indeed satisfies ∇ · B = 0.) This can be achieved locally by
placing two finite-length solenoids end-to-end with opposite fields. Unfortunately,
quadrupole traps have a zero of B in the middle. Atoms can experience spin-flips
at this location, resulting in them being lost from the trap. The MIT team dealt
with this problem by “plugging” the hole at (x, y, z) = 0 using a laser with fre-
quency tuned to repel the atoms.29 A nicer choice is the Ioffe-Pritchard trap, which
has a nonzero magnetic field at its center and gives rise to a harmonic (though not
isotropic) potential.
where ω is the resonant frequency of the atom, ν is the applied frequency of light,
γ is the rate of spontaneous emission of photons, and A depends on the squared
amplitude of the applied light. You can think of the atom as being like a harmonic
oscillator with frequency ω and damping rate 2γ. Then (12.32) is the expression
for the squared amplitude of a damped-driven oscillator with driving frequency ν.
29
The laser is tuned to a frequency ω + δ, where ω is a characteristic oscillation frequency of the
atom and δ is a large positive “detuning.” The laser creates an oscillating electric field that drives
oscillations of the electric dipole moment of the atoms. The energy of an electric dipole in a field E
is −E · d. When δ is positive, oscillations of d occur π out of phase with the driving electric field,
causing the atom to have higher energy in higher electric oscillating electric fields.
99
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
100
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
The source of this effective potential is the Zeeman splitting between spin states
Atoms whose magnetic moments are anti-aligned with the magnetic field experience
a plus sign in (12.36), and thus seek low values of |B| (and remain in the trap),
while atoms whose magnetic moments are aligned seek high values of |B| (and get
ejected).
One can induce oscillations between energy levels ∆E = ℏω by applying an
oscillating electric field with frequency ν ≈ ω. Thus, by applying such a field with
ν = ωZeeman , one can turn low-field-seeking atoms into high-field-seeking atoms and
kick them out of the trap. The key trick is to match the oscillation frequency ν to
the value of ωZeeman for atoms at the top of the trap. In this way, only atoms at the
top of the trap get ejected.
[End of lecture 14(2024)]
Last time, the energy depended on n2x + n2y + n2z , so it was useful to introduce a
q
variable n = n2x + n2y + n2z representing the radial direction in mode number space.
We can use the same trick here. This time, our “radial” variable is s = nx + ny + nz .
By dimensional analysis, the measure for the radial direction 2
R 1 2 will be1 as ds for some
constant a. We can fix the constant by requiring that 0 as ds = 6 , the volume of
a unit simplex. This gives a = 12 .
30
For typical BEC’s in the laboratory, this is not the case — the trap is typically elongated in
some direction.
101
Ph 12c 2024 12 Bose Gases and Bose-Einstein Condensation
Overall, we get
1 ∞ 2
Z
1
Ne = s ds (ℏω s−µ)/τ
2 0 e 0 −1
Z ∞
1 τ3 x2
≈ dx x
2 (ℏω0 )3 0 e −1
ζ(3)τ 3
= (12.40)
(ℏω0 )3
102
Ph 12c 2024 13 Heat and Work
103
Ph 12c 2024 13 Heat and Work
104
Ph 12c 2024 13 Heat and Work
13.3 Refrigerators
A refrigerator is a heat engine run in reverse. The refrigerator takes input heat
Ql from a reservoir at low temperature τl , together with input work, and outputs
heat Qh to a reservoir at temperature τh . In a reversible refrigerator, we again have
Qh /τh = Ql /τl so that total entropy is unchanged. Conservation of energy implies
W = Qh − Ql . The efficiency of a refrigerator is measured by
Ql Ql
γ= = (13.7)
W Qh − Ql
In the case of reversible operation, this is
τl
γC = . (13.8)
τh − τl
For an irreversible refrigerator, we have Qh /τh ≥ Ql /τl , so that the coefficient γ is
smaller than γC .
An air conditioner is a refrigerator that cools the inside of a building. Note that
the efficiency of an air conditioner is highest when the inside temperature τl is close
to the outside temperature τh . A heat pump is an air conditioner with the input and
output switched, so it uses work to heat up a building. The efficiency of a reversible
heat pump is
Qh Qh τh
= = > 1, (13.9)
W Qh − Ql τh − τl
which is better than directly converting W into heat and injecting it into the room
(as with a heater).
105
Ph 12c 2024 13 Heat and Work
To keep the entropy constant, we must have V τ 3/2 = const. So as the system
expands, the temperature drops. The energy is proportional to the temperature, so
the energy drops as well.
Because no heat can flow, we have ∆U = −W . Consider a small amount of
expansion V → V + dV . The change in energy due to work is
Nτ N2U 2U dV
dU = −pdV = − dV = − dV = −
V V 3N 3 V
dU 2 dV
=−
U 3 V
U = CV −2/3 (13.14)
106
Ph 12c 2024 13 Heat and Work
Thus, we have τ V 2/3 ∝ U V 2/3 = const., which agrees with our direct analysis using
the formula for the entropy.
Example 13.3 (Irreversible expansion into vacuum). If a gas suddenly ex-
pands, without exchanging energy with its environment, the process is irreversible.
The total energy doesn’t change, and so the temperature doesn’t either. The en-
tropy thus changes by N log(V2 /V1 ). For example, if we double the size of the box,
the entropy changes by N log 2. We can think of this as adding 1 bit of information
per particle, since now to specify the state of the system, we must say whether it is
in the left- or right-half of the box.
Thus, we see that σH − σL = N log(V2 /V1 ) = N log(V3 /V4 ), so that we must have
V2 /V1 = V3 /V4 . In more detail, the process is32
32
Following Kittel and Kroemer, we use conventions where we always associate a positive number
and a direction (i.e. from the reservoir to the system) to heat and work, as opposed to using negative
numbers to indicate direction. Thus, when combining different sources of heat and work, we will
have to keep in mind the direction of flow to know what signs to use.
107
Ph 12c 2024 13 Heat and Work
W12 = Qh . (13.18)
108
Ph 12c 2024 13 Heat and Work
This is simply the area of the rectangle in the τ -σ plane. This simple result is a
consequence of the fact that the integral of dU over a closed cycle in configuration
space must vanish
I
dU = 0. (13.24)
dU = dQ ¯ = τ dσ − pdV
¯ + dW (13.25)
Thus, we find
I I Z
pdV = τ dσ = dτ dσ, (13.26)
rectangle
are both nonzero. This shows that Q cannot be a function of the state of the system
— i.e. there cannot be a function Q(σ, V ) such thatH dQ¯ is the differential of that
function.33 If such a function existed, then we’d have dQ = Q(σf , Vf )−Q(σi , Vi ) =
0 for any closed cycle. The object dQ¯ is something that must be integrated along a
path. The resulting heat transfer Q depends on that path — not just the initial and
final points. This is why we use the funny d.
¯ The regular d is reserved for something
whose integral along a closed path is zero, as in (13.24).
33
The variables we use to parametrize the system are not important as long as they completely
characterize the macrostate. For example, there can’t be a function Q(τ, V ) either.
109
Ph 12c 2024 13 Heat and Work
from the low temperature reservoir and requires work W1 − ∆W . The heat engine
deposits heat
110
Ph 12c 2024 14 Gibbs free energy
¯ = dU − dQ
dW ¯ = dU − τ dσ = dU − d(τ σ) = dF, (14.1)
where F = U − τ σ is the free energy. Recall also that the condition for thermody-
namic equilibrium for a system at constant temperature τ is that F is minimized.
This is because the number of accessible states of a system+reservoir in thermal
contact is
g = gS gR = eσ(U ) eσR (Utot −U ) = eσ(U ) eσR (Utot )−U/τ = eσR (Utot ) e−F/τ , (14.2)
111
Ph 12c 2024 14 Gibbs free energy
G = F + pV = U + pV − τ σ (14.6)
Thus, the minimum of G is the “most probable configuration” for a system and
reservoir in thermal and mechanical contact.
One of the equilibrium conditions for thermal and mechanical contact is
∂σS ∂σR
0= − , (14.7)
∂V U ∂V U
which says that the pressure of the system and reservoir must be equal.
Recall that the free energy F = U −τ σ has an interpretation as the energy, where
we subtract off the “useless” contribution from heat τ σ. The Gibbs free energy has
a similar interpretation. When a system is in mechanical contact with a reservoir,
the an amount of energy −pV goes into changing the volume of the reservoir. This
energy is “useless” in the same form as heat and it is useful to subtract it off. Thus
G represents the “useful” work that can be extracted from a system at constant
temperature and pressure.
dG = dU + pdV + V dp − τ dσ − σdτ
= (τ dσ − pdV + µdN ) + pdV + V dp − τ dσ − σdτ
= V dp − σdτ + µdN. (14.9)
112
Ph 12c 2024 14 Gibbs free energy
Example 14.1 (Gibbs free energy of ideal gas). Recall that the entropy of an
ideal gas is
nQ 5 V nQ 5
σ = N log + = N log + . (14.13)
n 2 N 2
113
Ph 12c 2024 14 Gibbs free energy
so that we have
X
dG = µj dNj − σdτ + V dp. (14.20)
j
where Aj represent chemical species, and νj are coefficients. For example, for the
reaction H2 + Cl2 = 2HCL, we have
A1 = H2 , A2 = Cl2 , A3 = HCL
ν1 = 1, ν2 = 1, ν3 = −2. (14.22)
114
Ph 12c 2024 14 Gibbs free energy
dNj = νj dN
b, (14.24)
where dN b is the change in the number of times the reaction has taken place. Plugging
this into the expression for dG, we have
X
dG = µj νj dN
b. (14.25)
j
Here, the energies appearing in the sum come from the center-of-mass momentum of
the atoms, and we have written the result after approximating the sum over modes
as an integral in the usual way. If the gas is not monatomic, but the particles have
some internal structure, then the partition function becomes
X X X
Z1 = e−(En +Eint )/τ = e−En /τ e−Eint /τ
n,internal n internal
Z1 = nQ V Zint (τ ), (14.28)
where Zint (τ ) is the partition function associated with the internal degrees of freedom
— for instance, rotational or vibrational motion.
As an example, for rotational degrees of freedom contribute
∞
X ℏ2
Zrot (τ ) = (2j + 1)e− 2Iτ j(j+1) , (14.29)
j=0
115
Ph 12c 2024 14 Gibbs free energy
The total internal partition function for a diatomic molecule is a product of these
factors
Zint (τ ) = Zrot (τ )Zvib (τ ). (14.31)
where λ = eµ/τ , and we have assumed λ ≪ 1 so that we are in the classical regime.
The expectation value of N is
∂ ∂
N = ⟨N ⟩ = λ log Z ≈ λ (λZ1 ) = λZ1 . (14.33)
∂λ ∂λ
Thus, the chemical potential is given by
N
µ = τ log = τ (log n − log c), (14.34)
Z1
where c = nQ Zint . This is a generalization of our previous results for monatomic
gases.
Suppose that we have multiple species, so that
µj = τ (log nj − log cj ), (14.35)
where cj = nQi Zint,i (τ ). The equilibrium condition (14.26) can be written
Y νj X X
log nj = νj log nj = νj log cj = log K(τ ),
j j j
ν
Y
K(τ ) ≡ cj j . (14.36)
j
where Fint,j = −τ log Zint,j is the internal free energy. Equation (14.36) is called the
law of mass action.
116
Ph 12c 2024 14 Gibbs free energy
Here, the notation [A] = nA is shorthand for the concentration of species A. Since
H is monatomic, it has Zint,H = 1. What about H2 ? One important point is that
there is a nontrivial binding energy EB = 4.476eV for two H’s inside an H2 molecule.
When computing Zint,H2 , we must use the same conventions for the zero of energy
as we used for H. Thus, we have
where Zrot,H2 and Zvib,H2 are the partition functions associated with rotational mo-
tion and vibrations. The equilibrium constant is
nQ,H2 Zint,H2 Zrot,H2 Zvib,H2
K(τ ) = 2 = 23/2 eEB /τ . (14.40)
nQ,H nQ,H
3/2
nQ,H2 mH
Here, we used nQ,H ∝ 2
3/2 = 23/2 . At low temperatures, the vibrational and
mH
rotational partition functions become 1, and the dominant term above is the binding
energy term eEB /τ . We can write our result as
[H2 ] [H]
= 23/2 eEB /τ Zrot,H2 Zvib,H2 (14.41)
[H] nQ,H
117
Ph 12c 2024 14 Gibbs free energy
118
Ph 12c 2024 15 Phase transitions
15 Phase transitions
You will notice that these lecture notes are lacking in pictures, especially in this
section. For the pictures, you can (a) copy them down in class, (b) consult the
textbook (which we’re following relatively closely), (c) check out some of the other
notes linked from the website.
Sources: K&K chapter 10, David Tong’s notes on statistical physics
A phase transition is a discontinuity in thermodynamic quantities or their deriva-
tives. For example, the heat capacity and the compressibility of water jump between
the liquid phase and the gas phase. A small change in temperature or pressure leads
to a big change in the equilibrium configuration.
15.0.1 Nonanalyticities
Discontinuities in thermodynamic quantities can only occur in infinite-size systems.
To see this, consider a system in the canonical ensemble at temperature τ . The
partition function is
X
Z(τ ) = e−Es /τ . (15.1)
s
Quantities like, e.g., the heat capacity can be computed from derivatives of Z, CV ∝
∂2
β 2 ∂β 2 log Z. If the sum over states s is finite, then CV can never be discontinuous
because Z(τ ) will be a finite sum of smooth curves E −Es /τ as a function of τ , and
hence itself a smooth curve. However, as the number of states goes to infinity, Z(τ )
can potentially develop non-smooth features, called “nonanalyticities.”
In fact, it’s not sufficient simply for the number of states to be infinite — for
example, a single harmonic oscillator will not exhibit a phase transition. In prac-
tice, the number of degrees of freedom must be infinite as well. This means that
phase transitions only actually occur in the infinite volume limit. In practice, the
rounding out of thermodynamic quantities due to finite system size is unobservable
for macroscopic systems.
To see how nonanalyticities can emerge, consider a density of states of the form
1 1
f (E) = E αN −1 + E βN −1 , (15.2)
Γ(N α) Γ(N β)
where β > α. (The important thing is the two different power laws — the coefficients
are chosen for convenience.) The partition function is
Z ∞
Z(τ ) = e−E/τ f (E) = τ N α + τ N β . (15.3)
0
119
Ph 12c 2024 15 Phase transitions
Consider the limit N → ∞. Note that the free energy grows like F ∼ N , so this is
like the limit of a large number of degrees of freedom. If τ < 1, then the first term
in parentheses dominates and we have F ≈ −αN τ log τ . If τ > 1, the second term
dominates, and we have F ≈ −βN τ log τ . Thus, in the large-N limit, we have
(
−αN τ log τ (τ < 1),
F = (N ≫ 1).
−βN τ log τ (τ > 1)
The derivative of the free energy becomes discontinuous at τ = 1. The free energy
itself is not discontinuous — it is equal to zero at τ = 1 in both expressions.
This type of discontinuity in the derivative of F is characteristic of a first-order
phase transition. First-order transitions are typically cases where two distinct con-
tributions to Z are switching places in importance, as is happening in our toy model.
Physically, one contribution τ N α comes from states in one kind of configuration, and
the other contribution τ N β comes from states in a different type of configuration.
When their contributions switch places, it means that thermodynamic quantities go
from being dominated by one type of configuration to the other.
120
Ph 12c 2024 15 Phase transitions
τl = τg , µl = µg , pl = pg . (15.5)
In particular, the chemical potentials must be equal, and this determines a curve in
the τ -p plane
µg (p + dp, τ + dτ ) = µl (p + dp, τ + dτ )
∂µg ∂µg ∂µl ∂µl
dp + dτ = dp + dτ, (15.7)
∂p τ ∂τ p ∂p τ ∂τ p
Let us define
Thus, we can write our equation for the slope of the coexistence curve as
dp ∆s
= , ∆s = sg − sl , ∆v = vg − vl (15.12)
dτ ∆v
121
Ph 12c 2024 15 Phase transitions
The quantity in the numerator is the change in entropy per particle in going from
liquid to gas, and the quantity in the denominator is the change in volume per
particle in going from liquid to gas.
The quantity ∆s is related to the amount of heat needed to boil a molecule —
i.e. to transfer a molecule reversibly from the liquid to the gas while keeping the
temperature constant. The heat during the transfer is
¯ = τ ∆s ≡ L,
dQ (15.13)
where L is called the latent heat of vaporization. With this definition, we have
dp L
= , (15.14)
dτ τ ∆v
which is called the Clausius-Clapeyron equation or vapor pressure equation.
V = Ng vg + Nl vl = Ng vg + (N − Ng )vl , (15.18)
122
Ph 12c 2024 15 Phase transitions
123
Ph 12c 2024 15 Phase transitions
make small corrections to it. We will talk more about the validity of mean field
approximations later. Plugging in n(x) = n, we have
n2 n2
Z Z
δU = d3 xd3 yϕ(x − y) = V d3 xϕ(x)
2 2
n2 N2
= V (−2a) = −a . (15.22)
2 V
R
Where we simply gave the integral dxϕ(x) a name −2a. The negative sign is
because the potential should be attractive. This gives a correction to the pressure
via
∂δU N2
δp ≈ − = −a 2 . (15.23)
∂V V
This derivation was a reasonable approximation in the limit of low density and
weak interactions. However, we are going to now take the Van der Waals equation
of state and use it outside of its regime of validity because it gives a very useful toy
model in which to understand some properties of phase transitions. The quantitative
results will be incorrect, but some of the qualitative features will be correct.
15.2.1 Isotherms
Let us fix τ , and plot p vs V in the Van der Waals model. Such curves are called
isotherms. If τ is large, the term −a/v 2 is unimportant, and the curves closely follow
the ideal gas law p ≈ τ /v. However, as τ gets smaller, the curve p(v) develops a
wiggle. The value τ = τc below which the wiggles occur is the one where the curve
has an inflection point
∂p ∂2p
= 2 = 0. (15.24)
∂v ∂v
A nice way to find the inflection point is to write the VdW equation as
This is a cubic in v with three solutions for τ < τc . When τ = τc and p = pc , the
three solutions coincide, so that the equation has the form
124
Ph 12c 2024 15 Phase transitions
Let us consider τ < τc . There is a range of p for which there are three possible
solutions v1 , v2 , v3 for the equation of state p(v). What is happening in this range?
Note that the middle solution v = v2 has the feature
∂p
> 0. (15.28)
∂v
This means that as the system grows in volume, the pressure grows. As the system
shrinks in volume, the pressure decreases. The result is an instability — the system
will evolve away from this configuration toward some other state.
Example 15.1 (Negative heat capacity). Incidentally, a similar instability con-
dition applies to heat transfer. Recall that the heat capacity is defined as
∂U
CV = . (15.29)
∂τ
If the heat capacity were negative, then increasing the energy would decrease the
temperature. This would cause more energy to flow into the system, further de-
creasing the temperature. Meanwhile, decreasing the energy would increase the
temperature, causing more energy to flow out, causing the temperature to increase
further. A famous example of a system with negative heat capacity is a black hole.
1
The Hawking temperature is τ ∝ M , which means it decreases as the mass increases.
The instability results in the black Hawking evaporating to nothing.
Back to our liquid-gas system. What configuration replaces the unstable one?
Well, the Van der Waals equation of state was derived assuming the system has a
fixed density n/V . We can move off this curve by making the density inhomogeneous
— i.e. by turning the system into an inhomogeneous mixture of coexisting gas and
liquid. As we argued before, in the inhomogeneous phase, the pressure and temper-
ature are constant as a function of volume. Thus, the physically correct isotherm
does not follow the curve pVdW (τ ) between v1 and v3 , but rather is a straight line
in that region.
This equation can be solved for p once we know µ for the Van der Waals gas.
Remember that we have
125
Ph 12c 2024 15 Phase transitions
We can integrate this equation along any path in configuration space to compute
the chemical potential µ. A convenient choice is to consider paths of constant N
and τ , in which case we have
N dµ = V dp (dτ = 0)
I pg
1
µ g = µl + V dp (15.32)
N pl
To find two points with equal µ, we should find two points such that the integral
of V dp between the points vanishes. This says that the difference between the area
above the straight line and below the straight line should be equal, which tells us
where to draw the isotherm in the coexistence region. This is called the Maxwell
construction.
Actually, there is something problematic about the Maxwell construction: it
requires us to integrate dµ through the unstable region, where the physics described
by the Van der Waals curve is completely wrong! A better derivation would be to
choose a path between the gas point and liquid point that does not pass through
the unstable region. Ultimately this would be more mathematically complicated,
and will give the same answer. Anyway, the Maxwell construction is not really very
important — it is a trick to help us visualize where the condition (15.30) holds.
Let us consider changing the pressure a little bit away from its value at the
coexistence curve pcoexistence at some temperature
τ. If p is slightly above pcoexistence ,
then we follow the liquid part of the curve ∂µ ∂p τ = vl . If p is slightly below
pcoexistence , then we follow the gas part of the curve ∂µ∂p = vg . Thus, there is a
τ
discontinuity in the first derivative of the chemical potential (and also the Gibbs
free energy) as we move across the coexistence pressure at fixed τ . Note that there
is no discontinuity in the Gibbs free energy itself, since the chemical potentials have
to be equal for coexistence. A discontinuity in the first derivative of the free energy
is characteristic of a first order transition. This is the same thing that happened in
our toy model of the partition function Z(τ ) = τ N α + τ N β .
126
Ph 12c 2024 15 Phase transitions
When we’re near the critical point, we can write vl = vc − ϵ/2 and vg = vc + ϵ/2,
where ϵ is small. Plugging this in, we find
where O(ϵ3 ) means terms of order ϵ3 and higher. When ϵ = 0, we must have τ = τc ,
so we can rewrite this as
τc − τ ∼ (vg − vl )2
(τc − τ )1/2 ∼ vg − vl . (15.36)
This is our first example of a critical exponent — the exponent in a power law
relationship between
quantities as one approaches a critical point.
∂µ ∂µ
Recall that ∂p = vg and ∂p = vl . Below the critical temperature, we
gas liquid
noticed that these quantities jumped as we dialed the pressure past its coexistence
value. Thus, the Gibbs free energy (G = N µ) had a discontinuous first derivative.
However, at the critical temperature, the difference between vl and vg vanishes.
The Gibbs free energy no longer has a discontinuous first derivative as a function of
2 ∂(vg −vl )
pressure. Instead, we have ∂∂p∆µ2 = ∂p ∼ (p − pc )−2/3 , which diverges. This is
characteristic of a second-order phase transition.
Let us compute a few more critical exponents in the Van der Waals model.
We can ask how the volume changes with pressure as we move along the critical
∂p ∂2p
isotherm. Remember that the critical point was an inflection point ∂v = ∂v 2 = 0.
Thus, the Taylor expansion starts with a cubic term
p − pc ∼ (v − vc )3 (τ = τc ). (15.37)
127
Ph 12c 2024 15 Phase transitions
so that
κ ∼ (τ − τc )−1 . (15.40)
vg − vl ∼ (τc − τ )β , β = 0.326419(3)
p − pc ∼ (v − vc )δ , δ = 4.78984(1)
−γ
κ ∼ (τ − τc ) , γ = 1.23708(7). (15.41)
So the Van der Waals model does not actually get the critical exponents right. This
is perhaps not surprising, since it is a very crude model and doesn’t incorporate
many of the properties of actual water. However, a very surprising fact is that the
above values of the critical exponents don’t actually depend on special properties of
water at all. If you measure critical exponents in any other liquid-vapor transition,
you find the same crazy numbers. The above critical exponents are actually some
kinds of universal “constants of nature” like e and π. The failure of the Van der
Waals model has nothing to do with the failure to build in details of water molecules
— rather it is missing some essential feature of critical points that leads to the
emergence of these universal numbers.
I have to show you so many decimal digits for β, δ and γ because the current
most precise values for these quantities are actually due to me — your instructor —
together with some collaborators. See, for example https://arxiv.org/abs/1603.
04436, which computes precise values for the numbers ∆σ = 0.5181489(10), ∆ϵ =
∆σ
1.412625(10) which are related to the critical exponents above by β = 3−∆ ϵ
,δ =
3−∆σ 3−2∆σ
∆σ , γ = 3−∆ϵ . These numbers are computed using a method called the “con-
formal bootstrap,” which is based on techniques from quantum field theory. In last
few lectures of the term, I hope to explain a little bit about why critical exponents
are universal and why quantum field theory is a good tool for describing them.
128
Ph 12c 2024 15 Phase transitions
The interpretation of the second term is that the spins sit in an applied external
magnetic field B, causing it to be energetically favorable to point in the direction
of B. The energy of a magnetic moment msi in a magnetic field is −Bmsi , and we
have defined Bm → B for brevity. P
In the first term, the notation ⟨ij⟩ means that the sum runs over nearest-
neighbor pairs of spins. Its precise meaning depends on the dimension d and the
type of lattice. We write q for the number of nearest neighbors of a given lattice
site. For a square lattice in d-dimensions, each spin has q = 2d nearest neighbors,
corresponding to the sites reached by moving by ±1 along each of the d axes. The
Ising model can be defined on any type of lattice, but for concreteness we will mostly
consider a square lattice. The cases d = 1, 2, 3 are relevant for real-world materials
— they describe magnetic
P strings, surfaces, and bulk materials.
The term −J ⟨ij⟩ si sj models interactions between neighboring spins and the
interaction strength J is a parameter. When J > 0, this term makes it energetically
favorable for the spins to align with each other. This case is called a ferromagnet.
The case J < 0 is called an antiferromagnet. For the purposes of the present
discussion, the distinction is not so important, but we will imagine that we have a
ferromagnet.
Let us work in the canonical ensemble and introduce the partition function
X
Z= e−βE[si ]
{si }
X X X
= ··· e−βE(s1 ,s2 ,...,sN ) , (15.43)
s1 =±1 s2 =±1 sN ±1
P
where β = 1/τ . The notation {si } means we sum over all configurations of spins
si ∈ ±1, and we have written it out more explicitly on the second line. An important
observable is average spin or magnetization, which can be written in terms of a
129
Ph 12c 2024 15 Phase transitions
However, this system is related to the Ising model by a simple change of variable:
si = 2ni − 1. (15.46)
si sj = [(si − m) + m][(sj − m) + m]
= (si − m)(sj − m) + m(si + sj ) − m2 (15.47)
130
Ph 12c 2024 15 Phase transitions
In the mean field approximation, we assume that fluctuations of spins away from
the average are small, so that we can neglect the first term. The energy is then
X X
E ≈ −J [m(si + sj ) − m2 ] − B si
⟨ij⟩ i
1 X
= JN qm2 − (Jqm + B) si , (15.48)
2
i
where q is the number of nearest neighbors for a given lattice site. The factor of
1/2 in the first term is to avoid overcounting pairs of lattice sites.
The mean field approximation has removed the interactions. We now have a
bunch of non-interacting spins in the effective magnetic field
Beff = B + Jqm, (15.49)
which includes the external applied field and also a field generated by nearest neigh-
bors. We analyzed a system of noninteracting spins a long time ago in section ??.
The partition function is
1 2
e−β ( 2 JN qm −Beff i si )
X P
Z=
{si }
1 2
= e− 2 βJN qm (eβBeff + e−βBeff )N
1 2
= e− 2 βJN qm 2N coshN (βBeff ). (15.50)
This expression for Z depends separately on m and Beff . However, we can determine
m from (15.44), and this gives us a consistency equation
m = tanh(βBeff ) = tanh(βB + βJqm). (15.51)
This is a transcendental equation that we can solve for m for various values of β, J, B.
We can visualize the solution by plotting the curves m and tanh(βB + βJqm) and
asking that they intersect. Let us consider some different cases.
• B = 0, βJq < 1 (zero applied magnetic field, high temperature/low spin-spin
interaction). When B = 0, the curve tanh(βB + βJqm) is symmetric around
m = 0. When βJq < 1, there is only one solution to (15.51) at m = 0. The
interpretation is that at high temperatures and zero applied field, the spins
fluctuate a lot and average out to zero.
• B = 0, βJq > 1 (zero applied magnetic field, low temperature/high spin-
spin interaction). There are now three solutions to equation (15.51), m =
−m0 , 0, m0 . The middle solution is unstable. To see this, we can expand the
free energy near m = 0 when B = 0,
1 N log 2 1
F =− log Z ≈ − − (JN q(Jqβ − 1))m2 + O(m4 ) (B = 0).
β β 2
(15.52)
131
Ph 12c 2024 15 Phase transitions
We see that m = 0 is a local maximum of the free energy and thus does not
dominate the canonical ensemble. Thus, the system settles into the states
m = ±m. The interpretation is that the spin-spin interactions overwhelm the
effects of thermal fluctuations and the spins overwhelmingly want to point in
the same direction.
Precisely which direction they chooses depends on the history of the system
(hysteresis). If more spins start out up than down, then the system is most
likely to evolve to a state where most of the spins are up (an vice-versa). This
is an example of a phenomenon called spontaneous symmetry breaking. When
B = 0, the system has a Z2 symmetry under flipping all the spins si → −si ,
which acts as m → −m. However, the actual state of the system at low
temperature is not invariant under m → −m. (When the temperature is high,
the symmetry is unbroken.)
• B ̸= 0. In this case, there is always a solution for m with the same sign
as B. At sufficiently high values of βJq, there are two additional solutions.
The middle one is always unstable. The other one, with opposite sign to B is
metastable — it is a local but not global minimum of the free energy. Thus,
for the stable solution, the magnetization always points in the direction of B.
The applied magnetic field breaks the Z2 symmetry. If we slowly change the
applied magnetic field from positive to negative, the system will move into
the metastable state and then at some point decay into the stable state where
most spins are flipped.
132
Ph 12c 2024 15 Phase transitions
We see that we get the same exponent as in our computation of ng − nl in the Van
der Waals model.
We can also set τ = τc and ask how the magnetization changes as we approach
B = 0. We then have βJq = 1 and our equation is
m = tanh(B/Jq + m) (15.55)
so that
m ∼ B 1/3 . (15.57)
This is analogous to our result vg − vl ∼ (p − pc )1/3 in the Van der Waals model.
Finally, let us consider the magnetic susceptibility
∂m
χ=N . (15.58)
∂B τ
This is the analog of compressibility for a gas. Let us ask how χ changes as we
approach τ → τc from above at B = 0. Differentiating (15.51) with respect to B,
we find
Nβ Jq
χ= 1+ χ . (15.59)
cosh2 (βJmq) N
Setting m = 0, B = 0, we find
Nβ
χ= ∼ (τ − τc )−1 . (15.60)
1 − Jqβ
We again get the same critical exponent as in the Van der Waals model.
133
Ph 12c 2024 15 Phase transitions
gas magnet
τ τ
p B
τc τc
p = pc B=0
n m
κ χ
n = ng m = m0
n = nl m = −m0
On the other hand, there are some differences — the Ising model has an inbuilt Z2
symmetry under B, m → −B, −m, whereas there is no obvious symmetry relating
the liquid and gas phases.
Just like with liquid-vapor transitions, the critical exponents of the Ising model
turn out to be different from those predicted by mean field theory. However, amaz-
ingly they are the same critical exponents as in liquid vapor transitions.
The very same β, γ, δ that we discussed before appear not only in myriad liquid-
vapor transitions, but at critical points of ferromagnets, and many other settings.
We say that all these systems are in the same universality class — their critical
points are controlled by the same critical exponents. The particular universality
class we have been discussing is called the 3d Ising universality class. However, this
is only because the 3d Ising model is the simplest model in the same universality
class. Other than being simple, there is nothing special about it — it is just one of
many models with the same emergent critical behavior.
What does mean field theory get wrong in all of these systems, and what is
shared between different critical points? A defining feature of critical points is that
fluctuations are important. In fact, the structure of fluctuations at critical points is
extremely special. As an example, in a ferromagnet with τ < τc and h = 0, there
are two phases m = ±m0 . In practice, these phases will coexist, so that we will have
large pockets of up spins and pockets of down spins. As we approach the critical
point, the distinction between m ± m0 starts to disappear and thermal fluctuations
become more important. Inside the pockets of up spins, there are smaller pockets of
down spins. Inside those pockets are even smaller pockets of up spins, etc.. At the
critical point, this nested structure goes on ad infinitum, and we get (in the limit
of an infinite lattice) infinitely nested pockets of up and down spins at all distance
scales.
In liquid-vapor transitions, something similar happens. When τ < τc the liquid
can boil. Boiling happens by a bubble of gas appearing inside the liquid. As we
approach the critical temperature, fluctuations in density become more important
and we end up with droplets inside the bubbles, and bubbles inside the droplets,
etc. until there are bubbles and droplets at all distance scales.
This infinitely nested structure of different phases at a critical point is reflective
of an emergent symmetry: scale invariance. The system looks the same at all
134
Ph 12c 2024 15 Phase transitions
distance scales, like a fractal. There is strong evidence that scale-invariant theories
are extremely tightly mathematically constrained, and this is the reason why the
same scale-invariant theories show up in so many different physical systems.
As an example, the tight mathematical constraints on scale-invariant theories in
2-dimensions have led to an exact solution for the critical exponents of the 2d Ising
model
1
m0 ∼ (τc − τ )β , β=
8
m ∼ B 1/δ , δ = 15
7
χ ∼ (τ − τc )−γ , γ= . (15.61)
4
The critical exponents in 3 dimensions are not known exactly, but they can be
computed numerically using conformal bootstrap techniques to give the values in
section ??.
Let us now develop a bit more technology to understand the significance of
fluctuations and scaling symmetry in the Ising model. Along the way, we will solve
the 1d Ising model (though unfortunately we won’t earn a Ph.D. for doing so like
Ising did).
where K = βJ and h = βB. I will sometimes refer to S[s] as the “action,” even
though it is equal to βE, where E is the classical energy.
We’re going to solve this theory using an analogy with quantum mechanics. The
partition sum can be thought of as a discrete version of a path-integral. This path
integral can be computed by relating it to a quantum-mechanical theory. This is an
example of the notion of quantization.
135
Ph 12c 2024 15 Phase transitions
The key idea is to build up the partition sum by moving along the lattice site-
by-site. Forget about periodicity for the moment, and consider the contribution to
the partition function from spins j < i for some fixed i,
X P P
Zpartial (i, si ) = eK j<i sj sj+1 +h j<i sj . (15.63)
{sj :j<i}
Because of the interaction term si−1 si , we cannot do the sum over {s1 , . . . , si−1 }
without specifying the spin si . Thus, we have a function of si . In short, Zpartial (i, s)
is the partition function of the theory on the lattice 1 . . . i, with fixed boundary
condition s at site i.37
Note that Zpartial (i + 1, si+1 ) can be related to Zpartial (i, si ) by inserting the
remaining Boltzmann weights that depend on si and performing the sum over si =
±1,
X
Zpartial (i + 1, si+1 ) = T (si+1 , si )Zpartial (i, si ), (15.64)
si =±1
where
e−K−h
K+h
′ ′ b e
T (s , s) = ⟨s |T |s⟩, T = −K+h
b , (15.66)
e eK−h
The matrix Tb is called the “transfer matrix”, and it plays the role of a discrete
time-translation operator. Here, i should be thought of as a discrete imaginary time
coordinate.
To be explicit, the (integrated) Schrodinger equation in a quantum theory in
imaginary time is
37
We must also impose some boundary condition at site 1. The precise choice is not important
for this discussion, so we have left it implicit.
136
Ph 12c 2024 15 Phase transitions
where tE is the imaginary time coordinate, ∆tE is some imaginary time-step, and
Hb is the quantum Hamiltonian. Thus, the 1-dimensional Ising lattice model is
equivalent to a 2-state quantum theory with Hamiltonian
b = − 1 log Tb.
H (15.70)
∆tE
When the lattice is periodic with length M , the partition function is related to
the transfer matrix by
X
Z= ⟨sM |Tb|sM −1 ⟩⟨sM −1 |Tb|sM −2 ⟩ · · · ⟨s1 |Tb|sM ⟩
{si }
= Tr(TbM ). (15.71)
Tr(TbM ) = λM M
+ + λ− , (15.72)
where
p
λ± = eK cosh h ± e2K sinh2 h + e−2K
(
2 cosh K
→ (when h = 0). (15.73)
2 sinh K
This is an interesting result: the 1d Ising model has a partition function that is
completely smooth as a function of the parameters K, h. Thus, there is no phase
transition in the 1d Ising model. The interpretation is that the spin-spin interactions
aren’t strong enough to create different high and low temperature phases. To see a
phase-transition, we will have to look in higher dimensions.
There is a nice interpretation of (15.74) in terms of our quantum mechanics
analogy: the state with the largest eigenvalue of Tb has the smallest eigenvalue of
Hb — i.e. it is the ground state, and we should call it |0⟩. We have shown that the
ground state dominates the thermodynamic limit. Contributions from the excited
state are exponentially suppressed in the size of the system
M
λ−
= e−M/ξ ,
λ+
1
≡ − log(λ− /λ+ ). (15.75)
ξ
137
Ph 12c 2024 15 Phase transitions
1 T
Z Z
− TℏH
b
Tr(e )= Dx(t) exp − dtL(x(t), ẋ(t)) , (15.81)
x(0)=x(T ) ℏ 0
138
Ph 12c 2024 15 Phase transitions
where here the integral is over periodic paths. Note that the left-hand side is the
partition function at inverse temperature β = 1/τ = T /ℏ. Thus, the partition
function is a path integral over paths that are periodic in imaginary time with
periodicity T = ℏβ, or ∆t = −iℏβ.
This transformation is analogous to the one we just did for the 1d Ising model,
but backwards. The analogy is as follows. Firstly, the sum over a spin is analogous
to the integral over x
X Z ∞
⇐⇒ dxi (15.82)
si =±1 −∞
The energy functional S[s] in the Ising model is analogous to the action
Z T
1
S[s] ⇐⇒ dtL(x(t), ẋ(t)) (15.83)
ℏ 0
The partition function of the Ising model is analogous to the path integral
Z ∞ Z ∞
1 T
X Z
−S[s]
e ⇐⇒ A T dx0 · · · A T dxN −1 exp − dtL(x(t), ẋ(t))
−∞ N −∞ N ℏ 0
{si }
(15.84)
and the transfer matrix is analogous to the operator that evolves with the Hamilto-
nian from one time to the next
!
T H
b 1
Tb ⇐⇒ e−T H/ℏN = 1 − +O (15.86)
b
ℏN N2
139
Ph 12c 2024 15 Phase transitions
In the last line, we split the action into contributions from pairs of neighboring rows.
The notation sj represents the configuration of spins in the j-th row,
(sj )i = si,j . (15.88)
The action associated with a pair of neighboring rows is given by
M M
1 X ′ 1 X
L(s′ , s) = K (si − si )2 − K (si+1 si + s′i+1 s′i ). (15.89)
2 2
i=1 i=1
where Hi is a 1-qubit Hilbert space for each site i. HM is the quantum Hilbert
space of M qubits, and is 2M -dimensional. This is sometimes called a (quantum-
mechanical) spin-chain.
The transfer matrix between successive time slices is a 2M × 2M matrix with
entries
′
⟨s′ |Tb|s⟩ = e−L(s ,s) . (15.91)
The partition function on ZM × ZN is then
Z(ZM × ZN ) = TrHM (TbN ). (15.92)
To compute correlation functions, we need an operator that measures the spin at
site i. This is simply the Pauli spin matrix σiz associated with the i-th site
σiz |s1 , . . . , sj , . . . , sM ⟩ = si |s1 , . . . , si , . . . , sM ⟩. (15.93)
140
Ph 12c 2024 15 Phase transitions
⟨si1 ,j1 si2 ,j2 ⟩ = TrHM (TbN +j2 −j1 σiz1 Tb2j1 −j2 σiz2 ). (15.94)
Let us write Tb in a more familiar way as an operator on a spin chain. First split
L into contributions from horizontal and vertical bonds
Note that
1
σiz σi+1
z
P
e2K i |s⟩ = e−Lh (s) |s⟩. (15.96)
Meanwhile, Lv only involves spins at a single site, so let us imagine that we have
only one site. Note that
1 ′ 2
⟨s′ |(1 + e−2K σ x )|s⟩ = e− 2 K(s −s) . (15.97)
We also have
′ x
1 + e−2K σ x = eA+K σ , where
′ −2K
tanh K = e ,
p
eA = 1 − e−4K , (15.98)
′ x
which follows by expanding out the Taylor series for eA+K σ and matching the
coefficients of 1, σ x . Thus,
′ σix ′
P
eAM ⟨s′ |eK i |s⟩ = e−Lv (s ,s) . (15.99)
The constant eAM will cancel in correlation functions, so we will ignore it. Putting
everything together, we find
! ! !
1 X X 1 X
Tb ∝ exp K σiz σi+1
z
exp K ′ σix exp K σiz σi+1
z
. (15.100)
2 2
i i i
sulting H
b would be very complicated.
Onsager’s 1944 solution of the 2d Ising model consisted of diagonalizing the above
matrix. The solution has an interesting and important feature: when the parameter
141
Ph 12c 2024 15 Phase transitions
K is dialed to a very special value, the correlation length ξ (which is the inverse of
the difference between the two lowest eigenvalues of Tb) diverges — i.e. the low-lying
eigenvalues of Tb collapse toward zero. This is the critical point of the theory, and
it results in lots of interesting and rich physics. In fact, the right definition of a
critical point/second-order phase transition is a point where the correlation length
of a system diverges. A divergent correlation length means the system is in some
sense coupled to itself on all distance scales, and can display interesting emergent
phenomena.
We have started with a 2d lattice model, and in the end obtained a quantum
mechanical system that has a spatial direction (in addition to time). In the con-
tinuum limit, where we zoom out to distance scales much larger than the lattice
spacing, this system becomes a quantum field theory. In general, a d-dimensional
lattice model in the continuum limit can be rewritten as a quantum field theory
with d − 1 spatial directions and 1 time direction. In fact, one of our best definitions
of the strong nuclear force is as the continuum limit of a 4d lattice model.
A critical point is described by a quantum field theory with an infinite correlation
length. The fact that the correlation length is infinite means that the theory doesn’t
have any intrinsic length scales — it becomes invariant under rescaling.
142
Ph 12c 2024 16 Maxwell’s demon
16 Maxwell’s demon
Maxwell’s demon is a deceptively simple paradox about entropy that was proposed
in 1867 and resolved more than 100 years later.
Imagine a box of gas with a divider separating it into two halves. Both halves
initially have the same temperature, volume, and number of particles. On the
divider, there is a small door operated by a machine that we call the “demon.” The
demon looks at each gas molecule approaching the door. By selectively choosing to
open or close the door, it allows only fast-moving molecules to enter the right half,
and only slow-moving molecules to enter the left half. In this way, the temperature
on the left decreases and the temperature on the right increases. The demon has
apparently violated the second law of thermodynamics, since the configuration where
both sides have different temperatures has lower entropy than the configuration
where they have the same temperature (i.e. thermal equilibrium).
The demon does not need to be conscious — it could be a computer with a laser
sensor, or even something as simple as a spring-loaded door that only opens to the
right and can only be knocked open by a sufficiently fast molecule. There are also
versions of the paradox that violate the second law in other ways. For example,
starting with a mixture of two gases A and B, the demon could selectively let A
molecules pass to the left and B molecules pass to the right, separating the mixtures.
The separated gases have lower entropy than the mixed gas.
No matter what, the demon has to perform some kind of computation: it has
to measure a molecule, make a decision, and act accordingly. The resolution to the
paradox comes from understanding some thermodynamic properties of computation:
specifically, which kinds of computations are reversible and which ones are not.
Let us start with a very concrete model of a bit of information: a ball in a double-
well potential V (x) = (x2 − a2 )2 . We say that the ball in the left well represents
0, and the ball in the right well represents 1. Given a ball in the 1 state, we can
move it to the 0 state in a completely reversible manner without doing any work or
dissipating any heat. For example, we can hook the ball up via a pulley to another
ball in an exactly opposite potential −V (x). Now an infinitesimally-tiny nudge will
cause the ball to move from state 1 to state 0.
Suppose, however, that we do not know the initial state of the ball, and we
wish to move it to the 0 state. Importantly, we want an operation that works no
matter what the initial state is. Let us call this operation “SetToZero.” Almost by
definition, SetToZero cannot be reversible. The problem is that it maps two different
initial states (0 and 1) to a single final state 0. The laws of physics can always be
run backwards, so how can we achieve an irreversible operation? We must use the
environment.
As a concrete example, suppose that the ball experiences some friction. Then
our SetToZero operation can be: swing a hammer at the right well. If the ball is
already on the left, nothing will happen. If the ball is on the right, it will get kicked
143
Ph 12c 2024 16 Maxwell’s demon
over the barrier, and settle on the other side due to friction. Friction is important for
this, since otherwise the ball will just come back over the barrier. More abstractly,
in this example, information about the initial state has moved via heat dissipation
into the environment, where it is forgotten. In general, this must happen any time
we want to erase information, since erasing is an irreversible operation. This is
Definition (Landauer’s principle). Erasing information requires energy to be
dissipated as heat.
The heat that’s dissipated is at least τ ∆σ, where ∆σ is the information entropy
associated with the erased bits.
We can now see a problem with Maxwell’s demon. During operation, the demon
must measure a molecule. This measurement requires storing information, like “the
molecule is moving fast” in some kind of memory bank. The demon makes a decision
based on this information. Then before the next molecule arrives, the demon must
erase its memory bank. Erasure requires dissipating heat, which means the entropy
of that information is put into the environment.
As an example, consider the spring-loaded door. In order to function correctly,
the door must settle back down after each molecule comes through. This is a form
of erasing information: the door is more excited if a fast molecule hits it and less
excited if a slow molecule hits it, but either way it must settle down to the same
state. However, settling down requires sending away energy, and hence entropy, into
the environment.
Thus, the demon is ultimately taking entropy from the gas and sending it into
the environment. The total entropy of system plus environment is non-decreasing,
and the paradox is resolved.
One could ask: why must the demon erase its memory? Couldn’t it just leave
the information in memory? To understand the answer, let us imagine a memory
bank given by a ticker tape that can store a string of 0’s and 1’s. To avoid erasing
its memory, the demon must start filling up the ticker tape. After it has separated
a large number of molecules, it has created a long ticker tape with an essentially
random sequence of 0’s and 1’s. The entropy of the gas has turned into information
entropy of the ticker tape. An important insight is that we should count the informa-
tion entropy of the tape as physical entropy. After all, the tape is a physical system.
A tape with length N has at least 2N different microstates, and these should be
treated on the same footing as the microstates of any other physical system. Thus,
in this case, the demon has transferred entropy from the gas to the tape, but it has
not decreased the total entropy of the combined gas-tape system.
The insight that information entropy should be counted as thermodynamic en-
tropy, and that this gives the resolution to the Maxwell’s demon paradox, was largely
due to Bennett in 1982.
144
Ph 12c 2024 A Review: trace
A Review: trace
Let us review the trace and some of its properties. Consider a linear operator A.
Its matrix elements Aji in an n-dimensional basis ⃗ei are defined by38
n
X
A⃗ei = Aji⃗ej . (A.1)
j=1
Actually, this follows from cyclicity for a product of two matrices (exercise!)
A = U diag(λ1 , . . . , λn )U −1 , (A.7)
38
If ⃗ei is an orthonormal basis, we have Aji = ⃗ej · (A⃗ei ), or in Dirac notation Aji = ⟨j|A|i⟩.
However, what we say in this section holds in any basis.
145
Ph 12c 2024 A Review: trace
where λ1 , . . . , λn are the eigenvalues of A. Then the trace is given by the sum of the
eigenvalues:
X
Tr(A) = Tr(diag(λ1 , . . . , λn )) = λi . (A.8)
i
In particular,
X
Tr(f (A)) = f (λi ). (A.10)
i
This is the equation used in writing (3.38), in the case f (H) = e−H/τ .
146