Lectures On Kinetic Theory of Gases and Statistical Physics: Alexander A. Schekochihin
Lectures On Kinetic Theory of Gases and Statistical Physics: Alexander A. Schekochihin
Alexander A. Schekochihin†
The Rudolf Peierls Centre for Theoretical Physics, University of Oxford, Oxford OX1 3NP, UK
Merton College, Oxford OX1 4JD, UK
These are the notes for my lectures on Kinetic Theory and Statistical Physics, being part
of the 2nd -year course (Paper A1) at Oxford. I taught the course in 2011-18, jointly with
Professors Andrew Boothroyd (2011-15) and Julien Devriendt (2015-18). Only my part
of the course is covered in these notes. I will be grateful for any feedback from students,
tutors or sympathisers.
CONTENTS
[Part I. Basic Thermodynamics] 5
Part II. Kinetic Theory 5
1. Statistical Description of a Gas 5
1.1. Introduction 5
1.2. Energy 8
1.3. Thermodynamic Limit 9
1.4. Kinetic Calculation of Pressure 11
1.5. Isotropic Distributions 13
2. Classical Ideal Gas in Equilibrium 15
2.1. Maxwell’s Distribution 16
2.2. Equation of State and Temperature 19
2.3. Validity of the Classical Limit 21
2.3.1. Nonrelativistic Limit 21
2.3.2. No Quantum Correlations 21
3. Effusion 21
4. Collisions 25
4.1. Cross-section 25
4.2. Collision Time 26
4.3. Mean Free Path 26
4.4. Relative Speed 26
5. From Local to Global Equilibrium (Transport Equations) 28
5.1. Inhomogeneous Distributions 28
5.2. Local Maxwellian Equilibrium 29
5.3. Conservation Laws 30
5.3.1. Temperature 31
5.3.2. Velocity 33
5.4. Thermal Conductivity and Viscosity 35
5.5. Transport Equations 37
† E-mail: [email protected]
2 A. A. Schekochihin
5.6. Relaxation to Global Equilibrium 37
5.6.1. Initial-Value Problem: Fourier Decomposition 38
5.6.2. Dimensional Estimate of Transport Coefficients 39
5.6.3. Separation of Scales 39
5.6.4. Sources, Sinks and Boundaries 39
5.6.5. Steady-State Solutions 40
5.6.6. Time-Periodic Solutions 41
5.7. Diffusion 43
5.7.1. Derivation of the Diffusion Equation 43
5.7.2. Random-Walk Model 43
5.7.3. Diffusive Spreading 44
6. Kinetic Calculation of Transport Coefficients 45
6.1. A Nice but Dodgy Derivation 45
6.1.1. Viscosity 45
6.1.2. Thermal Conductivity 47
6.1.3. Why This Derivation is Dodgy 47
6.2. Kinetic Expressions for Fluxes 48
6.3. Kinetic Equation 49
6.4. Conservation Laws and Fluid Equations 50
6.4.1. Number Density 50
6.4.2. Momentum Density 51
6.4.3. Energy Density 52
6.5. Collision Operator 55
6.6. Solution of the Kinetic Equation 56
6.7. Calculation of Fluxes 57
6.7.1. Momentum Flux 58
6.7.2. Heat Flux 58
6.8. Calculation of Fluxes in 3D 60
6.9. Kinetic Theory of Brownian Particles 61
6.9.1. Langevin Equation 61
6.9.2. Diffusion in Velocity Space 62
6.9.3. Brownian Motion 63
6.9.4. Kinetic Equation for Brownian Particles 64
6.9.5. Diffusion in Position Space 65
Part III. Foundations of Statistical Mechanics 66
7. From Microphysics to Macrophysics 66
7.1. What Are We Trying to Do? 66
7.2. The System and Its States 67
7.3. Pressure 67
8. Principle of Maximum Entropy 68
8.1. Quantifying Ignorance 68
8.1.1. Complete Ignorance 69
8.1.2. Some Knowledge 69
8.1.3. Assignment of Likelihoods 69
8.1.4. Some properties of Gibbs–Shannon Entropy 71
8.1.5. Shannon’s Theorem 72
8.2. Method of Lagrange Multipliers 74
8.3. Test of the Method: Isolated System 76
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 3
9. Canonical Ensemble 77
9.1. Gibbs Distribution 77
9.2. Construction of Thermodynamics 78
9.3. Some Mathematical Niceties 79
9.4. Third Law 80
9.5. Part I Obviated, Road Ahead Clear 81
10. Thermodynamic Equilibria and Stability 83
10.1. Additivity of Entropy 83
10.2. Thermal Equilibrium 84
10.3. Physical Interpretation of the Canonical Ensemble 86
10.4. Mechanical and Dynamical Equilibria 86
10.4.1. Thermal Equilibrium 88
10.4.2. Mechanical Equilibrium 88
10.4.3. Dynamical Equilibrium 89
10.5. Stability 90
10.5.1. Thermal Stability 90
10.5.2. Dynamical Stability 91
11. Statistical Mechanics of Classical Monatomic Ideal Gas 92
11.1. Single-Particle States 92
11.2. Down the Garden Path. . . 93
11.3. Single-Particle Partition Function 93
11.4. Digression: Density of States 94
11.5. Disaster Strikes 94
11.6. Gibbs Paradox 94
11.7. Distinguishability 95
11.8. Correct Partition Function 96
11.9. Thermodynamics of Classical Ideal Gas 97
11.10. Maxwell’s Distribution 98
12. P.S. Entropy, Ensembles and the Meaning of Probabilities 99
12.1. Boltzmann Entropy and the Ensembles 99
12.1.1. Boltzmann’s Formula 99
12.1.2. Microcanonical Ensemble 100
12.1.3. Alternative (Original) Construction of the Canonical Ensemble 103
12.2. Gibbs vs. Boltzmann and the Meaning of Probabilities 104
12.3. Whose Uncertainty? 106
12.4. Second Law 106
13. P.P.S. Density Matrix and Entropy in Quantum Mechanics 108
13.1. Statistical and Quantum Uncertainty 108
13.2. Density Matrix 109
13.3. Quantum Entropy and Canonical Ensemble 110
13.4. Time Evolution and the Second Law 110
13.5. How Information Is Lost 111
[Part IV. Statistical Mechanics of Simple Systems] 112
Part V. Open Systems 113
14. Grand Canonical Ensemble 113
14.1. Grand Canonical Distribution 113
14.2. Thermodynamics of Open Systems and the Meaning of Chemical Potential115
14.3. Particle Equilibrium 117
14.4. Grand Partition Function and Chemical Potential of Classical Ideal Gas 118
14.5. Equilibria of Inhomogeneous Systems 120
4 A. A. Schekochihin
14.6. Chemical Potential and Thermodynamic Potentials 122
14.6.1. Free Energy 122
14.6.2. Gibbs Free Energy 122
14.6.3. Meaning of Grand Potential 123
15. Multi-Species (Multi-Component) Systems 125
15.1. Generalisation of the Grand Canonical Formalism to Many Species 125
15.1.1. Gibbs Free Energy vs. µs 126
15.1.2. Fractional Concentrations 127
15.2. Particle Equilibrium and Gibbs Phase Rule 127
15.3. Chemical Equilibrium 127
15.4. Chemical Equilibrium in a Mixture of Classical Ideal Gases: Law of
Mass Action 129
Part VI. Quantum Gases 131
16. Quantum Ideal Gases 131
16.1. Fermions and Bosons 131
16.2. Partition Function 132
16.3. Occupation Number Statistics and Thermodynamics 133
16.4. Calculations in Continuum Limit 134
16.4.1. From Sums to Integrals 134
16.4.2. Chemical Potential of a Quantum Ideal Gas 136
16.4.3. Classical Limit 136
16.4.4. Mean Energy of a Quantum Ideal Gas 137
16.4.5. Grand Potential of a Quantum Ideal Gas 138
16.4.6. Equation of State of a Quantum Ideal Gas 138
16.4.7. Entropy and Adiabatic Processes 138
16.5. Degeneration 139
17. Degenerate Fermi Gas 141
17.1. Fermi Energy 142
17.2. Mean Energy and Equation of State at T = 0 142
17.3. Heat Capacity 145
17.3.1. Qualitative Calculation 145
17.3.2. Equation of State at T > 0 146
17.3.3. Quantitative Calculation: Sommerfeld Expansion 146
18. Degenerate Bose Gas 149
18.1. Bose-Einstein Condensation 149
18.2. Thermodynamics of Degenerate Bose Gas 152
18.2.1. Mean Energy 153
18.2.2. Heat Capacity 153
18.2.3. Equation of State 153
19. Thermal Radiation (Photon Gas) 155
[Part VII. Thermodynamics of Real Gases] 155
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 5
To play the good family doctor who warns about reading something prematurely,
simply because it would be premature for him his whole life long—I’m not the man
for that. And I find nothing more tactless and brutal than constantly trying
to nail talented youth down to its “immaturity,” with every other sentence
a “that’s nothing for you yet.” Let him be the judge of that! Let him
keep an eye out for how he manages.
Thomas Mann, Doctor Faustus
PART I
Basic Thermodynamics
This part of the course was taught by Professors Andrew Boothroyd and Julien Devriendt.
PART II
Kinetic Theory
1. Statistical Description of a Gas
1.1. Introduction
You have so far encountered two basic types of physics:
1) Physics of single objects (or of groups of just a few such objects). For classical
(macroscopic) objects, we had a completely deterministic description based on Newton’s
2nd Law: given initial positions and velocities of all participating objects and the forces
acting on them (or between them), we could predict their behaviour forever. In the case of
microscopic objects, this failed and had to be replaced by Quantum Mechanics—where,
however, we again typically deal with single (or not very numerous) objects and can
solve differential equations that determine, eventually, probabilities of quantum states
(generalising the classical-mechanical notions of momentum, energy, angular momentum
etc.)
It is clear that a link between the two must exist—and we would like to understand
how it works both for our general peace of mind and for the purposes of practical
calculation: for example, whereas the relationship between energy, heat, pressure and
volume could be established and then the notions of temperature and entropy introduced
without specifying what the system under consideration was made of, we had, in order to
make practical quantitative predictions, to rely on experimentally determined empirical
6 A. A. Schekochihin
relations between
P, V, and T (equation of state)
and, U being internal energy, between
Statistical Mechanics (which we will study from Part III onwards) will deal with the
question of how, given some basic microphysical information about properties of a system
under consideration and some very general principles that a system in equilibrium must
respect, we can derive the thermodynamics of the system (including, typically, U (V, T ),
equation of state P (V, T ), entropy S(V, T ), and hence heat capacities, etc.).
Kinetic Theory (which we are about to study for the simple case of classical monatomic
ideal gas) is concerned not just with the properties of systems in equilibrium but
also—indeed, primarily—with how the equilibrium is reached and so how the collective
properties of a system evolve with time. This will require both a workable model of
the constituent particles of the system and of their interaction (collisions). Equilibrium
properties will also be derived, but with less generality than in Statistical Mechanics.
We study Kinetic Theory first because it is somewhat less abstract and more intuitive
than Statistical Mechanics (and we will recover all our equilibrium results later on in
Statistical Mechanics). Also, it is convenient, in formulating Statistical Mechanics, to
refer to some basic knowledge of Quantum Mechanics, whereas our treatment of Kinetic
Theory will be completely classical.
cisions1 would quickly change the solution, because any error in the initial conditions
grows exponentially fast with time.
Let me give you an example to illustrate the last point.2 Imagine we have a set of
billiard balls on a frictionless table, we set them in motion (at t = 0) and want to observe
them as time goes on. We could, in principle, solve their equations of motion and predict
where they will all be and how fast they will be moving at any time t > 0. It turns
out that if someone enters the room during this experiment, the small deflections of the
balls due to the intruder’s gravitational pull will accumulate to alter their trajectories
completely after only ∼ 10 collisions!
Proof (Fig. 1). For simplicity of this very rough estimate, let us consider all the balls to be
fixed in space, except for one, which moves and collides with them. Assume:
1
And of course any saved data will always have finite precision!
2
I am grateful to G. Hammett for pointing out this example to me.
8 A. A. Schekochihin
Therefore, after n collisions, it will be
n n
l ∆x l
∆θn ∼ ∆θ0 ∼ . (1.5)
r l r
In order to estimate the number of collisions after which the trajectory changes significantly, we
calculate n such that ∆θn ∼ 1:
ln(l/∆x)
n∼ ∼ 10, q.e.d. (1.6)
ln(l/r)
The basic idea is that if errors grow exponentially with the number of collisions that a
particle undergoes, you do not need very many collisions to amplify to order unity even
very tiny initial perturbations (this is sometimes referred to as the “butterfly effect,” after
the butterfly that flaps its wings in India, producing a small perturbation that eventually
precipitates a hurricane in Britain; cf. Bradbury 1952). A particle of air at 1 atm at room
temperature has ∼ 109 collisions per second (we will derive this in §4). Therefore, particle
motion becomes essentially random—meaning chaotic, deterministically unpredictable in
practice even for a classical system.
Thus, particle-by-particle deterministic description [Eq. (1.1)] is useless. Is this a
setback? In fact, this is fine because we really are only interested in bulk properties of
our system, not the motion of individual particles.3 If we can relate those bulk properties
to averages over particle motion, we will determine everything we wish to know.
Let us see how this is done.
1.2. Energy
So, we model our gas as a collection of moving point particles of mass m, whose
positions r and velocities v are random variables. If we consider a volume of such a gas
with no spatial inhomogeneities, then all positions r are equiprobable.
The mean energy of the N particles comprising this system is
mv 2
hEi = N , (1.7)
2
where hmv 2 /2i is the mean energy of a particle and we assume that all particles have the
same statistical distribution of velocities. In general, particles may have a mean velocity,
i.e., the whole system may be moving at some speed in some direction:
hvi = u. (1.8)
Let v = u + w, where w is peculiar velocity, for which hwi = 0 by definition. Then
M u2 mw2
m
2
hEi = N |u + w| = +N , (1.9)
2 2 2
| {z } | {z }
≡K ≡U
where M = N m. The energy consists of the kinetic energy of the system as a whole, K,
and the internal energy, U . It is U that appears in thermodynamics (“heat”)—the mean
energy of the disordered motion of the particles (“invisible motion,” as they called it in
the 19th century). The motion is disordered in the sense that it is random and has zero
3
We will learn in §11.7 that, in fact, talking about the behaviour of individual particles in a gas
is often meaningless anyway, because particles can be indistinguishable.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 9
mean: hwi = 0. For now, we will assume u = 04 and so
mv 2
U = hEi = N . (1.10)
2
where the index i runs through all N particles in the system and v i is the velocity of the
ith particle. Then the mean square energy fluctuation is
* + !2
X mv 2 mvj2 X mv 2
i i
(E − U )2 = hE 2 i − U 2 =
−
i,j
2 2 i
2
* +* + !2
X m2 v 4 X mv 2
i i
mvj2 X mv 2
i
= + −
i
4 2 2 i
2
i6=j
2 4 2 2
mv 2 mv 2
m v
=N + N (N − 1) − N
4 2 2
2
m
hv 4 i − hv 2 i2 .
=N (1.12)
4
Note that, in the second line of this calculation, we are only allowed to write hvi2 vj2 i =
hvi2 ihvj2 i for i 6= j if we assume that velocities of different particles are independent
random variables, an important caveat. From Eq. (1.12), we find that the relative root-
mean-square fluctuation of energy is
1/2 1/2 1/2
(E − U )2 N (m2 /4) hv 4 i − hv 2 i2
4
∆Erms hv i 1
≡ = = −1 √ 1.
U U N hmv 2 /2i hv 2 i2 N
(1.13)
This is very small for N 1 because the prefactor in the above formula is clearly
independent of N , as it depends only on single-particle properties, viz., the moments
hv 2 i and hv 4 i of a particle’s velocity.
Exercise 1.1. If you like mathematical exercises, figure out how to prove that hv 4 i > hv 2 i2 ,
whatever the distribution of v—so we are not taking square root of a negative number!
4
Since we have already assumed that the system is homogeneous, we must have u = const
across the system and so, if u is also constant in time, we can just go to a frame moving with
velocity u. We will relax the homogeneity assumption in §5.
10 A. A. Schekochihin
The result (1.13) implies that the distribution of the system’s total energy E (which
is a random variable5 because particle velocities are random variables) is very√sharply
peaked around its mean U = hEi: the width of this peak is ∼ ∆Erms /U ∼ 1/ N 1
for N 1 (Fig. 2). This is called the thermodynamic limit—the statement that mean
quantities for systems of very many particles approximate extremely well the exact
properties of the system.6
I hope to have convinced you that averages do give us a good representation of the
actual state of the system, at least when the number of constituent particles is large.
1
= p (1.14)
hN i hN i
(so fluctuations around the average are very small if hN i 1).
c) Show that, if hN i 1, pN has its maximum at N ≈ hN i = nV ; then show that in the
vicinity of this maximum, the distribution of N is Gaussian:
1 2
pN ≈ √ e−(N −nV ) /2nV . (1.15)
2πnV
Hint. Use Stirling’s formula for N !, Taylor-expand ln pN around N = nV .
The result of (a) is, of course, intuitively obvious, but it is nice to be able to prove it
5
Unless the system is completely isolated, in which case E = const (see §12.1.2). However,
completely isolated systems do not really exist (or are, at any rate, inaccessible to observation,
on account of being completely isolated) and it tends to be more interesting and more useful to
think of systems in which some exchange with the outside world is permitted and only mean
quantities, in particular the mean energy, are fixed (§9).
6
This may break down if there are strong correlations between particles, i.e., hvi2 vj2 i = 6 hvi2 ihvj2 i:
indeed, as I noted after Eq. (1.12), our result is only valid if the averages can be split. Fluctuations
in strongly coupled systems, where the averages cannot be split, can be very strong. This is why
we focus on the “ideal gas” (non-interacting particles; see §2).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 11
Figure 3. Kinetic calculation of pressure. Particles within the volume Avz t will hit area A
during time t and bounce, each delivering momentum 2mvz to the wall.
mathematically and even to work out with what precision it holds, as you have done in (b)—
another demonstration that the world is constructed in a sensible way.
Our objective now is to work out how an important bulk property of a volume of
gas—pressure P felt by the walls of a container (or by a body immersed in the gas, or by
an imaginary surface separating one part of the gas from another)—is related to average
properties of the velocity distribution of the moving particles.
Particles hit a surface (wall) and bounce off; we assume that they do it elastically.
Recall that
Therefore, pressure on the wall is the momentum delivered to the wall by the bouncing
particles per unit time per unit area (“momentum flux ”).
Let z be the direction perpendicular to the wall (Fig. 3). When a particle bounces off
the wall, the projection of its velocity on the z axis changes sign,
vz(after) = −vz(before) , (1.16)
while the two other components of the velocity (vx and vy ) are unchanged. Therefore,
the momentum delivered by the particle to the wall is
∆p = 2mvz . (1.17)
Consider the particles the z component of whose velocity lies in a small interval [vz , vz +
dvz ], where dvz vz . Then the contribution of these particles to pressure is
dP (vz ) = ∆p dΦ(vz ) = 2mvz dΦ(vz ), (1.18)
where dΦ(vz ) is the differential particle flux, i.e., the number of particles with velocities
in the interval [vz , vz + dvz ] hitting the wall per unit time per unit area. In other words,
12 A. A. Schekochihin
if we consider a wall area A and time t, then
dN (vz )
dΦ(vz ) = . (1.19)
At
Here dN (vz ) is the number of particles with velocity in the interval [vz , vz + dvz ] that
hit area A over time t:
dN (vz ) = Avz t · n · f (vz )dvz , (1.20)
where Avz t is the volume where a particle with velocity vz must be to hit the wall
during time t, n = N/V is the number density of particles in the gas and f (vz )dvz is, by
definition, the fraction of particles whose velocities are in the interval [vz , vz + dvz ]. The
differential particle flux is, therefore,
dΦ(vz ) = nvz f (vz )dvz (1.21)
(perhaps this is just obvious without this lengthy explanation).
We have found that we need to know the particle distribution function (“pdf”) f (vz ),
which is the probability density function (also “pdf”) of the velocity distribution for a
single particle—i.e., the fraction of particles in our infinitesimal interval, f (vz )dvz , is
the probability for a single particle to have its velocity in this interval.7 As always in
probability theory, the normalisation of the pdf is
Z +∞
f (vz )dvz = 1 (1.22)
−∞
(the probability for a particle to have some velocity between −∞ and +∞ is 1). We
assume that all particles have the same velocity pdf: there is nothing special, statistically,
about any given particle or subset of particles and they are all in equilibrium with each
other.
From Eqs. (1.18) and (1.21), we have
dP (vz ) = 2mnvz2 f (vz )dvz . (1.23)
To get the total pressure, we integrate this over all particles with vz > 0 (those that are
moving towards the wall rather than away from it):
Z ∞
P = 2mnvz2 f (vz )dvz . (1.24)
0
Let us further assume that f (vz ) = f (−vz ), i.e., there is no preference for motion in any
particular direction (e.g., the wall is not attractive). Then
Z +∞
P = mn vz2 f (vz )dvz = mnhvz2 i. (1.25)
−∞
The pdf that I have introduced was in 1D, describing particle velocities in one direction
only. It is easily generalised to 3D: let me introduce f (vx , vy , vz ), which I will abbreviate
as f (v), such that f (v)dvx dvy dvz is the probability for the particle velocity to be in the
“cube” v ∈ [vx , vx + dvx ] × [vy , vy + dvy ] × [vz , vz + dvz ] (mathematically speaking, this
a joint probability for three random variables vx , vy and vz ). Then the 1D pdf of vz is
simply
Z +∞ Z +∞
f (vz ) = dvx dvy f (vx , vy , vz ). (1.26)
−∞ −∞
7
In §12.2, we will examine somewhat more critically this “frequentist” interpretation of
probabilities. A more precise statistical-mechanical definition of f (vz ) will be given in §11.10.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 13
Therefore, the pressure is
Z
P = mn d3 v vz2 f (v) = mnhvz2 i . (1.27)
So the pressure on a wall is simply proportional to the mean square z component of the
velocity of the particles, where z, by definition, is the direction perpendicular to the wall
on which we are calculating the pressure.8
1 2U
P = mnhv 2 i = , (1.29)
3 3V
where V is the volume of the system and U is its mean internal energy (defined in
§1.3). We have discovered the interesting result that in isotropic, 3D systems, pressure
is equal to 2/3 of the mean internal energy density (Exercise: what is it in an isotropic
2D system?). This relationship between pressure and the energy of the particles makes
physical sense: pressure is to do with how vigorously particles bombard the wall and that
depends on how fast they are, on average.
How large are the particle velocities? In view of Eq. (1.29) for pressure, we can relate them to
a macroscopic quantity that you might have encountered before: the sound speed in ap medium
of pressure P and mass density ρ = mn is (omitting constants of order unity) cs ∼ P/ρ ∼
hv 2 i1/2 ∼ 300 m/s [cf. Eq. (2.17)].
For future use, let us see what isotropy implies for the pdf. Obviously, f in an isotropic
system must be independent of the direction of v, it is a function of the speed v = |v|
alone:
f (v) = f (v). (1.30)
This amounts to the system being spherically symmetric in v space, so it is convenient
to change the v-space variables to polar coordinates (Fig. 4):
(vx , vy , vz ) → (v, θ, φ). (1.31)
If we know f (vx , vy , vz ), what is the joint pdf of v, θ, φ, which we will denote f˜(v, θ, φ)?
Here is how pdfs transform under change of variables:
∂(vx , vy , vz )
f (v)dvx dvy dvz = f (v) dvdθdφ = f (v)v 2 sin θ dvdθdφ. (1.32)
∂(v, θ, φ)
| {z }
Jacobian
| {z }
f˜(v,θ,φ)
8
This raises the interesting possibility that pressure need not, in general, be the same in all
directions—a possibility that we will eliminate under the additional assumptions of §1.5, but
resurrect in Exercise 1.4.
14 A. A. Schekochihin
Figure 4. Polar coordinates in velocity space. The factor of sin θ in Eq. (1.33) accounts for the
fact that, if the particles are uniformly distributed over a sphere |v| = v, there will be fewer of
them in azimuthal bands at low θ than at high θ (the radius of an azimuthal band is v sin θ).
Thus,
f˜(v, θ, φ) = f (v)v 2 sin θ = f (v)v 2 sin θ. (1.33)
The last equality is a consequence of isotropy [Eq. (1.30)]. It implies that an isotropic
distribution of particle velocities is uniform in φ, but not in θ and the pdf of particle
speeds is9
Z π Z 2π
˜
f (v) = dθ dφ f˜(v, θ, φ) = 4πv 2 f (v) . (1.34)
0 0
Exercise 1.3. a) Prove, using vx = v cos φ sin θ, vy = v sin φ sin θ and directly calculating
integrals in v-space polar coordinates, that
1 2
hvx2 i = hvy2 i = hv i. (1.36)
3
b) Calculate also hvx vy i, hvx vz i, hvy vz i. Could you have worked out the outcome of this last
calculation from symmetry arguments?
The answer to the last question is yes. Here is a smart way of computing hvi vj i, where
i, j = x, y, z (in fact, you can do all this not just in 3D, but in any number of dimensions,
i, j = 1, 2, . . . , d). Clearly, hvi vj i is a symmetric rank-2 tensor (i.e., a tensor, or matrix, with
two indices, that remains the same if these indices are swapped). Since the velocity distribution
is isotropic, this tensor must be rotationally invariant (i.e., not change under rotations of the
9
Blundell & Blundell (2009) call the distribution of speeds f and the distribution of vector
velocities g, so my f is their g and my f˜ is their f . This variation in notation should help you
keep alert and avoid mechanical copying of formulae from textbooks.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 15
coordinate frame). The only symmetric rank-2 tensor that has this property is a constant times
Kronecker delta δij . So it must be the case that
hvi vj i = Cδij , (1.37)
where C can only depend on the distribution of speeds v (not vectors v). Work out what C is.
Is it the same in 2D and in 3D? This is a much simpler derivation than doing velocity integrals
directly, but it was worth checking the result by direct integration, as you did above, to convince
yourself that the symmetry magic works.
c∗ ) Now that you know that it works, calculate hvi vj vk vl i in terms of averages of moments
of v (i.e., averages of powers of v such as hv 2 i or hv 4 i).
Hint. Doing this by direct integration would be a lot of work. Generalise the symmetry
argument given above: see what symmetric rotationally invariant rank-4 tensors (i.e., tensors
with 4 indices) you can cook up: it turns out that they have to be products of Kronecker deltas,
e.g., δij δkl ; what other combinations are there? Then hvi vj vk vl i must be a linear combination
of these tensors, with coefficients that depend on moments of v. By examining the symmetry
properties of hvi vj vk vl i, work out what these coefficients are. How does the answer depend on
the dimensionality of the world (2D, 3D, dD)?
Exercise 1.4. Consider an anisotropic system, where there exists one (and only one) special
direction in space (call it z), which affects the distribution of particle velocities (an example of
such a situation is a gas of charged particles—plasma—in a straight magnetic field along z).
a) How many variables does the velocity distribution function now depend on? (Recall that
in the isotropic case, it depended only on one, v.) Write down the most general form of
the distribution function under these symmetries—what is the appropriate transformation of
variables from (vx , vy , vz )?
b) In terms of averages of these new velocity variables, what is the expression for pressure Pk
that the gas will exert on a wall perpendicular to the z axis? (It is called Pk because it is due to
particles whose velocities have non-zero projections onto the special direction z.) What is P⊥ ,
pressure on any wall parallel to z?
c) Now consider a wall the normal to which, n̂, is at an angle θ to z. What is the pressure
on this wall in terms of Pk and P⊥ ?
Exercise 1.5. Consider an insulated cylindrical vessel filled with monatomic ideal gas. The
cylinder is closed on one side and plugged by a piston on the other side. The piston is very
slowly pulled out (its velocity u is much smaller than the typical velocities of the gas molecules).
Show using kinetic theory, not thermodynamics, that during this process the pressure P and
volume V of the gas inside the vessel are related by P V 5/3 =const.
Hint. Consider how the energy of a gas particle changes after each collision with the piston
and hence calculate the rate of change of the internal energy of the gas inside the vessel.
[Ginzburg et al. 2006, #307]
• Particles do not interact (e.g., they do not attract or repel each other), except for
having elastic binary collisions, during which they conserve total momentum and energy,
do not fracture or stick.
• They are point particles, i.e., they do not occupy a significant fraction of the
system’s volume, however many of them there are. This assumption is necessary
to ensure that a particle’s ability to be anywhere in space is not restricted by being
16 A. A. Schekochihin
crowded out by other particles. We will relax this assumption for “real gases” in Part VII.
• They are classical particles, so there are no quantum correlations (which would
jeopardise a particle’s ability to have a particular momentum if the corresponding
quantum state(s) is(are) already occupied by other particles). We will relax this
assumption for “quantum gases” in Part VI.
• They are non-relativistic particles, i.e., their speeds are v c. You will have an
opportunity to play around with relativistic gases later on (e.g., Exercise 11.3).
In practice, all this is satisfied if the gas is sufficiently dilute (low enough number den-
sity n) and sufficiently hot (high enough temperature T ) to avoid Quantum Mechanics,
but not so hot as to run into Relativity. I will make these constraints quantitative after
I define T (see §2.3).
Consider our model gas in a container of volume V and assume that there are no
changes to external (boundary) conditions or fields—everything is homogeneous in time
and space.
Let us wait long enough for a sufficient number (=a few) of collisions to occur
so all memory of initial conditions is lost (recall the discussion in §1.1 of how that
happens; roughly how long we must wait we will be able to estimate after we discuss
collisions in §4).
We will call the resulting state an equilibrium in the sense that it will be statistically
stationary, i.e., the particles in the gas will settle into some velocity distribution inde-
pendent of time, position or initial conditions (NB: it is essential to have collisions to
achieve this!). How the gas attains such a state will be the subject of §§5–6.
Since the distribution function f (v) does not depend on anything, we must be able to
work out what it is from some general principles.
First of all, if there are no special directions in the system, the pdf must be isotropic
[Eq. (1.30)]:
f (v) = f (v) = g(v 2 ), (2.1)
2
where g is some function of v (introduced for the convenience of the upcoming deriva-
tion).
Exercise 2.1. In the real world, you might object, there are always special directions. For
example, gravity (particles have mass!). After we have finished deriving f (v), think under what
condition gravity can be ignored.
Also, the Earth is rotating in a definite direction, so particles in the atmosphere are subject
to Coriolis and centrifugal forces. Under what condition can these forces be ignored?
Maxwell (1860) argued (or conjectured) that the three components of the velocity vector
must be independent random variables.10 Then
f (v) = h(vx2 )h(vy2 )h(vz2 ), (2.2)
10
It is possible to prove this for classical ideal gas either from Statistical Mechanics (see §11.10)
or by analysing elastic binary collisions (Boltzmann 1995; Chapman & Cowling 1991), but here
we will simply assume that this is true.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 17
where all three distributions are the same because of isotropy and depend only on squares
of velocity components assuming mirror symmetry of the distribution (invariance with
respect to the transformation v → −v; this means there are no flows or fluxes in the
system).
But in view of isotropy [Eq. (2.1)], Eq. (2.2) implies
h(vx2 )h(vy2 )h(vz2 ) = g(v 2 ) = g(vx2 + vy2 + vz2 ). (2.3)
Denoting further
ϕ(vx2 ) ≡ ln h(vx2 ) and ψ(v 2 ) ≡ ln g(v 2 ), (2.4)
we find
ϕ(vx2 ) + ϕ(vy2 ) + ϕ(vz2 ) = ψ(vx2 + vy2 + vz2 ). (2.5)
Such a functional relationship can only be satisfied if ϕ and ψ are linear functions of
their arguments:
ϕ(vx2 ) = −αvx2 + β and ψ(v 2 ) = −αv 2 + 3β. (2.6)
Here α and β are as yet undetermined integration constants and the minus sign is purely
a matter convention (α will turn out to be positive).
Proof. Differentiate Eq. (2.5) with respect to vx2 keeping vy2 and vz2 constant:
Z Z Z Z r 3
3 −αv 2 2
−αvx −αvy2 −αvz2 π
1 = C d ve = C dvx e dvy e dvz e =C . (2.13)
α
Therefore,
α 3/2 α 3/2 2
C= ⇒ f (v) = e−αv . (2.14)
π π
Thus, we have expressed f (v) in terms of only one scalar parameter α! Have we derived
something from nothing? Not quite: the functional form of the pdf followed from a set
of assumptions about (statistical) symmetries of the equilibrium state.
It is claimed (by Kapitsa 1974) that the problem to find the distribution of particle velocities
in a gas was routinely set by Stokes at a graduate exam in Cambridge in mid-19th century—
the answer was unknown and Stokes’ purpose was to check whether the examinee had the
erudition to realise this. To Stokes’ astonishment, a student called James Clerk Maxwell solved
the problem during his exam.
All these manipulations are well and good, but to relate vth to something physical,
we need to relate it to something measurable. What is measurable about a gas in a box?
The two most obviously measurable quantities are
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 19
—pressure (we can measure force on a wall),
—temperature (we can stick in a thermometer, as it is defined in Thermodynamics).
We will see in the next section how to relate vth , T and P .
Exercise 2.2. a) Work out a general formula for hv n i (n is an arbitrary positive integer) in
terms of vth , for a Maxwellian gas (Hint: it is useful to consider separately odd and even n). If
n < m, what is larger, hv n i1/n or hv m i1/m ? Why is this, qualitatively?
b∗ ) What is the distribution of speeds f˜(v) in a Maxwellian d-dimensional gas?
Hint. This involves calculating the area of a d-dimensional unit sphere in velocity space.
c) Obtain the exact formula for the rms energy fluctuation in a Maxwellian gas (see §1.3).
This provides us with a clear relationship between vth and the thermodynamic quantities
P and n = N/V . Furthermore, we know empirically11 that, for 1 mole of ideal gas
(N = NA = 6.022140857 × 1023 , the Avogadro number of particles),
and T here is the absolute temperature as defined in Thermodynamics (via Zeroth Law
etc.; see Part I). Another, equivalent, form of this equation of state is
R
P = nkB T, where kB = = 1.3807 × 10−23 J/K (the Boltzmann constant).
NA
(2.19)
Comparing Eqs. (2.19) and (2.17), we can extract the relationship between vth and the
thermodynamic temperature:
2
mvth
= kB T . (2.20)
2
Thus, temperature in Kinetic Theory is simply the kinetic energy of a particle moving
at the most probable speed in the Maxwellian velocity distribution,12 or, vice versa, the
width of the Maxwellian is related to temperature via
r
2kB T
vth = . (2.21)
m
11
From the thermodynamic experiments of Boyle 1662, Mariotte 1676 (P ∝ 1/V at constant
T ), Charles 1787 (V ∝ T at constant P ), Gay-Lussac 1809 (P ∝ T at constant V ) and
Amontons 1699 (who anticipated the latter two by about a century). To be precise, what we
know empirically is that Eq. (2.18) holds for the thermodynamically defined quantities P and T
in most gases as long as they are measured in parameter regimes in which we expect the ideal
gas approximation to hold.
12
The Boltzmann constant kB is just a dimensional conversion coefficient owing its existence to
the fact that historically T is measured in K rather than in units of energy (as it should have
been).
20 A. A. Schekochihin
Two other, equivalent, statements of this sort are that (Exercise: prove them)
1 mhvx2 i
kB T = , (2.22)
2 2
the mean energy per particle per degree of freedom, and, recalling the definition of U ,
Eq. (1.10), that
3 U
kB T = , (2.23)
2 N
the mean energy per particle.13 From Eq. (2.23), the heat capacity of the monatomic
classical ideal gas is
3
CV = kB N. (2.24)
2
Finally, using our expression (2.21) for vth , we arrive at the traditional formula for the
Maxwellian: Eq. (2.15) becomes
3/2
mv 2
m
f (v) = exp − . (2.25)
2πkB T 2kB T
This is a particular case (which we have here derived for our model gas) of a much more
general statistical-mechanical result known as the Gibbs distribution—exactly how to
recover Maxwell from Gibbs will be explained in §11.10.
The above treatment has not just given us the particle-velocity pdf in equilibrium—
we have also learned something new and important about the physical meaning of
temperature, which has turned out to measure how energetic, on average, microscopic
particles are. This is progress compared to Thermodynamics, where T was a purely
macroscopic and rather mysterious (if indispensable) quantity: recall that the defining
property of T was that it was some quantity that would equalise across a system in
equilibrium (e.g., if two systems with initially different temperatures were brought into
contact); in Thermodynamics, we were able to prove that such a quantity must exist,
but we could not explain exactly what it was or how the equalisation happened. It is
now clear how it happens for two volumes of gas when they are mixed together: particles
collide and eventually attain a global Maxwellian distribution with a single parameter
α ⇔ vth ⇔ T . When a gas touches a hot or cold wall, particles of the gas collide with the
vibrating molecules of the wall—the energy of this vibration is also proportional to T ,
as we will see in Statistical Mechanics—and again attain a Maxwellian with the same T .
To summarise, we now have the full thermodynamics of classical monatomic ideal gas:
specific formulae for energy U = U (N, T ), Eq. (2.23), heat capacity CV = CV (N ),
Eq. (2.24), equation of state P = P (N, V, T ), Eq. (2.19), etc. In addition, we know the
full velocity distribution, Eq. (2.25), and so can calculate other interesting things, which
thermodynamics is ignorant of (effusion, §3, will be the first example of that, followed
by the great and glorious theory of heat and momentum transport, §§5–6).
13
Note that one sometimes defines temperature in Kinetic Theory via Eqs. (2.23), (2.22) or
(2.20) and then proves the equivalence of this “kinetic temperature” and the thermodynamic
temperature (see, e.g., Chapman & Cowling 1991).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 21
2.3. Validity of the Classical Limit
Here are two very quick estimates for the range of temperatures in which the classical
results derived above should hold.
3. Effusion
Let us practice our newly acquired knowledge of particle distributions (§2) and calcu-
lations of fluxes (§1.4) on a simple, but interesting problem.
Consider a container containing ideal gas and make a small hole in it (Fig. 7). Suppose
the hole is so small that its diameter
d λmfp , (3.1)
where λmfp is the particle mean free path (the typical distance that particles travel
14
Another way to get this is to demand that the volume per particle should contain many de
Broglie wave lengths λdB = h/p associated with the thermal motion: nλ3dB ∼ n(h/mvth )3 1.
22 A. A. Schekochihin
between collisions—we will calculate it in §4). Then macroscopically the gas does not
“know” about the hole—this is a way to abduct particles without changing their distri-
bution.15 This can be a way to find out, non-invasively, what the velocity distribution is
inside the container, provided we have a way of measuring the velocities of the escaping
particles. On an even more applied note, we might be interested in what happens in this
set up because we are concerned about gas leaks through small holes in some industrially
important walls or partitions.
There are two obviously interesting quantitative questions we can ask:
(i) Given some distribution of particles inside the container, f (v), what will be the
distribution of the particles emerging from the hole?
(ii) Given the area A of the hole, how many particles escape through it per unit time?
(i.e., what is the particle flux through the hole?)
The answers are quite easy to obtain. Indeed, this is just like the calculation of pressure
(§1.4): there we needed to calculate the flux of momentum carried by the particles hitting
an area of the wall; here we need the flux of particles themselves that hit an area of the
wall (hole of area A)—these particles will obviously be the ones that escape through the
hole. Taking, as in §1.4, z to be the direction perpendicular to the wall, we find that the
(differential) particle flux, i.e., the number per unit time per unit area of particles with
velocities in the 3D cube [v, v + d3 v], is [see Eq. (1.21)]
dΦ(v) = nvz f (v) d3 v = n v 3 f (v)dv cos θ sin θ dθ dφ, (3.2)
| {z } | {z }
speed dis- angular
tribution distribution
where, in the second expression, we assumed that the distribution is isotropic, f (v) =
f (v), and used vz = v cos θ and d3 v = v 2 sin θ dv dθ dφ.
Thus, we have the answer to our question (i) and conclude that the distribution of
the emerging particles is neither isotropic nor Maxwellian (even if the gas inside the
container is Maxwellian). The angle distribution is not isotropic (has an extra cos θ
factor) because particles travelling nearly perpendicularly to the wall (small θ) escape
with greater probability.16 The speed distribution is not Maxwellian (has an extra factor
of v; Fig. 8) because faster particles get out with greater probability (somewhat like the
15
In §5, we will learn what happens when the gas does “know” and why the hole has to be larger
than λmfp for that.
16
However, there are fewer of these particles in the original isotropic angle distribution
∝ sin θdθdφ, so statistically, it is θ = 45◦ that is the most probable angle for the effusing
particles.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 23
Figure 8. Speed distribution of effusing particles: favours faster particles more than the
Maxwellian; see Eq. (3.3).
smarter students passing with greater probability through the narrow admissions filter
into Oxford—not an entirely deterministic process though, just like effusion).
Exercise 3.1. a) Consider a gas effusing out through a small hole into an evacuated sphere,
with the particles sticking to the internal surface of the sphere once they hit it. Show that this
would produce a uniform coating of the surface.
b) Show that the distribution of the speeds of the particles that might be found in transit
between the effusion hole and the surface at any given time is the same as for a Maxwellian gas.
If we are only interested in the distribution of speeds, we can integrate out the angular
dependence in Eq. (3.2): the flux through the hole of particles with speeds in the interval
[v, v + dv] is
Z π/2 Z 2π
1
3
dΦ̃(v) = nv f (v)dv dθ cos θ sin θ dφ = πnv 3 f (v)dv = nv f˜(v)dv, (3.3)
0 0 4
where f˜(v) is the distribution of speeds inside the container, related to f (v) via Eq. (1.34).
Note the upper limit of integration with respect to θ: it is π/2 and not π because only
particles moving toward the hole (vz = v cos θ > 0) will escape through it.
Finally, the total flux of effusing particles (number of particles per unit time per unit
area escaping through the hole, no matter what their speed) is
Z ∞
1 1
Φ= dv nv f˜(v) = nhvi, (3.4)
0 4 4
where hvi is the average particle speed inside the container. For a Maxwellian distribution,
f˜(v) is given by Eq. (2.16) and so hvi can be readily computed:
r
1 8kB T P
Φ= n =√ (3.5)
4 πm 2πmkB T
The fact that, given P and T , the effusion flux Φ ∝ m−1/2 , implies that if we put a mixture
of two particle species into a box with a small hole and let them effuse, the lighter species will
effuse at a larger rate than the heavier one, so the composition of the blend emerging on the
24 A. A. Schekochihin
other side of the hole will favour the lighter particles. This has applications to separation of
isotopes that are strictly on a need-to-know basis.
Exercise 3.2. Show that the condition of no mass flow between two insulated chambers
containing ideal gas at pressures P1,2 and temperatures T1,2 and connected by a tiny hole is
P1 P2
√ = √ . (3.6)
T1 T2
What would be the condition for no flow if the hole between the chambers were large (d λmfp )?
Exercise 3.3. What is the energy flux through the hole? (i.e., what is the energy lost by the
gas in the container per unit time, as particles leave by a hole of area A?)
Exercise 3.4. Consider a thermally insulated container of volume V with a small hole of area
A, containing a gas with molecular mass m. At time t = 0, the density is n0 and the temperature
is T0 . As gas effuses out through a small hole, both density and temperature inside the container
will drop. Work out their time dependence, n(t) and T (t), in terms of the quantities given above.
What is the characteristic time over which they will change significantly?
Hint. Temperature is related to the total energy of the particles in the container. The flux of
energy of the effusing particles will determine the rate of change of energy inside the container
in the same way as the particle flux determines the rate of change of the particle number (and,
therefore, their density). Based on this principle, you should be able to derive two differential
(with respect to time) equations for two unknowns, n and T . Having derived them, solve them.
Exercise 3.5. A festive helium balloon of radius R = 20 cm made of a soft but unstretchable
material is tied to a lamppost. The material is not perfect and can have microholes of approx-
imate radius r = 10−5 cm, through which helium will be leaking out. As this happens, the
balloon shrinks under atmospheric pressure.
a) Assuming the balloon material is a good thermal conductor, calculate how many microholes
per cm2 the balloon can have if it is to lose no more than 10% of its initial volume over one
festive week.
b) Now suppose the balloon material is a perfect thermal insulator. Repeat the calculation.
Exercise 3.6. Consider two chambers of equal volume separated by an insulating wall and
containing an ideal gas maintained at two distinct temperatures T1 < T2 . Initially the chambers
are connected by a long tube (Fig. 9) whose diameter is much larger than the mean free path
in either chamber, and equilibrium is established (while maintaining T1 and T2 ). Then the tube
is removed, the chambers are sealed, but a small hole is opened in the insulating wall, with
diameter d λmfp (where the mean free path is for either gas).
a) In what direction will the gas flow through the hole, from cold to hot or from hot to cold?
b) If the total mass of the gas in both chambers is M , show that the mass ∆M transferred
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 25
Figure 10. Cross section, collision time and mean free path.
through the hole from one chamber to the other before a new equilibrium is established is
√ √ √
T1 T2 T2 − T1
∆M = √ √ M. (3.7)
T1 + T2 T1 + T2
[Ginzburg et al. 2006, #427]
4. Collisions
We argued (on plausible symmetry grounds) that in equilibrium, we should expect the
pdf to be Maxwellian for an ideal gas. “In equilibrium” meant that initial conditions were
forgotten, i.e., that particles had collided a sufficient number of times. There are certain
constraints on the time scales on which the gas is likely to be in equilibrium (how long
do we wait for the gas to “Maxwellianise”?) and on the spatial scales of the system if we
are to describe it in these terms. Namely,
• t τc , the collision time, or the typical time that a particle spends in free flight
between collisions (it is also convenient to define the collision rate νc = 1/τc , the typical
number of collisions a particle has per unit time);
• l λmfp , the mean free path, or the typical distance a particle travels between
collisions.
In order to estimate τc and λmfp , we will have to bring in some information and some
assumptions about the microscopic properties of the gas and the nature of collisions.
4.1. Cross-section
Assume that particles are hard spheres of diameter d. Then they can be considered to
collide if their centres approach each other within the distance d. Think of a particle with
velocity v moving through a cylinder (Fig. 10) whose axis is v and whose cross section is
σ = πd2 . (4.1)
As the particle will necessarily collide with any other particle whose centre is within this
cylinder, σ is called the collisional cross section.
A useful way of parametrising the more general situation in which particles are not hard spheres
but instead interact with each other via some smooth potential (e.g., charged particles feeling
each other’s Coulomb potential), is to introduce the “effective cross section,” in which case d
tells you how close they have to get to have a “collision,” i.e., to be significantly deflected from
a straight path.
Exercise 4.1. Coulomb Collisions. For particles with charge e, mass m and temperature T ,
estimate d.
26 A. A. Schekochihin
4.2. Collision Time
Moving through the imaginary cylinder of cross section σ, a particle sweeps the volume
σvt over time t. The average number of other particles in this volume is σvtn. If this
is > 1, then there will be at least one collision during the time t. Thus, we define the
collision time t = τc so that
1 1
σvτc n = 1 ⇒ τc =
, νc = = σnv. (4.2)
σnv τc
As we are interested in a “typical” particle, v here is some typical speed. For a Maxwellian
distribution, we may pick any of these:
v ∼ hvi ∼ vrms ∼ vth . (4.3)
√
All these speeds have different numerical coefficients (viz., hvi = 2vth / π, vrms =
p
3/2 vth ), but we are in the realm of order-of-magnitude estimates here, so it does
not really matter which we choose. To fix the notation, let us define
r
1 1 1 m
τc = = = . (4.4)
νc σnvth σn 2kB T
If you have a suspicious mind, you might worry that the arguments above are somewhat
dodgy: indeed, we effectively assumed that while our chosen particle moved through its σvt
cylinder, all other particles just sat there waiting to be collided with. Surely what matters is, in
fact, the relative speed of colliding particles? This might prompt us to introduce the following
definition for the mean collision rate, which is conventional:
νc = σnhvr i, (4.6)
where vr = |v 1 − v 2 | is the mean relative speed of a pair of particles. It is more or less obvious
that hvr i ∼ vth just like any other speed in a Maxwellian distribution (what else could it
possibly be?!), but let us convince ourselves of this anyway (it is also an instructive exercise to
calculate hvr i).
By definition,
Z Z
hvr i = d3 v 1 d3 v 2 |v 1 − v 2 |f (v 1 , v 2 ), (4.7)
where f (v 1 , v 2 ) is the joint two-particle distribution function (i.e., the pdf that the first velocity is
in a d3 v 1 interval around v 1 and the second in d3 v 2 around v 2 ). Now we make a key assumption:
f (v 1 , v 2 ) = f (v 1 )f (v 2 ), (4.8)
i.e., the two particles’ velocities are independent. This makes sense as long as we are considering
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 27
them before they have undergone a collision—remember that particles are non-interacting in an
ideal gas, except for collisions.17 Taking the single-particle pdfs f to be Maxwellian, we get
v12 v22
Z Z
1
hvr i = d3 v 1 d3 v 2 |v 1 − v 2 |
2 3
exp − 2
− 2
(πvth ) vth vth
2 2
Z Z
1 2V v
= d3 v r v r d3 V 2 3
exp − 2 − r2
(πvth ) vth 2vth
√ r
vr2 √
Z
3 1 2 2 vth kB T
= d v r vr √ exp − 2 = 2hvi = √ =4 . (4.9)
( 2πvth ) 3 2v th π πm
Exercise 4.3. Consider a gas that is a mixture of two species of molecules: type-1 with diameter
d1 , mass m1 and mean number density n1 and type-2 with diameter d2 , mass m2 and mean
number density n2 . If we let them collide with each other for a while, they will eventually settle
into a Maxwellian equilibrium and the temperatures of the two species will be the same.
a) What will be the rms speeds of each of the two species?
b) Show that the combined pressure of the mixture will be P = P1 + P2 (Dalton’s law).
c) What is the cross-section for the collisions between type-1 and type-2 molecules?
d) What is the mean collision rate of type-1 molecules with type-2 molecules? Is it the same
as the collision rate of type-2 molecules with type-1 molecules? (Think carefully about what
exactly you mean when you define these rates.)
Hint. In (d), you will need to find the mean relative speed of the two types of particles, a
calculation analogous to the one in §4.4. Note however, that as the masses of the particles of
the two different types can be very different, the distinction between hvr i and hv1 i or hv2 i can
now be much more important than in the case of like-particle collisions.
17
It certainly would not be sensible to assume that they are independent right after a collision.
The assumption of independence of particle velocities before a collision is a key one in the
derivation of Boltzmann’s collision integral (Boltzmann 1995; Chapman & Cowling 1991) and
is known as Boltzmann’s Stosszahlansatz. Boltzmann’s derivation would be a central topic in a
more advanced course on Kinetic Theory (e.g., Dellar 2015).
28 A. A. Schekochihin
5. From Local to Global Equilibrium (Transport Equations)
5.1. Inhomogeneous Distributions
We have so far discussed a very simple situation in which the gas was homogeneous, so
the velocity pdf f (v) described the state of affairs at any point in space and quantities
such as n, P , T were constants in space. This also meant that we could assume that there
were no flows (if there was a constant mean flow u, we could always go to the frame
moving with it). This is obviously not the most general situation: thus, we know from
experience that if we open a window from a warm room onto a cold Oxford autumn,
it will be colder near the window than far away from it (so T will be a function of
space), a draft may develop (mean flow u of air, with some gradients across the room),
etc. Clearly such systems will have a particle velocity distribution that is different in
different places. Let us therefore generalise our notion of the velocity pdf and introduce
the particle distribution function in the position and velocity space (“phase space”):
F (t, r, v)d3 rd3 v = average number of particles with velocities in the 3D v-space
volume [vx , vx + dvx ] × [vy , vy + dvy ] × [vz , vz + dvz ] finding themselves in the spatial
cube [x, x + dx] × [y, y + dy] × [z, z + dz] at time t.
I have followed convention in choosing the normalisation
Z Z
d3 r d3 v F (t, r, v) = N, (5.1)
the total number of particles (rather than 1). Clearly, the 0-th velocity moment of F is
the (position- and time-dependent) particle number density:
Z
d3 v F (t, r, v) = n(t, r), (5.2)
(the r integrals are always over the system’s volume V ). Note that in a homogeneous
system,
which gets us back to our old familiar homogeneous velocity pdf f (v) (which integrates
to 1 over the velocity space).
If we know F (t, r, v), we can calculate other bulk properties of the gas, besides its
density (5.2), by taking moments of F , i.e., integrals over velocity space of various powers
of v multiplied by F .
Thus, the first moment,
Z
d3 v mvF (t, r, v) = mn(t, r)u(t, r), (5.5)
is the mean momentum density, where u(t, r) is the mean velocity of the gas flow (without
the factor of m, this expression, nu, is the mean particle flux ).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 29
A second moment gives the mean energy density:
mv 2 m|u + w|2
Z Z
d3 v F (t, r, v) = d3 w F
2 2
mu2 mw2
Z Z Z
= d3 w F +mu · d3 w wF + d3 w F
2 2
| {z } | {z }
= n(t, r) = 0 by
definition
of w
mnu2 mw2
= + n, (5.6)
2 2
| {z } | {z }
energy density of ≡ ε(t, r),
mean motions; internal-energy
u(t, r) given by density (motions
Eq. (5.5) around the mean)
where we have utilised the decomposition of particle velocities into mean and peculiar
parts, v = u(t, r) + w (cf. §1.2), where u is defined by Eq. (5.5). The total “ordered”
energy and the total internal (“disordered”) energy are [cf. Eq. (1.9)]
mnu2
Z Z
K = d3 r and U = d3 r ε(t, r), (5.7)
2
respectively.
So how do we calculate F (t, r, v)?
18
So we are now treating the limit opposite to what we considered when discussing effusion (§3).
30 A. A. Schekochihin
Everything is as before, but now locally: e.g., the pressure is [cf. Eq. (1.29)]
2
P (t, r) = n(t, r)kB T (t, r) = ε(t, r) (5.11)
3
and, therefore, the local temperature is, by definition, 2/3 of the mean internal energy
per particle:
2 mw2
2 ε(t, r)
kB T (t, r) = = = hmwx2 i (5.12)
3 n(t, r) 3 2
[cf. Eqs. (2.22) and (2.23)].
It is great progress to learn that only three functions on a 3D space (r), viz., n, u and
T , completely describe the particle distribution in the 6D phase space (v, r).19 How then
do we determine these three functions?
Thermodynamics gives us a hint as to how they will evolve in time. We know that if we
put in contact two systems with different T , their temperatures will tend to equalise—so
temperature gradients between fluid elements must tend to relax—and this should be a
collisional process because that is how contact between particles with different energies
is made. Same is true about velocity gradients (we will prove this thermodynamically
in §10.4). But Thermodynamics just tells us that everything must tend from local to
global equilibrium (no gradients)—not how fast that happens or what the intermediate
stages in this evolution look like. Kinetic Theory will allow us to describe this route to
equilibrium quantitatively. We will also see what happens when systems are constantly
driven out of equilibrium (§§5.6.4–5.6.6).
But before bringing the full power of Kinetic Theory to bear on this problem (in §6),
we will first consider what can be said a priori about the evolution of n, u and T .20
(0 because we can work in the frame moving with the centre of mass of the system), and
of the total energy:
mnu2
Z
3
d r + |{z}
ε = K + U = const. (5.15)
2
= 32 nkB T
19
NB: in §6.2, we will learn that, in fact, fluxes of momentum and energy—and, therefore,
transport phenomena—arise from small deviations of F from the local Maxwellian. Thus, the
local Maxwellian is not the whole story and even to determine the three functions that specify
this local Maxwellian, we will need to calculate small deviations of the particle distribution
from it.
20
In the words of J. B. Taylor, one always ought to know the answer before doing the calculation,
“we don’t do the bloody calculation because we don’t know the answer, we do it because we
have a conscience!”
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 31
Without knowing any Kinetic Theory, can we establish from these constraints the general
form of the evolution equations for n, u and T ? Yes, we can!
5.3.1. Temperature
For simplicity, let us first consider a situation in which nothing moves on average
(u = 0) and n = const globally. Then all energy in the system is internal energy,
and only temperature is inhomogeneous. Here c1 is the heat capacity per particle: for a
monatomic ideal gas, c1 = 3kB /2, but I will use c1 in what follows to mark the results
that are valid also for gases or other substances with different values of c1 —because these
results are reliant on conservation of energy and little else.21
To simplify even further, consider a 1D problem, where T = T (t, z) varies in one
direction only. Internal energy (heat) will flow from hot to cold regions (as we know from
Thermodynamics), so there will be a heat flux :
Jz (z) = internal energy flowing along z per unit time through unit area perpendicular
to the z axis.
Then the rate of change of internal energy in a small volume A × [z − dz/2, z + dz/2] (A
is area; see Fig. 11) is22
∂ dz dz
nc1 T · Adz = Jz z − · A − Jz z + · A. (5.17)
∂t | {z } 2 2
| {z } | {z }
energy in the energy energy
volume Adz flowing in flowing out
21
Furthermore, n = const is a very good approximation for liquids and solids, but, in fact, quite
a bad one for a gas, even if all its motions are subsonic. There is a subtlety here, related to the
gas wanting to be in pressure balance—this is discussed at the end of §6.4.2 [around Eq. (6.25)],
but we will ignore it for now, for the sake of simplicity and to minimise the amount of algebra in
this initial derivation. Obviously, everything can be derived without these simplifications: the full
correct temperature equation is derived on general energy-conservation grounds in Exercise 5.3
[Eq. (5.37)] and systematically as part of the kinetic theory of transport in §6.4.3 [Eq. (6.39)].
22
Note that incompressibility (n = const) is useful here as it allows us not to worry about the
net flux of matter into (or out of) our volume. In the more general, compressible, case, this
contribution to the rate of change of internal energy turns up in the form of the ∇ · u term
in Eq. (5.37).
32 A. A. Schekochihin
∂T
nc1 = −∇ · J . (5.20)
∂t
This is of course just a local statement of energy conservation.
Thus, if we can calculate the heat flux, J , we can determine the evolution of T .
Exercise 5.2. Continuity Equation. Now consider a gas with some mean flow velocity u(t, r)
and density n(t, r), both varying in (3D) space and time. What is the flux of particles through
a surface within such a system? Use the requirement of particle conservation to derive the
continuity equation
∂n
= −∇ · (nu). (5.23)
∂t
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 33
5.3.2. Velocity
We can handle momentum conservation in a similar fashion. Let us again assume
n = const, but allow a z-dependent flow velocity in the x direction (this is called a shear
flow ):
u = ux (t, z)x̂ (5.25)
In this system, momentum will flow from fast- to slow-moving layers of the gas (because,
as we will learn below, they experience friction against each other, due to particle
collisions). We define momentum flux
Πzx (z) = momentum in the x direction flowing along z per unit time through unit area
perpendicular to the z axis.
Then, analogously to Eq. (5.17) (see Fig. 13),
∂ dz dz
mnux · Adz = Πzx z − · A − Πzx z + · A, (5.26)
∂t 2 2
| {z } | {z } | {z }
momentum in momentum momentum
the volume flowing in flowing out
Adz
whence
∂ux ∂Πzx
mn =− . (5.27)
∂t ∂z
Thus, in order to determine the evolution of velocity, we must calculate the momentum
flux.
Let us generalise this calculation. Let n(t, r) and u(t, r) both be functions of space and time.
34 A. A. Schekochihin
Considering an arbitrary volume V of the gas, we can write the rate of change of momentum in
it as Z Z
∂
d3 r mnu = − dS · Π, (5.28)
∂t V ∂V
or, in tensor notation,
Z Z
∂
d3 r mnuj = − dSi Πij . (5.29)
∂t V ∂V
The momentum flux is now a tensor (also known as the stress tensor ): Πij is the flux of the
j-th component of momentum in the i direction (in the case of the shear flow, this tensor only
had one non-zero component, Πzx ). Application of Gauss’s Theorem gives us
∂
mnuj = −∂i Πij . (5.30)
∂t
The momentum flux consists of three parts:
—one (“convective”) due to the fact that the boundary of a fluid element containing the same
particles itself moves with velocity u: the flux of the j-th component of the momentum, mnuj ,
due to this effect is mnuj u and so
(convective)
Πij = mnui uj , (5.31)
i.e., momentum “carries itself” (just like it carries particle density: recall the flux of particles
being nu in Eq. (5.23));
—one due to the fact that there is pressure in the system and pressure is also momentum flux,
viz., the flux of each component of the momentum in the direction of that component (recall
§1.4: particles with velocity component vz transfer momentum in the z direction to the wall
perpendicular to z—in our current calculation, this pressure acts on the boundary of our chosen
volume V ); thus, the pressure part of the momentum flux is diagonal:
(pressure)
Πij = P δij ; (5.32)
—and, finally, one due to friction between layers of gas moving at different velocities; as we
(viscous)
have seen in §5.3.2, this part of the momentum-flux tensor, Πij , will contain off-diagonal
elements, but we have not yet worked out how to calculate them.
Substituting these three contributions, viz.,
(viscous)
Πij = mnui uj + P δij + Πij , (5.33)
into Eq. (5.30), we get
∂
mnu = −∇ · (mnuu) − ∇P − ∇ · Π (viscous) , (5.34)
∂t
or, after using Eq. (5.23) to express ∂n/∂t,
∂u
mn + u · ∇u = −∇P − ∇ · Π (viscous) . (5.35)
∂t
This is the desired generalisation of Eq. (5.27)—the evolution equation for the mean flow velocity
u(t, r). This equation says that fluid elements move around at their own velocity (the convective
time derivative in the left-hand side) and are subject to forces arising from pressure gradients
and friction (the right-hand side); if there are any other forces in the system, e.g., gravity,
those have to be put into the right-hand side of Eq. (5.35). Obviously, we still need to calculate
Π (viscous) in order for this equation to be useful in actual calculations.
Eq. (5.35) will be derived from kinetic theory in §6.4.2.
Exercise 5.3. Energy Flows. Generalise Eq. (5.20) to the case of non-zero flow velocity
u(t, r) 6= 0 and non-constant n(t, r). Consider the total energy density of the fluid,
mnu2 3
+ nkB T, (5.36)
2 2
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 35
and calculate the rate of change of the total energy inside a volume V due to two contributions:
energy being carried by the flow u through the boundary ∂V + work done by pressure on that
boundary. If you use Eqs. (5.23) and (5.35) to work out the time derivatives of n and u, in the
end you should be left with the following evolution equation for T :
3 ∂T (viscous)
nkB + u · ∇T = −∇ · J − nkB T ∇ · u − Πij ∂i uj . (5.37)
2 ∂t
Interpret all the terms and identify the conditions under which the gas behaves adiabatically,
i.e., satisfies
∂ P
+u·∇ = 0. (5.38)
∂t n5/3
Eq. (5.37) will be derived from kinetic theory in §6.4.3.
Figure 14. (Gedanken) experiment to define and determine viscosity; see Eq. (5.43).
where I is a unit matrix. The latter expression is not immediately obvious—I will derive it
(extracurricularly) in §6.8 [Eq. (6.74)].
The proportionalities between fluxes and gradients expressed by Eqs. (5.39) and (5.40)
do indeed turn out to hold, experimentally, in a good range of physical parameters (n,
P , T ) and for very many substances (gases, fluids, or, in the case of Eq. (5.39), even
solids). The coefficients κ and η can be experimentally measured and tabulated even
if we know nothing of kinetics or microphysics. It is thus that physics—and certainly
engineering!—very often progresses to workable models without necessarily achieving
complete understanding right away.
For example, viscosity can be introduced and measured as follows. Set up an experiment with
two horizontal plates of area A at a vertical distance d from each other and a fluid (or gas)
between them, the lower plate stationary, the upper one being moved at a horizontal velocity
ux (Fig. 14). If one measures the force F that one needs to apply to the upper plate in order to
maintain a constant ux , one discovers that, for small enough d,
F ux ∂ux
=η ≈η , (5.43)
A d ∂z
Note that the relationships (5.39) and (5.40) are valid much more broadly than will
be the upcoming expressions for κ and η that we will derive for ideal gas. Thus, we can
talk about the viscosity of water or thermal conductivity of a metal, although neither
obviously can be viewed as a collection of non-interacting billiard-ball particles on any
level of simplification.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 37
5.5. Transport Equations
If we now substitute Eqs. (5.39) and (5.40) into Eqs. (5.18) and (5.27), we obtain closed
equations for T and ux :
∂T ∂2T
nc1 =κ , (5.44)
∂t ∂z 2
∂ux ∂ 2 ux
mn =η . (5.45)
∂t ∂z 2
These are the transport equations that we were after.
Note that in pulling κ and η out of the z derivative, we assumed them to be independent
of z: this is fine even though they do depend on T (which depends on z) as long as the
temperature gradients and, therefore, the temperature differences are not large on the
scales that we are considering and so κ and η can be approximated by constant values
taken at some reference temperature.
Let us make this quantitative. Let κ = κ(T ) and assume that T = T0 + δT , where T0 = const
and all the temperature variation is contained in the small perturbation δT (t, z) T0 . This
is indeed a commonplace situation: temperature variations in our everyday environment rarely
exceed ∼ 10% of the absolute temperature T ∼ 300 K. Then
κ(T ) ≈ κ(T0 ) + κ 0 (T0 )δT (5.46)
and so, from Eqs. (5.18) and (5.39),
∂T ∂ ∂T ∂ 2 δT ∂ ∂δT
nc1 = κ(T ) ≈ κ(T0 ) + κ 0 (T0 ) δT , (5.47)
∂t ∂z ∂z ∂z 2 ∂z ∂z
but the second term is quadratic in the small quantity δT and so can be neglected, giving us
back Eq. (5.44) (after δT in the diffusion term is replaced by T , which is legitimate because the
constant part T0 vanishes under gradients).
satisfies23
∂ T̂ 2
= −DT k 2 T̂ ⇒ T̂ (t, k) = T̂0 (k)e−DT k t . (5.54)
∂t
Thus, spatial variations (k 6= 0) of temperature relax exponentially fast in time on the
diffusion time scale:
1 l2
T̂ (t, k) ∝ e−t/τdiff , τdiff = 2
∼ , (5.55)
DT k DT
where l ∼ k −1 is the typical spatial scale of the variation and τdiff is, therefore, its typical
time scale.
The velocity diffusion governed by Eq. (5.49) is entirely analogous to the temperature
diffusion.
Recall that in arguing for a local Maxwellian, we required the assumption that these
scales were much greater than the spatial and time scales of particle collisions, λmfp and
τc [see Eq. (5.8)]. Are they? Yes, but to show this (and to be able to solve practical
problems), we still have to derive explicit expressions for DT and ν.
In what follows, we will do this not once but four times, in four different ways (which
highlight different aspects of the problem):
• a dimensional guess, a scoundrel’s last (or, in our case, first) refuge (§5.6.2),
• an estimate based on modelling collisions as particle diffusion, a physically important
insight (§5.7),
• a “pseudo-kinetic” derivation, dodgy but nice and simple (§6.1),
• a “real” kinetic derivation, more involved, but also more systematic, mathematically
appealing and showing how more complicated problems are solved (the rest of §6).
23
For the full reconstruction of the solution T (t, z), see §5.7.3.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 39
5.6.2. Dimensional Estimate of Transport Coefficients
As often happens, the quickest way to get the answer (or an answer) is a dimensional
guess. The dimensionality of diffusion coefficients is
length2
[DT ] = [ν] = . (5.56)
time
Clearly, transport of energy and momentum from one part of the system to another is due
to particles colliding. Therefore, both the energy- and momentum-diffusion coefficients
must depend on some quantities characterising particle collisions. We need a length and
a time: well, obviously, the mean free path λmfp and the collision time τc . Then [using
Eq. (4.5)]
λ2mfp 2
DT ∼ ν ∼ ∼ vth τc ∼ vth λmfp . (5.57)
τc
This is indeed true (as properly proved in §6), although of course we cannot determine
numerical prefactors from dimensional analysis.
centrifugal forces in rotating systems, Lorentz force in conducting media, buoyancy force
in stratified media, etc.
Sources or sinks of heat and momentum can also take the form of boundary condi-
tions, e.g.,
—a surface kept at some fixed temperature,
—a given heat flux constantly pumped through a surface (perhaps via the latter being
in contact with a heat source generating heat at a given rate),
—a rate of cooling at a surface specified in terms of its temperature (e.g., Newton’s law
of cooling: cooling rate proportional to the temperature difference between the surface
of a body and the environment),
—a surface moving at a given velocity, etc.
5.6.5. Steady-State Solutions
Steady-state solutions arise when sources, sinks and/or boundary conditions are con-
stant in time and so cause time-independent temperature or velocity profiles to emerge.
For example, the force balance
∂ 2 ux
η + fx = 0 (5.61)
∂z 2
will imply some profile ux (z), given the spatial dependence of the force fx (z) and some
boundary conditions on ux (since the diffusion equation is second-order in z, two of those
are needed). The simplest case is fx = 0, ux (0) = 0, ux (L) = U , which instantly implies,
for z ∈ [0, L]
z
ux (z) = U , (5.62)
L
a solution known as linear shear flow (Fig. 15).
Similarly, looking for steady-state solutions of Eq. (5.44) subject to both ends of the
domain being kept at fixed temperatures, T (0) = T1 and T (L) = T2 (Fig. 16a), we find
∂2T z
= 0 ⇒ T (z) = T1 + (T2 − T1 ) . (5.63)
∂z 2 L
Note that the simple linear profiles (5.62) and (5.63) are entirely independent of the
transport coefficients κ and η.
A slightly more sophisticated example is a set up where, say, the bottom surface of the
system is heated at some known fixed rate, i.e., the heat flux through the z = 0 boundary
is specified, Jz (0) = J1 , while the top surface is in contact with a fixed-temperature
thermostat, T (L) = T2 (Fig. 16b). Then Eq. (5.44) or, indeed, already Eq. (5.18) gives,
in steady state,
∂Jz
= 0 ⇒ Jz = const = J1 . (5.64)
∂z
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 41
(a) (b)
Figure 16. Boundary conditions for the heat diffusion equation: (a) two thermostatted
surfaces, (b) a thermostatted surface and a heat source.
Exercise 5.4. Work out the steady-state temperature profile T (r) that will be maintained at
the radii r ∈ [r1 , r2 ] in an axisymmetric system where T (r1 ) = T1 and T (r2 ) = T2 .
Note that steady-state profiles of the kind described above, even though they are solutions of the
transport equations, are not necessarily stable solutions. Time-dependent motions can develop
as a result of small perturbations of the steady state (e.g., for convection, given large enough
temperature contrasts, the so-called Rayleigh-Bénard problem; see, e.g., Chandrasekhar 2003).
Indeed, it is very common for Nature to find such ways of relaxing gradients via instabilities
and resulting motions (turbulence) when the gradients (deviations from global equilibrium) are
strong and collisional/diffusive transport is relatively slow—Nature tends to be impatient with
non-equilibrium set-ups.
Just how impatient can be estimated very crudely in the following way. We might think of
mean fluid motions that develop in a system as carrying heat and momentum in a way somewhat
similar to what random-walking particles do (§5.7.2), but now moving parcels of fluid travel at
the typical flow velocity u and “collide” after some distance l representing the typical scale of
the motions. This gives rise to “turbulent diffusion” with diffusivity Dturb ∼ ul,24 analogous to
DT ∼ ν ∼ vth λmfp . Which of these is larger determines which controls transport. Their ratio,
Dturb u l
Re = ∼ , (5.66)
ν vth λmfp
known as the Reynolds number, is a product of, typically, a small number (u/vth ) and a large
number (l/λmfp ). The latter usually wins, except in fairly small systems or when flows are
very slow. In turbulent systems (Re 1), the heat and momentum transport is “anomalous”,
meaning much faster than collisional.
Figure 17. Temperature perturbation with given frequency ω penetrates ∼ a skin depth δω
into heat-conducting medium, Eq. (5.71).
The treatment of such cases is analogous to what we did with the relaxation of an
initial inhomogeneity in §5.6.1, but now the Fourier transform is in time rather than in
space. So, consider a semi-infinite domain, z ∈ [0, ∞), with the boundary condition
X
T (t, z = 0) = T̂0 (ω)e−iωt (5.67)
ω
(say a building with the outer wall at z = 0 exposed to the elements, with ω’s being the
frequencies of daily, annual, millennial etc. temperature variations). Then the solution of
Eq. (5.48) can be sought in the form
X
T (t, z) = T̂ (ω, z)e−iωt , (5.68)
ω
where δω is the typical scale on which temperature perturbations with frequency ω decay,
known as the skin depth—the further away from the boundary (and the higher the
frequency), the more feeble is the temperature variation that manages to penetrate there
(Fig. 17). Note that it also arrives to z > 0 with a time delay ∆t = z/δω |ω|.
This was an example of “relaxation to equilibrium” effectively occurring in space
rather than in time.
You see that once we have a diffusion equation for heat or momentum, solving it—and,
therefore, working out how systems return (or strive) to global equilibrium—becomes a
problem in applied mathematics rather than in physics (although interpreting the answer
still requires some physical insight). Returning to physics, the key piece of unfinished
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 43
business that remains is to calculate the diffusion coefficients DT and ν (or κ and η)
based on some theory of particle motion and collisions in an ideal gas (and we will
restrict these calculations to ideal gas only).
5.7. Diffusion
Before we make good on our promise of a proper kinetic calculation, it is useful to discuss what
fundamental property of moving particles in a collisional gas the diffusion equations encode.
Let us forget about transport equations for a moment, consider an ideal-gas system and
imagine that there is a sub-population of particles in this gas, with number density n∗ , that
carry some identifiable property: e.g., they might be labelled in some way (e.g., be particles of
a different species than the rest). Non-rigorously, we will argue this property they carry might
also be mean energy or momentum and so the evolution equation for n∗ that we are about to
derive should have the same form as the evolution equations for mean momentum or energy
density (temperature) of the gas.
where δzi are independent random displacements with mean hδzi i = 0 and variance hδzi2 i = λ2mfp ,
and N = ∆t/τc is the number of collisions over time ∆t. By the Central Limit Theorem (see,
e.g., Sinai 1992), in the limit N → ∞, the quantity
N
!
√ 1 X
X= N δzi − hδzi i (5.78)
N i=1
will have a normal (Gaussian) distribution with zero mean and variance hδzi2 i − hδzi i2 = λ2mfp :
1 2 2
f (X) = √ e−X /2λmfp . (5.79)
λmfp 2π
√
Since ∆z = X N , we conclude that, for N = ∆t/τc 1,
h∆z 2 i hX 2 i λ2mfp
D= = = , (5.80)
2∆t 2τc 2τc
so we recover the dimensional guess (5.57), up to a numerical factor, of course.
The model of the particle motion that we have used to obtain this result—a sequence of
independent random increments—is known as Brownian motion, or random walk, and describes
random meandering of a particle being bombarded by other particles of the gas and thus
undergoing a sequence of random kicks. The density of such particles—or of any quantity
they carry, such as energy or momentum—always satisfies a diffusion equation, as follows from
the above derivation (in §6.9, the full kinetic theory of Brownian particles is developed more
rigorously and systematically).
In the context of a diffusive spreading of an admixture of particles of a distinct species in
an ambient gas, Eq. (5.76) is called Fick’s law. In the expression for the diffusion coefficient,
Eq. (5.80), λmfp and τc are the mean free path and the collision time of the labelled species
(which, if this species has different mass than the ambient one, are not the same as the ambient
mean free path and collision time; see Exercise 4.3).
If we were to use the model above to understand transport of energy or momentum, while this
is fine qualitatively, we ought to be cognizant of an important nuance. Implicitly, if we treat n∗ as
energy density (nc1 T ) or momentum density (mnux ) and carry out exactly the same calculation,
we are assuming that particles that have random-walked through many collisions from z − ∆z to
z have not, through all these collisions, changed their energy or momentum. This is, of course,
incorrect—in fact, in each collision, energy and momentum are exchanged and so the velocity
of each particle receives a random kick uncorrelated with the particle’s previous history. Thus,
the particle random-walks not just in position space z but also in velocity space v. The reason
the above calculation is still fine is that we can think of the particles it describes not literally
as particles but as units of energy or momentum random-walking from place to place—and also
from particle to particle!—and thus effectively diffusing from regions with higher average ux or
T to regions with lower such averages.
Here T̂0 (k) can be expressed as the inverse Fourier transform of T (t = 0, z), from Eq. (5.52):
Z
1 0
T̂0 (k) = dz 0 T (t = 0, z 0 ) e−ikz , (5.82)
L
where L is the length of the domain in z. Substituting this into Eq. (5.81) and replacing the
sum over k with an integral (which we can do if we notice that, in a periodic domain of size L,
the “mesh” size in k is 2π/L),
Z
X L
= dk, (5.83)
2π
k
we get
Z Z
1 0 2
T (t, z) = dz 0 T (t = 0, z 0 ) dk eik(z−z )−DT k t
2π
(z − z 0 )2
Z
1
= dz 0 T (t = 0, z 0 ) √ exp − , (5.84)
4πDT t 4DT t
where we have done the k integral by completing the square in the exponential. This formula
(which is an example of a Green’s-function solution of a partial differential equation) describes
precisely what√we anticipated: a diffusive (random-walk-like) spreading of the initial perturbation
with z − z 0 ∼ DT t. The easiest way to see this is to imagine that the initial perturbation is a
sharp spike at the origin, T√(t = 0, z) = δ(z). After time t, this spike turns into a Gaussian-shaped
profile with rms width = 2DT t (Fig. 18).
6.1.1. Viscosity
Given a shear flow profile, ux (z), we wish to calculate the momentum flux Πzx through
the plane defined by a fixed value of the coordinate z (Fig. 19). The number of particles
with velocity v that cross that plane per unit time per unit area is given by Eq. (3.2):
dΦ(v) = nvz f (v)d3 v = nv 3 f (v)dv cos θ sin θdθdφ. (6.1)
46 A. A. Schekochihin
Figure 19. Physics of transport: particles wander from faster-moving (or hotter) regions to
slower (or colder) ones, bring with then extra momentum (or energy). This gives rise to net
momentum (or heat) flux and so to the viscosity (thermal conductivity) of the gas.
These particles have travelled the distance λmfp since their last collision—i.e., since they
last “communicated” with the gas as a collective. This was at the position z − ∆z, where
∆z = λmfp cos θ (because they are flying at angle θ to the z axis). But, since ux is a
function of z, the mean momentum of the particles at z − ∆z is different than it is at z
and so a particle that last collided at z − ∆z brings with it to z some extra momentum:
∂ux ∂ux
∆p = mux (z − ∆z) − mux (z) ≈ −m ∆z = −m λmfp cos θ, (6.2)
∂z ∂z
assuming that ∆z l (l is the scale of variation of ux ). The flux of momentum through
z is then simply
Z Z ∞ Z π Z 2π
∂ux 3 2
Πzx = dΦ(v)∆p = −mn λmfp dv v f (v) dθ cos θ sin θ dφ
∂z
|0 {z } |0 {z } | 0 {z }
= hvi/4π = 2/3 = 2π
1 ∂ux
= − mnλmfp hvi . (6.3)
3 ∂z
Note that, unlike in our effusion (§3) or pressure (§1.4) calculations, the integral is over
all θ because particles come from z − ∆z, where ∆z = λmfp cos θ can be positive or
negative.
Comparing Eq. (6.3) with Eq. (5.40), we read off the expression for dynamical viscosity:
r
1 2 2 2mkB T
η = mnλmfp hvi = √ mnλmfp vth = . (6.4)
3 3 π 3σ π
We have recovered the dimensional guess (5.57), with a particular numerical coefficient
(which is, however, wrong, as I am about to explain). Note that the assumption ∆z ∼
λmfp l is justified a posteriori: once we have Eq. (6.4), we can confirm scale separation
as in §5.6.3.
The last expression in Eq. (6.4), to obtain which we used Eqs. (4.5) and (2.21),
emphasises the fact that the dynamical viscosity depends on the temperature but not
the number density of the gas.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 47
Exercise 6.1. What is going on physically? Why does it make sense that the rate of momentum
transport should be independent of the density of particles that transport it? Robert Boyle
discovered this in 1660 when he put a pendulum inside a vessel from which he proceeded to
pump out the air. The rate at which the pendulum motion was damped did not change.
If Boyle had had a really good vacuum pump and continued pumping the air out, at what
pressure would have he started detecting a change in the pendulum’s damping rate? Below
that pressure, estimate the momentum flux from the pendulum, given the pendulum’s typical
velocity u and any other parameters that you might reasonably expect to know.
Exercise 6.2. Fick’s Law of Diffusion. Given the number density n∗ (z) and the mean free
path λmfp of an admixture of labelled particles, as well as the temperature of the ambient gas,
calculate the flux of the labelled species, Φ∗z , and derive Fick’s Law of Diffusion, Eq. (5.76).
You might object that the assumption of a constant λmfp was in fact more plausible: indeed, we
saw in §4.3 that λmfp , at least when estimated very roughly, was independent of the particles’
velocity (except via possible v dependence of the collisional cross section σ, for “squishy”
particles). On the other hand, imagining the extreme case of a particle sitting still, one might
argue that it would remain still until hit by some other particle, after some characteristic
collision time τc , so perhaps a constant τc [Eq. (6.8)] is not an entirely unreasonable model
either. The correct v dependence of λmfp , or, equivalently, of the collision time τc , can be
worked out systematically for any particular model of collisions: e.g., for the “hard-spheres”
model, τc ∼ const when v vth and τc ∼ λmfp /v, λmfp ∼ const when v vth , with a more
nontrivial behaviour in between the two limits (see, e.g., Dellar 2015). This is because the faster
particles can be thought of as rushing around amongst an almost immobile majority population,
as envisioned by the arguments of §4, whereas the slower ones are better modelled as sitting
still and waiting to be hit. Thus, both λmfp = const and τc = const are plausible, but not
quantitatively correct, simplifications for the majority of the particles (for which v ∼ vth ).
Thus, the derivation given in this section is in fact no more rigorous then the random-
walk model of §5.7.2 or even the dimensional estimate of §5.6.2—although it does
highlight the essential fact that we need some sort of kinetic (meaning based on the
velocity distribution) calculation of the fluxes.
Another, somewhat more formalistic, objection to our last derivation is that the
homogeneous Maxwellian f (v) was used, despite the fact that we had previously made
quite a lot of fuss about only having a local Maxwellian F (t, r, v) [Eq. (5.10)] depending
on z via T (z) and ux (z). In fact, this was OK because the scale of inhomogeneities was
long (l λmfp ) and the flow velocity small (ux vth ), but we certainly did not set up
systematically whatever expansion around a homogeneous distribution that might have
justified this approach.
You will find some further critique of the derivation above, as well as the quantitatively correct
formulae for the transport coefficients, in Blundell & Blundell (2009), §9.4. The derivation of
these formulae can be found, e.g., in Chapman & Cowling (1991).
Clearly, if we are to claim that we really can do better than our unapologetically
qualitative arguments in §5, we must develop a more systematic algorithm for calcu-
lating transport coefficients. We shall do this now and, in the process, learn how to
solve (kinetic) problems involving scale separation—a useful piece from the toolbox of
theoretical physics.
mv 2
Z
Jz (z) = d3 v · vz · F (z, v) (6.10)
2
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 49
(in the latter expression, we assumed u = 0 for simplicity, a restriction that will be lifted
in §6.4.3). But if F is a local Maxwellian,
( ) r
n [vx − ux (z)]2 + vy2 + vz2 2kB T (z)
FM (z, v) = √ 3 exp − 2
vth (z)
, vth (z) =
m
,
[ πvth (z)]
(6.11)
then Πzx = 0 and Jz = 0 because they both have a single power of vz under the integral
and FM is even in vz ! This means that non-zero fluxes come from the distribution function
in fact not being exactly a local Maxwellian:
F (z, v) = FM (z, v) + δF (z, v), (6.12)
and we must now find δF .
In order to do this, we need an evolution equation for F , the argument for a local
Maxwellian (§5.2) is no longer enough.
∂F
+ v · ∇F = C[F ] , (6.14)
∂t
where the right-hand side, C[F ] = lim∆t→0 ∆Fc /∆t, is called the collision operator,
whereas the left-hand side expresses conservation of particle density in phase space:
indeed, our equation can be written as ∂F/∂t = −∇ · (vF ) + C[F ], where vF is the flux
of particles with velocity v.
Exercise 6.3. Kinetic Equation for a Plasma. We have assumed that no forces act on
particles, apart from collisions. Work out the form of the kinetic equation if some external force
ma acts on each particle, e.g., gravity a = g, or Lorentz force a = (q/m)(E + v × B/c) (q
is the particle charge). The kinetic equation for the latter case is the Vlasov–Landau equation
describing an ionised particle species in a plasma (see, e.g., lecture notes by Schekochihin 2019,
and references therein).
Eq. (6.14) might appear rather less than satisfactory as we have not specified what
C[F ] is. Thinking about what it might be is depressing as it is clearly quite a complicated
object:
—collisions leading to a change in the local number of particles with velocity v must
have involved particles that had other velocities v 0 before they collided, so C[F ] is likely
to be a integral operator depending on F (t, r, v 0 ) integrated over a range of v 0 ;
—assuming predominantly binary collisions, C[F ] is also likely to be a quadratic (and
so nonlinear!) operator in F because the probability of getting a particle with velocity v
after a collision must depend on the joint probability of two particles with some suitable
velocities meeting.
50 A. A. Schekochihin
In §6.5, we will happily avoid these complications by introducing a very simple model
of C[F ],25 but first let us show what can be done without knowing the explicit form of
C[F ] (in the process, we will also learn of some important properties that any collision
operator must have).
The second term vanishes because, whatever the explicit form of the collision operator
is, it cannot lead to any change in the number of particles—elastic collisions conserve
particle number :26
Z
d3 v C[F ] = 0. (6.16)
∂n
+ ∇ · (nu) = 0 , (6.17)
∂t
which you have already had the opportunity to derive on general particle-conservation
grounds in Exercise 5.2 [Eq. (5.23)]. It is good to know that our kinetic equation allows
us to recover such non-negotiable results. We are about to show that it will also allow us
to recover Eqs. (5.35) and (5.37), but we will this time work out what Π (viscous) and J
are.
25
See, e.g., lecture notes by Dellar (2015) for the derivation of Boltzmann’s full collision operator.
See also §6.9.2 for a simple derivation of a collision operator describing a particular kind of
particles.
26
In Eq. (6.13), ∆Fc represents collisions between particles at the point r in space. The only
effect of these collisions is a redistribution of particle velocities—any movements of particles
between different points in space are accounted forR in the v · ∇F term. Therefore, ∆Fc cannot
change the total number of particles at r and so d3 v ∆Fc = 0. Similar considerations apply
to the conservation of momentum, Eq. (6.19), and energy, Eq. (6.31).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 51
6.4.2. Momentum Density
The first moment of Eq. (6.14) is
Z Z
∂ 3 ∂F
mnu = d v mv = d3 v mv (−v · ∇F + C[F ])
∂t ∂t
Z Z
= −∇ · d3 v mvvF + d3 v mvx C[F ] . (6.18)
| {z }
=0
Similarly to Eq. (6.15), the collisional term vanishes because, again, whatever the explicit
form of the collision operator might be, it cannot lead to any change in the mean
momentum of particles—elastic collisions conserve momentum:
Z
d3 v mv C[F ] = 0. (6.19)
We now have to do some technical work separating the mean flow from the random
motions, v = u + w:
Z
∂
mnu = −∇ · d3 w m(u + w)(u + w)F
∂t
Z Z Z
= −∇ · muu d3 w F + d3 w m (uw + wu) F + d3 w mwwF
| {z } | {z }
=n = 0 by definition of w
Z Z
= −mu∇ · (nu) −mnu · ∇u − ∇ · d3 w mwwFM − ∇ · d3 w mww δF .
| {z }
∂n
| {z } | {z }
≡Π
Z
= mu , 1
∂t =∇ d3 w mw2 FM viscous stress,
Eq. (6.17) 3
= ∇P , cf. Eq. (6.9)
Eq. (1.29)
(6.20)
Now combining the left-hand side of Eq. (6.20) with the first term on its right-hand side,
we arrive at the evolution equation for the mean velocity of the gas:
∂u
mn + u · ∇u = −∇P − ∇ · Π . (6.21)
∂t
Thus, we have recovered Eq. (5.35), which, when specialised to the case of a shear flow
with 1D spatial dependence, u = ux (z)x̂, gives us back the momentum-conservation
equation (5.27):
∂ux ∂Πzx
mn =− . (6.22)
∂t ∂z
The momentum flux, which will become viscous stress once we are done with this
extended calculation, is, by definition, the matrix
Z
Πij = d3 w mwi wj δF, (6.23)
and, in particular, the element of this matrix already familiar to us from previous
derivations is
Z
Πzx = m d3 w wz wx δF . (6.24)
52 A. A. Schekochihin
Note that Eq. (6.21) teaches us that we cannot, technically speaking, restrict the gas
flow just to u = ux (z)x̂ (or to zero) and density to n = const if we also want there to
be a non-constant temperature profile T = T (z). Indeed, P = nkB T , so a temperature
gradient in the z direction will produce a pressure gradient in the same direction and
that will drive a flow uz . The flow will then change the density of the gas according to
Eq. (6.17), that will change ∇P , etc.—it is clear that, whatever the detailed dynamics,
the system will strive towards pressure balance, ∇P = 0, and thus we will end up with
∇n ∇T
=− , (6.25)
n T
so there will be a density gradient to compensate the temperature gradient. This will
normally happen much faster than the heat or momentum diffusion because the pressure-
gradient force acts dynamically, without being limited by the smallness of the collisional
mean free path.27 Therefore, as the slower evolution of T due to heat diffusion proceeds at
its own snail pace, we can assume n to be adjusting instantaneously to satisfy Eq. (6.25).
The flows that are required to effect this adjustment are very small: from Eq. (6.17), we can
estimate
1 ∂δn ∂ δT DT δT vth λmfp δT uz λmfp δT
∇·u∼ ∼ ∼ 2 ∼ ⇒ ∼ , (6.26)
n0 ∂t ∂t T0 l T0 l2 T0 vth l T0
where δn and δT are typical sizes of the density and temperature perturbations from their
constant spatial means n0 and T0 ; note that δn/n0 ∼ δT /T0 because of Eq. (6.25). In principle,
nothing stops the shear flow ux (z) from being much greater than this, even if still subsonic
(ux vth ).
Namely, consider any arbitrary pdf F . Let FM be a local Maxwellian [Eq. (5.10)] such that its
density n, mean velocity u and mean energy ε = 3nkB T /2 are the same as the density, mean
velocity and mean energy of F (as defined in §5.1). Then we can always write
F = FM + F − F M , (6.28)
| {z }
≡ δF
where δF contains no particle-, momentum- or energy-density perturbation.
27
To wit, pressure gradients will be wiped out on the time scale ∼ l/vth of sound propagation
across the typical scale l of any (briefly) arising pressure inhomogeneity.
28
Note that this implies that the viscous stress tensor (6.23) is traceless.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 53
Eq. (6.27) implies that the rate of change of the internal energy is
∂ 3 ∂hEi ∂ mnu2
nkB T = − . (6.29)
∂t 2 ∂t ∂t 2
Let us calculate both of these contributions. The second one will follow from Eqs. (6.17)
and (6.21), but for the first, we shall need the kinetic equation again.
Taking the mv 2 /2 moment of Eq. (6.14), we get
mv 2 ∂F mv 2
Z Z
∂hEi
= d3 v = d3 v (−v · ∇F + C[F ])
∂t 2 ∂t 2
mv 2 mv 2
Z Z
= −∇ · d3 v vF + d3 v C[F ] . (6.30)
2 2
| {z }
=0
Similarly to Eqs. (6.15) and (6.20), the second term vanishes because the collision
operator cannot lead to any change in the mean energy of particles—elastic collisions
conserve energy:
mv 2
Z
d3 v C[F ] = 0. (6.31)
2
The first term in Eq. (6.30) looks very much like the divergence of the heat flux,
Eq. (6.10), but we must be careful as heat is only the random part of the motions,
whereas
R 3 v now also contains the mean flow u. Breaking up v = u + w as before, where
d v wF = 0, we get
mv 2 mv 2
Z Z
∂hEi
= −∇ · u d3 v F + d3 v wF
∂t 2 2
| {z } | {z }
= hEi m|u + w|2
Z
= d3 w wF
2
mu2 mw2
Z Z Z
3 3 3
= −∇ · uhEi + d w wF + d w mwwF · u + d w wF
2 2
| {z } | {z } | {z }
=0 = P I + Π, ≡J
as in Eq. (6.20) heat flux
mnu2
3
= −∇ · u + nkB T + uP + Π · u + J (6.32)
2 2
We have now extracted the heat flux:
mw2
Z
J = d3 w wδF, (6.33)
2
or, in the familiar 1D form,
mw2
Z
Jz = d3 w wz δF , (6.34)
2
where only δF is left because J = 0 for F = FM , the local Maxwellian distribution being
even in w (see §6.2). It remains to mop up the rest of the terms.
Recall that, to get the rate of change of internal energy, we need to subtract from
the rate of change of the total energy (6.32) the rate of change of the kinetic energy of
the mean motions [see Eq. (6.29)]. The latter quantity can be calculated by substituting
54 A. A. Schekochihin
for ∂n/∂t and for mn∂u/∂t the continuity equation (6.17) and the momentum equation
(6.21), respectively:
∂ mnu2 mu2 ∂n ∂u
= + mnu ·
∂t 2 2 ∂t ∂t
mu2 u2
·
= − ∇ · (nu) − mnu · ∇ − u· ∇P − (∇·
Π) u. (6.35)
2 2
When this is subtracted from Eq. (6.32), all these terms happily cancel with various bits
that come out when we work out the divergence in the right-hand side of Eq. (6.32).
Namely, keeping terms in the same order as they appeared originally in Eq. (6.32) and
crossing out those that cancel with similar terms in Eq. (6.35),
mu2 u2
∂hEi 3
= − ∇ · (nu) − mnu · ∇ − ∇ · u nkB T
∂t 2 2 2
− P∇ · u − u· ∇P − (∇· Π)· u − Π ∂ u − ∇ · J . (6.36)
ij i j
Therefore,
∂ 3 ∂hEi ∂ mnu2
nkB T = −
∂t 2 ∂t ∂t 2
3
= − ∇ · u nkB T − P∇ · u − Πij ∂i uj − ∇ · J . (6.37)
2
| {z } | {z } | {z } | {z }
internal-energy flux compressional viscous heat flux
due to mean flow heating heating
Our old energy-conservation equation (5.18) is recovered if we set u = 0 and n = const
(which is the assumption under which we derived it in §5.3.1), but we now know
better and see that if we do retain the flow, a number of new terms appear, all with
straightforward physical meaning (so our algebra is vindicated).
As we argued in §6.4.2 [see discussion around Eq. (6.25)], we cannot really assume
n = const and so we need to use the continuity equation (6.17) to split off the rate
of change of n from the rate of change of T in the left-hand side of Eq. (6.37). After
unpacking also the first term on the right-hand side, this gives us a nice cancellation:
∂ 3 3 ∂T 3 ∂n
3 3
nkB T = nkB + kB T =− T∇
kB · (nu)− nkB u·∇T +the rest of terms.
∂t 2 2 ∂t 2 ∂t 2 2
(6.38)
Hence, finally, we get the desired equation for the evolution of temperature:
3 ∂T
nkB + u · ∇T = −P ∇ · u − Πij ∂i uj − ∇ · J . (6.39)
2 ∂t
If one derives the collision operator based on an explicit microphysical model of particle collisions,
one can then prove that C[F ] = 0 implies F = FM and also that collisions always drive the
distribution towards FM (a simple example of such a calculation, involving deriving a collision
operator from “first principles” of particle motion, can be found in §6.9). This property is
associated with the so-called Boltzmann’s H-Theorem, which is the law of entropy increase for
kinetic systems. This belongs to a more advanced course of kinetic theory (e.g., Dellar 2015).
• Secondly, the relaxation to the local Maxwellian must occur on the collisional time
scale τc = (σnvth )−1 [see Eq. (4.4)]. This depends on n and T , so, in general, τc is a
function of r. In a more quantitative theory, it transpires that it can also be a function
of v (see discussion in §6.1.3).
• Thirdly, as we have already explained in §6.4, elastic collisions must not change the
total number, momentum or energy density of the particles and so the collision operator
satisfies the conservation properties (6.16), (6.19) and (6.31).
Arguably the simplest possible form of the collision operator that satisfies these criteria
is the so-called Krook operator (also known as the BGK operator, after Bhatnagar–Gross–
Krook):
F − FM 1
C[F ] = − = − δF . (6.43)
τc τc
To satisfy the conservation laws (6.16), (6.19) and (6.31), we must have
mv 2
Z Z Z
3 3
d v δF = 0, d v mvδF = 0, d3 v δF = 0. (6.44)
2
These conditions are indeed satisfied because, as argued at the beginning of §6.4.3, we
30
Note its interpretation suggested by the last part of that exercise: parcels of gas move around
at velocity u behaving adiabatically except for heat fluxes and viscous heating.
56 A. A. Schekochihin
are, without loss of generality, committed to considering only such deviations from the
local Maxwellian that contain no perturbation of n, u or energy.
The Krook operator is, of course, grossly simplified and inadequate for many kinetic
calculations—and it certainly will not give us quantitatively precise values of transport
coefficients. However, where it loses in precision it compensates in analytical simplicity
and it is amply sufficient for demonstrating the basic idea of the calculation of these
coefficients. The process of enlightened guesswork (also known as modelling) that we
followed in devising it is also quite instructive as an illustration of how one comes up
with a simple physically sensible model where the exact nature of the underlying process
(in this case, collisions) might be unknown or too difficult to incorporate precisely, but
it is clear what criteria must be respected by any sensible model.
Thus, we have solved the kinetic equation and found the small deviation of the particle
distribution function from the local Maxwellian caused by mean velocity and temperature
gradients. The first line of Eq. (6.51) is perhaps the most transparent as to the mechanism
of this deviation: δF is simply the result of taking a local Maxwellian and letting it
evolve ballistically for a time τc , with all particles flying in straight lines at their initial
velocities. Because τc is small, they only have an opportunity to do this for a short time,
before collisions restore local equilibrium, and so the local Maxwellian gets only slightly
perturbed.
Note that δF is neither Maxwellian nor isotropic—as indeed ought to be the case as
it arises from the global equilibrium being broken by the presence of flows (which have a
direction, in our case, x) and gradients (which also have a direction, in our case, z). The
deviation from the Maxwellian is small because the departures from the equilibrium—
the gradients—are macroscopic (i.e., the corresponding time and spatial scales are long
compared to collisional scales τc and λmfp ).
If our collision operator had been a more realistic and, therefore, much more complicated,
integral operator than the Krook model one, solving the kinetic equation would have involved
quite a lot of hard work inverting this operator—while with the Krook operator, that inversion
was simply multiplication by τc , which took us painlessly from Eq. (6.45) to Eq. (6.47). You
will find the strategies for dealing with the true Boltzmann collision operator in Chapman &
Cowling (1991) or Lifshitz & Pitaevskii (1981) and a simple example of inverting a differential
collision operator in §6.9.5.
Exercise 6.4. Check that the solution (6.52) satisfies the particle, momentum and energy
conservation conditions (6.44).
where the term involving ∂T /∂z vanished because its integrand was odd in wx . Satis-
fyingly, we have found that the momentum flux is proportional to the mean-velocity
gradient, as I have previously argued it must be [see Eq. (5.40)]. The coefficient of
proportionality between them is, by definition, the dynamical viscosity, the expression
for which is, therefore,
Z
2mτc
η= 2 d3 w wz2 wx2 FM (w)
vth
2mτc ∞
Z Z π Z 2π
6 n −w2 /vth
2
3 2
= 2 dw w √ e dθ sin θ cos θ dφ cos2 φ
vth 0 ( πvth )3 0 0
| {z }| {z }| {z }
4
= 15nvth /16π = 4/15 =π
1 2 1
= mnvth τc = mnλmfp vth . (6.54)
2 2
No surprises here: the same dependence on λmfp and temperature (via vth ) as in Eq. (6.4),
but a different numerical coefficient.33 This coefficient depends on the form of the collision
operator and so, since the collision operator that we used is only a crude model, the
coefficient is order-unity wrong. It is progress, however, that we now know what to do
to calculate viscosity precisely for any given model of collisions. You will find many such
precise calculations in, e.g., Chapman & Cowling (1991).
mw2
Z
Jz = d3 w wz δF
2
Z 2
mτc 3 2 2 w 5 1 ∂T 2wx ∂ux
=− d w wz w − + 2 FM (w)
2 v2 2 T ∂z v ∂z
Z th2 th
mτc w 5 ∂T ∂T
=− d3 w wz2 w2 2 − 2 FM (w) ∂z ≡ −κ ∂z , (6.55)
2T vth
where the term involving ∂ux /∂z vanished because its integrand was odd in wx . The
heat flux turns out to be proportional to the temperature gradient, as expected [see
33
Note that the angle dependence of the integrand in Eq. (6.3) that we so proudly worked out
in §6.1 was in fact wrong. However, the derivation in §6.1, while “dodgy,” was not useless: it
highlighted much better than the present, more systematic, one that momentum and energy are
transported because of particles wandering between regions of gas with different ux and T .
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 59
Eq. (5.39)]. The expression for the thermal conductivity is, therefore,
Z 2
mτc w 5
κ= d3 w wz2 w2 2 − FM (w)
2T vth 2
k B τc ∞
Z 2 Z π Z 2π
w 5 n −w2 /vth
2
= 2 dw w6 2 − √ e dθ sin θ cos 2
θ dφ
vth 0 vth 2 ( πvth )3 0 0
| {z }| {z } | {z }
4
= 15nvth /16π = 2/3 = 2π
5 2 5 3
= nkB vth τc = nc1 λmfp vth , c1 = kB . (6.56)
4 6 2
Again, we have the same kind of expression as in Eq. (6.7), but with a different prefactor.
You now have enough experience to spot that these prefactors come from the averaging of
various angle and speed dependences over the underlying Maxwellian distribution—and
the prefactors are nontrivial basically because of intrinsic correlations between, e.g., in
this case, particle energy, the speed and angle at which it moves (transport), and the
form of the non-Maxwellian correction to the local equilibrium which is caused by the
temperature gradient and enables heat to flow on average.
Since we now have the heat equation including also viscous heating, Eq. (6.41), it is
worth writing out its final form: using Eqs. (6.56) and (6.53), we have
2
∂2T
3 ∂T ∂ux
nkB =κ +η . (6.57)
2 ∂t ∂z 2 ∂z
The viscous term is manifestly positive, so does indeed represent heating.
In terms of diffusivities, DT = 2κ/3nkB [Eq. (5.50)] and ν = η/mn [Eq. (5.51)],
2
∂2T
∂T 2m ∂ux
= DT 2 + ν . (6.58)
∂t ∂z 3kB ∂z
Eq. (6.58) and the momentum equation (6.22) combined with Eq. (6.53),
∂ux ∂ 2 ux
=ν , (6.59)
∂t ∂z 2
form a closed system, completely describing the evolution of the gas.
Exercise 6.5. Fick’s Law of Diffusion. a) Starting from the kinetic equation for the
distribution function F ∗ (t, z, v) of some labelled particle admixture in a gas, derive the diffusion
equation
∂n∗ ∂ 2 n∗
=D (6.60)
∂t ∂z 2
for the number density n∗ (t, z) = d3 v F ∗ (t, z, v) of the labelled particles (assuming n∗ changes
R
only in the z direction). Derive also the expression for the diffusion coefficient D, given
—the molecular mass m∗ of the labelled particles,
—the temperature T of the ambient gas (assume T is uniform),
—collision frequency νc∗ of the labelled particles with the ambient ones.
Assume that the ambient gas is static (no mean flows), that the density of the labelled particles
is so low that they only collide with the unlabelled particles (and not each other) and that
the frequency of these collisions is much larger than the rate of change of any mean quantities.
Use the Krook collision operator, assuming that collisions relax the distribution of the labelled
∗
particles to a Maxwellian FM with density n∗ and the same velocity (zero) and temperature (T )
as the ambient unlabelled gas.
Hint. Is the momentum of the labelled particles conserved by collisions? You should discover
that self-diffusion is related to the mean velocity u∗z of the labelled particles (you can assume
60 A. A. Schekochihin
u∗z vth ). You can calculate this velocity either directly from δF ∗ = F ∗ − FM
∗
or from the
momentum equation for the labelled particles.
b) Derive the momentum equation for the mean flow u∗z of the labelled particles and obtain
the result you have known since school: that the friction force (the collisional drag exerted on
labelled particles by the ambient population) is proportional to the mean velocity of the labelled
particles. What is the proportionality coefficient (the “drag coefficient”)? This, by the way, is
the “Aristotelian equation of motion”—Aristotle thought force was generally proportional to
velocity. It took a while for another brilliant man to figure out the more general formula.
Show from the momentum equation that you have derived that the flux of the labelled particles
is proportional to their pressure gradient:
1 ∂P ∗
Φ∗z = n∗ u∗z = − . (6.61)
m∗ νc∗ ∂z
Exercise 6.6. Repeat the calculation that follows without employing this ruse and convince
yourself that the same result obtains.
Using P = nkB T where opportune, Eqs. (6.17), (6.39) and (6.21) then give us
1 ∂n
= −∇ · u, (6.66)
n ∂t
1 ∂T 2 Πij ∂i u
j ∇ ·J
=− ∇ · u + + , (6.67)
T ∂t 3 nkB T nkB T
w ∂u ∇P ∇·Π
2 2 · = −w · + . (6.68)
vth ∂t P P
The terms that are crossed out are negligible in comparison with the ones that are retained (this
can be ascertained a posteriori, once J and Π are known). Assembling the rest according to
Eq. (6.65), we have
2 w2
∂ ln FM ∇n ∇T
=− 2 ∇·u−w· + . (6.69)
∂t 3 vth n T
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 61
Finally, substituting Eqs. (6.64) and (6.69) into Eq. (6.63), we arrive at
2
w 5 w · ∇T wk wl 1
δF = −τc 2
− + 2 2
∂ u
k l − δkl ∇ · u FM (6.70)
vth 2 T vth 3
(where we have replaced v by w where necessary because we are at a point where u = 0).
Exercise 6.7. Check that this δF contains no density, momentum or energy perturbation.
Now we are ready to calculate the fluxes, according to Eqs. (6.23) and (6.33). Similarly to
what happened in §§6.7.1 and 6.7.2, the part of δF containing ∇T only contributes to the heat
flux because it is odd in w and the part containing ∂k ul only contributes to the momentum flux
because it is even in w.
The heat flux is the easier calculation:
mw2
Z Z 2
mτc w 5
J = d3 w w δF = − d3 w ww w2 2
− F M (w) · ∇T. (6.71)
2 2T vth 2
Since the angle average is hwi wj i = w2 δij /3 (recall Exercise 1.3b), this becomes
" 2 2
#
mnτc 4π ∞ 5 e−w /vth
Z 2
6 w
J =− dw w 2
− √ ∇T = −κ∇T, (6.72)
2T 3 0 vth 2 ( π vth )3
| {z }
4
= (5/4)vth
2
where κ = (5/4)nkB vth τc , in gratifying agreement with Eq. (6.56).
The momentum flux is a little more work because it is a matrix:
Z Z
2mτc 1
Πij = d3 w mwi wj δF = − 2 d3 w wi wj wk wl FM (w) ∂k ul − δkl ∇ · u . (6.73)
vth 3
The angle average is hwi wj wk wl i = w4 (δij δkl + δik δjl + δil δjk )/15 (Exercise 1.3c). Therefore,
" #
−w2 /vth
2
2mnτc 4π ∞
Z
6 e 2
Πij = − 2 dw w √ ∂i u j + ∂j u i − δij ∇ · u
vth 15 0 ( π vth )3 3
| {z }
4
= vth /4
2
= −η ∂i uj + ∂j ui − δij ∇ · u , (6.74)
3
2
where η = mnvth τc /2, the same as found in Eq. (6.54). Besides the expression for the dynamical
viscosity, we have now also worked out the tensor structure of the viscous stress, as promised
earlier [after Eq. (5.42)].
v̇ + νv = χ(t) . (6.75)
Here ν is some effective damping rate representing the slowing down of our particle due to friction
with the particles of the ambient gas and χ(t) is a random force representing the random kicks
62 A. A. Schekochihin
that our particle receives from them. This is a good model not for a gas molecule but for
some macroscopic alien particle moving about in the gas—e.g., a particle of pollen in air. It is
called a Brownian particle and its motion Brownian motion after the pioneering researcher who
discovered it.
The frictional force proportional to velocity is simply the Stokes drag on a body moving
through a viscous medium. The force χ(t) is postulated to be a Gaussian random process
with zero average, hχ(t)i = 0, and zero correlation time (Gaussian white noise), i.e., its time
correlation function is taken to be
hχ(t)χ(t0 )i = Aδ(t − t0 ), (6.76)
where A is some (known) constant. We can relate this constant and the drag rate ν to the
temperature of the ambient gas (with which we shall assume the Brownian particles to be in
thermal equilibrium) by noticing that Eq. (6.75) implies, after multiplication by v and averaging,
Z t
d hv 2 i
2 0 0 0
+ νhv i = hv(t)χ(t)i = v(0) + dt −νv(t ) + χ(t ) χ(t)
dt 2 0
Z t
A
dt0 −ν 0 )χ(t)i + hχ(t0 )χ(t)i = . (6.77)
hv(0)χ(t)i hv(t
= +
0 2
Here the two terms that vanished did so because they are correlations between the force at time
t and the velocity at an earlier time—so the latter cannot depend on the former, the average of
the product is the product of averages and we use hχ(t)i = 0. The only term that did not vanish
was calculated using Eq. (6.76) (the factor of 1/2 appeared because the integration was up to
t: only half of the delta function). In the statistical steady state (equilibrium), dhv 2 i/dt = 0, so
Eq. (6.77) gives us
A kB T
hv 2 i = = . (6.78)
2ν m
The last equality is inferred from the fact that, statistically, in 1D, mhv 2 i = kB T , where T is the
temperature of the gas and m the mass of the particle [see Eq. (2.22)]. Thus, we will henceforth
write
2kB T 2
A=ν = νvth . (6.79)
m
We shall now derive the evolution equation for f . First, the unaveraged delta function satisfies,
formally,
∂
δ(v − v(t)) = −δ 0 (v − v(t))v̇(t)
∂t
∂
= − δ(v − v(t))v̇(t)
∂v
∂
= − δ(v − v(t)) [−νv(t) + χ(t)]
∂v
∂
= [νv − χ(t)] δ(v − v(t)). (6.82)
∂v
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 63
Averaging this and using Eq. (6.80), we get
∂f ∂
= [νvf − hχ(t)δ(v − v(t))i] . (6.83)
∂t ∂v
To find the average in the second term, we formally integrate Eq. (6.82):
Z t
∂
dt0 νv − χ(t0 ) δ(v − v(t0 ))
hχ(t)δ(v − v(t))i = χ(t) δ(v − v(0)) +
0 ∂v
2
νvth ∂
=− f (t, v). (6.84)
2 ∂v
To obtain this result, we took δ(v − v(t0 )) to be independent of either χ(t) or χ(t0 ), again by
the causality principle: v(t0 ) can only depend on the force at times previous to t0 . As a result of
this, the first two terms vanished because hχ(t)i = 0 and in the last term we used Eqs. (6.76)
and (6.79) and did the integral similarly to Eq. (6.77).
Finally, substituting Eq. (6.84) into Eq. (6.83), we get
v 2 ∂f
∂f ∂
=ν vf + th . (6.85)
∂t ∂v 2 ∂v
This is very obviously a diffusion equation in velocity space, with an additional drag (the vf
term). The steady-state (∂f /∂t = 0) solution of Eq. (6.85) that normalises to unity is
1 2 2
f= √ e−v /vth , (6.86)
π vth
a 1D Maxwellian, as it ought to be, in equilibrium.
It is at this point that we should be struck by the realisation that what we have just derived
is the collision operator for Brownian particles. In this simple model, it is the differential
operator in the right-hand side of Eq. (6.85). As a collision operator must do, it pushes the
particle distribution towards a Maxwellian—since we derived the collision operator from “first
principles” of particle motion, we are actually able to conclude that the equilibrium distribution
is Maxwellian simply by solving Eq. (6.85) in steady state (rather than having to bring the
Maxwellian in as a requirement for constructing a model of collisions, as we did in §6.5).
There is one important difference between the collision operator in Eq. (6.85) and the kind
of collision operator, discussed in §6.5, that would be suitable for gas molecules: whereas the
Brownian particles’ collision operator does conserve both their number and their energy, it
certainly does not conserve momentum (Exercise: check these statements). This is not an error:
since the Brownian particles experience a drag force from the ambient gas, it is not surprising
that they should lose momentum as a result (cf. Exercise 6.5).
Eq. (6.85) is clearly the kinetic equation for Brownian particles. Where then, might you ask,
is then the spatial dependence of this distribution—i.e., where is the v · ∇F term that appears
in our prototypical kinetic equation (6.14)? This will be recovered in §6.9.4.
Exercise 6.8. Particle Heating. What happens to our particles if ν = 0 and A is fixed to
some constant? Explain the following statement: the drag on the particles limits how much their
distribution can be heated.
Thus, in order to calculate hz 2 i, we need to know the time-correlation function hv(t0 )v(t00 )i of
the particle velocities.
64 A. A. Schekochihin
This is easy to work out because we can solve Eq. (6.75) explicitly:
Z t
v(t) = v(0)e−νt + dτ χ(τ )e−ν(t−τ ) . (6.89)
0
This says that the “memory” of the initial condition decays exponentially and so, for νt 1,
we can simply omit the first term (or formally consider our particle to have started from rest at
t = 0). The mean square displacement (6.88) becomes in this long-time limit
Z t Z t Z t0 Z t00 2
0
−τ 0 +t00 −τ 00 ) vth
hz 2 (t)i = dt0 dt00 dτ 0 dτ 00 hχ(τ 0 )χ(τ 00 )ie−ν(t = t, (6.90)
0 0 0 0 ν
where we have again used Eqs. (6.76) and (6.79) and integrated the exponentials, carefully
paying attention to the integration limits, to what happens when t0 > t00 vs. t0 < t00 , and finally
retaining only the largest term in the limit νt 1.
Thus, the mean square displacement of our particle is proportional to time. It might be
illuminating at this point for you to compare this particular model of diffusion with the model
discussed in §5.7.2 and think about why the two are similar.
Exercise 6.9. Calculate hv(t0 )v(t00 )i carefully and show that the correlation time of the particle
velocity is 1/ν (i.e., argue that this is the typical time over which the particles “remembers” its
history).
Exercise 6.10. Work out hz 2 (t)i without assuming νt 1 and find what it is when νt 1?
Does this answer make physical sense?
v 2 ∂F
∂F ∂F ∂
+v =ν vF + th ≡ C[F ] . (6.94)
∂t ∂z ∂v 2 ∂v
This is the kinetic equation for Brownian particles, analogous to Eq. (6.14), with the collision
operator that we already derived in §6.9.2. Eq. (6.85) is, of course, just Eq. (6.94) integrated
over all particle positions z.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 65
6.9.5. Diffusion in Position Space
The collision operator in Eq. (6.94) is still pushing our pdf towards a Maxwellian, but it is,
in general, only a local Maxwellian, with particle number density that can depend on t and z:
n(t, z) −v2 /vth
2
FM (t, z, v) = √ e . (6.95)
π vth
This is the Brownian-gas analog of the local Maxwellian (5.10). Note that we are assuming that
the temperature of the ambient gas is spatially homogeneous and constant in time, i.e., that
vth = const. Clearly, the pdf (6.95) represents the local equilibrium that will be achieved provided
the right-hand side of Eq. (6.94) is dominant, i.e., provided that n(t, z) changes sufficiently slowly
in time compared to the collision rate ν and has a sufficiently long gradient scale length compared
to vth /ν (the mean free path of Brownian particles).
We may now complete the kinetic theory of Brownian particles by deriving the evolution
equation for their density n(t, z). Let us do the same thing as we did in §6.4.1 and obtain
this equation by integrating the kinetic equation (6.94) over all velocities. Expectedly, we get a
continuity equation:
∂n ∂
+ nu = 0, (6.96)
R ∂t ∂z
where nu(t, z) = dv vF (t, z, v) is the particle flux. Since the equilibrium solution (6.95) has no
mean flow in it, all of the particle flux must be due to the (small) deviation of F from FM , just
like the momentum and heat fluxes in §6.2 arose due to such a deviation.
We shall solve for δF = F − FM using the same method as in §6.6: Assuming δF FM and
ν v∂/∂z ∂/∂t, we conclude from Eq. (6.94) that δF must satisfy, approximately:
v 2 ∂δF
∂ v ∂FM v ∂n
vδF + th = = FM . (6.97)
∂v 2 ∂v ν ∂z νn ∂z
Inverting the collision operator, which is now a differential one, is a less trivial operation than
2
with the Krook operator in §6.6, but only slightly less: noticing that vFM = −(vth /2)∂FM /∂v,
we may integrate Eq. (6.97) once, reducing it to a first-order ODE:
∂δF 2v 1 ∂n
+ 2 δF = − FM . (6.98)
∂v vth νn ∂z
The solution of this is
v ∂n
δF = − FM . (6.99)
νn ∂z
The integration constants are what they are because δF must vanish at v → ±∞ R and because
we require the density n of the Maxwellian (6.95) to be the exact density, i.e., dv δF = 0 (the
logic of this was explained at the beginning of §6.4.3).
Finally, the particle flux is
v 2 ∂n
Z
nu = dv vδF = − th (6.100)
2ν ∂z
and Eq. (6.96) becomes the diffusion equation for Brownian particles:
∂n ∂2n 2
vth
=D , D= . (6.101)
∂t ∂z 2 2ν
This is nothing but Fick’s Law of Diffusion, which already made an appearance in §5.7 and in
Exercises 6.2 and 6.5 and which we have now formally derived for Brownian particles.
Exercise 6.11. Work out the kinetic theory of Brownian particles in 3D by generalising the
above calculations to vector velocities v and positions r. You may assume the vector components
of the random force χ(t) to be uncorrelated with each other, hχi (t)χj (t0 )i = Aδij δ(t − t0 ).
66 A. A. Schekochihin
PART III
Foundations of Statistical Mechanics
7. From Microphysics to Macrophysics
7.1. What Are We Trying to Do?
Thermodynamics was all about flows of energy, which we formalised in two ways:
dU = δQ − δW = T dS − P dV. (7.1)
|{z} |{z}
heat work
Note that T and S were introduced via their relationship with heat in reversible
processes. All this was completely general. But to calculate anything specific, we needed
two further pieces of information:
α = 1, 2, 3, . . . , Ω 1 (7.7)
(the total number of possible microstates is huge for a large system). For each such state,
there is a certain probability of the system being in it:
Ω
X
p1 , p2 , p3 , . . . , pα , . . . , pΩ , pα = 1. (7.8)
α=1
E1 , E2 , E3 , . . . , Eα , . . . , EΩ . (7.9)
They might also have momenta, angular momenta, spins, and other quantum numbers.
If we knew all these things, we could then calculate various macrophysical quantities
as averages over the distribution {pα }, e.g., the mean energy
X
U = hEα i = pα Eα . (7.10)
α
7.3. Pressure
The concept of pressure arises in connection with changing the volume of the system.
In most of what follows (but not in Exercise 14.6), I will treat volume as an exact
external parameter (as opposed to some mean property to be measured). Let us consider
deformations that occur very very slowly. We know from Quantum Mechanics (e.g.,
Binney & Skinner 2013, §12.1) that if an external parameter (here volume) is changed
slowly in an otherwise isolated system, the system will stay in the same eigenstate (say,
α) with its energy, Eα (V ), changing slowly. This process is called adiabatic (we will learn
soon that this meaning of “adiabatic” is equivalent to the familiar thermodynamical one).
Since the system’s microstates {α} do not change in an adiabatic process, neither do
68 A. A. Schekochihin
their probabilities {pα }. The corresponding change in the mean energy is then
∂U X ∂Eα
dUad = dV = pα dV. (7.11)
∂V p1 ,...,pΩ α
∂V
But a slow change of energy in a system due exclusively to a change in its volume can be
related to the work done on the system by whatever force is applied to effect the change.
This work is, of course, equal to minus the work done by the system against that force:
dUad = dWad = −P dV, (7.12)
and so we may define pressure as
X ∂Eα ∂Eα
P =− pα =− . (7.13)
α
∂V ∂V
Thus, if we know {pα } and {Eα } (the latter as functions of V or other external
parameters), then we can calculate pressure and/or its non-P V analogs.
It is clear that we cannot make any progress calculating {Eα } without specifying what
our system is made of and how it is constituted. So the determination of the energies
is a job for the microphysical (in general, quantum) theory. Normally, exact solution
will only be possible for simple models (like the ideal gas). The amazing thing, however,
is that in equilibrium, we will be able to determine {pα } as functions of {Eα } in a
completely general way—without having to solve a Ω-dimensional Schrödinger equation
for our system (which would clearly be a hopeless quest).
NB: When I say “determine {pα },” what I really mean is find a set of probabilities
{pα } such that upon their insertion into averages such (7.10) or (7.13), correct (ex-
perimentally verifiable) macroscopic quantities will be obtained. This does not mean
that these probabilities will literally be solutions of the Schrödinger equation for our
system (many different sets of probabilities give the same averages, so, e.g., getting the
correct mean energy does not imply—or, indeed, require—that the true probabilities
be used).
To learn how to determine these {pα }, we will make a philosophical leap and learn to
calculate things not on the basis of what we know, but on the basis of what we don’t
know!
Clearly, any particular measured value of U will be consistent with lots of different
microstates, so knowing U , while not generally consistent with equal probabilities (8.1),
will not constrain the values of pα ’s very strongly: indeed, there are Ω 1 pα ’s
and
P only one equation (8.2) that they are required to satisfy (plus the normalisation
α pα = 1). We may be able to measure other quantities and so have more information
in the form of equations like Eq. (8.2), but it is clear that the amount of information we
are ever likely to have (or want) falls hugely short of uniquely fixing every pα . This is
good: it means that we do not need to know these probabilities well—just well enough
to recover our measurable quantities.
In order to make progress we must find a way of assigning values to {pα } systematically,
34
Adopting the view of probabilities as likelihoods—as opposed to frequencies—with which the
system is supposed to visit those microstates (“gambler’s statistics,” rather than “accountant’s
statistics”) is a controversial move, which will be further discussed in §12.2.
70 A. A. Schekochihin
taking into account strictly the information we have and nothing more. We shall adopt
the following algorithm (Jaynes 2003, §11.4).
We
P have Ω microstates and need to assign probabilities p1 , . . . , pΩ to them, subject
to α pα = 1 and whatever constraints are imposed by our information.
What is the most likely outcome of this game? The number of ways W in which an
assignment (8.3) can be obtained is the number of ways of choosing N1 , . . . , NΩ quanta
out of a set of N , viz.,
N!
W = . (8.4)
N1 ! · · · NΩ !
All outcomes are equiprobable, so the most likely assignment {Nα } is the one that
maximises W subject to the constraints imposed by the available information.35
Note that we were at liberty to choose N as large as we liked and so we may assume
that all Nα 1 and use Stirling’s formula to evaluate factorials:
ln N ! = N ln N − N + O(ln N ). (8.5)
P
Then, using also α Nα = N ,
X
ln W = N ln N − N + O(ln N ) − [Nα ln Nα −
N
α + O(ln Nα )]
| {z }
P α
α (Nα ln N + Nα)
X Nα
=− Nα ln + O(ln N )
α
N
" #
X ln N
= −N pα ln pα + O . (8.6)
α
N
35
It is possible to prove that this maximum is very sharp for large N (for a simple case of Ω = 2,
this is done in Exercise 8.1; for the more general case, see Schrödinger 1990).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 71
This quantity is called the Gibbs entropy, or, in the context of information theory,
the Shannon entropy (the “amount of ignorance” associated with the set of probabil-
ities {pα }).
Maximising W is the same as maximising SG , so the role of this quantity is that the
“fairest” assignment of probabilities {pα } subject to some information will correspond to
the maximum of SG subject to the constraints imposed by that information.
2) Since 0 < pα 6 1, SG > 0 always. Note that pα > 0 because pα = 0 would mean
that α is not an allowed state of the system; pα = 1 means that there is only one
state that the system can be in, so it must be in it and then SG = 0—we have perfect
knowledge ⇔ zero ignorance.
3) Entropy is additive: essentially, when two systems are put together, the entropy of
the composite system is the sum of the entropies of its two parts. This will discussed
carefully in §10.1.
4) What is the maximum possible value of SG ? The number of all possible distributions
of N probability quanta over Ω microstates is Ω N , which is, therefore, the maximum
value that W can take:36
Wmax = Ω N . (8.8)
Then the maximum possible value of SG is
1
SG,max = ln Wmax = ln Ω. (8.9)
N
This value is attained when our ignorance about the system is total, which means that
all microstates are, as far as we are concerned, equiprobable:
1 X 1 1
pα = ⇒ SG = − ln = ln Ω = SG,max . (8.10)
Ω α
Ω Ω
In this context, the Shannon (1948) definition of the information content of a probability
distribution is
X
I(p1 , . . . , pΩ ) = SG,max − SG (p1 , . . . , pΩ ) = ln Ω + pα ln pα . (8.11)
α
Exercise 8.1. Tossing a Coin. This example illustrates the scheme for assignment of a priori
probabilities to microstates discussed in §8.1.3.
Suppose we have a system that only has two states, α = 1, 2, and no further information
about it is available. We shall assign probabilities to these states in a fair and balanced way: by
36
At finite N , this is not a sharp bound for (8.4), but its gets sharper for N 1.
72 A. A. Schekochihin
flipping a coin N 1 times, recording the number of heads N1 and tails N2 and declaring that
the probabilities of the two states are p1 = N1 /N and p2 = N2 /N .
a) Calculate the number of ways, W , in which a given outcome {N1 , N2 } can happen, find
its maximum and prove therefore that the most likely assignment of probabilities will be p1 =
p2 = 1/2. What is the Gibbs entropy of this system?
b) Show that for a large number of coin tosses, this maximum is sharp. Namely, show that
the number of ways W (m) in which you can get an outcome with N /2 ± m heads (where
N m 1) is
W (m)
≈ exp −2m2 /N ,
(8.12)
W (0)
In §8.1.3, I argued that, in order to achieve the “fairest” and most unbiased assignment of
probabilities pα to microstates α, one must maximise the function
X
SG (p1 , . . . , pΩ ) = − pα ln pα (8.13)
α
(called Gibbs entropy, Shannon entropy, “information entropy,” measure of uncertainty, etc.).
I did this by presenting a reasonable and practical scheme for assigning probabilities, which I
asked you to agree was the fairest imaginable. In the spirit of formalistic nit-picking, you might
be tempted to ask whether the function (8.13) is in any sense unique—could we have invented
other “fair games” leading to different definitions of entropy? Here is an argument that addresses
this question.
Faced with some set of probabilities {pα } (“a distribution”), let us seek to define a function
H(p1 , . . . , pΩ ) that would measure the uncertainty associated with this distribution. In order
to be a suitable such measure, H must satisfy certain basic properties:
2) H should be symmetric with respect to permutations of {pα } (i.e., it should not matter
in what order we list the microstates);
3) for any set of probabilities {pα } that are not all equal,
1 1
H(p1 , . . . , pΩ ) < H ,..., ≡ HΩ (8.14)
Ω Ω
5) H should be additive and independent of how we count the microstates, in the following
sense. If the choice of a microstate is broken down into two successive choices—first a sub-
group, then the individual state—the total H should be a weighted sum of individual values of
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 73
H associated with each subgroup. Namely, split the microstates into groups:
M
X −1 M
X
α = 1, . . . , m1 , m1 + 1, . . . , m1 + m2 , . . . , mi + 1, . . . , mi = Ω . (8.15)
i=1 i=1
| {z } | {z } | {z }
group group group
i=1 i=2 i=M
↓ ↓ ↓
probability probability probability
w1 w2 wM
Clearly, wi is the sum of pα ’s for the states that are in the group i. Within each group, we
can assign conditional probabilities to all microstates in that group, viz., the probability for
the system to be in microstate α within group i if it is given that the system is in one of the
microstates in that group, is
(i) pα
pα = . (8.16)
wi
We then want H to satisfy
(1) (2) (2)
H(p1 , . . . , pΩ ) = H(w1 , . . . , wM ) + w1 H(p1 , . . . , p(1)
m1 ) + w2 H(pm1 +1 , . . . , pm1 +m2 ) + . . .
| {z } | {z } | {z } | {z }
total uncertainty in uncertainty uncertainty
uncertainty the within within
distribution group 1 group 2
of groups
p1 pm1 pm1 +1 pm1 +m2
= H(w1 , . . . , wM ) + w1 H ,..., + w2 H ,..., + ...
w1 w1 w2 w2
(8.17)
Theorem. The only function H with these properties is
X
H(p1 , . . . , pΩ ) = −k pα ln pα , (8.18)
α
Proof. For any integers m, n > 1, we can always find integers r and (an arbitrarily large) s such
that
r ln m r+1
< < ⇒ nr < ms < nr+1 . (8.26)
s ln n s
As f is a monotonically increasing function,
f (nr ) < f (ms ) < f (nr+1 ). (8.27)
r
But Eq. (8.24) implies f (n ) = rf (n), so the above inequality becomes
r f (m) r+1
rf (n) < sf (m) < (r + 1)f (n) ⇒ < < . (8.28)
s f (n) s
The inequalities (8.26) and (8.28) imply
f (m) ln m 1 f (m) f (n) 1 f (n)
f (n) − ln n < s ⇒ ln m − ln n < s ln m → 0 (8.29)
This, then, is the method for conditional maximising (extremising) a function subject
to a constraint: add to it the constraint multiplied by −λ and maximise the resulting
function unconditionally, with respect to the original variables and λ. The additional
variable λ is called the Lagrange multiplier.
The method is easily generalised to the case of several constraints: suppose, instead of
one constraint (8.34), we have m of them:
Fi (p1 , . . . , pΩ ) = 0, i = 1, . . . , m. (8.44)
To maximise SG subject to these, introduce m Lagrange multipliers λ1 , . . . , λm and
76 A. A. Schekochihin
maximise unconditionally
X
SG − λi Fi → max (8.45)
i
This gives
!
X X
dSG − λ dpα − pα − 1 dλ = 0. (8.48)
α α
pα = e−(1+λ) . (8.51)
Setting also the coefficient in front of dλ to zero (this is just the constraint (8.46)), we find
X 1
e−(1+λ) = Ω e−(1+λ) = 1 ⇒ e−(1+λ) = . (8.52)
α
Ω
Thus, we recover the equal-probabilities distribution (8.1), with SG for this distribution taking
the maximum possible value [Eq. (8.10)]:
1
pα = , SG = ln Ω, (8.53)
Ω
the state of maximum ignorance. Our method works.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 77
9. Canonical Ensemble
9.1. Gibbs Distribution
We are now going to implement the programme of deriving the probability distribution
resulting from maximising entropy subject to a single physical constraint: a fixed value
of mean energy,
X
pα Eα = U. (9.1)
α
The set of realisations of a system described by this probability distribution is called the
canonical ensemble, introduced by J. W. Gibbs (1839–1903), a great American physicist
whose name will loom large in everything that follows. Constraints other than (or in
addition to) (9.1) will define different ensembles, some of which will be discussed later
(see §14 and Exercise 14.6). P
As explained in §8.2, in order to findP {pα }, we must maximise SG = − α pα ln pα
subject to the constraint (9.1) and to α pα = 1 [Eq. (8.46)]. This means that we need
two Lagrange multipliers, which we will call λ and β, and an unconditional maximum
! !
X X
SG − λ pα − 1 − β pα Eα − U → max (9.2)
α α
e−βEα
pα = , (9.8)
Z(β)
39
Note the upcoming physical interpretation of the partition function as the number of
microstates effectively available to the system at a given temperature (see §11.8).
78 A. A. Schekochihin
known as the Gibbs (canonical) distribution. Finally, the second Lagrange multiplier β is
found from the constraint (9.1),
X 1 X ∂ ln Z
pα Eα = Eα e−βEα = − = U. (9.9)
α
Z(β) α ∂β
I am going to show you that we have solved the problem posed in §7: how to work out
all thermodynamically relevant quantities (in particular, free energy) and relationships
from just knowing the energy levels {Eα } of a given system. To do this, we first need to
establish what β means and then how to calculate the thermodynamical entropy S and
pressure P .
The Gibbs entropy in the equilibrium given by the Gibbs distribution (9.8) is
X X
SG = − pα ln pα = − pα (−βEα − ln Z) = βU + ln Z. (9.10)
α α
40
Therefore, in equilibrium,
dZ
dSG = βdU + U dβ +
Z
X e−βEα
= βdU +
U dβ
+
(−βdEα −
Eα
dβ
)
α
Z }
| {z
= pα
!
X
=β dU − pα dEα . (9.11)
α
Since Eα = Eα (V ) (we will hold N to be unchangeable for now), dEα = (∂Eα /∂V )dV .
Recalling Eq. (7.13), we then identify the second term inside the bracket in Eq. (9.11) as
P dV , so
dSG = β(dU + P dV ) = βdQrev , (9.12)
where dQrev = dU − dWad is the definition of reversible heat, the difference between the
change in internal energy and the adiabatic work dWad = −P dV done on the system. The
left-hand side of Eq. (9.12) is a full differential of SG , which is clearly a function of state.
40
Here the differential of SG is between different equilibrium states, i.e., we vary external
parameters and constraints, viz., V and U —not the probability distribution, as we did in
Eq. (9.3) in order to find the equilibrium state. The SG that we vary here, given by Eq. (9.10),
is already the maximum SG (for any given V , U ) that we found in §9.1.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 79
So we have found that β is an integrating factor of heat in thermal equilibrium—Kelvin’s
definition of (inverse) thermodynamical temperature!
Thus, it must be the case that
1
β= , (9.13)
kB T
i.e., 1/β differs from the thermodynamical temperature at most by a constant factor,
which we choose to be the Boltzmann constant simply to convert from energy units (β
multiplies Eα in the exponentials, so its units are inverse energy) to degrees Kelvin, a
historical (in)convenience. Then Eq. (9.12) immediately implies the relationship between
the thermodynamical entropy S and the Gibbs–Shannon entropy SG :
S = kB SG . (9.14)
(see §§9.3 and 9.4 for a more formal proof of these results).
With Eqs. (9.13) and (9.14), Eq. (9.12) turns into the familiar fundamental equation
of thermodynamics:
T dS = dU + P dV . (9.15)
We are done: introducing as usual the free energy
F = U − T S, (9.16)
we can calculate everything (see §7.1): equation of state, entropy, energy, etc.:
∂F ∂F
P =− , S=− , U = F + T S, . . . (9.17)
∂V T ∂T V
The progress we have made is that we now know the explicit expression for F in terms
of energy levels of the systems: namely, combining Eqs. (9.10), (9.13) and (9.14), we get
S U
= + ln Z, (9.18)
kB kB T
whence, via Eq. (9.16),
X
F = −kB T ln Z, where Z= e−Eα /kB T . (9.19)
α
This means, by the way, that if we know the partition function, we know about the system
everything that is needed to describe its equilibrium thermodynamics.
Note that from Eq. (9.19) follows a nice way to write the Gibbs distribution (9.8):
e−βEα
Z = e−βF ⇒ pα = = eβ(F −Eα ) . (9.20)
Z
If you thought the derivation of Eqs. (9.13) and (9.14) in §9.2 was a little cavalier, mathe-
matically, here is a more formal proof.
We had derived, using only the principle of maximum entropy, Eq. (8.7) (Gibbs–Shannon
entropy, which at that point had nothing to do with the thermodynamic entropy, heat engines
or any of that), and the definition of pressure, Eq. (7.13), that [Eq. (9.12)]
dSG = βdQrev . (9.21)
80 A. A. Schekochihin
From Thermodynamics, we knew the thermodynamic entropy S, thermodynamic temperature
T and the reversible heat to be related by
1
dS = dQrev . (9.22)
T
Therefore,
1
dS = dSG . (9.23)
βT
Since the left-hand side of this equation is a full differential, so is the right-hand side. Therefore,
1/βT is a function of SG only:
1
= f (SG ) ⇒ dS = f (SG )dSG ⇒ S = ϕ(SG ), (9.24)
βT
Therefore,
ϕ01 (SG,1 ) = ϕ02 (SG,2 ) = const ≡ kB (9.29)
(“separation constant”), giving
1 1
ϕ0 (SG ) = f (SG ) = kB ⇒ = f (SG ) = kB ⇒ β= , (9.30)
kB T kB T
the desired Eq. (9.13), q.e.d. This implies, finally [see Eq. (9.24)],
Setting const = 0 gives Eq. (9.14), q.e.d. It remains to discuss this choice of the integration
constant, which has a physical meaning.
Consider what happens to this quantity in the limit T → 0, or β → ∞. Suppose the lowest
energy level is E1 and the lowest m microstates have this energy, viz.,
SG → ln m as T →0 , (9.35)
where m is the degeneracy of the lowest energy level. Physically, this makes sense: at zero
temperature, the system will be in one of its m available lowest-energy states, all of which have
equal probability.
Setting const = 0 in Eq. (9.31) means that also the thermodynamic entropy
S → kB ln m as T → 0. (9.36)
Recall that the 3-rd Law of Thermodynamics said that S → 0 as T → 0. This is not a
contradiction because kB ln m is very small compared to typical values that S can have: indeed,
since S is additive, it will generally be proportional to the number of particles in the system,
S ∝ kB N (see §11.9), whereas obviously ln m N except for very strange systems. Thus, the
choice const = 0 in Eq. (9.31) is basically the statement of the 3-rd Law. You will find further
discussion of this topic in Chapter III of Schrödinger (1990).
NB: In any event, these details do not matter very much because what is important is that
the constant in Eq. (9.31) is a constant, independent of the parameters of the system, so all
entropy differences are independent of it—and related via kB when expressed in terms of S
and SG .
Exercise 9.1. Elastic Chain. A very simplistic model of an elastic chain is illustrated in
Fig. 20. This is a 1D chain consisting of N segments, each of which can be in one of two (non-
degenerate) states: horizontal (along the chain) or vertical. Let the length of the segment be
a when it is horizontal and 0 when it is vertical. Let the chain be under fixed tension γ and
so let the energy of each segment be 0 when it is horizontal and γa when it is vertical. The
temperature of the chain is T .
a) What are the microstates of the chain? Using the canonical ensemble, work out the single-
segment partition function and hence the partition function of the entire chain.
b) Entropic force. Work out the relationship between mean energy U and mean length L of the
chain and hence calculate the mean length as a function of γ and T . Under what approximation
do we obtain Hooke’s law
γ = AkB T (L − L0 ) , (9.38)
where L0 and A are constants? What is the physical meaning of L0 ? Physically, why is the
tension required to stretch the chain to the mean length L greater when the temperature is
higher?
c) Calculate the heat capacity for this chain and sketch it as a function of temperature
(pay attention to what quantity is held constant for the calculation of the heat capacity). Why
physically does the heat capacity vanish both at small and large temperatures?
d) Negative temperature. If you treat the mean energy U of the chain as given and temperature
as the quantity to be found, you will find that temperature can be negative! Sketch T as a
function of U and determine under what conditions T < 0. Why is this possible in this system
and not, say, for the ideal gas? Why does the stability argument from §10.5.2 not apply here?
e) Superfluous constraints. This example illustrates that if you have more measurements and
so more constraints, you do not necessarily get different statistical mechanics (so the maximum-
entropy principle is less subjective than it might seem to be; see §12.3).
So far we have treated our chain as a canonical ensemble, i.e., we assumed that the only
constraint on probabilities would be the mean energy U . Suppose now that we have both a
thermometer and a ruler and so wish to maximise entropy subject to two constraints: the mean
energy is U and the mean length of the chain is L. Do this and find the probabilities of the
microstates α of the chain as functions of their energies Eα and corresponding chain lengths
`α . Show that the maximisation problem only has a solution when U and L are in a specific
relationship with each other—so the new constraint is not independent and does not bring in any
new physics. Show that in this case one of the Lagrange multipliers is arbitrary (and so can be
set to 0—e.g., the one corresponding to the constraint of fixed L; this constraint is superfluous
so we are back to the canonical ensemble).
f) It is obviously a limitation of our model that the energy and the length of the chain are
in one-to-one correspondence: thus, you would not be able to construct from this model the
standard thermodynamics based on tension force and chain length, with the latter changeable
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 83
independently from the energy. Invent your own model in which U and L can be varied inde-
pendently and work out its statistical mechanics (partition function) and its thermodynamics
(entropy, energy, heat capacity, Hooke’s law, etc.).42 One possibility might be to allow the
segments to have more than two states, with some states having the same energy but contributing
to the total length in a different way (or vice versa), e.g., to enable the segments to fold back
onto each other.
The tension force (9.38) is an example of an entropic force. To be precise, the entropic force
is the equal and oppositely directed counterforce with which the elastic chain responds to an
externally applied force of magnitude γ required to keep the chain at mean length L. There is no
fundamental interaction associated with this force43 —indeed this force only exists if temperature
is non-zero and results from the statistical tendency for the chain to maximise its entropy,
so the segments of the chain cannot all be in the horizontal state and the chain wants to
shrink if stretched beyond its natural tension-free equilibrium length (which is N a/2). In the
currently very fashionable language, such a force is called emergent, being a member of the
class of emergent phenomena, i.e., phenomena that result from collective behaviour of many
simple entities embedded in an environment (e.g., a heat bath setting T ; see §10.3) but have no
fundamental prototype in the individual physics of these simple entities.
Verlinde (2011) recently made a splash by proposing that gravity was not a fundamental
force but an emergent entropic one, somewhat analogous to our γ = −T ∂S/∂L, but with
entropy measuring (in a certain rather ingenious way) the information associated with positions
of material bodies in space.
Now put them together into a composite system, but in such a way that the two
constituent systems are in “loose” thermal contact, meaning that the microstates of the
two systems are independent.44 Then the microstates of the composite system are
42
Exercise 14.6 is the P V analog of this calculation.
43
In our model, on the microscopic level, it costs γa amount of energy to put a link into the
vertical state, thus shortening the chain. Nevertheless, a chain of N links in contact with a
thermal bath will resist stretching!
44
In the language of Quantum Mechanics, the eigenstates of a composite system are products
of the eigenstates of its (two) parts. This works, e.g., for gases or fluids, but not for solids,
where states are fully collective. You will find further discussion of this in Binney & Skinner
(2013), §6.1.
84 A. A. Schekochihin
(1) (2) (1) (2)
(α, α0 ) with energy levels Eαα0 = Eα + Eα0 , probabilities pαα0 = pα · pα0 .
Note that in fact, in equilibrium, everything can be derived from the additivity of the energy
(1) (2)
levels: indeed, Eαα0 = Eα + Eα0 implies that partition functions multiply: for the composite
system at a single temperature (otherwise it would not be in equilibrium; see §10.2),
! !
X −βE X −β hE (1) +E (2) i X −βE (1) X −βE (2)
0 α 0 α
Z(β) = e αα = e α = e e α0 = Z1 (β)Z2 (β).
αα0 αα0 α α0
(10.4)
Therefore, the canonical equilibrium probabilities are
(1) (2)
e−βEαα0 e−βEα e−βEα0 (2)
pαα0 = = = p(1)
α p α0 (10.5)
Z Z1 Z2
and also
F = −kB T ln Z = −kB T ln(Z1 Z2 ) = F1 + F2 , (10.6)
∂F ∂F1 ∂F2
S=− =− − = S1 + S2 , (10.7)
∂T V ∂T V ∂T V
∂ ln Z ∂ ln(Z1 Z2 )
U =− =− = U1 + U2 . (10.8)
∂β ∂β
45
Note that if in constructing the expression for entropy we followed the formal route offered by
Shannon’s Theorem (§8.1.5), this would be guaranteed automatically (requirement 5 imposed
on SG in §8.1.5).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 85
the total energy constant and maximise entropy:
U = U1 + U2 = const, (10.9)
S = S1 + S2 → max . (10.10)
These conditions are implemented by setting the differentials of both the total energy
and the total entropy to zero while allowing changes in the energies and entropies of the
two sub-systems:
dU = dU1 + dU2 = 0 ⇒ dU2 = −dU1 , (10.11)
∂S1 ∂S2 ∂S1 ∂S2
dS = dS1 + dS2 = dU1 + dU2 = − dU1 = 0. (10.12)
∂U1 ∂U2 ∂U1 ∂U2
From the fundamental equation of thermodynamics [Eq. (9.15)],46
1 P
dS = dU + dV, (10.13)
T T
we get
1 ∂S
= , (10.14)
T ∂U
so Eq. (10.12) is
1 1
dS = − dU1 = 0 ⇒ T1 = T2 . (10.15)
T1 T2
Thus, in equilibrium, two systems in loose thermal contact will have equal temperatures.
This is called thermal equilibrium.
Note also that, if initially T1 6= T2 , the direction of change is set by dS > 0, so
T1 < T2 ⇔ dU1 > 0, i.e., energy flows from hot to cold.
What we have done can be recast formally as a Lagrange multiplier calculation: we are
maximising S1 + S2 subject to U1 + U2 = U , so, unconditionally,
S1 + S2 − λ(U1 + U2 − U ) → max . (10.16)
This gives
∂S1 ∂S2 ∂S1 ∂S2 1
− λ dU1 + − λ dU2 + (U1 + U2 − U )dλ = 0 ⇒ = =λ= .
∂U1 ∂U2 ∂U1 ∂U2 T
(10.17)
NB: The validity of Eq. (10.14) does not depend on the identification of S and
T with the entropy and temperature from empirical thermodynamics, the equation
holds for the statistical-mechanical entropy (measure of uncertainty in the distribution
{pα }) and statistical-mechanical temperature (Lagrange multiplier associated with
fixed mean energy in the canonical ensemble). The above argument therefore shows
that the statistical-mechanical temperature is a sensible definition of temperature: it
is a scalar function that is the same across a composite system in equilibrium. This
property then allows one to introduce a thermometer based on this temperature and
hence a temperature scale (recall that in Thermodynamics, temperature was introduced
either via the 0-th Law, as just such a function, which, however, did not have to be
unique, or as the universal integrating factor of dQrev —Kelvin’s definition, which we
46
This equation is only valid for equilibrium states, so its use here means that we are assuming
the two subsystems and their composite all to be in equilibrium at the beginning and at the end
of this experiment.
86 A. A. Schekochihin
used in §9.2 when proving the equivalence between thermodynamical and statistical-
mechanical temperatures). I am stressing this to re-emphasise the point, made in §9.5,
that Thermodynamics can be derived entirely from Statistical Mechanics.
One can explicitly construct the Gibbs distribution on this basis if one starts from a (fictional)
“closed system” with equal probabilities for all its microstates (the “microcanonical ensemble”)
and then considers a small part of it. This will be discussed in detail in §12.1.2 (or see, e.g.,
Blundell & Blundell 2009, §4.6, Landau & Lifshitz 1980, §28).
total energy Ei ,
47
To make statistical inferences about the state of a system, you can maximise entropy subject
to whatever constraints you like—but you are not necessarily guaranteed to get a useful result.
If you want to get some sensible physics out, you have to choose your constraints judiciously.
We now see that mean energy is indeed such a judicious choice for a system in a heat bath—this
is not particularly surprising, since energy is what is exchanged when systems settle in thermal
equilibrium. As we shall see in §10.4, it is generally a good strategy to use conserved quantities
as constraints.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 87
mass mi ,
velocity ui ,
centre of mass position r i
and volume Vi .
We now join them all together (in “loose contact,” as explained in §10.1, so their
microstates remain independent) and allow them to exchange energy, momentum, angular
momentum and also to push on each other (“exchange volume,” but not merge). If
we now isolate them and confine them within some volume,48 the equilibrium state of
the combined system must be the state of maximum entropy subject to the following
conservation laws:
X
Ei = E total energy, (10.19)
i
X
mi ui = p total momentum, (10.20)
i
X
mi r i × ui = L total angular momentum, (10.21)
i
X
Vi = V total volume. (10.22)
i
where λ, a, b and σ are Lagrange multipliers. The variables with respect to which we
must maximise this expression are {Ei , ui , Vi } (and λ, a, b and σ). We do not include
the masses {mi } in this set because we are assuming that our systems cannot exchange
matter—we will see in §14 how to handle the possibility that they might.49 We also
48
Equivalently, we can simply say that the combined system will have some total energy,
momentum, angular momentum and volume, which we expect to be able to measure.
49
However, if we allowed such an exchange, we would have to disallow something else, for example
88 A. A. Schekochihin
do not include the positions {r i } amongst the variables because the entropy Si cannot
depend on where the system i is—this is because Si depends only on the probabilities
of the system’s microstates {pα }, which clearly depend only on the internal workings of
the system, not on its position in space (unless there is some inhomogeneous external
potential in which this entire assemblage resides and which would then affect energy
levels—we will not consider this possibility until §14.5).
By the same token, the entropy of each subsystem can depend only on its internal
energy, not on that of its macroscopic motion, because the probabilities {pα } are, by
Galilean invariance, the same in any inertial frame. The internal energy is
mi u2i
Ui = Ei − (10.24)
2
(because the total energy Ei consists of the internal one, Ui , and the kinetic energy of
the system’s macroscopic motion, mi u2i /2). Therefore,
mi u2i
Si = Si (Ui , Vi ) = Si Ei − , Vi . (10.25)
2
Thus, Si depends on both Ei and ui via its internal-energy dependence.
NB: We treat Ei , not Ui , as variables with respect to which we will be maximising
entropy because only the total energy of the system is constrained by the energy
conservation law—it is perfectly fine for energy to be transferred between internal and
kinetic as the system seeks equilibrium.
Differentiating the expression (10.23) with respect to Ei , ui and Vi , and demanding
that all these derivatives vanish, we find
∂Si
−λ=0 thermal equilibrium, (10.26)
∂Ei
∂Si
− mi (a + b × r i ) = 0 dynamical equilibrium, (10.27)
∂ui
∂Si
−σ =0 mechanical equilibrium. (10.28)
∂Vi
10.4.1. Thermal Equilibrium
Using again Eq. (10.14), we find that Eq. (10.26) tells us that in equilibrium, the
temperatures of all subsystems must be equal to the same Lagrange multiplier and,
therefore, to each other:
∂Si ∂Si 1 1 1
= = ⇒ =λ≡ . (10.29)
∂Ei ∂Ui Ti Ti T
This is simply the generalisation to more than two subsystems of the result already
obtained in §10.2.
exchange of volume—otherwise, how would we define where one system ends and another begins?
Cf. Exercise 14.7.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 89
But we already know that all Ti = T , so Eq. (10.28) implies that in equilibrium, all
pressures are equal as well:
Pi P
=σ≡ (10.31)
T T
(note that for ideal gas, this Lagrange multiplier is particle density: σ = nkB ; cf.
Exercise 14.6). Physically, this says that in equilibrium, everything is in pressure balance
(otherwise volumes will expand or shrink to make it so).
The main implication of these results is that in a system in equilibrium, there cannot
be any temperature or pressure gradients or any internal macroscopic motions (velocity
gradients). Statistical Mechanics does not tell us how this is achieved, but we know from
our experience with Kinetic Theory that temperature and velocity gradients will relax
to global equilibrium via thermal diffusivity and viscosity, respectively (see §§5–6).
A few further observations are in order.
1) In practice, mechanical equilibrium (pressure balance) is often achieved faster than
the thermal and dynamical ones are, at least in incompressible systems: pressure imbal-
ances will create uncompensated macroscopic forces, which will give rise to macroscopic
motions, which will iron out pressure differences on dynamical time scales (recall the
discussion of this topic at the end of §6.4.2).
2) All the arguments above are generalised in an obvious way to non-P V systems.
3) Another type of equilibrium that we might have considered is particle equilibrium—
by allowing our subsystems to exchange particles, subject to the overall conservation
of their total number. This leads to the equalisation of the chemical potential across
all subsystems—another Lagrange multiplier, which will be introduced in §14, when we
study “open systems.” Yet further generalisation will be to phase and chemical equilibria,
discussed in §15.
4) In considering quantities other than energy as measurable constraints (momentum,
angular, momentum, volume), we went beyond the canonical ensemble—and indeed,
other ensembles can be constructed to handle situations where, besides energy, other
quantities are considered known: e.g., mean angular momentum (“rotational ensemble”;
see Gibbs 1902), mean volume (“pressure ensemble”; see Exercise 14.6), mean particle
number (“grand canonical ensemble”; see §14), etc. There is no ensemble based on
the momentum of translational motion: indeed, if we consider non-rotating systems,
Eq. (10.33) says that ui = u and we can always go to the frame of reference in which
u = 0 and the system is at rest.
90 A. A. Schekochihin
10.5. Stability
How do we know that when we extremised S, the solution that we found was a
maximum, not a minimum (or a saddle point)? This is equivalent to asking whether
the equilibria that we found were stable. To check for stability, we need to calculate
second derivatives of the entropy.
10.5.1. Thermal Stability
From Eqs. (10.26) and (10.29),
∂ 2 Si ∂ 1 1 ∂T 1
= =− 2 =− 2 <0 (10.34)
∂Ei2 ∂Ei T T ∂Ei T CV i
is a necessary condition for stability. Here
∂Ei ∂Ui
= = CV i (10.35)
∂T ∂T
is the heat capacity and so, in physics language, the inequality (10.34) is the requirement
that the heat capacity should always be positive:
CV > 0 . (10.36)
That this is always so can actually be proven directly by calculating CV = ∂U/∂T from
U = −∂ ln Z/∂β and using the explicit Gibbs formula for Z.
Exercise 10.1. Heat Capacity from Canonical Ensemble. Prove the inequality (10.36)
by showing that
h∆E 2 i
CV = , (10.37)
kB T 2
where h∆E 2 i is the mean square fluctuation of the system’s energy around its mean energy U .
A curious example of the failure of thermal stability is the thermodynamics of black holes. A
classical Schwarzschild black hole of mass M has energy U = M c2 and a horizon whose radius
is R = 2GM/c2 and area is
16πG2 M 2
A = 4πR2 = . (10.38)
c4
Hawking famously showed that such a black hole would emit radiation as if it were a black body
(see §19) with temperature
~c3
T = . (10.39)
8πkB GM
If we take all this on faith and integrate dS/dU = 1/T , the entropy of a black hole turns out to
be proportional to the area of its horizon:
r
4πkB GM 2 A G~
S= = kB 2 , ` P = , (10.40)
~c 4`P c3
where `P is the Planck length. This entropy accounts for the disappearance of the entropy
of objects that fall into the black hole (or indeed of any knowledge that we might have of
them), thus preventing violation of the second law of thermodynamics—even in the absence of
Hawking’s result, this would be reasonable grounds for expecting black holes to have entropy;
indeed, Bekenstein (1973) had argued that this entropy should be proportional to the area of
the horizon before Hawking discovered his radiation.
Eqs. (10.39) and (10.40) imply that, if M is increased, T goes down while S goes up and so
the heat capacity is negative. This can be interpreted to mean that a black hole is not really
in equilibrium (indeed, we know that it evaporates, even if slowly) and that a population of
black holes is an unstable system: they would merge with each other, producing ever larger but
“colder” black holes.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 91
How to construct the statistical mechanics of a black hole remains an active research question
because we do not really know what the “microstates” are [although string theorists do have
models of these microstates from which they are able to calculate S and recover Eq. (10.40)].
I like and, therefore, recommend the paper by Gour (1999), where, with certain assumptions
about these microstates, the black hole is treated via the maximum-entropy principle starting
from the expectation of an observer being able to measure the black hole’s mass and the area
of its horizon (you can also follow the paper trail from there to various alternative schemes).
Exercise 14.8 is a somewhat vulgarised version of this paper.
T >0 . (10.42)
Thus, we have proven that temperature must be positive! Systems with negative temper-
ature are unstable.
Another, more qualitative way of arguing this is as follows. The entropy of the
composite system is
X mi u2i
X
S= Si (Ui ) = Si Ei − . (10.43)
i i
2
If temperature were negative,
∂Si 1
= < 0, (10.44)
∂Ui T
then all Si ’s would be maximised Pby decreasing their argument as much as possible, i.e.,
by increasing all ui ’s subject to i mi ui = 0. This means that all the parts of the system
would fly in opposite directions (the system would blow up).
NB: The prohibition on negative temperatures can be relaxed if bits of the system are
not allowed to move and/or if the system’s allowed range of energies is bounded (see
Exercise 9.1).50
Note that a similar argument can be made for the positivity of pressure: if pressure is
negative,
∂S
P =T < 0, (10.45)
∂V U
then entropy in a (closed) system can increase if volume goes down, i.e., the system will
50
Note, however, a recent objection to the idea of negative temperatures: Dunkel & Hilbert
(2014). This paper also has all the relevant references on the subject; note that what they call
“Gibbs entropy” is not the same thing as our Gibbs–Shannon entropy. If you are going to explore
this literature, you may want to read §§12 and 13 first.
92 A. A. Schekochihin
shrink to nothing. In contrast, if P > 0, then entropy increases as V increases (system
expands)—but this is checked by walls or whatever external circumstances maintain the
fixed total volume. This argument militates strongly against negative pressures, but it
is not, in fact completely prohibitive: negative pressures can exist (although usually in
metastable states, to be discussed in Part VII)—this happens, for example, when cavities
form or substances separate from walls, etc.
where {Eα } are the energy levels of our gas—i.e., of N non-interacting particles in a
box of volume V —corresponding to all possible states {α} in which these particles can
collectively find themselves. Thus, in order to compute Z, we must start by working out
what are {α} and {Eα }.
This counting scheme will turn out to be very wrong, but let us explore where it
leads—we will learn some useful things and later fix it without much extra work.
Under this scheme, the partition function is
" #N
X X
−β(εk1 +···+εkN ) −βεk
Z= e = e = Z1N , (11.6)
{k1 ,...,kN } k
| {z }
= Z1
where Z1 is the single-particle partition function. So, if we can calculate Z1 , we are done.
where ∆kx,y,z = 2π/Lx,y,z are the spacings between discrete points in the “grid” in k
space [see Eq. (11.3)]. The continuous approximation is good as long as the typical scale
of variation of k in the integrand is much larger than the ‘k-grid spacing:
r √
2m 2mkB T 2π 2π
k∼ = ∆kx,y,z = ∼ 1/3
β~2 ~ Lx,y,z V
~2 ~2 n 2/3 Tdeg
⇔ T 2/3
= = 2/3 , (11.8)
mkB V mk B N N
where Tdeg is the degeneration temperature—the lower limit to the temperatures at
which the classical approximation can be used, given by Eq. (2.29). The condition (11.8)
is easily satisfied, of course, because T Tdeg and N 1.
The triple Gaussian integral in Eq. (11.7) is instantly calculable:
Z 3 3/2 3/2
V −β~2 kx
2
/2m V 2m V mkB T V
Z1 = 3
dk x e = 3 2
π = 3
≡ 3 ,
(2π) (2π) β~ ~ 2π λth
(11.9)
where we have introduced the thermal wavelength
r
2π
λth = ~ , (11.10)
mkB T
a quantity that is obviously (dimensionally) convenient here, will continue to prove
convenient further on and acquire a modicum of physical meaning in Eq. (11.27).
94 A. A. Schekochihin
11.4. Digression: Density of States
When we calculate partition functions based on the canonical distribution, only microstates
with different energies give different contributions to the sum over states [Eq. (11.1)], whereas
microstates whose energies are the same (“degenerate” energy levels) all have the same probabil-
ities and so contribute similarly. Therefore, we can write Z as a weighted integral over energies
or over some variable that is in one-to-one correspondence with energy—in the case of energy
levels of the ideal gas, k = |k| [Eq. (11.4)]. In this context, there arises the quantity called the
density of states—the number of microstates per k, or per ε.
For the classical monatomic ideal gas, we can determine this quantity by transforming the
integration in Eq. (11.7) to polar coordinates and integrating out the angles in k space:
Z ∞ Z ∞
V 2 −β~2 k2 /2m −β~2 k2 /2m V k2
Z1 = dk 4πk e ≡ dk g(k) e , g(k) = , (11.11)
(2π)3 0 0 2π 2
where g(k) is the density of states (per k). The fact that g(k) grows with k says that energy
levels are increasingly more degenerate as k goes up (the number of states in a spherical shell
of width dk in k space, g(k)dk, goes up).
Similarly, transforming the integration variable in Eq. (11.11) to ε = ~2 k2 /2m, we can write
Z ∞ √
2 V ε
Z1 = dε g(ε) e−βε , g(ε) = √ 3 , (11.12)
0 π λ th (k B T )3/2
where g(ε) is the density of states per ε (not the same function as g(k), despite, somewhat
sloppily, being denoted by the same letter).
Note that the functional form of g(k) or g(ε) depends on the dimension of space.
Exercise 11.1. Density of States in d Dimensions. a) Calculate g(k) and g(ε) for a classical
monatomic ideal gas in d dimensions (also do the d = 1 and d = 2 cases separately and check
that your general formula reduces to the right expressions in 1D, 2D and 3D).
b) Do the same calculation for an ultrarelativistic (i.e., ε mc2 ) monatomic ideal gas.
the other gas 2, N particles of each. Remove the partition and let the gases mix (Fig. 22).
Each gas expands into vacuum (Joule expansion), so each picks up kB N ln 2 of entropy
and so52
∆S = 2kB N ln 2. (11.16)
This is certainly true if the two gases are different. If, on the other hand, the two gases
are the same, surely we must have
∆S = 0, (11.17)
because, if we reinserted the partition, we would be back to status quo ante! This
inconsistency is called the Gibbs Paradox.
As often happens, realising there is a paradox helps resolve it.
11.7. Distinguishability
It is now clear where the problem came from: when we counted the states of the
system (§11.2), we distinguished between individual particles: e.g., swapping momenta
[ki and kj , assuming ki 6= kj , in Eq. (11.5)] between two particles would give a different
microstate in our accounting scheme. In the Gibbs set up in §11.6, we got the spurious
entropy increase after mixing identical gases by moving “individual” particles from one
chamber to another.
In Quantum Mechanics, this problem does not arise because particles are in fact
indistinguishable (interchanging them amounts to permuting the arguments of some big
symmetric wave-function amplitude). One way of explaining this intuitively is to say
that distinguishing particles amounts to pointing at them: “this one” or “that one,” i.e.,
identifying their positions. But since their momenta are definite, their positions are in fact
completely undeterminable, by the uncertainty principle: they are just waves in a box!53
In Part IV, you will see that in systems where individual particles are distinguishable,
they are often fixed in some spatial positions (e.g., magnetisable spins in a lattice).
Thus, the microstates of a gas in a box should be designated not by lists of momenta
of individual particles [Eq. (11.5)], but by
X
α = {nk1 , nk2 , nk3 , . . . }, nk = N, (11.18)
k
where nki are occupation numbers of the single-particle microstates: nk1 particles with
52
Another way to derive this result is by arguing (pretending these are classical particles) that
after the partition is removed, there is additional uncertainty for each particle as to whether
it ends up in chamber 1 or in chamber 2. These outcomes have equal probabilities 1/2, so the
additional entropy per particle is, as per Eq. (8.7), ∆S1 = −kB ( 21 ln 12 + 21 ln 12 ) = kB ln 2 and
so, for 2N particles, we get Eq. (11.16).
53
A formal way of defining indistinguishably of particles without invoking Quantum Mechanics
is to stipulate that all realistically measurable physical quantities are symmetric with respect
to permutations of particles.
96 A. A. Schekochihin
wave number k1 , nk2 particles with wave number k2 , etc., up to the total of N particles.
The corresponding collective energy levels are
X
Eα = nk εk . (11.19)
k
where
P the sum is over all possible sequences {nk } of occupation numbers, subject to
k n k = N . Calculating this sum is a somewhat tricky combinatorial problem—we will
solve it in §16.2, but for our current purposes, we can use a convenient shortcut.
Suppose we are allowed to neglect all those collective microstates in which more than
one particle occupies the same single-particle microstate, i.e.,
for any k, nk = 0 or 1. (11.21)
Then the correct, collective microstates (11.18) are the same as our old, wrong ones (11.5)
(“particle 1 has wave number k1 , particle 2 has wave number k2 , etc.”;
P cases where k1 ,
k2 , . . . are not different are assumed to contribute negligibly to α in the partition
function), except the order in which we list the particles ought not to matter. Thus,
we must correct our previous formula for Z, Eq. (11.6), to eliminate the overcounting
of the microstates in which the particles were simply permuted—as the particles are
indistinguishable, these are in fact not different microstates. The necessary correction is,
therefore,54
Z1N
Z= . (11.22)
N!
Using Eq. (11.9), we have for the classical monatomic ideal gas,
N r
1 V 2π
Z= , λth = ~ . (11.23)
N ! λ3th mkB T
Before we use this new formula to calculate everything, let us assess how good the
assumption (11.21) is. In order for it to hold, we need that
the number of available the number of
(11.24)
single-particle states particles N .
The single-particle partition function, Eq. (11.9), gives a decent estimate of the former
54
Our old formula, Z = Z1N , is still fine for systems consisting of distinguishable elementary
units. It might not be immediately obvious why the validity of the corrected formula (11.22)
is restricted to the case (11.21), but breaks down if there are non-negligibly many multiply
occupied states. The reason is that our original counting scheme (11.5) distinguished between
cases such as “particle 1 has wave number k1 , particle 2 has wave number k2 , . . . ” vs. “particle 1
has wave number k2 , particle 2 has wave number k1 , . . . ” when k1 6= k2 —this was wrong and
is corrected by the N ! factor, which removes all permutations of the particles; however, the
scheme (11.5) did not distinguish between such cases for k1 = k2 and so, if they were present
in abundance, the factor N ! would overcorrect.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 97
quantity because the typical energy of the system will be εk ∼ kB T and the summand in
X
Z1 = e−εk /kB T (11.25)
k
stays order unity roughly up to this energy, so the sum is simply of order of the number
of microstates in the interval εk . kB T .55 Then the condition (11.24) becomes
V
N ⇔ nλ3th 1 . (11.26)
λ3th
Another popular way of expressing this condition is by stating that the number density
of the particles must be much smaller than the “quantum concentration” nQ :
1
n nQ ≡ . (11.27)
λ3th
Physically, the quantum concentration is the number of single-particle states per unit
volume (this is meaningful because the number of states is an extensive quantity: in
larger volumes, there are more wave numbers available, so there are more states).
The condition (11.27) is actually the condition for the classical limit to hold, T Tdeg
[see Eq. (2.29)], guaranteeing the absence of quantum correlations (which have to do with
precisely the situation that we wish to neglect: more than one particle trying to be in
the same single-particle state; see Part VI). When n ∼ nQ or larger, we can no longer
use Eq. (11.22) and are in the realm of quantum gases. Substituting the numbers, which,
e.g., for air at 1 atm and room temperature, gives n ∼ 1025 m−3 vs. nQ ∼ 1030 m−3 , will
convince you that we can usefully stay out of that realm for a little longer.
where, upon application of Stirling’s formula, the non-additive terms have happily
cancelled.
The entropy is, therefore,
∂F 5 3
S=− = kB N − ln(nλth ) , (11.29)
∂T V 2
the formula known as the Sackur–Tetrode Equation. It is nice and additive, no paradoxes.
The mean energy of the gas is
3
U = F + TS = kB T N, (11.30)
2
the same as the familiar Eq. (2.23), and hence the heat capacity is
∂U 3
CV = = kB N, (11.31)
∂T V 2
55
In other words, using Eq. (11.12), the number of states that are not exponentially unlikely is
R kB T
∼ 0
dε g(ε) ∼ V /λ3th .
98 A. A. Schekochihin
the same as Eq. (2.24).
NB: This formula is for monatomic gases. In Part IV, you will learn how to handle
diatomic gases, where molecules can have additional energy levels due to rotational
and vibrational degrees of freedom.
Finally, the equation of state is
∂F N
P =− = kB T = nkB T, (11.32)
∂V T V
the same as Eq. (2.19).
NB: The only property of the theory that matters for the equation of state is the
fact that Z ∝ V N , so neither the (in)distinguishability of particles nor the precise
form of the single-particle energy levels [Eq. (11.4)] affect the outcome—this will only
change when particles start crowding each other out of parts of the volume, as happens
for “real” gases (Part VII), or of parts of phase space, as happens for quantum ones
(Part VI).
Thus, we have recovered from Statistical Mechanics the same thermodynamics for the
ideal gas as was constructed empirically in Part I or kinetically in Part II. Note that
Eqs. (11.30) and (11.32) constitute the proof that the kinetic temperature [Eq. (2.20)]
and kinetic pressure [Eq. (1.27)] are the same as the statistical mechanical temperature
(= 1/kB β) and statistical mechanical pressure [Eq. (7.13)].
Exercise 11.2. Adiabatic Law. Using the Sackur–Tetrode equation, show that for a classical
monatomic ideal gas undergoing an adiabatic process,
P V 5/3 = const. (11.33)
Exercise 11.3. Relativistic Ideal Gas. a) Show that the equation of state of an ideal gas is
still
P V = N kB T (11.34)
even when the gas is heated to such a high temperature that the particles are moving at
relativistic speeds. Why is the equation of state unchanged?
b) Although the equation of state does not change, show, by explicit calculation of the
expression for the entropy, that in the ultrarelativistic limit (i.e., in the limit in which the
rest energy of the particles is negligible compared to their kinetic energy), the formula for an
adiabat is
P V 4/3 = const. (11.35)
c) Show that the pressure of an ultrarelativistic monatomic ideal gas is
ε
P = , (11.36)
3
where ε is the internal energy density. Why is this relationship different for a nonrelativistic
gas?
whereas
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 99
hnk i = mean number of particles in the microstate with wave number k = mv/~.
Therefore,
hnk i V hnk i m 3 3 m 3 hn i
k
f (v)d3 v = 3
d3 k = d v ⇒ f (v) = . (11.37)
N (2π) n 2π~ 2π~ n
We are not yet ready to calculate hnk i from Statistical Mechanics—we will do this in §16.3, but
in the meanwhile, the anticipation of the Maxwellian f (v) tells us what the result ought to be:
3 −mv2 /2kB T
2π~ e
hnk i = n = (nλ3th ) e−βεk . (11.38)
m (2πkB T /m)3/2
We shall verify this formula in due course (see §16.4.3).
NB: It is a popular hand-waving shortcut to argue that Maxwell’s distribution is the Gibbs
distribution for one particle—a system in thermal contact (via collisions) with the rest of the
particles, forming the heat bath and thus determining the particle’s mean energy.
S = k log W , (12.1)
1) larger for a larger number of states, viz., S(Ω 0 ) > S(Ω) for Ω 0 > Ω,
2) additive for several systems when they are put together (an essential property, as
we saw in §10), i.e., the number of states in a combined system being Ω12 = Ω1 Ω2 ,
Boltzmann wanted
S(Ω1 Ω2 ) = S(Ω1 ) + S(Ω2 ). (12.3)
The proof that the only such function is given by Eq. (12.2) is the proof of the Lemma
within the proof of Shannon’s Theorem in §8.1.5.56 Thus, Boltzmann’s entropy simply
appears to be a particular case of the Gibbs–Shannon entropy for isolated systems
(systems with equiprobable states).
In fact, as we shall see in §§12.1.2 and 12.1.3, it is possible to turn the argument around
and get the Gibbs entropy (and the Gibbs distribution) from the Boltzmann entropy.
(§12.1.1):
S = kB ln Ω(E) . (12.5)
Now, to get the canonical (Gibbs) distribution from Eq. (12.4), pick a small part of
the system (Fig. 23) and ask what is the probability for it to have energy ( E)?
Using Eq. (12.5) and denoting by
Ωpart () the number of microstates of the small part of the system that have energy ,
Ωres (E − ) the number of microstates of the rest of the system (the reservoir, the heat
bath; cf. §10.3) that have energy E − ,
we can express the desired probability as follows:
Ωpart ()Ωres (E − ) Ωpart () Sres (E − )
p() = = exp
Ω(E) Ω(E) kB
Ωpart () 1 ∂Sres
≈ exp Sres (E) − +...
Ω(E) kB | ∂E
{z }
= 1/T
Sres (E)/kB
e
= Ωpart () e−/kB T , (12.6)
Ω(E)
| {z }
norm.
constant
where T is, by definition, the temperature of the reservoir. The prefactor in front of
this distribution is independent of and can be found by normalisation. Thus, we have
obtained a variant of the Gibbs distribution (also known as the Boltzmann distribution):
Ωpart () e−/kB T X
p() = , Z= Ωpart () e−/kB T , (12.7)
Z
where the normalisation constant has been cast in the familiar form of a partition
function, Z. The reason this formula, unlike Eq. (9.8), has the prefactor Ω() is that
this is the probability for the system to have the energy , not to occupy a particular
single state α. Many such states can have the same energy —to be precise, Ω() of them
will—all with the same probability, so we recover the more familiar formula as follows:
isolated system will also conserve its linear and angular momentum. In fact, it is possible to
show that in steady state (and so, in equilibrium), the distribution can only be a function of
globally conserved quantities. As it is usually possible to consider the system in a frame in which
it is at rest, E is what matters most in Statistical Mechanics (see, e.g., Landau & Lifshitz 1980,
§4).
102 A. A. Schekochihin
for α such that the energy of the subsystem is Eα = ,
p() e−/kB T e−βEα X
pα = = = , Z= e−βEα . (12.8)
Ωpart () Z Z α
We are done now, as we can again calculate everything from this: energy via the usual
formula
∂ ln Z
U =− (12.9)
∂β
and entropy either by showing, as in §9.2, that
U
dQrev = dU + P dV = T d + kB ln Z = T dSpart , (12.10)
T
so T is the thermodynamic temperature and
U
Spart =
+ kB ln Z (12.11)
T
the thermodynamic entropy of the small subsystem in contact with a reservoir of
temperature T ,
then
X
Spart = kB ln Ω(E) − p() kB ln Ωres (E − )
| {z }
Sres (E − )
X Ωres (E − ) X
= −kB p() ln = −kB pα ln pα , (12.13)
Ω(E) α
|{z} | {z }
= Ω()pα = p()/Ωpart ()
= pα
and we have thus recovered the Gibbs entropy.
either we can repeat the argument of §10.2 replacing mean energies U1 , U2 , U with exact
energies E1 , E2 , E and maximising the Boltzmann entropy of two conjoint systems to
show that in equilibrium the quantity T defined by Eq. (12.14) must equalise between
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 103
them—and thus T is a good definition of temperature.
or we note that T defined via Eq. (12.14) is the width of the distribution p() [Eq. (12.7)]
and hence enters Eq. (12.10)—thus, 1/T is manifestly the integrating factor of reversible
heat, so T is the thermodynamic temperature (same argument as in §9.2).
Finally, let me outline yet another scheme for constructing the Gibbs canonical ensemble.
This is not, in fact, how the Gibbs construction has traditionally been thought about (e.g., by
Gibbs 1902—or by Schrödinger 1990, who has a very cogent explanation of the more traditional
approach in his lectures). Rather,
• one makes N mental copies of the system that one is interested in and joins them together
into one isolated über-system (an “ensemble”). The states (“complexions”) of this über-system
are characterised by
and so the number of all possible such über-states is W , given by Eq. (12.15).
• Since the über-system is isolated, all these states are equiprobable and the entropy of the
über-system is the Boltzmann entropy,
SN = kB ln W, (12.17)
which, if maximised, will give the most probable über-state—this is the equilibrium state of the
über-system. Maximising entropy per system,
SN
S= , (12.18)
N
which is the same as Gibbs entropy (12.16), is equivalent to maximising SN .
• If {N1 , . . . , NΩ } is the most probable über-state, then
Nα
pα = (12.19)
N
are the desired probabilities of the microstates in which one might find a copy of the original
system if one picks it randomly from the über-system (the ensemble).
104 A. A. Schekochihin
• To complete this construction, one proves that the fluctuations around the most probable
über-state vanish as N → ∞, which is always a good limit because N is in our head and so can
be chosen arbitrarily large (for details, see Schrödinger 1990, Chapter V, VI).
Recall that to get the canonical (Gibbs) distribution (§9.1), we maximised Gibbs entropy
[Eq. (12.18), or Eq. (12.16)] subject to fixed mean energy
X
pα Eα = U. (12.20)
α
i.e., the (isolated) über-system has the exact total energy E = N U . Thus, seeking the equilibrium
of a system at fixed mean energy U (or, equivalently/consequently, temperature) is the same as
seeking the most likely way in which exact energy N U would distribute itself between very many,
N 1, copies of the system, if they were all in thermal contact with each other and isolated
from the rest of the world.
Thus, the canonical ensemble of Gibbs, if interpreted in terms of one “über-system” containing
N copies of the original system with exact total energy N U is basically a case of microcanonical
distribution being applied to this (imaginary) assemblage.
Clearly, a system with mean energy U inside our über-system is a case of a system in contact
with a heat bath (see §10.3)—in the above construction, the bath is a strange one, as it is made
of N − 1 copies of the system itself, but that does not matter because the nature of the heat
bath does not matter—what does matter is only the value of the temperature (or, equivalently,
mean energy) that it sets for the system.
The “Gibbs” and “Shannon” schemes really are versions of one another: whereas the language
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 105
is different, both the mathematics (cf. §§8.1.3 and 12.1.3) and the philosophy (probabilities as
likelihoods of finding the system of interest in any given microstate) are the same (one might
even argue that the “Shannon” construction is what Gibbs really had in mind). So I will refer
to this entire school of thought as “Gibbsian” (perhaps the “Gibbsian heresy”).
The Boltzmann scheme (the “Boltzmannite orthodoxy”) is philosophically different: we are
invited to think of every step in the construction as describing some form of objective reality,
whereas under the Gibbsian approach, we are effectively just trying to come up with the best
possible guess, given limited information.
The reality of the Boltzmannite construction is, however, somewhat illusory:
1) An isolated system with a fixed energy is a fiction:
—it is impossible to set up practically;
—if set up, it is inaccessible to measurement (because it is isolated!).
So it is in fact just as imaginary as, say, the Gibbsian ensemble of N identical systems.
2) What is the basis for assuming equal probabilities?
The usual view within this school of thought is as follows. As the isolated system in question
evolves in time, it samples (repeatedly) its entire phase space—i.e., it visits all possible mi-
crostates consistent with its conservation laws (E = const). Thus, the probability for it to be in
any given microstate or set of microstates is simply the fraction of time that it spends in those
states. In other words, time averages of any quantities of interest are equal to the statistical
averages, i.e., to the averages over all microstates:
1 t 0
Z X
lim dt (quantity)(t0 ) = pα (quantity)α . (12.22)
t→∞ t 0
α
This last statement is known as the ergodic hypothesis. To be precise, the assumption is that
time spent in any subset of
number of microstates in this subset
microstates = . (12.23)
(subvolume of phase space) total number of microstates
So the idea is that we do all our practical calculations via statistical averages (with pα = 1/Ω
etc.), but the physical justification for that is that the system is time-averaging itself (we cannot
directly calculate time averages because we cannot calculate precise dynamics).58
The objection to this view that I find the most compelling is simply that the size of the phase
space (the number of microstates) of any macroscopic system is so enormous that it is in fact
quite impossible for the system to visit all of it over a reasonable time (see Jaynes 2003).
The key divide here is rooted in the old argument about the meaning of probabilities:
—probabilities as frequencies, or “objective” probabilities, measuring how often something
actually happens
vs.
—probabilities as (a priori) likelihoods, or “subjective” probabilities, measuring our (lack of)
knowledge about what happens (this view has quite a respectable intellectual pedigree: Laplace,
Bayes, Keynes, Jeffreys, Jaynes... and Binney!).
58
A further mathematical nuance is as follows. Formally speaking, the system over which we
are calculating the averages, e.g., in the case of the ideal gas, often consists of a number of
non-interacting particles—since they are non-interacting, each of them is conserving its energy
and the system is most definitely not ergodic: its phase space is foliated into many subspaces
defined by the constancy of the energy of each particle and the system cannot escape from any
of these subspaces. To get around this problem, one must assume that the particles in fact do
interact (indeed, they collide!), but rarely, so their interaction energy is small. If we calculate
the time average in the left-hand side of Eq. (12.22) for this weakly interacting system, then
the resulting average taken in the limit of vanishing interaction will be equal to the statistical
average on the right-hand side of Eq. (12.22) calculated for the system with no interaction (see,
e.g., Berezin 2007, §2; he also makes the point that as the interaction energy tends to zero, the
rate of convergence of the time average to a finite value as t → ∞ may become very slow, in
which case the physical value of the ergodic hypothesis becomes rather limited—this reinforces
Jaynes’s objection articulated in the next paragraph).
106 A. A. Schekochihin
NB: In choosing to go with the latter view and putting up all these objections to the former,
I am not suggesting that one is “right” and the other “wrong.” Remember that the falsifiable
(and examinable) content of the theory is the same either way, so the issue is which of the
logical constructions leading to it makes more sense to me or to you—and I urge you to
explore the literature on your own and decide for yourselves whether you are a Gibbsian or a
Boltzmannite (either way, you are in good company)—or, indeed, whether you wish to invent
a third way!59
To pre-empt some of the inevitable confusion about the “subjective” nature of maximising
uncertainty (whose uncertainty?!), let me deal with the common objection that, surely, if two
observers (call them A and B) have different amounts of information about the same system
and so arrive at two different entropy-maximising sets of pα ’s, it would be disastrous if those
different sets gave different testable predictions about the system! (Heat capacity of a room
filled with air cannot depend on who is looking!)
There are three possible scenarios.
• If Mrs B has more constraints (i.e., more knowledge) than Mr A, but her additional
constraints are, in fact, derivable from Mr A’s, then both Mr A and Mrs B will get the same
probability distribution {pα } because Mrs B’s additional Lagrange multipliers will turn out to
be arbitrary and so can be set to 0 (this is easy to see if you work through an example: e.g.,
Exercise 9.1e).
• If Mrs B’s additional constraints are incompatible with Mr A’s, the method of Lagrange
multipliers will produce a set of equations for λ’s that has no real solutions—telling us that the
system of constraints is logically contradictory and so no theory exists (this basically means
that one of them got their constraints wrong).
• Finally, if Mrs B’s additional constraints are neither incompatible with nor derivable
from Mr A’s, that means that she has discovered new physics: Mrs B’s additional constraints
will bring in new Lagrange multipliers, which will turn out to have some interesting physical
interpretation—usually as some macroscopic thermodynamical quantities (we will see an
example of this when we discover chemical potential in §14).
So far in this part of the course, we have not involved time in our considerations:
we have always been interested in some eventual equilibrium and the way to calculate
it was to maximise SG subject to constraints representing some measurable properties
of this equilibrium. This maximisation of SG is not the same thing as the 2-nd Law
of Thermodynamics, which states, effectively, that the thermodynamic entropy S of the
world (or a closed, isolated system) must either increase or stay constant in any process—
and so in time.
This statement is famously replete with hard metaphysical questions (even though
it is quite straightforward when it comes to calculating entropy changes in mundane
situations)—so it is perhaps useful to see how it emerges within the conceptual framework
that I am advocating here. The following proof is what I believe to be an acceptable
59
This is a bit like the thorny matter of the interpretations of Quantum Mechanics: everyone
agrees on the results, but not on why the theory works.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 107
vulgarisation of an argument due to Jaynes (1965).
Time t:
Consider a closed system (the world) in equilibrium, subject to some set of its properties
having just been measured and no other information available. Then our best guess as
to its state at this time t is obtained by maximising SG subject to those properties that
are known at time t. This gives a set of probabilities {pα } that describe this equilibrium.
In this equilibrium, the maximum value of SG that we have obtained is equal to the
thermodynamical entropy (see proof in §9.2):
S(t) = kB SG,max (t). (12.24)
Time t0 > t:
Now consider the evolution of this system from time t to a later time t0 , starting from
the set of states {α} and their probabilities {pα } that we inferred at time t and using
Hamilton’s equations (if the system is classical) or the time-dependent Schrödinger’s
equation (if it is quantum, as it always really is; see §13.4). During this evolution, the
Gibbs entropy stays constant:
X
SG (t0 ) = − pα ln pα = SG (t). (12.25)
α
Indeed, the Schrödiner equation evolves the states {α}, but if the system was in some
state α(t) at time t with probability pα , it will be in the descendant α(t0 ) of that state
at t0 with exactly the same probability; this is like changing labels in the expression for
SG while pα ’s stay the same—and so does SG . Thus,
1
SG (t0 ) = SG (t) = SG,max (t) = S(t). (12.26)
kB
Now forget all previous information, make a new set of measurements at time t0 , work
out a new set of probabilities {pα } at t0 subject only to these new constraints, by
maximising Gibbs entropy, and from it infer the new thermodynamical (equilibrium)
entropy:
S(t0 ) = kB SG,max (t0 ) > kB SG (t0 ) = kB SG (t) = S(t). (12.27)
| {z } | {z }
the new SG , the “true”
maximised at SG , evolved
time t0 from time t
Thus,
S(t0 ) > S(t) at t0 > t, q.e.d., Second Law. (12.28)
The meaning of this is that the increase of S reflects our insistence to forget most of the
detailed knowledge that we possess as a result of evolving in time any earlier state (even
if based on an earlier guess) and to re-apply at every later time the rules of statistical
inference based on the very little knowledge that we can obtain in our measurements at
those later times.
If you are sufficiently steeped in quantum ways of thinking by now, you will pounce
and ask: who is doing all these measurements?
If it is an external observer or apparatus, then the system is not really closed and, in
particular, the measurement at the later time t0 will potentially destroy the identification
108 A. A. Schekochihin
of all those microstates with their progenitors at time t, so the equality (12.25) no longer
holds.60
A further objection is: what if your measurements at t0 are much better than at the
technologically backward time t? You might imagine an extreme case in which you
determine the state of the system at t0 precisely and so SG (t0 ) = 0!
• Clearly, the observer is, in fact, not external, but lives inside the system.
• As he/she/it performs the measurement, not just the entropy of the object of
measurement (a subsystem) but also of the observer and their apparatus changes. The
argument above implies that a very precise measurement leading to a decrease in the
entropy of the measured subsystem must massively increase the entropy of the observer
and his kit, to compensate and ensure that the total entropy increases [Eq. (12.28)].61
We will return to these arguments in a slightly more quantitative (or, at any rate,
more quantum) manner in §§13.4–13.5.
So far the only way in which the quantum-mechanical nature of the world has figured in
our discussion is via the sums of states being discrete and also in the interpretation of the
indistinguishability of particles. Now I want to show you how one introduces the uncertainty
about the quantum state of the system into the general quantum mechanical formalism.
pα is the a priori probability that the system is in the state |αi and hα|Ô|αi is the expectation
value of Ô if the system is in the state |αi (e.g., Eα if Ô = Ĥ). The states {|αi} are not
necessarily eigenstates of Ô. Since, written in terms of its eigenstates and eigenvalues, this
60
In a classical world, this would not be a problem because you can make measurements without
altering the system, but in Quantum Mechanics, you cannot.
61
This sort of argument was the basis of the exorcism of Maxwell’s Demon by Szilard (1929).
62
It is an interesting question whether it is important that the system really is in one of
the states {|αi}. Binney & Skinner (2013) appear to think it is important to conjecture
this, but I am unconvinced. Indeed, in the same way that probabilities pα are not the true
quantum probabilities but rather a set of probabilities that would produce correct predictions
for measurement outcomes (expectation values Ō), it seems natural to allow {|αi} to be any
complete set, with pα then chosen so that measurement outcomes are correctly predicted. This
does raise the possibility that if our measurement were so precise as to pin down the true state of
the system unambiguously, it might not be possible to accommodate such information with any
set of pα ’s. However, such a situation would correspond to complete certainty anyway, obviating
statistical approach.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 109
operator is
X
Ô = Oµ |Oµ ihOµ |, (13.2)
µ
This looks analogous to Eq. (13.2), except note that ρ̂ is not an observable because pα ’s are
subjective. In the context of this definition, one refers to the system being in a pure state if for
some α, pα = 1 and so ρ̂ = |αihα|, or an impure state if all pα < 1.
The density operator is useful because, knowing ρ̂, we can express expectation values of
observables as
Ō = Tr ρ̂ Ô . (13.5)
Indeed, the above expression reduces to Eq. (13.1):
X 0 X X
Tr ρ̂ Ô = hα |ρ̂Ô|α0 i = pα hα0 |αi hα|Ô|α0 i = pα hα|Ô|αi = Ō, q.e.d. (13.6)
α0 α0 α
| {z } α
δ α0 α
It is useful to look at the density operator in the {|Oµ i} representation: since
X
|αi = hOµ |αi|Oµ i, (13.7)
µ
we have X X X
ρ̂ = pα hOµ |αihα|Oν i|Oµ ihOν | ≡ pµν |Oµ ihOν |, (13.8)
α µν µν
Thus, whereas ρ̂ is diagonal in the “information basis” {|αi}, it is, in general, not diagonal in any
given basis associated with the eigenstates of an observable, {|Oµ i}—in other words, the states
to which we assign a priori probabilities are not necessarily the eigenstates of the observable
that we then wish to calculate.
Let us express the expectation value of Ô in terms of the density matrix: using Eq. (13.8),
X X X
Ō = Tr ρ̂ Ô = hOµ0 |ρ̂Ô|Oµ0 i = Oµ0 pµν hOµ0 |Oµ i hOν |Oµ0 i = Oµ pµµ , (13.10)
µ0 µ0 µν µ
| {z } | {z }
δµ0 µ δνµ0
110 A. A. Schekochihin
the same expression as Eq. (13.3), seeing that
X
pµµ = pα |hOµ |αi|2 . (13.11)
α
Thus, the diagonal elements of the density matrix in the Ô representation are the combined
quantum and a priori (statistical) probabilities of the observable giving eigenvalues Oµ as
measurement outcomes.
The off-diagonal elements have no classical interpretation. They measure quantum correla-
tions and come into play when, e.g., we want the expectation value of an observable other than
the one in whose representation we chose to write ρ̂: for an observable P̂ , the expectation value is
X X X
P̄ = Tr ρ̂ P̂ = hOµ0 | pµν |Oµ ihOν | P̂ |Oµ0 i = pµν hOν |P̂ |Oµ i. (13.12)
µ0 µν µν
| {z }
ρ̂
dρ̂
i~ = Ĥ, ρ̂ . (13.18)
dt
Note that the probabilities pα do not change with time: if the system was in a state |α(0)i
initially, it will be in its descendant state |α(t)i at any later time t.
So, we may envision a situation in which we are uncertain about a system’s initial conditions,
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 111
work out ρ̂(t = 0) via the maximum-entropy principle, constrained by some measurements, and
then evolve ρ̂(t) forever if we know the Hamiltonian precisely. Since pα ’s do not change, the
Gibbs–Shannon–von Neumann entropy of the system stays the same during this time evolution—
the only uncertainty was in the initial conditions.
What if we do not know the Hamiltonian (or choose to forget)? This was discussed in §12.4:
then, at a later time, we may make another measurement and construct the new density matrix
ρ̂new (t) via another application of the maximum-entropy principle. Both ρ̂new (t) and ρ̂old (t)—
which is our ρ̂(0) evolved via Eq. (13.18) with the (unknown to us) precise Ĥ—are consistent
with the new measurement. But ρ̂new (t) corresponds to the maximum possible value of the
entropy consistent with this measurement, while ρ̂old (t) has the same entropy as ρ̂(0) did at
t = 0. Therefore,
Snew (t) > Sold (0). (13.19)
This is the Second Law and the argument above is the same as the argument already given
in §12.4.
(sys) (env)
where pαα0 are the probabilities of |Eα (0)i, indifferent to |Eα0 (0)i.
Now evolve this density matrix according to the time-dependent Schrödinger equation (13.18):
pαα0 ’s will stay the same, while the states will evolve:
where the old density matrix in the new representation is [cf. Eq. (13.9)]:
X (old)
pαα0 hµµ0 , new|αα, tihαα0 , t|νν 0 , newi.
(old)
pµµ0 νν 0 (t) = (13.26)
αα0
However, the measured energy of the system at time t only depends on the diagonal elements
(old)
pαα0 αα0 (t) of this matrix:
All information about correlations between the system and the environment is lost in this
measurement.
When we maximise entropy and thus make a new statistical inference about the system, the
new entropy will be higher than the old for two reasons:
1) all off-diagonal elements from the old density matrix are lost,
(old)
2) the diagonal elements pαα0 αα0 (t) are in general not the ones that maximise entropy (see
the argument in §12.4):
(new) (old)
pαα0 6= pαα0 αα0 (t). (13.28)
Thus, the new density matrix
X
pαα0 |αα0 , newihαα0 , new|,
(new)
ρ̂(new) = (13.29)
αα0
(old)
= SvN (0) = −Tr ρ̂(0) ln ρ̂(0) . (13.30)
| {z }
old entropy did not change
because pαα0 ’s did not change
So information is lost and we move forward to an ever more boring world... (which is a very
interesting fact, so don’t despair!)
You might think of what has happened as our total ignorance about the environment having
polluted our knowledge about the system as a result of the latter getting entangled with the former.
PART IV
Statistical Mechanics of Simple Systems
This part of the course was taught by Professors Andrew Boothroyd and Julien Devriendt.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 113
PART V
Open Systems
14. Grand Canonical Ensemble
So you know what to do if you are interested in a system whose quantum states you
know and whose probabilities for being in any one of these states you have to guess based
on (the expectation of) the knowledge of some measurable mean quantities associated
with the system. So far (except in §10.4) the measurable quantity has always been mean
energy—and the resulting canonical distribution gave a good statistical description of a
physical system in contact with a heat bath at some fixed temperature.
Besides the measurable mean energy U , our system depended on a number of exactly
fixed external parameters: the volume V , the number of particles N —these were not
constraints, they did not need to be measured, they were just there, set in stone (a box
of definite volume, with impenetrable walls, containing a definite number of particles).
Mathematically speaking, the microstates of the system depended parametrically on V
and N 63 and so did their energies:
α = α(V, N ), Eα = Eα (V, N ). (14.1)
There are good reasons to recast N as a measurable mean quantity rather than a fixed
parameter. This will allow us to treat systems that are not entirely closed and so can
exchange particles with other systems. For example:
—inhomogeneous systems in some external potential (gravity, electric field, rotation,
etc.), in which parts of the system can be thought of as exchanging particles with other
parts where the external potential has a different value (§14.5);
—multiphase systems, where different phases (e.g., gaseous, liquid, solid) can exchange
particles via evaporation, condensation, sublimation, solidification, etc. (§15.2, Part VII);
—systems containing different substances that can react with each other and turn into
each other (§15), e.g., chemical reacting mixtures (§§15.3 and 15.4), partially ionised
plasmas subject to ionisation/recombination (Exercise 15.2);
—systems in which the number of particles is not fixed at all and is determined by
the requirements of thermodynamical equilibrium, e.g., pair production/annihilation
(Exercise 16.7), thermal radiation (§19), etc.;
—systems where N might be fixed, but, for greater ease of counting microstates, it is
convenient formally to allow it to vary (Fermi and Bose statistics for quantum gases, §16).
Both U and N̄ are measurable; measuring N̄ is equivalent to measuring the mean density
N̄
n= (14.5)
V
(note that V remains an exactly fixed external parameter).
We know the routine: maximise entropy subject to these two constraints:
! ! !
X X X
SG − λ pα − 1 − β pα Eα − U + βµ pα Nα − N̄ → max, (14.6)
α α α
where −βµ is the new Lagrange multiplier responsible for enforcing the new constraint
(14.4); the factor of −β is introduced to follow the conventional definition of µ, which is
called the chemical potential and whose physical meaning will shortly emerge. Carrying
out the maximisation in the same manner as in §9.1, we find
ln pα + 1 + λ + βEα − βµNα = 0. (14.7)
This gives us the grand canonical distribution:
e−β(Eα −µNα )
pα = , (14.8)
Z(β, µ)
where the normalisation factor (arising from the Lagrange multiplier λ) is the grand
partition function:
X
Z(β, µ) = e−β(Eα −µNα ) . (14.9)
α
and so
1 ∂ ln Z
N̄ (β, µ) = . (14.13)
β ∂µ
values of Nα will be more probable than others, so the mean number of particles N̄ will depend
on V .
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 115
Note that the canonical distribution and the canonical partition function (§9.1) can be recovered
as a special case of our new theory: suppose that for all α, the number of particles is the same,
Nα = N for all α. (14.14)
Then Eq. (14.8) becomes
eβµN e−βEα
pα = e−βEα = , (14.15)
Z }
| {z Z
= 1/Z
which is our old canonical distribution [Eq. (9.8)], where, using Eq. (14.9),
X −β(Eα − ) = X e−βEα
Z = e−βµN Z = e− βµN
e µN
(14.16)
α α
is the familiar non-grand partition function [Eq. (9.7)]. The relationship for the grand and
non-grand partition functions, when written in the form
Z = (eβµ )N Z(β), (14.17)
highlights a quantity sometimes referred to as “fugacity” = eβµ .
Its differential is
dZ
dSG = β(dU − N̄ dµ − µdN̄ ) + (U − µN̄ )dβ +
Z
= β(dU −
N̄ − µdN̄ ) + h
dµ (Uh−hµh
N̄ h
)dβ
h
X e−β(Eα −µNα ) h i
+ −β(dEα − Nα −h
dµ) α −h
(Eh hµNh
α )dβ
α |
Z
{z }
hh
= pα
X
= β dU − µdN̄ − pα dEα
α
| {z }
∂Eα
= dV
∂V
= β dU + P dV − µdN̄ = βdQrev . (14.19)
We have taken Eα = Eα (V ) (energy levels are a function of the single remaining external
parameter V , the volume of the system) but dNα = 0 (Nα is not a function of V ; see
footnote 64 in §14.1); we have also used our standard definition of pressure (7.13).
The right-hand side of Eq. (14.19) has to be identified as βdQrev because we would like
to keep the correspondence between SG and the thermodynamical entropy [Eq. (9.14)]
and between β and the thermodynamical temperature [Eq. (9.13)]:
S 1
SG = , β= . (14.20)
kB kB T
This implies the physical interpretation of µ: in a reversible process where U and V stay
116 A. A. Schekochihin
the same but N̄ changes, adding each particle generates −µ amount of heat. In other
words,
∂S
µ = −T . (14.21)
∂ N̄ U,V
Intuitively, adding particles should increase entropy (systems with more particles
usually have a larger number of microstates available to them, so the uncertainty as
to which of these microstates they are in is likely to be greater)—therefore, we expect
µ to be a negative quantity, under normal circumstances. Equivalently, one might argue
that a positive value of µ would imply that entropy increased with diminishing N̄ and
so, in its quest to maximise entropy, a system with positive µ would be motivated to lose
all its particles and thus cease to be a system. This logic is mostly correct, although we
will encounter an interesting exception in the case of degenerate Fermi gas (§17).
It is in fact possible to derive Eq. (14.19) and the resulting variable-particle-number thermody-
namics from the canonical ensemble. Go back to Eq. (9.11) and treat the number of particles
N as a variable parameter, in the same way as volume was treated. Then
X ∂Eα ∂Eα
pα dEα = dV + dN = −P dV + µdN, (14.22)
α
∂V ∂N
where we used the definition (7.13) of pressure and introduced the chemical potential in an
analogous way as being, by definition,
X ∂Eα
∂Eα
µ= pα = , (14.23)
α
∂N ∂N
where pα are the canonical probabilities (9.8). In this scheme, µ is explicitly defined as the
energy cost of an extra particle [cf. Eq. (14.25)], in the same way that −P is the energy cost of
an extra piece of volume.
This illustrates that, in constructing various ensembles, we have some degree of choice as
to which quantities we treat as measurable constraints (U in the canonical ensemble, U and
N̄ in the grand canonical one) and which as exactly fixed external parameters that can be
varied between different equilibria (V in the grand canonical ensemble, V and N in the version
of the canonical ensemble that we have just outlined). In Exercise 14.6, this point is further
illustrated with an ensemble in which volume becomes a measurable constraint and pressure the
corresponding Lagrange multiplier.
dU = T dS − P dV + µdN̄ . (14.24)
called the grand potential (its physical meaning will become clear in §14.6.3). The
usefulness of this quantity for open systems is the same as the usefulness of F for closed
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 117
ones: it is the function by differentiating which one gets all the relevant thermodynamical
quantities and equations. Indeed, using Eq. (14.24), we get
dΦ = −SdT − P dV − N̄ dµ , (14.27)
and so,
∂Φ
S=− , (14.28)
∂T V,µ
∂Φ N̄
N̄ = − equivalently, equation for density n = , (14.29)
∂µ T,V V
U = Φ + T S + µN̄ , (14.30)
∂Φ
P =− , equation of state (14.31)
∂V T,µ
(note that the equation of state will, in fact, turn out to be obtainable in an even simpler
way than this: see §14.6.3).
Similarly to the case of fixed number of particles, we have found that all we need to
do, pragmatically, is calculate the (grand) partition function Z(β, µ), which incorporates
all the microphysics relevant to the thermodynamical description, infer from it the grand
potential Φ, and then take derivatives of it—and we get to know everything we care
about.
What is the role that µ plays in all this? Eq. (14.24) suggests that −µ to N̄ is what
P is to V or 1/T to U , i.e., it regulates the way in which some form of equilibrium is
achieved across a system.65
U = U1 + U2 = const, (14.32)
N̄ = N̄1 + N̄2 = const, (14.33)
S = S1 + S2 → max . (14.34)
65
Let me reiterate the point that has (implicitly) been made in several places before. Extensive
thermodynamic variables like U , V , N̄ have intensive conjugate variables associated with
them: 1/T , P/T , −µ/T . They represent “entropic” costs of changing the extensive variables;
equivalently, T , −P and µ are energetic costs of changing the system’s entropy, volume and
particle number, respectively [see Eq. (14.24)]. It turns out that these costs cannot vary across
the free-trade zone that a system in equilibrium is.
118 A. A. Schekochihin
Taking differentials,
∂S1 ∂S1 ∂S2 ∂S2
dS = dU1 + dN̄1 + dU2 + dN̄2
∂U1 N̄1 ,V1 ∂ N̄1 U1 ,V1 ∂U2 N̄2 ,V2 |{z} ∂ N̄2 U2 ,V2 |{z}
= =
−dU1 −dN̄1
" # " #
∂S1 ∂S2 ∂S1 ∂S2
= − dU1 + − dN̄1 = 0,
∂U1 N̄1 ,V1 ∂U2 N̄2 ,V2 ∂ N̄1 U1 ,V1 ∂ N̄2 U2 ,V2
| {z } | {z }
1 1 µ1 µ2
= − =− +
T1 T2 T1 T2
(14.35)
where we have used Eq. (14.21) to identify the derivatives in the second term. Setting
the first term to zero gives T1 = T2 = T (thermal equilibrium). Then setting the second
term to zero implies that
µ1 = µ2 , (14.36)
i.e., µ = const across a system in equilibrium. We also see that, if initially µ1 6= µ2 , the
direction of change, set by dS > 0, is µ1 < µ2 ⇔ dN̄1 > 0, so matter flows from larger
to smaller µ.
Thus, if we figure out how to calculate µ, we should be able to predict equilibrium
states: how many particles, on average, there will be in each part of a system in
equilibrium.
Exercise 14.1. Microcanonical Ensemble Revisited. Derive the grand canonical distri-
bution starting from the microcanonical distribution (i.e., by considering a small subsystem
exchanging particles and energy with a large, otherwise isolated system). This is a generalisation
of the derivation in §12.1.2.
14.4. Grand Partition Function and Chemical Potential of Classical Ideal Gas
So let us then learn how to calculate µ for our favorite special case of a classical
monatomic ideal gas.
As always, the key question is what are the microstates? The answer is that they
are the same as before [Eq. (11.18)], except now we can have an arbitrary number of
particles, so
α = (αN , N ), (14.37)
where
X
αN = {nk1 , nk2 , . . . }, nk = N, (14.38)
k
are the microstates of a gas of N particles and nk are occupation numbers of the single-
particle states designated by the wave vectors k.
The grand partition function is, therefore,
X X X X
Z= e−β(Eα −µNα ) = eβµN e−βEαN = eβµN ZN , (14.39)
α N αN N
where ZN is the familiar partition function of a gas of N particles, for which, neglecting
quantum correlations, we may use Eq. (11.22):
X Z1N X (eβµ Z1 )N βµ
Z≈ eβµN = = eZ1 e . (14.40)
N! N!
N N
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 119
The grand potential (14.26) is, therefore,
nλ3th
µ = kB T ln (internal)
, (14.46)
Z1
where n = N̄ /V is the (mean) number density of the gas. Note that, as nλ3th 1 in the
(internal)
classical limit [Eq. (11.26)] and Z1 > 1 (because the number of internal states is
at least 1), the formula (14.46) gives µ < 0, as anticipated in §14.2.
Finally, using Eqs. (14.40–14.42), we get two remarkably simple formulae:
Z = eN̄ , (14.47)
Φ = −kB T N̄ , (14.48)
Exercise 14.2. Particle Number Distribution. Consider a volume V of classical ideal gas
with mean number density n = N̄ /V , where N̄ is the mean number of particles in this volume.
Starting from the grand canonical distribution, show that the probability to find exactly N
particles in this volume is a Poisson distribution (thus, you will have recovered by a different
method the result of Exercise 1.2a).
120 A. A. Schekochihin
Exercise 14.3. Rotating Gas. a) A cylindrical container of radius R is filled with ideal gas
at temperature T and rotating around the axis with angular velocity Ω. The molecular mass
is m. The mean density of the gas without rotation is n̄. Assuming the gas is in isothermal
equilibrium, what is the gas density at the edge of the cylinder, n(R)? Discuss the high and low
temperature limits of your result.
b) Putting a mixture of two gases with different particle masses into a rotating container (a
centrifuge) is a way to separate heavier from lighter particles (e.g., separate isotopes). Another
method of doing this was via effusion (see comment at the end of §3). Making a set of sensible
assumptions about all the parameters you need, assess the relative merits of the two methods.
c) In the calculation of the “isothermal atmosphere” we imagined subdividing it into thin
horizontal layers, each at constant z, and treated each layer as a homogeneous system. Similarly,
here, you had to imagine subdividing the cylinder into thin annular layers, each at constant
radius. Why can we use for this system the results originally derived in a rectangular box (from
§11.1 onwards)? Does it matter that we might not be describing quite correctly the particles
with low wave numbers (say, k ∼ R−1 )?
Exercise 14.4. Debye Screening. a) Consider a charged plate kept at the potential ϕ0 and
bounding a semi-infinite hydrogen plasma (an ideal gas consisting of ions and electrons with
charges e and −e, respectively). Assume that the plasma is in isothermal equilibrium with
temperature kB T eϕ0 . The electrostatic potential satisfies Gauss’s law:
d2 ϕ
− = 4πe[ni (x) − ne (x)], (14.59)
dx2
where x is the distance from the plate and ni and ne are number densities of ions and electrons,
respectively. Assume that at x → ∞, ϕ → 0 and the number densities of ions and electrons are
equal: ni,e → n∞ = const (i.e., the plasma is neutral). Show that
r
−x/λD kB T
ϕ(x) = ϕ0 e , where λD = . (14.60)
8πe2 n∞
Thus, plasma screen (or shields) the plate’s charge over a typical distance λD , known as the
Debye length.
b) The same mechanism is responsible for charges of all particles in a plasma being screened.
Considering, for example, a charged hydrogen ion in an infinite homogeneous isothermal hydro-
gen plasma, and using the same logic as above, show that the ion’s potential as a function of
distance r from the ion’s location is
e
ϕ(r) = e−r/λD , (14.61)
r
i.e., that individual charged particles cannot “see” each other behind the crowd of other particles
beyond distances ∼ λD .
122 A. A. Schekochihin
14.6. Chemical Potential and Thermodynamic Potentials
Finally, we derive a few important general results concerning the relationship between
µ and various thermodynamical quantities.
Gibbs free energy per particle in systems with fixed temperature and pressure.
This result leads to a remarkable further simplification. Since G = G(P, T, N̄ ) is an
extensive quantity, P and T are intensive and N̄ extensive, if we change N̄ by a factor
of λ, G must change by the same factor while P and T stay the same:
G(P, T, λN̄ ) = λG(P, T, N̄ ). (14.66)
Differentiate this with respect to λ, then set λ = 1:
∂G ∂G
N̄ = G, λ = 1 ⇒ N̄ = G. (14.67)
∂(λN̄ ) P,T ∂ N̄ P,T
| {z }
= µ,
Eq. (14.65)
Exercise 14.5. Calculate G = U − T S + P V for the ideal gas using the results of §11.9 and
compare the outcome with Eq. (14.46).
Eq. (14.68) implies that µ is an intensive quantity (this was, of course, already obvious)
and so
∀λ, µ(P, T, λN̄ ) = µ(P, T, N̄ ) ⇒ µ = µ(P, T ), (14.69)
chemical potential is a function of pressure and temperature only. Indeed, for the ideal
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 123
gas [Eq. (14.46)], using P = nkB T ,
nλ3th λ3th
µ = kB T ln (internal)
= kB T ln P + kB T ln (internal)
. (14.70)
Z1 kB T Z1
| {z }
function of T only
Φ
P =− , (14.72)
V
a simpler formula than Eq. (14.31), promised at the end of §14.2. Eq. (14.72) tells us
that pressure is minus the grand-potential density, a way to give physical meaning to the
thus far formal quantity Φ (viz., Φ is minus “the total amount of pressure in the whole
volume of the system”).
Exercise 14.6. Pressure Ensemble. Throughout this course, we have repeatedly discussed
systems whose volume is not fixed, but allowed to come to some equilibrium value under pressure.
Yet, in both canonical (§9) and grand canonical (§14) ensembles, we treated volume as an
external parameter, not as a quantity only measurable in the mean. In this Exercise, your
objective is to construct an ensemble in which the volume is not fixed.
a) Consider a system with (discrete) microstates α to each of which corresponds some energy
Eα and some volume Vα . Maximise the Gibbs entropy subject to the measured mean energy
being U and the mean volume V̄ , with the number of particles N exactly fixed, and find
the probabilities pα . Show that the (“grandish”) partition function for this ensemble can be
defined as X −βE −σV
α α
Z= e , (14.73)
α
where β and σ are Lagrange multipliers. How are β and σ calculated?
b) Show that if we demand that the Gibbs entropy SG for those probabilities be equal to
S/kB , where S is the thermodynamic entropy, then the Lagrange multiplier arising from the
mean-volume constraint is
P
σ = βP = , (14.74)
kB T
where P is pressure. Thus, this ensemble describes a system under pressure set by the environ-
ment.
c) Prove that
dU = T dS − P dV̄ . (14.75)
d) Show that
− kB T ln Z = G, (14.76)
where G is the Gibbs free energy defined in the usual way. How does one derive the equation of
state for this ensemble?
e) Calculate the partition function Z for a classical monatomic ideal gas in a container of
changeable volume but impermeable to particles (e.g., a balloon made of inelastic material).
You will find it useful to consider microstates of an ideal gas at fixed volume V and then sum
up over
P allRpossible values of V . This sum (assumed discrete) can be converted to an integral
∞
via V = 0 dV /∆V , where ∆V is the “quantum of volume” (an artificial quantity shortly to
be eliminated from the theory; how small must ∆V be in order for the sum and the integral to
be good approximations of each other?).
124 A. A. Schekochihin
R∞
Hint. You will need to use the formula 0
dx xN e−x = N !
f) Calculate G and find what conditions ∆V must satisfy in order for the resulting expression
to coincide with the standard formula for the ideal gas (derived in Exercise 14.5) and be
independent of ∆V (assume N 1). If you can argue that the unphysical quantity ∆V does
not affect any physically testable results, then your theory is sensible.
g) Show that the equation of state is
N
P = nkB T, n= . (14.77)
V̄
[cf. Lewis & Siegert 1956]
Exercise 14.7. Expansio ad absurdum. Try constructing the “grandiose” ensemble, where
all three of mean energy, mean volume and mean number of particles are treated as measurable
constraints. Why is such a theory impossible/meaningless?
Exercise 14.8. Statistical Mechanics of a Black Hole. Here we pick up from our earlier
digression on the thermodynamics of black holes (see §10.5.1).
Consider the following model of a Schwarzschild black hole’s quantum states. Assume that
its horizon’s area is quantised according to
r
G~
An = a0 n, n = 1, 2, 3, . . . , a0 = 4`2P ln k, `P = , (14.78)
c3
where `P is the Planck length and ln k is some constant. Assume further that there are many
equiprobable microstates corresponding to each value of the area and use Bekenstein’s entropy
(10.40) to guess what the number Ωn of such states is:
Sn An
= 2 = ln Ωn ⇒ Ωn = kn . (14.79)
kB 4`P
Finally, assume that the mass of the black hole corresponding to each value of An is given (at
least approximately, for black holes much larger than the Planck length) by Schwarzschild’s
formula (10.38):
r r
√ c2
r
a0 ln k ~c
Mn = m0 n, m0 = = mP , mP = , (14.80)
G 16π 4π G
where mP is the Planck mass.
a) Assume that the only measurable√ constraint in the problem is the mean mass of the
black hole, M̄ = hMn i (equivalently, h ni). Attempt to use the maximum-entropy principle
to calculate probabilities of microstates. Are you able to calculate the partition function? Why
not? If you study the literature, you will see that a lot of other people have grappled with the
same problem, some less convincingly than others.
b) Try instead a kind of “grand canonical” approach, applying the maximum-entropy principle
with two constraints: the mean area of the horizon Ā = hAn i (equivalently, hni) and the mean
mass M̄ = hMn i. Why is one of the constraints in this scheme not a priori superfluous?
c) Show that the resulting partition function is
X n −µn+χ√n
Z= k e , (14.81)
n
where µ and −χ are Lagrange multipliers (one could interpret µ as a kind of chemical potential).
Argue that we can obtain a finite Z and a black hole with large area and mass (compared with
the Planck area and mass) if χ γ ≡ µ − ln k > 0. Assuming that this is the case, calculate the
partition function approximately, by expanding the exponent in Eq. (14.81) around the value
n = n0 where it is at its maximum. You should find that
r 2
4πn0 γn0 1 χ
Z≈ e 1+O √ , n0 = . (14.82)
γ γn0 2γ
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 125
d) Find expressions for Ā and M̄ in terms of γ and n0 (or, equivalently, γ and χ), keeping
the dominant and the largest subdominant terms in the large-n0 expansion. Hence show that
Ā and M̄ satisfy the Schwarzschild relation (10.38) to lowest order and also that the entropy
(calculated for the distribution that you have obtained) and Ā satisfy the Bekenstein formula
(10.40) in the same limit, up to a logarithmic correction, viz.,
S Ā 1 Ā
= 2 + ln 2 + O(1). (14.83)
kB 4`P 2 4`P
e) Confirm that neither of the two constraints that we have imposed is superfluous. However,
would any arbitrary values of Ā and M̄ lead to valid thermodynamics, with definite values of
the Lagrange multipliers obtainable?
f) Finally, work out the relationship between the entropy and the mean energy (U = M̄ c2 )
and show that the temperature, defined by 1/T = dS/dU , is the Hawking temperature (10.39).
Why is the temperature not just the Lagrange multiplier −χ and, therefore, negative?
g) Show that the heat capacity of a black hole is negative and that the mean square fluctuation
of the black hole’s mass around its mean is
ln k
h(Mn − M̄ )2 i = m2P . (14.84)
8πγ
Why is there not a relationship between the heat capacity and the mean square fluctuation of
energy (equivalently, mass) analogous to Eq. (10.37)?
[cf. Gour 1999]
and the multispecies thermodynamics, i.e., the expressions for S, N̄s , U and P , can be
read off from this in the same manner as Eqs. (14.28–14.31) were. The differentials of
the thermodynamic potentials (defined in the usual way) are, therefore
X
Eq. (14.24) → dU = T dS − P dV + µs dN̄s , (15.7)
s
X
Eq. (14.62) → dF = −SdT − P dV + µs dN̄s , (15.8)
s
X
Eq. (14.64) → dG = −SdT + V dP + µs dN̄s , (15.9)
s
whence follow the expressions for the chemical potential of species s, analogous to
Eqs. (14.21), (14.25), (14.63) and (14.68):
∂S ∂U ∂F ∂G
µs = −T = = = ,
∂ N̄s U,V,N̄s0 6=s ∂ N̄s S,V,N̄s0 6=s ∂ N̄s T,V,N̄s0 6=s ∂ N̄s T,P,N̄s0 6=s
(15.10)
where all these derivatives are taken at constant N̄s0 , where s0 = 1, . . . , s − 1, s + 1, . . . , m.
which, upon differentiation with respect to λ and then setting λ = 1, gives us [cf.
Eq. (14.67)]
X ∂G
N̄s = G, (15.12)
s
∂ N̄s T,P,N̄s0 6=s
| {z }
= µs ,
Eq. (15.10)
whence it follows that the Gibbs free energy of a multispecies system is “the total amount
of chemical potential” amongst all species:
X
G= µs N̄s . (15.13)
s
Φ = U − T S − G = −P V, (15.14)
and the equation of state can again be obtained from this [Eq. (14.72)].
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 127
15.1.2. Fractional Concentrations
Since µs are all intensive [follows from Eq. (15.10)], they do not depend on the total
number of particles, but only on other intensive quantities, viz., pressure, temperature
and the fractional concentrations of all the species:
µs = µs (P, T, c1 , . . . , cm−1 ), (15.15)
where
N̄s X
cs = , N̄ = N̄s . (15.16)
N̄ s
P
There are only m − 1 independent fractional concentrations as, obviously, s cs = 1.
This is called the Gibbs phase rule. It implies, for example, that a single species (m = 1)
can only support an equilibrium state with r 6 3 coexisting phases (e.g., gas, liquid,
solid).
Eqs. (15.17) are the starting point for the theory of phase transitions, of which more
will be said in Part VII.
The proof of this is the standard so-called “availability” argument, which is as follows. Consider
a system in contact with environment. As it equilibrates, the total energy is conserved,
d(U + Uenv ) = 0, (15.25)
whereas the total entropy must grow,
d(S + Senv ) > 0. (15.26)
From Eq. (15.25),
dU = −dUenv = −Tenv dSenv + Penv dVenv . (15.27)
Note that the number of particles in the environment does not change: we assume that all the
exchanges/transmutations of matter occur within the system. Since dVenv = −dV (the volume
of the world is constant), this gives
Tenv dSenv = −dU − Penv dV. (15.28)
Now, from this and Eq. (15.26),
0 6 Tenv (dS + dSenv ) = Tenv dS − dU − Penv dV = −d(U − Tenv S + Penv V ) = −dG. (15.29)
Thus, dG 6 0, so the final equilibrium is achieved at the minimum value of G (the same argument
mandates dF 6 0 when V = const and, unsurprisingly, dS > 0 when also U = const, i.e., when
the system is isolated).
This is the equation of chemical equilibrium. There will be an equation like this for
each reaction that the system is capable of (each specified by a set of numbers {νs }). All
these equations together give a set of constraints on fractional concentrations c1 , . . . , cm−1
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 129
because these are the only variables that µs depends on, at constant P and T [Eq. (15.15)].
Note that the number of equations (15.31) is not necessarily equal to the number of
unknowns and so solutions do not necessarily exist or are unique.
15.4. Chemical Equilibrium in a Mixture of Classical Ideal Gases: Law of Mass Action
In order to apply Eq. (15.31), we need explicit expressions for µs (P, T, c1 , . . . , cm−1 ).
We can get them from, e.g., Eq. (15.4), which we can rewrite so:
kB T ∂ ln Z
cs = (15.32)
N̄ ∂µs
(a system of m equations for s = 1, . . . , m). This means that we need the grand partition
function for our mixture. If the mixture is of classical ideal gases, we can calculate it by
direct generalisation of the relevant results of §14.4. Since the gases are ideal, there are
no interactions between particles and so each species within a gas behaves as a separate
subsystem,66 in equilibrium with the rest. Therefore,
!
Y X
βµs
Z= Zs = exp Z1s e , (15.33)
s s
where Zs is the grand partition function of the species s, we have used Eq. (14.40), and
r
V (internal) 2π
Z1s = 3 Z1s , λths = ~ , (15.34)
λths ms kB T
is the single-particle partition function of species s.
Exercise 15.1. Derive Eq. (15.33) directly, by constructing the microstates of a mixture of
ideal gases and then summing over all these microstates to get Z.
and, after using Eq. (15.34), we get [cf. Eqs. (14.43) and (14.46)],
" # " #
Z1s cs nλ3ths cs P λ3ths
µs = −kB T ln = kB T ln (internal)
= kB T ln (internal) k T
, (15.36)
cs N̄ Z1s Z1s B
where n = N̄ /V is the overall number density of the mixture and we have used Dalton’s
law: total pressure is the sum of the pressures of individual species,
X
P = ns kB T = nkB T (15.37)
s
(see Exercise 4.3b or convince yourself, starting from Eq. (15.14), that this is true).
Finally, inserting Eq. (15.36) into Eq. (15.31), we find
" #
X cs P λ3ths
kB T νs ln (internal) k T
= 0. (15.38)
s Z 1s
B
66
This is not true in general for multicomponent chemical systems as they can, in principle,
interpenetrate, be strongly interacting and have collective energy levels not simply equal to
sums of the energy levels of individual components.
130 A. A. Schekochihin
Thus, the fractional concentrations must obey
" #
X X P λ3ths
νs ln cs = − νs ln (internal)
, (15.39)
s s Z1s kB T
or, to write this in the commonly used form highlighting pressure and temperature
dependence,
Y P Y kB T (internal) νs
cνss = P − s νs Z ≡ K(P, T ) . (15.40)
s s
λ3ths 1s
| {z }
function of T only
The right-hand side of this equation is called the chemical equilibrium constant, which,
for any given reaction (defined by νs ’s), is a known function of P , T and the microphysics
of the participating particles. The equation itself is known as the Law of Mass Action
(because of the particle masses ms entering K(P, T ) via P λths ).
Eq. (15.40), together with the requirement that s cs = 1, constrains fractional
concentrations in chemical equilibrium. It also allows one to determine in which direction
the reaction will go from some initial non-equilibrium state:
—if s cνss > K(P, T ), the reaction is direct, i.e., the concentrations cs of the species
Q
with νs > 0, which are on the left-hand side of Eq. (15.20), will go down, while those of
the species with νs < 0, on the right-hand side of Eq. (15.20), will go up;
—if s cνss < K(P, T ), the reaction is reverse.
Q
This is all the chemistry you need to know! (At least in this course.)
Exercise 15.2. Partially Ionised Plasma. Consider atomic-hydrogen gas at high enough
temperature that ionisation and recombination are occurring. The reaction is given by
Eq. (15.23). Our goal is to find, as a function of density and temperature (or pressure and
temperature), the degree of ionisation χ = np /n, where np is the proton number density,
n = nH + np is the total number density of hydrogen, ionised or not, and nH is the number
density of the un-ionised H atoms. Note that n is fixed (conservation of nucleons). Assume
overall charge neutrality of the system.
a) What is the relation between chemical potentials of the H, p and e gases if the system is
in chemical equilibrium?
b) Treating all three species as classical ideal gases, show that in equilibrium,
3/2
ne np me kB T
= e−R/kB T , (15.41)
nH 2π~2
where R = 13.6 eV (1 Rydberg) is the ionisation energy of hydrogen. This formula is known as
the Saha equation.
Hint. Remember that you have to include the internal energy levels into the partition function
for the hydrogen atom. You may assume that only the ground state energy level −R matters
(i.e., neglect all excited states).
c) Find the degree of ionisation χ = np /n as a function of n and T . Does χ go up or down
as density is decreased? Why? Consider a cloud of hydrogen with n ∼ 1 cm−3 . Roughly at
what temperature would most of it be ionised? These are approximately the conditions in the so
called “warm” phase of the interstellar medium—the stuff that much of the Galaxy is filled with
(although, admittedly, the Law of Mass Action is not thought to be a very good approximation
for interstellar medium, because it is not exactly in equilibrium).
d) Now find an expression for χ as a function of total gas pressure P and temperature T .
Sketch χ as a function of T at several constant values of P .
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 131
PART VI
Quantum Gases
16. Quantum Ideal Gases
So far, in all our calculations of partition functions for gases, we have stayed within
the classical limit, where the key assumption was that the number of single-particle
states available to particles was much greater than the number of these particles, so the
probability of any one particle occupying any given single-particle state was small and,
therefore, the probability of more than one particle laying claim to the same state could
be completely discounted. The time has now come to relax this assumption, but first, let
me explain what are those quantum correlations dealing with which we have so far been
so determined to avoid.
called bosons—they can be proven to be particles with integer spin, e.g., photons (spin 1),
4
He atoms (spin 0);
2) ψ(2, 1) = −ψ(1, 2) , (16.4)
called fermions—these are particles with half-integer spin, e.g., e, n, p, 3 He (spin 1/2).
The fermions are subject to the Pauli exclusion principle: if the states 1 and 2 are the
same, then
ψ(1, 1) = −ψ(1, 1) = 0, (16.5)
so no two fermions can be in the same state. This is precisely an example of quantum
correlations: even though the gas is ideal and so the fermions are non-interacting, the
system as a whole “knows” which single-particle states are occupied and so unavailable
to other particles.
67
See Landau & Lifshitz (1981), §61–62 for a rigorous generalisation of this argument to
N -particle wave functions and the derivation of the connection between a particle’s spin and
the exchange symmetry.
132 A. A. Schekochihin
What does all this mean for the statistical mechanics of systems composed of bosons
or fermions? Recall that the microstates of a box of ideal gas were specified in terms of
occupation numbers ni of single-particle states i.68 What we have just inferred from the
exchange symmetries determines what values these occupation numbers can take:
—for bosons, ni = 0, 1, 2, 3, . . . (any integer),
—for fermions, ni = 0 or 1 (no more than 1 particle in each state).
Armed with this knowledge, we are ready to start computing.
(εi are the energies of the single-particle states i), and the particle number in state α is
X
Nα = ni . (16.7)
i
Then
X X P
Z= e−β(Eα −µNα ) = e−β i ni (εi −µ)
α {ni }
XXX Y YX
= ... e−βni (εi −µ) = e−βni (εi −µ) . (16.8)
n1 n2 n3 i i ni
| {z }
over all possible
values of {ni }
P
For fermions, ni = 0 or 1, so the sum has only two members and so
ni
Yh i
Z= 1 + e−β(εi −µ) . (16.9)
i
For bosons, ni = 0, 1, 2, 3, . . . , so
∞ h ini
YX Y 1
Z= e−β(εi −µ) = . (16.10)
i ni =0 i
1 − e−β(εi −µ)
| {z }
geometric series
68
In §11, i was k, but in general, single-particle states will depend on other quantum numbers
as well, e.g., spin, angular momentum, vibrational levels, etc.—but they are still discrete, so we
simply index them by i.
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 133
16.3. Occupation Number Statistics and Thermodynamics
The probability for a given set of occupation numbers to occur is given by the grand
canonical distribution (14.8):
1 −β Pi ni (εi −µ)
pα ≡ p(n1 , n2 , n3 , . . . ) =
e . (16.12)
Z
Therefore, the mean occupation number of a single-particle state j is
X 1 X P 1 ∂ ln Z
n̄j ≡ hnj i = nj p(n1 , n2 , n3 , . . . ) = nj e−β i ni (εi −µ) = − . (16.13)
Z β ∂εj
{ni } {ni }
where the upper sign is for fermions and theQ lower for bosons.
Hint. Observe that ΩN (N1 , N2 , . . . ) = i Ωi , where Ωi is the number of ways to assign the
Ni particles available for the microstate i to the N copies in the ensemble.
Note that Eq. (16.16) certainly holds for Fermi and Bose gases in equilibrium, i.e., if the
occupation numbers n̄i are given by (16.14) (convince yourself that this is the case), but you
have shown now that it also holds out of equilibrium, i.e., for arbitrary sets of occupation numbers
(arbitrary particle distributions).
b) Considering a system with fixed mean energy and number of particles and maximising
SG , derive from Eq. (16.16) the Fermi–Dirac and Bose–Einstein formulae (16.14) for the mean
occupation numbers in equilibrium.
c) Devise a way to treat a classical ideal gas by the same method.
The machinery you have learned from Exercise 16.1 can be used in a somewhat unexpected
134 A. A. Schekochihin
way to think of the statistics of self-gravitating systems (e.g., distribution of energies of stars in
a galaxy) or of collisionless plasmas (cf. Exercise 6.3)—generally, systems of many particles
interacting via some field (gravitational, electromagnetic) but not experiencing particle-on-
particle collisions. It turns out one can argue that, subject to certain assumptions, these
(classical!) systems strive towards a variant of the Fermi–Dirac distribution known as the
Lynden-Bell distribution (after the seminal paper by Lynden-Bell 1967). If you are intrigued
by this, read §8.5 of Schekochihin (2019) and do Exercise 8.3 in that section.
[equivalent to Eq. (14.13)]. From this point on, I will drop the bars on N as we really
are interested in the case with a fixed number of particles again and the use of grand
canonical ensemble was a matter of analytical convenience. As I explained at the end of
§14.1 and around Eq. (14.23), canonical results are recoverable from the grand canonical
ones because they correspond to the special case of Nα = N for all α (with N treated as
a parameter, akin to V ).
Exercise 16.2. Show that using Eq. (16.11) in Eq. (14.13) gives the same result as using
Eq. (16.14) in Eq. (16.17).
[equivalent to Eq. (14.11)], the grand potential and the equation of state [Eqs. (14.26)
and (14.72)],
Φ
Φ = −kB T ln Z ⇒ P =− , (16.19)
V
and the entropy
U − Φ − µN
S= (16.20)
T
[equivalent to Eq. (14.28)], whence we can get the heat capacities, etc.
Since n̄i only depends on k = |k|, via ε(k), we can approximate the sum over single-
particle states with an integral as follows, using the same trick as in Eq. (11.7),
X (2s + 1)V Z Z ∞
X
3 (2s + 1)V
= (2s + 1) = 3
d k= 3
4πk 2 dk
i
(2π) (2π) 0
k
Z ∞ Z
(2s + 1)V 2
= dk k ≡ dk g(k), (16.25)
2π 2 0
nλ3th nλ3th
βµ
e ≈ ⇒ µ ≈ kB T ln , (16.35)
2s + 1 2s + 1
(internal)
which is precisely the classical expression (14.46) with Z1 = 2s + 1, q.e.d.!
Note that we have also confirmed that the classical limit is achieved when
n nλ3th
= 1 , (16.36)
nQ 2s + 1
as anticipated in our derivation of the partition function for the classical ideal gas [see
Eq. (11.26)].
Let us be thorough and confirm that we can recover our familiar expression for the grand and
ordinary partition functions of an ideal gas in the classical limit. As we now know, we must take
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 137
eβµ 1. From Eq. (16.11), we get in this limit
X βµ
ln Z ≈ eβµ e−βεi = eβµ Z1 ⇒ Z ≈ eZ1 e , (16.37)
i
which is Eq. (14.40), the classical grand partition function. Note that, in the classical limit,
using Eq. (16.35),
V
Z1 = (2s + 1) = N e−βµ ⇒ Z = eN . (16.38)
λ3th
Furthermore, if N is fixed, we find from Eq. (14.16) that the ordinary partition function is
N
Z1N Z1N
Z 2s + 1
Z= ≈ eN = ≈ , (16.39)
(eβµ )N nλ3 N e−N
N N!
| {zth }
Z1
=
N
as we indeed surmised for the classical ideal gas in §11.8 by neglecting quantum correlations
[Eq. (11.22)].
Finally, as anticipated in §11.10, we can also recover Maxwell’s distribution from the
occupation-number statistics in the classical limit: from Eq. (16.14) and using Eq. (16.35),
1 nλ3th −βεi
n̄i = ≈ eβµ e−βεi = e . (16.40)
eβ(εi −µ) ± 1 2s + 1
This is exactly the expression (11.38) that we expected for the occupation numbers in
a classical ideal gas, so as to recover the Maxwellian distribution. Note that Eq. (16.40)
makes it obvious that in the classical limit, n̄i 1, i.e., all microstates are mostly
unoccupied—just as we argued (in §11.8) must be the case in order for quantum corre-
lations to be negligible.
Obviously, none of this is a great surprise, but it is nice how neatly it all works out.
∞ Z ∞
dε ε3/2 β 5/2
Z
X dε g(ε) ε 2(2s + 1) V
U= n̄i εi = = √ 3 k B T
i 0 eβ(ε−µ) ± 1 π λth 0 eβ(ε−µ) ± 1
Z ∞
dx x3/2
nQ 2
⇒ U = N kB T √ . (16.41)
n π 0 ex−βµ ± 1
Exercise 16.3. Via a calculation analogous to what was done in §16.4.3, check that the
expression in brackets in Eq. (16.41) is equal to 3/2 in the classical limit [as it ought to be;
see Eq. (11.30)].
138 A. A. Schekochihin
16.4.5. Grand Potential of a Quantum Ideal Gas
From Eqs. (16.19) and (16.11),
X h i Z ∞ h i
−β(εi −µ)
Φ = −kB T ln Z = ∓kB T ln 1 ± e = ∓kB T dε g(ε) ln 1 ± e−β(ε−µ)
i 0
∞ ∞
√ dx x3/2
Z Z
nQ 2 2 nQ 2
dx x ln 1 ± e−x+βµ = − N kB T
= ∓N kB T √ √
n π 3 n π ex−βµ ± 1
|0 {z } 0
Z ∞
2 d 3/2 h i
= dx x ln 1 ± e−x+βµ
0 3 dx
integrate by parts
2 ∞ ∓e−x+βµ
Z
=− dx x3/2
3 0 1 ± e−x+βµ
2
⇒ Φ=− U . (16.42)
3
2U
P = , (16.43)
3V
i.e., pressure is 2/3 energy density completely generally for a non-relativistic quantum
ideal gas in 3D [not just in the classical limit, cf. Eq. (1.29)].
Exercise 16.4. What happens in 2D? Trace back the way in which the dimensionality of space
entered into all these calculations.
Exercise 16.5. Check that, in the classical limit, the expression in brackets in Eq. (16.44)
asymptotes to unity and the familiar classical equation of state (11.32) is thus recovered.
16.5. Degeneration
We have seen above (§16.4.3) that for nλ3th 1 (hot, dilute gas), we recover the
classical limit. Obviously, we did not go to all the trouble of calculating quantum
statistics just to get back to the classical world. The new and exciting things will happen
when the classical limit breaks down, viz., nλ3th & 1.
Under what conditions does this happen? Let us start from the classical limit, use P =
nkB T , and estimate:
3/2 −5/2 −3/2
P 3 2π P T m
nλ3th = ~ ≈ 2.5 · 10−5 . (16.51)
kB T mkB T 1 atm 300 K mp
This gives us
4
He at 4 K and 1 atm: nλ3th ∼ 0.15, getting dangerous...;
electrons in metals: nλ3th ∼ 104 1 at T = 300 K (here we used n ∼ 1028 m−3 , not
P = nkB T ). Thus, they are completely degenerate even in everyday conditions! It does
indeed turn out that you cannot correctly calculate heat capacity of metals solely based
on classical models (see Exercise 19.2). This will be a clear application of Fermi statistics
in the quantum (degenerate) limit.
Note that this teaches us that “low-” and “high-”temperature limits do not necessarily
apply at temperatures naı̈vely appearing to be low or high from our everyday perspective.
For example, for electrons in metals temperature would stop being “low” (i.e., the classical
limit would be approached) when nλ3th ∼ 1, or T ∼ Tdeg ∼ 2πn2/3 ~2 /me kB ∼ 104 K. The
“degeneration temperature” is high because density is high and the particles (electrons)
are light. Of course most metals in fact would melt and, indeed, evaporate, dissociate
140 A. A. Schekochihin
and ionise at such temperatures. Thus, the world is more quantum than you might have
thought.
Another famous application of the theory of degenerate Fermi gases is to the admittedly
less mundane environments of white dwarves and neutron stars, where densities are so
high that even relativistic temperatures (T & mc2 /kB ) can be “low” from the point of
view of quantum effects being dominant (some elements of Chandrasekhar’s theory of
the stability of stars will appear in Exercise 17.1).
Exercise 16.6. Ultrarelativistic Quantum Gas. Consider an ideal quantum gas (Bose or
Fermi) in the ultrarelativistic limit and reproduce the calculations of §16.4 as follows.
a) Find the equation that determines its chemical potential (implicitly) as a function of density
n and temperature T .
b) Calculate the energy U and grand potential Φ and hence prove that the equation of state
can be written as
1
P V = U, (16.52)
3
regardless of whether the gas is in the classical limit, degenerate limit or in between.
c) Consider an adiabatic process with the number of particles held fixed and show that
for any temperature and density (not just in the classical limit, as in Exercise 11.3).
d) Show that in the hot, dilute limit (large T , small n), eµ/kB T 1. Find the specific condition
on n and T that must hold in order for the classical limit to be applicable. Hence derive the
condition for the gas to cease to be classical and become degenerate.
e) Estimate the minimum density for which an electron gas can be simultaneously degenerate
and ultrarelativistic.
Exercise 16.7. Pair Plasma. At relativistic temperatures, the number of particles can stop
being a fixed number, with production and annihilation of electron-positron pairs providing the
number of particles required for thermal equilibrium. The reaction is
e+ + e− ⇔ photon(s). (16.54)
a) What is the condition for the “chemical” equilibrium for this system?
b) Assume that the numbers of electrons and positrons are the same (i.e., ignore the fact
that there is ordinary matter and, therefore, a surplus of electrons). This allows you to treat the
situation as fully symmetric and conclude that the chemical potentials of electrons and positrons
are the same. What are they equal to? Hence calculate the density of electrons and positrons
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 141
εF ~2 n2/3
T TF ≡ ∼ ∼ Tdeg , (17.5)
kB mkB
precisely the degeneration temperature that we already derived in §16.5, 11.8 and 2.3.2
(e.g., TF ∼ 104 K for electrons in metals).69
69
I stress again that “low T ” in this context just means T TF , even though TF can be very
high for systems with large density and low particle mass. For example, for electrons in white
dwarves (Exercise 17.1), εF ∼ MeV and so TF ∼ 1010 K ∼ me c2 , so in fact they are not just hot
but relativistically hot—and all our calculations must be redone with the relativistic formula
for ε(k) [see Eq. (16.24)].
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 143
Hence the equation of state [Eq. (16.43)] is
2/3
~2 6π 2
2U 2
P = = nεF = n5/3 . (17.7)
3V 5 5m 2s + 1
This is, of course, independent of T (indeed, T = 0) and so the gas might be said to
behave as a “pure mechanism” (changes in volume and pressure are hard-coupled, with
no heat exchange involved).
Note that Eq. (17.7) is equivalent to P V 5/3 = const, the general adiabatic law for a
quantum gas [Eq. (16.46)]. This is perhaps not surprising as we are at T = 0 and expect
S = 0 = const (although we will only prove this in the next section).
Exercise 17.1. White Dwarves, Neutron Stars and Black Holes. This question deals
with the states into which stars collapse under gravity when they run out of nuclear fuel. As
matter is compressed, the density of electrons will eventually become so large as to turn them
into a degenerate gas, effectively at zero temperature, while the nuclei supply gravity and enforce
charge neutrality (any local deviation from zero charge density is quickly ironed out by large
electric forces). Let us assume that the total mass of matter per electron is m (typically, for
each electron, there is one proton and roughly one neutron, so m ≈ me + mp + mn ≈ 2mn ). Our
objective is, given the total mass M of the star, to determine its radius R. They are related by
Z R
M = 4π drr2 ρ(r), (17.8)
0
where ρ(r) = mne (r) is the mass density and ne (r) is the electron number density. Thus, we
need to work out the density profile of a spherically symmetrical cloud of degenerate (T = 0)
electron gas in a gravitational field determined by that same density profile, the gravitational
potential ϕ satisfying
∇2 ϕ = 4πGρ. (17.9)
a) Assuming particle equlibrium and arguing that the effective potential energy associated
with placing an electron at the location r is mϕ(r) (cf. §14.5), show that the chemical potential
of the electron gas can be expressed as
√ 3/2
1 r 8 2Gm2 me
µ(r) = 4 2 f , where Λ = 3
(17.10)
R Λ R 3π~
and f (x) is a dimensionless function of a dimensionless argument satisfying the boundary-value
problem
1 d 2 df
x = −f 3/2 , f (1) = 0, f 0 (0) = 0. (17.11)
x2 dx dx
While this equation can only be solved numerically, you should not find it a difficult task to
sketch its solution. Sketch also the resulting density profile.
b) On the basis of Eqs. (17.8) and (17.10), argue (dimensionally) that the radius of a white
dwarf R ∝ M −1/3 . Indeed, using Eq. (17.9), you should be able to show precisely that
f 0 (1)
M R3 = − . (17.12)
mGΛ2
Numerical solution of Eq. (17.11) gives f 0 (1) ≈ −132. Hence show that the radius of a solar-mass
white dwarf (M ≈ 2 · 1030 kg) is of the order of the radius of the Earth.
The existence of this equilibrium solution is easy to interpret. The gravitational energy pulling
the white dwarf together is, obviously, ∝ −M 2 /R, whereas the internal (Fermi) energy pushing
it apart is ∝ N (N/V )2/3 ∝ M 5/3 /R2 . Their sum, the total energy
M 5/3 M2
E = const − const (17.13)
R2 R
144 A. A. Schekochihin
(a) non-relativistic; see Eq. (17.13) (b) relativistic; see Eq. (17.14)
Figure 26. Energy of a white dwarf (or neutron star).
has a minimum at R ∝ M −1/3 , where the equlibrium solution will sit (Fig. 26a). Equivalently,
this is a balance between gravity and pressure.
The situation changes if the electron gas is ultrarelativistic: the Fermi energy is then ∝
N (N/V )1/3 ∝ M 4/3 /R (see Exercise 17.4), which has the same R dependence as the gravita-
tional energy and so can only be balanced with the latter at a single value of M = M0 , at which
the total energy
M 4/3 M2
E = const − const (17.14)
R R
is zero. When M < M0 , E > 0, so the gas will want to expand until it becomes non-relativistic;
when M > M0 , it will contract to ever smaller R (Fig. 26b). In the next part of this Exercise,
we discover how this result is reflected in a formal calculation.
c) Show that the non-relativistic approximation (εF me c2 ) breaks down for
3/2
1 c~
M& 2 . (17.15)
m G
For this estimate, you may use the mean density 3M/4πR3 of the white dwarf or the density
at its centre; if you use the latter, you will need to know that f (0) ≈ 178 (how different are
the mean density and the density at the centre?). How does the mass threshold that you have
obtained compare with the mass of our Sun?
d) Redo the above calculations for an ultrarelativistic gas and show that
1 r 4Gm2 1 d 2 df
µ(r) = √ f , where Λ= and x = −f 3 (17.16)
R Λ R 3πc3 ~3 x2 dx dx
(with the same boundary conditions as before). Show that there is a single value of mass, M =
M0 , compatible with such an equilibrium. Using the fact (which can be obtained numerically)
that f 0 (1) ≈ −2, show that M0 ≈ 1.45M . This is called the Chandrasekhar limit (he discovered
it at the age of 19, during his voyage from India to England in 1930).
As explained above, when M > M0 , the white dwarf collapses. As density goes up, electrons
are captured by protons and everything turns into neutrons. The result is again a Fermi gas,
but now consisting of neutrons. If its Fermi energy is smaller than mn c2 , the non-relativistic
calculation done in (a)–(b) applies, but with me → mn and m → mn . The stable solution
obtained this way is called a neutron star. For masses large enough that neutrons become
relativistic, this too is unstable and collapses into a black hole. The corresponding mass limit is
a few solar masses. You may estimate it yourself, working in the same vein as you did in (c)–
(d). Note, however, that things are, in fact, more complicated: as neutrons become relativistic,
Newton’s equation (17.9) is no longer valid, you have to use GR and also work with the general
relativistic energy-momentum relation (16.23) because the ultrarelativistic limit is, in fact, never
quite reached. The quantitative details are messy, but the qualitative conclusion is the same:
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 145
there is an order-unity interval of masses around M in which neutron stars can exist; ligher
stars end up white dwarves, heavier ones collapse into black holes.
[Literature: Landau & Lifshitz (1980), Ch. XI]
Thus, the heat capacity starts off linear with T at low T and eventually asymptotes to a
const (= 3N kB /2) at high T (Fig. 27a). In metals at sufficiently low temperatures, this
heat capacity due to electrons is the dominant contribution because the heat capacity
due to lattice vibrations is ∝ T 3 (see Exercise 19.2).
70
Note that we have not even proven yet that S = 0 at T = 0: in Eq. (16.45), the numerator
and the denominator are both 0 at T = 0, but finding the limit of their ratio requires knowledge
of the derivatives of U and µ with respect to T .
146 A. A. Schekochihin
(a) Heat capacity CV (T ); see Eq. (17.28) (b) Equation of state P (T ); see Eq. (17.29)
Figure 27. Thermodynamics of a Fermi gas.
Eq. (17.20), via P = (2/3)U/V [Eq. (16.43)], also gives us the general form of the
equation of state for a Fermi gas at low temperatures: P grows quadratically from the
T = 0 value [Eq. (17.7)], asymptoting to P = nkB T at T εF (Fig. 27b).
This highlights a key thermodynamical (and, indeed mechanical) difference between a
Fermi gas and a classical gas: at low T , the Fermi gas exerts a much larger pressure than
it would have done had it been classical. This is of course due to the stacking of particles
in the energy levels up to εF and the consequent smaller energy density than would have
been achieved at low temperature had Pauli’s exclusion principle not been in operation.
Z ∞
dε f (ε)
I= , (17.22)
0 eβ(ε−µ) + 1
√
where f (ε) = g(ε) ∝ ε in Eq. (16.31), f (ε) = g(ε)ε ∝ ε3/2 in Eq. (16.41), and it can also
scale with other powers of ε in other limits and regimes (e.g., in 2D, or for the ultrarelativistic
calculations in Exercise 17.4).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 147
Figure 28. Chemical potential µ(T ) of a Fermi gas; see Eq. (17.25).
Z µ Z ∞ " 4 #
2 0 dx x kB T
= dε f (ε) + 2(kB T ) f (µ) +O
0 ex + 1 µ
| 0 {z }
π2
=
12
Z µ
π2 0 2
= dε f (ε) + f (µ)(kB T ) + . . . (17.23)
0 6
This is called the Sommerfeld expansion. It allows us to calculate finite-T corrections to anything
we like by substituting the appropriate form of f (ε).
First, we calculate the chemical potential from Eq. (16.31), to which we apply Eq. (17.23)
with
N √
f (ε) = g(ε) = 3/2
ε. (17.24)
(2/3)εF
148 A. A. Schekochihin
This gives
" 2 #
2 3/2 π 2 1 π2
N 2 kB T
N= 3/2
µ + √ (kB T ) + . . . ⇒ µ = εF 1− + ... .
(2/3)εF 3 6 2 µ 12 εF
| {z }
µ = εF + . . .
(17.25)
Thus, µ falls off with T —eventually, it must become large and negative in the classical limit, as
per Eq. (16.35) (Fig. 28).
Now we turn to mean energy: in Eq. (16.41), use Eq. (17.23) with
N
f (ε) = g(ε)ε = 3/2
ε3/2 (17.26)
(2/3)εF
to get
" 2 #
π2 3 √ 5π 2 kB T
N 2 5/2 2 3
U= 3/2
µ + µ (k B T ) + . . . = N εF 1 + + . . . .
(2/3)εF |5 {z } |6 2 {z } 5 12 εF
use µ = εF + . . .
Eq. (17.25)
(17.27)
In the lowest order, this gives us back Eq. (17.6), while the next-order correction is precisely
the δU (T ) that we need to calculate heat capacity:
π 2 kB T
∂U
CV = = N kB + ... . (17.28)
∂T V 2 εF
Finally, substituting Eqs. (17.25) and (17.27) into Eq. (16.45), we find the entropy of a Fermi
gas at low temperature:
π 2 kB T
1 5
S= U − µN = N kB + · · · → 0 as T → 0 . (17.30)
T 3 2 εF
Exercise 17.2. Ratio of Heat Capacities for a Fermi Gas. Show that the ratio of heat
capacities for a Fermi gas CP /CV → 1 as T → 0. Can you show this without the need to
use the detailed calculation of §17.3.3? Sketch CP /CV as a function of T from T = 0 to the
high-temperature limit.
Exercise 17.3. We have seen that µ > 0 for a Fermi gas at low temperatures. In §14.2, we
argued, on the basis of Eq. (14.21), that adding particles to a system (at constant U and V )
would increase entropy and so µ would have to be negative. Why does this line of reasoning fail
for a degenerate Fermi gas?
Exercise 17.4. Heat Capacity of an Ultrarelativistic Electron Gas. Find the Fermi
energy εF of an ultrarelativistic electron gas and show that when kB T εF , its energy density is
U 3
= nεF (17.31)
V 4
and its heat capacity is
kB T
CV = N kB π 2 . (17.32)
εF
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 149
Sketch the heat capacity CV of an ultrarelativistic electron gas as a function of temperature,
from T εF /kB to T εF /kB .
The strangeness of the degenerate Fermi gas, compared to classical gas, was, in a
sense, that it behaved as if there were more of it than there actually was (§17.3.2). The
strangeness of the degenerate Bose gas will be that it behaves as if there were less of it.
Figure 29. Chemical potential µ(T ) of a Bose gas; see Eq. (18.4).
Clearly, as T → 0 (β → ∞), the lower is the energy the larger is the occupation number
and so at T = +0 we expect all particles to drop to the ground state:
1
n̄0 = →N as β → ∞ (18.3)
e−βµ − 1
1 kB T
⇒ µ(T → +0) ≈ −kB T ln 1 + ≈− → −0. (18.4)
N N
The chemical potential of a Bose gas starts off at µ = 0, eventually decaying further with
increasing T to its classical value (16.35) (Fig. 29).
Thus, at low temperatures, the lowest-energy state becomes macroscopically occupied:
n̄0 (T = 0) = N and, clearly, n̄0 ∼ some significant fraction of N for T just above
zero. This is a serious problem for the calculations in §16.4, which were all done in the
continuous limit. Indeed, we replaced the sum over states i with an integral
√ over energies
weighted by the density of states, Eq. (16.30), but the latter was g(ε) ∝ ε [Eq. (16.29)],
so the ε = 0 state always gave us a vanishing contribution to our integrals! This is not
surprising as the continuous approximation of a sum over states can obviously only be
reasonable if the number of particles in each state is small compared to the total number
N . As we have just seen, this is patently wrong for a Bose gas at sufficiently low T , so
we must adjust our theory. In order to adjust it, let us first see how, mathematically
speaking, it breaks down as T → 0 (and break down it must, otherwise we would be in
grave trouble, with spurious results emerging!).
Recall that the first step in any treatment of a quantum gas is to calculate µ(n, T )
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 151
from Eq. (16.32), a transcendental equation that has the form, for a Bose gas,
Z ∞ √
2 dx x n n
f (βµ) ≡ √ x−βµ
= ∝ 3/2 . (18.5)
π 0 e −1 nQ T
The solution to this equation (see Fig. 30) certainly exists at low n and high T (small
right-hand side)—that was the classical limit (§16.4.3). The solution is there because
in the limit βµ → −∞, the function f (βµ) ≈ eβµ is monotonic and one can always
find the value of µ for which Eq. (18.5) would be satisfied. The solution also always
exists in the opposite (low-T ) limit for Fermi gases, with µ(T → 0) being the Fermi
energy: again, this is because, in the limit βµ → ∞, the Fermi version of our function
√ R βµ √ √
f (βµ) ≈ (2/ π) 0 dx x = (4/3 π)(βµ)3/2 [this is Eq. (17.3), µ = εF ] is monotonic.
In contrast, for Bose gas, as we saw in Eq. (18.2), there are no physically legitimate
positive values of µ and so f (βµ) has a finite upper limit:
Z ∞ √
2 dx x 3
f (βµ) 6 f (0) = √ = ζ ≈ 2.612, (18.6)
π 0 ex − 1 2
where ζ is Riemann’s zeta function (it does not matter how the integral is calculated,
the important thing is that it is a finite number).
Therefore, if n/nQ > f (0) (and there is no reason why that cannot be, at low enough
T and/or high enough n), Eq. (18.5) no longer has a solution! The temperature below
which this happens is T = Tc such that
2/3
nλ3th 2π~2
n n
= = f (0) ≈ 2.612 ⇒ Tc ≈ . (18.7)
nQ 2s + 1 mkB 2.612(2s + 1)
Thus,
for T > Tc , all is well and we can always find µ(n, T ); as T → Tc + 0, we will have
µ → −0;
for T < Tc , we must set µ = 0,72 but this means that now Eq. (18.5) no longer
determines µ, but rather the number of particles in the excited states (ε > 0):
Nexcited = nQ V f (0) < N. (18.8)
Equivalently,
3/2
Nexcited 2.612(2s + 1) T
≈ = , (18.9)
N nλ3th Tc
whence the occupation number of the ground state is
" 3/2 #
T
n̄0 = N − Nexcited = N 1 − . (18.10)
Tc
The ground state is macroscopically occupied at T < Tc and n̄0 = N at T = 0 (Fig. 31).
The phenomenon of a macroscopic number of particles collecting in the lowest-energy
state is called Bose–Einstein condensation. This is a kind of phase transition (which
occurs at T = Tc ), but the condensation is not like ordinary condensation of vapour:
it occurs in the momentum space! When the condensate is present (T < Tc ), Bose gas
72
From Eq. (18.4), we know that it is a tiny bit below zero, but for the purposes of the continuous
approximation, this is 0, because N 1 in the thermodynamic limit.
152 A. A. Schekochihin
behaves as a system in which the number of particles is not conserved at all because
particles can always leave the excited population (Nexcited ) and drop into the condensate
(n̄0 ), or vice versa, and the number of the excited particles is determined by thermody-
namical parameters (temperature and total mean density). This is rather similar to the
way a photon gas behaves in the sense that for the latter too, the number of photons is
set by the temperature (mean energy) of the system and, appropriately, µ = 0, a generic
feature of systems in which the number of particles is not conserved (see §19 and Exercise
19.1).73
As might have been expected, the critical temperature Tc ∼ Tdeg , the degeneration
temperature (i.e., at T Tc , we are back in the classical limit). For 4 He, Tc ≈ 3 K,
quite cold, and this is a typical value under normal conditions, so not many gases are
still gases at these temperatures and Bose condensates tend to be quite exotic objects.74
In 2001, Cornell, Wieman and Ketterle got the Nobel Prize for the first experimental
observation of Bose condensation, one of those triumphs of physics in which mathematical
reasoning predicting strange and whimsical phenomena is proven right as those strange
and whimsical phenomena are found to be real. We have become used to this, but do
pause and ponder what an extraordinary thing this is.
Exercise 18.1. Low Energy Levels in Degenerate Bose Gas. In a degenerate Bose gas,
the lowest energy level (particle energy ε0 = 0) is macroscopically occupied, in the sense that its
occupation number n̄0 is comparable with the total number of particles N . Is the first energy
level (particle energy ε1 , the next one above the lowest) also macroscopically occupied? In order
to answer this question, estimate the occupation number of the first level and work out how it
scales with N (you will find that n̄1 ∝ a fractional power of N ). What is the significance of this
result: do the particles in the first level require special consideration as a condensate the same
way the zeroth-level ones did?
73
Another system of this ilk is ultrarelativistic pair plasma, which you encountered in
Exercise 16.7.
74
Superfluidity and and superconductivity are related phenomena, although the systems involved
are not really non-interacting ideal gases and one needs to do quite a bit more theory to
understand them (see, e.g., Lifshitz & Pitaevskii 1980).
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 153
18.2.1. Mean Energy
Using the results obtained in the continuous approximation (which is fine for the
excited particles), we get, from Eq. (16.41) at T < Tc and, therefore, with µ = 0,
Z ∞
(2s + 1)V 2 dx x3/2 3 · 1.341 (2s + 1)V m3/2
U= 3 kB T √ x
≈ (kB T )5/2 . (18.11)
λth π 0 e −1 2(2π)3/2 ~3
| {z } | {z }
= (3/2)ζ(5/2) ≈ 0.128
≈ (3/2) · 1.341
Note, however, that Eq. (18.11) perhaps better emphasises the fact that the mean energy
depends on T and V , but not on the number of particles (which is adjustable by the
system depending on what volume the particles are called upon to occupy and at what
temperature they are doing it); in Eq. (18.13), this fact is hidden in the dependence of
Tc on n = N/V .
Note that 1.93 > 3/2, so CV at T = Tc is larger than it is in the classical limit. It turns
out that at T = Tc , CV has a maximum and a discontinuous derivative (Fig. 32a). The
jump in the derivative can be calculated by expanding around T = Tc . This is done, e.g.,
in Landau & Lifshitz (1980, §62). The answer is
∂CV N kB
≈ 2.89 , (18.15)
∂T T =Tc −0
Tc
∂CV N kB
≈ −0.77 . (18.16)
∂T T =Tc +0 Tc
Thus, Bose condensation is a 3rd-order phase transition (meaning that a third derivative
of Φ is discontinuous).
(a) Heat capacity CV (T ); see Eqs. (18.14), (b) Equation of state P (T ); see Eq. (18.17)
(18.15) and (18.16)
Figure 32. Thermodynamics of a Bose gas.
The salient fact here is that pressure (equivalently, the energy density) is independent of
particle density and depends on temperature only. Obviously, at T Tc , the equation of
state must asymptote to the classical ideal gas law (Fig. 32b).
Note that, as I promised at the beginning of §18, a degenerate Bose gas exerts
less pressure at low T than it would have done had it been classical (in contrast to
Fermi gas, which punches above its weight; §17.3.2). This is, of course, again because of
the energetic invisibility of the part of the gas that has dropped into the Bose condensate.
Such is the weird and wonderful quantum world. We must stop here. Enjoy!
Exercise 18.2. Degenerate Bose Gas in 2D. a) Show that Bose condensation does not
occur in 2D.
Hint. The integral that you will get when you write the formula for N is doable in elementary
functions. You should find that N ∝ ln(1 − eβµ ).
b) Calculate the chemical potential as a function of n and T in the limit of small T . Sketch
µ(T ) from small to large T .
c) Show that the heat capacity (at constant area) is C ∝ T at low temperatures and sketch
C(T ) from small to large T .
Exercise 18.3. Paramagnetism of Degenerate Bose Gas. Consider a gas of bosons with
spin 1 in a weak magnetic field, with energy levels
~2 k 2
ε(k) = − 2µB sz B, sz = −1, 0, 1, (18.18)
2m
where µB = e~/2me c is the Bohr magneton.
a) Derive an expression for the magnetic susceptibility of this system. Show that Curie’s law
(χ ∝ 1/T ) is recovered in the classical limit.
b) What happens to χ(T ) as the temperature tends to the critical Bose-Einstein condensation
temperature from above (T → Tc + 0)? Sketch χ(T ).
c) At T < Tc and for a given B, which quantum state will be macroscopically occupied?
Taking B → +0 (i.e., infinitesimally small), calculate the spontaneous magnetisation of the
system,
M0 (n, T ) = lim M (n, T, B), (18.19)
B→0
as a function of n and T . Explain why the magnetisation is non-zero even though B is vanishingly
small. Does the result of (b) make sense in view of what you have found?
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 155
19. Thermal Radiation (Photon Gas)
[Literature: Landau & Lifshitz (1980), §63]
This part of the course was taught by Professors Andrew Boothroyd and Julien
Devriendt.
Exercise 19.1. Work out the theory of thermal radiation using the results of Exercise 16.6.
Exercise 19.2. Heat Capacity of Metals. The objective here is to find at what temperature
the heat capacity of the electron gas in a metal dominates over the heat capacity associated
with the vibrations of the crystal lattice.
a) Calculate the heat capacity of electrons in aluminium as a function of temperature for
T TF .
b) To estimate the heat capacity due to the vibrations of the lattice, you will need to use the
so-called Debye model. Derive it from the results you obtained in Exercise 16.6 as follows.
The vibrations of the lattice can be modelled as sound waves propagating through the metal.
These in turn can be thought of as massless particles (“phonons”) with energies ε = ~ω and
frequencies ω = cs k, where cs is the speed of sound in a given metal and k is the wave number
(a discrete set of allowed wave numbers is determined by the size of the system, as usual). Thus,
the statistical mechanics for the phonons is the same as for photons, with two exceptions: (i)
they have 3 possible polarisations in 3D (1 longitudinal, 2 transverse) and (ii) the wave number
cannot be larger, roughly, than the inverse spacing of the atoms in the lattice (do you see why
this makes sense?).
Given these assumptions,
— derive an expression for the density of states g(ε) [or g(ω)];
— derive an expression for the mean energy of a slab of metal of volume V ;
— figure out the condition on temperature T that has to be satisfied in order for it to be
possible to consider the maximum wave number effectively infinite;
— Rcalculate the heat capacity in this limit as a function of T ; you may need to use the fact
∞
that 0 dx x3 /(ex − 1) = π 4 /15.
Hint. You already did all the required maths in Exercise 16.6, so all you need is to figure out
how to modify it to describe the phonon gas. You will find it convenient to define the Debye
temperature
~cs (6π 2 n)1/3
ΘD = , (19.1)
kB
where n is the number density of the metal. This is the temperature associated with the maximal
wave number in the lattice, which Debye defined by stipulating that the total number of possible
phonon modes was equal to 3 times the number of atoms:
Z kmax
dk g(k) = 3N. (19.2)
0
PART VII
Thermodynamics of Real Gases
[Literature: Landau & Lifshitz (1980), Ch. VII and VIII]
This part of the course was taught by Professors Andrew Boothroyd and Julien Devriendt.
156 A. A. Schekochihin
IUCUNDI ACTI LABORES.
Acknowledgments
I am grateful to many at Oxford who have over the last five years given advice,
commented, pointed out errors, asked challenging questions and generally helped me get a
grip. A partial list is: tutors Michael Barnes, James Binney (who convinced me to handle
entropy in the way I do and on whose unpublished lecture notes on Statistical Mechanics I
have drawn in some of my own lectures), Steve Blundell, Andrew Boothroyd, Archie Bott,
Radu Coldea, Julien Devriendt, Hadleigh Frost, John Gregg, Dieter Jaksch, Nick Jelley,
Minhyong Kim, Adrian Menssen, Federico Nova, Sid Parameswaran, Armin Reichold,
Steve Simon, students Bruno Balthazar, Matias Janvin, Jakob Kastelic, Matthew Elliot,
Harry Plumley, Guillermo Valle, Marjolein van der Linden, Ziyan Li, Leander Thiele,
Yungsong Tao, Krystof Kolar, Marco Fabus, Robert Ewart (as well as many others who
asked good questions but never introduced themselves) and especially the generations of
Merton College 2nd-year undergraduates who had to endure not just my lecturing but
also my tutoring—and in responding vigorously to the latter helped improve (I hope) the
former: Will Bennett, Jon Burr, Matt Constable, Richard Fern (who later returned as a
brilliant teaching assistant), Alessandro Geraldini, Luke Hughes, Laurie McClymont,
Michael Adamer, Joel Devine, Gregory Farquhar, Sammy Gillespie, Catherine Hale,
Toby Smith, Chris Staines, Tiffany Brydges, Ravin Jain, Sergejs Lukanihins, James
Matthew, Alex Moore, Jasper Russell, Liz Traynor, Mantas Abazorius, David Felce,
Chris Hamilton, Tom Hornigold, Isabelle Naylor, Adam Stanway, Glenn Wagner, Toby
Adkins (who produced a set of rival course notes to explain what I was really talking
about), David Hosking, Ilya Lapan, Ewan McCullogh, Caleb Rich, Robert Stemmons,
Jacob White, Matthew Davis, Pablo Espinoza, Joey Li, Oliver Paulin, Max Plummer,
Uros Ristivojevic, Kirill Sasin, Georgia Acton, Lucy Biddle, Jules Desai, Roshan Dodhia,
Andrew Doyle, Catherine Felce, Jack McIntyre, Jacob Robertson, Leonie Woodland,
Richard Chatterjee, Maeve Dai, Rayhan Mahmud, Oskar Maatta, Ross McDonald.
REFERENCES
Bekenstein, J. D. 1973 Black holes and entropy. Physical Review D 7, 2333.
Berezin, F. A. 2007 Lectures on Statistical Physics. Moscow: MCCME Publishing.
Binney, J. & Skinner, D. 2013 The Physics of Quantum Mechanics. Oxford University Press.
Blundell, S. J. & Blundell, K. M. 2009 Concepts in Thermal Physics. Oxford University
Press, 2nd Edition.
Boltzmann, L. 1995 Lectures on Gas Theory. Dover.
Bradbury, R. 1952 A Sound of Thunder . Collier.
Chandrasekhar, S. 2003 Hydrodynamic and Hydromagnetic Stability. Dover.
Chapman, S. & Cowling, T. G. 1991 The Mathematical Theory of Non-uniform Gases.
Cambridge University Press, 3rd Edition.
Dellar, P. J. 2015 Kinetic Theory of Gases. Lecture Notes for the Oxford MMathPhys course
on Kinetic Theory; URL: http://people.maths.ox.ac.uk/dellar/MMPkinetic.html.
Dunkel, J. & Hilbert, S. 2014 Consistent thermostatistics forbids negative absolute
temperatures. Nature Physics 10, 67.
Ford, I. 2013 Statistical Physics. An Entropic Approach. Wiley.
Gibbs, J. W. 1902 Elementary Principles in Statistical Mechanics Developed with Especial
Reference to the Rational Foundation of Thermodynamics. New York: Charles Scribner’s
Sons.
Ginzburg, V. L., Levin, L. M., Sivukhin, D. V. & Yakovlev, I. A. 2006 Problems
Oxford Physics Lectures: Kinetic Theory and Statistical Physics 157
for a Course of General Physics II. Thermodynamics and Molecular Physics. Moscow:
Fizmatlit.
Gour, G. 1999 Schwarzschild black hole as a grand canonical ensemble. Physical Review D 61,
021501(R).
Jaynes, E. T. 1965 Gibbs vs. Boltzmann entropies. American Journal of Physics 33, 391.
Jaynes, E. T. 2003 Probability Theory: The Logic of Science. Cambridge: Cambridge University
Press.
Kapitsa, P. L. 1974 Experiment, Theory, Practice. Moscow: Nauka.
Kardar, M. 2007 Statistical Physics of Particles. Cambridge: Cambridge University Press.
Landau, L. D. & Lifshitz, E. M. 1980 Statistical Physics, Part 1 (L. D. Landau and
E. M. Lifshitz’s Course of Theoretical Physics, Volume 5). Pergamon Press.
Landau, L. D. & Lifshitz, E. M. 1981 Quantum Mechanics: Non-Relativistic Teeory
(L. D. Landau and E. M. Lifshitz’s Course of Theoretical Physics, Volume 3). Elsevier.
Lewis, M. B. & Siegert, A. J. F. 1956 Extension of the condensation theory of Yang and Lee
to the pressure ensemble. Physical Review 101, 1227.
Lifshitz, E. M. & Pitaevskii, L. P. 1980 Statistical Physics, Part 2: Theory of the Condensed
State (L. D. Landau and E. M. Lifshitz’s Course of Theoretical Physics, Volume 9).
Pergamon Press.
Lifshitz, E. M. & Pitaevskii, L. P. 1981 Physical Kinetics (L. D. Landau and E. M. Lifshitz’s
Course of Theoretical Physics, Volume 10). Elsevier.
Lynden-Bell, D. 1967 Statistical mechanics of violent relaxation in stellar systems. Mon. Not.
R. Astron. Soc. 136, 101.
Maxwell, J. C. 1860 Illustrations of the dynamical theory of gases.—Part I. On the motions
and collisions of perfectly elastic spheres. The London, Edinburgh and Dublin Philosophical
Magazine and Journal of Science 19, 19.
Pauli, W. 2003 Thermodynamics and the Kinetic Theory of Gases (Pauli Lectures on Physics,
Volume 3). Dover.
Schekochihin, A. A. 2019 Kinetic Theory of Plasmas. Lecture Notes for the Oxford
MMathPhys course on Kinetic Theory; URL: http://www-thphys.physics.ox.ac.uk/
people/AlexanderSchekochihin/KT/2015/KTLectureNotes.pdf.
Schrödinger, E. 1990 Statistical Thermodynamics. Dover.
Shannon, C. 1948 A mathematical theory of communication. The Bell System Technical Journal
27, 379.
Sinai, Ya. G. 1992 Probability Theory: An Introductory Course. Springer.
Szilard, L. 1929 On the decrease of entropy in a thermodynamic system by the intervention
of intelligent beings. Zeitschrift für Physik 53, 840, English translation in Behavioral
Science 9, 301 (1964).
Verlinde, E. 2011 On the origin of gravity and the laws of Newton. Journal of High Energy
Physics 4, 029.