Crash Course Statistical Mechanics
Crash Course Statistical Mechanics
Noah Miller
December 27, 2018
Abstract
A friendly introduction to statistical mechanics, geared towards
covering the powerful methods physicists have developed for working
in the subject.
Contents
1 Statistical Mechanics 1
1.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Temperature and Equilibrium . . . . . . . . . . . . . . 5
1.3 The Partition Function . . . . . . . . . . . . . . . . . . 8
1.4 Free energy . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Phase Transitions . . . . . . . . . . . . . . . . . . . . . 15
1.6 Example: Box of Gas . . . . . . . . . . . . . . . . . . . 17
1.7 Shannon Entropy . . . . . . . . . . . . . . . . . . . . . 18
1.8 Quantum Mechanics, Density Matrices . . . . . . . . . . 19
1.9 Example: Two state system . . . . . . . . . . . . . . . . 21
1.10 Entropy of Mixed States . . . . . . . . . . . . . . . . . 23
1.11 Classicality from environmental entanglement . . . . . . 23
1.12 The Quantum Partition Function . . . . . . . . . . . . . 27
1 Statistical Mechanics
1.1 Entropy
Statistical Mechanics is a branch of physics that pervades all other
branches. Statistical mechanics is relevant to Newtonian mechanics,
relativity, quantum mechanics, and quantum field theory.
1
Figure 1: Statistical mechanics applies to all realms of physics.
(x1 , y1 , z1 , px 1 , py 1 , pz 1 , . . . xN , yN , zN , px N , py N , pz N ) ∈ R6N
2
microstates. For examples, the “macrostate” of a box of gas labelled by
(U, V, N ) would be the set of all microstates with energy U , volume V ,
and particle number N . The idea is that if you know what macrostate
your system is in, you know that your system is equally likely to truly
be in any of the microstates it contains.
Figure 2: You may know the macrostate, but only God knows the mi-
crostate.
S ≡ k log Ω. (1)
3
The only reason that we need k to define S is because the human race
defined units of temperature before they defined entropy. (We’ll see
how temperature factors into any of this soon.) Otherwise, we probably
would have set k = 1 and temperature would have the same units as
energy.
You might be wondering how we actually count Ω. As you probably
noticed, the phase space R6N is not discrete. In that situation, we
integrate over a phase space volume with the measure
d3 x1 d3 p1 . . . d3 xN d3 pN .
4
1.2 Temperature and Equilibrium
Let’s say we label our macrostates by their total internal energy
U and some other macroscopic variables like V and N . (Obviously,
these other macroscopic variables V and N can be replaced by different
quantities in different situations, but let’s just stick with this for now.)
Our entropy S depends on all of these variables.
S = S(U, V, N ) (4)
1 ∂S
≡ . (5)
T ∂U V,N
5
The crucial realization of statistical mechanics is that, all else being
equal, a system is most likely to find itself in a macrostate corresponding
to the largest number of microstates. This is the so-called “Second law
of thermodynamics”: for all practical intents and purposes, the entropy
of a closed system always increases over time. It is not really a physical
“law” in the regular sense, it is more like a profound realization.
Therefore, the entropy SAB of our joint AB system will increase as
time goes on until it reaches its maximum possible value. In other words,
A and B trade energy in a seemingly random fashion that increases SAB
on average. When SAB is finally maximized, we say that our systems
are in “thermal equilibrium.”
Let’s say that the internal energy of system A is UA and the internal
energy of system B is UB . Crucially, note that the total energy of
combined system
UAB = UA + UB
is constant over time! This is because energy of the total system is
conserved. Therefore,
dUA = −dUB .
Now, the combined system will maximize its entropy when UA and UB
have some particular values. Knowing the value of UA is enough though,
because UB = UAB − UA . Therefore, entropy is maximized when
∂SAB
0= . (7)
∂UA
6
However, we can rewrite this as
∂SAB
0=
∂UA
∂SA ∂SB
= +
∂UA ∂UA
∂SA ∂SB
= −
∂UA ∂UB
1 1
= − .
TA TB
Therefore, our two systems are in equilibrium if they have the same
temperature!
TA = TB (8)
If there are other macroscopic variables we are using to define our
macrostates, like volume V or particle number N , then there will be
other quantities that must be equal in equibrium, assuming our two sys-
tems compete for volume or trade particles back and forth. In these
cases, we define the quantities P and µ to be
P ∂S µ ∂S
≡ ≡− . (9)
T ∂V U,N T ∂N U,V
PA = PB µA = µB . (10)
(You might object that pressure has another definition, namely force di-
vided by area. It would be incumbent on us to check that this definition
matches that definition in the relevant situation where both definitions
have meaning. Thankfully it does.)
7
1.3 The Partition Function
8
Figure 6: A large environment E and system S have a fixed total energy
Etot . E is called a “heat bath” because it is very big. The combined
system has a temperature T .
Etot = E + EE (11)
Here is the important part. Say that our heat bath has a lot of energy:
Etot E. As far as the heat bath is concerned, E is a very small
amount of energy. Therefore,
1
ΩE (Etot − E) = exp SE (Etot − E)
k
1 E
≈ exp SE (Etot ) −
k kT
by Taylor expanding SE in E and using the definition of temperature.
We now have
E
Prob(E) ∝ ΩS (E) exp − .
kT
9
ΩS (E) is sometimes called the “degeneracy” of E. In any case, we can
easily see what the ratio of Prob(E1 ) and Prob(E2 ) must be.
Prob(E1 ) ΩS (E1 )e−E1 /kT
=
Prob(E2 ) ΩS (E2 )e−E2 /kT
Furthermore, we can use the fact that all probabilities must sum to 1 in
order to calculate the absolute probability. We define
X
Z(T ) ≡ ΩS (E)e−E/kT (14)
E
X
= e−Es /kT
s
P
where s is a sum over all states of S. Finally, we have
ΩS (E)e−E/kT
Prob(E) = (15)
Z(T )
However, more than being a mere proportionality factor, Z(T ) takes
on a life of its own, so it is given the special name of the “partition
function.” Interestingly, Z(T ) is a function that depends on T and
not E. It is not a function that has anything to do with a particular
macrostate. Rather, it is a function that has to with every microstate
at some temperature. Oftentimes, we also define
1
β≡
kT
and write X
Z(β) = e−βEs . (16)
s
The partition function Z(β) has many amazing properties. For one,
it can be used to write an endless number of clever identities. Here is
one. Say you want to compute the expected energy hEi your system
has at temperature T .
X
hEi = Es Prob(Es )
s
−βEs
P
s Es e
=
Z(β)
1 ∂
=− Z
Z ∂β
∂
=− log Z
∂β
10
This expresses the expected energy hEi as a function of temperature.
(We could also calculate hE n i for any n if we wanted to.)
Where the partition function really shines is in the “thermodynamic
limit.” Usually, people define the thermodynamic limit as
SS = N S1 E = N E1
The thing to really gawk at in the above equation is that the probability
that S has some energy E is given by
Prob(E) ∝ eN (...) .
11
Prob(E) will change radically. Therefore, Prob(E) will be extremely
concentrated at some particular energy, and deviating slightly from that
maximum will cause Prob(E) to plummit.
12
Let’s just appreciate this for a second. Our original definition of
S(U ) was
S(U ) = k log(Ω(U ))
and our original definition of temperature was
1 ∂S
= .
T ∂U
In other words, T is a function of U . However, we totally reversed logic
when we coupled our system to a larger environment. We no longer
knew what the exact energy of our system was. I am now telling you
that instead of calculating T as a function of U , when N is large we are
actually able to calculate U as a function of T ! Therefore, instead of
having to calculate Ω(U ), we can just calculate Z(T ) instead.
I should stress, however, that Z(T ) is still a perfectly worthwhile
thing to calculate even when your system S isn’t “big.” It will still give
you the exact average energy hEi when your system is in equilibrium
with a bigger environment at some temperature. What’s special about
the thermodynamic limit is that you no longer have to imagine the heat
bath is there in order to interpret your results, because any “average
quantity” will basically just be an actual, sharply defined, “quantity.” In
short,
Z(β) = Ω(U )e−βU (thermodynamic limit) (19)
It’s worth mentioning that the other contributions to Z(β) will also be
absolute huge; they just won’t be as stupendously huge as the term due
to U .
Okay, enough adulation for the partition function. Let’s do some-
thing with it again. Using the above equation there is a very easy way
to figure out what SS (U ) is in terms of Z(β).
SS (U ) = k log ΩS (U )
= k log ZeβU
(thermodynamic limit)
= k log Z + kβU
∂
=k 1−β log Z
∂β
(Gah. Another amazing identity, all thanks to the partition function.)
This game that we played, coupling our system S to a heat bath so
we could calculate U as a function of T instead of T as a function of
U , can be replicated with other quantities like the chemical potential µ
(defined in Eq. 10). We could now imagine that S is trading particles
13
with a larger environment. Our partition function would then be a
function of µ in addition to T .
Z = Z(µ, T )
In the thermodynamic limit, we could once again use our old tricks to
find N in terms of µ and T .
F ≡ U − TS (20)
(This is also called the “Helmholtz Free Energy.”) F is defined for any
system with some well defined internal energy U and entropy S when
present in a larger environment which has temperature T . Crucially,
the system does not need to be in thermal equilibrium with the environ-
ment. In other words, free energy is a quantity associated with some
system which may or may not be in equilibrium with an environment at
temperature T .
14
can let the second law of thermodynamics to do all the hard work,
transferring energy into our system at no cost to us! I should warn
you that ∆F is actually not equal to the change in internal energy ∆U
that occurs during this equilibriation. This is apparent just from its
definition. (Although it does turn out that F is equal to the “useful
work” you can extract from such a system.)
The reason I’m telling you about F is because it is a useful quan-
tity for determining what will happen to a system at temperature T .
Namely, in the thermodynamic limit, the system will minimize F by
equilibriating with the environment.
Recall Eq. 19 (reproduced below).
Z(β) = exp k1 S − βU
(at equilibrium in thermodynamic limit)
= exp(−βF ).
First off, we just derived another amazing identity of the partition func-
tion. More importantly, recall that U , as written in Eq. 19, is defined
to be the energy that maximizes Ω(U )e−βU , A.K.A. the energy that
maximizes the entropy of the world. Because we know that the entropy
of the world always wants to be maximized, we can clearly see that F
wants to be minimized, as claimed.
Therefore, F is a very useful quantity! It always wants to be min-
imized at equilibrium. It can therefore be used to detect interesting
phenomena, such as phase transitions.
15
Figure 9: A phase transition, right below the critical temperature Tc ,
at Tc , and right above Tc .
This can indeed happen, and is in fact what a physicist would call a
“first order phase transition.” We can see that will be a discontinuity in
the first derivative of Z(T ) at Tc . You might be wondering how this is
possible, given the fact that from its definition, Z is clearly an analytic
function as it is a sum of analytic functions. The thing to remember is
that we are using the thermodynamic limit, and the sum of an infinite
number of analytic functions may not be analytic.
Because there is a discontinuity in the first derivative of Z(β), there
∂
will be a discontinuity in E = − ∂β log Z. This is just the “latent heat”
you learned about in high school. In real life systems, it takes some
time for enough energy to be transferred into a system to overcome
the latent heat energy barrier. This is why it takes so long for a pot
of water to boil or a block of ice to melt. Furthermore, during these
lengthy phase transitions, the pot of water or block of ice will actually
be at a constant temperature, the “critical temperature” (100◦ C and 0◦ C
respectively). Once the phase transition is complete, the temperature
can start changing again.
16
1.6 Example: Box of Gas
For concreteness, I will compute the partition function for an ideal
gas. By ideal, I mean that the particles do not interact with each other.
Let N be the number of particles in the box and m be the mass of
each particle. Suppose the particles exist in a box of volume V . The
positions and momenta of the particles at ~xi and p~i for i = 1 . . . N . The
energy is given by the sum of kinetic energies of all particles.
N
X p~2i
E= . (21)
i=1
2m
Therefore,
X
Z(β) = e−βEs
s
N N
!
Z Y 2
1 1 X p~i
= 3N
d3 xi d3 pi exp −β
N! h i=1 i=1
2m
N Z
1 VN Y 2
p
~
= 3N
d3 pi exp −β i
N ! h i=1 2m
1 V N 2mπ 3N/2
=
N ! h3N β
If N is large, the thermodynamic limit is satisfied. Therefore,
∂
U =− log Z
∂β
3 ∂
−2
V 32 2mπ
=− N log N ! 3N 3
2 ∂β h β
3N
=
2β
3
= N kT.
2
You could add interactions between the particles by adding some po-
tential energy between V each pair of particles (unrelated to the volume
V ).
N
X p~2i 1X
E= + V (|~xi − ~xj |) (22)
i=1
2m 2 i,j
17
Figure 11: An example for an interaction potential V between particles
as a function of distance r.
It turns out that entropy is maximized when all the probabilities ps are
equal to each other. Say there are Ω states and each ps = Ω−1 . Then
S = log Ω (24)
18
One tiny technicality when dealing with the Shannon entropy is in-
terpreting the value of
0 log 0.
It is a bit troublesome because log 0 = −∞. However, it turns out that
the correct value to assign the above quantity is
0 log 0 ≡ 0.
lim x log x = 0.
x→0
2. Uncertainty due to the fact that you may not know the exact
quantum state your system is in anyway. (This is sometimes called
“classical uncertainty.”)
ρ : H → H. (25)
19
which one it is in. This would be an example of a “classical superposi-
tion” of quantum states. Usually, we think of classical superpositions as
having a thermodynamical nature, but that doesn’t have to be the case.
Anyway, say that your lab mate thinks there’s a 50% chance the
system could be in either state. The density matrix corresponding to
this classical superposition would be
1 1
ρ= |ψ1 i hψ1 | + |ψ2 i hψ2 | .
2 2
More generally, if you have a set of N quantum states |ψi i each with a
classical probability pi , then the corresponding density matrix would be
N
X
ρ= pi |ψi i hψi | . (26)
i=1
20
We can therefore see that for our state |ψi,
hÔi = Tr |ψi hψ| Ô . (30)
ρ = |ψi hψ|
for some |ψi is said to represent a “pure state,” because you know with
100% certainty which quantum state your system is in. Note that for a
pure state,
ρ2 = ρ (for pure state).
It turns out that the above condition is a necessary and sufficient con-
dition for determining if a density matrix represents a pure state.
If a density matrix is instead a combination of different states in a
classical superposition, it is said to represent a “mixed state.” This is
sort of bad terminology, because a mixed state is not a “state” in the
Hilbert space Ĥ, but whatever.
H = C2
The pure state density matrix is different from the mixed state because
of the non-zero off diagonal terms. These are sometimes called “inter-
ference terms.” The reason is that states in a quantum superposition
can “interfere” with each other, while states in a classical superposition
can’t.
Let’s now look at the expectation value of the following operators
for both density matrices.
1 0 0 1
σz = σx =
0 −1 1 0
They are given by
1
0 1 0
hσz iMixed = Tr 2 =0
0 12 0 −1
1 1
1 0
hσz iPure = Tr 2 2
1 1 =0
2 2
0 −1
1
0 0 1
hσx iMixed = Tr 2 =0
0 21 1 0
1 1
0 1
hσx iPure = Tr 2 2
1 1 =1
2 2
1 0
22
So we can see that a measurement given by σz cannot distinguish be-
tween ρMixed and ρPure , while a measurement given by σx can distinguish
between them! There really is a difference between classical super posi-
tions and quantum superpositions, but you can only see this difference
if you exploit the off-diagonal terms!
|ψiA |ψiB
23
for some |ψiA ∈ HA and |ψiB ∈ HB .
So, for example, if HA = C2 and HB = C2 , then the state
1 i
|0i √ |0i − √ |1i
2 2
would not be entangled, while the state
1
√ |0i |0i + |1i |1i
2
would be entangled.
Let’s say a state starts out unentangled. How would it then become
entangled over time? Well, say the two systems A and B have Hamilto-
nians ĤA and ĤB . If we want the systems to interact weakly, i.e. “trade
energy,” we’ll also need to add an interaction term to the Hamiltonian.
24
Figure 12: Air molecules bumping up against a quantum system S will
entangle with it.
Notice that the experimentalist will not have access to the observ-
ables in the environment. Associated with HS is a set of observables
ÔS . If you tensor these observables together with the identity,
ÔS ⊗ 1E
you now have an observable which only measures quantities in the HS
subsector of the full Hilbert space. The thing is that entanglement
within the environment gets in the way of measuring ÔS ⊗ 1E in the
way the experimenter would like.
Say, for example, HS = C2 and HE = CN for some very big N . Any
state in HS ⊗ HE will be of the form
c0 |0i |ψ0 i + c1 |1i |ψ1 i (32)
for some c0 , c1 ∈ C and |ψ0 i , |ψ1 i ∈ H. The expectation value for our
observable is
∗ ∗
hÔS ⊗ 1E i = c0 h0| hψ0 | + c1 h1| hψ1 | ÔS ⊗ 1E c0 |0i |ψ0 i + c1 |1i |ψ1 i
=|c0 |2 h0| ÔS |0i + |c1 |2 h1| ÔS |1i + 2 Re c∗0 c1 h0| ÔS |1i hψ0 |ψ1 i
The thing is that, if the environment E is very big, then any two random
given vectors |ψ0 i , |ψ1 i ∈ HE will generically have almost no overlap.
hψ0 |ψ1 i ≈ e−N
(This is just a fact about random vectors in high dimensional vector
spaces.) Therefore, the expectation value of this observable will be
hÔS ⊗ 1E i ≈ |c0 |2 h0| ÔS |0i + |c1 |2 h1| ÔS |1i .
25
Because there is no cross term between |0i and |1i, we can see that
when we measure our observable, our system S seems to be in a classical
superposition, A.K.A a mixed state!
This can be formalized by what is called a “partial trace.” Say that
|φi iE comprises an orthonormal basis of HE . Say we have some density
matrix ρ representating a state in the full Hilbert space. We can “trace
over the E degrees of freedom” to recieve a density matrix in the S
Hilbert space.
X
ρS ≡ TrE (ρ) ≡ E hφi | ρ |φi iE . (33)
i
You be wondering why anyone would want to take this partial trace.
Well, I would say that if you can’t perform the E degrees of freedom,
why are you describing them? It turns out that the partially traced
density matrix gives us the expectation values for any observables in
S. Once we compute ρS , by tracing over E, we can then calculate the
expectation value of any observable ÔS by just calculating the trace over
S of ρS ÔS :
Tr ρÔS ⊗ 1E = TrS (ρS ÔS ).
Even though the whole world is in some particular state in HS ⊗ HE ,
when you only perform measurements on one part of it, that part might
as well only be in a mixed state for all you know! Entanglement looks
like a mixed state when you only look at one part of a Hilbert space.
Furthermore, when the environment is very large, the off diagonal “in-
terference terms” in the density matrix are usually very close to zero,
meaning the state looks very mixed.
This is the idea of “entanglement entropy.” If you have an entangled
state, then trace out over the states in one part of the Hilbert space,
you will recieve a mixed density matrix. That density matrix will have
some Von Neumann entropy, and in this context we would call it “en-
tanglement entropy.” The more entanglement entropy your state has,
the more entangled it is! And, as we can see, when you can only look
at one tiny part of a state when it is heavily entangled, it appears to be
in a classical superposition instead of a quantum superposition!
The process by which quantum states in real life become entangled
with the surrounding environment is called “decoherence.” It is one of
the most visciously efficient processes in all of physics, and is the reason
why it took the human race so long to discover quantum mechanics. It’s
very ironic that entanglement, a quintessentially quantum phenomenon,
when taken to dramatic extremes, hides quantum mechanics from view
26
entirely!
I would like to point out an important difference between a clas-
sical macrostate and a quantum mixed state. In classical mechanics,
the subtle perturbing effects of the environment on the system make it
difficult to keep track of the exact microstate a system is in. However,
in principle you can always just re-measure your system very precisely
and figure out what the microstate is all over again. This isn’t the case
in quantum mechanics when your system becomes entangled with the
environment. The problem is that once your system entangles with the
environment, that entanglement is almost certainly never going to undo
itself. In fact, it’s just going to spread from the air molecules in your
laboratory to the surrounding building, then the whole univeristy, then
the state, the country, the planet, the solar system, the galaxy, and then
the universe! And unless you “undo” all of that entanglement, the show’s
over! You’d just have to start from scratch and prepare your system in
a pure state all over again.
Obviously, this is just the same Z(T ) that we saw in classical mechanics!
They are really not different at all. However, there is something very
interesting in the above expression. The operator
exp −Ĥ/kT
if we just replace
i
− t −→ −β.
~
It seems as though β is, in some sense, an “imaginary time.” Rotating the
time variable into the imaginary direction is called a “Wick Rotation,”
27
and is one of the most simple, mysterious, and powerful tricks in the
working physicist’s toolbelt. There’s a whole beautiful story here with
the path integral, but I won’t get into it.
Anyway, a mixed state is said to be “thermal” if it is of the form
1 X −Es /kT
ρThermal = e |Es i hEs | (35)
Z(T ) s
1 −β Ĥ
= e
Z(β)
for some temperature T where |Es i are the energy eigenstates with eigen-
values Es . If you let your system equilibriate with an environment at
some temperature T , and then trace out by the environmental degrees
of freedom, you will find your system in the thermal mixed state.
28