0% found this document useful (0 votes)
35 views28 pages

Crash Course Statistical Mechanics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views28 pages

Crash Course Statistical Mechanics

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 28

A Crash Course in Statistical Mechanics

Noah Miller
December 27, 2018

Abstract
A friendly introduction to statistical mechanics, geared towards
covering the powerful methods physicists have developed for working
in the subject.

Contents
1 Statistical Mechanics 1
1.1 Entropy . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Temperature and Equilibrium . . . . . . . . . . . . . . 5
1.3 The Partition Function . . . . . . . . . . . . . . . . . . 8
1.4 Free energy . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Phase Transitions . . . . . . . . . . . . . . . . . . . . . 15
1.6 Example: Box of Gas . . . . . . . . . . . . . . . . . . . 17
1.7 Shannon Entropy . . . . . . . . . . . . . . . . . . . . . 18
1.8 Quantum Mechanics, Density Matrices . . . . . . . . . . 19
1.9 Example: Two state system . . . . . . . . . . . . . . . . 21
1.10 Entropy of Mixed States . . . . . . . . . . . . . . . . . 23
1.11 Classicality from environmental entanglement . . . . . . 23
1.12 The Quantum Partition Function . . . . . . . . . . . . . 27

1 Statistical Mechanics
1.1 Entropy
Statistical Mechanics is a branch of physics that pervades all other
branches. Statistical mechanics is relevant to Newtonian mechanics,
relativity, quantum mechanics, and quantum field theory.

1
Figure 1: Statistical mechanics applies to all realms of physics.

Its exact incarnation is a little different in each quadrant, but the


basic details are identical.
The most important quantity in statistical mechanics is called “en-
tropy,” which we label by S. People sometimes say that entropy is a
measure of the “disorder” of a system, but I don’t think this a good way
to think about it. But before we define entropy, we need to discuss two
different notions of state: “microstates” and “macrostates.”
In physics, we like to describe the real world as mathematical objects.
In classical physics, states are points in a “phase space.” Say for example
you had N particles moving around in 3 dimensions. It would take 6N
real numbers to specify the physical state of this system at a given
instant: 3 numbers for each particle’s position and 3 numbers for each
particle’s momentum. The phase space for this system would therefore
just be R6N .

(x1 , y1 , z1 , px 1 , py 1 , pz 1 , . . . xN , yN , zN , px N , py N , pz N ) ∈ R6N

(In quantum mechanics, states are vectors in a Hilbert space H instead


of points in a phase space. We’ll return to the quantum case a bit later.)
A “microstate” is a state of the above form. It contains absolutely
all the physical information that an omniscent observer could know. If
you were to know the exact microstate of a system and knew all of the
laws of physics, you could in principle deduce what the microstate will
be at all future times and what the microstate was at all past times.
However, practically speaking, we can never know the true microstate
of a system. For example, you could never know the positions and mo-
menta of every damn particle in a box of gas. The only things we can
actually measure are macroscopic variables such as internal energy, vol-
ume, and particle number (U, V, N ). A “macrostate” is just a set of

2
microstates. For examples, the “macrostate” of a box of gas labelled by
(U, V, N ) would be the set of all microstates with energy U , volume V ,
and particle number N . The idea is that if you know what macrostate
your system is in, you know that your system is equally likely to truly
be in any of the microstates it contains.

Figure 2: You may know the macrostate, but only God knows the mi-
crostate.

I am now ready to define what entropy is. Entropy is a quantity asso-


ciated with a macrostate. If a macrostate is just a set of Ω microstates,
then the entropy S of the system is

S ≡ k log Ω. (1)

Here, k is Boltzmann’s constant. It is a physical constant with units of


energy / temperature.
Joules
k ≡ 1.38065 × 10−23 (2)
Kelvin

3
The only reason that we need k to define S is because the human race
defined units of temperature before they defined entropy. (We’ll see
how temperature factors into any of this soon.) Otherwise, we probably
would have set k = 1 and temperature would have the same units as
energy.
You might be wondering how we actually count Ω. As you probably
noticed, the phase space R6N is not discrete. In that situation, we
integrate over a phase space volume with the measure

d3 x1 d3 p1 . . . d3 xN d3 pN .

However, this isn’t completely satisfactory because position and mo-


mentum are dimensionful quantities while Ω should be a dimensionless
number. We should therefore divide by a constant with units of posi-
tion times momentum. Notice, however, that because S only depends
on log Ω, any constant rescaling of Ω will only alter S by a constant and
will therefore never affect the change in entropy ∆S of some process. So
while we have to divide by a constant, whichever constant we divide by
doesn’t affect the physics.
Anyway, even though we are free to choose whatever dimensionful
constant we want, the “best” is actually Planck’s constant h! Therefore,
for a classical macrostate that occupies a phase space volume Vol,
Z N
1 1 Y
Ω= d3 xi d3 pi . (3)
N ! h3N Vol i=1

(The prefactor 1/N ! is necessary if all N particles are indistinguishable.


It is the cause of some philosophical consternation but I don’t want to
get into any of that.)
Let me now explain why I think saying entropy is “disorder” is not
such a good idea. Different observers might describe reality with differ-
ent macrostates. For example, say your room is very messy and disor-
ganized. This isn’t a problem for you, because you spend a lot of time
in there and know where everything is. Therefore, the macrostate you
use to describe your room contains very few microstates and has a small
entropy. However, according to your mother who has not studied your
room very carefully, the entropy of your room is very large. The point
is that while everyone might agree your room is messy, the entropy of
your room really depends on how little you know about it.

4
1.2 Temperature and Equilibrium
Let’s say we label our macrostates by their total internal energy
U and some other macroscopic variables like V and N . (Obviously,
these other macroscopic variables V and N can be replaced by different
quantities in different situations, but let’s just stick with this for now.)
Our entropy S depends on all of these variables.

S = S(U, V, N ) (4)

The temperature T of the (U, V, N ) macrostate is then be defined to be

1 ∂S
≡ . (5)
T ∂U V,N

The partial derivative above means that we just differentiate S(U, V, N )


with respect to U while keeping V and N fixed.
If your system has a high temperature and you add a bit of energy
dU to it, then the entropy S will not change much. If your system has a
small temperature and you add a bit of energy, the entropy will increase
a lot.
Next, say you have two systems A and B which are free to trade
energy back and forth.

Figure 3: Two systems A and B trading energy. UA + UB is fixed.

Say system A could be in one of ΩA possible microstates and system


B could be in ΩB possible microstates. Therefore, the total AB system
could be in ΩA ΩB possible microstates. Therefore, the entropy SAB of
both systems combined is just the sum of entropies of both sub-systems.

SAB = k log(ΩA ΩB ) = k log ΩA + k log ΩB = SA + SB (6)

5
The crucial realization of statistical mechanics is that, all else being
equal, a system is most likely to find itself in a macrostate corresponding
to the largest number of microstates. This is the so-called “Second law
of thermodynamics”: for all practical intents and purposes, the entropy
of a closed system always increases over time. It is not really a physical
“law” in the regular sense, it is more like a profound realization.
Therefore, the entropy SAB of our joint AB system will increase as
time goes on until it reaches its maximum possible value. In other words,
A and B trade energy in a seemingly random fashion that increases SAB
on average. When SAB is finally maximized, we say that our systems
are in “thermal equilibrium.”

Figure 4: SAB is maximized when UA has some particular value.


(It should be noted that there will actually be tiny random "thermal"
fluctuations around this maximum.)

Let’s say that the internal energy of system A is UA and the internal
energy of system B is UB . Crucially, note that the total energy of
combined system
UAB = UA + UB
is constant over time! This is because energy of the total system is
conserved. Therefore,
dUA = −dUB .
Now, the combined system will maximize its entropy when UA and UB
have some particular values. Knowing the value of UA is enough though,
because UB = UAB − UA . Therefore, entropy is maximized when
∂SAB
0= . (7)
∂UA

6
However, we can rewrite this as
∂SAB
0=
∂UA
∂SA ∂SB
= +
∂UA ∂UA
∂SA ∂SB
= −
∂UA ∂UB
1 1
= − .
TA TB
Therefore, our two systems are in equilibrium if they have the same
temperature!
TA = TB (8)
If there are other macroscopic variables we are using to define our
macrostates, like volume V or particle number N , then there will be
other quantities that must be equal in equibrium, assuming our two sys-
tems compete for volume or trade particles back and forth. In these
cases, we define the quantities P and µ to be

P ∂S µ ∂S
≡ ≡− . (9)
T ∂V U,N T ∂N U,V

P is called “pressure” and µ is called “chemical potential.” In equilib-


rium, we would also have

PA = PB µA = µB . (10)

(You might object that pressure has another definition, namely force di-
vided by area. It would be incumbent on us to check that this definition
matches that definition in the relevant situation where both definitions
have meaning. Thankfully it does.)

7
1.3 The Partition Function

Figure 5: If you want to do statistical mechanics, you really should


know about the partition function.

Explicitly calculating Ω for a given macrostate is usually very hard.


Practically speaking, it can only be done for simple systems you under-
stand very well. However, physicists have developed an extremely pow-
erful way of doing statistical mechanics even for complicated systems.
It turns out that there is a function of temperature called the “partition
function” that contains all the information you’d care to know about
your macrostate when you are working in the “thermodynamic limit.”
This function is denoted Z(T ). Once you compute Z(T ) (which is usu-
ally much easier than computing Ω) it is a simple matter to extract the
relevant physics.
Before defining the partition function, I would like to talk a bit about
heat baths. Say you have some system S in a very large environment E.
Say you can measure the macroscopic variables of S, including its energy
E at any given moment. (We use E here to denotes energy instead of
U when talking about the partition function.) The question I ask is: if
the total system has a temperature T , what’s the probability that S has
some particular energy E?

8
Figure 6: A large environment E and system S have a fixed total energy
Etot . E is called a “heat bath” because it is very big. The combined
system has a temperature T .

We should be picturing that S and E are evolving in some compli-


cated way we can’t understand. However, their total energy

Etot = E + EE (11)

is conserved. We now define

ΩS (E) ≡ num. microstates of S with energy E (12)


ΩE (EE ) ≡ num. microstates of E with energy EE .

Therefore, the probability that S has some energy E is proportional


to the number of microstates where S has energy E and E has energy
Etot − E.

Prob(E) ∝ ΩS (E)ΩE (Etot − E) (13)

Here is the important part. Say that our heat bath has a lot of energy:
Etot  E. As far as the heat bath is concerned, E is a very small
amount of energy. Therefore,
 
1
ΩE (Etot − E) = exp SE (Etot − E)
k
 
1 E
≈ exp SE (Etot ) −
k kT
by Taylor expanding SE in E and using the definition of temperature.
We now have
 
E
Prob(E) ∝ ΩS (E) exp − .
kT

9
ΩS (E) is sometimes called the “degeneracy” of E. In any case, we can
easily see what the ratio of Prob(E1 ) and Prob(E2 ) must be.
Prob(E1 ) ΩS (E1 )e−E1 /kT
=
Prob(E2 ) ΩS (E2 )e−E2 /kT
Furthermore, we can use the fact that all probabilities must sum to 1 in
order to calculate the absolute probability. We define
X
Z(T ) ≡ ΩS (E)e−E/kT (14)
E
X
= e−Es /kT
s
P
where s is a sum over all states of S. Finally, we have
ΩS (E)e−E/kT
Prob(E) = (15)
Z(T )
However, more than being a mere proportionality factor, Z(T ) takes
on a life of its own, so it is given the special name of the “partition
function.” Interestingly, Z(T ) is a function that depends on T and
not E. It is not a function that has anything to do with a particular
macrostate. Rather, it is a function that has to with every microstate
at some temperature. Oftentimes, we also define
1
β≡
kT
and write X
Z(β) = e−βEs . (16)
s
The partition function Z(β) has many amazing properties. For one,
it can be used to write an endless number of clever identities. Here is
one. Say you want to compute the expected energy hEi your system
has at temperature T .
X
hEi = Es Prob(Es )
s
−βEs
P
s Es e
=
Z(β)
1 ∂
=− Z
Z ∂β

=− log Z
∂β

10
This expresses the expected energy hEi as a function of temperature.
(We could also calculate hE n i for any n if we wanted to.)
Where the partition function really shines is in the “thermodynamic
limit.” Usually, people define the thermodynamic limit as

N → ∞ (thermodynamic limit) (17)

where N is the number of particles. However, sometimes you might


be interested in more abstract systems like a spin chain (the so-called
“Ising model”) or something else. There are no “particles” in such a
system, however there is still something you would justifiably call the
thermodynamic limit. This would be when the number of sites in your
spin chain becomes very large. So N should really just be thought of
as the number of variables you need to specify a microstate. When
someone is “working in the thermodynamic limit,” it just means that
they are considering very “big” systems.
Of course, in real life N is never infinite. However, I think we can
all agree that 1023 is close enough to infinity for all practical purposes.
Whenever an equation is true “in the thermodynamic limit,” you can
imagine that there are extra terms of order N1 unwritten in your equation
and laugh at them.
What is special about the thermodynamic limit is that ΩS becomes,
like, really big...
ΩS = (something)N
Furthermore, the entropy and energy will scale with N

SS = N S1 E = N E1

In the above equation, S1 and E1 can be thought of as the average


amount of entropy per particle.
Therefore, we can rewrite
1

Prob(E) ∝ ΩS (E) exp − kT E
= exp k1 SS − kT
1

E
= exp N k1 S1 − kT 1

E1 .

The thing to really gawk at in the above equation is that the probability
that S has some energy E is given by

Prob(E) ∝ eN (...) .

I want you to appreciate how insanely big eN (...) is in the thermody-


namic limit. Furthermore, if there is even a miniscule change in (. . .),

11
Prob(E) will change radically. Therefore, Prob(E) will be extremely
concentrated at some particular energy, and deviating slightly from that
maximum will cause Prob(E) to plummit.

Figure 7: In the thermodynamic limit, the system S will have a well


defined energy.

We can therefore see that if the energy U maximizes Prob(E), we


will essentially have
(
1 if E = U
Prob(E) ≈ .
0 if E 6= U

Let’s now think back to our previously derived equation



hEi = − log Z(β).
∂β
Recall that hEi is the expected energy of S when it is coupled to a heat
bath at some temperature. The beauty is that in the thermodynamic
limit where our system S becomes very large, we don’t even have to
think about the heat bath anymore! Our system S is basically just in
the macrostate where all microstates with energy U are equally likely.
Therefore,
hEi = U (thermodynamic limit)
and

U =− log Z(β) (18)
∂β
is an exact equation in the thermodynamic limit.

12
Let’s just appreciate this for a second. Our original definition of
S(U ) was
S(U ) = k log(Ω(U ))
and our original definition of temperature was
1 ∂S
= .
T ∂U
In other words, T is a function of U . However, we totally reversed logic
when we coupled our system to a larger environment. We no longer
knew what the exact energy of our system was. I am now telling you
that instead of calculating T as a function of U , when N is large we are
actually able to calculate U as a function of T ! Therefore, instead of
having to calculate Ω(U ), we can just calculate Z(T ) instead.
I should stress, however, that Z(T ) is still a perfectly worthwhile
thing to calculate even when your system S isn’t “big.” It will still give
you the exact average energy hEi when your system is in equilibrium
with a bigger environment at some temperature. What’s special about
the thermodynamic limit is that you no longer have to imagine the heat
bath is there in order to interpret your results, because any “average
quantity” will basically just be an actual, sharply defined, “quantity.” In
short,
Z(β) = Ω(U )e−βU (thermodynamic limit) (19)
It’s worth mentioning that the other contributions to Z(β) will also be
absolute huge; they just won’t be as stupendously huge as the term due
to U .
Okay, enough adulation for the partition function. Let’s do some-
thing with it again. Using the above equation there is a very easy way
to figure out what SS (U ) is in terms of Z(β).

SS (U ) = k log ΩS (U )
= k log ZeβU

(thermodynamic limit)
= k log Z + kβU
∂ 
=k 1−β log Z
∂β
(Gah. Another amazing identity, all thanks to the partition function.)
This game that we played, coupling our system S to a heat bath so
we could calculate U as a function of T instead of T as a function of
U , can be replicated with other quantities like the chemical potential µ
(defined in Eq. 10). We could now imagine that S is trading particles

13
with a larger environment. Our partition function would then be a
function of µ in addition to T .

Z = Z(µ, T )

In the thermodynamic limit, we could once again use our old tricks to
find N in terms of µ and T .

1.4 Free energy


Now that we’re on an unstoppable victory march of introductory
statistical mechanics, I think I should define a quantity closely related
to the partition function: the “free energy” F .

F ≡ U − TS (20)

(This is also called the “Helmholtz Free Energy.”) F is defined for any
system with some well defined internal energy U and entropy S when
present in a larger environment which has temperature T . Crucially,
the system does not need to be in thermal equilibrium with the environ-
ment. In other words, free energy is a quantity associated with some
system which may or may not be in equilibrium with an environment at
temperature T .

Figure 8: A system with internal energy U and entropy S in a heat


bath at temperature T has free energy F = U − T S.

Okay. So why did we define this quantity F ? The hint is in the


name “free energy.” Over time, the system will equilibriate with the
environment in order to maximize the entropy of the whole world. While
doing so, the energy U of the system will change. So if we cleverly leave
our system in a larger environment, under the right circumstances we

14
can let the second law of thermodynamics to do all the hard work,
transferring energy into our system at no cost to us! I should warn
you that ∆F is actually not equal to the change in internal energy ∆U
that occurs during this equilibriation. This is apparent just from its
definition. (Although it does turn out that F is equal to the “useful
work” you can extract from such a system.)
The reason I’m telling you about F is because it is a useful quan-
tity for determining what will happen to a system at temperature T .
Namely, in the thermodynamic limit, the system will minimize F by
equilibriating with the environment.
Recall Eq. 19 (reproduced below).

Z(β) = Ω(U )e−βU (thermodynamic limit)

If our system S is in equilibrium with the heat bath, then

Z(β) = exp k1 S − βU

(at equilibrium in thermodynamic limit)
= exp(−βF ).

First off, we just derived another amazing identity of the partition func-
tion. More importantly, recall that U , as written in Eq. 19, is defined
to be the energy that maximizes Ω(U )e−βU , A.K.A. the energy that
maximizes the entropy of the world. Because we know that the entropy
of the world always wants to be maximized, we can clearly see that F
wants to be minimized, as claimed.
Therefore, F is a very useful quantity! It always wants to be min-
imized at equilibrium. It can therefore be used to detect interesting
phenomena, such as phase transitions.

1.5 Phase Transitions


Let’s back up a bit and think about a picture we drew, Fig. 7. It’s
a very suggestive picture that begs a very interesting question. What
if, at some critical temperature Tc , a new peak grows and overtakes our
first peak?

15
Figure 9: A phase transition, right below the critical temperature Tc ,
at Tc , and right above Tc .

This can indeed happen, and is in fact what a physicist would call a
“first order phase transition.” We can see that will be a discontinuity in
the first derivative of Z(T ) at Tc . You might be wondering how this is
possible, given the fact that from its definition, Z is clearly an analytic
function as it is a sum of analytic functions. The thing to remember is
that we are using the thermodynamic limit, and the sum of an infinite
number of analytic functions may not be analytic.
Because there is a discontinuity in the first derivative of Z(β), there

will be a discontinuity in E = − ∂β log Z. This is just the “latent heat”
you learned about in high school. In real life systems, it takes some
time for enough energy to be transferred into a system to overcome
the latent heat energy barrier. This is why it takes so long for a pot
of water to boil or a block of ice to melt. Furthermore, during these
lengthy phase transitions, the pot of water or block of ice will actually
be at a constant temperature, the “critical temperature” (100◦ C and 0◦ C
respectively). Once the phase transition is complete, the temperature
can start changing again.

Figure 10: A discontinuity in the first derivative of Z corresponds


to a first order phase transition. This means that you must put a fi-
nite amount of energy into the system called “latent heat” at the phase
transition before the temperature of the system will rise again.

16
1.6 Example: Box of Gas
For concreteness, I will compute the partition function for an ideal
gas. By ideal, I mean that the particles do not interact with each other.
Let N be the number of particles in the box and m be the mass of
each particle. Suppose the particles exist in a box of volume V . The
positions and momenta of the particles at ~xi and p~i for i = 1 . . . N . The
energy is given by the sum of kinetic energies of all particles.
N
X p~2i
E= . (21)
i=1
2m

Therefore,
X
Z(β) = e−βEs
s
N N
!
Z Y 2
1 1 X p~i
= 3N
d3 xi d3 pi exp −β
N! h i=1 i=1
2m
N Z
1 VN Y 2
 
p
~
= 3N
d3 pi exp −β i
N ! h i=1 2m
1 V N  2mπ 3N/2
=
N ! h3N β
If N is large, the thermodynamic limit is satisfied. Therefore,

U =− log Z
∂β
3 ∂

−2
 V  32 2mπ 
=− N log N ! 3N 3
2 ∂β h β
3N
=

3
= N kT.
2
You could add interactions between the particles by adding some po-
tential energy between V each pair of particles (unrelated to the volume
V ).
N
X p~2i 1X
E= + V (|~xi − ~xj |) (22)
i=1
2m 2 i,j

The form of V (r) might look something like this.

17
Figure 11: An example for an interaction potential V between particles
as a function of distance r.

The calculation of Z(β) then becomes more difficult, although you


could approximate it pretty well using something called the “cluster
decomposition.” This partition function would then exhibit a phase
transition at a critical temperature between a gas phase and a liquid
phase. It is an interesting exercise to try to pin down for yourself where
all the new states are coming from at the critical temperature which
make Z(β) discontinuous. (Hint: condensation.)
Obviously, the attractions real life particles experience cannot be
written in terms of such a simple central potential V (r). It’s just a sim-
plified model. For example, there should be some angular dependence
to the potential energy as well which is responsible for the chemical
structures we see in nature. If we wanted to model the liquid-to-solid
transition, we’d have to take that into account.

1.7 Shannon Entropy


So far, we have been imagining that that all microstates in a macrostate
are equally likely to be the “true” microstate. However, what if you as-
sign a different probability ps to each microstate s? What is the entropy
then?
There is a more general notion of entropy in computer science call
“Shannon entropy.” It is given by
X
S=− ps log ps . (23)
s

It turns out that entropy is maximized when all the probabilities ps are
equal to each other. Say there are Ω states and each ps = Ω−1 . Then

S = log Ω (24)

matching the physicist’s definition (up to the Boltzmann constant).

18
One tiny technicality when dealing with the Shannon entropy is in-
terpreting the value of
0 log 0.
It is a bit troublesome because log 0 = −∞. However, it turns out that
the correct value to assign the above quantity is

0 log 0 ≡ 0.

This isn’t too crazy though, because

lim x log x = 0.
x→0

1.8 Quantum Mechanics, Density Matrices


So far I have only told you about statistical mechanics in the context
of classical mechanics. Now it’s time to talk about quantum mechanics.
There is something very interesting about quantum mechanics: states
can be in superpositions. Because of this, even if you know the exact
quantum state your system is in, you can still only predict the proba-
bilities that any observable (such as energy) will have a particular value
when measured. Therefore, there are two notions of uncertainty in quan-
tum statistical mechanics:

1. Fundemental quantum uncertainty

2. Uncertainty due to the fact that you may not know the exact
quantum state your system is in anyway. (This is sometimes called
“classical uncertainty.”)

It would be nice if we could capture these two different notions of un-


certainty in one unified mathematical object. This object is called the
“density matrix.”
Say the quantum states for your system live in a Hilbert space H.
A density matrix ρ is an operator

ρ : H → H. (25)

Each density matrix is meant to represent a so-called “classical super-


position” of quantum states.
For example, say that you are a physics PhD student working in a lab
and studying some quantum system. Say your lab mate has prepared
the system in one of two states |ψ1 i or |ψ2 i, but unprofessionally forgot

19
which one it is in. This would be an example of a “classical superposi-
tion” of quantum states. Usually, we think of classical superpositions as
having a thermodynamical nature, but that doesn’t have to be the case.
Anyway, say that your lab mate thinks there’s a 50% chance the
system could be in either state. The density matrix corresponding to
this classical superposition would be
1 1
ρ= |ψ1 i hψ1 | + |ψ2 i hψ2 | .
2 2
More generally, if you have a set of N quantum states |ψi i each with a
classical probability pi , then the corresponding density matrix would be
N
X
ρ= pi |ψi i hψi | . (26)
i=1

This is useful to define because it allows us to extract expectation values


of observables Ô in a classical superposition. But before I prove that,
I’ll have to explain a very important operation: “tracing.”
Say you have quantum state |ψi and you want to calculate the ex-
pectation value of Ô. This is just equal to

hÔi = hψ| Ô |ψi . (27)

Now, say we have an orthonormal basis |φs i ∈ H. We then have


X
1= |φs i hφs | . (28)
s

Therefore, inserting the identity, we have

hÔi = hψ| Ô |ψi


X
= hψ| Ô |φs i hφs |ψi
s
X
= hφs |ψi hψ| Ô |φs i .
s

This motivates us to define something called the “trace operation” for


any operator H → H. While we are using an orthonormal basis of H
to define it, it is actually independent of which basis you choose.
X
Tr(. . .) ≡ hφs | . . . |φs i (29)
s

20
We can therefore see that for our state |ψi,
 
hÔi = Tr |ψi hψ| Ô . (30)

Returning to our classical superposition and density matrix ρ, we are


now ready to see how to compute the expectation values.
X
hÔi = pi hψi | Ô |ψi i
i
X  
= pi Tr |ψi i hψi | Ô
i
 
= Tr ρÔ

So I have now proved my claim that we can use density matrices to


extract expectation values of observables.
Now that I have told you about these density matrices, I should
introduce some terminology. A density matrix that is of the form

ρ = |ψi hψ|

for some |ψi is said to represent a “pure state,” because you know with
100% certainty which quantum state your system is in. Note that for a
pure state,
ρ2 = ρ (for pure state).
It turns out that the above condition is a necessary and sufficient con-
dition for determining if a density matrix represents a pure state.
If a density matrix is instead a combination of different states in a
classical superposition, it is said to represent a “mixed state.” This is
sort of bad terminology, because a mixed state is not a “state” in the
Hilbert space Ĥ, but whatever.

1.9 Example: Two state system


Consider the simplest Hilbert space, representing a two state system.

H = C2

Let us investigate the difference between a quantum superposition and


a classical super position. An orthonormal basis for this Hilbert space
is given by    
0 1
|0i = |1i =
1 0
21
Say you have a classical superposition of these two states where you
have a 50% probability that your state is in either state. Then
1 1
ρMixed = |0i h0| + |1i h1|
2  2
1

0
= 2 1 .
0 2
Let’s compare this to the pure state of the quantum super position
1 1
|ψi = √ |0i + √ |1i .
2 2
The density matrix would be
 1 1  1 1 
ρPure = √ |0i + √ |1i √ h0| + √ h1|
2 2 2 2
1 
= |0i h0| + |1i h1| + |0i h1| + |1i h0|
2
1 1
 
= 2 2
1 1
2 2

The pure state density matrix is different from the mixed state because
of the non-zero off diagonal terms. These are sometimes called “inter-
ference terms.” The reason is that states in a quantum superposition
can “interfere” with each other, while states in a classical superposition
can’t.
Let’s now look at the expectation value of the following operators
for both density matrices.
   
1 0 0 1
σz = σx =
0 −1 1 0
They are given by
 1  
0 1 0
hσz iMixed = Tr 2 =0
0 12 0 −1
 1 1   
1 0
hσz iPure = Tr 2 2
1 1 =0
2 2
0 −1
 1   
0 0 1
hσx iMixed = Tr 2 =0
0 21 1 0
 1 1   
0 1
hσx iPure = Tr 2 2
1 1 =1
2 2
1 0
22
So we can see that a measurement given by σz cannot distinguish be-
tween ρMixed and ρPure , while a measurement given by σx can distinguish
between them! There really is a difference between classical super posi-
tions and quantum superpositions, but you can only see this difference
if you exploit the off-diagonal terms!

1.10 Entropy of Mixed States


In quantum mechanics, pure states are microstates and mixed states
are the macrostates. We can define the entropy of a mixed state drawing
inspiration from the definition of Shannon entropy.

S = −k Tr(ρ log ρ) (31)


This is called the Von Neumann Entropy. If ρ represents a classical
superposition of orthonormal states |ψi i, each with some probability
pi , then the above definition exactly matches the definition of Shannon
entropy.
One thing should be explained, though. How do you take the log-
arithm of a matrix? This is actually pretty easy. Just diagonalize the
matrix and take the log of the diagonal entries. Thankfully, density ma-
trices can always be diagonalized (they are manifestly self-adjoint and
therefore diagonalizable by the spectral theorem) so you don’t have to
do anything more complicated.

1.11 Classicality from environmental entanglement


Say you have two quantum systems A and B with Hilbert spaces HA
and HB . If you combine the two systems, states will live in the Hilbert
space
HA ⊗ HB .
Say that |φi iA ∈ HA comprise a basis for the state space of HA and
|φj iB ∈ HB comprise a basis for the state space HB . All states in
HA ⊗ HB will be of the form
X
|Ψi = cij |φi iA |φj iB
i,j

for some cij ∈ C.


States are said to be “entangled” if they can not be written as

|ψiA |ψiB

23
for some |ψiA ∈ HA and |ψiB ∈ HB .
So, for example, if HA = C2 and HB = C2 , then the state
 1 i 
|0i √ |0i − √ |1i
2 2
would not be entangled, while the state
1  
√ |0i |0i + |1i |1i
2
would be entangled.
Let’s say a state starts out unentangled. How would it then become
entangled over time? Well, say the two systems A and B have Hamilto-
nians ĤA and ĤB . If we want the systems to interact weakly, i.e. “trade
energy,” we’ll also need to add an interaction term to the Hamiltonian.

Ĥ = ĤA ⊗ ĤB + Ĥint .

It doesn’t actually matter what the interaction term is or if it is very


small. All that matters is that if we really want them to interact, its
important that the interaction term is there at all. Once we add an
interaction term, we will generically see that states which start out un-
entangled become heavily entangled over time as A and B interact.
Say for example you had a system S described by a Hilbert space
HS coupled to a large environment E described by a Hilbert space HE .
Now, maybe you are an experimentalist and you are really interested in
studying the quantum dynamics of S. You then face a very big prob-
lem: E. Air molecules in your laboratory will be constantly bumping up
against your system, for example. This is just intuitively what I mean
by having some non-zero Ĥint . The issue is that, if you really want to
study S, you desperately don’t want it to entangle with the environ-
ment, because you have no control over the environment! This is why
people who study quantum systems are always building these big com-
plicated vacuum chambers and cooling their system down to fractions of
a degree above absolute zero: they want to prevent entanglement with
the environment so they can study S in peace!

24
Figure 12: Air molecules bumping up against a quantum system S will
entangle with it.

Notice that the experimentalist will not have access to the observ-
ables in the environment. Associated with HS is a set of observables
ÔS . If you tensor these observables together with the identity,
ÔS ⊗ 1E
you now have an observable which only measures quantities in the HS
subsector of the full Hilbert space. The thing is that entanglement
within the environment gets in the way of measuring ÔS ⊗ 1E in the
way the experimenter would like.
Say, for example, HS = C2 and HE = CN for some very big N . Any
state in HS ⊗ HE will be of the form
c0 |0i |ψ0 i + c1 |1i |ψ1 i (32)
for some c0 , c1 ∈ C and |ψ0 i , |ψ1 i ∈ H. The expectation value for our
observable is
   
∗ ∗
hÔS ⊗ 1E i = c0 h0| hψ0 | + c1 h1| hψ1 | ÔS ⊗ 1E c0 |0i |ψ0 i + c1 |1i |ψ1 i

=|c0 |2 h0| ÔS |0i + |c1 |2 h1| ÔS |1i + 2 Re c∗0 c1 h0| ÔS |1i hψ0 |ψ1 i


The thing is that, if the environment E is very big, then any two random
given vectors |ψ0 i , |ψ1 i ∈ HE will generically have almost no overlap.
hψ0 |ψ1 i ≈ e−N
(This is just a fact about random vectors in high dimensional vector
spaces.) Therefore, the expectation value of this observable will be
hÔS ⊗ 1E i ≈ |c0 |2 h0| ÔS |0i + |c1 |2 h1| ÔS |1i .

25
Because there is no cross term between |0i and |1i, we can see that
when we measure our observable, our system S seems to be in a classical
superposition, A.K.A a mixed state!
This can be formalized by what is called a “partial trace.” Say that
|φi iE comprises an orthonormal basis of HE . Say we have some density
matrix ρ representating a state in the full Hilbert space. We can “trace
over the E degrees of freedom” to recieve a density matrix in the S
Hilbert space.
X
ρS ≡ TrE (ρ) ≡ E hφi | ρ |φi iE . (33)
i
You be wondering why anyone would want to take this partial trace.
Well, I would say that if you can’t perform the E degrees of freedom,
why are you describing them? It turns out that the partially traced
density matrix gives us the expectation values for any observables in
S. Once we compute ρS , by tracing over E, we can then calculate the
expectation value of any observable ÔS by just calculating the trace over
S of ρS ÔS :  
Tr ρÔS ⊗ 1E = TrS (ρS ÔS ).
Even though the whole world is in some particular state in HS ⊗ HE ,
when you only perform measurements on one part of it, that part might
as well only be in a mixed state for all you know! Entanglement looks
like a mixed state when you only look at one part of a Hilbert space.
Furthermore, when the environment is very large, the off diagonal “in-
terference terms” in the density matrix are usually very close to zero,
meaning the state looks very mixed.
This is the idea of “entanglement entropy.” If you have an entangled
state, then trace out over the states in one part of the Hilbert space,
you will recieve a mixed density matrix. That density matrix will have
some Von Neumann entropy, and in this context we would call it “en-
tanglement entropy.” The more entanglement entropy your state has,
the more entangled it is! And, as we can see, when you can only look
at one tiny part of a state when it is heavily entangled, it appears to be
in a classical superposition instead of a quantum superposition!
The process by which quantum states in real life become entangled
with the surrounding environment is called “decoherence.” It is one of
the most visciously efficient processes in all of physics, and is the reason
why it took the human race so long to discover quantum mechanics. It’s
very ironic that entanglement, a quintessentially quantum phenomenon,
when taken to dramatic extremes, hides quantum mechanics from view

26
entirely!
I would like to point out an important difference between a clas-
sical macrostate and a quantum mixed state. In classical mechanics,
the subtle perturbing effects of the environment on the system make it
difficult to keep track of the exact microstate a system is in. However,
in principle you can always just re-measure your system very precisely
and figure out what the microstate is all over again. This isn’t the case
in quantum mechanics when your system becomes entangled with the
environment. The problem is that once your system entangles with the
environment, that entanglement is almost certainly never going to undo
itself. In fact, it’s just going to spread from the air molecules in your
laboratory to the surrounding building, then the whole univeristy, then
the state, the country, the planet, the solar system, the galaxy, and then
the universe! And unless you “undo” all of that entanglement, the show’s
over! You’d just have to start from scratch and prepare your system in
a pure state all over again.

1.12 The Quantum Partition Function


The quantum analog of the partition function is very straightforward.
The partition function is defined to be
 
Z(T ) ≡ Tr exp −Ĥ/kT (34)
X
= e−βEs .
s

Obviously, this is just the same Z(T ) that we saw in classical mechanics!
They are really not different at all. However, there is something very
interesting in the above expression. The operator
 
exp −Ĥ/kT

looks an awful lot like the time evolution operator


 
exp −iĤt/~

if we just replace
i
− t −→ −β.
~
It seems as though β is, in some sense, an “imaginary time.” Rotating the
time variable into the imaginary direction is called a “Wick Rotation,”

27
and is one of the most simple, mysterious, and powerful tricks in the
working physicist’s toolbelt. There’s a whole beautiful story here with
the path integral, but I won’t get into it.
Anyway, a mixed state is said to be “thermal” if it is of the form
1 X −Es /kT
ρThermal = e |Es i hEs | (35)
Z(T ) s
1 −β Ĥ
= e
Z(β)

for some temperature T where |Es i are the energy eigenstates with eigen-
values Es . If you let your system equilibriate with an environment at
some temperature T , and then trace out by the environmental degrees
of freedom, you will find your system in the thermal mixed state.

28

You might also like