2018F 217 Lectures
2018F 217 Lectures
Fall 2018
Lecturer: McGreevy
These lecture notes live here. Please email corrections to mcgreevy at physics dot
ucsd dot edu.
Contents
0.1 Introductory comments . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 Random walks 18
2.1 Biased gaussian walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Universality class of the (unrestricted) random walk . . . . . . . . . . . 19
2.3 Self-avoiding walks have their own universality class . . . . . . . . . . . 21
3 Ising models 26
3.1 Decimation RG for 1d nearest-neighbor Ising model . . . . . . . . . . . 31
3.2 High-temperature expansion . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 RG evaluation of physical quantities . . . . . . . . . . . . . . . . . . . . 37
3.4 Need for other schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Low-temperature expansion, and existence of phase transition in d>1 42
3.6 A word from our sponsor . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.8 Block spins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
1
4.1 Landau-Ginzburg theory . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Correlations; Ginzburg criterion for MFT breakdown . . . . . . . . . . 59
5 Festival of rigor 68
5.1 Extensivity of the free energy . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Long-range interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.3 (Anti-)convexity of the free energy . . . . . . . . . . . . . . . . . . . . 72
5.4 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . . 75
5.5 Phase coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
6 Field Theory 82
6.1 Beyond mean field theory . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Momentum shells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3 Gaussian fixed point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.4 Perturbations of the Gaussian model . . . . . . . . . . . . . . . . . . . 88
6.5 Field theory without Feynman diagrams . . . . . . . . . . . . . . . . . 91
6.6 Perturbative momentum-shell RG . . . . . . . . . . . . . . . . . . . . . 100
7 Scaling 106
7.1 Crossover phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.2 Finite-size scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
2
0.1 Introductory comments
The ‘renormalization group’ (RG) is a poor name for the central concept in many-body
physics. It is a framework for addressing the question: what is the relationship between
microscopic laws and macroscopic observations?
Or, closer to home, it allows us to answer questions such as: Why don’t you need
to understand nuclear physics to make your tea in the morning?1
Briefly, the RG is the realization that systems of many degrees of freedom (especially
when they have local interactions) should be understood hierarchically, i.e. scale by
scale.
There is a lot more to say to contextualize the RG, which, as you can see from the
previous question, is really a piece of metaphysics, that is, it is a framework for how
to do physics. But since it is such a broad and far-reaching concept, in order to avoid
being vague and useless, it will be better to start with some concrete and simple ideas,
before discussing of some of its many consequences.
A word about prerequisites: The official prerequisite for this course is graduate
statistical mechanics. I think you would be able to get by with a good undergrad class.
The historical origins of the RG (at least its name) are tied up with high-energy
particle physics and quantum field theory. That stuff involves quantum mechanics in
a serious way. Much of the content of this course can be understood without quantum
mechanics; the fluctuations could all be thermal. At various points along the way I
will point out the connections with quantum field theory.
So this is mostly a course in statistical field theory (≡ statistical mechanics of many
degrees of freedom). But there are many other applications of the RG which don’t quite
fit in this category which I also hope to discuss.
Also, I think our discussion will all be non-relativistic, v c.
3. Ising models
3
5. RG treatment of iterative maps and the period-doubling approach to chaos
As the title indicates, this is a very rough guess for what we’ll do. An early target
will be a renormalization-group understanding of the central limit theorem.
4
0.2 Conventions
The convention that repeated indices are summed is always in effect unless otherwise
indicated.
h
A useful generalization of the shorthand ~ ≡ 2π
is
dk
d̄k ≡ .
2π
I will also write /δ(q) ≡ (2π)d δ d (q).
I will try to be consistent about writing fourier transforms as
dd k ikx ˜
Z Z
d
e f (k) ≡ d̄d k eikx f˜(k) ≡ f (x).
(2π)
5
1 Scaling and self-similarity
[This discussion largely follows the early chapters of the book by Creswick et al.]
First some somewhat-vague definitions to get us started. An object is self-similar if
its parts, when magnified by a suitable scale factor λ, look like the whole. (Here is
an example.) Something is scale-invariant if this is true for every λ. (Self-similarity
is sometimes called ‘discrete scale invariance’. ) An important generalization is the
notion of statistical self-similarity – something which is sampled from a distribution
which is self-similar.
The point in life of the renormalization group is to provide a way of thinking about
(and ideally relating quantitatively) what’s going on at different scales of magnification.
So something which is self-similar or scale-invariant is a simple special case for the RG.
As we’ll see, a symptom of scale invariance is a power law.
The word ‘dimension’ is used in many ways in this business. Let’s consider a set of
points in d-dimensional Euclidean space, Rd . In the previous sentence ‘dimension’ is
the minimum number of coordinates needed to specify the location of a point (this is
usually called ‘Euclidean dimension’). It’s an integer.
A subset of Rd specified by some algebraic equations on the
coordinates (we can call this an algebraic set) generically has
a Euclidean dimension which is an integer (though it may not
be the same integer for every point). That is, locally around
almost every solution of the equations, the object will look
The algebraic set {y(y − x2 ) = 0} ⊂ R2
like a piece of RdT for some dT ≤ d (sometimes this notion is
has dT = 1.
called ‘topological dimension’).
Here is a different, RG-inflected definition of the dimension of an object O ⊂ Rd ,
called fractal dimension or Hausdorff dimension: cover the object O with d-balls of
diameter a,
Br0 (a) ≡ {~r ∈ Rd such that |~r − ~r0 | ≤ a/2}. (1.1)
Let
N (a) ≡ the minimum number of such balls required to cover O,
minimizing over the locations of their centers. Do this for various values of a. Then, if
this function is a power law,
N (a) ∼ a−D (1.2)
6
then D is the fractal dimension of O. Even if N (a) is not a power law, we can define
D ≡ − loga N (a).
A few observations:
• Notice that D may itself depend on the range of ball-sizes a we consider, that
is, the same scaling relation may not hold for all a. Often (always) there is a
short-distance (“UV”) cutoff on the regime where the scaling relation (1.2) holds
– if our object is the coastline of France, it is maybe not so useful to consider
femtometer-sized balls. Also, there is often a long-distance (“IR”) cutoff – in the
same example, Earth-sized balls will not give an interesting power law (it just
gives N (r♁ ) = 1).
log aa
− log 2
− 0 a log 3
N (a) = 2n(a) = 2 log 3 =
a0
7
which gives fractal dimension
log 2
D= ' .63 ∈ (0, 1) . (1.3)
log 3
Notice that this object is self-similar with scale factor λ = 3: the two remaining
thirds are identical to the original up to a rescaling of the length by a factor
of three. This fact can be used to infer the power-law, since it means N (a) =
2N (3a). So if N (a) ∼ a−D , we must have a−D = 2(3a)−D =⇒ 1 = 2 · 3−D which
is (1.3).
8
1.2 Fractal dimension of a random walk
D E M X
X M M
X
~ M |2
|R = h~ri · ~rj iM = |rj |2 = M a20 .
M
i j j=1
√
rD E
The RMS displacement R(M ) ≡ ~ M |2
|R = M a0 goes like the square root of
M
the number of steps, a probably-familiar result on which we are going to get some new
perspective now.
What is the fractal dimension of a random walk?
A walk of M steps can be regarded as M/n subwalks of n steps (choose M so
that these are integers). By the above result, the RMS displacement of the subwalks is
√
r(n) = na0 ; choose M big enough so that this is a good approximation. This suggests
that we may think of a random walk (RW) of M steps of length a0 as a RW of M/n
√
steps each of length (approximately) a1 ≡ na0 . Notice that this ‘coarse-grained’ step
size is not actually the same for each subwalk. (We are relying on the central limit
theorem here to say that the distribution of subwalk sizes is well-peaked around the
central value. We’ll give an RG proof of that result next.)
9
This perspective allows us to estimate the fractal dimension
of an unrestricted RW. Let N (a) be as above the number of
balls of diameter a needed to cover a walk (probably) of M
microscopic steps of size a0 . When the ball-size is about the
same as the stepsize, we need one ball for each step (this is
overcounting but should give the right scaling), we’ll have
N (a) ∼ M, for a ∼ a0 .
2. As a practical physicist, why should you care about this result? Here’s one kind
of answer: suppose you have in your hands some object which is locally one-
dimensional, but squiggles around in a seemingly random way. It is governed
by some microscopic dynamics which are mysterious to you, and you would like
10
to know if you can model it as an unrestricted random walk. One diagnostic
you might do is to measure its fractal dimension; if it’s not D = 2 then for sure
something else is going on in there. (If it is D = 2 something else still might be
going on.)
[End of Lecture 1]
3. For some statistically self-similar sets, a single fractal dimension does not capture
the full glory of their fractaliciousness, and it is useful to introduce a whole
spectrum of fractal dimensions. Such a thing is called multifractal.
I hope to say more about both of the previous points later on in the course.
Now we’ll study the random walk a bit more precisely, and use it to introduce the RG
machinery. To be specific, suppose that each microscopic step is sampled from the
(Gaussian) distribution
r |2
|~
−
, N = (2πσ02 )−d/2 .
2
p(~r) = N e 2σ0
As before, the detailed form of the single-step distribution will be unimportant for the
questions of interest to us – the technical term is ‘irrelevant’; this will be an outcome
of the RG analysis. In this case, we have h~ri = 0, h~r · ~ri = σ02 .
Let ~r0 ≡ ni=1 ~ri . Think of this as a ‘coarse-grained step’ – imagine that the single
P
steps (of RMS size σ0 ) are too small to see, but for n big enough, n of them can get
somewhere. The distribution for the coarse-grained step is:
Z n
!
X
P (~r0 ) = dd r1 · · · dd rn p(~r1 ) · · · p(~rn ) δ ~r0 − ~ri
| {zi=1 }
~ 0 P
= d̄d k eik·(~r − i ~ri )
R
(Do n · d Gaussian integrals.
|~r0 |2
0
−d/2
= N exp − , N 0 ≡ 2πnσ02 . (1.4)
2nσ02
11
This is the same form of the distribution, with
√
the replacement σ0 → σ 0 ≡ nσ0 . We can make
it actually the same distribution if we rescale our
units (the second half of the RG transformation):
√
rescale r0 ≡ nr00 , where the zoom factor is cho-
sen to keep the width of the distribution the same
after the coarse-graining step. Remembering that
distributions transform under change of variables
by
P (~r0 )dd r0 = P (~r00 )dd r00
we have 00 2
|~
r |
00 1 − 2
P (~r ) = e 2σ0
(2πσ0 )d/2
– the same distribution as we had for single step. Therefore, a random walk is (prob-
ably) a fractal – it is self-similar on average.
The two steps above – (1) coarse graining and (2) rescaling – constitute a renor-
malization group transformation (more on the general notion next). The ‘coupling
constant’ σ0 transforms under this transformation, in this case as
σ0 7→ σrenormalized = σ0 ,
i.e. it maps to itself; such a parameter is called marginal and is a special case.
Consider the RMS distance covered by a walk in M steps,
* M +
X
R(M )2 ≡ | ~ri |2 .
i=1 M
R
It depends on M and the typical step size, which is σ (since σ 2 = dd r|~r|2 p(~r)).
Dimensional analysis tells us that we must have R(M ) ∝ σ and the statistical self-
similarity we’ve just found suggests a power law dependence on M :
R(M ) ∼ σM ν
which scaling relation defines the exponent ν. The coarse-grained walk (no rescaling)
takes M 0 = N/n steps. Demanding the same outcome for the RMS displacement in
both the microscopic description and in the coarse-grained description says
ν
0 0 ν
√ M 1
ν
σM = |{z} σ (M ) = nσ = n 2 −ν σM ν . (1.5)
0
√ n
σ = nσ
12
(In the context of quantum field theory, a relation with the same logical content is
called a Callan-Symanzik equation.) In order for this to be true for all n, we must have
1
ν= .
2
√
Recalling that the fractal dimension D = 2 also came from σ 0 = nσ0 = n1/D σ0 , we’ve
shown that an unrestricted random walk in d ≥ 2 has a relationship between the fractal
dimension and the RMS displacement exponent: ν = 1/D.
Measurability of the fractal dimension. I’ve spoken above about the fractal
dimension of a random walk, for example of a random polymer configuration, as an
‘observable’. How could you measure it?
Suppose the monomers making up the polymer scatter light (elastically). The
fractal dimension can be extracted from the structure factor S(k), as measured by the
intensity of scattering of light off the object, as a function of the wavenumber k of the
light. (This statement is related to the open-ended question on the first homework.)
1. Coarse-graining or decimation: The idea of this step is familiar from the cen-
tral idea of how thermodynamics emerges from statistical mechanics: we should
13
average over the stuff we can’t keep track of (microscopic configurations of the
system), holding fixed the stuff we do keep track of (the thermodynamic variables
like energy and volume). In the connection mentioned in the previous sentence,
we do it all at once.
The key new idea of the RG is to do it a little bit at a time. That is: Integrate
out or average over some set of short-distance/fast degrees of freedom, holding
fixed a set of long-wavelength/slow degrees of freedom.
Notice that this step is not necessarily reversible: the value of a definite integral
(or sum) does not uniquely determine the integrand (or summand). We lose
information in this step. This means that a set of transformations defined this
way is not in fact a group in the mathematical sense, since there is no inverse
element (it is a semigroup). So much for that.
The idea is that we are squinting, so that the smallest distance ∆x we can resolve
gets a little bigger, say before the coarse-graining, we had a resolution ∆x = ,
and afterwards we only keep track of stuff that varies on distances larger than
∆x = λ for some scale factor λ > 1.
2. Rescaling: Now we change units to map the coarse-grained system back onto
the original one, so that λ 7→ . We do this so that we can compare them.
Now we’re going to think about the space on which this transformation is acting.
Its coordinates are the parameters of the system, such as the parameters defining
the probability distribution such as σ0 for the random walk, or the couplings in the
Hamiltonian if p = e−βH /Z. Let’s call the set of such parameters {hj }, where j is an
index which runs over as many parameters as we need to consider23 . These parameters
get transformed according to
steps 1, 2
{hj } 7→ {h0j ≡ Rj ({h})}.
This map is something we can do over and over, coarse-graining (zooming out) by
a factor of λ each time, until we get to macroscopic sizes. The repeated application of
2
For example, in the random walk case, other parameters we could include are b, c, ... in
r2
p(~r) = exp − ~b · ~r + 2 + cr4 + ... .
2σ
3
One of the many crucial contributions of Ken Wilson to this subject was (I think) allowing for
the possibility of including arbitrarily many parameters. The terror you are feeling at this possibility
of an infinite-dimensional space of coupling parameters will be allayed when we discover the correct
way to organize them two pages from now.
14
the map h0j ≡ Rj (h) describes a dynamical system on the space of parameters. If we
are interested in macroscopic physics, we care about what happens when we do it lots
of times:
h 7→ R(h) 7→ R(R(h)) ≡ R2 (h) 7→ R3 (h) → · · ·
(When studying such a possibly-nonlinear dynamical system more generally, it is a
good idea to ask first about the possible late-time behavior.)
What can happen? There are three possibilities:
The first case, where there is a fixed point, is the one about which we have a lot to say,
and fortunately is what seems to happen usually.
A crucial point: the distribution described by such a fixed point of the RG is self-
similar, by the definition of the RG transformation. (If this is true when our zooming
size λ → 1, then it is actually scale-invariant.)
15
Linearizing about the fixed point, let hj ≡ h?j + δj , where |δ| 1 will be our small
parameter. This maps under the RG step according to
∂h0j
Taylor
hj ≡ h?j + δj 7→ h0j ?
= Rj (h + δ) = Rj (h ) +δk |h? +O(δ 2 ) ?
| {z } ∂hk
=h? | {z }
j
≡Rjk
where in the last step we assumed that the RG map R is analytic in the neighborhood
of the fixed point, i.e. that it has a Taylor expansion. How could it not be? We got
it by doing some finite sums of analytic functions. By +O(δ 2 ) I mean plus terms that
go like δ 2 and higher powers of delta which are small and we will ignore them. If we
ignore them, then the map on the deviation from the fixed point δ is a linear map:
δj 7→ δj0 = Rjk δk .
We know what to do with a linear map: find its eigenvalues and eigenvectors:
(n) (n)
Rjk φk = ρn φj . (1.6)
Notice that nothing we’ve said guarantees that Rjk is a symmetric matrix, so its right
and left eigenvectors need not be the same (the eigenvalues are), so we’ll also need
(n) (n)
φ̃j Rjk = ρn φ̃k .
About the eigenvalues, notice the following. We’ve defined the RG transformation
R ≡ Rλ to accomplish a coarse-graining by a scale factor λ. We can imagine defin-
ing such a transformation for any λ, and these operations form a semigroup under
composition
Rλ Rλ0 = Rλλ0 .
This is useful because it says that the eigenvalues of the linearized operators
Rλ φ(n) = ρn (λ)φ(n)
16
must satisfy the same multiplication law4
But a function which satisfies this rule must have the form5
The eigenvectors of R give a preferred coordinate basis near the fixed point:
X (n) (1.7) X (n)
δj = gn φj , gn = φ̃k δk ,
n k
which we will use from now on. yn is called the scaling dimension of the coupling gn .
Now we can see the crucial RG dichotomy which tames
the infinitely many couplings: If |ρn | < 1 (yn < 0) then
as we act with R many times to get to long wavelengths,
then gn → 0. Such a coupling is called irrelevant: it goes
away upon repeated RG transformations and its effects on
macroscopic physics can be ignored. Notice that since the
perturbation is getting smaller, the approximation |δ| 1
becomes better and better in this case.
In contrast, if |ρn | > 1 (yn > 0) then as we act with R (Notice that the eigenvectors need not
many times to get to long wavelengths, then gn grows. Such be orthogonal.)
4
Why do Rλ for different λ have the same eigenvectors?
It really follows from the semigroup property. The eigenvectors are physical things an eigenvector
determines some operator O with the following property: if I add O to the fixed-point hamiltonian,
H? + gO, an RG transformation does not generate any other operators, i.e. it gives H = H? + αgO
for some α.
On the other hand, the choice of by how much to zoom out (λ) is an arbitrary one. Doing the RG
step by λ twice should give the same result as doing it once by 2λ. So in particular either one should
give the same set of special directions.
5
The function y(λ) ≡ log ρn (λ) then satisfies y(λ) + y(λ0 ) = y(λλ0 ). First this implies y(1) = 0. If
we consider λ0 = 1 + , we have
y(λ) = y 0 (1) ln λ.
I’m not sure if the statement (1.9) follows if we only know (1.8) for discrete values of λ. Does it?
17
a parameter is called relevant, and represents an instability
of the fixed point: our linearization breaks down after repeated applications of R and
we leave the neighborhood of the fixed point.
The case of a coupling with yn = 0 which doesn’t change is called marginal.
In these terms, the critical surface (actually its tangent space near the fixed point)
is determined by
S(h? ) = {gn = 0 if yn > 0}.
In particular, the codimension of the critical surface in the space of couplings is the
number of relevant perturbations of the fixed point.
[End of Lecture 2]
2 Random walks
Next we generalize our ensemble of random walks to illustrate some features of the RG
that were missing from our simple pure Gaussian example above.
First, we can see an example of a relevant operator if we study a biased walk, with
|~r − ~r0 |2
2 −d/2
p(~r) = 2πσ exp − . (2.1)
2σ 2
Again define the distribution for the coarse-grained step to be
Z Yn n
!
X
P (~r0 ) dd ri p(~ri ) δ ~r0 −
= ~ri
i=1 i
(more Gaussian integrals)
|~r − n~r0 |2
2 −d/2
= 2πnσ exp − . (2.2)
2nσ 2
After the rescaling step, to keep the width of the distribution fixed, we have
(
σ (R) = σ
(R) √ .
~r0 = n~r0
18
So R is diagonal already. This says that the bias of the walk is a relevant operator of
dimension y0 = 12 > 0.
We have here an explicit example of an RG map R. Let’s study its fixed points.
There’s one at (σ, ~r0 = 0) (for any σ, so actually it is a family of fixed points parametrized
by the marginal coupling σ) which is the unbiased walk we studied earlier. This fixed
point is unstable because if we turn on a little r0 it will grow indefinitely.
And there’s another fixed point at (σ, ~r0 = ∞). This
is where we end up if we perturb the unbiased fixed point.
The distribution (2.1) says (by direct calculation) that
rD E
~ M |2 M 1
p
R(M ) = |R = M 2 |~r0 |2 + M σ 2 → M |~r0 |.
M
This means that for large a, we’ll need N (a) ∼ 1/a spheres of diameter a to cover the
walk – it will be one dimensional.
This means that a system defined by some microscopic distribution of the form
(2.1) with some value of ~r0 and σ will look like a Brownian walk of the type described
above, with fractal dimension D = 2, if you look at it closely, with a resolution δx σ.
But from a distance (resolution worse than δx σ), it will look like a one-dimensional
path (D = 1) in the ~r0 direction. For example, the number of balls defining the fractal
dimension behaves as (
a−2 , a σ
N (a) ∼ .
a−1 , a σ
Now let the distribution from which a single step is sampled be any rotation-invariant
distribution p(~r) = p(|~r|) with finite moments. For example, the fixed-step-length
1
distribution p(~r) = 4πa r| − a) is a good one to keep in mind. (This is still not the
2 δ (|~
most general walk, since we’re still assuming the steps are independent. More on that
next.) The distribution for the coarse-grained step is
Z Y n
!
X
P (~r0 ) = dd ri p(~ri )δ ~r0 − ~ri
Z i=1 i
D En
~ 0 ~
= d̄d k e−ik·~r eik·~r . (2.3)
The quantity Z
D E
i~k·~
r ~
e = dd rp(~r)eik·~r ≡ g(k)
19
is called the characteristic function of the distribution p(~r), and is a generating function
for its moments:
hrm i = (−i∂k )m g(k)|k=0 .
The Taylor expansion in k of its logarithm is the cumulant expansion:
X (ik)m
log g(k) = Cm , Cm = (−i∂k )m log g|k=0 .
m
m!
m
If we don’t truncate the sum in m (ik)
P
m!
Cm , then the {Cm } are just another set of
coordinates on the space of couplings for the walk. Why should we treat the integration
variable k in (2.3) Z
~ 0 n ~ 2 +O(nk3 )
P (~r0 ) = d̄d k e−ik·~r e− 2 σ0 |k|
as small? Because the integrand is suppressed by the Gaussian factor. If the Gaussian
bit dominates, then the integrand has support at k <
∼ √ 2 , at which the mth term in
1
nσ0
the cumulant expansion contributes to the exponent in (2.3) as
m n→∞
nk m Cm ∼ n1− 2 → 0 if m > 2,
where the important thing for getting zero is just that Cm is finite and independent of
n and k. This is the statement that the couplings Cm for m > 2 are irrelevant. Then
we can do the remaining Gaussian integral (ignoring the small corrections which are
1− m
suppressed by e−n 2 Cm )
r 0 −nh~
|~ r i|2
0 n1 −1
(2πnσ02 )d/2 e 2
2
nσ0
P (~r ) = .
What’s this? This is the Gaussian we used at the beginning, with r0 = n h~ri.
This result, that the distribution for a sum of many random variables independently
distributed according to some distribution with finite moments, is usually called the
Central Limit Theorem or the Law of Large Numbers. (For more on the derivation I
recommend the discussion in Kardar volume 1.)
In the framework of the RG it is an example of universality: all such probability
distributions are in the basin of attraction of the gaussian random walk – they are said
to be in the same universality class meaning that they have the same long-wavelength
20
physics. In particular, their RMS displacement goes like RM ∼ M 1/2 for large number
of steps M , and (for d ≥ 2) their fractal dimension is D = 2.
Notice that we did not prove that the Gaussian fixed point is the only one: we had
to assume that we were in its neighborhood in order to use the k ∼ n−1/2 scaling – this
scaling is a property of the neighborhood of the fixed point, just like the exponents y
we got by linearizing about the general fixed point in §1.5.
We could try to find other fixed points in the space of d-dimensional walk distri-
butions. For example, we could have chosen the scaling to fix the coefficient Cm for
any m. In that case we would find that the m − 1 perturbations Cl<m are relevant and
all the Cl > m are irrelevant. The special case where we fix C1 (i.e. choose k ∼ 1/n)
gives the same fixed-point we reached for the biased walk. The fixed points with Cm>2
fixed would have more than one relevant operator (we will learn to call this ‘multicrit-
ical’), which means reaching them requires tuning several parameters. For better or
worse, these fixed point distributions with m > 2 don’t seem to exist as probability
distributions, because they would have to have zero variance6 .
Also, the assumption in the statement of the CLT also has an RG analog: if the
initial distribution does not have finite moments, then our expansion in terms of cu-
mulants is no good. An example is a Lorentzian distribution, p(r) = r2σ/π
+σ 2
. In fact in a
certain sense the Lorentzian is a fixed point (if we set n = 2 where n is the parameter
in the coarse-graining transformation as above).
(We will see another fixed point next when we include interactions between the
steps of the walk.)
One lesson which does generalize, however, is that most of the possible perturbations
of the fixed point are irrelevant, and there is only a small number of relevant or marginal
perturbations.
[Still from Creswick! I like this book. According to Amazon, Dover has put out a
second edition.] Suppose that the random 1d objects we are studying are actually
polymers – long chain molecules made of ‘monomers’ which cannot be in the same
place, i.e. they have some short-ranged repulsion from each other. We can model this
as lattice paths without self-intersection, or self-avoiding walks (SAWs). Does this
microscopic modification of our ensemble change the long-wavelength physics?
It certainly changes our ability to do all the sums. If our polymer has n monomers,
6
Thanks to Tarun Grover for pointing this out to me. Maybe they do exist as simple analogs of
‘complex fixed points,’ where we drop some positivity assumptions
21
we’d like to know about the numbers
Then we could figure out the RMS displacement from head-to-tail of the n step polymer
(actually we are not distinguishing between head and tail):
D E P
~
~ R|
Mn (R)| ~ 2
~ 2
2
R(n) ≡ |R| = R
.
n Mn
P
The denominator here is Mn ≡ R~ Mn (R). ~ As with the unrestricted random walk, we
might expect to have (we will) a scaling relation
R(n) ∼ nν (2.5)
P
In (2.6) G(K) ≡ R G(K, R). In this ensemble, for K < 1,
the falloff of K n with n fights against the growth of Mn to
produce a sharp peak at some n0 (K).
22
There is a value of K where this peak step-length diverges,
since it is finite for K → 0 and infinite for K ≥ 1.
23
The weights are related by
X
WΓ0 (K 0 ) = WΓ (K)
Γ∈Γ0
3. K = Kc ' 0.297. This third one is where we go from finite walks at K slightly
below Kc to infinite walks at K > Kc .
The jagged line between K 0 = K and the curve defined by (2.8) depicts the repeated
action of the map with an initial condition near (but slightly below) the fixed point
at K = Kc . As you can see from the jagged line, the fixed point Kc is unstable –
the perturbation parametrized by K − Kc is relevant. Its dimension determines the
exponent ν defined in (2.5) as follows.
Because we are zooming out by a factor of λ, the typical size will rescale as
ξ(K) = λξ 0 (K 0 ).
24
Near the critical point,
K→Kc !
ξ(K) ∼ |K − Kc |−ν = |{z}
λ ξ 0 (K 0 ) = 2| K 0 (K) − Kc |−ν
| {z } | {z }
=2 =ξ(K 0 ) 0
= ∂K | (K−Kc )
∂K Kc
Therefore −ν
∂K 0
−ν
|K − Kc | =λ |K |K − Kc |−ν
∂K c
from which we conclude
ln λ
ν= 0 = 0.771.
ln ∂K
∂K Kc
|
Numerical simulations give Kc = 0.379 and ν = 0.74.
Where are we making an approximation in the above? For example, some con-
figurations on the fine lattice have no counterpart on the coarse lattice (an example
is a walk which enters the cell and leaves again the same way). We are hoping that
these don’t make an important contribution to the sum. The real-space RG can be
systematically improved by increasing the zoom factor λ (clearly if we coarse-grain the
whole lattice at once, we’ll get the exact answer).
The important conclusion, however, is pretty robust: the d = 2 SAW has a different
exponent than the unrestricted walk:
This makes sense, since it means that RRMS (SAW) > RRMS (unrestricted) for many
steps – the SAW takes up more space (for a fixed number of steps) since it can’t
backtrack. The fractal dimension is therefore smaller DSAW = ν1 ' 1.3 < 2.
Different exponents for the same observable near the critical point means different
universality class.
Teaser: This ensemble of self-avoiding walks is the n → 0 limit of the O(n) model!
More specifically, the critical point in temperature of the latter model maps to the
large-walk limit: T − Tc ∼ M −1 . This realization will allow us to apply the same
technology we will use for the Ising model (which we could call the O(1) model) and
its O(n) generalizations to this class of models.
25
3 Ising models
Words about the role of models, solvable and otherwise, and universality:
Fixed points of the RG are valuable. Each one describes a possible long-wavelength
behavior, and each one has its own basin of attraction. That basin of attraction includes
lots of models which are in some sense ‘different’: they differ in microscopic details of
values of couplings, and sometimes even more dramatically. Two important examples:
(1) a lattice model and a continuum model can both flow to the same fixed point. The
idea is that if the correlation length is much longer than the lattice spacing, the lattice
variable looks like a continuous field, and we can interpolate between the lattice points.
And at a fixed point scale invariance requires that the correlation length be infinity (or
zero).
(2) a model with two states per site (like an Ising magnet, the subject of this
section) and a model with infinitely many states at each site can flow to the same fixed
point. Here’s a picture of how that might come about. Suppose we have at each site a
variable called S which lives on the real line, and it is governed by the potential energy
function V (S) = g(S 2 − 1)2 . (So for example the Boltzmann distribution is e−βV (S) .)
The parameter g might be relevant, in the sense that g → ∞ at long wavelengths. This
process of making g larger is depicted in the following figures (left to right g = 1, 11, 21):
As you can see, it becomes more and more energetically favorable to restrict S to just
the two values S = ±1 as g grows.
I’ve just made a big deal about universality and the worship of fixed points of the
RG. Part of the reason for the big deal is that universality greatly increases the power
of simple models: if you can understand the physics of some simple (even ridiculously
over-idealized) model and show that it’s in the same universality class as a system of
interest, then you win.
[Goldenfeld §2.5, Creswick §5, lots of other places] The Ising model is an important
common ground of many fields of science. At each site i ∈ Λ (Λ may be a chain,
or the square lattice, or an arbitrary graph, and i = 1...|Λ| ≡ N (Λ) = N is the
number of sites), we have a binary variable si = ±1 called a spin, whose two states are
sometimes called up and down. There are 2N configurations altogether. (Although I
26
will sometimes call these ‘states’ I emphasize that we are doing classical physics.)
The name ‘Ising model’ connotes the following family of energy functionals (also
known as Hamiltonians):
X X X
− H(s) = hi si + Jij si sj + Kijk si sj sk + · · · (3.1)
i∈Λ ij ijk
where this sum could go on forever with terms involving more and more spins at once.
(The RG will generically generate all such terms, with coefficients that we can hope do
not cause too much trouble.) With this definition, the model may describe magnetic
dipoles in a solid, a lattice gas (where si = ±1 correspond to presence or absence of
a particle at i), constrained satisfaction problems, neural networks, ... anything with
bits distributed over space. This list also could go on forever7 .8
Equilibrium statistical mechanics. Why might we care about H(s)? We can
use it to study the equilibrium thermodynamics of the system, at some temperature
T ≡ 1/β. Let’s spend a few moments reminding ourselves about the machinery of
equilibrium statistical mechanics. The key ‘bridge’ equation between the microscopic
world (stat mech) and the macroscopic world (thermo) in thermal equilibrium is
X
e−βF = e−βH(s) ≡ Z.
s
X X X X X N X
Y
≡ ··· ≡ ≡ tr
s s1 =±1 s2 =±1 s3 =±1 sN =±1 i=1 si =±
and we will sometimes write tr for ‘trace’. I emphasize that we are doing classical
physics here.
Why do we care about the free energy F ? For one thing, it encodes the thermody-
namics of the system: the average energy is
1
E ≡ hHi ≡ trHe−βH = −∂β log Z,
Z
7
Here is an example I learned of recently of how an Ising model is used for data clustering.
8
Sometimes the word ‘Ising’ is used to indicate the presence of the Z2 symmetry under s → −s
which is present when only even terms appear in H (h = 0, K = 0)).
27
the entropy is
S = −∂T F,
the heat capacity is
1 2 2
CV = ∂T E = H − hHi ,
T2
a dimensionless measure of the number of degrees of freedom. Notice the familiar
thermodynamic identity F = E − T S follows by calculus.
More ambitiously, if we knew how F depended on all the coupling parameters
{hi , Jij , Kijk ...} in (3.1), we would know all of the correlation functions of the spins,
for example
1 si
∂hi F = −T ∂hi log Z = −T tr e−βH = − hsi i .
Z T
And similarly,
∂hi ∂hj F = (hsi sj i − hsi i hsj i) T −1 ≡ Gij T −1 .
It is a generating function for these (connected) correlation functions.
Clean and local Ising models. Two important specializations of (3.1) are quite
important in physics (not always in the other applications of the Ising model). We will
(usually) restrict to the important special case with the following two assumptions.
1. the couplings (Jij and friends) are local in the sense that the coupling between
two sites goes away (Jij → 0) if the sites are far apart (|ri − rj | → ∞).
A reason to care about the two point function in the case where there is a notion
of locality, then, is that it allows to define a correlation length, ξ:
ra
Gij ∼ e−rij /ξ
– here a is the range of the interactions, or the lattice spacing, and rij ≡ |ri − rj |
is the distance between the locations of spins i and j. The correlation length
will depend on the parameters in H and on the temperature, and it measures
the distance beyond which the spin orientations are uncorrelated. More formally,
ξ −1 ≡ − limr→∞ ∂r ln Gi,i+r (but of course the ∞ here has to stay within the box
containing the system in question).
2. the couplings are translation invariant: Jij = Jf (|ri − rj |) for some function of
the distance f (r). (If one thinks of variations of Jij with i, j as coming from
some kind of microscopic disorder, one refers to this case as clean.) We will often
consider the case where f (r) only has support when r = a is one lattice spacing.
(Notice that s2 = 1 means that we can ignore the case when r = 0.)
28
These two assumptions are independent, but we will usually make both. So: on
any graph (of N sites), the nearest-neighbor ‘clean’ Ising model has energy functional
X X
−H = h si + J si sj
i hiji
When J > 0, the energy of a configuration is lower if neighboring spins point the
same way; in this ‘ferromagnetic’ case everybody can be happy (and M 6= 0). In
the antiferromagnetic case J < 0, neighbors want to disagree. All spins can agree
to disagree if the graph has no loops. Any loop with an odd number of sites, like
a triangle, leads to a frustration of the antiferromagnetic interaction, which requires
compromise and leads to drama.
Lack of drama for bipartite lattices. A bipartite lattice is one which can be
divided into two distinct sublattices A, B each of which only neighbors sites of the other
lattice. That is hiji contains only pairs, one from A and one from B. For example,
hypercubic lattices are all bipartite: let the A lattice be those sites (x, y, ...) whose
(integer) coordinates add up to an even number x + y + ... ∈ 2Z. The honeycomb
lattice is also bipartite. The triangular lattice is not. 9
[End of Lecture 4]
A consequence of bipartiteness is that any loop traverses an even number of sites,
since it must alternate between the two sublattices. Hence there is no frustration for
a (nearest-neighbor!) Ising antiferromagnet on a bipartite lattice. In fact, a stronger
statement is true. Since
X
Hh=0,J (sA , sB ) = −J sA B
i sj
hiji
9
Notice, by the way, that bipartite does not require that A and B be isomorphic or even that they
have the same number of sites. For example, if we simply removed a (periodic) subset of sites (and
all the associated links) from the A sublattice of a lattice, we would still have a bipartite lattice. You
can find an example by googling ‘Lieb lattice’. Beware confusion in the literature on this point.
29
if we flip the spins of one sublattice, we also reverse J:
X
Hh=0,J (sA , −sB ) = +J sA B A B
i sj = Hh=0,−J (s , s ).
hiji
– flipping all the spins and flipping the coefficients of odd powers of the spins preserves
the energy. In particular, if h = 0, K = 0, all odd powers do not appear, and flipping
the spins is a symmetry of the Hamiltonian. What consequence does this have for
thermodynamics?
X X (3.2)
Z(−h, J, −K, T ) = e−βH−h,J,−K (s) = e−βHh,J,K (−s) = Z(h, J, K, T ) .
{s} {s}
And therefore the free energy in particular satisfies F (−h, J, −K, T ) = F (h, J, K, T ).
Let’s set K = 0 from now on. This operation si → −si is a Z2 transformation in the
sense that doing it twice is the same as doing nothing. It is a symmetry when h = 0.
(Only when h = 0 does the transformation map the ensemble to itself.)
Question: does this mean that when h = 0 we must have zero magnetization,
1 X ?
M= hsi i ∝ ∂h F = 0 ?
N i
30
Answer: It would if F (h) had to be a smooth, differ-
entiable function. In order for hsih=0 to be nonzero, F (h)
must have a different derivative coming from positive and
negative h, as in the figure at right. This phenomenon is
called spontaneous symmetry breaking because the symme-
try reverses the sign of the magnetization M → −M .
But this phenomenon, of ∂h F |h=0+ 6= ∂h F |h=0− requires the function F (h) to be
non-analytic in h at h = 0. This is to be contrasted with the behavior for a finite
system (N < ∞) where
X
Z(h) = e−βH(s) = e−βhN m1 c1 + e−βhN m2 c2 + ... + e−βhN mn cn
{s}
Let’s step back from these grand vistas and apply the RG for the Ising model in one
dimension. Consider a chain of sites i = 1...N , arranged in a line with spacing a, and
with an even number of sites, N ∈ 2Z. And for definiteness, if you must, take periodic
boundary conditions sN +1 = s1 . Turn off the magnetic field, so
N
X
H = −J si si+1 .
i=1
We’ll speak about the ferromagnetic case, J > 0 (though the same results apply to
J < 0 since the chain is bipartite). The partition function
Z = tre−βH = Z(βJ)
is calculable exactly in many ways, each of which instructive. Since the partition
function only depends on the combination βJ, let us set β = 1.
In the spirit of the RG, let us proceed by a hierarchical route, by decimating the
even sites: X
e−H(s) = e−Heff (sodd )
{s}i,even
31
On the right hand side, we have defined the effective hamiltonian for the spins at the
odd sites. The odd sites are separated by distance a0 = 2a and there are half as many
of them. We can use this as the first half of an RG implementation (the second half is
rescaling). We’ve zoomed by a factor of λ = a0 /a = 2.
In this 1d case we can actually do these sums:
0
X
e+Js2 (s1 +s3 ) = 2 cosh (J (s1 + s3 )) ≡ ∆eJ s1 s3
s2 =±1
The ∆ business just adds a constant to the (free) energy, which divides out of the
partition function and we don’t care about it here.
We can figure out what the new parameters are by checking cases, of which only
two classes are distinct:
0 product
if s1 = s3 : 2 cosh 2J = ∆eJ =⇒ ∆2 = 4 cosh 2J
0 ratio 0
if s1 = −s3 : 2 = ∆e−J =⇒ e2J = cosh 2J . (3.3)
where v ≡ tanh J ∈ [0, 1] (using hyperbolic trig identities). The map (3.4) is another
explicit example of an RG map on the parameters. In this case, unlike the previous
SAW example, it happens to be exact.
The RG preserves symmetries. Why is the effective hamiltonian of the same
form as the original one? The couplings like the magnetic field multiplying odd numbers
of spins vanish by the Ising spin-flip symmetry of the original model. (More precisely:
because of the locality of H, we can determine Heff by decimating only a finite number of
spins. This rules out generation of nonzero h0 by some version of spontaneous symmetry
breaking. This requires locality of the interactions.) This line of thinking leads us to
expect that the effective hamiltonian should generally have the same symmetries as
the original one.
The 4-spin interaction vanishes because in 1d, each site has only two neighbors with
whom it interacts, each of which has only one other neighbor. So that was a bit of an
accident.
32
This map has two fixed points: One is
?
v = 0, which is βJ = 0, meaning infinite
temperature, or no interactions; this one
is ‘boring’ from the point of view of the
study of many-body physics and collective phenomena, since the spins don’t care about
each other at all. The other fixed point is v? = 1, which is βJ = ∞, meaning zero
temperature or infinite interaction strength. This is a ferromagnetic fixed point where
it is very urgent for the spins to agree with each other. The fact that there is no
fixed point at a finite temperature means that there is no critical behavior in the 1d
nearest-neighbor Ising model; only at T = 0 do the spins align with each other.
More explicitly, how does the correlation length behave? In zooming out by a factor
of λ, it changes by
K T →0 K 2J/T
ξ(v) = λξ(v 0 ) = 2ξ(v 2 ) =⇒ ξ = − → e (3.5)
log v 2
(where K is a constant not determined by this argument) which is finite for T > 0.10
11
Why did it happen that there is no critical point at T > 0? A point of view
which illuminates the distinction between 1d and d > 1 (and is due to Peierls and
now permeates theoretical condensed matter physics) is to think about the statistical
mechanics of defects in the ordered configuration.
Consider a favored configuration at low-temperature, where all spins point the same
way. Small deviations from this configuration require reversing some of the spins and
will cost energy 2J above the aligned configuration for each dissatisfied bond. In 1d,
a single dissatisfied bond separates two happy regions, and is called a kink or domain
wall. Notice that the energy is independent of the size of each happy region (which is
called a domain). n domains of reversed spins cost energy 4Jn, since each domain has
two boundary links.
In 1d, each region of spins that we re-
verse has two boundaries, a kink and an
antikink.
At T = 0, the state minimizes the en-
ergy and there is no reason to have any kinks. But at T > 0, we care about (i.e. the
macroscopic equilibrium configuration minimizes) the free energy F = E − T S, and
the fact that there are many kink configurations matters.
10
A log is a special case of a power law: Taylor expand v ν in ν about 0.
11 T ∼T
Preview: near less weird fixed points, the correlation length will diverge like a power law ξ(T ) ∼ c
(T − Tc )−ν instead of this weird function.
33
How many are there? If there are n segments of s = −1 in a sea of s = +1 then we
must decide where to place 2n endpoints. The number of ways to do this is:
N N! 1nN N log N −2n log 2n−(N −2n) log(N −2n)
Ω(n) ' = ∼ e
2n (2n)! (N − 2n)!
where in the last step we used Stirling’s formula. So the free energy for 2n kinks is
In equilibrium, the free energy is minimized with respect to any variational parame-
ters12 such as n, which happens when
34
where v ≡ tanh βJ (as above). Think about expanding this product over links into a
sum. Each term in the sum gets either a 1 or a vsi sj from each link. Any term in the
sum can be visualized by coloring the links which contribute a vsi sj .
When we multiply this out, the dependence on any one of the spins si can be only
two things: 1 if the term has an even number of factors of si , or si if it has an odd
number. Here’s the Ising model integration table:
X X
1 = 1 + 1 = 2, si = 1 − 1 = 0. (3.7)
si si
In the last two paragraphs, we haven’t used the restriction to 1d at all. (This will
be useful in §3.2.) Consider a single spin s2 of an infinite 1d chain; if it is not one of
the two sites i or i + r in (3.6) the factors which matter to it are13 :
FOIL!
X X
1 + vs2 (s1 + s3 ) + v 2 s1 s3 = 2 1 + v 2 s1 s3 .
(1 + vs1 s2 ) (1 + vs2 s3 ) =
s2 s2
This is just of the same form as if we had a direct link between 1 and 3 with weight v 2
(up to the overall prefactor). Therefore, doing this repeatedly (r times) for the sites in
between i and i + r,
trsi 2r (1 + v r si si+r ) sr
G(r) = = vr
tr2r (1 + v r si si+r )
(The red factors are the ones that survive the trace.) Therefore
35
3.2 High-temperature expansion
Return now to the moment at (3.7), right before we restricted our discussion to one
dimension. We had written the partition function of the nearest-neighbor Ising model
(on any graph) as a product over links
XY
Z = coshNΛ (βJ) (1 + vsi sj ) (3.8)
s hiji
and argued that expanding this binomial gives a sum over paths in the graph. More
explicitly, we think of the two terms in each link factor in (3.8) as a sum over another
dynamical variable, nhiji = 0, 1:
X
1 + vsi sj = (vsi sj )nij .
nij =0,1
Now we can do the sums over the spins using our ‘integration table’ above (3.7).
For each spin, the sum is
X Phi|ji nij X
si = δ nij ≡ 0 mod 2
si =±1 hi|ji
Y X P
Z = coshNΛ (βJ) (1 + vsi sj ) = coshNΛ (βJ) v l nl (C)
(3.9)
hiji C
36
That is: we sum over lattice curves which have an even number of links going into each
site. The contribution of a curve C (which is not necessarily connected) is weighted by
v length(C) .
This rewriting of the Ising partition sum will be useful below.
Behavior of the correlation length under RG. We’ve defined the correlation
length using the spin-spin correlator G(r), in terms of its rate of falloff for large r. Let
us use this to examine its behavior under the RG more directly. To do this, denote
more explicitly
trsi si+r e−H
GH (r) ≡ .
tre−H
Now suppose that i and i + r are both odd sites (so that they survive our decimation);
in that case we can still do all the decimation as in the partition function :
tre,o si si+r e−H(se ,so ) tro si si+r tre e−H(se ,so )
GH (r) ≡ = .
tre−H(se ,so ) tro tre e−H(se ,so )
I emphasize that the argument of GH is measured in units of the lattice spacing, i.e. the
0
number of lattice sites between the spins. But recall that e−H (so ) ∝ tre e−H(se ,so ) defines
the effective Hamiltonian for the remaining odd sites, so this is precisely
0
tro si si+r/2 e−H (so )
GH 0 (r/2) ≡ ,
tre−H 0 (so )
where now there are only half as many sites in between the spins in the new coarser
lattice. Under this RG, we are zooming out by a factor of 2. Altogether, GH 0 (r/2) =
GH (r). Combining this with the definition of ξ, we have
1
ξH 0 = ξH (3.10)
2
(as we said earlier).
The notation ξH is to emphasize that the correlation length is completely deter-
mined by the Hamiltonian (I am assuming thermal equilibrium here). At a fixed point,
the Hamiltonian does not change under the RG, so the correlation length can’t either.
This can be consistent with (3.10) in one of two ways
ξ? = 0 or ξ? = ∞.
The first case means that spins at different sites do not care about each other, as at
T = ∞. I’ve already disparaged this case as boring. The second case of a divergent
correlation length characterizes critical behavior and we define it to be interesting.
37
[Cardy §3.4, Domany, RG notes, chapter 1] Free energy density. Next I want to
show how to calculate the free energy from an ‘RG trajectory sum’. It is a reason to care
about the constants in the effective hamiltonian, as in a0 in H 0 (s0 ) =
P 0 0 0
J s s + a0 N 0 .
In the example above, we found a0 = N1 log ∆, where ∆ was some function of the
microscopic J.
Let the free energy density (free energy per site) be
T
f ≡− log ZN (K).
N
Here I am denoting by K the collection of all couplings, and labelling the partition
function ZN by the number of sites. More explicitly,
so that it has no constant piece (for quantum mechanical folks: it is like a ‘normal-
ordered’ Hamiltonian). And Z̃N ≡ s e−β H̃(s) , and naturally we’ll denote
P
T
f˜ ≡ − log Z̃N .
N
This last expression is a little less innocent than it seems: I am anticipating here
that the free energy is extensive – has a leading piece at large N that grows like N ,
N 1
F = N f + O(N 0 ) – so that f˜ is independent of N in the thermodynamic limit.
(We’ll give an RG-based proof of this statement in §5.) Then f (K) = C + f˜(K).
Now some RG content: the partition function is invariant under the RG:
NC NC −N 0 a0
ZN = e− T Z̃N = e− T eZ̃N/b (K 0 )
T
Here we’ve defined N (n) to be the number of sites decimated at step n, and N/bn is
the number of sites remaining. For the example above, these are the same, and b = 2:
N (n) = N/2n . As above K (n) = Rn (K) is the image of the couplings under n-times
repeated RG transformation. (Notice that if we were in d dimensions, we would have
b = λd , where λ is the linear zoom factor, and the number of sites decimated would not
38
equal the number remaining even for λ = 2.) Taking logs of the BHS of the previous
equation
n
X N (k) (k) N ˜ (n)
f (K) = C + a + n f (K ). (3.12)
k=1
N b
If we iterate the RG transformation enough times, and f˜(n) is finite, its contribution is
suppressed by b−n → 0.
Magnetization. The magnetization can be calculated by taking derivatives of the
previous result:
M ∝ ∂h f = hsi i
but here is some cleverness. By translation invariance the BHS is independent of i.
Therefore, we can choose i to be a site that survives all the decimation. Then
−H 0 (so )
P P −H(so ,se ) P
si e−H
P
s s0 si se e so si e
hsi iH = P −H = P X = P −H 0 (so ) = hsi iH 0 .
se s0 e−H(so ,se ) s0 e
s
|e {z }
0 (s )
=e−H 0
We have just shown that the magnetization is an RG invariant. This result required
that we are using a decimation scheme, where the spins surviving the RG are a subset
of the initial spins. I will come back to alternatives soon, and we will see why we need
them. This means we can compute the magnetization for a macroscopic system just
by following the flow to the end:
−H ∞ (si )
P
si =±1 si e
hsi i = P −H ∞ (si )
si e
but H ∞ (si ) = a∞ + h∞ si (these are the only two possible terms) and h∞ is the fixed-
point value of the Zeeman field. So
−h∞ si ∞ ∞
P
si =±1 si e −e+h + e−h
hsi i = P −h∞ si = +h∞ −h∞ = − tanh h∞ .
si e e +e
I emphasize again that this works only for decimation schemes.
Let’s think about decimation of the Ising model on the square lattice. Again it is
bipartite, and we can do the sum of each spin on one of the sublattices fixing the spins
on the other, one at a time:
X
eJsx (s1 +s2 +s3 +s4 ) ≡ ψ(s1 + s2 + s3 + s4 ).
sx
39
The argument of the function ψ defined by this equation only takes the values 0, ±2, ±4.
We’ve set the Zeeman field h = 0, so it is even ψ(−x) = ψ(x), and there are only three
values of the argument we care about. For these values, it can be written as
0 0 0
ψ(s1 + s2 + s3 + s4 ) = ea +J (s1 s2 +s2 s3 +s3 s4 +s4 s1 +s1 s3 +s2 s4 )+M s1 s2 s3 s4
with values of a0 , J 0 , M 0 determined by J which you can figure out. The first term a0 is
just a constant. The first four terms multiplied by J 0 are nearest-neighbor interactions
√
on the new (square) lattice with lattice spacing 2a (rotated by π/4). This means
√
λ = 2; the number of remaining spins is N/2, so b = λd=2 = 2 as expected in two
dimensions. The next two terms are next-nearest-neighbor exchange couplings (s1 and
s3 are separated by 2a) of the same size. Finally, M 0 multiplies a qualitatively-new
4-spin interaction, proportional to J 4 . Ick!
This isn’t so bad if we think of the initial Hamiltonian as sitting in a special corner
of the large and high-dimensional space of possible couplings, and the RG just moves
us to a more generic point:
R
(J, 0, 0, · · · ) 7→ (J 0 , K 0 , M 0 · · · ).
That’s just a little ugly. But there’s a reason why it’s objectively bad: we can’t repeat
this RG step. After the first iteration, we generate couplings between spins of the
same sublattice of the remaining square lattice. This means we can’t just sum them
independently anymore. We could do some uncontrolled truncation, or we can find a
better scheme. There are 2d lattices for which a decimation scheme can work (i.e. can
be iterated).
40
We can nevertheless persevere by truncating the generation
of couplings. For example, if we keep terms only to order
J 2 and order K, we do not generate any further couplings
beyond J, K, and we find a closed set of RG recursion equa-
tions:
J 0 = K + 2J 2 , K 0 = J 2 .
These equations have three fixed points: (J, K) =
(0, 0), (∞, ∞) and (1/3, 1/9). The nearby flow diagram is in-
dicated at right. Fixing the couplings and varying T amounts
to the replacement (J, K) to (J/T, K/T ). Increasing the tem-
perature corresponds to scaling J, K down towards K0 =
(0, 0), the infinite-temperature fixed point, where everyone is
decoupled. This point and the zero-temperature fixed point
(K∞ , where all couplings are infinite) are separated by a new
fixed point with a single relevant perturbation. Let’s focus on
just the relevant dimension (which is not orthogonal to the
temperature direction), so we can draw a one-dimensional
plot (after all, we are already ignoring infinitely many other
irrelevant directions). We see that there is a critical value
Tc below which we flow to K∞ , and above which we flow to
K0 . A fixed point with a single relevant operator describes
a critical point, a continuous phase transition between two
phases.
41
3.5 Low-temperature expansion, and existence of phase tran-
sition in d > 1
Maybe you still don’t believe me that there has to be a phase transition in the nearest-
neighbor Ising model, even in d = 2. At arbitrarily high temperatures, there is definitely
no spontaneous symmetry breaking, since each spin is just looking out for itself and
there can be no collective behavior, and hsi = m = 0. At T = 0, the spins all align
(as they do in d = 1, too). Here is an argument (due to Peierls, still) that the ordered
state survives to some finite temperature for d ≥ 2.
A configuration of lowest energy, say all si = +, has energy E0 = −JNl , where Nl
is the number of links of the graph (this is 2N for the square lattice since there are
two links in each unit cell, one up and one right). The minimal excitation above the
ordered configuration flips one spin and has energy E0 + 2zJ where z is the number
of neighbors of the flipped spin. We can estimate the entropy of a dilute gas of n such
flipped spins, with energy E(n) ∼ E0 + 2Jzn; the number of configurations is again
N
approximately Ω(n) = , and so their free energy is
n
Stirling
F ∼ nz2J − T (N log N − (N − n) log(N − n) − n log n) .
(Actually, the flipped spins have a short-ranged attraction because if they are adjacent
they share a happy bond. We ignore this; think about why we can get away with it.)
This is minimized by an equilibrium density of flipped spins
neq
' e−2zJ/T .
N
All this so far is just like in the 1d argument, except we replaced 2 neighbors with z
neighbors, and counted spin flips rather than domain walls.14
Here’s the catch: The magnetization is not so strongly affected by a flipped spin as
it is by a domain wall. It is only decreased from the maximum (m = 1) to
neq
m=1−2 ' 1 − 2e−2zJ/T ' 1 if T zJ.
N
So this means that at low (but nonzero) temperature, the magnetization survives. And
therefore something interesting has to happen at some intermediate temperature.
[End of Lecture 6]
14
Why did we count domain walls in d = 1? Because in d = 1, the energy of a row of k flipped spins
in a row is the same for any k. The elementary dynamical object is really the kink itself in d = 1.
This is the tip of an iceberg called ‘fractionalization’.
42
3.6 A word from our sponsor
We’ve been spending a lot of time talking about Ising models. Let’s take a break and
talk about another role it plays in physics.
Lattice gas. Suppose our dynamical variables are the locations r1 ..rN of a collec-
tion of point particles. The grand canonical partition function is
X ζN Z P
Ξ(ζ) = dd r1 · · · dd rN e−β i<j V (ri −rj ) (3.13)
N
N!
where ζ is a fugacity for particle number, and V (r) is an interparticle potential, which
usually has a short-range repulsion and long-range attraction (most kinds of particles
P p~2
find each other vaguely attractive from far away...). The kinetic energy was i 2mi ,
~2
p
but we did the p integrals already: dd p e−β 2m = (πmT )d/2 .
R
These integrals in (3.13) are hard. If our interest is in critical behavior, we can
zoom out, and take the particles to live at the sites of a lattice r ∈ Λ, so our dynamical
variables are instead the number of particles at site r, n(r). To implement the short-
range repulsion, we take n(r) = 0, 1. Then we study
1 P 0
X P
ΞΛ (ζ) = ζ r n(r) e− 2 β r,r0 Jr,r0 n(r)n(r )
{n(r)=0,1}
where J(r−r0 ) implements the long-ranged part of the potential. If we change variables
to s(r) ≡ 2n(r) − 1 = ±1, we have
1 X X
H(s) = − β Jr,r0 sr sr0 − β hr sr + const
2 r,r0 r
with βhr = 21 log ζ + β r0 Jr,r0 . This is an Ising model. The ferromagnetic ordering
P
transition is the liquid-gas transition! Recalling that this occurs at h = 0, we see that
the s → −s symmetry of the Ising model (with h = 0) is a symmetry of the lattice gas
only near the critical point – it is an ‘emergent symmetry’.
Another useful interpretation of the same model is as a ‘binary fluid’, where n = 0, 1
represent occupation by two kinds of fluid elements.
43
3.7 Duality
Λ̂: The domain walls of a spin configuration on the sites of Λ cover a set
of links of Λ̂:
44
But our description of the low-temperature expansion on Λ as
X
ZΛ (T ) = 2 e−2β`(C) (3.14)
C
has exactly the same form as our high-temperature expansion (3.9) if we identify
e−2β = v̂ ≡ tanh β̂ .
e−2βJ = v̂ ≡ tanh βJ .
The dual of the honeycomb lattice is the triangular lattice (and vice versa). To
learn their critical temperature, we add one more maneuver, called star-triangle trans-
formation: The honeycomb lattice is bipartite, and the two sublattices are triangular
lattices. By decimating one of the two sublattices, we can relate
N N/2
Z7 (J) = ∆N/2 Z4 (K)
Combining this with the duality relation we can relate the critical temperature of the
Ising model on the triangular lattice to itself.
1
Here is a table of the critical values of βJ for various lattices. z is the coordination
number, the number of neighbors of each site.
Λ z Tc /J
– 2 0
7 3 1.52
4 2.27
4 6 3.64
45
The first entry is the 1d chain. You can see that the critical temperature rises with
coordination number.
Notice that the disordered (high-temperature) phase is dual to the ordered (low-
temperature) phase. That this is not a contradiction is related to the factor of 2 in
front of the partition sum in (3.14): the description in terms of domain walls doesn’t
really know about the magnetization.
If you can’t wait to learn more about the many generalizations of Kramers-Wannier
duality, here are some references: Kogut, Savit.
There is more to be said about this sum over curves. They can be used to solve the
2d Ising model exactly. They are the worldlines of free fermions.
Previously, in the decimation schemes, the coarse-grained variables {s0 } ⊂ {s} were a
subset of the microscopic variables. This is a special case of the more general blocking
rule
0 0
X Y
e−H (s ) ≡ T (s0 ; si∈b )e−H(s)
s blocks, b
where T is a function which decides how the block spin s0 depends on the spins si in
the block. Decimation is the special case where we weight the opinion of one of the
spins over all the others:
Tdecimate (s0 ; si∈b ) = δs0 ,s2 .
Another option is majority rule:
(
1, if s0b i∈b si > 0
P
T (s0b ; si∈b ) =
0, otherwise.
Notice that for each block, s0 =±1 T (s0 ; s) = 1 guarantees (3.15). Furthermore, it is
P
46
4 Mean Field Theory
Mean field theory (MFT) is always simple and sometimes right, and it is all around
us in physics departments, so we must understand well when to believe it. We will see
that it goes bad near critical points, and the RG will come to our rescue. It is great
for getting a big picture of the phase diagram.
We’ll give three roads toward MFT, in order of decreasing squiggliness. For defi-
niteness, consider the Ising model, on any graph Λ:
X 1 X X
Z= e−H(s) , H(s) = − Jij si sj − h si .
s
2 i,j∈Λ i
(I’ve put the 12 to agree with our previous definition of J, because here the sum is over
all i, j.) Mean field theory is an attempt to fulfill the urge everyone has to be able to
do the sums over the spins one at a time. If only J were zero, we could do this, for
example to compute the magnetization:
seβhs
P
m = hsi = Ps=±1 βhs = tanh βh. (4.1)
s=±1 e
From its point of view, this is just like some external magnetic field depending on
what its neighbors are doing. What’s sj ? Well, it’s probably equal to its average
value, hsj i = m. So let’s just forget everyone else, and assume they are average and
incorporate them into an effective magnetic field:
1X
heff ≡ Jij m + h.
2 j
47
one spin in the world, and this is the field it sees, then we can compute m using (4.1):
(4.1)
m = tanh βheff = tanh (zJm + h) .
1
P
Here I defined J ≡ 2 j Jij . This is an equation for m! We can solve it!
At least graphically or numerically we can solve it. Here is m (yellow) and tanh(zJm+
h) (blue) plotted versus m for two values of J (large and small compared to T , with
some small h)
Here’s our second approach to MFT. Basically, here we will be more explicit about
what we’re leaving out (but it is the same as the previous discussion). We rewrite the
interaction term in the Ising hamiltonian as
We are going to treat the fluctuation about the mean δs as small. Then
1X X
Jij m(si + sj ) − m2 + h si + O(δs)2
−H =
2 ij i
1 2
X
= − N Jm + (zJm + h) si + O(δs)2 . (4.3)
2 i
P
N is the number of sites, and J ≡ j Jij . The contribution Jm to the external field
from the neighbors is sometimes called the ‘molecular field’. What we are neglecting
here (when we drop the O(δs)2 in a moment) is the correlations between the spins at
different sites i, j. This is not small if |ri − rj | < ξ, by definition of the correlation
length ξ. Brutally ignoring the correlations, then, we can do all the sums have
1 2
Z ' e− 2 N βJm (2 cosh β(zJm + h))N ≡ ZMFT
48
I claim, and will prove next, that fMFT (m) ≥ f is an upper bound on the correct free
energy. This is true for every m, and so the best bound comes from minimizing over
m. That condition gives back the equation for m (namely m = tanh β(zJm + h)) that
we got from self-consistency above. (And it will tell us what to do in the case of J T
where there are three solutions.)
Our third approach is the variational method. [There is a good discussion of this
in Parisi’s book.] It will give our proof that fMFT (m) upper bounds f . The idea can
be found from a Bayesian viewpoint on statistical mechanics. Let’s put this in a box:
16
subject to the constraint that E = hHiP? ≡ E[P? ]. The distribution should also be
P
normalized s P (s) = 1. We can impose these conditions with lagrange multipliers:
X X X
Φ[P ] ≡ S[P ]+b(E[P ]−E)+a( P (s)−1) = − P (s) log(P (s))+ (bH(s)+a)P (s)−bE−a
s s s
δΦ[P ]
= − log P (s) − 1 + bH(s) + a
δP (s)
=⇒ P? (s) = ebH(s)+a−1
where a, b must be determined to satisfy the two constraints.
If instead of fixing the average energy, we want to fix the temperature 1/β, what do
we do? We should instead find the distribution P? (s) which minimizes the free energy
49
as a functional of P . It is still normalized, so we need to use a lagrange multiplier
again, and minimize X
Fλ [P ] ≡ F [P ] + λ( P (s) − 1)
s
from which we again recover the Boltzmann distribution, P (s) = e−βH(s) /Z (the mul-
tiplier λ is eliminated in favor of Z by normalizing).
This derivation is useful philosophically (for example, it evades all the vexing ques-
tions about ergodicity), and it also implies a variational bound on the free energy F .
That is, if we pick some arbitrary other distribution Poff-the-street (s), then we know that
its free energy is bigger than the correct equilibrium free energy:
[End of Lecture 7]
So: to recover mean field theory, we choose a distribution which we like because we
know how to calculate its averages, that is, one which factorizes:
Y
PMFT (s) = pi (si ) (4.4)
i
50
i (m )
Now we apply the variational bound. The free energy F (mi ) = F [PMFT ] ≥ F upper
bounds the true free energy for any m, so we do best by minimizing it:
X
0 = ∂mi F = − Jij mj − hi + T arctanhmi
j
where b represent some blocks of sites. Such a state is more general than the MFT
ansatz, and will have more variational parameters, and necessarily gives a better esti-
mate of the correct free energy. Further thinking in this direction leads to cluster mean
field theory and belief propagation algorithms.
On the form of the mean-field free en-
ergy. The most important conclusion from the
mean field theory is that (for h = 0) there are
two phases distinguished by whether or not the
Z2 symmetry is spontaneously broken – at high
T , we have m = 0, and at low T m 6= 0. In
between there is a phase transition17 , where m
suddenly grows from zero. If we set h = 0 and
study small m, we can expand fMFT in m and
find
1
fMFT (m) ' a + Bm2 + cm4 + ... (4.5)
2
where a, c are constants. The coefficient B is
B = (1 − βJ) ≡ bt,
where t ≡ T −T
Tc
c
is the “reduced” temperature. If c > 0, this function looks like one of
the figures at right, where the top left figure is for T > TcMF = J and the bottom left
17
In case I forgot to say so, a phase transition occurs when physical quantities are non-analytic in
the parameters at some point in the parameter space – it means that Taylor expanding physics on
one side of the phase transition gives the wrong answer (for something) on the other side.
51
figure T < the critical temperature. If c < 0, then we have to keep more terms in the
expansion to know what happens. (The right column is with h < 0.) So you can see
that the minimum of f occurs at m = 0 for T > Tc (disordered phase) and m 6= 0 for
T < Tc (ordered phase). This figure makes it clear that the third solution of the MF
equations (at m = 0) that exists for T < Tc is a maximum of the free energy – it is
unstable.
[Parisi §4.3, 5.2] Before drawing any further physical conclusions from the MFT free
energy we just derived, let me say some words in defense of this form of the free energy
(4.5). These are the words (the idea is due to Landau; this is a paraphrase):
If the free energy is analytic near m = 0, it looks like this. So all that song and dance
about justifying mean field theory is really irrelevant to the conclusions we draw about
the phase transition from m = 0 (at T > Tc ) to m 6= 0 (at T < Tc ). The dependence
of B on T − Tc follows from (4.5) itself! With this assumption,
( fMFT (m) is the most
m 7→ −m
general answer, consistent with the symmetry under (at the same time).
h 7→ −h
So: the only real assumption leading to (4.5) is the analyticity of f (m). Some points:
(1) we will see immediately below that analytic f (m) does not mean that the physics
is analytic in external parameters – we can get critical behavior from this framework.
(2) When we find out that MFT gives wrong predictions for critical exponents, we will
have to find out how and why we get an f (m) which is not analytic. (3) The fact
that the coefficient of m2 is proportional to the deviation from the critical temperature
follows from our analysis of (4.5). The only input from the microscopic calculation
(with all the approximations above) is how do the coefficients a, b, c, d depend on the
microscopic couplings. Notice that the actual magnetization m = N −1 N
P
i=1 hsi i is
an average of numbers each ±1, and therefore lies between these two numbers. The
minimum of f (m) will not satisfy this constraint for all values of a, b, c, d... consistent
with the input above: this is a “UV constraint on IR physics” of the kind that the
string theorists dream about.
Types of phase transitions. A first order phase transition is one where the
minimum of the free energy jumps from one value to another, distant value, like if the
potential evolves as in this comic strip as a function of the parameter in question:
52
The two configurations need have nothing to do with each other, and there is no
notion of universal properties of such a transition. The correlation length need not
grow. This is what happens when we vary h from positive to negative, at nonzero
t < 0. The correlation length stays fixed, but the minimum jumps from −m0 to +m0
as h goes through zero (as in the comic strip above).
The alternative is a continuous phase transition which is more interesting, because
then, as we will see, there is a field theory which encodes a collection of universal
phenomena at and near the critical point.
(Sometimes, one hears about ‘nth-order’ phase transitions, where the nth derivative
of the free energy is discontinuous for various n ≥ 2, but I haven’t found the need to
distinguish between these. Moreover, it is only in mean field theory that the free
energy goes like integer powers of t (as in (4.6) below); more generally, taking enough
derivatives of the free energy will give a divergent (not just discontinuous) behavior
at the transition. So this more detailed ‘classification’ (due to Ehrenfest) is both
incomplete and not useful.)
Notice that when we say that ‘a transition is continuous’ it can depend on what
parameter we are varying: at T < Tc , as a function of the magnetic field, the transition
from one minimum to the other of the Ising model is first order. (This is what’s
illustrated in the comic above). But at h = 0, there is a continuous transition as T is
varied through Tc .
Here are some simple examples of the power of the LG point of view: If we break
the Ising symmetry the transition should generically be first order. This allows a cubic
term in the potential, and it means that as we cool from high temperatures, one of the
two minima at m 6= 0 will have f (m) < f (0) before (at a higher temperature than the
one where) f 00 (0) becomes negative.
A continuous transition is, however, not an inevitable conse-
quence of Ising symmetry: if c < 0, then we must consider the m6
term. Depending on the signs, there is a regime where the minima
at m 6= 0 descend before f 00 (0) goes negative.
Usually (but not always) TcMF > Tc , since the fluctuations we
are ignoring disfavor the ordered state. (Sometimes in fact Tc ≤ 0.)
53
Mean field critical exponents. The very fact that there is a notion of Tc in
MFT is worth remarking on. Lots of stuff is non-analytic at Tc !
Near Tc , we can expand
where T ≡ T −T
Tc
c
is the non-dimensionalized deviation from the critical temperature.
Notice that a, b, c, µ really do depend on T , but only weakly (i.e. , a = a0 + a1 t + · · · ).
When h = 0, the free energy is minimized when :
0, t>0
m= q √
± b −t, t < 0
2c
This exponent is called δ and δMFT = 13 . (I’m mentioning this botany of greek letters
because there are people for whom these letters are close friends.)
Finally, the free energy density evaluated at the minimum, at h = 0, is
(
a, t>0
f (t) = (bt)2
(4.6)
a − 2c , t < 0
which means that ∂t2 f jumps at the transition; this jump is actually an artifact of
MFT.
Otherwise, the behavior in general predicted by MFT is good, but we’ll see that
the values of these exponents aren’t always right (and why and when, and then we’ll
understand how to fix them). In particular, mean-field critical exponents are always
rational numbers. In contrast, for the 3d Ising model, β = 0.326419(3), which isn’t
looking very rational. This value comes from the conformal bootstrap program to solve
and classify fixed points.
54
Notice that the critical exponents do not depend on the particular values of the
parameters a, b, c, µ · · · . This is one reason to hope that they can be understood, and
that they are universal in the sense defined earlier.
It is worth thinking about what the
extrema of this potential do as we vary
the parameters. At right is a plot of the
free energy evaluated at all of the critical
points of f (m) as h varies (the other cou-
plings are fixed to T < Tc ). (This picture
is sometimes called a ‘swallowtail’.) In-
set in red is the shape of the potential
at the corresponding value of h. Plot-
ted below is the corresponding magneti-
zation. Notice that the number of (real)
critical points goes from 1 to 3 as |h| is
decreased below some value; the two new
extrema are pair-produced from the com-
plex plane, that is, the new extrema come
in pairs and have a larger free energy. No-
tice further that ∂h2 f > 0 along the top
trajectory – this is the maximum near the
origin. The other one is actually a local minimum – a metastable state, responsible for
hysteresis phenomena at the first-order transition. More on the physics of this in §5.5.
55
LG Theory for other symmetries. Here is another illustration (of the Power
m 7→ −m
of Landau. We’ve been studying models with a Z2 symmetry acting by .
h 7→ −h
Suppose instead of this, we made the replacement Z2 O(n) rotation symmetry acting
on a generalization of the magnetization with n components, m ma , in that case
a
the external field would be h h , and the transformation rule would be
(
ma 7→ Rab mb
ha 7→ ha Rab
– the quadratic terms are completely independent of the other N components of the
fluctuations δm2 ...δmN ! We’ll see in a moment that this absence of a restoring force
means that those degrees of freedom have infinite correlation length, everywhere in the
ordered phase. They are called Goldstone modes.
[End of Lecture 8]
18
Dachuan Lu reminds me that for some values of n, there can sometimes be extra invariants, such
as i1 ···in mi1 · · · min .
56
‘Microscopic’ Landau-Ginzburg Theory. In our variational derivation of mean
field theory, we actually derived a stronger bound, since we allowed for spatially-varying
magnetization. Let’s combine the Landau point of view with the knowledge that the
free energy is extensive19 to learn the answer without doing any work. Because F is
extensive, we can write the free energy as a sum over a contribution associated to each
P
lattice site, or patch of the lattice F = i fi , where fi depends on the magnetization
mi at site i and nearby sites. (Think about assembling the system from big enough
chunks.) If the correlation length is not so small, fi will vary smoothly and we can
−d
P R d
approximate this as an integral: i f (xi ) ' a d xf (x). The integrand, in turn,
depends locally on the field and its derivatives. Translation invariance forbids any
explicit dependence on x:
Z
~
F [m] = dd xf (m(x), ∇m(x), ∇2 m(x)...).
~ · ∇m
fLG = V (m) + κ∇m ~ + κ0 (∇2 m)2 + ... (4.7)
where V (m) = a + Bm2 + cm4 + dm6 + ... is the value when m is constant in space – it
contains all the information about the mean field treatment of phase transitions, some
of which we discussed above.
We will have a lot more to say about how to organize this expansion. So far it
is an expansion in powers of m (since know that in the neighborhood of the critical
point m is small). It is also an expansion in the number of derivatives, something like
the dimensionless quantity a∇m, where a is the lattice spacing. If this quantity is
not small then we are asking the wrong question, because the ‘field’ we are treating
as continuous is varying rapidly on the scale of the lattice spacing a. The RG will
give us a better understanding of this expansion: we’ll see that operators with more
derivatives are more irrelevant (near any of the fixed points under discussion here).
The equation (4.7) contains an enormous amount of information. To better appre-
ciate it, let’s first discuss the mean-field treatment of the correlation function.
By the way, what exactly is the LG free energy? It is not convex in m, so how can
it be the actual free energy?
19
I owe you some discussion of why this is the case. This happens in §5.1.
57
[Goldenfeld §5.6] The answer to this is that it is the free energy with the constraint
that the (coarse-grained) magnetization is fixed to be m(r):
!
X Y X
e−βFLG [m] ≡ e−βH(s) δ si − m(r)NΛ (r) . (4.8)
s blocks,r i∈r
Here r denotes a block, and NΛ (r) is the number of sites in the block r. This is just
like the construction of the block-spin effective Hamiltonian. It is only more ambitious
in that we are hoping that m(r) is smoothly varying in r, which will be true if ξ > a.
So the LG free energy S can be regarded as (a parametrization of) the coarse-grained
free energy.
It is indeed analytic in m, since we need to do only a finite number of sums in (4.8).
And, also because there is only a finite number of sums, it need not be convex.
How do we get the actual, thermodynamic free energy from FLG (which is convex
and need not be analytic in its arguments)? We have to do the rest of the sums, the
ones over m: X X
e−βF = e−βH(s) = e−βFLG [m] .
{s} m
P
Because m(r) is a continuous variable, ‘ m ’ is actually an integral, one for every block,
r: X Z Y Z
= dm(r) ≡ [Dm]
m r
where the right equation defines what we mean by such a ‘functional integral.’
Altogether, we have Z
Z= [Dm]e−βFLG [m]
– we have rewritten the partition function (in a regime of moderately large correlation
length) in terms of a field theory functional integral. The quantity appearing in the
exponent of such an integral Z
Z= [Dm]e−S[m]
58
4.2 Correlations; Ginzburg criterion for MFT breakdown
[Goldenfeld §5.7] You might think that the spirit of mean field theory is antithetical
to obtaining information about correlations between the spins, since after all that was
precisely what we ignored to do the sums. Not so!
Here’s a first pass. The connected correlator (assume translation invariance) is
This is called the static susceptibility sum rule. It relates a thermodynamic quantity
χT to a (integrated) correlation function. If the correlation length is big enough, ξ > a,
then we can approximate the sum by an integral
Z
1
χT = dd rG(r).
T ad
Is the integral well-defined? The lower limit of integration, the UV, is fine because
we are talking about a lattice model. When ξ is finite, the fact that the correlations
ra
fall off rapidly G(r) ∼ e−r/ξ means that the integral converges in the IR (the upper
limit of integration) as well.
1 MFT
But: χT → ∞ at the critical point, in fact we saw above that χT ∼ T −T c
+ regular
20
terms as T → Tc . The only way this can happen consistently with the susceptibility
sum rule is if ξ → ∞ as well at the transition. We’ll see in a moment with what power
it diverges.
MFT for G(r). We can actually do better and find the form of G(r) within the
mean field approximation. This is because G(r) is a response function. Here’s what
this means.
When h = 0, the correlation function is
−H(s)
P
s sr s0 e ·1
hsr s0 i = P −H(s)
se ·1
20
If I keep chanting ‘γ = 1’ maybe I will remember these letters someday.
59
where we can write 1 cleverly as
where 0 means we sum over all the spins but the one at 0, and we fix s0 = +1,
P
and h...i0 denotes expectation in this ensemble. So the correlation function hsr s0 i is
just the magnetization at r, m(r) in response to an (infinite) applied field (completely)
localized to r = 0. In the presence of this localized source, m(r) will certainly depend
on its distance from the source. But the mean field equation (for r 6= 0) still takes the
form
!
X
m(r) = tanh β Jrr0 m(r0 )
r0
m1 X
' β Jrr0 m(r0 ) (r 6= 0) .
r0
In the second line, we retreated to small m, which is useful for T > J. (Otherwise
maybe we need some numerics.) We can do better and include the corrections at the
origin, by including a source:
X
m(r) = β Jrr0 m(r0 ) + Aδr,0 .
r0
˜
(1 − β J(k))m̃k = +A,
where Z
i~k·~ ~
X
m̃k ≡ e r
m(r), m(r) = d̄d k e−ik·~r m̃k .
r∈Λ BZ
60
In the inversion formula, the integral is over the Brillouin zone of the lattice Λ; for a
cubic lattice, this just means k ∈ (−π/a, π/a]. The Fourier transform of the coupling
is X ~
˜ ≡
J(k) eik·~r Jr,0 .
r
For example, for a cubic lattice, this is J˜cubic (k) = µ=x,y.. 2 cos kµ a, where a is the
P
61
A necessary condition for its self-consistency is that the expected value of this term,
calculated within MFT, is small compared to the MFT energy:
!
h∆HiMFT < EMF .
[End of Lecture 9]
We assume that Jrr0 has a smaller range than Grr0 (i.e. R < ξ ), so that we may
approximate the RHS as
d̄d k d̄d k
Z Z
A
zJG(0) = A ' 2 . (4.10)
˜
BZ 1 − β J(k) R β |k|<a−1 k 2 + ξ −2
In a lattice model, the integral is over the Brillouin zone. The dangerous bit, where
the RHS can become big, though, comes from k → 0, which doesn’t care about your
lattice details. We used this in replacing G̃k with its long-wavelength approximation
in the last step of (4.10). In making this approximation, we may as well replace the
BZ integral with a simple cutoff |k| < a−1 since the form of the integrand is wrong for
|k| ∼ a−1 anyway.
To separate out the UV physics (k ∼ 2π a
) from the IR physics (k ∼ 2π
L
), let’s use
the partial-fractions trick familiar from calculus:
1 1 ξ −2
= −
k 2 + ξ −2 k 2 k 2 (k 2 + ξ −2 )
so that
d̄d k d̄d k d̄d k
Z Z Z
I≡ = −ξ −2 .
|k|<a−1 k 2 + ξ −2 |k|<a−1 k
2
|k|<a−1 k 2 (k 2 + ξ −2 )
| {z }
ind. of T
The first term is a (possibly big, honking) constant, which doesn’t care about the
temperature or the correlation length. The second term is finite as a → 0 if d < 4
(finding that this integral is infinite as a → 0 just means that the short-distance stuff
62
at the lattice matters). (Note that the integral is finite as L → ∞ if d > 2.) When the
integral is finite, we can scale out the dependence on ξ (define x ≡ |k|ξ):
Z ∞ d−3
ξa 2−d x dx
I = const + ξ Kd
0 x2 + 1
where
Ωd−1
Kd ≡
(2π)d
is a ubiquitous combination of angular factors; Ωd is the volume of the unit d-sphere.
So: the demand that the things we ignored be small corrections to the MFT energy
computed within MFT requires
ATc ξ 2−d
Jt
R2
Remembering that we derived ξM F = Rt−1/2 , we can write this condition purely in
terms of the mean field correlation length. If the condition
ξ 4−d R4
is violated then mean field theory is wrong. (The R4 on the RHS stands in for some
quantities with the right dimensions which do not vary with t near the transition)
So for sure this condition is violated if ever ξ → ∞ in d < 4. (Remember that d is
the number of space dimensions.)
Note that the condition depends on the range R of the interactions: MFT works
better for longer-range interactions, and in more dimensions.
?
we can approximate the values of the neighboring spins by their average sj = hsj i, and
treat the coefficient of si as an effective ‘molecular’ field heff
P
i = hi|ji hsj i + hi .
More dimensions or longer range means more neighbors (for example, for the hyper-
cubic lattice in d dimensions, each site has 2d neighbors); more neighbors means that
P
there are more terms in the sum hi|ji sj + hi . If the correlations between the terms in
the sum are small enough, the central limit theorem tells us that the fractional error
63
decays with the number of terms in the sum. And this assumption is self-consistent,
since in MFT the spins sj are statistically independent (the probability distribution
factorizes).
The preceding argument says that at asymptotically large d, MFT becomes more
and more correct. You saw on the homework that when the number of neighbors grows
with N (namely with all-to-all interactions), then MFT is exact. When d = 1 MFT
is completely wrong, since there is no ordering at all at finite T . So something must
happen somewhere in between. We’ve just learned that that somewhere is d = 4.
d = 4 is maybe not so exciting for statistical mechanics applications. However, the
same machinery can be used with one of the dimensions interpreted as time. For more
on this, I refer you to references on QFT (such as my 215C notes).
d = 4 = dc is called the upper critical dimension (in the sense that mean field theory
is correct for larger dimensions) for the Ising critical behavior (since we’ve been talking
about the case with Ising symmetry). More generally, the upper critical dimension can
be efficiently determined from the zoo of critical exponents as follows. The fractional
error in mean field theory can be rewritten as
R d
d rG(r)
error ∼ R V d (4.11)
V
d rm(r)2
where V is a ‘correlation volume’, a region of space whose linear size is ξ. The numerator
is V dd rG(r) = T χT ∼ t−γ . The denominator is ξ d |t|2β ∼ t2β−νd , so the condition that
R
(4.11) is small is
2β + γ
1 t−γ−2β+νd =⇒ dc = .
ν
Continuum field theory
Along the way in the preceding discussion of correlation functions in mean field
theory, we showed the following, which is a useful summary of the whole discussion,
and makes contact with the microscopic Landau-Ginzburg theory. Consider the simple
case where (
J, rij ≤ R
Jij = .
0, rij > R
Then we showed that the contribution to the mean-field free energy from the interaction
term is
X
−∆fM F [m] = Jij mi mj
ij
64
2 2 !
a2 X X
mi+δ − mi mi+δ + mi
= −J −
4 i a a
|δ|≤R
2
2
a X X m(ri + δ) − m(ri )
= −J + O(m2 )
4 i a | {z }
|δ|≤R correction to V (m)
!2
Taylor a2 X X ~δ · ∇m(r ~ i )
' −J + O(m2 )
4 i a
|δ|≤R
2 Z
zJR dd r ~ 2
' − d
∇m + O(m2 )
4 a
where z is the coordination number of the lattice. Comparing this to our ‘local’ Landau-
Ginzburg expression (4.7), we’ve learned that the constant in front is
R2 TcM F
2 Jz
κ'R = .
4ad ad
For the case of a localized source, h(x) = δ(x), (and ignoring the interaction terms
mn>1 ) the solution in Fourier space
(2κ)−1
m̃k =
k 2 + bt/κ
p
gives back ξ −1 = bt/κ. You might think that ignoring the higher powers of m is OK
near the critical point, since m is small; this assumption gives back mean field theory
(which we’ve already seen is not always correct).
In case you’re not comfortable with this derivation of the continuum field theory
description of Ising models with large correlation length, another approach is outlined
on the problem set.
21
For those of you who are not at home with variational calculus, please see the sidebar on the
subject at §4.2.1.
65
Return for a moment to our discussion of the LG theory of a system with an O(n)
symmetry. Recall that in the ordered phase, we found that n − 1 of the modes did not
appear in the quadratic term of the LG free energy. Now you can see why I said that
the existence of these Goldstone modes implied that the correlation length was infinite
everywhere in the ordered phase.
66
4.2.1 Sidebar on Calculus of Variations
22
If you are unhappy with thinking of what we just did as a use of the chain rule, think of time
as taking on a discrete set of values ti (this is what you have to do to define calculus anyway) and
let x(ti ) ≡ xi . Now instead of a functional SV [x(t)] we just have a function of several variables
P
SV (xi ) = i V (xi ). The basic equation of calculus of variations is even more obvious now:
∂xi
= δij
∂xj
and the manipulation we did above is
X X X XX X
δSV = δxj ∂xj SV = δxj ∂xj V (xi ) = δxj V 0 (xi )δij = δxi V 0 (xi ).
j j i j i i
67
5 Festival of rigor
Let us pause in our assault on field theory to collect some Facts that we know for sure
about the free energy of short-ranged lattice models. As with any rigorous, formal
results in physics, it will be crucial to understand the hypotheses.
[Parisi pp. 41-42] The Ising model free energy is extensive, F/N = f + terms which
go to zero as the number of sites N → ∞. In particular, in the thermodynamic limit,
the bulk free energy density f doesn’t care about boundary conditions. This assumes
that J is short-ranged: Jr,0 is either of finite support (system-size-independent range),
or falling off sufficiently rapidly in r.
Here is an RG-inspired proof of this result. We begin with a finite system, with N
sites.
First, notice that the hamiltonian H(s) is bounded
for some constant D (for the near-neighbor Ising model on a cubic lattice it’s J for
each link, so D = dJ).
We can bound the free energy, too, by realizing that the number of configurations
is finite – for a finite lattice with N sites, there are only 2N of them. Each one
contributes an energy below the maximum value, and above the minimum value. If
all 2N configurations achieved the max/min value, we get the smallest/biggest possible
values of the partition function:
2N e−βN D ≤ ZN ≤ 2N eβN D .
68
exist.) Take L R, the range of the interactions. Let ZLF be the partition function
for this chunk.
Now we try to double the (linear) size of the system, by gluing together the right
number (2d ) of smaller chunks of size L. Gluing just means that we add the terms in
the hamiltonian which couple the sites across the interface. The number of terms we
have to add is Ld−1 R for each interface (each pair of chunks) we glue, and we have to
glue 2d interfaces. The magnitude of the contribution of each term is bounded by D.
Therefore
2d 2d Ld−1 R 2d 2d Ld−1 R
ZLF 2e−βD F
≤ Z2L ≤ ZLF 2e+βD .
Taking the log and dividing by (2L)d gives
Again when we take the log and divide by the volume Ld , the terms proportional to
˜ ≡ ∆ + T ln 2 are suppressed by a factor of L.
∆
Thermodynamic limit
We conclude that in a system in d dimensions of linear size L, with short-range
interactions, the free energy takes the form:
F = Ld fb + Ld−1 f∂ + O(Ld−2 )
F F − Ld fb
fb = lim , f∂ = lim .
L→∞ Ld L→∞ Ld−1
f∂ is a boundary free energy density.
Two questions to ponder:
69
1. What should we hold fixed in the limit L → ∞? In a fluid, we might want to
fix the density of particles, ρ = Nparticles /L. If we instead fix Nparticles , we get a
boring fluid.
2. How can the thermodynamic limit fail to exist? We consider a class of examples
where it might fail next.
70
In vacuum, φ(r) would be re . We will determine it self-consistently. The electron
number density is proportional to the probability p(r), and must approach the average
density far away (where φ → 0), so
n(r) = n∞ e−βeφ(r) .
This is just the equation we solved in (4.12) to find the correlation function G(r) away
from the critical point, at finite ξ −2 = 4πβe2 n∞ , and the solution is
e eeff (r)
φ(r) = e−r/`D ≡ . (5.3)
r r
The name of the correlation length in this case is
r
T
`D ≡ ,
4πe2 n∞
the Debye screening length. In the second equality in (5.3) I introduced a distance-
dependent effective charge eeff (r): how much charge you see depends how closely you
look.
The continuum approximation we’ve used here is consistent with classical corpuscles
if the average interparticle distance is small compared to the screening length:
−1/3
n∞ `D
√
which is true when e3 N T 3/2 , i.e. at high enough temperature, consistent with
our approximation in (5.2).
You might worry that a collection of charges of both signs, once we let them all
move around, might either implode or explode. This paper by Lieb, called The Stability
of Matter, is very interesting and not too forbidding. The early sections are about the
stability of matter to implosion, which is a short-distance issue (whose resolution cru-
cially involves quantum mechanics and the Pauli principle and hence is off-limits here);
but Section V contains a ‘rigorous version of screening’ which removes the concern that
matter should want to explode like in (5.1).
71
Other power laws. Suppose instead of Coulomb interactions in d = 3, we have
particles interacting pairwise via a potential U (r) = rAσ in d dimensions. Then the
energy of a collection of particles with density ρ(r), in a ball of radius R, BR is
Z Z
1
E(R) = d
d r dd r0 ρ(r)U (r − r0 )ρ(r0 )
2 BR BR
ρ2
Z Z
uniform ρ 1
' A d
d r dd r0
2 BR BR |r − r0 |σ
2
ρ
= A R2d−σ C(d, σ) (5.4)
2
where
dd xdd y
Z
C(d, σ) ≡
B1 |x − y|σ
In the last step we scaled out the system-size dependence of the integral by defining
r ≡ Rx, r0 ≡ Ry. This C is just a dimensionless number – if it’s finite. In that case,
the ‘bulk energy density’ (free energy density at T = 0) is
[Goldenfeld §2.6] We’re going to prove some facts about the nearest-neighbor Ising
model, with Hamiltonian
X X
H(s) = −J si sj − h si . (5.5)
hiji i
72
(1) With the additive normalization in (5.5), the bulk free energy density is negative:
f < 0.
– it is normal-ordered.
Proof of (1): Begin with N < ∞ sites. The free energy density is f = F/N =
T
log Z, so the claim f < 0 means Z > 1. The partition function Z = s e−βH(s)
P
−N
is a sum of 2N positive terms (for 0 < T < ∞). And Z > 1 because there exists
?
a configuration s? which by itself contributes a term e−βH(s ) > 1. For example, for
J > 0, h > 0, it happens when s?i = 1, ∀i. But more generally, it follows from the
normal-ordering condition (5.6) since H(s) is not identically zero, so there must be
configurations with both signs of H(s), and at least one which has H(s? ) < 0.
(2) The entropy density is
s = −∂T f ≥ 0.
73
∂x f is non-increasing (and in particular the derivative ex-
ists almost everywhere). f can have cusps.
74
up. On the other hand, as a function of T = 1/β, the free energy f (T ) = −T ln Z(T )
is indeed anti-convex.
A useful alternative viewpoint: anticonvexity follows by showing that all second
derivatives of f are negative. For example,
1
∂β2 f = − (H − hHi)2 ≤ 0
βN
is proportional to minus the specific heat, aka the variance of the energy. Similar
statements hold for other variations, such as the magnetic susceptibility
∂h2 f = −c (s − hsi)2 ≤ 0.
which interpolates between the two ensembles. By similar steps as above, ln Z(t) is
convex in t. Convexity of a function implies that it lies above any of its tangents, and
in particular24 ,
On the right hand side we then have a bound on the free energy in terms only of
easy-to-compute quantities. (Consider what happens in the case of the ising model, If
P
we take H0 = i si hi .)
75
so that the magnetization is
(
ms + O(hσ−1 ), h > 0,
m = −∂h f = .
−ms + O(hσ−1 ), h<0
(If σ were not larger than one, the magnetization would diverge as h → 0 and that’s
not happening, since it’s bounded (|m| ≤ 1). I also imposed f (h) = f (−h) by Ising
symmetry.)
But before the thermodynamic limit, f (h) is a smooth function. This means the
two limits h → 0, N → ∞ are clashing violently:
1 1
lim lim ∂h F = 0 but lim lim ∂h F = ±ms .
N →∞ h→0 N h→0 N →∞ N
Yang-Lee singularities. Here is a toy model of how this can come about. Suppose
our system of volume V is so tightly bound that only two configurations matter, the one
where all N spins point up, m = +V , and the one where they all point down, m = −V .
(All the rest of the configurations have such a large energy that we can ignore their
contributions to Z.) So a single spin s = ±1 determines the whole configuration.
Then, in a field, we have
X
Z(h) = e−βhV s = 2 cosh βhV
s=±1
and
T V →∞
f (h) = − log (2 cosh βV ) , m(h) = ∂h f = tanh βhV → m(h) = sign(h).
V
p0 (s) = Z −1 e−βH(s)
76
then the Ising symmetry H(s) = H(−s) implies directly that the magnetization van-
ishes:
?
X
m = hsi = hsi0 ≡ P0 (s)s = 0.
s
What gives? Consider, at small h > 0 and finite N , the ratio of the probabilities
of two configurations: a reference configuration s, and the one related to it by a global
spin reversal. If m(s) ≡ N1 i si is the magnetization in this configuration, then
P
77
‘locality’. SSB means that cluster decomposition fails for the symmetric distribution.
Only the non-symmetric ‘pure states’ with q = 0, 1 satisfy this demand (this is the
definition of ‘pure state’ in this context).
[Goldenfeld, §4] First, let’s recall some thermodynamics facts. I will speak in the
language of fluids, but with appropriate substitutions of letters, it can be translated
into physics of magnets or other examples. At fixed volume, the free energy which
is minimized in equilibrium is the Hemholtz one (the one we’ve been talking about),
F (T, V, N ) = E −T S. If instead we fix the pressure P , the quantity which is minimized
in equilibrum is the Legendre transform of F , named for Gibbs:
G(T, P, N ) = F + P V,
in terms of which the first law of thermodyanimcs is
dG = −SdT + V dP + µdN.
The Gibbs-Duhem relation (basically, integrating the first law) says E = −P V + T S +
µN , so that in fact G = µN is just proportional to the chemical potential.
Let’s consider a situation at fixed P where there is
a first order transition, between two phases I, II (for
example, liquid and gas) where the order parameter
is the volume, or the density (equivalently at fixed N ,
since V = N/ρ). Along the phase boundary, where
they exchange dominance, we must have
GI = GII . (5.8)
Hence also µI = µII ; this is a condition for chemical equilibrium of the two phases.
and therefore we get the Clausius-Clapeyron equation for the slope of the coexistence
curve
dP SI − SII
= .
dT coexistence VI − VII
78
The difference in the numerator is propor-
tional to the latent heat of the transition, T ∆S =
T (SI − SII ). If phases I and II are not somehow
topologically distinguished (for example, by a dif-
ferent symmetry-breaking pattern), then there can
be a critical endpoint of the line of first-order transitions, where ∆S → 0, ∆V → 0, at
some (Tc , Pc ).
The consequence of a first-order transition de-
pends on what is held fixed as the transition is
traversed. If we heat a fluid at constant pres-
sure P < Pc (for example atmospheric pressure),
starting from T < Tc (moving along the red verti-
cal line in the figure, and doing so slowly enough
that we stay in the equilibrium phase diagram
at every step) then first the fluid expands and
warms up. When it reaches the coexistence curve
Tcoexistence (P ), it starts to boil. While this hap-
pens, the energy goes into the latent heat convert-
ing I into II, and the temperature stays fixed: we
are sitting at the point (Tcoexistence (P ), P ) on the
coexistence curve in the (P, T ) phase diagram, while the fraction x of the fluid which
is gas grows:
V = xVl + (1 − x)Vg , x = x(t)
is some protocol-dependent function. Although Vl 6= Vg , the volume of fluid itself does
not jump. How do I know this? Bear with me a moment, the proof is at Eq. (5.9).
If instead we compress the fluid at constant T , starting at T > Tc in the gas phase:
1
− ∂V P |T ≡ κT > 0
V
a positive compressibility says that it fights back. It fights back until the volume
reaches V = Vl (T ), which is when P = Pcoexistence (T ), beyond which the fluid starts to
condense.
What do these isothermal curves look like? Let v = V /N = 1/ρ be the volume frac-
tion per particle. For an ideal gas, recall that P v = T . This is correct in general at high
temperature. For lower temperatures, van der Waals suggests some appealing simple
corrections which account for an interparticle interaction described by a potential like
we discussed in §3.6:
79
• each particle wants some amount of personal space, and therefore excludes some
fixed volume b: v → v − b.
• the energy per particle is decreased by the long-range attractive part of the
potential by an amount proportional to the density:
E E a
→ − aρ =⇒ P = ∂V F → P − .
N N v2
• It has a critical T = Tc below which there is a line of first order phase transitions.
The critical point appears when P (v) = const goes from having one solution
(T > Tc , like the ideal gas), to having three. When this happens, ∂v P = ∂v2 P = 0,
so that locally P ∼ (vc −v)3 is locally cubic. In fact, for the vdW equation of state,
this condition is exactly a cubic equation for v: P0 v 3 − v 2 (bP0 + T ) + av − ab = 0.
• (Relatedly), it has regions where κT = − V1 ∂V P |T < 0 which says that if you try
to squeeze it, it doesn’t fight back, but rather tries to help you squeeze it further.
Creepy! (The same thing happened in our study of the Landua-Ginzburg free
energy in §4.1 and this led to the picture of the swallowtail.)
• Note by the way that the vdW equation is a masterpiece of estimation: a, b can
be determined from high-temperature data and they give a (not bad) estimate
of the location of the critical point.
80
so the area under the V (P ) curve is zero (and is the change in the Gibbs free energy),
along any path in equilibrium. This is true even for infinitesimal paths. Therefore, the
actual equilibrium trajectory of the free energy is a straight line between Vg and Vp .
This is the Maxwell construction. It saves the convexity of the free energy.
The creepy self-squeezing regions of the equation-of-state curve are
exactly the ones which are removed by the phase-coexistence region.
At left here, I’ve made some pictures where a decreasing fraction
of the dots are colored red, in an attempt to depict the history of the
volume fraction of one phase in the other as the coexistence region is
traversed. What’s wrong with this picture? How could you make it
more realistic?
Notice that we are making a strong demand of equilibrium here, ef-
fectively taking t → ∞ before N → ∞. This failure of commutativity of
these limits is the same issue as in our discussion of ergodicity-breaking
above.
81
6 Field Theory
Now we are going to try to see where Landau and Ginzburg could have gone wrong
near the critical point.
Here is a hint, from experiment. The hard thing about the critical point, which
mean field theory misses, is that fluctuations at all scales are important. I know this
because I’ve seen it, e.g. here and (with better soundtrack) here. Critical opalescence
is a phenomenon whereby a two-fluid mixture which is otherwise transparent becomes
opaque at a continuous phase transition. (The difference in densities of the two fluids
plays the role of the order parameter.) It is explained by the scattering of light by the
density fluctuations at all scales, at least at all the wavelengths in the visible spectrum.
These are the fluctuations we’re leaving out in mean field theory.
At this point I want to remind you about the derivation of field theory that you
made for homework 5. There, you studied the Legendre transform of the free energy
F [h] at fixed field: X
S[m] = F [h] − mr hr |m=+∂h F .
r
It’s easy to get confused about Legendre transforms and all that stuff, so it’s very
helpful to appeal to a simpler narrative of the origin of field theory, by exploiting
universality. Recall at the beginning of our discussion of Ising models in §3, I mentioned
the many avatars of the Ising model. One I mentioned arose by considering a real-valued
variable φx at each point in space (or on some lattice).
That is: suppose we replace each spin sx by such a real variable, a factor in whose
probability distribution is
p0 (φx ) ∝ e−βV (φx ) (6.1)
where V (φ) ∼ g(φ2 − 1)2 for large g. This probability distribution is basically zero
unless φ = ±1, so this is no change at all if g is big enough. Important piece of
foreshadowing: we are going to see that a large g at the lattice scale is not at all the
same as a large gφ4 term in the coarse-grained action.
So we replace
X Y X Z Y Z R
... ≡ ... dφx p0 (φx )... ≡ Dφ e−β x V (φ(x))
...
s x sx =±1 x
82
The nearest-neighbor ferromagnetic Ising Hamiltonian becomes (up to an additive con-
stant, using s2 = 1)
d d d
XX 1 XX 1 X X
−J (sx+µ sx − 2) = J (sx+µ − sx )2 J (φx+µ̂ − φx ) 2 .
x µ=1
2 x µ=1
2 x µ=1
| {z }
|{z} 'a−1 ∂µ φ
R
'a−d dd x
That is: the ferromagnetic coupling makes the nearby spins want to agree, so it adds
a term to the energy which grows when the nearby φx s disagree.
Altogether, we are going to replace the Ising partition function with
X Z R d
−βH(s)
Z= e [Dφ]e− d xL(φ)
s
where (I am calling the LG free energy density L for ‘Landau’ or for ‘Lagrangian’.)
1 1 g
L(φ) = κ (∇φ)2 + rφ2 + φ4 + hφ + · · ·
2 2 4!
Our hope is that the operation does not take us out of the basin of attraction of the
Ising critical point. The constants κ, r, g are related in some way (roughly determinable
but not the point here) to the microscopic parameters. For some physical situations
(such as high energy particle physics!) this is a better starting point than the lattice
model. There is some coarse-graining involved in the operation, and therefore the
dependence of κ, r, g on β needn’t be linear, but it should be analytic. After all, the
miraculous phenomenon we are trying to understand is how physics can be non-analytic
in T at some finite value of T ; we don’t want to assume the answer.
Mean field theory arises by making a saddle point approximation: find m which min-
imizes S[φ], 0 = δS
δφ
, and make a (functional) Taylor of the exponent about the
φ=m
minimum:
Z
Z = [Dφ]e−S[φ=m+ϕ]
Z
δ2 S
δS
− S[m]+ δφ |φ=m ϕx + 12 | ϕ ϕ +···
δφx δφy φ=m x y
= [Dϕ]e x (6.3)
83
In the second line I used the fact that the change of variables φ = m + ϕ has unit
Jacobian. I also used a matrix notation, where the position indices x, y are repeated
indices, and hence are summed. The saddle point condition means that the term in
the exponent linear in ϕx vanishes.
The mean field theory answer is just Z0 = e−S[m] . The first correction to mean field
theory comes by keeping the quadratic term and doing the gaussian integral:
Z
1
R R
Z1 = Z0 [Dϕ]e− 2 x y ϕx Kxy ϕy
δ2S g
Kxy ≡ = r + m2 − κ∇2 δ d (x − y) .
δφx δφy φ=m 2
I absorbed the constant C into the − log λ0 which we can choose to our advantage. So
the leading correction to the mean-field free energy gives
11 X λ
F (1) [h] = FMF [h] + log .
22 λ λ0
Who are the eigenvalues of the kinetic operator K? If h and hence m are constant,
the problem is translation invariant, and they are plane waves, uq (x) = √1V ei~q·~x – the
eigenvalue equation (6.4) is
Z g 2 2
g 2 2
δ(x − y) r + m − ∇ uq (y) = r + m + q uq (x).
y 2 | 2 {z }
=λq
r + g2 m2 (h) + q 2
Z
(1) 1 d
F [h] = FMF [h] + V d̄ q log
2 r + q2
84
where I made a choice of λ0 to be λ(m = 0).
Making the Legendre transform (a little tricky, and requiring us to ignore terms
of the same size as the corrections to the first order approximation), we have Γ[m] =
V γ(m) with the answer to this order
r + g2 m2 + q 2
Z
(1) 1 2 g 4 1 d
γ = rm + m + d̄ q log . (6.5)
2 4! 2 r + q2
Shift of critical point, Ginzburg criterion revisited. So what? First let’s
use this to recover the Ginzburg criterion. The susceptibility, at h = 0, for T > Tc is
χ = ∂h m|h=0 which (as you’ll verify on the homework) is related to the curvature of
the effective potential γ by
Z
1 g 1
2
|m=0 = ∂m γ|m=0 = r + d̄d q 2 .
χ 2 q +r
The phase transition happens when the correlation length goes to infinity; we showed
by the susceptibility sum rule (4.9) that ξ → ∞ is required by χ → ∞. So, while
in mean field theory the critical point occurs when r → 0, the fluctuation corrections
we’ve just derived shift the location of the critical point to
Z
! −1 g 1
0 = χ (Tc ) = r(Tc ) + d̄d q 2 .
2 q + r(Tc )
You’ll show on the homework that we can eliminate the (annoying, non-universal any-
way) parameter r from the discussion and relate the susceptibility near the transition
to the non-dimensionalized temperature t = T −T Tc
c
:
Z
1 g d 1
= c1 t 1 − d̄ q 2 2 .
χ 4 q (q + r)
for some constant c1 . Everywhere here we are ignoring terms which are as small as
the corrections to the gaussian approximation. Since if g were zero, the integral would
be exactly gaussian (ignoring even higher order terms like φ6 for now), the corrections
must come with powers of g.
When is the correction to MFT actually small? The shift in the critical point is
of order gG(0) = g d̄d q q2 (q12 +t) + const, which is the same quantity we found in our
R
85
6.2 Momentum shells
So the analog of the partition function after a single blocking step is the following:
Break up the configurations into pieces:
Z
φ(x) = d̄d keikx φk ≡ φ< + φ> .
Here φ< has nonzero fourier components only for |k| ≤ Λ/b for some b > 1 and φ> has
nonzero fourier components only in the shell Λ/b ≤ |k| ≤ Λ. These two parts of the
field could be called respectively ‘slow’ and ‘fast’, or ‘light’ and ‘heavy’, or ‘smooth’
and ‘wiggly’. We want to do the integral over the heavy/wiggly/fast modes to develop
an effective action for the light/smooth/slow modes:
Z Z
− dd xL(φ< ) > − dd xL1 (φ< ,φ> )
R R
−Seff [φ< ]
e ≡e [Dφ ]e , ZΛ = [Dφ< ]e−Seff [φ< ]
Λ/b
86
6.3 Gaussian fixed point
In the special case where the action is quadratic in φ, not only can we do the integrals,
but the quadratic action is form-invariant under our coarse-graining procedure.
Consider
Z Z Λ
1 1
d x φ(x) r0 − r2 ∂ 2 φ(x) =
d
d̄d kφ(k)φ(−k) r0 + r2 k 2 .
S0 [φ] =
2 0 2
The coefficient r2 of the kinetic term (I called it κ earlier) is a book-keeping device that
we may set to 1 by rescaling the field variable φ if we choose. Why set this particular
coefficient to one? One good reason is that then our coarse-graining scheme will map
Ising models to Ising models, in the sense that the kinetic term is the continuum
P
representation of the near-neighbor Ising interaction J hiji si sj .
PΛ
We can add a source q hq φ−q to compute
P hq h−q
−1
D PΛ E
Z[h] = e− q hq φ−q = Z[0]e 2 q q2 +r
and
1 ∂ ∂ 1
hφq φq0 i = Z[h]|h=0 = 2 δq+q0 = G(q)δq+q0 .
Z ∂h−q ∂h−q0 q +r
We can relate the parameter r to a physical quantity by our friend the susceptibility
sum rule: Z
Gaussian 1
χ = dd xG(x) = G(q = 0) = .
r
Here’s what I mean by form-invariant: because S0 does not mix modes of different
wavenumber, the integrals over the fast and slow modes simply factorize:
Z
−Seff [φ< ] > < <
e = [Dφ> ]e−S0 [φ ]−S0 [φ ] = Z> e−S0 [φ ]
– the effective action for the slow modes doesn’t change at all, except that the cutoff
changes by Λ → Λ/b. To make the two systems comparable, we do a change of rulers:
d−2
Λ0 ≡ bΛ, φ0q ≡ b 2 φbq
so that Z Λ
1
Seff = dd qφ0q φ0−q (q 2 + r0 )
0 2
87
where r0 = b2 r.
What we just showed is that this RG we’ve constructed maps the quadratic action
to itself. There are two fixed points, r0 = ∞ and r0 = 0. The former is the high-
temperature disordered state. Near this fixed point, the parameter r0 is relevant and
grows as we iterate the RG. No other terms (besides a constant) are generated. We
could say there is another fixed point at r0 = −∞, which could describe the ordered
phase, but with g = 0, the integral is not well-defined with r0 < 0.
This is the same calculation we did of the random walk, the very first calculation
we did, with a lot more labels! The linear term in φ (the external magnetic field here)
would be relevant, just like the bias term in the random walk that we introduced in
§2.1. It is forbidden by the Ising symmetry.
Following the general RG strategy, once we find a fixed point, we must study the
neighborhood of the fixed point.
Just as with the spin sums, the integrals are hard to actually do, except in a gaussian
theory. But again we don’t need to do them to understand the form of the result. We
use it to make an RG. As usual there are two steps: coarse-graining and rescaling.
First give it a name:
Z
dd xδL(φ< ) dd xL1 (φ< ,φ> )
R R
−
e ≡ [Dφ> ]e− (6.7)
where we include all possible terms consistent with the symmetries (φ≶ → −φ≶ , h →
−h, rotation invariance27 ). Then we can find an explicit expression for L1 :
Z Z
d < > d 1 > 2 1 2 > 2 > 3 <
d xL1 (φ , φ ) = d x κ(∂φ ) + m (φ ) + g4 (φ ) φ + ...
2 2
27
Why impose rotation invariance here? For now, it’s for simplicity. But (preview) we will see
that the fixed points we find are stable to rotation-symmetry breaking perturbations. Its an emergent
symmetry.
88
(I write the integral so that I can ignore terms that integrate to zero, such as ∂φ< ∂φ> .)
This is the action for a scalar field φ> interacting with itself and with a (slowly-varying)
background field φ< . But what can the result δL of integrating out φ< be but something
of the form (6.9) again, with different coefficients?28 The result is to shift the couplings
gn → gn + δgn . (This includes the coefficient of the kinetic term and also of the higher-
derivative terms which are hidden in the ... in (6.9). You will see in a moment the logic
behind which terms I hid.)
Finally, so that we can compare steps of the procedure to each other, we rescale
R R
our rulers. We’d like to change units so that Λ/b is a Λ with different couplings; we
accomplish this by changing variables: k 0 = bk so now |k 0 | < Λ. So x0 = x/b, ∂ 0 ≡
0 0
∂/∂x0 = b∂x and the Fourier kernel is preserved eikx = eik x . Plug this into the action29
Z Z !
1 −2 0 < 2 X n
Seff [φ< ] = dd x (L(φ< ) + δL(φ< )) = d d x 0 bd b (∂ φ ) + (gn + δgn ) (φ< ) + ...
2 n
We can make this look like L again (with r2 = 1) by rescaling the field variable:
1
bd−2 (∂ 0 φ< )2 ≡ (∂ 0 φ0 )2 (i.e. φ0 ≡ b 2 (d−2) φ< ):
Z !
1 0 0 2 X n(d−2)
Seff [φ< ] = dd x0 (∂ φ ) + (gn + δgn ) bd− 2 (φ0 )n + ...
2 n
where
n(2 − d)
∆n ≡ + d.
2
Ignore the interaction corrections, δgn , for a moment. Then we can keep doing this
and take b → ∞ to reach macroscopic scales. Then, as b grows, the couplings with
∆n < 0 get smaller and smaller as we integrate out more shells. If we are interested
in only the longest-wavelength modes, we can ignore these terms. They are irrelevant.
Couplings (‘operators’) with ∆n > 0 get bigger and are relevant.
The ‘mass term’ rφ2 has n = 2 and r0 = b2 r is always relevant for any d < ∞.
28
Again we apply the Landau-Ginzburg-Wilson logic. The idea is the same as in our discussion of
blocking for the Ising model. The result is local in space because the interactions between the slow
modes mediated by the fast modes have a range of order b/Λ. The result is analytic in φ< at small
φ< and there is no symmetry-breaking because we only integrate the short-wavelength modes.
2
29
Really, the coefficient of (∂ 0 φ< ) should be b−2 (1 + δκ). But δκ turns out to be O(g 2 ) so let’s
ignore it for now.
89
This counting is the same as dimensional analysis: demand that βH is dimension-
less, and demand that the kinetic term (∂φ)2 stays fixed. Naive (length) dimensions:
– couplings with negative length dimension are relevant. This result is the same as
engineering dimensional analysis because we’ve left out the interaction terms. This is
actually correct when gn = 0, n ≥ 3, which is the gaussian fixed point.
An important conclusion from this discussion is that there is only a finite number
of marginal and relevant couplings that we must include to parametrize the physics.
Further, if the interactions produce small corrections, they will not change a very
irrelevant operator to a relevant operator. This should mitigate some of the terror you
felt when we introduced the horrible infinite-dimensional space of hamiltonians M at
the beginning of the course.
Another important conclusion is that the gaussian Ising critical point is stable to
interactions in d > 4. It is of course unstable in the sense that rφ2 is relevant. And it is
unstable if we allow terms with odd powers of φ which break the Ising symmetry. But
what is the smallest-dimension operator which we haven’t added and which respects
the Ising symmetry? According to our Gaussian counting, each derivative counts for
+1, and each power of φ counts for 2−d 2
. If we demand rotation invariance (or even
just parity) so we can’t have a single derivative, the next most important perturbation
is g4 φ4 . Its dimension is ∆4 = 4 − d – it is irrelevant if d > 4 and relevant if d < 4. We
could have expected this, since it coincides with the breakdown of mean field theory
– above the upper critical dimension, the interactions are irrelevant and MFT gives a
correct accounting of the fixed point. In d = 4, the φ4 term is marginal, and it is an
opportunity for small interaction corrections to decide its fate.
[End of Lecture 12]
90
6.5 Field theory without Feynman diagrams
g0 4 1 Λ d r0 + g20 m2 + q 2
Z
(1) 1 2
γ [m, b] = r0 m + m + d̄ q log . (6.10)
2 4! 2 Λ/b r0 + q 2
1 g(b) 4
≡ r(b)m2 + m + ...
2 4!
I also added some subscripts on the couplings to emphasize that r0 , g0 are parameters
in some particular zeroth-order accounting we are making of the physics, not some holy
symbols whose values we can measure. In the last line, we’ve defined running couplings
r(b), g(b).
From this expression we can read off
Λ
d̄d q
Z
g0
r(b) = r0 + .
2 Λ/b q2 + r
A slightly more useful parameter is the deviation from the critical coupling. The critical
point occurs when χ−1 = ∂m 2
γ|m=0 → 0, which happens when r0 is
g0 d̄d q
Z
c
r0 = − + O(g02 ).
2 q2
On the RHS here, we ignored the r in the denominator because it is O(g). This gives
the deviation in temperature from the critical point, by subtracting the previous two
displayed equations:
d̄d q
Z
g0
t(b) ≡ r0 − r0c = r0 1 − +O(g 2
0 .
)
2 q 2 (q 2 + r)
| {z }
≡Id (r,b)
91
(Note that t = t(b) is merely a convenient relabelling of the coordinate r0 ; the relation
between them is analytic and t depends on our zoom factor b.)
Now we must study the integral I. We’ve observed that Id (r, b → ∞) blows up (by
taking b → ∞ we include all the fluctuations) when r → 0 for d ≤ 4. Let’s start at
d = 4, where Z Λ
q 3 dq Λ2 r
I4 (r, b) = K4 2 2
= +K4 log b. (6.11)
Λ/b q (q + r)
where t0 ≡ t(b = 1). If there exists a fixed point, g = g? with κ(g? ) 6= 0, then its
contribution to the exponent (the upper limit dominates) is
Z b Z b
dµ b→∞ dµ
− κ(g? ) → −κ(g? ) .
µ µ
| 1 {z }
=log b
92
Hence, in this case
t(b) = t0 b−κ(g? ) (6.17)
– that is κ(g? ) determines the critical exponent with which the IR value of t(b) diverges.
Why do we care about the IR value of t(b)? It determines the correlation length! We’ll
come back to this.
What is the solution of the beta function equation for the coupling in d = 4? To
save writing, let’s redefine g̃0 ≡ K4 g0 and drop the tilde. The equation is
3
−b∂b g = g02 + O(g03 )
2
which is solved by
2g0 b1 2 1 b→∞
g(b) = → → 0. (6.18)
2 + 3g0 log b 3 log b
There is an attractive IR fixed point at g0 = 0. This is part of the way towards justifying
my claim that perturbation theory would be useful to study the long-wavelength physics
in this problem.
In the case of d = 4, then, the interesting physics comes from the slow approach to
the free theory in the IR. To get something interesting we must include the flow, for
example in the solution for t, Eq. (6.16): since the flow of g0 (6.18) never stops, we can
parametrize the flow by g0 and use the chain rule to write dµ µ
dg0
= β(g 0)
so that
Z b Z g0 (b)
dµ κ(g) b1 1 g(b)
κ(g0 (µ)) = dg ' log
µ g0 β(g) 3 g0
| {z }
1
= 3g (1+O(g))
– this is what I called the Callan-Symanzik equation during the random-walk discussion
§1.3. Second, we use ordinary engineering dimensional analysis:
93
This implies that the RHS of (6.20) is
When does the zoom factor hit the sweet spot (6.22)? The answer is different in
d = 4 and d < 4.
Using (6.19), this happens when
√ t0
t(b? ) = t0 (log(b? ))−1/6 (b? )−2 (log b? )1/3 =
p
Λ/b? = ↔
Λ2
which we can solve for b? in the limit t Λ (closer to the critical point than the lattice
scale):
t/Λ2 1 t 2
(b? )−2 ' . (6.23)
Λ (log(t/Λ2 ))1/3
2
Putting this information back into the Callan-Symanzik equation for the suscepti-
bility (6.20), we have
94
A comment on active versus passive RG. I’ve presented the condensed-matter
perspective on the RG here: there is a fixed, real cutoff, and the couplings run as we
integrate out longer and longer wavelength modes, i.e. vary the resolution with which
we look at the degrees of freedom.
Another perspective (which leads to the same conclusions!), taken by high-energy
physicists, is that the cutoff Λ is an artificial device. We should be able to vary
this cutoff without changing the physics, at the cost of changing the values of the
couplings at the cutoff. That is we regard the couplings at the cutoff (what I called
r0 , g0 above, the ones appearing in the Lagrangian) as depending on the cutoff Λ. To
make this precise, we must ask how the couplings in γ(Λ) need to depend on Λ to keep
the physics from depending on this fictional division we are making between UV and
IR. We can think about the RG transformation as replacing the cutoff Λ with a new
(smaller) cutoff Λ/b.
Something we can measure, and which should not depend on our accounting pa-
rameter b, is the susceptibility (for T > Tc ):
g0 Λ/b d̄d q
Z
−1 2
r ≡ χ = ∂m γ|m=0 = r0 + .
2 q 2 + r0
(Such an equation, relating a physical quantity like χ to something we can compute in
terms of the running couplings gn (b), is sometimes called a renormalization condition.)
We can invert this equation to figure out r0 (b):
g0 Λ/b d̄d q
Z
r0 = r − + O(g02 ).
2 q2 + r
Again we subtract the critical value of r0 to get
d̄d q
Z
c g0 2
t0 ≡ r0 − r0 = r 1 + + O(g0 ) .
2 q 2 (q 2 + r)
Near d = 4, this is
g0 Λ/b g0
t0 = r 1 + K4 log √ = r 1 − K4 log b + · · ·
2 r 2
(where the ellipsis is independent of b).
Another quantity we can imagine measuring is the coupling g, a non-linear suscep-
tibility:
3g02 Λ/b d̄d q
Z
4
g ≡ ∂m γ|m=0 = g0 − + O(g03 )
2 (q 2 + r0 )2
95
– notice that this is the same equation as (6.12), but the BHS is interpreted differently:
now the LHS is a physical, fixed, measurable thing, and g0 is a fake thing that depends
on the artificial parameter b. We can invert this equation to find g0 in terms of g and
b:
3g 2 Λ/b d̄d q
Z
g0 (b) = g + + O(g 3 )
2 (q 2 + r0 )2
(where we studiously neglect higher order things). Near d = 4 this is
d→4 3g 2 1 Λ 3g 2
g0 (b) ' g + K4 log √ + O(g 3 ) = g − K4 log b + O(g 3 ) (6.25)
2 2 b r 2
(where the ellipsis is independent of b). This reproduces the same beta functions as
above.
Two important generalizations. Now we make two easy but crucial generaliza-
tions of the d = 4 Ising calculation we’ve just done: namely Z2 → O(n) and d → 4 − .
O(n) : [Goldenfeld, §11.1] by the LG logic, a O(n)-invariant and translation-
invariant free energy at fixed magnetization ma must look like
Z
a d 1~ a ~ a 1 a a g0 a a 2
S[φ ] = d x ∇φ · ∇φ + r0 φ φ + (φ φ )
2 2 4!
For n > 1, in expanding about the mean field configuration φa = ma +ϕa , we encounter
a distinction between the one (longitudinal) fluctuation in the direction of ma ≡ mea0
and the n − 1 transverse fluctuations. The quadratic part of this action comes from
the kernel
ab δS g0 g0
Kxy = a b |φ=m = −∇2 + r0 + m2 δab + ma mb δxy .
δφx δφy 6 3
This matrix is made of one copy of the n = 1 Ising case with coefficient of m2 equal to
g0 /2, and n − 1 degenerate copies of the same thing with g0 /6. So the sum of the logs
96
of the eigenvalues is
Z g0 g0
trx,a log K = V d̄d q log r0 + m2 + q 2 + (n − 1) log r0 + m2 + q 2 + const .
2 6
d̄d q
2
g02
Z
3g0 3 d→4 2 3 n−1
g = g0 − + (n − 1) + O(g0 ) ' g0 − g0 + K4 log b
2 6 (q 2 + r)2 2 6
Anticipating the result a bit, we are going to treat g0 and as being of the same order
in our expansion, so O(g0 ) = O() and O(g02 ) = O(g0 ) et cetera. Thinking of as
small, then, the only change in (6.12) is
dd q
Z
4 2
∂m γ(b)|m=0 == Λ g0 − b0 g (6.26)
(q 2 + r)2
3 n−1
where b0 ≡ 2
+ 6
.
The the coefficient of φ4 in the effective action at scale Λ/b is
(Λ/b) g(b) ≡ ∂m
4
γ(b)|m=0 = Λ g0 − b0 Kd Λ g02 log b + O(g02 ) .
Here comes the magic: the key fact is roughly that “Λ = 1 + log Λ + O(2 )”; I
put that in quotes because it is distasteful to take the log of a dimensionful quantity.
Systematically ignoring things that can be ignored (including the Λ which is needed
in the previous equation for dimensions to work), this is:
97
(Again we absorb the factors of Kd into g, g0 .)
The crucial extra term proportional to g0 comes from the engineering dimensions of
g0 .
Where are the fixed points? There is still one at g0 = 0, our
old friend the Gaussian fixed point. But there is another, at
6
g? = + O(2 ) = + O(2 ) .
b0 n+8
This is the Wilson-Fisher fixed point (really one for every n
and d <
∼ 4). As was foretold, g0 is of order .
The WF fixed point and the Gaussian critical point exchange roles as we decrease
d through four. For d > 4, the Gaussian critical point is IR attractive and governs
the critical behavior at long wavelengths: MFT is right. At d = 4, they collide and
this produces the weird logarithms in the approach to g = 0 that we saw above. For
d < 4, the Gaussian fixed point is unstable to the interaction term: the g0 φ4 term is a
relevant perturbation, since g0 grows as we zoom out.
– this is what I called the Callan-Symanzik equation during the random-walk discussion
§1.3. Second, we use ordinary engineering dimensional analysis:
– the correlation length is a length and so zooms like a length. From this, we deduce
that (the RHS of (6.27) is )
Now we can choose a convenient zoom factor, b. Again, we choose b = b? so that the
argument of the logs are all 1 and they go away:
t(b? )
= 1. (6.28)
(Λ/b? )2
98
If b? → ∞, then g(b? ) → g? , the IR fixed point value, where
(6.17)
t(b? ) ' (b? )−κ(g? ) t.
which indeed blows up in the critical region t Λ – that is: this is an IR fixed point,
a fixed point we reach by zooming out.
Therefore
1
t − 2−κ(g? )
ξ(t, g0 , Λ) = b? ξ(t(b? ) (b? )2 = Λ2 , g? , Λ) ∼
Λ 2
t −ν
≡ Λ2
(6.29)
n+2 n+2
Explicitly, κ(g0 ) = 6
g0 + O(g02 ) means κ(g? ) = n+8
+ O(2 ) so that
1 n+2
ν= + + O() (6.30)
2 4(n + 8)
Notice that all the information about the short-distance stuff has dropped out of (6.29)
(except for the stuff hidden in the twiddle, i.e. the overall coefficient) – only the physics
at the fixed point matters for the exponent.
We can do remarkably well by setting = 1 in (6.30) and comparing to numerical
simulations in d = 3.
99
6.6 Perturbative momentum-shell RG
[Kardar, Fields, §5.5, 5.6] I will say a bit about how to develop this perturbative RG
more systematically. We’ll end up at the same place, but with more context. This
calculation is important enough that it’s worth doing many ways.
We’ll do n-component fields, φa , a = 1..n with O(n) symmetry, in d = 4 − dimen-
sions. Let’s decompose the action as
S[φ] = S0 [φ] + U,
with S0 the gaussian terms, as above. For n component fields, the gaussian term looks
like Z Λ
1
d̄d k φa (k)φa (−k) r0 + r2 k 2 .
S0 [φ] =
0 | {z }2
≡|φ|2 (k)
(If it is not diagonal, do a field redefinition to make it so.) We assume the model has
a O(n) symmetry which acts by φa → Rba φb , with Rt R = 1 n×n . The most relevant,
symmetric interaction term (non-Gaussian perturbation) is the φ4 term
Z Z Y 4 Xn X
d a a 2 d
U = d xu0 (φ (x)φ (x)) = u0 d̄ ki φa1 (k1 )φa2 (k2 )φa3 (k3 )φa4 (k4 )/δ( ki )δ a1 a2 δ a3 a4 .
i=1 a1,2,3,4 =1 i
The h...i0,> means averaging over the fast modes with their Gaussian measure, and Z0,>
is an irrelevant normalization factor, independent of the objects of our fascination, the
slow modes φ< .
The corrections to the effective action for φ< can be organized as a cumulant ex-
pansion:
−U 1 2 2
log e 0,>
= − hUi0,> + U 0,> − hUi0,> +O(U 3 )
| {z } |2 {z }
1 2
Let’s focus on the first-order term first:
Z Y 4
!* +
d
X Y
1 = hU[φ< , φ> ]i0,> = u0 d̄d ki /δ ki (φ< + φ> )i
i=1 i i 0,>
100
It is useful to introduce a diagrammatic notation in which these 16 terms decompose
as in Fig. 1.
We can compute the averages over the fast modes by doing Wick contractions. This
is a fact about Gaussian integrals, which can be summarized by noting that
1
ehA φA 0
= e 2 hA hφA φB i0 hB
where A is a multi-index over space and flavor labels and whatever else (to prove it,
complete the square). Then expand both sides to learn that
(
0, if m is odd
hφA1 · · · φAm i0 = .
sum of all pairwise contractions, if m is even
By ‘pairwise contraction’ I just mean a way of replacing a pair of φs on the LHS with
hφA φB i. Each pairwise contraction is given by the ‘propagator’, which in our case is
δ ab /δ(q1 + q2 )
φa> (q1 )φb> (q2 ) = = .
0,> r0 + q12 r2
In the figure, these are denoted by wiggly lines. The slow modes are denoted by straight
lines. The 4-point interaction is denoted by a dotted line connecting two pairs of lines
(straight or wiggly). !
X
u0 δ a1 a2 δ a3 a4 /δ qi = .
i
Although the four fields must be at the same point in space we separate the two pairs
whose flavor indices are contracted, so that we can follow the conserved flavor index
around the diagrams.
Let’s analyze the results of the first order correction: The interesting terms are
Z Λ/s Z Λ
d 1
13 = −u0 |{z} 2 |{z}n d̄ k|φ< (k)|2
d̄D q
aa 0 Λ/s r0 + r2 q 2
symmetry =δ
4·1
13 14 =
2·n
has a bigger symmetry factor but no closed flavor index loop. The result through
O(u) is then just what we found previously:
Z Λ
1
r0 → r0 + δr0 = r0 + 4u0 (n + 2) d̄d q 2
+ O(u20 ) .
Λ/b r0 + r2 q
101
Figure 1: 1st order corrections from the quartic perturbation of the Gaussian fixed point of the O(N )
model. Naturally, wiggly lines denote propagation of fast modes φ> , straight lines denote (external)
slow modes φ< . A further refinement of the notation is that we split apart the 4-point vertex to
indicate how the flavor indices are contracted; the dotted line denotes a direction in which no flavor
flows, i.e. it represents a coupling between the two flavor singlets, φa φa and φb φb . The numbers at
left are multiplicities with which these diagrams appear. (The relative factor of 2 between 13 and 14
can be understood as arising from the fact that 13 has a symmetry which exchanges the fast lines but
not the slow lines, while 14 does not.) Notice that closed loops of the wiggly lines produce of n, since
we must sum over which flavor is propagating in the loop – the flavor of a field running in a closed
loop is not determined by the external lines, just like the momentum.
to restore the original action: we must choose ζ = b1+d/2 to keep r̃2 = r2 (the unfamiliar
R
power is because φ(k) = dd xφ(x)eikx scales differently from φ(x)).
The second-order-in-u0 terms are displayed in Fig. 2. The interesting part of the
102
+
Figure 2: 2nd order corrections from the quartic perturbation of the Gaussian fixed point of the O(N )
model. The left column of diagrams are corrections to the quartic interaction, and the right column
correct quadratic terms. In fact the top right diagram is independent of the external momentum and
hence only corrects r0 ; the bottom right diagram (that looks like a sheep) also corrects the kinetic
term.
Notice that the diagram at top right has two closed flavor loops, and hence goes like n2 , and it
comes with two powers of u0 . You can convince yourself by drawing some diagrams that this pattern
continues at higher orders. If you wanted to define a model with large n you could therefore consider
taking a limit where n → ∞, u0 → 0, holding u0 n fixed. The quantity u0 n is often called the ’t Hooft
coupling.
with
Z Z
d 1 1
f (k1 +k2 ) = d̄ q ' d̄d q (1 + O(k1 + k2 ))
(r0 + r2 q )(r0 + r2 (k1 + k2 − q)2 )
2 (r0 + r2 q 2 )2
– the bits that depend on the external momenta give irrelevant derivative corrections,
like φ2< ∂ 2 φ2< . We ignore them. This leaves behind just the correction to u we found
before.
103
There are also two-loop corrections to the quadratic term (diagrams with two
straight lines sticking out). Altogether, the full result through O(u20 ) is then the original
action, with the parameter replacement
−d−2 2
r2 r̃2 b ζ (r2 + δr2 )
r0 7→ r̃0 = b−d ζ 2 (r0 + δr0 ) + O(u30 ).
u0 ũ0 b−3d ζ 4 (u0 + δu0 )
The shifts are: 2
2 ∂k A(0)
δr2 = u 0 r2
RΛ
δr0 = 4u0 (n + 2) Λ/b d̄d q r0 +r1 2 q2 − A(0)u20 .
δu = − 1 u2 (8n + 64) R Λ d̄d q
1
0 2 0 Λ/b (r0 +r2 q 2 )2
bd+2
ζ2 = = bd+2 1 + O(u20 ) .
2 2
1 + u0 ∂k A(0)/r2
b = e` ' 1 + δ`
4(n+2)Kd Λd
(
dr0
d`
= 2r0 + r0 +r2 Λ2
u0 − Au20 + O(u30 )
d (6.31)
du0
d`
= (4 − d)u0 − 4(n+8)K dΛ
(r0 +r2 Λ2 )2 0
u2 + O(u30 )
To see how the previous thing arises, and how the integrals all went away, let’s
consider just the O(u0 ) correction to the mass:
d̄d q
Z Λ
dr0 2 2
r̃0 = r0 + δ` = s r0 + 4u(n + 2) 2
+ O(u0 )
d` Λ/b r0 + r2 q
d 1 2
= (1 + 2δ`) r0 + 4u0 (n + 2)Kd Λ δ` + O(u0 )
r0 + r2 Λ2
4u0 (n + 2)
= 2r0 + Kd Λd δ` + O(u20 ). (6.32)
r0 + r2 Λ2
Now we are home. (6.31) has two fixed points. One is the free fixed point at the
origin where nothing happens. The other (Wilson-Fisher) fixed point is at
2u? (n+2)K Λd d=4−
(
r0? = − 0r? +r2 Λ2d = − 12 n+8
n+2
r2 Λ2 + O(2 )
0
(r? +r2 Λ2 )2 d=4− 1 r22
u?0 = 4(n+8)Kd Λd
= 4 (n+8)K 4
+ O(2 )
104
Figure 3: The φ4 phase diagram, for > 0. If r0 (` = ∞) > 0, the effective potential for the
uniform ‘magnetization’ has a minimum at the origin; this is the disordered phase, where there is no
00
magnetization. If r0 (` = ∞) = Veff < 0, the effective potential has minima away from the origin,
and the groundstate breaks the O(n) symmetry; this is the ordered phase. Too far to the right, u0
is too large for us to trust our perturbative analysis. Experimental and numerical evidence suggests,
however, that there are no other fixed points nearby, i.e. that there are actually no dragons.
which is at positive u?0 if > 0. In the second step we keep only leading order in
= 4 − d.
Now we follow protocol and linearize near the W-F fixed point:
d δr0 δr0
=M
d` δu0 δu0
The matrix M is a 2 × 2 matrix whose eigenvalues describe the flows near the fixed
point. It looks like !
2 − n+2
n+8
...
M=
O(2 ) −
Its eigenvalues (which don’t care about the off-diagonal terms because the lower left
105
entry is O(2 ) are
n+2
yr = 2 − + O(2 ) > 0
n+8
which determines the instability of the fixed point and
This last equality is the definition of the correlation length exponent (how does the
correlation length scale with our deviation from the critical point δr0 (0)). Therefore
−1
1 1n+2 2 1 n+2
ν= = 2 1− + O( ) ' 1+ + O(2 ).
yr 2n+8 2 2(n + 8)
7 Scaling
Scaling functions from the RG. [Goldenfeld, §9.4, 9.2] Consider a renormalization
group near a fixed point with one relevant parameter, which transforms as
T 0 = Rb (T ).
T 0 − T? = Rb (T ) − Rb (T? ) = Rb (T − T? ) + O(T − T? )2
106
Letting t0 = T −T
T?
?
, the reduced temperature transforms under a single RG step as
0 yt
t = t0 b and under n RG steps as
t(n) = t (byt )n .
which says
ξ(t0 ) = bn ξ t(n) = bn ξ (t0 bnyt ) .
Here ξ(b) is the high-temperature correlation length, far from the critical point, which
we can regard as a constant, independent of t0 . Comparing to the definition of the
correlation length critical exponent ξ(t0 ) ∼ t−ν
0 this says
1
ν= . (7.1)
yt
cV ∼ ∂t2 f ∼ t−α
d
has α = 2 − yt
. Comparing to the expression (7.1) for ν, we have
2 − α = νd
107
All the variables. More generally, there are many more couplings. For example,
for the Wilson-Fisher fixed point (in 2 < d < 4) there are two relevant couplings
t, h (the latter of which breaks symmetries), and a long list of irrelevant operators
which I’ll call K3 , K4 . Let us suppose that we have already diagonalized the matrix
Rαβ = ∂K α Rβ |K=K? at the fixed point, and t, h, K3 , K4 ... are the coordinates in the
eigendirections, with scaling exponents yt , yh , y3 , y4 .... The statements about relevance
above say yt , yh > 0, y3 , y4 < 0. (Note that the quartic coupling is in the list of irrelevant
perturbations of the WF fixed point for 2 < d < 4.) Then
The last expression for the free energy density is in terms of a scaling function F –
which basically just means a function of dimensionless arguments. The existence of
this function implies all the so-called hyperscaling relations (such as the Josephson one
above) which relate various exponents.
An important disclaimer about the t → 0 limit in (7.5): the limit f (t, h, K3 ) as
K3 → 0 may not exist. In that case, K3 is called a dangerous irrelevant variable. An
example where this happens is at the gaussian fixed point in d > 4, with K3 = g, the
quartic coupling. If the quartic coupling is zero, then for t ≤ 0, the partition function
blows up, so the limit t → 0 does not commute with g → 0. Despite the fact that g is an
irrelevant perturbation, g > 0 is crucial for determining the saddle point configuration
of m when t ≤ 0.
Corrections to scaling. In experiments or simulations, t isn’t really zero. So the
contributions from irrelevant operators are not exactly zero. For example, by a similar
argument to (7.5), the susceptibility is
−γ ± h −y3 /yt
χT (t, h, · · · ) = |t| Fχ , K3 t ,···
t∆
and even though y3 /yt < 0, the second argument of the scaling function is not actually
zero. On the other hand, if K3 is not a dangerous irrelevant variable which affects the
vacuum structure, then F (x, y) will be analytic in y near 0, so we can Taylor expand:
108
For h = 0, A± , B± are non-universal constants. The term with A gives the leading
singularity |t|−γ , but the term with B, which goes like |t|−γ−y3 /yt can also be singular
at t → 0 and if so must be included in a comparison with experiment or simulation.
Two relevant couplings. If we also keep track of the external field, the singular
part of the free energy is
2−α h
fs (t, h) = |t| F± , ∆ ≡ yh /yt .
|t|∆
The ± label is to allow for the possibility of different scaling functions for t > 0 and
t < 0. F± (0), which describes h = 0, fixed t, must be a constant so that e.g. cV ∼
∂t2 fs |h=0 ∼ |t|−α . F± (∞) describes the behavior t → 0 at fixed h, which is constrained
by
−1 2−α−∆ 0 h
M = −T ∂h fs ∼ t F± . (7.7)
|t|∆
When h = 0, we require M ∼ tβ , so we learn that
β = 2 − α − ∆ = 2 − α − yt /yh . (7.8)
x→∞
When t = 0, we must have M ∼ h1/δ . If F±0 (x) ∼ xλ then (7.7) is
(7.8)
M ∼ hλ |t|2−α−∆−∆λ = hλ |t|β−∆λ .
which requires both β = ∆λ and λ = 1δ .
[End of Lecture 15]
Scaling for the correlation function. A similar scaling argument can be made
for the spin-spin correlation function, G(r, {K}) ≡ hm(r)m(0)ic . On the one hand, G
transforms as
G0 = G (r/b, {K 0 }) = G(r/b, tbyt , hbyh ...).
Notice that the separation between the points is just like another coupling with dimen-
sions of length. On the other hand, I claim that
G0 = b2(d−yh ) G (r, {K}) .
This follows if we regard G0 as the correlation function of the block spins (see Goldenfeld
§9.8). The input is: Z(K 0 ) = Z(K), G(r, K) = δhr δh0 ln Z(h) (hence the factor of b−2yh
comes from the rescaling of h) and finally the factor of b2 d comes from the fact that a
block spin contains bd spins.
If we choose b = t−1/yt then
G(r, t, h...) = t−2(d−yh )/yt G rt1/yt , 1, ht−yh /yt
(7.9)
| {z }
−2(d−yh ) −y
≡(rt t )
1/y FG (rt t ,ht h /yt )
1/y
109
from which we learn that 2(d − yh ) = d − 2 + η (as long as FG (x, y) is smooth as
x, y → 0).
Of all the greek letters we defined (α, β, γ, δ, ν, η, ∆) only two combinations are
independent – they all depend on the two exponents yt , yh associated with the two
relevant perturbations of the fixed point in question.
Data collapse. The main reason to care about scaling functions is the phenomenon
of data collapse. If we plot, say, the magnetization as a function of temperature, for
various values of the external field, we’ll get a different curve M (t) for each h. On the
other hand, (7.7) presents M (t, h) as tβ times a function of the single variable h/|t|∆ .
This formula is valid for small |h|, |t|, but arbitrary h/t. It
implies that if we plot M/|t|β as a function of |h|/|t|∆ , all
of the data will lie on two curves, one for t < 0 and one for
t > 0. At right is a cartoon of what this looks like.
binder cumulant
0.6 0.6
0.4 0.4
from some small monte carlo simulations of
0.2 0.2
the 2d ising model the data collapse for the
0.0 0.0
Binder cumulant, as well as the crossing at 30 20 10 0
(T-Tc)L^(1/ )
10 20 1.0 1.5 2.0
T
2.5 3.0 3.5
[Cardy, chapter 4, Goldenfeld §9.9]. A crossover refers to a smooth change between two
behaviors as a parameter is varied, the opposite of a phase transition. There are many
reasons why critical behavior might be absent; examples include symmetry breaking,
finite volume, disorder.
Suppose we take a critical system with some symmetry and perturb it by a small
symmetry-breaking term. For example, consider the Ising fixed point, perturbed by
a small magnetic field. There will no longer be a sharp transition between high and
low temperature, but what does the critical theory say about the behavior of physical
quantities?
110
Our scaling function expression for
2−α h
fs (h, t) = |t| F±
|t|∆
tells us that when h = 0 we see the expected critical behavior, e.g. cv ∼ |t|−α . On
the other hand, when h ∼ t∆ , something else happens – the dependence on t in the
argument of the scaling function matters; tX ≡ h1/∆ is called the crossover temperature.
x→∞ 1
When h t∆ , instead we see the limit x → ∞ of F± (x) ∼ x1+ δ , which gives instead
cV ∼ t−α−∆(1+1/δ) , a completely ‘wrong’ critical exponent:
(
t−α , tX t 1
cV ∼ .
t−α−∆(1+1/δ) , t tX
So watch out for residual magnetic field when trying to measure critical exponents.
O(3) → O(2)×Z2 . [Here I am really just re-typing Cardy §4.2] Above we considered
a Z2 -symmetric fixed point, perturbed by something which broke all the symmetry.
Consider instead an O(3)-symmetric fixed point which breaks the symmetry down to
~i = (Six , S y , Siz )
O(2) × Z2 . A definite lattice model is provided by 3-component rotors S i
~ ~
with Si · Si = 1 on a lattice in 2 < d < 4, with
X X
−H = ~i · S
Jij S ~j + D (Siz )2 .
ij i
The O(3)-symmetric fixed point is named after Heisenberg. The Landau theory looks
like
1 ~ 2 1 a a
L= ∇φ + tφ φ + u (φa φa )2 + Dφ2z .
2 2
H
Near D = 0, T = TcH , let t = T −T
TcH
c
, and we can write a scaling function using the
scaling near the Heisenberg fixed point
−nd nytH nyD 2−αH −yD /ytH
fs (t, D) = b fs tb , Db = |t| Ψ D|t| (7.11)
111
where ytH is the dimension of t as a perturbation of the Heisenberg fixed point, H,
and yD > 0 is the dimension of D, which is relevant at that fixed point. In the
H
second expression we did the familiar step of choosing tbnyt = some order-one number.
φ ≡ yD /ytH is called the crossover exponent; notice that it is determined by data of the
Heisenberg (UV) fixed point.
So at D = 0, cV ∼ ∂t2 fs ∼ |t|−αH where αH is the specific heat exponent at H. The
crossover happens when D0 |t|−φ ∼ 1, i.e. |t| ∼ tX ≡ D1/φ , the crossover temperature.
What happens for larger D0 ?
Here’s a piece of physics input: for |t| tX , we should see Ising behavior, cV ∼
−αI
|t| . This is because the trajectory (A in the figure above) will spend a long time
near the Ising fixed point. This input constrains the scaling function Ψ in (7.11):
α /φ
cV ∼ |t|−αH Ψ D|t|−φ = D−αH /φ D|t|−φ H Ψ D|t|−φ ≡ D−αH /φ Ψ̃ tD−1/φ
where Ψ̃ is another scaling function. Demanding the Ising singularity at tc (D), we have
!
cV ∼ A(D) (t − tc (D))−αI
which requires
−αI
Ψ̃ tD−1/φ ∼ a tD−1/φ − b
, (a, b constants)
112
The system size L enters through the combination L = N a where N is the number
of sites on a side. An RG step replaces a → ba, holding L fixed. Therefore N → N/b.
Assuming that N does not appear explicitly in the RG map (this is violated by long-
range interactions), we can add the parameter N to the list of arguments of the singular
free energy density:
i.e. N −1 acts just like a relevant parameter with dimension yN = +1. Goldenfeld gives
a nice definition: a relevant parameter is one that an experimenter has to adjust to
reach the critical point; the inverse system size N −1 certainly must be adjusted to zero
to have any critical behavior.
Then, for simplicity at h = 0, we can write a scaling function by the hopefully-
standard-by-now trick of running b to an appropriate value:
To understand this regime, let’s rewrite the scaling function so that the analytic
dependence on t is manifest:
where F̃ is a new scaling function. The fact that N is finite means that F̃ (x) is analytic
in its argument. This leads to powerful conclusions. In particular, it enormously
constrains the functions which blow up at a critical point, such as the susceptibility
χT (t, L−1 ) ∼ ∂h2 fs ∼ Lγ/ν ψ tL1/ν , or the correlation length ξ, or the specific heat
113
Instead of a divergence, each of these functions will have
a maximum at some value of t, determined entirely by the
scaling function F̃ 00 – say x0 is the location of the maximum
of F̃ 00 (x). This means that the location of the peak in t is t0 =
x0 /L1/ν ∼ L−1/ν – we know its dependence on system size!
Furthermore, the height of the (finite!) peak as a function of
L is determined by the prefactor Lα/ν .
ξ t, N −1 = bξ tbyt , bN −1
(7.14)
= t−ν Fξ N −1 tν
(7.15)
= t−ν Lt−ν F̃ L1/ν t
(7.16)
= LF̃ tL1/ν .
(7.17)
x→∞
For L → ∞ at fixed t 1, ξ ∼ t−ν requires F̃ (x) ∼ x−ν . For fixed L, t → 0, we
have ξ ∼ L, and F̃ (x) = A + Bx + · · · is analytic near 0. But this implies that
ξ t, L−1 = A + BtL1ν + · · ·
so that at the critical value of the couplings, K, for any L, this is ξ (0, L−1 ) = A. If
we plot ξ/L as a function of K for various L, all the curves will cross at the critical
coupling.
A similar argument applies to the magnetization (which is
easier to measure)
d−yh
M t, h = 0, L−1 = t yt FM t−ν L−1 = L−d+yh F̃M (tLyt )
y→0
where F̃M (y) = A + By + · · · is analytic near 0 since this
is the finite-size limit. Thus
M (t, L−1 )
= A + BtLyt + · · ·
Lyh −d
– the curves will cross at the critical coupling K = Kc
(i.e. t = 0).
This allows us to determine the critical value of the coupling, and yh . (Probably, keep-
ing track of the leading irrelevant operator is a good idea.) We can then determine
ν = 1/yt as well by
M
∂K ∼ BLyt
Lyh −d
i.e. log(LHS) = log B + ν1 log L.
114
• The finite-size scaling analysis (up to (7.13)) applies equally well if the system
geometry is L × L × ∞ or L × ∞ × ∞. The difference is that the scaling function
F 00 will be different. In the former case, the system is effectively one-dimensional,
and F 00 is still smooth. It can be determined by a transfer matrix calculation.
In the latter case, F 00 is determined by the critical behavior of an auxiliary 2d
system. Cardy has a bit more detail on this point.
• This account of scaling and scaling functions is completely ahistorical. The idea
of data collapse was known and used long before it was justified by the RG in
the way I’ve described.
where {Φ} is a collection of other local operators at locations {xl }; suppose that the
two operators we’ve picked out are closer to each other than to any of the others:
Then from the point of view of the collection Φ, φi φj looks like a single local operator.
But which one? Well, it looks like some sum over all of them:
X
hφi (x1 )φj (x2 )Φi = Cijk (x1 − x2 ) hφk (x1 )Φi
k
115
where {φk } is some basis of local operators. By Taylor expanding we can move all the
space-dependence of the operators to one point, e.g.:
∂
(x2 −x1 )µ µ
φ(x2 ) = e ∂x1
φ(x1 ) = φ(x1 ) + (x2 − x1 )µ ∂µ φ(x1 ) + · · · .
which is to be understood as an operator equation: true for all states, but only up to
collisions with other operator insertions (hence the ∼ rather than =).
This is an attractive concept, but is useless unless we can find a good basis of local
operators. At a fixed point of the RG, it becomes much more useful, because of scale
invariance. This means that we can organize our operators according to their scaling
dimension. Roughly it means two wonderful simplifications:
• Further, the form of Cijk is fixed up to a number. Again for scalar operators,
X cijk
Oi (x1 )Oj (x2 ) ∼ Ok (x1 ) (8.3)
k
|x1 − x2 |∆i +∆j −∆k
where cijk is now a set of pure numbers, the OPE coefficients (or structure con-
stants).
The structure constants are universal data about the fixed point: they transcend
perturbation theory. How do I know this? Because they can be computed from
correlation functions of scaling operators at the fixed point: multiply the BHS of
(8.3) by Ok (x3 ) and take the expectation value at the fixed point:
(8.3) X cijk0
hOi (x1 )Oj (x2 )Ok (x3 )i? = hOk0 (x1 )Ok (x3 )i?
k0
|x1 − x2 |∆i +∆j −∆k
116
(8.2) cijk 1
= ∆ +∆ −∆
(8.4)
|x1 − x2 | i j k |x1 − x3 |2∆k
(There is a better way to organize the RHS here, but let me not worry about
that here.) The point here is that by evaluating the LHS at the fixed point, with
some known positions x1,2,3 , we can extract cijk .
Confession: I (and Cardy) have used a tiny little extra assumption of conformal
invariance to help constrain the situation here. It is difficult to have scale invariance
without conformal invariance, so this is not a big loss of generality. We can say more
about this later but for now it is a distraction.
Conformal perturbation theory. Suppose we find a fixed point of the RG, H? .
(For example, it could be the gaussian fixed point of N scalar fields.) Let us study its
neighborhood. (For example, we could seek out the nearby interacting Wilson-Fisher
fixed point in D < 4 in this way.) For definiteness and simplicity let’s think about the
equilibrium partition function
Z = tre−H
– we set the temperature equal to 1 and include it in the couplings, so H is dimension-
less. We can parametrize it as
XX
H = H? + gi a∆i Oi (x) (8.5)
x i
where a is the short distance cutoff (e.g. the lattice spacing), and Oi has dimensions of
length−∆i as you can check from (8.2). So gi are de-dimensionalized couplings which
we will treat as small and expand in32 .
Then
D P P ∆i O (x)
E
Z = Z? e− x i gi a i
|{z} ?
≡tre−H ?
1
dd r
R
dd x
P
x ' ad X Z
' Z? 1 − hOi (x)i? d−∆i
gi
i
a
Z d d
1 X d x1 d x2
+ gi gj hOi (x1 )Oj (x2 )i?
2 ij a2d−∆i −∆j
Z Z Z Q3 !
d
1 X a=1 d x a
− gi gj gk hOi (x1 )Oj (x2 )Ok (x3 )i? + ... .
3! ijk a3d−∆i −∆j −∆k
Comments:
32
Don’t be put off by the word ‘conformal’ in the name ‘conformal perturbation theory’ – it just
means doing perturbation theory about a general fixed point, not necessarily the gaussian one.
117
• We used the fact that near the fixed point, the correlation length is much larger
than the lattice spacing to replace x ' a1d dd x.
P R
• There is still a UV cutoff on all the integrals – the operators can’t get within a
lattice spacing of each other: |xi − xj | > a.
• The integrals over space are also IR divergent; we cut this off by putting the
whole story in a big box of size L. This is a physical size which should be
RG-independent.
• The structure of this expansion does not require the initial fixed point to be a
free fixed point; it merely requires us to be able to say something about the
correlation functions. As we will see, the OPE structure constants cijk are quite
enough to learn something.
Now let’s do the RG dance. We’ll take the high-energy point of view here: while
preserving Z, we make an infinitesimal change of the cutoff,
a → ba = (1 + δ`)a, 0 < δl 1 .
The price for preserving Z is letting the couplings run gi = gi (b). Where does a appear?
(1) in the integration measure factors ad−∆i .
R
(2) in the cutoffs on dx1 dx2 which enforce |x1 − x2 | > a.
(3) not in the IR cutoff – L is fixed during the RG transformation, independent of b .
The leading-in-δ` effects of (1) and (2) are additive and so may be considered separately:
(1) g̃i = (1 + δ`)d−∆i gi ' gi + (d − ∆i )gi δ` ≡ gi + δ1 gi
The effect of (2) first appears in the O(g 2 ) term, the change in which is
Z d
d x1 d d x 2
X Z
(2) gi gj hOi (x1 )Oj (x2 )i?
i,j |x1 −x2 |∈(a(1+δ`),a) a2d−∆i −∆j | {z }
= k cijk |x1 −x2 |∆k −∆i −∆j hOk i?
P
X Z
−2d+∆k
= δ` gi gj cijk Ωd−1 a hOk i?
ijk
where the O(g 3 ) term comes from triple collisions which we haven’t considered here.
Therefore we arrive at the following expression for evolution of couplings: dg
d`
= (δ1 g + δ2 g) /δ`
dgk 1 X
= (d − ∆k )gk − Ωd cijk gi gj + O(g 3 ) . (8.6)
d` 2 ij
118
33
At g = 0, the linearized solution is dgk /gk = (d − ∆k )d` =⇒ gk ∼ e(d−∆k )` which
translates our understanding of relevant and irrelevant at the initial fixed point in terms
of the scaling dimensions ∆k : gk is relevant if ∆k < d.
(8.6) says that to find the interaction bit of the beta function for gk , we look at all
the OPEs between operators in the perturbed hamiltonian (8.5) which produce gk on
the RHS.
Let’s reconsider the Ising model from this point of view:
1X X
H =− J(x − x0 )S(x)S(x0 ) − h S(x)
2 x,x0 x
1X X X 2
'− J(x − x0 )S(x)S(x0 ) − h S(x) + λ S(x)2 − 1
2 0 x x
Z x,x 2
d 1 ~ −2 2 d−4 4 −1−d/2
' d x ∇φ + r0 a φ + u0 a φ + ha φ (8.7)
2
In the first step I wrote a lattice model of spins S = ±1; in the second step I used
the freedom imparted by universality to relax the S = ±1 constraint, and replace it
with a potential which merely discourages other values of S; in the final step we took
a continuum limit.
In (8.7) I’ve temporarily included a Zeeman-field term hS which breaks the φ → −φ
symmetry. Setting it to zero it stays zero (i.e. it will not be generated by the RG)
because of the symmetry. This situation is called technically natural.
Now, consider for example as our starting fixed point the Gaussian fixed point, with
Z
1 ~ 2
H?,0 = dd x ∇φ .
2
Since this is quadratic in φ, all the correlation functions (and hence the OPEs, which
we’ll write below) are determined by Wick contractions using
N
hφ(x1 )φ(x2 )i?,0 = .
|x1 − x2 |d−2
33
To make the preceding discussion we considered the partition function Z. If you look carefully you
will see that in fact it was not really necessary to take the expectation values hi? to obtain the result
(8.6). Because the OPE is an operator equation, we can just consider the running of the operator e−H
and the calculation is identical. A reason you might consider doing this instead is that expectation
values of scaling operators on the plane actually vanish hOi (x)i? = 0. However, if we consider the
partition function in finite volume (say on a torus of side length L), then the expectation values
of scaling operators are not zero. You can check these statements explicitly for the normal-ordered
operators at the gaussian fixed point introduced below. Thanks to Sridip Pal for bringing these issues
to my attention.
119
2
It is convenient to rescale the couplings of the perturbing operators by gi → Ωd−1 gi
to remove the annoying Ωd−1 /2 factor from the beta function equation. Then the RG
equations (8.6) say
dh
P
d`
= (1 + d/2)h − ij cijh gi gj
dr0
P
d`
= 2r0 − ij cijr0 gi gj
du0 = u − P c g g
d` 0 ij iju0 i j
So we just need to know a few numbers, which we can compute by doing Wick con-
tractions with free fields.
On ≡: φn := φn − (self-contractions)
O2 = φ2 − φ2 , O4 = φ4 − 6 φ2 φ2 + φ4 . (8.10)
First we demand
!
0 = hO4 i = 3G20 + aG20 + 3bG20
which requires 0 = 3 + a + 3b. The next demand is that
!
0 = hO4 (x)O2 (0)i = φ4 (x)φ2 (0) − φ4 G2 + a φ2 (x)G0 φ2 (0) − φ2 G20 + 3bG20 hO2 (0)i (8.8)
= 3G30 + 12Gx G0 − 3G30 + a 2Gx G0 + G30 − G30 = Gx G0 (12 + 2a) + G30 0. (8.9)
which requires a = −6, and hence b = +1. Notice, however, that this changes nothing about the
operational definition (omit self-contractions). Thanks to Aria Yom for questioning the expression in
(8.10).
120
To compute their OPEs, we consider a correlator of the form above:
Notice that the symmetric operators (the ones we might add to the action preserving
the symmetry) form a closed subalgebra of the operator algebra.
121
follows. Linearizing the RG flow about the new fixed point,
dr0
= 2r0 − 24u?0 r0 + · · ·
d`
gives
dr0 24 24 ν1
= (2 − )d` =⇒ r0 ∼ e(2− 72 )` ≡ e`
r0 72
1 1
which gives ν = 2
+ 12
+ O(2 ).
[End of Lecture 17]
122
9 Lower dimensions and continuous symmetries
[Cardy §6, Goldenfeld §11]
Mean field theory gets better as the number of dimensions grows, so naturally it gets
worse when the number of dimensions shrinks. For low enough d, the fluctuations
completely destroy the order at any finite temperature. For the Ising model, this lower
critical dimension is d = 1; that is, Tc is zero for an Ising chain. Recall our (Peierls’)
understanding of this: if we fix the spins to be up at one end of the chain, then the
free energy cost for making a region of down spins is
∆F1 = E − T S ' 4J − 2T ln L
where L is the system size – the domain walls can be in any of L places. For any
T > 0, for large enough L, ∆F1 is negative (hence favorable). In contrast, in d = 2, the
energy of the domain wall is 2JLd−1 = 2JL, while the entropy is of order log µL (the
domain wall is a self-avoiding but closed random walk of length of order L; at each of
∼ L steps it has of order µ = z − 1 choices of direction to go), so
∆F2 ∼ L (2J − T ln µ)
123
doesn’t change the energy. Translating different parts of the solid by slightly different
amounts will therefore cost a small energy, proportional to the gradient of the trans-
lation. The excitations of the solid therefore include large-correlation-length modes
(Goldstone bosons) ~ui (x) which appear in the energy only through their derivatives.
When experiencing such an elastic deformation, the solid will exert a restoring force,
R
encoded in the energy functional for u by terms like K ∂i uj C ijkl ∂k ul (analogous to
~ 2 ). The fact that a solid is rigid is a consequence of spontaneous symmetry
R
K(∇θ)
breaking, and this concept of rigidity generalizes to other cases of SSB. We’ll have more
to say about the stiffness of magnets.
Hohenberg-Mermin-Wagner-Coleman Theorem. Consider an O(n) model,
~ ~ ~
with n-component rotors pSr at each site, Sr · Sr = 1, ∀r. If the system orders, we can
~r = ( 1 − σr2 , ~σr ), where ~σr is an n − 1-component vector pointing
write the spin as S
transversely to the ordering direction, describing the fluctuations about a particular
ordered state – we will assume σ 2 1. The action for these fluctuations (known as
spin waves in the context of magnetism) is
Z
1 X K d
2
~
~ 4
S=− Jrr0 σr σr0 ' const + d r ∇σ + O ∇σ . (9.1)
2T rr0 2
And now here’s the crucial point: in d = 2, this fluctuation correction to the magne-
R 2
tization goes like d̄k2k ∼ ln LΛ. It diverges with system size, which clearly means it’s
not a small correction to the leading term. (Notice that the form of the integrand is not
exactly correct far from k = 0 in the Brillouin zone, but it is the infrared divergence at
124
k = 0 which is the story here.) The singularity from the long-wavelength fluctuations
is only worse if d < 2. The way out is that our assumption that there was ordering in
the first place was wrong in d ≤ 2. We conclude that it is not possible to spontaneously
break a continuous symmetry in d ≤ 235 . A more proofy proof of this statement is on
the homework.
In the second step, we focussed on the universal physics by considering u0 large, with
fixed r0 /u0 . This has the effect of making the longitudinal excitations very costly
– the walls of the potential
q become very steep about the circle of minima. Writing
iθ(r) r0
Φ(r) = e u0
+ δ(r) , the longitudinal excitation δ(r) is very hard to excite and
q
we can forget about it. We defined K = ur00 , but recall that the overall coefficient
of the action is J/T , and this is what determines K. The angular variable θ is the
Goldstone mode – it only appears in the action via its derivatives.
The spin Green’s function is
125
where we used Wick’s theorem in the last step. This correlation function of the Gold-
stones is
(θ(r) − θ(0))2 = 2 θ(0)2 − hθ(r)θ(0)i
(9.7)
Z Λ=1/a D E ra 1
2 d̄2 k |θ̃k |2 1 − eikr ' log r/a . (9.8)
0 | {z } 2πK
1
=
Kk2
Therefore
1 T
G(r) = r−η , η = = . (9.9)
2πK 2πJ
At the last step, we restored the factor of 1/T in the action. Important comments:
• η is indeed the anomalous dimension of the spin operator, defined as usual for a
critical theory by G(r) ∼ r2−d−η . But this is not at a critical point, this behavior
occurs everywhere in a whole phase.
On the other hand, at high temperatures, we know that the correlations must be
short-ranged (for example, using the (convergent!) high-temperature expansion),
T J
G(r) ∼ e−r/ξ . The distinction between these two asymptotic behaviors of G(r)
is sharp, and they represent different phases.
The low-temperature phase is consistent with the Mermin-Wagner theorem –
there is no disconnected piece of G = Gconnected . It is called algebraic order or
quasi-long-range order. In between there must be a phase transition of some kind,
which we will understand below.
• And indeed, the exponent η varies with K and hence with temperature! K is an
exactly marginal perturbation of a scale-invariant theory– it parametrizes a line
of (different!) fixed points.
Why isn’t (9.9) an exact statement for all temperatures? In our computation of
(9.9) we neglected the important fact that θ ' θ +2π, θ is compact. This means that in
addition to the smooth configurations which lead to (9.9), there are other, topologically
distinct, configurations where as we move around in a loop in space, θ wanders around
on the circle, and only returns to itself up to a multiple of 2π. That is, there can be
configurations of θ(r) and loops C for which
I
~ = 2πn, n ∈ Z.
d~r · ∇θ (9.10)
C
126
We say that the loop C encloses n vortices. The presence of a vortex is topological
because the winding number n is an integer, which therefore cannot vary continuously.
~ ∼ 1 , from which we can estimate that the energy of a vortex is
(9.10) says that ∇θ r
Z
1 2
~
Eone vortex = K d2 r ∇θ = πJ log L/a
2
where L is the system size. Notice that this diverges in the thermodynamic limit: a
net number of vortices is not a finite energy configuration. To have finite energy, the
largest loops must contain a net number zero of vortices. However smaller regions may
contain vortices (n > 0) and antivortices (n < 0). The energy of a vortex-antivortex
pair (a vortex dipole) separated by distance R is
Z R
dr
Ev−v̄ ' ∼ log R/a,
a r
finite in the thermodynamic limit. We estimated above the energy of a single vortex,
but what is its free energy? Its entropy is
2
L
Sone vortex = log ( # of possible locations ) = log
a
so that
(
+∞, T < πJ/2,
Fone vortex = Eone vortex −T Sone vortex = (πJ − 2T ) log L/a →
−∞, T > πJ/2 ≡ TKT .
This gives us an estimate (it turns out to be exactly correct) for the transition tem-
perature between the low-temperature, algebraically-ordered phase, and the high-
temperature disordered phase.
[End of Lecture 18]
KT transition. To give a more quantitative account of the transition, we must
explicitly include the vortices in our calculation. To this end we deform our theory of
the Goldstone field θ by introducing a fugacity for vortices – adding a vortex lowers
the energy by y0 . Formally we can do this by changing the action to
Z Z 2
2 1 ~ 2 dx †
S = d xK ∇θ − y0 V (x) + V (x) .
2 a2
In this expression V (x) is an operator which creates a vortex at position x, and V † (x)
creates an antivortex at x. V (x) is an example of a disorder operator – it is defined by
its effects on the spins: for example
Z
hV (x) · · ·i0 ≡ [Dθ]e−S[θ] · · ·
~ = 2π
H
configurations of θ with Cx r · ∇θ
d~
127
where Cx is any curve containing the point x. (Here h· · ·i0 denotes an expectation value
in the theory with y0 = 0.) By some cleverness (following Cardy), we will figure out
what we need without finding an explicit expression for V ; such an expression can be
found as part of a duality map (see Cardy §3.3 or Herbut §6.3). Notice that an expec-
L→∞
tation value with only a single vortex will be hV (x) · · ·i0 ∝ e−E1 /T ∼ e−πK log L/a → 0,
but expectations with zero total vortex number hV (x)V (0)? · · ·i0 will be finite.
Granting this starting point, the partition sum is now a function of two variables:
D R †
E
Z(K, y0 ) = ey0 (V +V ) (9.11)
0
Z Z Z
1
= 1 + y0 V + V † + y02 2 V V † + ··· (9.12)
2 0
Z 2 2
d r1 d r2
2
= 1 + y0 4
V (r1 )V † (r2 ) 0 + · · · (9.13)
a
The terms with odd powers of y0 vanish by the fact that they have a net number of
vortices, and therefore e−E = e−∞ = 0, zero Boltzmann weight.
In the last line (9.13), the vortex-antivortex correlator V (r1 )V † (r2 ) 0 ≡ e−E(r1 ,r2 )
is just the partition function for the spin waves in the presence of a vortex at r1 and an
antivortex at r2 . We can find the resulting free energy E(r1 , r2 ) by saddle point (since
the integral over smooth configurations of θ is gaussian), i.e. just solve the equations
of motion
∇2 θ = 0, (9.14)
(away from the vortices) with boundary conditions demanding the appropriate winding
number around r1 , r2 . The solution is
θ(~r) = Θ(~r − ~r1 ) − Θ(~r − ~r2 ), where Θ(~r) ≡ the angle between ~r and x̂.
128
with r12 ≡ |~r1 − ~r2 |. The additive constant, which is associated with the energy in the
core of the vortices, we can absorb into a rescaling of y0 : y0 → y0 e−πK C̃ ≡ y. Therefore
we conclude that r −2πK
12
V (r1 )V † (r2 ) 0 = ,
a
– we learn that the scaling dimension of the operator V is ∆V = πK. The scaling
behavior of y is then y(b) = byV y(1) with yV = d − ∆V = 2 − πK:
dy
βy = = (2 − πK)y (9.15)
d`
– y is irrelevant for x ≡ 2 − πK ∝ T − TKT < 0, and relevant for x > 0. Here
TKT ≡ πJ/2 as in the estimate above.
To complete the RG equations we need to know how the temperature variable x
runs. Either T or x is the coupling associated with the ‘energy operator’ (∂θ)2 . We
know a few things a priori: if y = 0, it doesn’t run. Only even powers of y can appear
since the total number of vortices and antivortices must be zero (hence zero mod 2).
Therefore, near x = 0, y = 0, the RG equation for x must have the form
dx
βx = = Ay 2 + O(y 4 ). (9.16)
d`
A fancier argument (in Cardy’s book) uses the OPE: Comparing (9.15) with our general
form in terms of OPE coefficients (8.6), we see that the OPE between V and the energy
operator has the form V · (∂θ)2 ∼ V + · · · . A general fact of CFT (Cardy §11.2) says
that the OPE coefficients cijk are completely symmetric, and this means that we must
have V · V ? ∼ (∂θ)2 + · · · . Comparing to (8.6) them implies (9.16).
Equation (9.16) has a nice physical interpretation.
Notice that the equation we solved for the behav-
ior of the θ field in response to the vortices (9.14) θ A0
is Coulomb’s law in d = 2, where θ plays the role (anti)vortex ± charge
of the electrostatic potential. K plays the role of K dielectric constant
R d ~
the dielectric constant of the medium. The running 2πK log r Coulomb potential, d̄k2k eik·~r
of x (equivalently K) in (9.16) is dielectric screen- χV in (9.17) polarizability, χE
ing of the Coulomb field θ by the charge-anticharge
(vortex-antivortex) pairs.
which is the same integral as we saw in (9.8):
d̄2 k
Z ra 1
~
2
1 − eik·~r ' ln r/a + cst.
k 2π
129
The combined RG equations
dy dx
= xy, = Ay 2
d` d`
dy x
imply that dx = Ay which integrates to Ay 2 − x2 = a constant determined by the
initial conditions. The flow lines in the xy plane are hyperbolae, except for
√ the special
initial condition
√ where the constant is zero, which is the lines y = ±x/ A. The line
y = −x/ A is the critical surface of the KT critical point at x = y = 0. Any initial
condition above this line flows off to the upper right, large x and y.
Figure 4: The Kosterlitz-Thouless phase diagram. The red line is the critical surface of the KT fixed
point; to its right is the disordered phase, where the flows end up at large x, y. The thin blue line is
a cartoon of a family of initial conditions for different values of the temperature, including the fact
that the bare value of y goes like e−πK C̃ = e−#/T , and hence is small for small T .
130
~
Infinite polarizability means free charges: it means an arbitrarily small electric field E
moves all the ± charges to opposite ends of the sample. By the same token, it means
an external charge is completely screened beyond the correlation length ξ. Everywhere
in the previous sentences ‘charge’ means ‘vortex’.
Let’s return to the spin stiffness, the free energy cost for twisted boundary con-
ditions. More precisely, consider the system on a torus, and consider the boundary
conditions θ(x, L) = θ(x, 0), θ(L, y) = θ(0, y) + α. We can relate this to the periodic
BC problem by defining
θ(x, y) = θ0 (x, y) − αx/L
where θ0 (x, y) is periodic. In the gaussian approximation, the free energy density is
K0 K0 α 2
(∂θ)2 = + ···
2 2 L
where the · · · comes from the periodic bit, which does not care about α. The spin
stiffness is defined to be
gaussian
κ ≡ L2 ∂α2 f = K0 .
Including the effects of fluctuations and vortices and all that – in the low-temperature
phase – the stiffness is κ = K(` = ∞), the running coupling evaluated in the far
infrared. This is because, in the low temperature phase, the vortex fugacity flows to
y(∞) = 0, and we return to the gaussian model, with a renormalized coupling K(∞).
Therefore:
T < Tc : κ = K(∞) varies with T
T = Tc : κ = K(∞) = Kc = π2 , a universal value
T > T : κ=0
c
where we know the answer for T > Tc because the finite correlation length means
∂α f ∝ e−L/ξ – the influence of the boundary conditions is short-ranged, and so the
leading bit of the free energy doesn’t care about α.
131
Screening by vortices. Above I may have made the β function equation for x
seem mysterious. Actually, it can be directly calculated by considering the renormal-
ization of the stiffness, i.e. its screening by vortex-antivortex pairs. Here we go:
[Chaikin-Lubensky §9.4] We’ll compute the running of the stiffness parameter K
by computing the stiffness in the presence of nonzero y: that is, we compute the free
energy in the presence of a uniform gradient of θ:
~ · ~x
θ(x) = θ0 (x) + α
where α~ parametrizes the twist around the two directions and θ0 is periodic. We can
further decompose the periodic bit
into a smooth piece θs which satisfies 0 = ij ∂i ∂j θs , and a vortex piece, which satisfies
ij ∂i ∂j θv = 2πnv (but 0 = δij ∂i ∂j θv ). Then the free energy defines the renormalized
stiffness K R :
1
F (α) − F (0) ≡ L2 K R α2 (9.18)
2
= − ln tre−H − F (0) (9.19)
1 R
~
= L2 Kα2 − ln tr e−H(α=0) e−K α~ ·∇θ0 − F (0) (9.20)
2 Z Z
1 2 1 2
= L Kα − K2
d x d2 x0 h∂i θ(x)∂j θ(x0 )i αi αj + O(α4 ).
2
(9.21)
2 2
(Note that we are still working in the convention where T = 1.) In the second line we
expanded out S = 21 K d2 x (∂θ0 + α)2 . Since d2 x∂i θs (x) = 0 for all configurations,
R R
132
Therefore37 :
Z
R 2
K = K − (2πK) d2 x h∂i θ(x)∂i θ(0)i (9.22)
hnv (q)nv (−q)i
= K − (2πK)2 lim . (9.23)
q→0 q2
q→0
Net vortex neutrality implies hnv (q)i → 0, plus rotation invariance, implies that
1
hnv (q)nv (−q)i = χv q 2 + O(q 4 )
2
where χv is exactly the vortex polarizability defined above. Therefore
Z 2 −2πK+2
1 2 2 dx x
KR = K − (2πK) y + O(y 4 ) (9.24)
2 a2 a
Z ∞
1 2 2 drr3−2πK
= K − (2πK) y 2π . (9.25)
2 a a4−2πK
Let’s take the high-energy point of view on the RG: we change the cutoff a → ae` and
demand that the physics (K R ) is invariant. This is accomplished by replacing
Z ae`
2 drr3−2πK
K → K(`) = K − cy
a a4−2πK
(where c > 0 is a constant) and
y → y(`) = ye`(2−πK) .
−∂` K = ∂` x = Ay 2 (9.26)
3
∂` y = (2 − πK(`))y(`) + O(y ) = xy. (9.27)
37
A little bit more detail which justifies the first line in (9.22): Claim 1:
h(∂i θv )(q1 )(∂j θv )(q2 )i = f (q)(2π)2 δ 2 (q1 + q2 ) (δij − q̂i q̂j ) ≡ Gij .
Claim 2:
(2π)2
(∂i θv )(q)(∂i θv )(−q) = nv (q)nv (−q).
q2
Claim 1 follows from 0 = ∂i ∂i θv , which implies q1i Gij = q2j Gij = 0. Translation invariance implies
Gij ∝ δ(q1 + q2 ), and rotation invariance implies Gij = A(q 2 )δij + B(q 2 )qi qj .
Claim 2 follows from ij ∂i ∂j θv = 2πnv , i.e.∇~ × ∇θ ~ v = 2πnv ẑ. Taking curl of the BHS gives
~ ~ ~ 2 ~ ~
∇ × (∇ × ∇θv ) = −∇ (∇θv ) = ∇ × 2πnv ẑ which says
−iji qi
(∂j θv )(q) = 2πnv (q).
q2
133
10 RG approach to walking
[Brézin, ch 8; Cardy ch 9; the original reference is (brief!) P. de Gennes, Phys. Lett. A38
(1972) 339.]
At each site i of a lattice (actually, it could be an arbitrary graph), place an n-
component vector ~si ; we’ll normalize them so that for each site i n = ~si ·~si ≡ na=1 (sai )2 ,
P
(I have named the coupling K to make contact with our previous discussion of SAWs).
Denote by dΩ(s) the round (i.e. O(n)-invariant) measure on an (n − 1)-sphere, nor-
R
malized to dΩ(s) = 1. The partition sum is
Z Y
Z = dΩ(si ) e−H(s)
i
k
∞
Kk
Z Y X X
= dΩ(si ) ~si · ~sj
i k=0
k!
hiji
k≡Nl (G) Z Y
X K Y
= dΩ(si ) ~si · ~sj . (10.1)
graphs, G
k! i hiji∈G
Here we are doing the high-temperature expansion, and further expanding the product
of factors of the Hamiltonian; we interpret each such term as a graph G covering a
subset of links of the lattice. Nl (G) is the number of links covered by the graph G.
Now we can do the spin integrals. The integral table is
Z
dΩ(s) = 1
Z
dΩ(s)sa sb = δab n
Z
n
dΩ(s)sa sb sc sd = (δab δcd + 2 perms) (10.2)
n+2
where the second follows by O(n) invariance and taking partial traces. The generating
function is useful:
Rπ n−2 ∞ Rπ
dθ sin θe x cos θ p dθ sinn−2 θ cosp θ
Z X x
fn (x) ≡ dΩ(s)e~x·~s = 0 R π = 0 R
π
0
dθ sinn−2 θ p! dθ sinn−2 θ
p=0 | 0 {z }
=0,n odd
134
∞
X x2p np n→0 x2
= 1+ → f 0 (x) = 1 + . (10.3)
p=1
p! 2p n(n + 2) · · · (n + 2p − 2) 2
Let’s interpret this last result: it says that in the limit n → 0, each site is covered
either zero times or two times. This means that the graphs which contribute at n → 0
avoid themselves. 38
Returning to n > 0, since sai sbi = 0 if a 6= b, the value of the spin is conserved
along the closed loops. We get a factor of n from the spin sums na=1 from closed
P
n→0
loops. Only closed loops contribute to Z. So Z → 1, yay. Less trivially, however,
consider Z Y
−1
Gab=11 a=1 b=1
(r, K) ≡ s0 sr ≡Z dΩ(si )e−H(s) s10 s1r .
i
where Mp (~r) is (as in (2.4)) the number of SAWs going from 0 to ~r in p steps. This is
the generating function we considered earlier! The quantity G in (2.6) is actually the
correlation function of the O(n → 0) magnet!
Summing the BHS of (10.4) over r, the LHS is r G11 (r, K) = χ11 (K) ∼ (Kc −K)−γ
P
which K → Kc (from below), from which we concluded earlier that for large walks,
p1
Mp ∼ pγ−1 ap (with a = 1/Kc , a non-universal constant which is sometimes fetishized
by mathematicians).
Furthermore, the quantity ξ in (2.7) is actually the correlation length, G11 (r, K) ∼
e−r/ξ . At the critical point, ξ ∼ (Kc − K)−ν means that Rp ∼ pν , which determines
1
the fractal dimension of the SAW in d dimensions to be DSAW = limn→0 ν(n,d) , where
ν(n, d) is the correlation-length critical exponent for the O(n) Wilson-Fisher fixed point
in d dimensions.
[End of Lecture 19]
38
Cardy has a clever way of avoiding these spherical integrations by starting with a microscopic
P
model with a nice high temperature expansion (namely H(s) = hiji log (1 + K~si · ~sj )) and appealing
to universality.
135
10.1.1 SAW:RW::WF:Gaussian
In the same way, the Gaussian fixed point determines the fractal dimension of the
unrestricted walk. This can be seen by a high-temperature expansion of the Gaussian
model. (For more on this point of view, see Parisi §4.3 - 4.4.) Alternatively, consider
unrestricted walks on a graph with adjacency matrix Aij , starting from the origin 0.
Denote the probability of being at site r after n steps by Pn (r). Starting at 0 means
P0 (r) = δr,0 . For an unrestricted walk, we have the one-step (Markov) recursion:
1X
Pn+1 (r) = Ar0 r Pn (r) (10.5)
z r0
P
where the normalization factor z ≡ r0 Ar0 r is the number of neighbors (more generally,
the matrix A could be a weighted adjacency matrix and z could depend on r). Defining
the generating function
X∞
G(r|q) ≡ q n Pn (r)
n=0
The long-wavelength properties (for which purposes the denominator may be replaced
by p2 + r as r ∼ q − 1) of the Gaussian model near its critical point at q → 1 determine
√
the behavior of large unrestricted walks, and in particular the RMS size ∼ n and
fractal dimension is 2.
And the Gaussian answer is the right answer even for a SAW in d > 4. We could
anticipate this based on our understanding of the fate of the WF fixed point as d → 4
from below. How can we see the correctness of mean field theory for SAWs in d > 4
directly from the walk?
There is a simple answer, and also a more involved, quantitative answer (next).
The simple answer is: the random walk has fractal dimension D = 2 (if it is embed-
ded in two or more dimensions and is unrestricted). Two-dimensional subspaces of
136
Rd will generically intersect (each other or themselves) if d ≤ 4 (generic intersection
happens when the sum of the codimensions is ≤ 0, so the condition for intersection is
underdetermined). For d ≥ 4, they generically miss each other, and the self-avoidance
condition does not have a big effect.
The first term insists that neighboring monomers be spaced by a distance approxi-
mately a. The second term penalizes a configuration where any two monomers collide.
We used factors of the chain-spacing a to render the coupling u dimensionless.
Now zoom out. Suppose that a ξ so that we may treat the polymer as a
continuous chain, ~r(ti ≡ ia2 ) ≡ ~ri . In taking the continuum limit we must take t ∼ a2
in order to keep the coefficient of the ṙ2 term independent of a. The exponent becomes
the Edwards hamiltonian:
Z 2 Z Z
dr d−4
HE [r] = dt + ua dt1 dt2 δ d (~r1 − ~r2 ).
dt
a → ba, r → b−x r, t → t.
137
Here is a clever (though approximate) argument (due to Flory) that suggests a value
for x. At a fixed point, the two terms in H must conspire, and so should scale the
same way. For general x, the kinetic term and the potential scale respectively as
KE → KEb−2x , V → V bd−4+dx
4−d
suggesting that x = 2+d
. Dimensional analysis says
t 1+x
r(t) = af 2
∼t 2
a
and therefore the RMS walk size is
1+x 3
R = r(t = N ) ∼ N ν , ν = |Flory = .
2 d+2
This isn’t too bad; in fact it’s exactly right in d = 2. See Cardy Chapter 9 for more
on this, and see the homework for a more quantitative approach to the value of ν.
138
with cn independent of g. So
So if [g] > 0, cn must have more and more powers of some inverse length as n increases.
What dimensionful quantity makes up the difference?? The dimensions are made up
by dependence on the short-distance cutoff Λ = 2π a
. which has [Λ] = −1. Generically:
n[g]
cn = c̃n (Λ) , where c̃n is dimensionless, and n[g] > 0 – it’s higher and higher powers
of the cutoff. But this means that if we integrate out shells down to Λ/b, in order for
physics to be independent of the zoom parameter b, the microscopic coupling g(b) will
have to depend on b to cancel this factor. In particular, we’ll have to have
b→∞
g(b) = g0 b−n[g] → 0.
139
10.2 RG approach to unrestricted lattice walk
We showed above that the generating function G(r|q) for unrestricted walks on a
lattice (from 0 to r) satisfies (10.6), which says that it’s a Green’s function for the
lattice laplacian. The data of the Green’s function is encoded in the spectrum of the
adjacency matrix
Aij vj = vi . (10.8)
This determines G via X 1
G(i|q) = v0 vi .
1 − q/z
The eigensystem of A encodes the solution to many physics problems. For example,
we could consider a continuous-time random walk, where the probability pi (t) for a
walker to be at site i at time t satisfies
Alternatively, we could think of these as the equations for the normal modes of the
lattice vibrations of a collection of springs stretched along the bonds of the lattice. In
that case, this spectrum determines the (phonon contribution to the) heat capacity of
a solid with this microstructure.
Previously, we solved this problem using translation symmetry of the lattice, by
going to momentum space. Here I would like to illustrate an RG solution to this
eigenvalue problem which is sometimes available. It takes advantage of the scaling
symmetry of the lattice. Sometimes both scaling symmetry and translation symmetry
are both present, but they don’t commute.
Sometimes, as for most fractals, only the self-similarity is present.
So this method is useful for developing an analytic understanding
of walks on fractal graphs, or more generally the spectrum of their
adjacency matrix. I believe the original references are this paper
and this one. Roughly, we are going to learn how to compute the
phonon contribution to the heat capacity of the broccoflower!
Let’s solve (10.8) for the case of a chain, with Aij = t(δi,j+1 +δi,j−1 ). I’ve introduced
a ‘hopping amplitude’ t which can be regarded as related to the length of the bonds.
140
The eigenvalue equation can be rewritten as
t
vi = (vi−1 + vi+1 ) . (10.10)
Notice that if i is odd, then the entries on the RHS only involve even sites. So this
equation eliminates vi at the odd sites in terms of the values at the even sites. Plugging
this back into the equation for an even site gives
t2
v2l = t(v2l−1 + v2l+1 ) = (v2l−2 + v2l + v2l + v2l+2 )
t2
=⇒ v2l = (v2l−2 + v2l+2 ) .
| 2 − 2t
{z }
2
≡t0 /
This is the same equation as (10.10), but with half as many sites, i.e. the zoom factor
is b = 2.
t0 is a renormalized hopping ampli-
tude:
t0 t2 (t/)2
= 2 = .
− 2t2 1 − 2(t/)2
Let’s cheat and remind ourselves of the known answer for the
spectrum using translation invariance: E(k) = 2t cos ka ranges
from −2t to 2k as k varies over the BZ from 0 to 2π/a. Let’s use
this to learn how to understand the iteration map.
141
For the chain, the map has three fixed points, at x = 0, 12 , −1. Let’s think of fixing
E and varying the initial hopping rate. If t0 ∈ (−E/2, E/2) (that is, if |E| > 2t is in
the band gap) then tn→∞ → t? = 0 eventually reaches the fixed point at x = 0 (as in
n1 n
the left figure). More precisely, it goes like tn ∼ Ee−2 λ for some λ.
Such an orbit which asymptotes to t → 0 can be described by decoupled clusters –
the wavefunction is localized. I learned about this from this paper.
In contrast, one with finite or infinite asymptotic t is associated with an extended
state. This happens if |t0 | > E/2 (so that E ∈ (−2t, 2t) is in the band). Then
tn > |E|/2 for all n, and we have a nonzero effective hopping even between two sites
that are arbitrarily far-separated.
The fixed point at t? = E/2 is the state with k = 0, i.e. the uniform state.
Eliminating the B sites by solving the previous six equations for them in terms of
the A sites and plugging into (10.11) gives an equation of the same form on a coarser
lattice
t2
A1 = t0 (A2 + A3 + A4 + A5 ), t0 = .
− 3t
√
Zoom factor is b = 2. In terms of the dimensionless ratio x ≡ t/,
x2
x→
1 − 3x
142
Here’s a way to visualize the huge qualitative difference from
this map relative to the result for the chain. Plot, as a func-
tion of some initial x = t/, the value of the nth iterate, for
some large value of n (here 105 × 2). For the chain (shown at
the top), every x which starts in the band stays in the band 39
(xn > 1/2 if x0 > 1/2), and vice versa. For the Sierpinski
case, we get this Cantor-like set of localized states. Here the
spacing on the x-axis is 10−2 ; if we scan more closely, we’ll
find more structure.
Here’s one more notion of dimension, for a graph embedded in Rd , following Toulouse
et al. Think of the graph as a Debye solid, that is, put springs on the links of the
graph, each with natural frequency ω02 = K/m. The normal modes of this collection of
springs have frequencies ω with ωn2 /ω02 which are eigenvalues of the adjacency matrix.
The density of states of such modes for small ω is an ingredient in the heat capacity
of the resulting model solid. Denote by ρ(ω)dω the number of modes with frequency
in the interval (ω, ω + dω).
For a translation-invariant system in d dimensions, the modes can be labelled by
wavenumber and ρ(ω)dω = d̄d k which at ω → 0 (in the thermodynamic limit) is
governed by Goldstone’s acoustic phonon with ω = vs k and therefore ρ(ω) ∝ ω d−1 .
More generally, we define the spectral dimension ds of the graph by the power law
relation
N →∞ then ω→0 ds
ρ(ω) ∼ ω .
Sometimes it’s called the diffusion dimension. It is a useful idea! One cool application
is to figuring out how many dimensions your average spacetime has when you do a
simulation involving dynamical triangulations. (See §5.2 of this paper.)
Now suppose that instead of translation-invariance, we have dilatation invariance,
i.e. self-similarity. The number of sites for a graph Γ of linear size L scales as
N (L) ∼ LDΓ
where DΓ is the fractal dimension. This means that if we assemble a scaled up version
whose linear size is scaled up by b, we have N (bL) = bDΓ N (L) sites. And it means,
39
Thanks to Daniel Ben-Zion for help with these figures.
143
just by counting eigenvalues, that the density of states per unit cell must scale like
Consider L finite so that the spectrum {ωn } is discrete, and focus on the nth
eigenvalue from the bottom, for some fixed n. If we knew that this eigenvalue scaled
with system size like
ω(L/b) = bx ω(L)
then
ρL/b (ω) = b−x ρL (ωb−x ) (10.13)
(10.12),(10.13) b=ω 1/x DΓ −x
=⇒ ρL (ω) = bDΓ −x ρL (ωb−x ) ∼ ω x .
Claim: The smooth part of the spectrum of the Sierpinski fractal solid does scale
like bx for some x which we can determine. A more earnest pursuit of the equations
(10.11) implies that
!
(ω 0 )2
2
ω2
2
ω ω
2
7→ 2
= 2
d+3−
ω0 ω0 ω0 ω02
The resistor network on a Sierpinski d-gasket is studied here. The scaling with size
of the conductivity of stuff made from such a graph can be related to its spectral
dimension.
Unlike the paragon of nerd-sniping problems (the resistor network on the square
lattice), this problem cannot be solved by going to momentum space.
Consider sending a current I into one corner of a Sierpinski gasket. By symmetry,
a current I/d must emerge from the other d corners.
Call ρ(a) the resistance of one bond with lattice spacing a. Now we want to compute
the effective, coarse-grained resistance ρ(ba) for b > 1. The symmetry of the problem
144
forbids current from crossing the middle of the triangle, and this allows us to compute
the voltage drop between the input corner and any of the others. Specifically, this
voltage drop is preserved if
d+3
ρ(ba) = ρ(a) ≡ bζ ρ(a)|b=2
d+1
log d+3
d+1
ζ= .
log 2
L
Now if we iterate this map ` times so that b` = a
for some macroscopic L, then the
resistance of the whole chunk of stuff is
ρ(L) ∼ Lζ
L2−d
σ(L) = ∼ L−t
ρ(L)
145
11 RG sampler platter
I didn’t get this far in lecture, but the goal of this section is to convey some more the
huge range of applications of the renormalization group perspective.
11.1 Disorder
chical lattice. I denote the new sites in black. The beauty of this construction for
our purposes is that decimating the black sites precisely undoes the construction step:
The generalization which replaces each link with q segments is called the Berker
146
Let vhiji ≡ tanh βJhiji . Consider tracing over the black sites A and B in
the figure at right. Using the high-temperature-expansion formula, this
isn’t hard:
X X Y
e−∆Heff (sC ,sC ) = e−H(s) =
1 + vhiji si sj
sA ,sB =±1 sA ,sB links,hiji
= 22 (1 + v1 v2 sC sC ) (1 + v3 v4 sC sd )
= 22 ((1 + v1 v2 v3 v4 ) + (v1 v2 + v3 v4 )sC sD )
= 22 (1 + v1 v2 v3 v4 ) (1 + v 0 sC sD )
with
v1 v2 + v3 v4
v 0 (v1 ..v4 ) = . (11.1)
1 + v1 v2 v3 v4
In the clean limit where all couplings are the same, this is
2v 2
v0 = .
1 + v4
This has fixed points at
v ? = 0, 1, 0.0437 . (11.2)
Just as we did for the Ising chain in §3, we can study the behavior near the nontrivial
√
fixed point and find (here b = 2) that ν ' 1.338. Redoing this analysis to include
also a magnetic field, we would find yh = 1.758 for the magnetization exponent.
But now suppose that the couplings are chosen from some initial product distribu-
tion, independently and identically distributed. Some examples for the bond distribu-
tion to consider are:
After the decimation step, the distribution for any link evolves according to the
usual formula for changing variables in a probability distribution, using the RG relation
(11.1): Z
P 0 (v 0 ) = dv1 dv2 dv3 dv4 δ(v 0 − v 0 (v1 ..v4 ))P (v1 ) · · · P (v4 ).
The preceding relation is then an RG recursion equation for the distribution of couplings
P (v) 7→ (R(P )) (v). As usual when confronted with such a recursion, we should ask
about its fixed points, this case fixed distributions:
Z
!
P? (v) = dv1 dv2 dv3 dv4 δ(v − v 0 (v1 ..v4 ))P? (v1 ) · · · P? (v4 ). (11.3)
147
We know some solutions of this equation. One is
P? (v) = 1δ(v − v? )
Px0 (v) = x4 + 4x3 (1 − x) + 2x2 (1 − x)2 δ(v−1)+ 4x2 (1 − x2 ) + 4x(1 − x)3 + (1 − x)4 δ(v)
where the terms come from enumerating which of the four bonds is zero. So: the
distribution is self-similar, but the bond-placing probability x evolves according to
x 7→ x0 = 2x2 − x4 .
So each fixed point of this map gives a solution the fixed-distribution equation (11.3).
They occur at
0, nobody’s home
x? = 1, everybody’s home
√
5−1 ,
percolation threshold on the DHL.
2
We can study (a subset of) flows between these fixed points if we make the more
general ansatz
p(v) = xδ(v − v0 ) + (1 − x)δ(v)
with two parameters v0 , x. Then we get a 2d map, much more manageable, if the
evolution preserves the form. It almost does. The evolution rule can be estimated by
x0 = −x4 + 2x
Z
x0 v00 = hviP 0 ≡ dvvP 0 (v). (11.4)
The result is
Z 4
Z Y Y
x0 v00 = dv dvi δ(v − v 0 (v1 ..v4 ))v p(vi )
i=1 i
148
Z Y
v1 v2 + v3 v4 Y
= dvi p(vi )
i
1 + v1 v2 v3 v4 i
2v02 4
=x 4
+ 4x3 (1 − x)v02 + 2x2 (1 − x)2 v02 .
1 + v0
With some random ti chosen from some broad distribution, so that individual ti will
be very different from each other. Consider the largest ti ≡ T , and assume that it is
much bigger than all the others, including its neighbors. Then we can eliminate the
two sites connected by the strong bond by solving the 2 × 2 problem
0 T vi−1 vi−1
' .
T 0 vi vi
More precisely, we can eliminate these two sites vi−1 , vi in terms of their neighbors
149
using their two rows of the eigenvalue equation:
. .
...
.. ..
0 t` v` v`
t` 0 T vi−1 vi−1
vi = vi
T 0 tr
tr 0 v
r
v
r
... .. ..
. .
−1
vi−1 − T t`
=⇒ =
vi T − tr
This RG rule (11.5) (which we could name for Dasgupta and Ma in a slightly more
fancy context) is very simple in terms of the logs of the couplings, ζ ≡ log T /t :
ζ 0 = ζ` + ζr
Imagine we start the RG at some initial strongest bond T0 . Then Γ = log T0 /T says
how much RGing we’ve done so far. The second rescaling step puts the distribution
back in the original range, which requires shifting everyone
T T − dT dT
ζi = log 7→ log '= ζi − = ζi − dΓ
ti ti T
This moves the whole distribution to the left: P (ζ) 7→ P (ζ + dΓ) = P (ζ) + dΓP 0 (ζ) +
O(dΓ), i.e.
dP (ζ)
drescale P (ζ) = dΓ.
dζ
150
The change in the full distribution from adding in the new bonds is
ωn2 = ω02 (1 + n )
151
where n are the eigenvalues of A. And we take Aij = (δi,i+1 + δi,i−1 ) ti and choose ti
from the strong-disorder distribution found above.
To find the heat capacity at temperature 1/β, we should run the RG from some
initial UV cutoff T0 down to the T associated with temperature T , which is of order
T 2
ω0
. Because of the breadth of the distribution, the bonds with t < T 2 are likely to
have t T 2 and we can ignore them. Any site not participating in a bond produces a
simple equipartition contribution ∆E = kB T (i.e. it adds a constant to CV ) as long as
1/β > Ω0 . Sites participating in a bond have ω T and are frozen out. So the heat
capacity is
CV (β) = N (T )
where N (T ) is the number of undecimated sites when the temperature is T , which
means here that the RG scale is T ∼ T 2 . So this model produces a crazy dependence
on the temperature,
1
CV ∼ 2 .
log (T 2 )
152