0% found this document useful (0 votes)

6 views152 pages

2018F 217 Lectures

The document contains lecture notes for Physics 217: The Renormalization Group, taught by McGreevy in Fall 2018. It outlines the course structure, including topics such as scaling, random walks, Ising models, and field theory, while emphasizing the importance of the renormalization group in understanding the relationship between microscopic laws and macroscopic observations. Additionally, it provides conventions used in the course and references for further reading on the subject.

Uploaded by

Spencer Xu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

6 views152 pages

2018F 217 Lectures

Uploaded by

Spencer Xu

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 152

Physics 217: The Renormalization Group

Fall 2018

Lecturer: McGreevy
These lecture notes live here. Please email corrections to mcgreevy at physics dot
ucsd dot edu.

Last updated: 2021/03/30, 12:52:46

Contents
0.1 Introductory comments . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
0.2 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1 Scaling and self-similarity 6

1.1 Fractal dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 Fractal dimension of a random walk . . . . . . . . . . . . . . . . . . . . 9
1.3 RG treatment of random walk . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Anatomy of an RG scheme . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.5 Scaling behavior near a fixed point . . . . . . . . . . . . . . . . . . . . 15

2 Random walks 18
2.1 Biased gaussian walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.2 Universality class of the (unrestricted) random walk . . . . . . . . . . . 19
2.3 Self-avoiding walks have their own universality class . . . . . . . . . . . 21

3 Ising models 26
3.1 Decimation RG for 1d nearest-neighbor Ising model . . . . . . . . . . . 31
3.2 High-temperature expansion . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 RG evaluation of physical quantities . . . . . . . . . . . . . . . . . . . . 37
3.4 Need for other schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.5 Low-temperature expansion, and existence of phase transition in d>1 42
3.6 A word from our sponsor . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.7 Duality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.8 Block spins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

4 Mean Field Theory 47

1
4.1 Landau-Ginzburg theory . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2 Correlations; Ginzburg criterion for MFT breakdown . . . . . . . . . . 59

5 Festival of rigor 68
5.1 Extensivity of the free energy . . . . . . . . . . . . . . . . . . . . . . . 68
5.2 Long-range interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.3 (Anti-)convexity of the free energy . . . . . . . . . . . . . . . . . . . . 72
5.4 Spontaneous symmetry breaking . . . . . . . . . . . . . . . . . . . . . . 75
5.5 Phase coexistence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

6 Field Theory 82
6.1 Beyond mean field theory . . . . . . . . . . . . . . . . . . . . . . . . . 83
6.2 Momentum shells . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.3 Gaussian fixed point . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6.4 Perturbations of the Gaussian model . . . . . . . . . . . . . . . . . . . 88
6.5 Field theory without Feynman diagrams . . . . . . . . . . . . . . . . . 91
6.6 Perturbative momentum-shell RG . . . . . . . . . . . . . . . . . . . . . 100

7 Scaling 106
7.1 Crossover phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
7.2 Finite-size scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

8 The operator product expansion and conformal perturbation theory115

9 Lower dimensions and continuous symmetries 123

9.1 Lower critical dimension . . . . . . . . . . . . . . . . . . . . . . . . . . 123
9.2 Kosterlitz-Thouless phase and phase transition . . . . . . . . . . . . . . 125

10 RG approach to walking 134

10.1 SAWs and O(n → 0) magnets . . . . . . . . . . . . . . . . . . . . . . . 134
10.2 RG approach to unrestricted lattice walk . . . . . . . . . . . . . . . . . 140
10.3 Spectral dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
10.4 Resistor networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

11 RG sampler platter 146

11.1 Disorder . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
11.2 RG viewpoint on matched asymptotic expansions . . . . . . . . . . . . 152
11.3 RG approach to the period doubling approach to chaos . . . . . . . . . 152

2
0.1 Introductory comments

The ‘renormalization group’ (RG) is a poor name for the central concept in many-body
physics. It is a framework for addressing the question: what is the relationship between
microscopic laws and macroscopic observations?
Or, closer to home, it allows us to answer questions such as: Why don’t you need
to understand nuclear physics to make your tea in the morning?1
Briefly, the RG is the realization that systems of many degrees of freedom (especially
when they have local interactions) should be understood hierarchically, i.e. scale by
scale.
There is a lot more to say to contextualize the RG, which, as you can see from the
previous question, is really a piece of metaphysics, that is, it is a framework for how
to do physics. But since it is such a broad and far-reaching concept, in order to avoid
being vague and useless, it will be better to start with some concrete and simple ideas,
before discussing of some of its many consequences.

A word about prerequisites: The official prerequisite for this course is graduate
statistical mechanics. I think you would be able to get by with a good undergrad class.
The historical origins of the RG (at least its name) are tied up with high-energy
particle physics and quantum field theory. That stuff involves quantum mechanics in
a serious way. Much of the content of this course can be understood without quantum
mechanics; the fluctuations could all be thermal. At various points along the way I
will point out the connections with quantum field theory.
So this is mostly a course in statistical field theory (≡ statistical mechanics of many
degrees of freedom). But there are many other applications of the RG which don’t quite
fit in this category which I also hope to discuss.
Also, I think our discussion will all be non-relativistic, v c.

Initial Tentative Plan:

1. Scaling and self-similarity

2. RG treatment of random walks

3. Ising models

4. Critical phenomena (a great victory of the RG). 4 − expansions

1
This framing of the question I heard from Savas Dimopoulos.

3
5. RG treatment of iterative maps and the period-doubling approach to chaos

6. RG treatment of percolation and lattice animals

7. RG understanding of the method of matched asymptotic expansions

8. RG treatment of stochastic PDEs

As the title indicates, this is a very rough guess for what we’ll do. An early target
will be a renormalization-group understanding of the central limit theorem.

Sources for these notes (anticipated):

Introduction to Renormalization Group Methods in Physics, by R. J. Creswick, H. A. Far-

rach, C. P. Poole.

Lectures on Phase Transitions and the Renormalization Group, by Nigel Goldenfeld.

Statistical Physics of Fields, by Mehran Kardar.

Introduction to Statistical Field Theory, by Eduard Brézin.

Renormalization Group Methods, a guide for beginners, by W. D. McComb.

Scaling and Renormalization in Statistical Physics, by John Cardy.

Modern Theory of Critical Phenomena, by S.-K. Ma (UCSD).

Statistical Field Theory, by David Tong.

4
0.2 Conventions

The convention that repeated indices are summed is always in effect unless otherwise
indicated.
h
A useful generalization of the shorthand ~ ≡ 2π
is

dk
d̄k ≡ .
2π
I will also write /δ(q) ≡ (2π)d δ d (q).
I will try to be consistent about writing fourier transforms as

dd k ikx ˜
Z Z
d
e f (k) ≡ d̄d k eikx f˜(k) ≡ f (x).
(2π)

RHS ≡ right-hand side.

LHS ≡ left-hand side.
BHS ≡ both-hand side.
IBP ≡ integration by parts.
+O(xn ) ≡ plus terms which go like xn (and higher powers) when x is small.
I write log for base two and ln for base e.
!
A = B means we demand A = B.
A ≡ B is a definition.

I reserve the right to add to this page as the notes evolve.

Please send me email if you find typos or errors or violations of the rules above.

5
1 Scaling and self-similarity
[This discussion largely follows the early chapters of the book by Creswick et al.]
First some somewhat-vague definitions to get us started. An object is self-similar if
its parts, when magnified by a suitable scale factor λ, look like the whole. (Here is
an example.) Something is scale-invariant if this is true for every λ. (Self-similarity
is sometimes called ‘discrete scale invariance’. ) An important generalization is the
notion of statistical self-similarity – something which is sampled from a distribution
which is self-similar.
The point in life of the renormalization group is to provide a way of thinking about
(and ideally relating quantitatively) what’s going on at different scales of magnification.
So something which is self-similar or scale-invariant is a simple special case for the RG.
As we’ll see, a symptom of scale invariance is a power law.

1.1 Fractal dimension

The word ‘dimension’ is used in many ways in this business. Let’s consider a set of
points in d-dimensional Euclidean space, Rd . In the previous sentence ‘dimension’ is
the minimum number of coordinates needed to specify the location of a point (this is
usually called ‘Euclidean dimension’). It’s an integer.
A subset of Rd specified by some algebraic equations on the
coordinates (we can call this an algebraic set) generically has
a Euclidean dimension which is an integer (though it may not
be the same integer for every point). That is, locally around
almost every solution of the equations, the object will look
The algebraic set {y(y − x2 ) = 0} ⊂ R2
like a piece of RdT for some dT ≤ d (sometimes this notion is
has dT = 1.
called ‘topological dimension’).
Here is a different, RG-inflected definition of the dimension of an object O ⊂ Rd ,
called fractal dimension or Hausdorff dimension: cover the object O with d-balls of
diameter a,
Br0 (a) ≡ {~r ∈ Rd such that |~r − ~r0 | ≤ a/2}. (1.1)
Let
N (a) ≡ the minimum number of such balls required to cover O,
minimizing over the locations of their centers. Do this for various values of a. Then, if
this function is a power law,
N (a) ∼ a−D (1.2)

6
then D is the fractal dimension of O. Even if N (a) is not a power law, we can define
D ≡ − loga N (a).
A few observations:

• Notice that D may itself depend on the range of ball-sizes a we consider, that
is, the same scaling relation may not hold for all a. Often (always) there is a
short-distance (“UV”) cutoff on the regime where the scaling relation (1.2) holds
– if our object is the coastline of France, it is maybe not so useful to consider
femtometer-sized balls. Also, there is often a long-distance (“IR”) cutoff – in the
same example, Earth-sized balls will not give an interesting power law (it just
gives N (r♁ ) = 1).

For objects defined by algebraic equations, D = dT . For

• example, in the example at right, the required number of
balls of size a goes like 1/a.

• O is a set of points in Rd , the dimension of the objects composing O (maybe

points themselves, maybe line segments...) have some topological dimension dT ,
and
dT ≤ D ≤ d.
Where does the right bound come from? By placing d-balls centered at the sites
of a cubic lattice (with a diameter a proportional to the lattice spacing) we can
cover a whole region of Rd with a number that goes like a−d .

It behooves us to give some examples with D ∈

/ Z. Such things are called fractals
and are often defined by a recursive process.

1. A Cantor set in d = 1 can be defined beginning with a line segment of length

a0 . Divide it in thirds and remove the middle segment. Repeat for each sub-
segment. At the end of this process, we get a bunch of points (dT = 0) in d = 1.
According to our definition (1.1), 1-ball of diameter a is an interval of length
a. After n steps of the above procedure, we end up with 2n ≡ N line segments
of length an = a0 3−n , n = 1, 2, 3... (that’s what I’m drawing). Since we only
remove stuff, we can cover the whole thing with these, so we havea
a lower bound
log n
of N (an ) = 2n , and you can’t do better. Eliminate n : n = − log 3
a0
(think of this
as n(a) = − logloga/a
3
0
), so

log aa
− log 2

− 0 a log 3
N (a) = 2n(a) = 2 log 3 =
a0

7
which gives fractal dimension
log 2
D= ' .63 ∈ (0, 1) . (1.3)
log 3
Notice that this object is self-similar with scale factor λ = 3: the two remaining
thirds are identical to the original up to a rescaling of the length by a factor
of three. This fact can be used to infer the power-law, since it means N (a) =
2N (3a). So if N (a) ∼ a−D , we must have a−D = 2(3a)−D =⇒ 1 = 2 · 3−D which
is (1.3).

2. Here’s an example in d = 2. Take a square with side length a0 . Now divide it

into nine squares by dividing each side by three. Remove every other sub-square,
leaving the corners. Repeat. This procedure gives the accompanying figures.
The resulting figure is again self-similar with λ = 3 and has N (a) = 5N (3a) –
we need only five times as many balls of diameter a to cover the region as balls
of diameter 3a. Therefore, if there is a scaling relation N (a) ∼ a−D , we need
log 5
D = log 3
' 1.46. Note that this is sensibly between 1 and 2.

The figure at left is defined by a similar procedure. I

don’t know how I’m ever going to get any physics done
if I don’t stop making these pictures. Lots of interesting
fractals come from other procedures where the fractal di-
mension is not so easy to find.

8
1.2 Fractal dimension of a random walk

So far we’ve been discussing fractals defined by an artificial procedure. Consider a

random walk in Euclidean space of dimension d ≥ 2. Starting from some origin, we
take discrete steps, where the displacement ~r of each step is chosen (e.g.) independently
from some distribution p(~r). For example, we could take our steps to have fixed length
a0 , and uniformly distributed direction. For now, we assume this distribution and that
the walk is otherwise unrestricted.
Of interest is the net displacement after M steps
M
X
~M ≡
R ~ri .
i=1

This is a random variable with average

D E Z M
X
~M
R ≡ d d
d r1 · · · d rM p(~r1 ) · · · p(~rM ) ~ri
M
i=1

which vanishes by our assumption of rotation invariance of the individual distributions.

On the other hand, h~ri · ~rj i ∝ δij , so the square-length of the net displacement has

D E M X
X M M
X
~ M |2
|R = h~ri · ~rj iM = |rj |2 = M a20 .
M
i j j=1

√
rD E
The RMS displacement R(M ) ≡ ~ M |2
|R = M a0 goes like the square root of
M
the number of steps, a probably-familiar result on which we are going to get some new
perspective now.
What is the fractal dimension of a random walk?
A walk of M steps can be regarded as M/n subwalks of n steps (choose M so
that these are integers). By the above result, the RMS displacement of the subwalks is
√
r(n) = na0 ; choose M big enough so that this is a good approximation. This suggests
that we may think of a random walk (RW) of M steps of length a0 as a RW of M/n
√
steps each of length (approximately) a1 ≡ na0 . Notice that this ‘coarse-grained’ step
size is not actually the same for each subwalk. (We are relying on the central limit
theorem here to say that the distribution of subwalk sizes is well-peaked around the
central value. We’ll give an RG proof of that result next.)

9
This perspective allows us to estimate the fractal dimension
of an unrestricted RW. Let N (a) be as above the number of
balls of diameter a needed to cover a walk (probably) of M
microscopic steps of size a0 . When the ball-size is about the
same as the stepsize, we need one ball for each step (this is
overcounting but should give the right scaling), we’ll have

N (a) ∼ M, for a ∼ a0 .

For the sub-steps, the same relation says we should cover

√
each subwalk step (length na0 , of which there are M/n)
with a ball, so
√ M
N ( na0 ) ∼ .
n
Combining the previous two displayed equations (eliminate
M ) gives
√
M = N (a0 ) = nN ( na0 ) =⇒ N (a) ∼ a−2

which says that the fractal dimension of the (unrestricted, in

d ≥ 2) random walk is D = 2.

A few points regarding the notion of fractal dimension.

The Hausdorff dimension we’ve defined is not the only can-

didate for such a scale-dependent and possibly-fractional no-
tion of dimension. If fact there are many others, and they
are not all equivalent. Two that are notable are the box-
counting dimension, where one covers the whole Rd with a
1.
grid of boxes of side length a and counts the number N (a)
of boxes containing an element of the set as a function of a;
if N (a) ∼ a−Dbc then this defines the box-counting dimen-
sion Dbc . This one is easier to implement numerically since
it doesn’t involve a minimization procedure.

Another one is the correlation dimension, which is related to a problem on the

homework.

2. As a practical physicist, why should you care about this result? Here’s one kind
of answer: suppose you have in your hands some object which is locally one-
dimensional, but squiggles around in a seemingly random way. It is governed
by some microscopic dynamics which are mysterious to you, and you would like

10
to know if you can model it as an unrestricted random walk. One diagnostic
you might do is to measure its fractal dimension; if it’s not D = 2 then for sure
something else is going on in there. (If it is D = 2 something else still might be
going on.)
[End of Lecture 1]

3. For some statistically self-similar sets, a single fractal dimension does not capture
the full glory of their fractaliciousness, and it is useful to introduce a whole
spectrum of fractal dimensions. Such a thing is called multifractal.

I hope to say more about both of the previous points later on in the course.

1.3 RG treatment of random walk

Now we’ll study the random walk a bit more precisely, and use it to introduce the RG
machinery. To be specific, suppose that each microscopic step is sampled from the
(Gaussian) distribution
r |2
|~
−
, N = (2πσ02 )−d/2 .
2
p(~r) = N e 2σ0

As before, the detailed form of the single-step distribution will be unimportant for the
questions of interest to us – the technical term is ‘irrelevant’; this will be an outcome
of the RG analysis. In this case, we have h~ri = 0, h~r · ~ri = σ02 .
Let ~r0 ≡ ni=1 ~ri . Think of this as a ‘coarse-grained step’ – imagine that the single
P

steps (of RMS size σ0 ) are too small to see, but for n big enough, n of them can get
somewhere. The distribution for the coarse-grained step is:
Z n
!
X
P (~r0 ) = dd r1 · · · dd rn p(~r1 ) · · · p(~rn ) δ ~r0 − ~ri
| {zi=1 }
~ 0 P
= d̄d k eik·(~r − i ~ri )
R
(Do n · d Gaussian integrals.

Note that d¯d k ≡ dd k .)

(2π)d
2
Z
2 σ0
= d̄ k exp −n|~k|
d
− i~k · ~r 0
2
(One more Gaussian integral.

|~r0 |2

0
−d/2
= N exp − , N 0 ≡ 2πnσ02 . (1.4)
2nσ02

11
This is the same form of the distribution, with
√
the replacement σ0 → σ 0 ≡ nσ0 . We can make
it actually the same distribution if we rescale our
units (the second half of the RG transformation):
√
rescale r0 ≡ nr00 , where the zoom factor is cho-
sen to keep the width of the distribution the same
after the coarse-graining step. Remembering that
distributions transform under change of variables
by
P (~r0 )dd r0 = P (~r00 )dd r00
we have 00 2
|~
r |
00 1 − 2
P (~r ) = e 2σ0
(2πσ0 )d/2
– the same distribution as we had for single step. Therefore, a random walk is (prob-
ably) a fractal – it is self-similar on average.
The two steps above – (1) coarse graining and (2) rescaling – constitute a renor-
malization group transformation (more on the general notion next). The ‘coupling
constant’ σ0 transforms under this transformation, in this case as

σ0 7→ σrenormalized = σ0 ,

i.e. it maps to itself; such a parameter is called marginal and is a special case.
Consider the RMS distance covered by a walk in M steps,
* M +
X
R(M )2 ≡ | ~ri |2 .
i=1 M
R
It depends on M and the typical step size, which is σ (since σ 2 = dd r|~r|2 p(~r)).
Dimensional analysis tells us that we must have R(M ) ∝ σ and the statistical self-
similarity we’ve just found suggests a power law dependence on M :

R(M ) ∼ σM ν

which scaling relation defines the exponent ν. The coarse-grained walk (no rescaling)
takes M 0 = N/n steps. Demanding the same outcome for the RMS displacement in
both the microscopic description and in the coarse-grained description says
ν
0 0 ν
√ M 1
ν
σM = |{z} σ (M ) = nσ = n 2 −ν σM ν . (1.5)
0
√ n
σ = nσ

12
(In the context of quantum field theory, a relation with the same logical content is
called a Callan-Symanzik equation.) In order for this to be true for all n, we must have
1
ν= .
2
√
Recalling that the fractal dimension D = 2 also came from σ 0 = nσ0 = n1/D σ0 , we’ve
shown that an unrestricted random walk in d ≥ 2 has a relationship between the fractal
dimension and the RMS displacement exponent: ν = 1/D.
Measurability of the fractal dimension. I’ve spoken above about the fractal
dimension of a random walk, for example of a random polymer configuration, as an
‘observable’. How could you measure it?
Suppose the monomers making up the polymer scatter light (elastically). The
fractal dimension can be extracted from the structure factor S(k), as measured by the
intensity of scattering of light off the object, as a function of the wavenumber k of the
light. (This statement is related to the open-ended question on the first homework.)

1.4 Anatomy of an RG scheme

As we saw in (1.5), we are taking a passive point of view on the RG transformations:

the observable physics (whatever it may be, R(M ) in the example above) is the same,
and we are only changing our description of that physics.
An RG transformation has two steps:

1. Coarse-graining or decimation: The idea of this step is familiar from the cen-
tral idea of how thermodynamics emerges from statistical mechanics: we should

13
average over the stuff we can’t keep track of (microscopic configurations of the
system), holding fixed the stuff we do keep track of (the thermodynamic variables
like energy and volume). In the connection mentioned in the previous sentence,
we do it all at once.
The key new idea of the RG is to do it a little bit at a time. That is: Integrate
out or average over some set of short-distance/fast degrees of freedom, holding
fixed a set of long-wavelength/slow degrees of freedom.
Notice that this step is not necessarily reversible: the value of a definite integral
(or sum) does not uniquely determine the integrand (or summand). We lose
information in this step. This means that a set of transformations defined this
way is not in fact a group in the mathematical sense, since there is no inverse
element (it is a semigroup). So much for that.
The idea is that we are squinting, so that the smallest distance ∆x we can resolve
gets a little bigger, say before the coarse-graining, we had a resolution ∆x = ,
and afterwards we only keep track of stuff that varies on distances larger than
∆x = λ for some scale factor λ > 1.

2. Rescaling: Now we change units to map the coarse-grained system back onto
the original one, so that λ 7→ . We do this so that we can compare them.

Now we’re going to think about the space on which this transformation is acting.
Its coordinates are the parameters of the system, such as the parameters defining
the probability distribution such as σ0 for the random walk, or the couplings in the
Hamiltonian if p = e−βH /Z. Let’s call the set of such parameters {hj }, where j is an
index which runs over as many parameters as we need to consider23 . These parameters
get transformed according to
steps 1, 2
{hj } 7→ {h0j ≡ Rj ({h})}.

This map is something we can do over and over, coarse-graining (zooming out) by
a factor of λ each time, until we get to macroscopic sizes. The repeated application of
2
For example, in the random walk case, other parameters we could include are b, c, ... in

r2

p(~r) = exp − ~b · ~r + 2 + cr4 + ... .
2σ

3
One of the many crucial contributions of Ken Wilson to this subject was (I think) allowing for
the possibility of including arbitrarily many parameters. The terror you are feeling at this possibility
of an infinite-dimensional space of coupling parameters will be allayed when we discover the correct
way to organize them two pages from now.

14
the map h0j ≡ Rj (h) describes a dynamical system on the space of parameters. If we
are interested in macroscopic physics, we care about what happens when we do it lots
of times:
h 7→ R(h) 7→ R(R(h)) ≡ R2 (h) 7→ R3 (h) → · · ·
(When studying such a possibly-nonlinear dynamical system more generally, it is a
good idea to ask first about the possible late-time behavior.)
What can happen? There are three possibilities:

1. We can reach a fixed point, h? = R(h? ). (We’ll in-

clude h → ∞ in some direction in this case. That
just means we chose a bad parametrization.)

2. We can go around in circles forever. This is called a

limit cycle.

3. We can jump around chaotically forever.

The first case, where there is a fixed point, is the one about which we have a lot to say,
and fortunately is what seems to happen usually.
A crucial point: the distribution described by such a fixed point of the RG is self-
similar, by the definition of the RG transformation. (If this is true when our zooming
size λ → 1, then it is actually scale-invariant.)

1.5 Scaling behavior near a fixed point

Now, suppose we have found a fixed point h? of our RG transformation, which is a

candidate for describing the macroscopic behavior of our system. It is then a good idea
to look at the behavior in the neighborhood of the fixed point (this is also a good piece
of advice for general dynamical systems): linearize about the fixed point. We will see
that this analysis immediately spits out the phenomenology of scaling behavior near a
critical point. If that is not a familiar notion, don’t worry, we’ll come back to it.
First, define the set of points which flows to the fixed point to be the

critical surface of h? ≡ {h| lim Rn (h) = h? } ≡ S(h? )

n→∞

– this is the basin of attraction of the fixed point in question.

15
Linearizing about the fixed point, let hj ≡ h?j + δj , where |δ| 1 will be our small
parameter. This maps under the RG step according to

∂h0j
Taylor
hj ≡ h?j + δj 7→ h0j ?
= Rj (h + δ) = Rj (h ) +δk |h? +O(δ 2 ) ?
| {z } ∂hk
=h? | {z }
j
≡Rjk

where in the last step we assumed that the RG map R is analytic in the neighborhood
of the fixed point, i.e. that it has a Taylor expansion. How could it not be? We got
it by doing some finite sums of analytic functions. By +O(δ 2 ) I mean plus terms that
go like δ 2 and higher powers of delta which are small and we will ignore them. If we
ignore them, then the map on the deviation from the fixed point δ is a linear map:

δj 7→ δj0 = Rjk δk .

We know what to do with a linear map: find its eigenvalues and eigenvectors:
(n) (n)
Rjk φk = ρn φj . (1.6)

Notice that nothing we’ve said guarantees that Rjk is a symmetric matrix, so its right
and left eigenvectors need not be the same (the eigenvalues are), so we’ll also need
(n) (n)
φ̃j Rjk = ρn φ̃k .

Together, these are orthonormal

(n) (n0 )
X
φ̃j φj = δn,n0 (1.7)
j

and complete X (n) (n)

φ̃j φk = δjk .
n

About the eigenvalues, notice the following. We’ve defined the RG transformation
R ≡ Rλ to accomplish a coarse-graining by a scale factor λ. We can imagine defin-
ing such a transformation for any λ, and these operations form a semigroup under
composition
Rλ Rλ0 = Rλλ0 .
This is useful because it says that the eigenvalues of the linearized operators

Rλ φ(n) = ρn (λ)φ(n)

16
must satisfy the same multiplication law4

ρn (λ)ρn (λ0 ) = ρn (λλ0 ). (1.8)

But a function which satisfies this rule must have the form5

ρn (λ) = λyn (1.9)

for some yn independent of λ.

The eigenvectors of R give a preferred coordinate basis near the fixed point:
X (n) (1.7) X (n)
δj = gn φj , gn = φ̃k δk ,
n k

which we will use from now on. yn is called the scaling dimension of the coupling gn .
Now we can see the crucial RG dichotomy which tames
the infinitely many couplings: If |ρn | < 1 (yn < 0) then
as we act with R many times to get to long wavelengths,
then gn → 0. Such a coupling is called irrelevant: it goes
away upon repeated RG transformations and its effects on
macroscopic physics can be ignored. Notice that since the
perturbation is getting smaller, the approximation |δ| 1
becomes better and better in this case.
In contrast, if |ρn | > 1 (yn > 0) then as we act with R (Notice that the eigenvectors need not
many times to get to long wavelengths, then gn grows. Such be orthogonal.)
4
Why do Rλ for different λ have the same eigenvectors?
It really follows from the semigroup property. The eigenvectors are physical things an eigenvector
determines some operator O with the following property: if I add O to the fixed-point hamiltonian,
H? + gO, an RG transformation does not generate any other operators, i.e. it gives H = H? + αgO
for some α.
On the other hand, the choice of by how much to zoom out (λ) is an arbitrary one. Doing the RG
step by λ twice should give the same result as doing it once by 2λ. So in particular either one should
give the same set of special directions.
5
The function y(λ) ≡ log ρn (λ) then satisfies y(λ) + y(λ0 ) = y(λλ0 ). First this implies y(1) = 0. If
we consider λ0 = 1 + , we have

y(λ) + y(1 + ) = y(λ + λ)

y(λ) + y(1) + y 0 (1) = y(λ) + λy 0 (λ) + O(2 )
y 0 (1)
which says that y satisfies the differential equation y 0 (λ) = λ which is solved by

y(λ) = y 0 (1) ln λ.

I’m not sure if the statement (1.9) follows if we only know (1.8) for discrete values of λ. Does it?

17
a parameter is called relevant, and represents an instability
of the fixed point: our linearization breaks down after repeated applications of R and
we leave the neighborhood of the fixed point.
The case of a coupling with yn = 0 which doesn’t change is called marginal.
In these terms, the critical surface (actually its tangent space near the fixed point)
is determined by
S(h? ) = {gn = 0 if yn > 0}.
In particular, the codimension of the critical surface in the space of couplings is the
number of relevant perturbations of the fixed point.
[End of Lecture 2]

2 Random walks
Next we generalize our ensemble of random walks to illustrate some features of the RG
that were missing from our simple pure Gaussian example above.

2.1 Biased gaussian walk

First, we can see an example of a relevant operator if we study a biased walk, with
|~r − ~r0 |2

2 −d/2

p(~r) = 2πσ exp − . (2.1)
2σ 2
Again define the distribution for the coarse-grained step to be
Z Yn n
!
X
P (~r0 ) dd ri p(~ri ) δ ~r0 −

= ~ri
i=1 i
(more Gaussian integrals)

|~r − n~r0 |2

2 −d/2

= 2πnσ exp − . (2.2)
2nσ 2

So, after the coarse-graining step, we have

( √
σ 0 = nσ
.
~r00 = n~r0

After the rescaling step, to keep the width of the distribution fixed, we have
(
σ (R) = σ
(R) √ .
~r0 = n~r0

18
So R is diagonal already. This says that the bias of the walk is a relevant operator of
dimension y0 = 12 > 0.
We have here an explicit example of an RG map R. Let’s study its fixed points.
There’s one at (σ, ~r0 = 0) (for any σ, so actually it is a family of fixed points parametrized
by the marginal coupling σ) which is the unbiased walk we studied earlier. This fixed
point is unstable because if we turn on a little r0 it will grow indefinitely.
And there’s another fixed point at (σ, ~r0 = ∞). This
is where we end up if we perturb the unbiased fixed point.
The distribution (2.1) says (by direct calculation) that
rD E
~ M |2 M 1
p
R(M ) = |R = M 2 |~r0 |2 + M σ 2 → M |~r0 |.
M

This means that for large a, we’ll need N (a) ∼ 1/a spheres of diameter a to cover the
walk – it will be one dimensional.
This means that a system defined by some microscopic distribution of the form
(2.1) with some value of ~r0 and σ will look like a Brownian walk of the type described
above, with fractal dimension D = 2, if you look at it closely, with a resolution δx σ.
But from a distance (resolution worse than δx σ), it will look like a one-dimensional
path (D = 1) in the ~r0 direction. For example, the number of balls defining the fractal
dimension behaves as (
a−2 , a σ
N (a) ∼ .
a−1 , a σ

2.2 Universality class of the (unrestricted) random walk

Now let the distribution from which a single step is sampled be any rotation-invariant
distribution p(~r) = p(|~r|) with finite moments. For example, the fixed-step-length
1
distribution p(~r) = 4πa r| − a) is a good one to keep in mind. (This is still not the
2 δ (|~

most general walk, since we’re still assuming the steps are independent. More on that
next.) The distribution for the coarse-grained step is
Z Y n
!
X
P (~r0 ) = dd ri p(~ri )δ ~r0 − ~ri
Z i=1 i
D En
~ 0 ~
= d̄d k e−ik·~r eik·~r . (2.3)

The quantity Z
D E
i~k·~
r ~
e = dd rp(~r)eik·~r ≡ g(k)

19
is called the characteristic function of the distribution p(~r), and is a generating function
for its moments:
hrm i = (−i∂k )m g(k)|k=0 .
The Taylor expansion in k of its logarithm is the cumulant expansion:
X (ik)m
log g(k) = Cm , Cm = (−i∂k )m log g|k=0 .
m
m!

The important point for us is the expansion:

 
D
~
E 1X (ik)3
eik·~r = exp i~k · h~ri − kµ kν (hrµ rν i − hrµ i hrν i) + C3 + O(k 4 ) .
 
2 µ,ν | {z } 3!
=σ02 δµν

m
If we don’t truncate the sum in m (ik)
P
m!
Cm , then the {Cm } are just another set of
coordinates on the space of couplings for the walk. Why should we treat the integration
variable k in (2.3) Z
~ 0 n ~ 2 +O(nk3 )
P (~r0 ) = d̄d k e−ik·~r e− 2 σ0 |k|

as small? Because the integrand is suppressed by the Gaussian factor. If the Gaussian
bit dominates, then the integrand has support at k <
∼ √ 2 , at which the mth term in
1
nσ0
the cumulant expansion contributes to the exponent in (2.3) as
m n→∞
nk m Cm ∼ n1− 2 → 0 if m > 2,

where the important thing for getting zero is just that Cm is finite and independent of
n and k. This is the statement that the couplings Cm for m > 2 are irrelevant. Then
we can do the remaining Gaussian integral (ignoring the small corrections which are
1− m
suppressed by e−n 2 Cm )
r 0 −nh~
|~ r i|2
0 n1 −1
(2πnσ02 )d/2 e 2
2
nσ0
P (~r ) = .

What’s this? This is the Gaussian we used at the beginning, with r0 = n h~ri.
This result, that the distribution for a sum of many random variables independently
distributed according to some distribution with finite moments, is usually called the
Central Limit Theorem or the Law of Large Numbers. (For more on the derivation I
recommend the discussion in Kardar volume 1.)
In the framework of the RG it is an example of universality: all such probability
distributions are in the basin of attraction of the gaussian random walk – they are said
to be in the same universality class meaning that they have the same long-wavelength

20
physics. In particular, their RMS displacement goes like RM ∼ M 1/2 for large number
of steps M , and (for d ≥ 2) their fractal dimension is D = 2.
Notice that we did not prove that the Gaussian fixed point is the only one: we had
to assume that we were in its neighborhood in order to use the k ∼ n−1/2 scaling – this
scaling is a property of the neighborhood of the fixed point, just like the exponents y
we got by linearizing about the general fixed point in §1.5.
We could try to find other fixed points in the space of d-dimensional walk distri-
butions. For example, we could have chosen the scaling to fix the coefficient Cm for
any m. In that case we would find that the m − 1 perturbations Cl<m are relevant and
all the Cl > m are irrelevant. The special case where we fix C1 (i.e. choose k ∼ 1/n)
gives the same fixed-point we reached for the biased walk. The fixed points with Cm>2
fixed would have more than one relevant operator (we will learn to call this ‘multicrit-
ical’), which means reaching them requires tuning several parameters. For better or
worse, these fixed point distributions with m > 2 don’t seem to exist as probability
distributions, because they would have to have zero variance6 .
Also, the assumption in the statement of the CLT also has an RG analog: if the
initial distribution does not have finite moments, then our expansion in terms of cu-
mulants is no good. An example is a Lorentzian distribution, p(r) = r2σ/π
+σ 2
. In fact in a
certain sense the Lorentzian is a fixed point (if we set n = 2 where n is the parameter
in the coarse-graining transformation as above).
(We will see another fixed point next when we include interactions between the
steps of the walk.)
One lesson which does generalize, however, is that most of the possible perturbations
of the fixed point are irrelevant, and there is only a small number of relevant or marginal
perturbations.

2.3 Self-avoiding walks have their own universality class

[Still from Creswick! I like this book. According to Amazon, Dover has put out a
second edition.] Suppose that the random 1d objects we are studying are actually
polymers – long chain molecules made of ‘monomers’ which cannot be in the same
place, i.e. they have some short-ranged repulsion from each other. We can model this
as lattice paths without self-intersection, or self-avoiding walks (SAWs). Does this
microscopic modification of our ensemble change the long-wavelength physics?
It certainly changes our ability to do all the sums. If our polymer has n monomers,
6
Thanks to Tarun Grover for pointing this out to me. Maybe they do exist as simple analogs of
‘complex fixed points,’ where we drop some positivity assumptions

21
we’d like to know about the numbers

~ ≡ # of SAWs with n steps connecting ~0 to R.

Mn (R) ~ (2.4)

Then we could figure out the RMS displacement from head-to-tail of the n step polymer
(actually we are not distinguishing between head and tail):
D E P
~
~ R|
Mn (R)| ~ 2
~ 2
2
R(n) ≡ |R| = R
.
n Mn
P
The denominator here is Mn ≡ R~ Mn (R). ~ As with the unrestricted random walk, we
might expect to have (we will) a scaling relation

R(n) ∼ nν (2.5)

with some characteristic exponent ν.

Enumerating Mn is not so easy. For the square lattice, M1 = 4, M2 = 4·3 since there
are three choices for the second step, M3 = 4 · 3 · 3, but after that we can make loops
(for some of the previous choices) and it gets ugly and grows rapidly. A generating
function which packages this information is the grand-canonical-type object
X
~ ≡
G(K, R) K n Mn (R)~ (2.6)
n

where K is a fugacity whose size determines the relative contributions of walks of

different lengths to G(K).
Let
~ 2 K n Mn (R)
X P |R| ~
2 n
ξ (K) ≡ (2.7)
R
G(K)
be the square of the RMS displacement at fixed K, the typical size of a SAW.

P
In (2.6) G(K) ≡ R G(K, R). In this ensemble, for K < 1,
the falloff of K n with n fights against the growth of Mn to
produce a sharp peak at some n0 (K).

22
There is a value of K where this peak step-length diverges,
since it is finite for K → 0 and infinite for K ≥ 1.

Preview: if Mn grows exponentially, with some power-law prefactor,

Mn ∼ Kc−n nγ−1
then n0 (K) occurs at the maximum of
K n Mn = en ln(K/Kc )+(γ−1) ln n
−1
γ − 1 K→Kc Kc − K
=⇒ n0 (K) = − ∼ (γ − 1)
ln (K/Kc ) Kc
which diverges at (the aptly named) Kc , and the typical walk size goes like
−ν
K→Kc ν Kc − K
ξ(K) ∼ R(n0 (K)) ∼ n0 (K) ∼ .
Kc
So from this grand-canonical point of view, the reason there are scaling relations and
power laws is the existence of this critical point where the length of the typical walk
diverges. End of preview.
Let’s implement an RG for a 2d SAW on the square lattice. What is the space
of couplings we should consider? Previously, our only coupling was the bond-fugacity
K, that is, a walk was weighted by K n with n the number of bonds covered by the
0
walk. We could also consider multiplying this weight by K2n where n0 is the number of
next-nearest neighbor bonds covered, or K3a where a is the area of the region between
the SAW and some reference curve. Any property of the SAW you can quantify can
appear in the weight, if you want. Call the weight W (K), where K now represents
some collection of such parameters. When pressed, I’ll just consider the one fugacity
K for the number of steps.
Here’s the coarse-graining we’ll do: take SAWs Γ on
the fine-grained lattice Λ with weight WΓ (K). We will use
these to build SAWs Γ0 on a coarser lattice Λ0 , with some
relative zoom factor λ. For example, if λ is an integer, we
could take Λ0 to be a square lattice with lattice spacing λa
where a is the lattice spacing of Λ. (λ = 2 in the figure at
right.)

23
The weights are related by
X
WΓ0 (K 0 ) = WΓ (K)
Γ∈Γ0

where we regard Γ0 as defining an equiv-

alence class of walks on the finer lattice. Here is an example of a possible rule for
determining the inclusion Γ ∈ Γ0 , for λ = 2. It is very non-unique. Depicted is a unit
cell of Γ0 (blue) and the overlapping unit cells of Γ (black). A SAW which enters the
cell at the lower left must leave through the right or the top. The model has a π/2
rotation symmetry so we can consider just the latter.
Since on the coarser lattice, each of these represents, just a single step, WΓ0 (K 0 ) =
K 0 . The result is
K 0 = 2(K 4 + 2K 3 + K 2 ). (2.8)
Let me emphasize that the details of this real-space RG procedure are not to be taken
too seriously, and other similarly-legitimate schemes produce somewhat different poly-
nomials on the RHS here.
At right is a visualization of the map
(2.8). Fixed points are (by definition) in-
tersections between the curve (2.8) and
the line K 0 = K. The map (2.8) has three
fixed points: [End of Lecture 3]

1. K = 0, which is an ensemble dominated by very short walks, and in particular

finite-length ones.

2. K = ∞, which is dominated by crazy lattice-filling walks. Maybe interesting.

3. K = Kc ' 0.297. This third one is where we go from finite walks at K slightly
below Kc to infinite walks at K > Kc .

The jagged line between K 0 = K and the curve defined by (2.8) depicts the repeated
action of the map with an initial condition near (but slightly below) the fixed point
at K = Kc . As you can see from the jagged line, the fixed point Kc is unstable –
the perturbation parametrized by K − Kc is relevant. Its dimension determines the
exponent ν defined in (2.5) as follows.
Because we are zooming out by a factor of λ, the typical size will rescale as

ξ(K) = λξ 0 (K 0 ).

24
Near the critical point,
K→Kc !
ξ(K) ∼ |K − Kc |−ν = |{z}
λ ξ 0 (K 0 ) = 2| K 0 (K) − Kc |−ν
| {z } | {z }
=2 =ξ(K 0 ) 0
= ∂K | (K−Kc )
∂K Kc

Therefore −ν
∂K 0

−ν
|K − Kc | =λ |K |K − Kc |−ν
∂K c
from which we conclude
ln λ
ν= 0 = 0.771.
ln ∂K
∂K Kc
|
Numerical simulations give Kc = 0.379 and ν = 0.74.
Where are we making an approximation in the above? For example, some con-
figurations on the fine lattice have no counterpart on the coarse lattice (an example
is a walk which enters the cell and leaves again the same way). We are hoping that
these don’t make an important contribution to the sum. The real-space RG can be
systematically improved by increasing the zoom factor λ (clearly if we coarse-grain the
whole lattice at once, we’ll get the exact answer).
The important conclusion, however, is pretty robust: the d = 2 SAW has a different
exponent than the unrestricted walk:

νSAW > νRW = 1/2.

This makes sense, since it means that RRMS (SAW) > RRMS (unrestricted) for many
steps – the SAW takes up more space (for a fixed number of steps) since it can’t
backtrack. The fractal dimension is therefore smaller DSAW = ν1 ' 1.3 < 2.
Different exponents for the same observable near the critical point means different
universality class.

Teaser: This ensemble of self-avoiding walks is the n → 0 limit of the O(n) model!
More specifically, the critical point in temperature of the latter model maps to the
large-walk limit: T − Tc ∼ M −1 . This realization will allow us to apply the same
technology we will use for the Ising model (which we could call the O(1) model) and
its O(n) generalizations to this class of models.

25
3 Ising models
Words about the role of models, solvable and otherwise, and universality:
Fixed points of the RG are valuable. Each one describes a possible long-wavelength
behavior, and each one has its own basin of attraction. That basin of attraction includes
lots of models which are in some sense ‘different’: they differ in microscopic details of
values of couplings, and sometimes even more dramatically. Two important examples:
(1) a lattice model and a continuum model can both flow to the same fixed point. The
idea is that if the correlation length is much longer than the lattice spacing, the lattice
variable looks like a continuous field, and we can interpolate between the lattice points.
And at a fixed point scale invariance requires that the correlation length be infinity (or
zero).
(2) a model with two states per site (like an Ising magnet, the subject of this
section) and a model with infinitely many states at each site can flow to the same fixed
point. Here’s a picture of how that might come about. Suppose we have at each site a
variable called S which lives on the real line, and it is governed by the potential energy
function V (S) = g(S 2 − 1)2 . (So for example the Boltzmann distribution is e−βV (S) .)
The parameter g might be relevant, in the sense that g → ∞ at long wavelengths. This
process of making g larger is depicted in the following figures (left to right g = 1, 11, 21):

As you can see, it becomes more and more energetically favorable to restrict S to just
the two values S = ±1 as g grows.
I’ve just made a big deal about universality and the worship of fixed points of the
RG. Part of the reason for the big deal is that universality greatly increases the power
of simple models: if you can understand the physics of some simple (even ridiculously
over-idealized) model and show that it’s in the same universality class as a system of
interest, then you win.
[Goldenfeld §2.5, Creswick §5, lots of other places] The Ising model is an important
common ground of many fields of science. At each site i ∈ Λ (Λ may be a chain,
or the square lattice, or an arbitrary graph, and i = 1...|Λ| ≡ N (Λ) = N is the
number of sites), we have a binary variable si = ±1 called a spin, whose two states are
sometimes called up and down. There are 2N configurations altogether. (Although I

26
will sometimes call these ‘states’ I emphasize that we are doing classical physics.)
The name ‘Ising model’ connotes the following family of energy functionals (also
known as Hamiltonians):
X X X
− H(s) = hi si + Jij si sj + Kijk si sj sk + · · · (3.1)
i∈Λ ij ijk

where this sum could go on forever with terms involving more and more spins at once.
(The RG will generically generate all such terms, with coefficients that we can hope do
not cause too much trouble.) With this definition, the model may describe magnetic
dipoles in a solid, a lattice gas (where si = ±1 correspond to presence or absence of
a particle at i), constrained satisfaction problems, neural networks, ... anything with
bits distributed over space. This list also could go on forever7 .8
Equilibrium statistical mechanics. Why might we care about H(s)? We can
use it to study the equilibrium thermodynamics of the system, at some temperature
T ≡ 1/β. Let’s spend a few moments reminding ourselves about the machinery of
equilibrium statistical mechanics. The key ‘bridge’ equation between the microscopic
world (stat mech) and the macroscopic world (thermo) in thermal equilibrium is
X
e−βF = e−βH(s) ≡ Z.
s

Here in our theorists’ paradise, we measure temperature in units of energy, kB = 1.

Notice that in classical equilibrium stat mech, the temperature is redundant with
the overall scaling of the Hamiltonian, only the combinations βh, βJ... appear in the
partition function, so a system with twice the temperature and twice the couplings will
have the same physics. The sum here is over the 2N configurations of the spins:

X X X X X N X
Y
≡ ··· ≡ ≡ tr
s s1 =±1 s2 =±1 s3 =±1 sN =±1 i=1 si =±

and we will sometimes write tr for ‘trace’. I emphasize that we are doing classical
physics here.
Why do we care about the free energy F ? For one thing, it encodes the thermody-
namics of the system: the average energy is
1
E ≡ hHi ≡ trHe−βH = −∂β log Z,
Z
7
Here is an example I learned of recently of how an Ising model is used for data clustering.
8
Sometimes the word ‘Ising’ is used to indicate the presence of the Z2 symmetry under s → −s
which is present when only even terms appear in H (h = 0, K = 0)).

27
the entropy is
S = −∂T F,
the heat capacity is
1 2 2
CV = ∂T E = H − hHi ,
T2
a dimensionless measure of the number of degrees of freedom. Notice the familiar
thermodynamic identity F = E − T S follows by calculus.
More ambitiously, if we knew how F depended on all the coupling parameters
{hi , Jij , Kijk ...} in (3.1), we would know all of the correlation functions of the spins,
for example
1 si
∂hi F = −T ∂hi log Z = −T tr e−βH = − hsi i .
Z T
And similarly,
∂hi ∂hj F = (hsi sj i − hsi i hsj i) T −1 ≡ Gij T −1 .
It is a generating function for these (connected) correlation functions.
Clean and local Ising models. Two important specializations of (3.1) are quite
important in physics (not always in the other applications of the Ising model). We will
(usually) restrict to the important special case with the following two assumptions.

1. the couplings (Jij and friends) are local in the sense that the coupling between
two sites goes away (Jij → 0) if the sites are far apart (|ri − rj | → ∞).
A reason to care about the two point function in the case where there is a notion
of locality, then, is that it allows to define a correlation length, ξ:
ra
Gij ∼ e−rij /ξ

– here a is the range of the interactions, or the lattice spacing, and rij ≡ |ri − rj |
is the distance between the locations of spins i and j. The correlation length
will depend on the parameters in H and on the temperature, and it measures
the distance beyond which the spin orientations are uncorrelated. More formally,
ξ −1 ≡ − limr→∞ ∂r ln Gi,i+r (but of course the ∞ here has to stay within the box
containing the system in question).

2. the couplings are translation invariant: Jij = Jf (|ri − rj |) for some function of
the distance f (r). (If one thinks of variations of Jij with i, j as coming from
some kind of microscopic disorder, one refers to this case as clean.) We will often
consider the case where f (r) only has support when r = a is one lattice spacing.
(Notice that s2 = 1 means that we can ignore the case when r = 0.)

28
These two assumptions are independent, but we will usually make both. So: on
any graph (of N sites), the nearest-neighbor ‘clean’ Ising model has energy functional
X X
−H = h si + J si sj
i hiji

where hiji means the set of nearest neighbors i, j.

An important observable (especially in this case) is the magnetization
N
1 X 1
M≡ hsi i = − ∂h F.
N i=1 N

Also of interest is the spin susceptibility:

1 1 1 X 1 X
χ ≡ ∂h M = (hs i s j i − hs i i hs j i) = Gij .
N T N 2 ij N 2 T ij

When J > 0, the energy of a configuration is lower if neighboring spins point the
same way; in this ‘ferromagnetic’ case everybody can be happy (and M 6= 0). In
the antiferromagnetic case J < 0, neighbors want to disagree. All spins can agree
to disagree if the graph has no loops. Any loop with an odd number of sites, like
a triangle, leads to a frustration of the antiferromagnetic interaction, which requires
compromise and leads to drama.

Lack of drama for bipartite lattices. A bipartite lattice is one which can be
divided into two distinct sublattices A, B each of which only neighbors sites of the other
lattice. That is hiji contains only pairs, one from A and one from B. For example,
hypercubic lattices are all bipartite: let the A lattice be those sites (x, y, ...) whose
(integer) coordinates add up to an even number x + y + ... ∈ 2Z. The honeycomb
lattice is also bipartite. The triangular lattice is not. 9
[End of Lecture 4]
A consequence of bipartiteness is that any loop traverses an even number of sites,
since it must alternate between the two sublattices. Hence there is no frustration for
a (nearest-neighbor!) Ising antiferromagnet on a bipartite lattice. In fact, a stronger
statement is true. Since
X
Hh=0,J (sA , sB ) = −J sA B
i sj
hiji

9
Notice, by the way, that bipartite does not require that A and B be isomorphic or even that they
have the same number of sites. For example, if we simply removed a (periodic) subset of sites (and
all the associated links) from the A sublattice of a lattice, we would still have a bipartite lattice. You
can find an example by googling ‘Lieb lattice’. Beware confusion in the literature on this point.

29
if we flip the spins of one sublattice, we also reverse J:
X
Hh=0,J (sA , −sB ) = +J sA B A B
i sj = Hh=0,−J (s , s ).
hiji

But for any function φ X X

φ ({s}) = φ ({−s}) (3.2)
{s} {s}

by renaming summation variables. Therefore on a bipartite lattice

A B A B
X X
Z(h = 0, J) = e−βHh=0,J (s ,s ) = e−βHh=0,−J (s ,s ) = Z(h = 0, −J).
{s} {s}

So on a bipartite lattice, a ferromagnet and an antiferromagnet have the same thermo-

dynamics.

The nearest-neighbor restriction is essential here. Even for a one-dimensional chain,

we can make frustration by adding antiferromagnetic interactions for si si+1 and for
si si+2 .
Symmetry and spontaneous symmetry breaking. Notice that the general
Ising Hamiltonian (3.1) enjoys the following property

Hh,J,K... (s) = H−h,J,−K... (−s)

– flipping all the spins and flipping the coefficients of odd powers of the spins preserves
the energy. In particular, if h = 0, K = 0, all odd powers do not appear, and flipping
the spins is a symmetry of the Hamiltonian. What consequence does this have for
thermodynamics?
X X (3.2)
Z(−h, J, −K, T ) = e−βH−h,J,−K (s) = e−βHh,J,K (−s) = Z(h, J, K, T ) .
{s} {s}

And therefore the free energy in particular satisfies F (−h, J, −K, T ) = F (h, J, K, T ).
Let’s set K = 0 from now on. This operation si → −si is a Z2 transformation in the
sense that doing it twice is the same as doing nothing. It is a symmetry when h = 0.
(Only when h = 0 does the transformation map the ensemble to itself.)
Question: does this mean that when h = 0 we must have zero magnetization,
1 X ?
M= hsi i ∝ ∂h F = 0 ?
N i

A nonzero value of M (without an applied field h) is called long-range order, because

it means that distant spins must conspire to point in the same direction.

30
Answer: It would if F (h) had to be a smooth, differ-
entiable function. In order for hsih=0 to be nonzero, F (h)
must have a different derivative coming from positive and
negative h, as in the figure at right. This phenomenon is
called spontaneous symmetry breaking because the symme-
try reverses the sign of the magnetization M → −M .
But this phenomenon, of ∂h F |h=0+ 6= ∂h F |h=0− requires the function F (h) to be
non-analytic in h at h = 0. This is to be contrasted with the behavior for a finite
system (N < ∞) where
X
Z(h) = e−βH(s) = e−βhN m1 c1 + e−βhN m2 c2 + ... + e−βhN mn cn
{s}

where n = 2N is the number of configurations, ca are positive coefficients (functions

of βJ) independent of h, and ma is the magnetization in configuration a (ma =
P
i si (a)/N ). The important point is just that there is a finite number of terms in
this sum, and Z is therefore a polynomial in e−βh , so F (h) = −T log Z(h) is only
singular when Z = 0 or Z = ∞, which doesn’t happen for finite values of β, h.
In conclusion: spontaneous symmetry breaking requires the thermodynamic limit
N → ∞.

3.1 Decimation RG for 1d nearest-neighbor Ising model

Let’s step back from these grand vistas and apply the RG for the Ising model in one
dimension. Consider a chain of sites i = 1...N , arranged in a line with spacing a, and
with an even number of sites, N ∈ 2Z. And for definiteness, if you must, take periodic
boundary conditions sN +1 = s1 . Turn off the magnetic field, so
N
X
H = −J si si+1 .
i=1

We’ll speak about the ferromagnetic case, J > 0 (though the same results apply to
J < 0 since the chain is bipartite). The partition function
Z = tre−βH = Z(βJ)
is calculable exactly in many ways, each of which instructive. Since the partition
function only depends on the combination βJ, let us set β = 1.
In the spirit of the RG, let us proceed by a hierarchical route, by decimating the
even sites: X
e−H(s) = e−Heff (sodd )
{s}i,even

31
On the right hand side, we have defined the effective hamiltonian for the spins at the
odd sites. The odd sites are separated by distance a0 = 2a and there are half as many
of them. We can use this as the first half of an RG implementation (the second half is
rescaling). We’ve zoomed by a factor of λ = a0 /a = 2.
In this 1d case we can actually do these sums:
0
X
e+Js2 (s1 +s3 ) = 2 cosh (J (s1 + s3 )) ≡ ∆eJ s1 s3
s2 =±1

where in the last step we defined ∆, J 0 , constants independent of the configuration of

the remaining, not-yet-decimated spins. Then the result for the whole trace over even
spins is
0 0
P X N
e−Heff (s ) = ∆N/2 eJ i s2i+1 s2(i+1)+1 , =⇒ Heff (s0 ) = J 0 si si+2 + log ∆.
i, odd
2

The ∆ business just adds a constant to the (free) energy, which divides out of the
partition function and we don’t care about it here.
We can figure out what the new parameters are by checking cases, of which only
two classes are distinct:
0 product
if s1 = s3 : 2 cosh 2J = ∆eJ =⇒ ∆2 = 4 cosh 2J
0 ratio 0
if s1 = −s3 : 2 = ∆e−J =⇒ e2J = cosh 2J . (3.3)

The solution can be usefully written as

√
v 0 = v 2 , ∆ = 2 cosh 2J (3.4)

where v ≡ tanh J ∈ [0, 1] (using hyperbolic trig identities). The map (3.4) is another
explicit example of an RG map on the parameters. In this case, unlike the previous
SAW example, it happens to be exact.
The RG preserves symmetries. Why is the effective hamiltonian of the same
form as the original one? The couplings like the magnetic field multiplying odd numbers
of spins vanish by the Ising spin-flip symmetry of the original model. (More precisely:
because of the locality of H, we can determine Heff by decimating only a finite number of
spins. This rules out generation of nonzero h0 by some version of spontaneous symmetry
breaking. This requires locality of the interactions.) This line of thinking leads us to
expect that the effective hamiltonian should generally have the same symmetries as
the original one.
The 4-spin interaction vanishes because in 1d, each site has only two neighbors with
whom it interacts, each of which has only one other neighbor. So that was a bit of an
accident.

32
This map has two fixed points: One is
?
v = 0, which is βJ = 0, meaning infinite
temperature, or no interactions; this one
is ‘boring’ from the point of view of the
study of many-body physics and collective phenomena, since the spins don’t care about
each other at all. The other fixed point is v? = 1, which is βJ = ∞, meaning zero
temperature or infinite interaction strength. This is a ferromagnetic fixed point where
it is very urgent for the spins to agree with each other. The fact that there is no
fixed point at a finite temperature means that there is no critical behavior in the 1d
nearest-neighbor Ising model; only at T = 0 do the spins align with each other.
More explicitly, how does the correlation length behave? In zooming out by a factor
of λ, it changes by
K T →0 K 2J/T
ξ(v) = λξ(v 0 ) = 2ξ(v 2 ) =⇒ ξ = − → e (3.5)
log v 2
(where K is a constant not determined by this argument) which is finite for T > 0.10
11

Why did it happen that there is no critical point at T > 0? A point of view
which illuminates the distinction between 1d and d > 1 (and is due to Peierls and
now permeates theoretical condensed matter physics) is to think about the statistical
mechanics of defects in the ordered configuration.
Consider a favored configuration at low-temperature, where all spins point the same
way. Small deviations from this configuration require reversing some of the spins and
will cost energy 2J above the aligned configuration for each dissatisfied bond. In 1d,
a single dissatisfied bond separates two happy regions, and is called a kink or domain
wall. Notice that the energy is independent of the size of each happy region (which is
called a domain). n domains of reversed spins cost energy 4Jn, since each domain has
two boundary links.
In 1d, each region of spins that we re-
verse has two boundaries, a kink and an
antikink.
At T = 0, the state minimizes the en-
ergy and there is no reason to have any kinks. But at T > 0, we care about (i.e. the
macroscopic equilibrium configuration minimizes) the free energy F = E − T S, and
the fact that there are many kink configurations matters.
10
A log is a special case of a power law: Taylor expand v ν in ν about 0.
11 T ∼T
Preview: near less weird fixed points, the correlation length will diverge like a power law ξ(T ) ∼ c
(T − Tc )−ν instead of this weird function.

33
How many are there? If there are n segments of s = −1 in a sea of s = +1 then we
must decide where to place 2n endpoints. The number of ways to do this is:

N N! 1nN N log N −2n log 2n−(N −2n) log(N −2n)
Ω(n) ' = ∼ e
2n (2n)! (N − 2n)!

where in the last step we used Stirling’s formula. So the free energy for 2n kinks is

Fn = n · 4J − T log Ω(n) ' 4Jn − T (N log N − 2n log 2n − (N − 2n) log(N − 2n)) .

In equilibrium, the free energy is minimized with respect to any variational parame-
ters12 such as n, which happens when

4J n 1 e−2J/T T <2J 1 −2J/T

− = 2 log 2n − 2 log(N − 2n) =⇒ = ∼ e .
T N 2 1 + e−2J/T 2
As a check, the correlation length is approximately the size of the domains, which is
approximately the inverse kink density:
n −1
ξ∼ ∼ 2e2J/T
N
which again agrees with our RG result (3.5).
The codewords for this phenomenon are: the destruction of long-range order by the
proliferation of topological defects. (They are topological, for example, in the sense
that a kink must follow an antikink, and the number of kinks on circle must equal the
number of antikinks.)
In the special case of 1d, we can be completely explicit and verify the result for the
correlation length by calculating the correlation function.
First of all,
G(r) = hsi+r si i (3.6)
(let’s keep the disconnected bits in there for now) is independent of i because of trans-
lation invariance. The Boltzmann factor can be written as
P Y
eβJ hiji si sj = eβJsi sj
hiji

Since s = ±1, we are multiplying

eβJsi sj = cosh βJ + si sj sinh βJ = cosh βJ (1 + vsi sj )

12
Actually, we’ll prove this statement in section 4.

34
where v ≡ tanh βJ (as above). Think about expanding this product over links into a
sum. Each term in the sum gets either a 1 or a vsi sj from each link. Any term in the
sum can be visualized by coloring the links which contribute a vsi sj .
When we multiply this out, the dependence on any one of the spins si can be only
two things: 1 if the term has an even number of factors of si , or si if it has an odd
number. Here’s the Ising model integration table:
X X
1 = 1 + 1 = 2, si = 1 − 1 = 0. (3.7)
si si

In the last two paragraphs, we haven’t used the restriction to 1d at all. (This will
be useful in §3.2.) Consider a single spin s2 of an infinite 1d chain; if it is not one of
the two sites i or i + r in (3.6) the factors which matter to it are13 :
FOIL!
X X
1 + vs2 (s1 + s3 ) + v 2 s1 s3 = 2 1 + v 2 s1 s3 .

(1 + vs1 s2 ) (1 + vs2 s3 ) =
s2 s2

This is just of the same form as if we had a direct link between 1 and 3 with weight v 2
(up to the overall prefactor). Therefore, doing this repeatedly (r times) for the sites in
between i and i + r,
trsi 2r (1 + v r si si+r ) sr
G(r) = = vr
tr2r (1 + v r si si+r )
(The red factors are the ones that survive the trace.) Therefore

ξ −1 = −∂r log G(r) = − log v

which agrees with (3.5) with K = 1.

The thing that’s special about 1d here is that only a single term in the expansion
of the product survives the sum. This is because there is only one path between the
two sites i and i + r. If we had taken N finite and periodic boundary conditions, there
would be another path (around the back of the circle) and hence another term in the
answer
N r
G(r) = v r + v N −r ∼ v r .
In d > 1 there are many paths and the answer is more interesting, as we’ll see below.
[End of Lecture 5]
13
Note that this expression gives a very simple derivation of (3.4).

35
3.2 High-temperature expansion

Return now to the moment at (3.7), right before we restricted our discussion to one
dimension. We had written the partition function of the nearest-neighbor Ising model
(on any graph) as a product over links
XY
Z = coshNΛ (βJ) (1 + vsi sj ) (3.8)
s hiji

and argued that expanding this binomial gives a sum over paths in the graph. More
explicitly, we think of the two terms in each link factor in (3.8) as a sum over another
dynamical variable, nhiji = 0, 1:
X
1 + vsi sj = (vsi sj )nij .
nij =0,1

So we can write the Ising partition function as

Y X XY
Z = coshNΛ (βJ) (vsi sj )nij .
` {n` =0,1} {s} hiji

Now we can do the sums over the spins using our ‘integration table’ above (3.7).
For each spin, the sum is
 
X Phi|ji nij X
si = δ nij ≡ 0 mod 2 
si =±1 hi|ji

Here we’ve defined the notation ‘hi|ji’ to mean

‘neighbors j of a fixed site i’. That is: the sum is
only only nonzero if an even number of the links
ending at site i have nij = 1. If we represent
nl = 1 by coloring the link l, the configurations
which survive this constraint are made of closed
loops.

Y X P
Z = coshNΛ (βJ) (1 + vsi sj ) = coshNΛ (βJ) v l nl (C)
(3.9)
hiji C

where we are summing over configurations of binary numbers nl = 0, 1 on the links

that are closed in the sense that
X
nhiji ≡2 0 ∀i.
hi|ji

36
That is: we sum over lattice curves which have an even number of links going into each
site. The contribution of a curve C (which is not necessarily connected) is weighted by
v length(C) .
This rewriting of the Ising partition sum will be useful below.

3.3 RG evaluation of physical quantities

Behavior of the correlation length under RG. We’ve defined the correlation
length using the spin-spin correlator G(r), in terms of its rate of falloff for large r. Let
us use this to examine its behavior under the RG more directly. To do this, denote
more explicitly
trsi si+r e−H
GH (r) ≡ .
tre−H
Now suppose that i and i + r are both odd sites (so that they survive our decimation);
in that case we can still do all the decimation as in the partition function :
tre,o si si+r e−H(se ,so ) tro si si+r tre e−H(se ,so )
GH (r) ≡ = .
tre−H(se ,so ) tro tre e−H(se ,so )
I emphasize that the argument of GH is measured in units of the lattice spacing, i.e. the
0
number of lattice sites between the spins. But recall that e−H (so ) ∝ tre e−H(se ,so ) defines
the effective Hamiltonian for the remaining odd sites, so this is precisely
0
tro si si+r/2 e−H (so )
GH 0 (r/2) ≡ ,
tre−H 0 (so )
where now there are only half as many sites in between the spins in the new coarser
lattice. Under this RG, we are zooming out by a factor of 2. Altogether, GH 0 (r/2) =
GH (r). Combining this with the definition of ξ, we have
1
ξH 0 = ξH (3.10)
2
(as we said earlier).
The notation ξH is to emphasize that the correlation length is completely deter-
mined by the Hamiltonian (I am assuming thermal equilibrium here). At a fixed point,
the Hamiltonian does not change under the RG, so the correlation length can’t either.
This can be consistent with (3.10) in one of two ways

ξ? = 0 or ξ? = ∞.

The first case means that spins at different sites do not care about each other, as at
T = ∞. I’ve already disparaged this case as boring. The second case of a divergent
correlation length characterizes critical behavior and we define it to be interesting.

37
[Cardy §3.4, Domany, RG notes, chapter 1] Free energy density. Next I want to
show how to calculate the free energy from an ‘RG trajectory sum’. It is a reason to care
about the constants in the effective hamiltonian, as in a0 in H 0 (s0 ) =
P 0 0 0
J s s + a0 N 0 .
In the example above, we found a0 = N1 log ∆, where ∆ was some function of the
microscopic J.
Let the free energy density (free energy per site) be

T
f ≡− log ZN (K).
N
Here I am denoting by K the collection of all couplings, and labelling the partition
function ZN by the number of sites. More explicitly,

e−β (N C+H̃(s)) ≡ e−N βC Z̃N .

X X
ZN ≡ e−βH(s) ≡
{s} {s}

Here H̃ is a Hamiltonian modified by subtracting a constant so that

X
H̃(s) = 0
{s}

so that it has no constant piece (for quantum mechanical folks: it is like a ‘normal-
ordered’ Hamiltonian). And Z̃N ≡ s e−β H̃(s) , and naturally we’ll denote
P

T
f˜ ≡ − log Z̃N .
N
This last expression is a little less innocent than it seems: I am anticipating here
that the free energy is extensive – has a leading piece at large N that grows like N ,
N 1
F = N f + O(N 0 ) – so that f˜ is independent of N in the thermodynamic limit.
(We’ll give an RG-based proof of this statement in §5.) Then f (K) = C + f˜(K).
Now some RG content: the partition function is invariant under the RG:
NC NC −N 0 a0
ZN = e− T Z̃N = e− T eZ̃N/b (K 0 )
T

= e−β (N C+N a +N a ) Z̃N/b2 (K (2) )

0 0 (2) (2)

= e−β (N C+N a +···+N a ) Z̃N/bn (K (n) ) .

0 0 (n) (n)
(3.11)

Here we’ve defined N (n) to be the number of sites decimated at step n, and N/bn is
the number of sites remaining. For the example above, these are the same, and b = 2:
N (n) = N/2n . As above K (n) = Rn (K) is the image of the couplings under n-times
repeated RG transformation. (Notice that if we were in d dimensions, we would have
b = λd , where λ is the linear zoom factor, and the number of sites decimated would not

38
equal the number remaining even for λ = 2.) Taking logs of the BHS of the previous
equation
n
X N (k) (k) N ˜ (n)
f (K) = C + a + n f (K ). (3.12)
k=1
N b

If we iterate the RG transformation enough times, and f˜(n) is finite, its contribution is
suppressed by b−n → 0.
Magnetization. The magnetization can be calculated by taking derivatives of the
previous result:
M ∝ ∂h f = hsi i
but here is some cleverness. By translation invariance the BHS is independent of i.
Therefore, we can choose i to be a site that survives all the decimation. Then
−H 0 (so )
P P −H(so ,se ) P
si e−H
P
s s0 si se e so si e
hsi iH = P −H = P X = P −H 0 (so ) = hsi iH 0 .
se s0 e−H(so ,se ) s0 e
s
|e {z }
0 (s )
=e−H 0

We have just shown that the magnetization is an RG invariant. This result required
that we are using a decimation scheme, where the spins surviving the RG are a subset
of the initial spins. I will come back to alternatives soon, and we will see why we need
them. This means we can compute the magnetization for a macroscopic system just
by following the flow to the end:
−H ∞ (si )
P
si =±1 si e
hsi i = P −H ∞ (si )
si e

but H ∞ (si ) = a∞ + h∞ si (these are the only two possible terms) and h∞ is the fixed-
point value of the Zeeman field. So
−h∞ si ∞ ∞
P
si =±1 si e −e+h + e−h
hsi i = P −h∞ si = +h∞ −h∞ = − tanh h∞ .
si e e +e
I emphasize again that this works only for decimation schemes.

3.4 Need for other schemes

Let’s think about decimation of the Ising model on the square lattice. Again it is
bipartite, and we can do the sum of each spin on one of the sublattices fixing the spins
on the other, one at a time:
X
eJsx (s1 +s2 +s3 +s4 ) ≡ ψ(s1 + s2 + s3 + s4 ).
sx

39
The argument of the function ψ defined by this equation only takes the values 0, ±2, ±4.
We’ve set the Zeeman field h = 0, so it is even ψ(−x) = ψ(x), and there are only three
values of the argument we care about. For these values, it can be written as
0 0 0
ψ(s1 + s2 + s3 + s4 ) = ea +J (s1 s2 +s2 s3 +s3 s4 +s4 s1 +s1 s3 +s2 s4 )+M s1 s2 s3 s4

with values of a0 , J 0 , M 0 determined by J which you can figure out. The first term a0 is
just a constant. The first four terms multiplied by J 0 are nearest-neighbor interactions
√
on the new (square) lattice with lattice spacing 2a (rotated by π/4). This means
√
λ = 2; the number of remaining spins is N/2, so b = λd=2 = 2 as expected in two
dimensions. The next two terms are next-nearest-neighbor exchange couplings (s1 and
s3 are separated by 2a) of the same size. Finally, M 0 multiplies a qualitatively-new
4-spin interaction, proportional to J 4 . Ick!
This isn’t so bad if we think of the initial Hamiltonian as sitting in a special corner
of the large and high-dimensional space of possible couplings, and the RG just moves
us to a more generic point:
R
(J, 0, 0, · · · ) 7→ (J 0 , K 0 , M 0 · · · ).

That’s just a little ugly. But there’s a reason why it’s objectively bad: we can’t repeat
this RG step. After the first iteration, we generate couplings between spins of the
same sublattice of the remaining square lattice. This means we can’t just sum them
independently anymore. We could do some uncontrolled truncation, or we can find a
better scheme. There are 2d lattices for which a decimation scheme can work (i.e. can
be iterated).

40
We can nevertheless persevere by truncating the generation
of couplings. For example, if we keep terms only to order
J 2 and order K, we do not generate any further couplings
beyond J, K, and we find a closed set of RG recursion equa-
tions:
J 0 = K + 2J 2 , K 0 = J 2 .
These equations have three fixed points: (J, K) =
(0, 0), (∞, ∞) and (1/3, 1/9). The nearby flow diagram is in-
dicated at right. Fixing the couplings and varying T amounts
to the replacement (J, K) to (J/T, K/T ). Increasing the tem-
perature corresponds to scaling J, K down towards K0 =
(0, 0), the infinite-temperature fixed point, where everyone is
decoupled. This point and the zero-temperature fixed point
(K∞ , where all couplings are infinite) are separated by a new
fixed point with a single relevant perturbation. Let’s focus on
just the relevant dimension (which is not orthogonal to the
temperature direction), so we can draw a one-dimensional
plot (after all, we are already ignoring infinitely many other
irrelevant directions). We see that there is a critical value
Tc below which we flow to K∞ , and above which we flow to
K0 . A fixed point with a single relevant operator describes
a critical point, a continuous phase transition between two
phases.

41
3.5 Low-temperature expansion, and existence of phase tran-
sition in d > 1

Maybe you still don’t believe me that there has to be a phase transition in the nearest-
neighbor Ising model, even in d = 2. At arbitrarily high temperatures, there is definitely
no spontaneous symmetry breaking, since each spin is just looking out for itself and
there can be no collective behavior, and hsi = m = 0. At T = 0, the spins all align
(as they do in d = 1, too). Here is an argument (due to Peierls, still) that the ordered
state survives to some finite temperature for d ≥ 2.
A configuration of lowest energy, say all si = +, has energy E0 = −JNl , where Nl
is the number of links of the graph (this is 2N for the square lattice since there are
two links in each unit cell, one up and one right). The minimal excitation above the
ordered configuration flips one spin and has energy E0 + 2zJ where z is the number
of neighbors of the flipped spin. We can estimate the entropy of a dilute gas of n such
flipped spins, with energy E(n) ∼ E0 + 2Jzn; the number of configurations is again

N
approximately Ω(n) = , and so their free energy is
n

Stirling
F ∼ nz2J − T (N log N − (N − n) log(N − n) − n log n) .

(Actually, the flipped spins have a short-ranged attraction because if they are adjacent
they share a happy bond. We ignore this; think about why we can get away with it.)
This is minimized by an equilibrium density of flipped spins
neq
' e−2zJ/T .
N
All this so far is just like in the 1d argument, except we replaced 2 neighbors with z
neighbors, and counted spin flips rather than domain walls.14
Here’s the catch: The magnetization is not so strongly affected by a flipped spin as
it is by a domain wall. It is only decreased from the maximum (m = 1) to
neq
m=1−2 ' 1 − 2e−2zJ/T ' 1 if T zJ.
N
So this means that at low (but nonzero) temperature, the magnetization survives. And
therefore something interesting has to happen at some intermediate temperature.
[End of Lecture 6]
14
Why did we count domain walls in d = 1? Because in d = 1, the energy of a row of k flipped spins
in a row is the same for any k. The elementary dynamical object is really the kink itself in d = 1.
This is the tip of an iceberg called ‘fractionalization’.

42
3.6 A word from our sponsor

We’ve been spending a lot of time talking about Ising models. Let’s take a break and
talk about another role it plays in physics.
Lattice gas. Suppose our dynamical variables are the locations r1 ..rN of a collec-
tion of point particles. The grand canonical partition function is
X ζN Z P
Ξ(ζ) = dd r1 · · · dd rN e−β i<j V (ri −rj ) (3.13)
N
N!

where ζ is a fugacity for particle number, and V (r) is an interparticle potential, which
usually has a short-range repulsion and long-range attraction (most kinds of particles
P p~2
find each other vaguely attractive from far away...). The kinetic energy was i 2mi ,
~2
p
but we did the p integrals already: dd p e−β 2m = (πmT )d/2 .
R

These integrals in (3.13) are hard. If our interest is in critical behavior, we can
zoom out, and take the particles to live at the sites of a lattice r ∈ Λ, so our dynamical
variables are instead the number of particles at site r, n(r). To implement the short-
range repulsion, we take n(r) = 0, 1. Then we study
1 P 0
X P
ΞΛ (ζ) = ζ r n(r) e− 2 β r,r0 Jr,r0 n(r)n(r )
{n(r)=0,1}

where J(r−r0 ) implements the long-ranged part of the potential. If we change variables
to s(r) ≡ 2n(r) − 1 = ±1, we have
1 X X
H(s) = − β Jr,r0 sr sr0 − β hr sr + const
2 r,r0 r

with βhr = 21 log ζ + β r0 Jr,r0 . This is an Ising model. The ferromagnetic ordering
P

transition is the liquid-gas transition! Recalling that this occurs at h = 0, we see that
the s → −s symmetry of the Ising model (with h = 0) is a symmetry of the lattice gas
only near the critical point – it is an ‘emergent symmetry’.
Another useful interpretation of the same model is as a ‘binary fluid’, where n = 0, 1
represent occupation by two kinds of fluid elements.

43
3.7 Duality

[Feynman, Statistical Mechanics §5.4, Parisi, §4.6]

Let’s set J = 1 for a bit, since only the combination βJ appears in this discussion.
Now consider again the low-temperature expansion: previously we thought about
flipping some dilute collection of isolated spins, each of which costs a (Boltzmann)
factor of e−2βJz in the partition function. More accurate is to think of this as a sum
over islands of reversed spins. If we were speaking about a 3d lattice, they would be
regions, with two-dimensional boundaries. Let’s focus on the case of a 2d lattice, so
that the boundaries of the islands are one-dimensional curves.
If we keep track of the boundaries between these regions, we have complete in-
formation about the spin configuration, up to an overall reversal. The weight for a
given configuration C is e−2β`(C) where `(C) is the total length of boundary in that
configuration. We could include a field h in this description, that multiplies the weight
by
e−βh(Ain (C)−Aout (C))
where Ain/out (C) is the number of sites inside/outside the chosen configuration of
boundaries.
Can we represent the entire partition sum as a sum over these boundaries (which are
called domain walls)? It is not just a sum over curves. Notice that the boundaries are
always closed curves – it is a deep topological fact that a boundary has no boundary.
Furthermore, the boundary curve is always contractible in the sense that by flipping
back some of the spins in the region to agree with the rest, we can gradually shrink it
away to nothing. Here’s an example of some curves that are not contractible, on the

surface of a complicated pastry: The curves A and B are not

the boundary of any 2d region on the surface of the pastry. Let us restrict ourselves for
now to lattices which do not contain such curves (they are called simply-connected).
It is useful to introduce at this point the dual lattice: for a 2d lattice Λ, this is
a lattice Λ̂ whose sites correspond to the plaquettes of Λ. A link of Λ separates two
plaquettes of Λ; it corresponds to a link of Λ̂ connecting the two corresponding sites of

Λ̂: The domain walls of a spin configuration on the sites of Λ cover a set

of links of Λ̂:

44
But our description of the low-temperature expansion on Λ as
X
ZΛ (T ) = 2 e−2β`(C) (3.14)
C

has exactly the same form as our high-temperature expansion (3.9) if we identify

e−2β = v̂ ≡ tanh β̂ .

This equation relates high temperature (small J) on Λ to low temperature (large J)

on Λ̂. It is due to Kramers and Wannier.
For the square lattice, the dual lattice is the
square lattice again! This means that if there is
only one critical point (remember: fixed points of
the RG are rare and valuable), it must be a fixed
point (not only of the RG but also) of the duality
transformation on the couplings:

e−2βJ = v̂ ≡ tanh βJ .

The dual of the honeycomb lattice is the triangular lattice (and vice versa). To
learn their critical temperature, we add one more maneuver, called star-triangle trans-
formation: The honeycomb lattice is bipartite, and the two sublattices are triangular
lattices. By decimating one of the two sublattices, we can relate
N N/2
Z7 (J) = ∆N/2 Z4 (K)

where ∆ and K are determined from J by:

X
eJsx (s1 +s2 +s3 ) = cosh3 J 1 + tanh2 J · (s1 s2 + s2 s3 + s3 s1 ) ≡ ∆eK(s1 s2 +s2 s3 +s3 s1 ) .

sx =±1

Combining this with the duality relation we can relate the critical temperature of the
Ising model on the triangular lattice to itself.
1
Here is a table of the critical values of βJ for various lattices. z is the coordination
number, the number of neighbors of each site.

Λ z Tc /J
– 2 0
7 3 1.52
4 2.27
4 6 3.64

45
The first entry is the 1d chain. You can see that the critical temperature rises with
coordination number.
Notice that the disordered (high-temperature) phase is dual to the ordered (low-
temperature) phase. That this is not a contradiction is related to the factor of 2 in
front of the partition sum in (3.14): the description in terms of domain walls doesn’t
really know about the magnetization.
If you can’t wait to learn more about the many generalizations of Kramers-Wannier
duality, here are some references: Kogut, Savit.
There is more to be said about this sum over curves. They can be used to solve the
2d Ising model exactly. They are the worldlines of free fermions.

3.8 Block spins

Here we introduce a more general class of coarse-graining transformations, called block-

ing. The essential rule is that the partition function is an RG invariant:
! 0 0
X X
Z= e−H(s) = e−H (s ) . (3.15)
s s0

Previously, in the decimation schemes, the coarse-grained variables {s0 } ⊂ {s} were a
subset of the microscopic variables. This is a special case of the more general blocking
rule
0 0
X Y
e−H (s ) ≡ T (s0 ; si∈b )e−H(s)
s blocks, b

where T is a function which decides how the block spin s0 depends on the spins si in
the block. Decimation is the special case where we weight the opinion of one of the
spins over all the others:
Tdecimate (s0 ; si∈b ) = δs0 ,s2 .
Another option is majority rule:
(
1, if s0b i∈b si > 0
P
T (s0b ; si∈b ) =
0, otherwise.

Notice that for each block, s0 =±1 T (s0 ; s) = 1 guarantees (3.15). Furthermore, it is
P

useful if T (s0 ; s) ≥ 0, so that everything is a probability. Also, it is best if T preserves

the symmetries of the system.

46
4 Mean Field Theory
Mean field theory (MFT) is always simple and sometimes right, and it is all around
us in physics departments, so we must understand well when to believe it. We will see
that it goes bad near critical points, and the RG will come to our rescue. It is great
for getting a big picture of the phase diagram.
We’ll give three roads toward MFT, in order of decreasing squiggliness. For defi-
niteness, consider the Ising model, on any graph Λ:
X 1 X X
Z= e−H(s) , H(s) = − Jij si sj − h si .
s
2 i,j∈Λ i

(I’ve put the 12 to agree with our previous definition of J, because here the sum is over
all i, j.) Mean field theory is an attempt to fulfill the urge everyone has to be able to
do the sums over the spins one at a time. If only J were zero, we could do this, for
example to compute the magnetization:
seβhs
P
m = hsi = Ps=±1 βhs = tanh βh. (4.1)
s=±1 e

But J 6= 0 is much more interesting. So what to do?

Our first approach to MFT is via political science. Mean field theory is the physics
realization of libertarian political philosophy15 . This has two ingredients.
(1) No one cares about anyone else. What I mean by this is: put yourself in
the position of one of the spins in the Ising model. How does it even know about its
neighbors? Its role in the hamiltonian is
!
1 X
H(si ) = si − Jij sj − h .
2 j6=i

From its point of view, this is just like some external magnetic field depending on
what its neighbors are doing. What’s sj ? Well, it’s probably equal to its average
value, hsj i = m. So let’s just forget everyone else, and assume they are average and
incorporate them into an effective magnetic field:
1X
heff ≡ Jij m + h.
2 j

The second tenet is

(2) Everyone’s the same (and I am just like everyone). That is: if there is only
15
Disclaimer: I actually don’t know anything at all about political philosophy and made all this up
during lecture.

47
one spin in the world, and this is the field it sees, then we can compute m using (4.1):
(4.1)
m = tanh βheff = tanh (zJm + h) .
1
P
Here I defined J ≡ 2 j Jij . This is an equation for m! We can solve it!
At least graphically or numerically we can solve it. Here is m (yellow) and tanh(zJm+
h) (blue) plotted versus m for two values of J (large and small compared to T , with
some small h)

Here’s our second approach to MFT. Basically, here we will be more explicit about
what we’re leaving out (but it is the same as the previous discussion). We rewrite the
interaction term in the Ising hamiltonian as

si sj = (m + (si − m))(m + (sj − m)) ≡ (m + δsi )(m + δsj )

= m2 + m(δsi + δsj ) + O(δs)2 = −m2 + m(si + sj ) + O(δs)2 . (4.2)

We are going to treat the fluctuation about the mean δs as small. Then
1X X
Jij m(si + sj ) − m2 + h si + O(δs)2

−H =
2 ij i
1 2
X
= − N Jm + (zJm + h) si + O(δs)2 . (4.3)
2 i
P
N is the number of sites, and J ≡ j Jij . The contribution Jm to the external field
from the neighbors is sometimes called the ‘molecular field’. What we are neglecting
here (when we drop the O(δs)2 in a moment) is the correlations between the spins at
different sites i, j. This is not small if |ri − rj | < ξ, by definition of the correlation
length ξ. Brutally ignoring the correlations, then, we can do all the sums have
1 2
Z ' e− 2 N βJm (2 cosh β(zJm + h))N ≡ ZMFT

So in this approximation, the free energy density is

T 1
fMFT (m) ≡ − log ZMFT = Jm2 − T log cosh β(zJm + h) + const.
N 2

48
I claim, and will prove next, that fMFT (m) ≥ f is an upper bound on the correct free
energy. This is true for every m, and so the best bound comes from minimizing over
m. That condition gives back the equation for m (namely m = tanh β(zJm + h)) that
we got from self-consistency above. (And it will tell us what to do in the case of J T
where there are three solutions.)
Our third approach is the variational method. [There is a good discussion of this
in Parisi’s book.] It will give our proof that fMFT (m) upper bounds f . The idea can
be found from a Bayesian viewpoint on statistical mechanics. Let’s put this in a box:

Bayesian viewpoint on statistical mechanics. Suppose we are given a (clas-

sical) physical system, defined by a configuration space (e.g. the values of a bunch of
spins {s}) and a Hamiltonian (H(s)). Further suppose that the only thing we know
about the state of the system is the average energy E. What probability distribution
P (s) should we use to make predictions? We don’t want to add unjustified information.
One way is to find the distribution P? which maximizes the (Shannon) entropy
X
S[P ] ≡ − hlog P iP = − P (s) log P (s),
s

16
subject to the constraint that E = hHiP? ≡ E[P? ]. The distribution should also be
P
normalized s P (s) = 1. We can impose these conditions with lagrange multipliers:
X X X
Φ[P ] ≡ S[P ]+b(E[P ]−E)+a( P (s)−1) = − P (s) log(P (s))+ (bH(s)+a)P (s)−bE−a
s s s

δΦ[P ]
= − log P (s) − 1 + bH(s) + a
δP (s)
=⇒ P? (s) = ebH(s)+a−1
where a, b must be determined to satisfy the two constraints.
If instead of fixing the average energy, we want to fix the temperature 1/β, what do
we do? We should instead find the distribution P? (s) which minimizes the free energy

F [P ] = E[P ] − S[P ]/β

16
A useful way to think about this quantity is the following. Given a distribution P (s), the quantity
− log P (s) is called the surprise of the configuration s – the bigger it is, the more surprised you should
be if s actually obtains. So the Shannon entropy is simply the average surprise (or maybe the expected
surprise). Clearly, all else being equal, we will make the best predictions using the distribution that
minimizes the expected surprise. If you like this perspective on the world, the place to get more is
E. T. Jaynes, Probability Theory: The Logic of Science.

49
as a functional of P . It is still normalized, so we need to use a lagrange multiplier
again, and minimize X
Fλ [P ] ≡ F [P ] + λ( P (s) − 1)
s

which is extremized when

0 = H(s) + T + λ + T log P (s)

from which we again recover the Boltzmann distribution, P (s) = e−βH(s) /Z (the mul-
tiplier λ is eliminated in favor of Z by normalizing).

This derivation is useful philosophically (for example, it evades all the vexing ques-
tions about ergodicity), and it also implies a variational bound on the free energy F .
That is, if we pick some arbitrary other distribution Poff-the-street (s), then we know that
its free energy is bigger than the correct equilibrium free energy:

F [Poff-the-street ] ≥ F [e−βH /Z] .

[End of Lecture 7]
So: to recover mean field theory, we choose a distribution which we like because we
know how to calculate its averages, that is, one which factorizes:
Y
PMFT (s) = pi (si ) (4.4)
i

where we can parametrize the individual factors as pi (s) = e−βhsi /zi or as

1 + mi 1 − mi
pi (s) = δs,1 + δs,−1 .
2 2
It is normalized since each factor is. For any distribution of the form (4.4) all the
moments factorize:
hg1 (s1 )g2 (s2 )iP = hg1 (s1 )ip1 hg2 (s2 )ip2
and each factor is
1 + mi 1 − mi
hg(s1 )ip1 = g(1) + g(−1).
2 2
And in particular,
1X X
hsi i = mi , hHi = − Jij mi mj − hmi
2 ij i

and its entropy is

X 1+x 1+x 1−x 1−x
S[P ] = − hlog P iP = s(mi ), s(x) ≡ − log − log .
i
2 2 2 2

50
i (m )
Now we apply the variational bound. The free energy F (mi ) = F [PMFT ] ≥ F upper
bounds the true free energy for any m, so we do best by minimizing it:
X
0 = ∂mi F = − Jij mj − hi + T arctanhmi
j

which gives back !

X
mi = tanh β Jij mj + hi
j

the mean field equation.

This perspective on mean field theory has the further advantage that it is systemati-
cally improvable. For example, rather than writing a completely factorized distribution
Q
p(s) = i pi (si ), we could instead consider, for example, a trial state of the form
Y
p(s) = pb (si∈b )
b

where b represent some blocks of sites. Such a state is more general than the MFT
ansatz, and will have more variational parameters, and necessarily gives a better esti-
mate of the correct free energy. Further thinking in this direction leads to cluster mean
field theory and belief propagation algorithms.
On the form of the mean-field free en-
ergy. The most important conclusion from the
mean field theory is that (for h = 0) there are
two phases distinguished by whether or not the
Z2 symmetry is spontaneously broken – at high
T , we have m = 0, and at low T m 6= 0. In
between there is a phase transition17 , where m
suddenly grows from zero. If we set h = 0 and
study small m, we can expand fMFT in m and
find
1
fMFT (m) ' a + Bm2 + cm4 + ... (4.5)
2
where a, c are constants. The coefficient B is

B = (1 − βJ) ≡ bt,

where t ≡ T −T
Tc
c
is the “reduced” temperature. If c > 0, this function looks like one of
the figures at right, where the top left figure is for T > TcMF = J and the bottom left
17
In case I forgot to say so, a phase transition occurs when physical quantities are non-analytic in
the parameters at some point in the parameter space – it means that Taylor expanding physics on
one side of the phase transition gives the wrong answer (for something) on the other side.

51
figure T < the critical temperature. If c < 0, then we have to keep more terms in the
expansion to know what happens. (The right column is with h < 0.) So you can see
that the minimum of f occurs at m = 0 for T > Tc (disordered phase) and m 6= 0 for
T < Tc (ordered phase). This figure makes it clear that the third solution of the MF
equations (at m = 0) that exists for T < Tc is a maximum of the free energy – it is
unstable.

4.1 Landau-Ginzburg theory

[Parisi §4.3, 5.2] Before drawing any further physical conclusions from the MFT free
energy we just derived, let me say some words in defense of this form of the free energy
(4.5). These are the words (the idea is due to Landau; this is a paraphrase):

What else could it be?

If the free energy is analytic near m = 0, it looks like this. So all that song and dance
about justifying mean field theory is really irrelevant to the conclusions we draw about
the phase transition from m = 0 (at T > Tc ) to m 6= 0 (at T < Tc ). The dependence
of B on T − Tc follows from (4.5) itself! With this assumption,
( fMFT (m) is the most
m 7→ −m
general answer, consistent with the symmetry under (at the same time).
h 7→ −h
So: the only real assumption leading to (4.5) is the analyticity of f (m). Some points:
(1) we will see immediately below that analytic f (m) does not mean that the physics
is analytic in external parameters – we can get critical behavior from this framework.
(2) When we find out that MFT gives wrong predictions for critical exponents, we will
have to find out how and why we get an f (m) which is not analytic. (3) The fact
that the coefficient of m2 is proportional to the deviation from the critical temperature
follows from our analysis of (4.5). The only input from the microscopic calculation
(with all the approximations above) is how do the coefficients a, b, c, d depend on the
microscopic couplings. Notice that the actual magnetization m = N −1 N
P
i=1 hsi i is
an average of numbers each ±1, and therefore lies between these two numbers. The
minimum of f (m) will not satisfy this constraint for all values of a, b, c, d... consistent
with the input above: this is a “UV constraint on IR physics” of the kind that the
string theorists dream about.
Types of phase transitions. A first order phase transition is one where the
minimum of the free energy jumps from one value to another, distant value, like if the
potential evolves as in this comic strip as a function of the parameter in question:

52
The two configurations need have nothing to do with each other, and there is no
notion of universal properties of such a transition. The correlation length need not
grow. This is what happens when we vary h from positive to negative, at nonzero
t < 0. The correlation length stays fixed, but the minimum jumps from −m0 to +m0
as h goes through zero (as in the comic strip above).
The alternative is a continuous phase transition which is more interesting, because
then, as we will see, there is a field theory which encodes a collection of universal
phenomena at and near the critical point.
(Sometimes, one hears about ‘nth-order’ phase transitions, where the nth derivative
of the free energy is discontinuous for various n ≥ 2, but I haven’t found the need to
distinguish between these. Moreover, it is only in mean field theory that the free
energy goes like integer powers of t (as in (4.6) below); more generally, taking enough
derivatives of the free energy will give a divergent (not just discontinuous) behavior
at the transition. So this more detailed ‘classification’ (due to Ehrenfest) is both
incomplete and not useful.)
Notice that when we say that ‘a transition is continuous’ it can depend on what
parameter we are varying: at T < Tc , as a function of the magnetic field, the transition
from one minimum to the other of the Ising model is first order. (This is what’s
illustrated in the comic above). But at h = 0, there is a continuous transition as T is
varied through Tc .
Here are some simple examples of the power of the LG point of view: If we break
the Ising symmetry the transition should generically be first order. This allows a cubic
term in the potential, and it means that as we cool from high temperatures, one of the
two minima at m 6= 0 will have f (m) < f (0) before (at a higher temperature than the
one where) f 00 (0) becomes negative.
A continuous transition is, however, not an inevitable conse-
quence of Ising symmetry: if c < 0, then we must consider the m6
term. Depending on the signs, there is a regime where the minima
at m 6= 0 descend before f 00 (0) goes negative.
Usually (but not always) TcMF > Tc , since the fluctuations we
are ignoring disfavor the ordered state. (Sometimes in fact Tc ≤ 0.)

53
Mean field critical exponents. The very fact that there is a notion of Tc in
MFT is worth remarking on. Lots of stuff is non-analytic at Tc !
Near Tc , we can expand

fMF (m) = a + btm2 + cm4 + µhm + ...

where T ≡ T −T
Tc
c
is the non-dimensionalized deviation from the critical temperature.
Notice that a, b, c, µ really do depend on T , but only weakly (i.e. , a = a0 + a1 t + · · · ).
When h = 0, the free energy is minimized when :

0, t>0
m= q √
± b −t, t < 0
2c

The magnetization critical exponent is called β and βMFT = 12 .

When h 6= 0 (and small) and t > 0, we can ignore everything
but f ∼ a + btm2 + µhm (beware typos in Cardy’s book here) to find
µh
m∼ .
bt
This exponent in t (which determines χ = ∂h m ∼ t−γ ) is called γ, and γMFT = 1.
Right at the transition, t = 0, we must keep the quartic term and we find
1/3
µh
m∼ .
c

This exponent is called δ and δMFT = 13 . (I’m mentioning this botany of greek letters
because there are people for whom these letters are close friends.)
Finally, the free energy density evaluated at the minimum, at h = 0, is
(
a, t>0
f (t) = (bt)2
(4.6)
a − 2c , t < 0

which means that ∂t2 f jumps at the transition; this jump is actually an artifact of
MFT.
Otherwise, the behavior in general predicted by MFT is good, but we’ll see that
the values of these exponents aren’t always right (and why and when, and then we’ll
understand how to fix them). In particular, mean-field critical exponents are always
rational numbers. In contrast, for the 3d Ising model, β = 0.326419(3), which isn’t
looking very rational. This value comes from the conformal bootstrap program to solve
and classify fixed points.

54
Notice that the critical exponents do not depend on the particular values of the
parameters a, b, c, µ · · · . This is one reason to hope that they can be understood, and
that they are universal in the sense defined earlier.
It is worth thinking about what the
extrema of this potential do as we vary
the parameters. At right is a plot of the
free energy evaluated at all of the critical
points of f (m) as h varies (the other cou-
plings are fixed to T < Tc ). (This picture
is sometimes called a ‘swallowtail’.) In-
set in red is the shape of the potential
at the corresponding value of h. Plot-
ted below is the corresponding magneti-
zation. Notice that the number of (real)
critical points goes from 1 to 3 as |h| is
decreased below some value; the two new
extrema are pair-produced from the com-
plex plane, that is, the new extrema come
in pairs and have a larger free energy. No-
tice further that ∂h2 f > 0 along the top
trajectory – this is the maximum near the
origin. The other one is actually a local minimum – a metastable state, responsible for
hysteresis phenomena at the first-order transition. More on the physics of this in §5.5.

55
LG Theory for other symmetries. Here is another illustration (of the Power
m 7→ −m
of Landau. We’ve been studying models with a Z2 symmetry acting by .
h 7→ −h
Suppose instead of this, we made the replacement Z2 O(n) rotation symmetry acting
on a generalization of the magnetization with n components, m ma , in that case
a
the external field would be h h , and the transformation rule would be
(
ma 7→ Rab mb
ha 7→ ha Rab

(where R is an orthogonal matrix (Rt R = 1) so that ma ha is invariant and can be added

to the hamiltonian). Incidentally, the field ma playing the role of the magnetization,
the thing that orders at low temperatures, is called the order parameter.
What’s the free energy in this case? If it’s analytic in m, for small m it must be of
the form18 !2
X X
fLG (ma ) = a + bt m2a + c m2a + ... + µha ma .
a a
q
−bt
For t < 0, a minimum occurs at ma = (m0 , 0, 0...)a with m0 = 2c
, as well as at
all its images under an arbitrary O(n) rotation. The O(n) symmetry is broken in
the sense that the free energy minima form a nontrivial orbit of the symmetry (and
furthermore, the free energy at the minimum will be non-analytic in h near h = 0).
This degeneracy has the following consequence. If we expand ma = (m0 , 0, 0...)a + δma
about the minimum, we find

fLG = c + 2b|t|δm21 + O(δm4 )

– the quadratic terms are completely independent of the other N components of the
fluctuations δm2 ...δmN ! We’ll see in a moment that this absence of a restoring force
means that those degrees of freedom have infinite correlation length, everywhere in the
ordered phase. They are called Goldstone modes.
[End of Lecture 8]

18
Dachuan Lu reminds me that for some values of n, there can sometimes be extra invariants, such
as i1 ···in mi1 · · · min .

56
‘Microscopic’ Landau-Ginzburg Theory. In our variational derivation of mean
field theory, we actually derived a stronger bound, since we allowed for spatially-varying
magnetization. Let’s combine the Landau point of view with the knowledge that the
free energy is extensive19 to learn the answer without doing any work. Because F is
extensive, we can write the free energy as a sum over a contribution associated to each
P
lattice site, or patch of the lattice F = i fi , where fi depends on the magnetization
mi at site i and nearby sites. (Think about assembling the system from big enough
chunks.) If the correlation length is not so small, fi will vary smoothly and we can
−d
P R d
approximate this as an integral: i f (xi ) ' a d xf (x). The integrand, in turn,
depends locally on the field and its derivatives. Translation invariance forbids any
explicit dependence on x:
Z
~
F [m] = dd xf (m(x), ∇m(x), ∇2 m(x)...).

Symmetries further constrain the form of f : Z2 symmetry forbids terms odd in m

and h, parity symmetry forbids terms with an odd number of derivatives, rotation
invariance requires us to dot all the vector indices together. So, under the crucial
analyticity assumption, we have

~ · ∇m
fLG = V (m) + κ∇m ~ + κ0 (∇2 m)2 + ... (4.7)

where V (m) = a + Bm2 + cm4 + dm6 + ... is the value when m is constant in space – it
contains all the information about the mean field treatment of phase transitions, some
of which we discussed above.
We will have a lot more to say about how to organize this expansion. So far it
is an expansion in powers of m (since know that in the neighborhood of the critical
point m is small). It is also an expansion in the number of derivatives, something like
the dimensionless quantity a∇m, where a is the lattice spacing. If this quantity is
not small then we are asking the wrong question, because the ‘field’ we are treating
as continuous is varying rapidly on the scale of the lattice spacing a. The RG will
give us a better understanding of this expansion: we’ll see that operators with more
derivatives are more irrelevant (near any of the fixed points under discussion here).
The equation (4.7) contains an enormous amount of information. To better appre-
ciate it, let’s first discuss the mean-field treatment of the correlation function.

By the way, what exactly is the LG free energy? It is not convex in m, so how can
it be the actual free energy?
19
I owe you some discussion of why this is the case. This happens in §5.1.

57
[Goldenfeld §5.6] The answer to this is that it is the free energy with the constraint
that the (coarse-grained) magnetization is fixed to be m(r):
!
X Y X
e−βFLG [m] ≡ e−βH(s) δ si − m(r)NΛ (r) . (4.8)
s blocks,r i∈r

Here r denotes a block, and NΛ (r) is the number of sites in the block r. This is just
like the construction of the block-spin effective Hamiltonian. It is only more ambitious
in that we are hoping that m(r) is smoothly varying in r, which will be true if ξ > a.
So the LG free energy S can be regarded as (a parametrization of) the coarse-grained
free energy.
It is indeed analytic in m, since we need to do only a finite number of sums in (4.8).
And, also because there is only a finite number of sums, it need not be convex.
How do we get the actual, thermodynamic free energy from FLG (which is convex
and need not be analytic in its arguments)? We have to do the rest of the sums, the
ones over m: X X
e−βF = e−βH(s) = e−βFLG [m] .
{s} m
P
Because m(r) is a continuous variable, ‘ m ’ is actually an integral, one for every block,
r: X Z Y Z
= dm(r) ≡ [Dm]
m r

where the right equation defines what we mean by such a ‘functional integral.’
Altogether, we have Z
Z= [Dm]e−βFLG [m]

– we have rewritten the partition function (in a regime of moderately large correlation
length) in terms of a field theory functional integral. The quantity appearing in the
exponent of such an integral Z
Z= [Dm]e−S[m]

is usually called the (euclidean) action, S[m] = βFLG [m].

Doing this integral over m (which is our job for the next few weeks) is what restores
convexity of F , and what can allow F to be non-analytic, and what can produce non-
mean-field critical behavior.

58
4.2 Correlations; Ginzburg criterion for MFT breakdown

[Goldenfeld §5.7] You might think that the spirit of mean field theory is antithetical
to obtaining information about correlations between the spins, since after all that was
precisely what we ignored to do the sums. Not so!
Here’s a first pass. The connected correlator (assume translation invariance) is

G(ri − rj ) ≡ hsi sj i − hsi i hsj i .

The magnetization is related to hsi i by

1 X 1
m= hsi i = ∂h log Z
N i βN

and the (isothermal magnetic) susceptibility is

1 2 1 X 1X
χT = ∂h m = ∂h log Z = G(ri − rj ) = G(ri ). (4.9)
Nβ N T ij T i

This is called the static susceptibility sum rule. It relates a thermodynamic quantity
χT to a (integrated) correlation function. If the correlation length is big enough, ξ > a,
then we can approximate the sum by an integral
Z
1
χT = dd rG(r).
T ad

Is the integral well-defined? The lower limit of integration, the UV, is fine because
we are talking about a lattice model. When ξ is finite, the fact that the correlations
ra
fall off rapidly G(r) ∼ e−r/ξ means that the integral converges in the IR (the upper
limit of integration) as well.
1 MFT
But: χT → ∞ at the critical point, in fact we saw above that χT ∼ T −T c
+ regular
20
terms as T → Tc . The only way this can happen consistently with the susceptibility
sum rule is if ξ → ∞ as well at the transition. We’ll see in a moment with what power
it diverges.
MFT for G(r). We can actually do better and find the form of G(r) within the
mean field approximation. This is because G(r) is a response function. Here’s what
this means.
When h = 0, the correlation function is
−H(s)
P
s sr s0 e ·1
hsr s0 i = P −H(s)
se ·1
20
If I keep chanting ‘γ = 1’ maybe I will remember these letters someday.

59
where we can write 1 cleverly as

1 = δs0 ,1 + δs0 ,−1 .

Using the fact that H(−s) = H(s) (for h = 0), we have

P0
sr e−H(s)
hsr s0 i = P0 −H(s) ≡ hsr i0
e

where 0 means we sum over all the spins but the one at 0, and we fix s0 = +1,
P

and h...i0 denotes expectation in this ensemble. So the correlation function hsr s0 i is
just the magnetization at r, m(r) in response to an (infinite) applied field (completely)
localized to r = 0. In the presence of this localized source, m(r) will certainly depend
on its distance from the source. But the mean field equation (for r 6= 0) still takes the
form
!
X
m(r) = tanh β Jrr0 m(r0 )
r0
m1 X
' β Jrr0 m(r0 ) (r 6= 0) .
r0

In the second line, we retreated to small m, which is useful for T > J. (Otherwise
maybe we need some numerics.) We can do better and include the corrections at the
origin, by including a source:
X
m(r) = β Jrr0 m(r0 ) + Aδr,0 .
r0

This (linear!) equation

X
(δrr0 − βJrr0 ) m(r0 ) = Aδr,0
r0

can be solved by Fourier transform (assuming translation invariant couplings Jrr0 =

J(r − r0 )). That is, the matrix we need to invert is diagonal in momentum space. That
P ~
is, take r eik·~r (BHS) to get:

˜
(1 − β J(k))m̃k = +A,

where Z
i~k·~ ~
X
m̃k ≡ e r
m(r), m(r) = d̄d k e−ik·~r m̃k .
r∈Λ BZ

60
In the inversion formula, the integral is over the Brillouin zone of the lattice Λ; for a
cubic lattice, this just means k ∈ (−π/a, π/a]. The Fourier transform of the coupling
is X ~
˜ ≡
J(k) eik·~r Jr,0 .
r

For example, for a cubic lattice, this is J˜cubic (k) = µ=x,y.. 2 cos kµ a, where a is the
P

lattice spacing. For small k, the general case is

˜ = zJ(1 − R2 k 2 ) + O(k 4 )
J(k)
r2 J
P
where R2 ≡ Pr Jr,0r,0 is the range-squared of the coupling. In the last expression we
r
assumed the lattice had an inversion symmetry so that there are no terms with odd
powers of k. We’ll be interested in small k because it determines the long-distance
behavior of G(r).
Therefore,
A
m̃k '
1 − βzJ(1 − R2 k 2 )
and (using the mean-field relation bt = 1 − βzJ)
A/(R2 βzJ) R
G̃k = −2 , ξMF = √ ∼ t−1/2 .
k 2 + ξMF bt
µ
R
You can check that dd rG(r) = G̃k=0 = χT = bt
, independent of R as we found above,
consistent with the susceptibility sum rule.
Why is ξ in this formula the correlation length? Fourier transform back:
e−r/ξMF
G(r) ∼ d−1
r 2
which is a formula named after Ornstein-Zernicke, I don’t know why. So we’ve found
the rate at which the correlation length diverges as we approach the critical temperature
from above (in the mean field approximation) ξMF ∼ √1t ; This scaling relation ξ ∼ t−ν ,
defines another critical exponent ν whose mean-field value is νMF = 12 .
Right at t = 0, we have
~
eik·~r
Z
G(r) = d̄d k ∼ r−d+2
k2
which says ηMF = 0.
Ginzburg criterion for breakdown of MFT. [Goldenfeld §6] So, is mean field
theory right? To get it, we had to ignore the following term in the hamiltonian
X
∆H = Jrr0 δsr δsr0 .
r,r0

61
A necessary condition for its self-consistency is that the expected value of this term,
calculated within MFT, is small compared to the MFT energy:
!
h∆HiMFT < EMF .

The right hand side divided by N is

1
EMF = −∂β (βfMF ) ∼ Jt,
N
t1
where we used fMF ∼ J(1 − βJ)2 . The LHS/N is
1 1 X X
h∆HiMFT = Jrr0 hδsr δsr0 iMF = Jrr0 Grr0 .
N N rr0 r0

[End of Lecture 9]
We assume that Jrr0 has a smaller range than Grr0 (i.e. R < ξ ), so that we may
approximate the RHS as

d̄d k d̄d k
Z Z
A
zJG(0) = A ' 2 . (4.10)
˜
BZ 1 − β J(k) R β |k|<a−1 k 2 + ξ −2

In a lattice model, the integral is over the Brillouin zone. The dangerous bit, where
the RHS can become big, though, comes from k → 0, which doesn’t care about your
lattice details. We used this in replacing G̃k with its long-wavelength approximation
in the last step of (4.10). In making this approximation, we may as well replace the
BZ integral with a simple cutoff |k| < a−1 since the form of the integrand is wrong for
|k| ∼ a−1 anyway.
To separate out the UV physics (k ∼ 2π a
) from the IR physics (k ∼ 2π
L
), let’s use
the partial-fractions trick familiar from calculus:

1 1 ξ −2
= −
k 2 + ξ −2 k 2 k 2 (k 2 + ξ −2 )

so that
d̄d k d̄d k d̄d k
Z Z Z
I≡ = −ξ −2 .
|k|<a−1 k 2 + ξ −2 |k|<a−1 k
2
|k|<a−1 k 2 (k 2 + ξ −2 )
| {z }
ind. of T

The first term is a (possibly big, honking) constant, which doesn’t care about the
temperature or the correlation length. The second term is finite as a → 0 if d < 4
(finding that this integral is infinite as a → 0 just means that the short-distance stuff

62
at the lattice matters). (Note that the integral is finite as L → ∞ if d > 2.) When the
integral is finite, we can scale out the dependence on ξ (define x ≡ |k|ξ):
Z ∞ d−3
ξa 2−d x dx
I = const + ξ Kd
0 x2 + 1
where
Ωd−1
Kd ≡
(2π)d
is a ubiquitous combination of angular factors; Ωd is the volume of the unit d-sphere.
So: the demand that the things we ignored be small corrections to the MFT energy
computed within MFT requires
ATc ξ 2−d
Jt
R2
Remembering that we derived ξM F = Rt−1/2 , we can write this condition purely in
terms of the mean field correlation length. If the condition

ξ 4−d R4

is violated then mean field theory is wrong. (The R4 on the RHS stands in for some
quantities with the right dimensions which do not vary with t near the transition)
So for sure this condition is violated if ever ξ → ∞ in d < 4. (Remember that d is
the number of space dimensions.)
Note that the condition depends on the range R of the interactions: MFT works
better for longer-range interactions, and in more dimensions.

Why does mean field theory get better in more dimensions?

Mean field theory is valid if in the energy depending on a spin si
 
X
Hi ≡ Jsi  sj + hi 
hi|ji

?
we can approximate the values of the neighboring spins by their average sj = hsj i, and
treat the coefficient of si as an effective ‘molecular’ field heff
P
i = hi|ji hsj i + hi .

More dimensions or longer range means more neighbors (for example, for the hyper-
cubic lattice in d dimensions, each site has 2d neighbors); more neighbors means that
P
there are more terms in the sum hi|ji sj + hi . If the correlations between the terms in
the sum are small enough, the central limit theorem tells us that the fractional error

63
decays with the number of terms in the sum. And this assumption is self-consistent,
since in MFT the spins sj are statistically independent (the probability distribution
factorizes).
The preceding argument says that at asymptotically large d, MFT becomes more
and more correct. You saw on the homework that when the number of neighbors grows
with N (namely with all-to-all interactions), then MFT is exact. When d = 1 MFT
is completely wrong, since there is no ordering at all at finite T . So something must
happen somewhere in between. We’ve just learned that that somewhere is d = 4.
d = 4 is maybe not so exciting for statistical mechanics applications. However, the
same machinery can be used with one of the dimensions interpreted as time. For more
on this, I refer you to references on QFT (such as my 215C notes).

d = 4 = dc is called the upper critical dimension (in the sense that mean field theory
is correct for larger dimensions) for the Ising critical behavior (since we’ve been talking
about the case with Ising symmetry). More generally, the upper critical dimension can
be efficiently determined from the zoo of critical exponents as follows. The fractional
error in mean field theory can be rewritten as
R d
d rG(r)
error ∼ R V d (4.11)
V
d rm(r)2

where V is a ‘correlation volume’, a region of space whose linear size is ξ. The numerator
is V dd rG(r) = T χT ∼ t−γ . The denominator is ξ d |t|2β ∼ t2β−νd , so the condition that
R

(4.11) is small is
2β + γ
1 t−γ−2β+νd =⇒ dc = .
ν
Continuum field theory
Along the way in the preceding discussion of correlation functions in mean field
theory, we showed the following, which is a useful summary of the whole discussion,
and makes contact with the microscopic Landau-Ginzburg theory. Consider the simple
case where (
J, rij ≤ R
Jij = .
0, rij > R
Then we showed that the contribution to the mean-field free energy from the interaction
term is
X
−∆fM F [m] = Jij mi mj
ij

64
2 2 !
a2 X X

mi+δ − mi mi+δ + mi
= −J −
4 i a a
|δ|≤R
2
2
a X X m(ri + δ) − m(ri )
= −J + O(m2 )
4 i a | {z }
|δ|≤R correction to V (m)
!2
Taylor a2 X X ~δ · ∇m(r ~ i )
' −J + O(m2 )
4 i a
|δ|≤R
2 Z
zJR dd r ~ 2
' − d
∇m + O(m2 )
4 a
where z is the coordination number of the lattice. Comparing this to our ‘local’ Landau-
Ginzburg expression (4.7), we’ve learned that the constant in front is
R2 TcM F

2 Jz
κ'R = .
4ad ad

The equations of motion for m coming from extremizing

Z
d ~ ~
fLG = d r κ∇m · ∇m + V (m(x))

in this continuum approximation, are21

δFLG ∂V
0= = −2κ∇2 m + |m=m(x) + ... (4.12)
δm(x) ∂m
R
If V contains a source term dxh(x)m(x), then this is

0 = −2κ∇2 m(x) + h(x) + 2btm(x) + O(m2 ).

For the case of a localized source, h(x) = δ(x), (and ignoring the interaction terms
mn>1 ) the solution in Fourier space

(2κ)−1
m̃k =
k 2 + bt/κ
p
gives back ξ −1 = bt/κ. You might think that ignoring the higher powers of m is OK
near the critical point, since m is small; this assumption gives back mean field theory
(which we’ve already seen is not always correct).
In case you’re not comfortable with this derivation of the continuum field theory
description of Ising models with large correlation length, another approach is outlined
on the problem set.
21
For those of you who are not at home with variational calculus, please see the sidebar on the
subject at §4.2.1.

65
Return for a moment to our discussion of the LG theory of a system with an O(n)
symmetry. Recall that in the ordered phase, we found that n − 1 of the modes did not
appear in the quadratic term of the LG free energy. Now you can see why I said that
the existence of these Goldstone modes implied that the correlation length was infinite
everywhere in the ordered phase.

66
4.2.1 Sidebar on Calculus of Variations

Let us spend a moment thinking about functionals – a functional is a monster that

eats a function and returns a number – and how they vary as we vary their arguments.
I’ll describe this in the context where the function in question is x(t), the trajectory of
a particle in time, but you can substitute m(x).
The basic equation of the calculus of variations is:
δx(t)
= δ(t − s).
δx(s)
This the statement that x(t) and x(s) for t 6= s are independent. From this rule and
integration by parts we can get everything we need. For example, let’s ask how does
R
the potential term in the action SV [x] = dtV (x(t)) vary if we vary the path of the
particle. Using the chain rule, we have:
Z R Z Z Z
δ dtV (x(t))
δSV = dsδx(s) = dsδx(s) dt∂x V (x(t))δ(t−s) = dtδx(t)∂x V (x(t)).
δx(s)
We could rewrite this information as:
R
δ dtV (x(t))
= V 0 (x(s)).
δx(s)
What about the kinetic term ST [x] ≡ dt 21 mẋ2 ? Here we need integration by parts:
22
R
Z Z Z
δ 2 δx(t)
ST [x] = m dtẋ(t)∂t = m dtẋ(t)∂t δ(t−s) = −m dtẍ(t)δ(t−s) = −mẍ(s).
δx(s) 2 δx(s)
If we set the total variation to zero, we get Newton’s equation:
Z
δ 1
dt 2
mẋ − V (x(t)) = −mẍ(s) + V 0 (x(s))
δx(s) 2

22
If you are unhappy with thinking of what we just did as a use of the chain rule, think of time
as taking on a discrete set of values ti (this is what you have to do to define calculus anyway) and
let x(ti ) ≡ xi . Now instead of a functional SV [x(t)] we just have a function of several variables
P
SV (xi ) = i V (xi ). The basic equation of calculus of variations is even more obvious now:
∂xi
= δij
∂xj
and the manipulation we did above is
X X X XX X
δSV = δxj ∂xj SV = δxj ∂xj V (xi ) = δxj V 0 (xi )δij = δxi V 0 (xi ).
j j i j i i

67
5 Festival of rigor
Let us pause in our assault on field theory to collect some Facts that we know for sure
about the free energy of short-ranged lattice models. As with any rigorous, formal
results in physics, it will be crucial to understand the hypotheses.

5.1 Extensivity of the free energy

[Parisi pp. 41-42] The Ising model free energy is extensive, F/N = f + terms which
go to zero as the number of sites N → ∞. In particular, in the thermodynamic limit,
the bulk free energy density f doesn’t care about boundary conditions. This assumes
that J is short-ranged: Jr,0 is either of finite support (system-size-independent range),
or falling off sufficiently rapidly in r.
Here is an RG-inspired proof of this result. We begin with a finite system, with N
sites.
First, notice that the hamiltonian H(s) is bounded

−N D < H(s) < N D

for some constant D (for the near-neighbor Ising model on a cubic lattice it’s J for
each link, so D = dJ).
We can bound the free energy, too, by realizing that the number of configurations
is finite – for a finite lattice with N sites, there are only 2N of them. Each one
contributes an energy below the maximum value, and above the minimum value. If
all 2N configurations achieved the max/min value, we get the smallest/biggest possible
values of the partition function:

2N e−βN D ≤ ZN ≤ 2N eβN D .

Taking log of the BHS gives

log 2 log 2
−∞ < −D − ≤ fN ≤ D + <∞
β β
the important thing being that the free energy density is bounded on either side,
independently of N .
Now here comes the RG bit. For definiteness, take free boundary conditions on an
L × L × · · · L chunk of cubic lattice. (Free boundary conditions means that we just stop
writing terms in the hamiltonian when the sites that would participate in them don’t

68
exist.) Take L R, the range of the interactions. Let ZLF be the partition function
for this chunk.
Now we try to double the (linear) size of the system, by gluing together the right
number (2d ) of smaller chunks of size L. Gluing just means that we add the terms in
the hamiltonian which couple the sites across the interface. The number of terms we
have to add is Ld−1 R for each interface (each pair of chunks) we glue, and we have to
glue 2d interfaces. The magnitude of the contribution of each term is bounded by D.
Therefore
2d 2d Ld−1 R 2d 2d Ld−1 R
ZLF 2e−βD F
≤ Z2L ≤ ZLF 2e+βD .
Taking the log and dividing by (2L)d gives

fL + D̃L−1 R ≥ f2L ≥ fL − D̃L−1 R

(where D̃ ≡ D + T ln 2), which can be written as

c
|f2L − fL | ≤
L
for some positive number c which does not depend on L.
This means that the sequence {f2n L }n converges as n → ∞ (for a big enough initial
L). The same argument can be used to show that the effect of changing boundary
conditions can be bounded on either side. Suppose we change terms in the hamiltonian
in a way which is localized near the boundary and where the magnitude of the change
of each term is bounded by some ∆. Then if Z B is the resulting partition function,
˜
d−1 ∆ ˜
d−1 ∆
ZLB e−βL ≤ ZLF ≤ ZLB eβL .

Again when we take the log and divide by the volume Ld , the terms proportional to
˜ ≡ ∆ + T ln 2 are suppressed by a factor of L.
∆
Thermodynamic limit
We conclude that in a system in d dimensions of linear size L, with short-range
interactions, the free energy takes the form:

F = Ld fb + Ld−1 f∂ + O(Ld−2 )

F F − Ld fb
fb = lim , f∂ = lim .
L→∞ Ld L→∞ Ld−1
f∂ is a boundary free energy density.
Two questions to ponder:

69
1. What should we hold fixed in the limit L → ∞? In a fluid, we might want to
fix the density of particles, ρ = Nparticles /L. If we instead fix Nparticles , we get a
boring fluid.

2. How can the thermodynamic limit fail to exist? We consider a class of examples
where it might fail next.

5.2 Long-range interactions

Charged system. Consider a bunch of stuff with a net electric charge, at T = 0.

Imagine we can fix the charge density ρ, and take d = 3 so that the inter-charge
potential is U (r) = A/r. The self-energy of a sphere of this stuff is (integrating the
contributions from shells of radius r, which only care about contributions from inside)
Z R
4 2 A
4πr2 ρdr ∼ Aρ2 R5 .

E(R) = πr ρ
0 3 r
So the ‘bulk free energy density’ is
E(R) R→∞
Eb = ∼ Aρ2 R2 → ∞. (5.1)
V (R)
So a Coulomb-like force is too long-ranged for the thermodynamic limit to exist. More
physically, the conclusion (5.1) means (for A > 0, repulsive interactions) that the
system (with a fixed number of particles or fixed charge) can lower its energy by
expanding into a larger volume – it explodes.
But wait: there are Coulomb-like forces in the world, and here we are in the ther-
modynamic limit. A fatal assumption above is that there was only one sign of the
charge. If we allow charge of both signs, we can have the phenomenon of screening.
Screening makes a microscopically long-range force short-ranged. That last sentence
has a lot of RG physics in it, and it’s worth absorbing more. This is an opportunity
to say something about “running couplings”.

Screening: mean field theory of Debye and Huckel.

[McComb] We take a uniform background of + charges, fixed in place. (This is
sometimes called ‘jellium’.) Their number density is n∞ . We add to this mobile −
charges (‘electrons’), with equal average number density.
Suppose we stick in a test (−) charge at the origin. At temperature T , what is the
probability of finding an electron a distance r from the test charge? If we knew the
electrostatic potential φ(r), the classical equilibrium probability would be
p(r) = Z −1 e−eφ(r)/T .

70
In vacuum, φ(r) would be re . We will determine it self-consistently. The electron
number density is proportional to the probability p(r), and must approach the average
density far away (where φ → 0), so

n(r) = n∞ e−βeφ(r) .

Now we can impose Gauss’ law:

−∇2 φ(r) = 4πρ(r)

= 4πe(n(r) − n∞ )
= 4πen∞ e−βeφ(r) − 1

T eφ
' −4πβe2 n∞ φ(r) + O(eβφ)2 . (5.2)

This is just the equation we solved in (4.12) to find the correlation function G(r) away
from the critical point, at finite ξ −2 = 4πβe2 n∞ , and the solution is

e eeff (r)
φ(r) = e−r/`D ≡ . (5.3)
r r
The name of the correlation length in this case is
r
T
`D ≡ ,
4πe2 n∞
the Debye screening length. In the second equality in (5.3) I introduced a distance-
dependent effective charge eeff (r): how much charge you see depends how closely you
look.
The continuum approximation we’ve used here is consistent with classical corpuscles
if the average interparticle distance is small compared to the screening length:
−1/3
n∞ `D
√
which is true when e3 N T 3/2 , i.e. at high enough temperature, consistent with
our approximation in (5.2).

You might worry that a collection of charges of both signs, once we let them all
move around, might either implode or explode. This paper by Lieb, called The Stability
of Matter, is very interesting and not too forbidding. The early sections are about the
stability of matter to implosion, which is a short-distance issue (whose resolution cru-
cially involves quantum mechanics and the Pauli principle and hence is off-limits here);
but Section V contains a ‘rigorous version of screening’ which removes the concern that
matter should want to explode like in (5.1).

71
Other power laws. Suppose instead of Coulomb interactions in d = 3, we have
particles interacting pairwise via a potential U (r) = rAσ in d dimensions. Then the
energy of a collection of particles with density ρ(r), in a ball of radius R, BR is
Z Z
1
E(R) = d
d r dd r0 ρ(r)U (r − r0 )ρ(r0 )
2 BR BR
ρ2
Z Z
uniform ρ 1
' A d
d r dd r0
2 BR BR |r − r0 |σ
2
ρ
= A R2d−σ C(d, σ) (5.4)
2
where
dd xdd y
Z
C(d, σ) ≡
B1 |x − y|σ
In the last step we scaled out the system-size dependence of the integral by defining
r ≡ Rx, r0 ≡ Ry. This C is just a dimensionless number – if it’s finite. In that case,
the ‘bulk energy density’ (free energy density at T = 0) is

E(R) R2d−σ Aρ2 C/2

εbulk ≡ = ∼ Rd−σ
V (R) Rd Vd

which is finite as R → ∞ (the would-be thermodynamic limit) if σ > d. (Vd is the

volume of the unit d-ball.) So σ > d is a sufficiently fast falloff of the interactions to
allow for a thermodynamic limit.
When is C(d, σ) finite? What I really mean by this is: the power law form of
U (r) ∼ r−σ surely only holds for r a, some UV cutoff – for example the size of the
particles. The real question is: when can we ignore this approximation for purposes of
computing C? Changing integration variables to u ≡ x − y, v = x + y,

Z 1 d−1  Vd Ωd−1 a d−σ
u du = d−σ 1 − R , d 6= σ
C = Vd Ωd−1 .
a/R uσ ∝ log R , d=σ
a

5.3 (Anti-)convexity of the free energy

[Goldenfeld §2.6] We’re going to prove some facts about the nearest-neighbor Ising
model, with Hamiltonian
X X
H(s) = −J si sj − h si . (5.5)
hiji i

Many of them are true more generally.

72
(1) With the additive normalization in (5.5), the bulk free energy density is negative:

f < 0.

This statement is sensitive to the additive normalization of H – if we add a big positive

constant to H, we can change this fact. The normalization (5.5) is special because there
is no constant term, in the sense that
X
H(s) = 0 (5.6)
s

– it is normal-ordered.
Proof of (1): Begin with N < ∞ sites. The free energy density is f = F/N =
T
log Z, so the claim f < 0 means Z > 1. The partition function Z = s e−βH(s)
P
−N
is a sum of 2N positive terms (for 0 < T < ∞). And Z > 1 because there exists
?
a configuration s? which by itself contributes a term e−βH(s ) > 1. For example, for
J > 0, h > 0, it happens when s?i = 1, ∀i. But more generally, it follows from the
normal-ordering condition (5.6) since H(s) is not identically zero, so there must be
configurations with both signs of H(s), and at least one which has H(s? ) < 0.
(2) The entropy density is
s = −∂T f ≥ 0.

The proof of this statement follows from the identity

X
−∂T F = − ρβ (s) log ρβ (s)
s

where ρβ (s) ≡ e−βH(s) Z −1 is the equilibrium probability distribution at temperature

T . Since 0 ≤ ρ ≤ 1, this is a sum of positive terms.

Here is a Definition: A function f (x) is anti-convex

(aka concave) in x if

f (tx1 + (1 − t)x2 ) ≥ sf (x1 ) + (1 − t)f (x2 ) , ∀t ∈ [0, 1];

that is, if the graph of the function is above any chord. If

f is anti-convex then I’ll call −f convex.

Convex implies continuous, as you can see from the pic-

ture at right of a function which jumps. If f is anti-convex,

73
∂x f is non-increasing (and in particular the derivative ex-
ists almost everywhere). f can have cusps.

(3) ln Z(β, h, J) is anti-convex in its arguments.

[End of Lecture 10]
Proof of (3): The proof relies on a Hölder inequality. Given two sequences {gk }, {hk }
of non-negative numbers gk , hk ≥ 0, and t ∈ [0, 1],
!t !1−t
X X X
(gk )t (hk )1−t ≤ gk hk .
k k k

This follows from the convexity of the logarithm23 .

Here’s the idea for ln Z(β):
X
Z(tβ1 + (1 − t)β2 ) = e−tβ1 H(s) e−(1−t)β2 H(s)
s
!t !1−t
Hölder X X
≤ e−β1 H(s) e−β2 H(s)
s s
= Z(β1 )t Z(β2 )1−t .

Taking logs gives:

ln Z(tβ1 + (1 − t)β2 ) ≤ t ln Z(β1 ) + (1 − t) ln Z(β2 ).

The limit as N → ∞ of a family of convex functions is also convex.

Note that I could have said f (h, J) is anti-convex in its arguments, but f (β) is not
necessarily so, since f (β) = − ln Z/β, and the β in the denominator can mess things
23

Here’s the idea:

1 p 1 q
ab = elog ab = e p log a + q log bp
1 p 1 q a bq
≤ elog a + elog b = +
p q p q

for p1 + 1q = 1, where we used the fact that ex is anticonvex

(etx+(1−t)y ≤ tex + (1 − t)ey ), as illustrated at right. Apply this
inequality with
1 ≡t 1 ≡1−t
gk p hk q
a= P , b= P
g h
and sum the BHS over k.

74
up. On the other hand, as a function of T = 1/β, the free energy f (T ) = −T ln Z(T )
is indeed anti-convex.
A useful alternative viewpoint: anticonvexity follows by showing that all second
derivatives of f are negative. For example,
1
∂β2 f = − (H − hHi)2 ≤ 0
βN
is proportional to minus the specific heat, aka the variance of the energy. Similar
statements hold for other variations, such as the magnetic susceptibility

∂h2 f = −c (s − hsi)2 ≤ 0.

So the condition of convexity is related to the stability of the equilibrium.

Note that f being anticonvex in β means f is convex as a function of T .
Gibbs’ inequality [Kardar, particles] Here’s an application of the anticonvexity
of ln Z as a function of couplings in the hamiltonian. Suppose that computing expec-
tations in the system with hamiltonian H is hard, but with another hamiltonian H0
(defined on the same configuration space) it is easy. Then let

Z(t) ≡ tre−β(H0 (1−t)+tH) ,

which interpolates between the two ensembles. By similar steps as above, ln Z(t) is
convex in t. Convexity of a function implies that it lies above any of its tangents, and
in particular24 ,

ln Z(t) ≥ ln Z(0) + t∂t ln Z|t=0 = ln Z(0) + β hH0 i0 − β hHi0 .

On the right hand side we then have a bound on the free energy in terms only of
easy-to-compute quantities. (Consider what happens in the case of the ising model, If
P
we take H0 = i si hi .)

5.4 Spontaneous symmetry breaking

[Goldenfeld p. 56, Parisi p. 15]

Orders of limits. I made a big deal earlier (in §3) about the impossibility of
spontaneous symmetry breaking (SSB) in finite volume. There is more to say about
this. What does the free energy density (in the thermodynamic limit) look like as a
function of h near h = 0? It must be

f (h) = f (0) − ms |h| + O(hσ>1 )

24
Alternatively, we could just show that ∂t2 ln Z(t) = β 2 (H − H0 )2 c
≥0

75
so that the magnetization is
(
ms + O(hσ−1 ), h > 0,
m = −∂h f = .
−ms + O(hσ−1 ), h<0

(If σ were not larger than one, the magnetization would diverge as h → 0 and that’s
not happening, since it’s bounded (|m| ≤ 1). I also imposed f (h) = f (−h) by Ising
symmetry.)
But before the thermodynamic limit, f (h) is a smooth function. This means the
two limits h → 0, N → ∞ are clashing violently:
1 1
lim lim ∂h F = 0 but lim lim ∂h F = ±ms .
N →∞ h→0 N h→0 N →∞ N

Yang-Lee singularities. Here is a toy model of how this can come about. Suppose
our system of volume V is so tightly bound that only two configurations matter, the one
where all N spins point up, m = +V , and the one where they all point down, m = −V .
(All the rest of the configurations have such a large energy that we can ignore their
contributions to Z.) So a single spin s = ±1 determines the whole configuration.
Then, in a field, we have
X
Z(h) = e−βhV s = 2 cosh βhV
s=±1

and
T V →∞
f (h) = − log (2 cosh βV ) , m(h) = ∂h f = tanh βhV → m(h) = sign(h).
V

Where in h is the free energy F ∝ log(Z) singular?

When Z(h) = 0. For V < ∞, the zeros of the partition
function happen at pairs of imaginary values of h
(2n + 1)πi
Z(hn = 0) at hn =
2βV
which in the thermodynamic limit V → ∞ accumulate and
pinch the real axis. (They are shown for βV = 1, 2, 5 in
the figure at right.) These zeros are named after Yang and Lee.
Ergodicity breaking. There are many points of view from which SSB seems
paradoxical. For example if the equilibrium probability density is

p0 (s) = Z −1 e−βH(s)

76
then the Ising symmetry H(s) = H(−s) implies directly that the magnetization van-
ishes:
?
X
m = hsi = hsi0 ≡ P0 (s)s = 0.
s

What gives? Consider, at small h > 0 and finite N , the ratio of the probabilities
of two configurations: a reference configuration s, and the one related to it by a global
spin reversal. If m(s) ≡ N1 i si is the magnetization in this configuration, then
P

p(s) e−β(hN m(s)) N →∞,h>0,m(s)>0

= +β(hN m(s)) = e−2βhN m(s) → 0.
p(−s) e
In the thermodynamic limit, if h 6= 0 one of these configurations is infinitely more
probable than the other! This is true no matter how small h is, even if h = 0+ . If we
reverse the sign of h, the other configuration wins. We’ve learned that
lim lim p0 (s) ≡ p± (s)
h→0± N →∞

is a different, restricted ensemble compared to the symmetric distribution p0 . It is

restricted in the sense that p+ (s|m(s) < 0) = 0 – states with the wrong sign of
magnetization have no weight. So in this limit, our distribution only samples a subset
of the configuration space – it is not ergodic. This is a toy example of ergodicity
breaking, a concept which is much more useful in the physics of glassy systems. Very
roughly, from the usual point of view of ergodicity as underlying statistical mechanics,
in terms of time evolution, the claim is that starting from an initial configuration, the
probability of evolving to a configuration with the opposite sign of the magnetization
N →∞
goes like e−β∆F where the change in free energy is ∆F ∼ N σ>0 → ∞. So we are also
claiming that SSB means that the N → ∞ limit and the t → ∞ limit do not commute.
(If we take t → ∞ first, the system will explore all the configurations.) For a bit more
about ergodicity-breaking, see Goldenfeld §2.10.
Cluster decomposition failure. If we prepare the system in an initial configura-
tion with a mixture of ± (or average over possible initial conditions with the appropriate
weight), as
pq (s) = qp+ (s) + (1 − q)p− (s), q ∈ [0, 1]
then our expectation for the connected correlations are
hsi sj icq ≡ hsi sj iq − hsi iq hsj iq
|ri −rj |→∞
→ m2 − ((1 − 2q)2 m2 = 4q(1 − q)m2 6= 0. (5.7)
They don’t vanish for arbitrarily-separated points!25 The demand that they should
is called the cluster decomposition principle; it is an implementation of the notion of
25
For some intuition for the sense in which arbitrarily-separated points are correlated in these
ensembles, see the homework.

77
‘locality’. SSB means that cluster decomposition fails for the symmetric distribution.
Only the non-symmetric ‘pure states’ with q = 0, 1 satisfy this demand (this is the
definition of ‘pure state’ in this context).

5.5 Phase coexistence

[Goldenfeld, §4] First, let’s recall some thermodynamics facts. I will speak in the
language of fluids, but with appropriate substitutions of letters, it can be translated
into physics of magnets or other examples. At fixed volume, the free energy which
is minimized in equilibrium is the Hemholtz one (the one we’ve been talking about),
F (T, V, N ) = E −T S. If instead we fix the pressure P , the quantity which is minimized
in equilibrum is the Legendre transform of F , named for Gibbs:
G(T, P, N ) = F + P V,
in terms of which the first law of thermodyanimcs is
dG = −SdT + V dP + µdN.
The Gibbs-Duhem relation (basically, integrating the first law) says E = −P V + T S +
µN , so that in fact G = µN is just proportional to the chemical potential.
Let’s consider a situation at fixed P where there is
a first order transition, between two phases I, II (for
example, liquid and gas) where the order parameter
is the volume, or the density (equivalently at fixed N ,
since V = N/ρ). Along the phase boundary, where
they exchange dominance, we must have
GI = GII . (5.8)
Hence also µI = µII ; this is a condition for chemical equilibrium of the two phases.

Moving along the phase boundary, the condition (5.8) says

GI (T + dT, P + dP, N ) = GII (T + dT, P + dP, N )

| {z }
=dT ∂T GI |P +dP ∂P GI |P
| {z } | {z }
−SI VI

and therefore we get the Clausius-Clapeyron equation for the slope of the coexistence
curve
dP SI − SII
= .
dT coexistence VI − VII

78
The difference in the numerator is propor-
tional to the latent heat of the transition, T ∆S =
T (SI − SII ). If phases I and II are not somehow
topologically distinguished (for example, by a dif-
ferent symmetry-breaking pattern), then there can
be a critical endpoint of the line of first-order transitions, where ∆S → 0, ∆V → 0, at
some (Tc , Pc ).
The consequence of a first-order transition de-
pends on what is held fixed as the transition is
traversed. If we heat a fluid at constant pres-
sure P < Pc (for example atmospheric pressure),
starting from T < Tc (moving along the red verti-
cal line in the figure, and doing so slowly enough
that we stay in the equilibrium phase diagram
at every step) then first the fluid expands and
warms up. When it reaches the coexistence curve
Tcoexistence (P ), it starts to boil. While this hap-
pens, the energy goes into the latent heat convert-
ing I into II, and the temperature stays fixed: we
are sitting at the point (Tcoexistence (P ), P ) on the
coexistence curve in the (P, T ) phase diagram, while the fraction x of the fluid which
is gas grows:
V = xVl + (1 − x)Vg , x = x(t)
is some protocol-dependent function. Although Vl 6= Vg , the volume of fluid itself does
not jump. How do I know this? Bear with me a moment, the proof is at Eq. (5.9).
If instead we compress the fluid at constant T , starting at T > Tc in the gas phase:
1
− ∂V P |T ≡ κT > 0
V
a positive compressibility says that it fights back. It fights back until the volume
reaches V = Vl (T ), which is when P = Pcoexistence (T ), beyond which the fluid starts to
condense.

What do these isothermal curves look like? Let v = V /N = 1/ρ be the volume frac-
tion per particle. For an ideal gas, recall that P v = T . This is correct in general at high
temperature. For lower temperatures, van der Waals suggests some appealing simple
corrections which account for an interparticle interaction described by a potential like
we discussed in §3.6:

79
• each particle wants some amount of personal space, and therefore excludes some
fixed volume b: v → v − b.

• the energy per particle is decreased by the long-range attractive part of the
potential by an amount proportional to the density:
E E a
→ − aρ =⇒ P = ∂V F → P − .
N N v2

So the vdW equation of state is

T a
P = − 2
v−b v
for some constants a, b (in the plot at right we see a =
2, b = 5 and T = 1, 1.5, 2 from bottom to top). This
has two nice new features for our present purposes:

• It has a critical T = Tc below which there is a line of first order phase transitions.
The critical point appears when P (v) = const goes from having one solution
(T > Tc , like the ideal gas), to having three. When this happens, ∂v P = ∂v2 P = 0,
so that locally P ∼ (vc −v)3 is locally cubic. In fact, for the vdW equation of state,
this condition is exactly a cubic equation for v: P0 v 3 − v 2 (bP0 + T ) + av − ab = 0.

• (Relatedly), it has regions where κT = − V1 ∂V P |T < 0 which says that if you try
to squeeze it, it doesn’t fight back, but rather tries to help you squeeze it further.
Creepy! (The same thing happened in our study of the Landua-Ginzburg free
energy in §4.1 and this led to the picture of the swallowtail.)

• Note by the way that the vdW equation is a masterpiece of estimation: a, b can
be determined from high-temperature data and they give a (not bad) estimate
of the location of the critical point.

What is the free energy doing while this is going

on? At coexistence, in equilibrium, µl = µg , and the
first law says
S V
dµ = − dT + dP
N N
so
Z liquid Z liquid
isotherm V (P )
0 = µl − µg = dµ = dP (5.9)
gas gas N

80
so the area under the V (P ) curve is zero (and is the change in the Gibbs free energy),
along any path in equilibrium. This is true even for infinitesimal paths. Therefore, the
actual equilibrium trajectory of the free energy is a straight line between Vg and Vp .
This is the Maxwell construction. It saves the convexity of the free energy.
The creepy self-squeezing regions of the equation-of-state curve are
exactly the ones which are removed by the phase-coexistence region.
At left here, I’ve made some pictures where a decreasing fraction
of the dots are colored red, in an attempt to depict the history of the
volume fraction of one phase in the other as the coexistence region is
traversed. What’s wrong with this picture? How could you make it
more realistic?
Notice that we are making a strong demand of equilibrium here, ef-
fectively taking t → ∞ before N → ∞. This failure of commutativity of
these limits is the same issue as in our discussion of ergodicity-breaking
above.

81
6 Field Theory
Now we are going to try to see where Landau and Ginzburg could have gone wrong
near the critical point.
Here is a hint, from experiment. The hard thing about the critical point, which
mean field theory misses, is that fluctuations at all scales are important. I know this
because I’ve seen it, e.g. here and (with better soundtrack) here. Critical opalescence
is a phenomenon whereby a two-fluid mixture which is otherwise transparent becomes
opaque at a continuous phase transition. (The difference in densities of the two fluids
plays the role of the order parameter.) It is explained by the scattering of light by the
density fluctuations at all scales, at least at all the wavelengths in the visible spectrum.
These are the fluctuations we’re leaving out in mean field theory.

At this point I want to remind you about the derivation of field theory that you
made for homework 5. There, you studied the Legendre transform of the free energy
F [h] at fixed field: X
S[m] = F [h] − mr hr |m=+∂h F .
r

In the thermodynamic limit, I claim that this is the same thing.

It’s easy to get confused about Legendre transforms and all that stuff, so it’s very
helpful to appeal to a simpler narrative of the origin of field theory, by exploiting
universality. Recall at the beginning of our discussion of Ising models in §3, I mentioned
the many avatars of the Ising model. One I mentioned arose by considering a real-valued
variable φx at each point in space (or on some lattice).
That is: suppose we replace each spin sx by such a real variable, a factor in whose
probability distribution is
p0 (φx ) ∝ e−βV (φx ) (6.1)
where V (φ) ∼ g(φ2 − 1)2 for large g. This probability distribution is basically zero
unless φ = ±1, so this is no change at all if g is big enough. Important piece of
foreshadowing: we are going to see that a large g at the lattice scale is not at all the
same as a large gφ4 term in the coarse-grained action.
So we replace
X Y X Z Y Z R
... ≡ ... dφx p0 (φx )... ≡ Dφ e−β x V (φ(x))
...
s x sx =±1 x

82
The nearest-neighbor ferromagnetic Ising Hamiltonian becomes (up to an additive con-
stant, using s2 = 1)
d d d
XX 1 XX 1 X X
−J (sx+µ sx − 2) = J (sx+µ − sx )2 J (φx+µ̂ − φx ) 2 .
x µ=1
2 x µ=1
2 x µ=1
| {z }
|{z} 'a−1 ∂µ φ
R
'a−d dd x

That is: the ferromagnetic coupling makes the nearby spins want to agree, so it adds
a term to the energy which grows when the nearby φx s disagree.
Altogether, we are going to replace the Ising partition function with
X Z R d
−βH(s)
Z= e [Dφ]e− d xL(φ)
s

where (I am calling the LG free energy density L for ‘Landau’ or for ‘Lagrangian’.)
1 1 g
L(φ) = κ (∇φ)2 + rφ2 + φ4 + hφ + · · ·
2 2 4!
Our hope is that the operation does not take us out of the basin of attraction of the
Ising critical point. The constants κ, r, g are related in some way (roughly determinable
but not the point here) to the microscopic parameters. For some physical situations
(such as high energy particle physics!) this is a better starting point than the lattice
model. There is some coarse-graining involved in the operation, and therefore the
dependence of κ, r, g on β needn’t be linear, but it should be analytic. After all, the
miraculous phenomenon we are trying to understand is how physics can be non-analytic
in T at some finite value of T ; we don’t want to assume the answer.

6.1 Beyond mean field theory

[Brezin, §9] So we want to understand the integral

Z
Z ≡ [Dφ]e−S[φ] . (6.2)

Mean field theory arises by making a saddle point approximation: find m which min-
imizes S[φ], 0 = δS
δφ
, and make a (functional) Taylor of the exponent about the
φ=m
minimum:
Z
Z = [Dφ]e−S[φ=m+ϕ]
Z
δ2 S

δS
− S[m]+ δφ |φ=m ϕx + 12 | ϕ ϕ +···
δφx δφy φ=m x y
= [Dϕ]e x (6.3)

83
In the second line I used the fact that the change of variables φ = m + ϕ has unit
Jacobian. I also used a matrix notation, where the position indices x, y are repeated
indices, and hence are summed. The saddle point condition means that the term in
the exponent linear in ϕx vanishes.
The mean field theory answer is just Z0 = e−S[m] . The first correction to mean field
theory comes by keeping the quadratic term and doing the gaussian integral:
Z
1
R R
Z1 = Z0 [Dϕ]e− 2 x y ϕx Kxy ϕy

[End of Lecture 11]

where the quadratic kernel K is

δ2S g
Kxy ≡ = r + m2 − κ∇2 δ d (x − y) .
δφx δφy φ=m 2

Notice that it depends on the background field m.

How do we do the (gaussian!) ϕ integral?
Z Y
1 1 Y −1
P
log λ
dϕx e− 2 ϕx Kxy ϕy = √ =C λ−1/2 = e 2 λ λ0
det K λ

where λ are the eigenvalues of K:

Kxy u(λ) (λ)

y = λux . (6.4)

I absorbed the constant C into the − log λ0 which we can choose to our advantage. So
the leading correction to the mean-field free energy gives
11 X λ
F (1) [h] = FMF [h] + log .
22 λ λ0

Who are the eigenvalues of the kinetic operator K? If h and hence m are constant,
the problem is translation invariant, and they are plane waves, uq (x) = √1V ei~q·~x – the
eigenvalue equation (6.4) is
Z g 2 2
g 2 2

δ(x − y) r + m − ∇ uq (y) = r + m + q uq (x).
y 2 | 2 {z }
=λq

Therefore, the free energy is

r + g2 m2 (h) + q 2
Z
(1) 1 d
F [h] = FMF [h] + V d̄ q log
2 r + q2

84
where I made a choice of λ0 to be λ(m = 0).
Making the Legendre transform (a little tricky, and requiring us to ignore terms
of the same size as the corrections to the first order approximation), we have Γ[m] =
V γ(m) with the answer to this order
r + g2 m2 + q 2
Z
(1) 1 2 g 4 1 d
γ = rm + m + d̄ q log . (6.5)
2 4! 2 r + q2
Shift of critical point, Ginzburg criterion revisited. So what? First let’s
use this to recover the Ginzburg criterion. The susceptibility, at h = 0, for T > Tc is
χ = ∂h m|h=0 which (as you’ll verify on the homework) is related to the curvature of
the effective potential γ by
Z
1 g 1
2
|m=0 = ∂m γ|m=0 = r + d̄d q 2 .
χ 2 q +r
The phase transition happens when the correlation length goes to infinity; we showed
by the susceptibility sum rule (4.9) that ξ → ∞ is required by χ → ∞. So, while
in mean field theory the critical point occurs when r → 0, the fluctuation corrections
we’ve just derived shift the location of the critical point to
Z
! −1 g 1
0 = χ (Tc ) = r(Tc ) + d̄d q 2 .
2 q + r(Tc )
You’ll show on the homework that we can eliminate the (annoying, non-universal any-
way) parameter r from the discussion and relate the susceptibility near the transition
to the non-dimensionalized temperature t = T −T Tc
c
:
Z
1 g d 1
= c1 t 1 − d̄ q 2 2 .
χ 4 q (q + r)
for some constant c1 . Everywhere here we are ignoring terms which are as small as
the corrections to the gaussian approximation. Since if g were zero, the integral would
be exactly gaussian (ignoring even higher order terms like φ6 for now), the corrections
must come with powers of g.
When is the correction to MFT actually small? The shift in the critical point is
of order gG(0) = g d̄d q q2 (q12 +t) + const, which is the same quantity we found in our
R

earlier discussion of the Ginzburg criterion for short-ranged interactions. As t → 0,

the integral (the q → 0 limit of the integral) is finite for d > 4, but for d ≤ 4 it blows
d−4
up at t → 0. More specifically, the corrections to MFT are small when gt 2 is small.
This determines the size of the critical region.
Now wait a minute: when we introduced the coupling g (at (6.1)) we said it had
to be big to give a good approximation to the Ising spins, but now I’m using an
approximation relying on small g. What gives? The answer is that coarse-graining can
make g shrink. Here we go:

85
6.2 Momentum shells

[Zee, Quantum Field Theory in a Nutshell, §VI.8 (page 362 of 2d Ed.)]

The continuum functional integral I’ve written in (6.2) is defined (to the extent
that it is) by taking a limit where the lattice spacing goes to zero as the number of
sites goes to infinity. This limit is dangerous and is the origin of the bad reputation
of the subject of quantum field theory. In its application to the lattice Ising model,
this isn’t a real problem, because the lattice spacing is a real thing. It provides an
ultraviolet (UV) cutoff on our field theory. To remind ourselves of this let me decorate
our expressions a little bit:
Z R d
ZΛ ≡ [Dφ]e− d xL(φ) . (6.6)
Λ

says that we integrate over field configurations φ(x) = d̄d keikx φk

R R
Here the specification Λ
qP
D 2 26
such that φk = 0 for |k| ≡ i=1 ki > Λ. Think of 2π/Λ as the lattice spacing –
there just aren’t modes of shorter wavelength.

We want to understand (6.6) by some coarse-graining

procedure. Let us imitate the block spin procedure.
Field variations within blocks of space of linear size na
2π
have wavenumbers greater than na . (These modes aver-
age to zero on larger blocks; modes with larger wavenum-
ber encode the variation between these blocks.)

So the analog of the partition function after a single blocking step is the following:
Break up the configurations into pieces:
Z
φ(x) = d̄d keikx φk ≡ φ< + φ> .

Here φ< has nonzero fourier components only for |k| ≤ Λ/b for some b > 1 and φ> has
nonzero fourier components only in the shell Λ/b ≤ |k| ≤ Λ. These two parts of the
field could be called respectively ‘slow’ and ‘fast’, or ‘light’ and ‘heavy’, or ‘smooth’
and ‘wiggly’. We want to do the integral over the heavy/wiggly/fast modes to develop
an effective action for the light/smooth/slow modes:
Z Z
− dd xL(φ< ) > − dd xL1 (φ< ,φ> )
R R
−Seff [φ< ]
e ≡e [Dφ ]e , ZΛ = [Dφ< ]e−Seff [φ< ]
Λ/b

where L1 contains all the dependence on φ> (and no other terms).

26
This cutoff is not precisely the same as have a lattice; with a lattice, the momentum space is
periodic: eikxn = eik(na) = ei(k+ a )(na) for n ∈ Z. Morally it is the same.
2π

86
6.3 Gaussian fixed point

In the special case where the action is quadratic in φ, not only can we do the integrals,
but the quadratic action is form-invariant under our coarse-graining procedure.
Consider
Z Z Λ
1 1
d x φ(x) r0 − r2 ∂ 2 φ(x) =
d
d̄d kφ(k)φ(−k) r0 + r2 k 2 .

S0 [φ] =
2 0 2

The coefficient r2 of the kinetic term (I called it κ earlier) is a book-keeping device that
we may set to 1 by rescaling the field variable φ if we choose. Why set this particular
coefficient to one? One good reason is that then our coarse-graining scheme will map
Ising models to Ising models, in the sense that the kinetic term is the continuum
P
representation of the near-neighbor Ising interaction J hiji si sj .

PΛ
We can add a source q hq φ−q to compute
P hq h−q
−1
D PΛ E
Z[h] = e− q hq φ−q = Z[0]e 2 q q2 +r

and
1 ∂ ∂ 1
hφq φq0 i = Z[h]|h=0 = 2 δq+q0 = G(q)δq+q0 .
Z ∂h−q ∂h−q0 q +r
We can relate the parameter r to a physical quantity by our friend the susceptibility
sum rule: Z
Gaussian 1
χ = dd xG(x) = G(q = 0) = .
r

Here’s what I mean by form-invariant: because S0 does not mix modes of different
wavenumber, the integrals over the fast and slow modes simply factorize:
Z
−Seff [φ< ] > < <
e = [Dφ> ]e−S0 [φ ]−S0 [φ ] = Z> e−S0 [φ ]

– the effective action for the slow modes doesn’t change at all, except that the cutoff
changes by Λ → Λ/b. To make the two systems comparable, we do a change of rulers:
d−2
Λ0 ≡ bΛ, φ0q ≡ b 2 φbq

so that Z Λ
1
Seff = dd qφ0q φ0−q (q 2 + r0 )
0 2

87
where r0 = b2 r.
What we just showed is that this RG we’ve constructed maps the quadratic action
to itself. There are two fixed points, r0 = ∞ and r0 = 0. The former is the high-
temperature disordered state. Near this fixed point, the parameter r0 is relevant and
grows as we iterate the RG. No other terms (besides a constant) are generated. We
could say there is another fixed point at r0 = −∞, which could describe the ordered
phase, but with g = 0, the integral is not well-defined with r0 < 0.
This is the same calculation we did of the random walk, the very first calculation
we did, with a lot more labels! The linear term in φ (the external magnetic field here)
would be relevant, just like the bias term in the random walk that we introduced in
§2.1. It is forbidden by the Ising symmetry.
Following the general RG strategy, once we find a fixed point, we must study the
neighborhood of the fixed point.

6.4 Perturbations of the Gaussian model

Just as with the spin sums, the integrals are hard to actually do, except in a gaussian
theory. But again we don’t need to do them to understand the form of the result. We
use it to make an RG. As usual there are two steps: coarse-graining and rescaling.
First give it a name:
Z
dd xδL(φ< ) dd xL1 (φ< ,φ> )
R R
−
e ≡ [Dφ> ]e− (6.7)

so once we’ve done the integral we’ll find

Z R d < <
ZΛ = [Dφ< ]e− d x(L(φ )+δL(φ )) . (6.8)
Λ/b

To get a feeling for the form of δL let’s parametrize the LG integrand:

1 X
L = γ(∂φ)2 + gn φn + ... (6.9)
2 n

where we include all possible terms consistent with the symmetries (φ≶ → −φ≶ , h →
−h, rotation invariance27 ). Then we can find an explicit expression for L1 :
Z Z
d < > d 1 > 2 1 2 > 2 > 3 <
d xL1 (φ , φ ) = d x κ(∂φ ) + m (φ ) + g4 (φ ) φ + ...
2 2
27
Why impose rotation invariance here? For now, it’s for simplicity. But (preview) we will see
that the fixed points we find are stable to rotation-symmetry breaking perturbations. Its an emergent
symmetry.

88
(I write the integral so that I can ignore terms that integrate to zero, such as ∂φ< ∂φ> .)
This is the action for a scalar field φ> interacting with itself and with a (slowly-varying)
background field φ< . But what can the result δL of integrating out φ< be but something
of the form (6.9) again, with different coefficients?28 The result is to shift the couplings
gn → gn + δgn . (This includes the coefficient of the kinetic term and also of the higher-
derivative terms which are hidden in the ... in (6.9). You will see in a moment the logic
behind which terms I hid.)
Finally, so that we can compare steps of the procedure to each other, we rescale
R R
our rulers. We’d like to change units so that Λ/b is a Λ with different couplings; we
accomplish this by changing variables: k 0 = bk so now |k 0 | < Λ. So x0 = x/b, ∂ 0 ≡
0 0
∂/∂x0 = b∂x and the Fourier kernel is preserved eikx = eik x . Plug this into the action29
Z Z !
1 −2 0 < 2 X n
Seff [φ< ] = dd x (L(φ< ) + δL(φ< )) = d d x 0 bd b (∂ φ ) + (gn + δgn ) (φ< ) + ...
2 n

We can make this look like L again (with r2 = 1) by rescaling the field variable:
1
bd−2 (∂ 0 φ< )2 ≡ (∂ 0 φ0 )2 (i.e. φ0 ≡ b 2 (d−2) φ< ):
Z !
1 0 0 2 X n(d−2)
Seff [φ< ] = dd x0 (∂ φ ) + (gn + δgn ) bd− 2 (φ0 )n + ...
2 n

So the end result is that integrating out a momentum shell of

thickness δΛ ≡ (1 − b−1 )Λ results in a change of the couplings to

gn0 = b∆n (gn + δgn )

where
n(2 − d)
∆n ≡ + d.
2
Ignore the interaction corrections, δgn , for a moment. Then we can keep doing this
and take b → ∞ to reach macroscopic scales. Then, as b grows, the couplings with
∆n < 0 get smaller and smaller as we integrate out more shells. If we are interested
in only the longest-wavelength modes, we can ignore these terms. They are irrelevant.
Couplings (‘operators’) with ∆n > 0 get bigger and are relevant.
The ‘mass term’ rφ2 has n = 2 and r0 = b2 r is always relevant for any d < ∞.
28
Again we apply the Landau-Ginzburg-Wilson logic. The idea is the same as in our discussion of
blocking for the Ising model. The result is local in space because the interactions between the slow
modes mediated by the fast modes have a range of order b/Λ. The result is analytic in φ< at small
φ< and there is no symmetry-breaking because we only integrate the short-wavelength modes.
2
29
Really, the coefficient of (∂ 0 φ< ) should be b−2 (1 + δκ). But δκ turns out to be O(g 2 ) so let’s
ignore it for now.

89
This counting is the same as dimensional analysis: demand that βH is dimension-
less, and demand that the kinetic term (∂φ)2 stays fixed. Naive (length) dimensions:

[βH = S] = 0, [x] ≡ 1, [dd x] = d, [∂] = −1

The kinetic term tells us the engineering dimensions of φ:

2−d
0 = [Skinetic ] = d − 2 + 2[φ] =⇒ [φ] = .
2
Then an interaction term has dimensions

n 2−d
0 = [βgn φ ] = d + [gn ] + n[φ] =⇒ [gn ] = −(d + n[φ]) = − d + n = −∆n
2

– couplings with negative length dimension are relevant. This result is the same as
engineering dimensional analysis because we’ve left out the interaction terms. This is
actually correct when gn = 0, n ≥ 3, which is the gaussian fixed point.
An important conclusion from this discussion is that there is only a finite number
of marginal and relevant couplings that we must include to parametrize the physics.
Further, if the interactions produce small corrections, they will not change a very
irrelevant operator to a relevant operator. This should mitigate some of the terror you
felt when we introduced the horrible infinite-dimensional space of hamiltonians M at
the beginning of the course.
Another important conclusion is that the gaussian Ising critical point is stable to
interactions in d > 4. It is of course unstable in the sense that rφ2 is relevant. And it is
unstable if we allow terms with odd powers of φ which break the Ising symmetry. But
what is the smallest-dimension operator which we haven’t added and which respects
the Ising symmetry? According to our Gaussian counting, each derivative counts for
+1, and each power of φ counts for 2−d 2
. If we demand rotation invariance (or even
just parity) so we can’t have a single derivative, the next most important perturbation
is g4 φ4 . Its dimension is ∆4 = 4 − d – it is irrelevant if d > 4 and relevant if d < 4. We
could have expected this, since it coincides with the breakdown of mean field theory
– above the upper critical dimension, the interactions are irrelevant and MFT gives a
correct accounting of the fixed point. In d = 4, the φ4 term is marginal, and it is an
opportunity for small interaction corrections to decide its fate.
[End of Lecture 12]

90
6.5 Field theory without Feynman diagrams

[Brezin, Chapter 11] Before we do so systematically, let’s pursue the calculation we

did in §6.1 a bit further, now that we’ve learned to organize the integrals over the
fluctuations scale-by-scale.
RΛ
Suppose we’ve already done the integrals over the shell: Λ/b dk, so that the effects
of the fluctuations with those wavenumbers are already incorporated into γ (1) [m, b]. As
we argued, γ (1) [m, b] will take the same form as the more microscopic effective potential
γ, but with some new values of the couplings, depending on b. And then let’s do the
rest of the integrals using this action (still in the quadratic approximation) and ask
how things depend on b.
If we’re just integrating out the fluctuations with momenta in the shell above Λ/b,
in the quadratic approximation, we can just replace (6.5) with

g0 4 1 Λ d r0 + g20 m2 + q 2
Z
(1) 1 2
γ [m, b] = r0 m + m + d̄ q log . (6.10)
2 4! 2 Λ/b r0 + q 2
1 g(b) 4
≡ r(b)m2 + m + ...
2 4!
I also added some subscripts on the couplings to emphasize that r0 , g0 are parameters
in some particular zeroth-order accounting we are making of the physics, not some holy
symbols whose values we can measure. In the last line, we’ve defined running couplings
r(b), g(b).
From this expression we can read off
Λ
d̄d q
Z
g0
r(b) = r0 + .
2 Λ/b q2 + r

A slightly more useful parameter is the deviation from the critical coupling. The critical
point occurs when χ−1 = ∂m 2
γ|m=0 → 0, which happens when r0 is

g0 d̄d q
Z
c
r0 = − + O(g02 ).
2 q2
On the RHS here, we ignored the r in the denominator because it is O(g). This gives
the deviation in temperature from the critical point, by subtracting the previous two
displayed equations:
 

d̄d q
Z
 g0 
t(b) ≡ r0 − r0c = r0 1 − +O(g 2 
0 .
)

 2 q 2 (q 2 + r) 
| {z }
≡Id (r,b)

91
(Note that t = t(b) is merely a convenient relabelling of the coordinate r0 ; the relation
between them is analytic and t depends on our zoom factor b.)
Now we must study the integral I. We’ve observed that Id (r, b → ∞) blows up (by
taking b → ∞ we include all the fluctuations) when r → 0 for d ≤ 4. Let’s start at
d = 4, where Z Λ
q 3 dq Λ2 r
I4 (r, b) = K4 2 2
= +K4 log b. (6.11)
Λ/b q (q + r)

The running quartic coupling is

3g02 d̄d q
Z
4
g(b) ≡ ∂m = g0 −
γ|m=0 + O(g03 ) (6.12)
2 (q 2 + r0 )2
d→4 3g 2 1 b2 (r + Λ2 ) Λ2 r 3g 2
' g0 − 0 K4 log 2 ' g0 − 0 K4 log b. (6.13)
2 2 b r+Λ 2

Combining (6.12) and (6.11), we conclude that for d = 4

g0
t(b) = r(1 − K4 log b).
2
3g02
g(b) = g − K4 log b.
2
(I may drop the subscripts on the ts sometimes.)
These expressions are useful because the b-dependence is explicit and we can derive
from them an infinitesimal statement:

− b∂b t = t(b)κ(g(t)). (6.14)

For the case above, κ(g) = 21 g + O(g 2 ). Similarly,

3
− b∂b g = βg = K4 g 2 + O(g 3 ). (6.15)
2
These vector fields indicating the continous flows of the couplings with the zoom factor
are generally called beta functions.
The ordinary differential equation (6.14) is solved by
Rb dµ
t(b) = t0 e− 1 µ
κ(g(µ))
, (6.16)

where t0 ≡ t(b = 1). If there exists a fixed point, g = g? with κ(g? ) 6= 0, then its
contribution to the exponent (the upper limit dominates) is
Z b Z b
dµ b→∞ dµ
− κ(g? ) → −κ(g? ) .
µ µ
| 1 {z }
=log b

92
Hence, in this case
t(b) = t0 b−κ(g? ) (6.17)
– that is κ(g? ) determines the critical exponent with which the IR value of t(b) diverges.
Why do we care about the IR value of t(b)? It determines the correlation length! We’ll
come back to this.
What is the solution of the beta function equation for the coupling in d = 4? To
save writing, let’s redefine g̃0 ≡ K4 g0 and drop the tilde. The equation is
3
−b∂b g = g02 + O(g03 )
2
which is solved by
2g0 b1 2 1 b→∞
g(b) = → → 0. (6.18)
2 + 3g0 log b 3 log b
There is an attractive IR fixed point at g0 = 0. This is part of the way towards justifying
my claim that perturbation theory would be useful to study the long-wavelength physics
in this problem.
In the case of d = 4, then, the interesting physics comes from the slow approach to
the free theory in the IR. To get something interesting we must include the flow, for
example in the solution for t, Eq. (6.16): since the flow of g0 (6.18) never stops, we can
parametrize the flow by g0 and use the chain rule to write dµ µ
dg0
= β(g 0)
so that
Z b Z g0 (b)
dµ κ(g) b1 1 g(b)
κ(g0 (µ)) = dg ' log
µ g0 β(g) 3 g0
| {z }
1
= 3g (1+O(g))

From which we conclude

b1
t0 (b) ' t (log b)−1/3 . (6.19)
This equation, a power law (remember that a log is a special case of a power law)
relation between the effective temperature and the zoom factor, will be useful below.
Extracting physics from the running couplings. Let’s use this information
to study the susceptibility and understand when the quadratic approximation is under
control.
First, physics is independent of where we start the flow:

χ−1 (t0 , g0 , Λ) = χ−1 (t(b), g(b), Λ/b) (6.20)

– this is what I called the Callan-Symanzik equation during the random-walk discussion
§1.3. Second, we use ordinary engineering dimensional analysis:

[Λ] = −1, [χ−1 ] = [t] = −2, [g0 ] = 0

93
This implies that the RHS of (6.20) is

χ−1 (t(b), g(b), Λ/b) = b−2 χ−1 (t(b)b2 , g(b), Λ).

For d ∼ 4, we found that

−1 g0 Λ
χ ∼ t0 1 − log √ . (6.21)
2 t
√
This says perturbation theory (in g0 ) breaks down (even for small g0 !) when g0 log Λ/ t &
1 . But for fixed physics (as in (6.20)), we can choose the zoom factor b = b? so that
the argument of the logarithm is
! Λ/b?
1= p . (6.22)
t(b? )

When does the zoom factor hit the sweet spot (6.22)? The answer is different in
d = 4 and d < 4.
Using (6.19), this happens when
√ t0
t(b? ) = t0 (log(b? ))−1/6 (b? )−2 (log b? )1/3 =
p
Λ/b? = ↔
Λ2
which we can solve for b? in the limit t Λ (closer to the critical point than the lattice
scale):
t/Λ2 1 t 2
(b? )−2 ' . (6.23)
Λ (log(t/Λ2 ))1/3
2

Putting this information back into the Callan-Symanzik equation for the suscepti-
bility (6.20), we have

χ−1 (t, g0 , Λ) = (b? )−2 χ

−1
(t(b? ), g0 (b? ), Λ) 
!
g0 (b? )

(6.21) ? −2 
 Λ 
= (b ) t 1 − log p 
 2 t(b? ) 
| {z }
=0,by design
(6.23) t
= . (6.24)
(log t/Λ2 )1/3
This is a scaling law for how the susceptibility diverges as we tune the knob on our
thermostat towards the critical value.

94
A comment on active versus passive RG. I’ve presented the condensed-matter
perspective on the RG here: there is a fixed, real cutoff, and the couplings run as we
integrate out longer and longer wavelength modes, i.e. vary the resolution with which
we look at the degrees of freedom.
Another perspective (which leads to the same conclusions!), taken by high-energy
physicists, is that the cutoff Λ is an artificial device. We should be able to vary
this cutoff without changing the physics, at the cost of changing the values of the
couplings at the cutoff. That is we regard the couplings at the cutoff (what I called
r0 , g0 above, the ones appearing in the Lagrangian) as depending on the cutoff Λ. To
make this precise, we must ask how the couplings in γ(Λ) need to depend on Λ to keep
the physics from depending on this fictional division we are making between UV and
IR. We can think about the RG transformation as replacing the cutoff Λ with a new
(smaller) cutoff Λ/b.
Something we can measure, and which should not depend on our accounting pa-
rameter b, is the susceptibility (for T > Tc ):

g0 Λ/b d̄d q
Z
−1 2
r ≡ χ = ∂m γ|m=0 = r0 + .
2 q 2 + r0
(Such an equation, relating a physical quantity like χ to something we can compute in
terms of the running couplings gn (b), is sometimes called a renormalization condition.)
We can invert this equation to figure out r0 (b):

g0 Λ/b d̄d q
Z
r0 = r − + O(g02 ).
2 q2 + r
Again we subtract the critical value of r0 to get

d̄d q
Z
c g0 2
t0 ≡ r0 − r0 = r 1 + + O(g0 ) .
2 q 2 (q 2 + r)
Near d = 4, this is

g0 Λ/b g0
t0 = r 1 + K4 log √ = r 1 − K4 log b + · · ·
2 r 2
(where the ellipsis is independent of b).
Another quantity we can imagine measuring is the coupling g, a non-linear suscep-
tibility:
3g02 Λ/b d̄d q
Z
4
g ≡ ∂m γ|m=0 = g0 − + O(g03 )
2 (q 2 + r0 )2

95
– notice that this is the same equation as (6.12), but the BHS is interpreted differently:
now the LHS is a physical, fixed, measurable thing, and g0 is a fake thing that depends
on the artificial parameter b. We can invert this equation to find g0 in terms of g and
b:
3g 2 Λ/b d̄d q
Z
g0 (b) = g + + O(g 3 )
2 (q 2 + r0 )2
(where we studiously neglect higher order things). Near d = 4 this is

d→4 3g 2 1 Λ 3g 2
g0 (b) ' g + K4 log √ + O(g 3 ) = g − K4 log b + O(g 3 ) (6.25)
2 2 b r 2

(where the ellipsis is independent of b). This reproduces the same beta functions as
above.

Two important generalizations. Now we make two easy but crucial generaliza-
tions of the d = 4 Ising calculation we’ve just done: namely Z2 → O(n) and d → 4 − .
O(n) : [Goldenfeld, §11.1] by the LG logic, a O(n)-invariant and translation-
invariant free energy at fixed magnetization ma must look like
Z
a d 1~ a ~ a 1 a a g0 a a 2
S[φ ] = d x ∇φ · ∇φ + r0 φ φ + (φ φ )
2 2 4!
For n > 1, in expanding about the mean field configuration φa = ma +ϕa , we encounter
a distinction between the one (longitudinal) fluctuation in the direction of ma ≡ mea0
and the n − 1 transverse fluctuations. The quadratic part of this action comes from
the kernel
ab δS g0 g0
Kxy = a b |φ=m = −∇2 + r0 + m2 δab + ma mb δxy .
δφx δφy 6 3

[End of Lecture 13]

a
Its eigenvectors can be constructed using an orthonormal basis {e }, in terms of which
we can choose ma = mea0 and decompose ϕa ≡ ϕL ea0 + ϕTα eaα , α = 1..n − 1. (L is for
longitudinal, T is for transverse, to the direction of the ordering vector ma .) Then K
is block-diagonal in this basis:
  
ab
g0
Kxy = (δab − ea eb ) −∇2 + r0 + m2 + ea eb −∇2 + r0 + 1/2 g0 m2  δxy
  
6 |{z}
=1/3+1/6

This matrix is made of one copy of the n = 1 Ising case with coefficient of m2 equal to
g0 /2, and n − 1 degenerate copies of the same thing with g0 /6. So the sum of the logs

96
of the eigenvalues is
Z g0 g0
trx,a log K = V d̄d q log r0 + m2 + q 2 + (n − 1) log r0 + m2 + q 2 + const .
2 6

Redoing the steps between (6.10) and (6.12), we find

d̄d q
2
g02
Z
3g0 3 d→4 2 3 n−1
g = g0 − + (n − 1) + O(g0 ) ' g0 − g0 + K4 log b
2 6 (q 2 + r)2 2 6

so that the beta function is

n+8 2
β(g0 ) = K4 g + O(g03 )
6 0
The flow of the temperature is

n+2
t0 (b) = t0 1 − g0 K4 log b .
6
R
d = 4 − : If d 6= 4 the self-coupling term dd x ϕ4 in the LG action is not dimen-
sionless: [ϕ] = 2−d
2
, =⇒ [g0 ] = 4 − d ≡ . Let’s extract powers of the UV cutoff to
make a dimensionless coupling g0 → Λ g0 , so that the LG action is
Z
γ[φ] = ... + dd xΛ g0 φ4 .

Anticipating the result a bit, we are going to treat g0 and as being of the same order
in our expansion, so O(g0 ) = O() and O(g02 ) = O(g0 ) et cetera. Thinking of as
small, then, the only change in (6.12) is

dd q
Z
4 2
∂m γ(b)|m=0 == Λ g0 − b0 g (6.26)
(q 2 + r)2
3 n−1
where b0 ≡ 2
+ 6
.
The the coefficient of φ4 in the effective action at scale Λ/b is

(Λ/b) g(b) ≡ ∂m
4
γ(b)|m=0 = Λ g0 − b0 Kd Λ g02 log b + O(g02 ) .

Here comes the magic: the key fact is roughly that “Λ = 1 + log Λ + O(2 )”; I
put that in quotes because it is distasteful to take the log of a dimensionful quantity.
Systematically ignoring things that can be ignored (including the Λ which is needed
in the previous equation for dimensions to work), this is:

g(b) = g0 (1 + log b) − b0 g02 log b + O(g02 )

97
(Again we absorb the factors of Kd into g, g0 .)

βg ≡ −b∂b g(b) = −g0 + b0 g02 + O(g03 ).

The crucial extra term proportional to g0 comes from the engineering dimensions of
g0 .
Where are the fixed points? There is still one at g0 = 0, our
old friend the Gaussian fixed point. But there is another, at
6
g? = + O(2 ) = + O(2 ) .
b0 n+8
This is the Wilson-Fisher fixed point (really one for every n
and d <
∼ 4). As was foretold, g0 is of order .
The WF fixed point and the Gaussian critical point exchange roles as we decrease
d through four. For d > 4, the Gaussian critical point is IR attractive and governs
the critical behavior at long wavelengths: MFT is right. At d = 4, they collide and
this produces the weird logarithms in the approach to g = 0 that we saw above. For
d < 4, the Gaussian fixed point is unstable to the interaction term: the g0 φ4 term is a
relevant perturbation, since g0 grows as we zoom out.

Correlation length critical exponent. Now we can look at the behavior of

the correlation length as we approach this critical point, by tuning the temperature.
Again, physics is independent of where we start the flow:

ξ(t0 , g0 , Λ) = ξ(t(b), g(b), Λ/b) (6.27)

– this is what I called the Callan-Symanzik equation during the random-walk discussion
§1.3. Second, we use ordinary engineering dimensional analysis:

[Λ] = −1, [ξ] = 1, [g0 ] = 0, [t] = −2

– the correlation length is a length and so zooms like a length. From this, we deduce
that (the RHS of (6.27) is )

ξ(t(b), g(b), Λ/b) = bξ(t(b)b2 , g(b), Λ).

Now we can choose a convenient zoom factor, b. Again, we choose b = b? so that the
argument of the logs are all 1 and they go away:

t(b? )
= 1. (6.28)
(Λ/b? )2

98
If b? → ∞, then g(b? ) → g? , the IR fixed point value, where
(6.17)
t(b? ) ' (b? )−κ(g? ) t.

We can solve this equation for b? , using (6.28):

1
− 2−κ(g
? t ?)
tΛ
b = → ∞
Λ2

which indeed blows up in the critical region t Λ – that is: this is an IR fixed point,
a fixed point we reach by zooming out.
Therefore
1
t − 2−κ(g? )
ξ(t, g0 , Λ) = b? ξ(t(b? ) (b? )2 = Λ2 , g? , Λ) ∼

Λ 2
t −ν

≡ Λ2
(6.29)
n+2 n+2
Explicitly, κ(g0 ) = 6
g0 + O(g02 ) means κ(g? ) = n+8
+ O(2 ) so that

1 n+2
ν= + + O() (6.30)
2 4(n + 8)

Notice that all the information about the short-distance stuff has dropped out of (6.29)
(except for the stuff hidden in the twiddle, i.e. the overall coefficient) – only the physics
at the fixed point matters for the exponent.
We can do remarkably well by setting = 1 in (6.30) and comparing to numerical
simulations in d = 3.

99
6.6 Perturbative momentum-shell RG

[Kardar, Fields, §5.5, 5.6] I will say a bit about how to develop this perturbative RG
more systematically. We’ll end up at the same place, but with more context. This
calculation is important enough that it’s worth doing many ways.
We’ll do n-component fields, φa , a = 1..n with O(n) symmetry, in d = 4 − dimen-
sions. Let’s decompose the action as
S[φ] = S0 [φ] + U,
with S0 the gaussian terms, as above. For n component fields, the gaussian term looks
like Z Λ
1
d̄d k φa (k)φa (−k) r0 + r2 k 2 .

S0 [φ] =
0 | {z }2
≡|φ|2 (k)

(If it is not diagonal, do a field redefinition to make it so.) We assume the model has
a O(n) symmetry which acts by φa → Rba φb , with Rt R = 1 n×n . The most relevant,
symmetric interaction term (non-Gaussian perturbation) is the φ4 term
Z Z Y 4 Xn X
d a a 2 d
U = d xu0 (φ (x)φ (x)) = u0 d̄ ki φa1 (k1 )φa2 (k2 )φa3 (k3 )φa4 (k4 )/δ( ki )δ a1 a2 δ a3 a4 .
i=1 a1,2,3,4 =1 i

(I’ve defined /δ(q) ≡ (2π)d δ d (q).)

We’ll show that it’s not actually necessary to ever do any momentum integrals to
derive the RG equations.
Again we break up our fields into slow and fast, and we want to integrate out the
fast modes first:

r +r k2
− 0Λ/b d̄d k|φ< (k)|2 0 2 2
Z R
ZΛ = [Dφ< ]e Z0,> e−U [φ< ,φ> ] 0,> .

The h...i0,> means averaging over the fast modes with their Gaussian measure, and Z0,>
is an irrelevant normalization factor, independent of the objects of our fascination, the
slow modes φ< .
The corrections to the effective action for φ< can be organized as a cumulant ex-
pansion:
−U 1 2 2

log e 0,>
= − hUi0,> + U 0,> − hUi0,> +O(U 3 )
| {z } |2 {z }
1 2
Let’s focus on the first-order term first:
Z Y 4
!* +
d
X Y
1 = hU[φ< , φ> ]i0,> = u0 d̄d ki /δ ki (φ< + φ> )i
i=1 i i 0,>

100
It is useful to introduce a diagrammatic notation in which these 16 terms decompose
as in Fig. 1.
We can compute the averages over the fast modes by doing Wick contractions. This
is a fact about Gaussian integrals, which can be summarized by noting that
1
ehA φA 0
= e 2 hA hφA φB i0 hB

where A is a multi-index over space and flavor labels and whatever else (to prove it,
complete the square). Then expand both sides to learn that
(
0, if m is odd
hφA1 · · · φAm i0 = .
sum of all pairwise contractions, if m is even

By ‘pairwise contraction’ I just mean a way of replacing a pair of φs on the LHS with
hφA φB i. Each pairwise contraction is given by the ‘propagator’, which in our case is

δ ab /δ(q1 + q2 )
φa> (q1 )φb> (q2 ) = = .
0,> r0 + q12 r2

In the figure, these are denoted by wiggly lines. The slow modes are denoted by straight
lines. The 4-point interaction is denoted by a dotted line connecting two pairs of lines
(straight or wiggly). !
X
u0 δ a1 a2 δ a3 a4 /δ qi = .
i

Although the four fields must be at the same point in space we separate the two pairs
whose flavor indices are contracted, so that we can follow the conserved flavor index
around the diagrams.
Let’s analyze the results of the first order correction: The interesting terms are
Z Λ/s Z Λ
d 1
13 = −u0 |{z} 2 |{z}n d̄ k|φ< (k)|2
d̄D q
aa 0 Λ/s r0 + r2 q 2
symmetry =δ

4·1
13 14 =
2·n
has a bigger symmetry factor but no closed flavor index loop. The result through
O(u) is then just what we found previously:
Z Λ
1
r0 → r0 + δr0 = r0 + 4u0 (n + 2) d̄d q 2
+ O(u20 ) .
Λ/b r0 + r2 q

101
Figure 1: 1st order corrections from the quartic perturbation of the Gaussian fixed point of the O(N )
model. Naturally, wiggly lines denote propagation of fast modes φ> , straight lines denote (external)
slow modes φ< . A further refinement of the notation is that we split apart the 4-point vertex to
indicate how the flavor indices are contracted; the dotted line denotes a direction in which no flavor
flows, i.e. it represents a coupling between the two flavor singlets, φa φa and φb φb . The numbers at
left are multiplicities with which these diagrams appear. (The relative factor of 2 between 13 and 14
can be understood as arising from the fact that 13 has a symmetry which exchanges the fast lines but
not the slow lines, while 14 does not.) Notice that closed loops of the wiggly lines produce of n, since
we must sum over which flavor is propagating in the loop – the flavor of a field running in a closed
loop is not determined by the external lines, just like the momentum.

r2 and u are unchanged. The second part of the RG step is rescaling

q̃ ≡ bq, φ̃(k) ≡ ζ −1 φ< (k)

to restore the original action: we must choose ζ = b1+d/2 to keep r̃2 = r2 (the unfamiliar
R
power is because φ(k) = dd xφ(x)eikx scales differently from φ(x)).
The second-order-in-u0 terms are displayed in Fig. 2. The interesting part of the

102
+

Figure 2: 2nd order corrections from the quartic perturbation of the Gaussian fixed point of the O(N )
model. The left column of diagrams are corrections to the quartic interaction, and the right column
correct quadratic terms. In fact the top right diagram is independent of the external momentum and
hence only corrects r0 ; the bottom right diagram (that looks like a sheep) also corrects the kinetic
term.
Notice that the diagram at top right has two closed flavor loops, and hence goes like n2 , and it
comes with two powers of u0 . You can convince yourself by drawing some diagrams that this pattern
continues at higher orders. If you wanted to define a model with large n you could therefore consider
taking a limit where n → ∞, u0 → 0, holding u0 n fixed. The quantity u0 n is often called the ’t Hooft
coupling.

second order bit

1
U[φ< , φ> ]2 0,>,connected
2=
2
is the correction to U[φ< ]. There are less interesting bits which are zero or constant.
[End of Lecture 14]
The correction to the quartic term at 2nd order is
Z Λ/b 4
Y X
δ2 S4 [φ< ] = u20 (4n + 32) d̄d ki φ< (ki ) /δ( ki )f (k1 + k2 )
0 i

with
Z Z
d 1 1
f (k1 +k2 ) = d̄ q ' d̄d q (1 + O(k1 + k2 ))
(r0 + r2 q )(r0 + r2 (k1 + k2 − q)2 )
2 (r0 + r2 q 2 )2

– the bits that depend on the external momenta give irrelevant derivative corrections,
like φ2< ∂ 2 φ2< . We ignore them. This leaves behind just the correction to u we found
before.

103
There are also two-loop corrections to the quadratic term (diagrams with two
straight lines sticking out). Altogether, the full result through O(u20 ) is then the original
action, with the parameter replacement
     −d−2 2 
r2 r̃2 b ζ (r2 + δr2 )
 r0  7→  r̃0  =  b−d ζ 2 (r0 + δr0 )  + O(u30 ).
u0 ũ0 b−3d ζ 4 (u0 + δu0 )
The shifts are:  2
2 ∂k A(0)


δr2 = u 0 r2
RΛ
δr0 = 4u0 (n + 2) Λ/b d̄d q r0 +r1 2 q2 − A(0)u20 .

δu = − 1 u2 (8n + 64) R Λ d̄d q
 1
0 2 0 Λ/b (r0 +r2 q 2 )2

Here A is a new ingredient that we didn’t notice earlier: it produces a correction

to the kinetic term for φ< (which is called ‘wavefunction renormalization’): A(k) =
A(0) + 21 k 2 ∂k2 A(0) + .... We can choose to keep r̃2 = r2 by setting

bd+2
ζ2 = = bd+2 1 + O(u20 ) .

2 2
1 + u0 ∂k A(0)/r2

Now let’s make the RG step infinitesimal:

b = e` ' 1 + δ`
4(n+2)Kd Λd
(
dr0
d`
= 2r0 + r0 +r2 Λ2
u0 − Au20 + O(u30 )
d (6.31)
du0
d`
= (4 − d)u0 − 4(n+8)K dΛ
(r0 +r2 Λ2 )2 0
u2 + O(u30 )

To see how the previous thing arises, and how the integrals all went away, let’s
consider just the O(u0 ) correction to the mass:
d̄d q
Z Λ
dr0 2 2
r̃0 = r0 + δ` = s r0 + 4u(n + 2) 2
+ O(u0 )
d` Λ/b r0 + r2 q

d 1 2
= (1 + 2δ`) r0 + 4u0 (n + 2)Kd Λ δ` + O(u0 )
r0 + r2 Λ2
4u0 (n + 2)
= 2r0 + Kd Λd δ` + O(u20 ). (6.32)
r0 + r2 Λ2

Now we are home. (6.31) has two fixed points. One is the free fixed point at the
origin where nothing happens. The other (Wilson-Fisher) fixed point is at
2u? (n+2)K Λd d=4−
(
r0? = − 0r? +r2 Λ2d = − 12 n+8
n+2
r2 Λ2 + O(2 )
0
(r? +r2 Λ2 )2 d=4− 1 r22
u?0 = 4(n+8)Kd Λd
= 4 (n+8)K 4
+ O(2 )

104
Figure 3: The φ4 phase diagram, for > 0. If r0 (` = ∞) > 0, the effective potential for the
uniform ‘magnetization’ has a minimum at the origin; this is the disordered phase, where there is no
00
magnetization. If r0 (` = ∞) = Veff < 0, the effective potential has minima away from the origin,
and the groundstate breaks the O(n) symmetry; this is the ordered phase. Too far to the right, u0
is too large for us to trust our perturbative analysis. Experimental and numerical evidence suggests,
however, that there are no other fixed points nearby, i.e. that there are actually no dragons.

which is at positive u?0 if > 0. In the second step we keep only leading order in
= 4 − d.
Now we follow protocol and linearize near the W-F fixed point:

d δr0 δr0
=M
d` δu0 δu0

The matrix M is a 2 × 2 matrix whose eigenvalues describe the flows near the fixed
point. It looks like !
2 − n+2
n+8
...
M=
O(2 ) −
Its eigenvalues (which don’t care about the off-diagonal terms because the lower left

105
entry is O(2 ) are
n+2
yr = 2 − + O(2 ) > 0
n+8
which determines the instability of the fixed point and

yu = − + O(2 ) < 0 for d < 4

which is a stable direction.

So yr determines the correlation length exponent. Its eigenvector is δr0 to O(2 ).
This makes sense: r0 is the relevant coupling which must be tuned to stay at the critical
point. The correlation length can be found as follows ξ is the value of s = s1 at which
the relevant operator has turned on by an order-1 amount, i.e. by setting ξ ∼ s1 when
1 ∼ δr0 (s1 ). According to the linearized RG equation, close to the fixed point, we have
δr0 (s) = syr δr0 (0). Therefore30
1
−
ξ ∼ s1 yr = (δr0 (0))−ν .

This last equality is the definition of the correlation length exponent (how does the
correlation length scale with our deviation from the critical point δr0 (0)). Therefore
−1
1 1n+2 2 1 n+2
ν= = 2 1− + O( ) ' 1+ + O(2 ).
yr 2n+8 2 2(n + 8)

7 Scaling
Scaling functions from the RG. [Goldenfeld, §9.4, 9.2] Consider a renormalization
group near a fixed point with one relevant parameter, which transforms as

T 0 = Rb (T ).

At the fixed point T? = Rb (T? ). Near the fixed point

T 0 − T? = Rb (T ) − Rb (T? ) = Rb (T − T? ) + O(T − T? )2

where as in §1.5, Rb ≡ ∂T Rb |T? , which is already diagonalized, so that

1
Rb = byt yt = log Rb .
b
30
I find this kind of argument very slippery; the next section is an attempt to make it and its ilk
more systematic.

106
Letting t0 = T −T
T?
?
, the reduced temperature transforms under a single RG step as
0 yt
t = t0 b and under n RG steps as

t(n) = t (byt )n .

Since the correlation length ξ is a length, it transforms as ξ 0 = ξ/b (think of a

blocking transformation whereby bd lattice sites are blocked, so a fixed length is smaller
in units of the new lattice spacing). Under n steps,

ξ (n) ≡ ξ t(n) = ξ(t0 )/bn

which says
ξ(t0 ) = bn ξ t(n) = bn ξ (t0 bnyt ) .

Now choose b so that 1/yt

n b
b = , b∼1
t0
where b is some order-one number. Then we have
−1/yt
ξ(t0 ) = b−1 t0 ξ(b) as t0 → 0.

Here ξ(b) is the high-temperature correlation length, far from the critical point, which
we can regard as a constant, independent of t0 . Comparing to the definition of the
correlation length critical exponent ξ(t0 ) ∼ t−ν
0 this says

1
ν= . (7.1)
yt

Similarly, the (singular part31 of the) free energy density transforms as

d/yt
−d 0 −nd (n)
t0
f (t) = b f (t ) = b f t = f (b).
b

From this we learn, for example, that the specific heat

cV ∼ ∂t2 f ∼ t−α
d
has α = 2 − yt
. Comparing to the expression (7.1) for ν, we have

2 − α = νd

which is called the Josephson scaling relation.

31
That is, not including the constant C in (3.12), which doesn’t scale.

107
All the variables. More generally, there are many more couplings. For example,
for the Wilson-Fisher fixed point (in 2 < d < 4) there are two relevant couplings
t, h (the latter of which breaks symmetries), and a long list of irrelevant operators
which I’ll call K3 , K4 . Let us suppose that we have already diagonalized the matrix
Rαβ = ∂K α Rβ |K=K? at the fixed point, and t, h, K3 , K4 ... are the coordinates in the
eigendirections, with scaling exponents yt , yh , y3 , y4 .... The statements about relevance
above say yt , yh > 0, y3 , y4 < 0. (Note that the quartic coupling is in the list of irrelevant
perturbations of the WF fixed point for 2 < d < 4.) Then

f (t, h, K3 , K4 ...) = b−d f (tbyt , hbyh , K3 by3 , K4 by4 ...) (7.2)

−nd nyt nyh ny3 ny4
=b f (tb , hb , K3 b , K4 b ...) (7.3)

= td/yt b−d f b, h (t/b)−yh /yt , K3 (t/b)−y3 /yt , K4 (t/b)−y4 /yt ... (7.4)

t→0 d/yt −d
→ t b f b, h (t/b)−yh /yt , 0, 0, ... (7.5)
≡ t2−α F h/t∆ , ∆ ≡ yh /yt .

(7.6)

The last expression for the free energy density is in terms of a scaling function F –
which basically just means a function of dimensionless arguments. The existence of
this function implies all the so-called hyperscaling relations (such as the Josephson one
above) which relate various exponents.
An important disclaimer about the t → 0 limit in (7.5): the limit f (t, h, K3 ) as
K3 → 0 may not exist. In that case, K3 is called a dangerous irrelevant variable. An
example where this happens is at the gaussian fixed point in d > 4, with K3 = g, the
quartic coupling. If the quartic coupling is zero, then for t ≤ 0, the partition function
blows up, so the limit t → 0 does not commute with g → 0. Despite the fact that g is an
irrelevant perturbation, g > 0 is crucial for determining the saddle point configuration
of m when t ≤ 0.
Corrections to scaling. In experiments or simulations, t isn’t really zero. So the
contributions from irrelevant operators are not exactly zero. For example, by a similar
argument to (7.5), the susceptibility is

−γ ± h −y3 /yt
χT (t, h, · · · ) = |t| Fχ , K3 t ,···
t∆

and even though y3 /yt < 0, the second argument of the scaling function is not actually
zero. On the other hand, if K3 is not a dangerous irrelevant variable which affects the
vacuum structure, then F (x, y) will be analytic in y near 0, so we can Taylor expand:

χT (t, h, · · · ) = |t|−γ A± + B± K3 |t|−y3 /yt + · · · .

108
For h = 0, A± , B± are non-universal constants. The term with A gives the leading
singularity |t|−γ , but the term with B, which goes like |t|−γ−y3 /yt can also be singular
at t → 0 and if so must be included in a comparison with experiment or simulation.
Two relevant couplings. If we also keep track of the external field, the singular
part of the free energy is

2−α h
fs (t, h) = |t| F± , ∆ ≡ yh /yt .
|t|∆
The ± label is to allow for the possibility of different scaling functions for t > 0 and
t < 0. F± (0), which describes h = 0, fixed t, must be a constant so that e.g. cV ∼
∂t2 fs |h=0 ∼ |t|−α . F± (∞) describes the behavior t → 0 at fixed h, which is constrained
by
−1 2−α−∆ 0 h
M = −T ∂h fs ∼ t F± . (7.7)
|t|∆
When h = 0, we require M ∼ tβ , so we learn that
β = 2 − α − ∆ = 2 − α − yt /yh . (7.8)
x→∞
When t = 0, we must have M ∼ h1/δ . If F±0 (x) ∼ xλ then (7.7) is
(7.8)
M ∼ hλ |t|2−α−∆−∆λ = hλ |t|β−∆λ .
which requires both β = ∆λ and λ = 1δ .
[End of Lecture 15]
Scaling for the correlation function. A similar scaling argument can be made
for the spin-spin correlation function, G(r, {K}) ≡ hm(r)m(0)ic . On the one hand, G
transforms as
G0 = G (r/b, {K 0 }) = G(r/b, tbyt , hbyh ...).
Notice that the separation between the points is just like another coupling with dimen-
sions of length. On the other hand, I claim that
G0 = b2(d−yh ) G (r, {K}) .
This follows if we regard G0 as the correlation function of the block spins (see Goldenfeld
§9.8). The input is: Z(K 0 ) = Z(K), G(r, K) = δhr δh0 ln Z(h) (hence the factor of b−2yh
comes from the rescaling of h) and finally the factor of b2 d comes from the fact that a
block spin contains bd spins.
If we choose b = t−1/yt then
G(r, t, h...) = t−2(d−yh )/yt G rt1/yt , 1, ht−yh /yt

(7.9)
| {z }
−2(d−yh ) −y
≡(rt t )
1/y FG (rt t ,ht h /yt )
1/y

r−2(d−yh ) FG rt1/yt , ht−yh /yt .

(7.10)

109
from which we learn that 2(d − yh ) = d − 2 + η (as long as FG (x, y) is smooth as
x, y → 0).
Of all the greek letters we defined (α, β, γ, δ, ν, η, ∆) only two combinations are
independent – they all depend on the two exponents yt , yh associated with the two
relevant perturbations of the fixed point in question.
Data collapse. The main reason to care about scaling functions is the phenomenon
of data collapse. If we plot, say, the magnetization as a function of temperature, for
various values of the external field, we’ll get a different curve M (t) for each h. On the
other hand, (7.7) presents M (t, h) as tβ times a function of the single variable h/|t|∆ .

This formula is valid for small |h|, |t|, but arbitrary h/t. It
implies that if we plot M/|t|β as a function of |h|/|t|∆ , all
of the data will lie on two curves, one for t < 0 and one for
t > 0. At right is a cartoon of what this looks like.

Even better, if we plot an observable which

has scaling dimension zero, then we don’t
scaling collapse of binder cumulant
need to divide by powers of t. An example 1.0
data collapse
6
binders' crossing
1.0 6
10 10
of such a variable is
the Binder cumulant, 0.8
14
18 0.8
14
18
22 22
hM i
4
g = 21 1 − 3hM 2 i2 . Here is some data
binder cumulant

binder cumulant
0.6 0.6

0.4 0.4
from some small monte carlo simulations of
0.2 0.2
the 2d ising model the data collapse for the
0.0 0.0
Binder cumulant, as well as the crossing at 30 20 10 0
(T-Tc)L^(1/ )
10 20 1.0 1.5 2.0
T
2.5 3.0 3.5

Tc (see the discussion below on finite-size

scaling).

7.1 Crossover phenomena

[Cardy, chapter 4, Goldenfeld §9.9]. A crossover refers to a smooth change between two
behaviors as a parameter is varied, the opposite of a phase transition. There are many
reasons why critical behavior might be absent; examples include symmetry breaking,
finite volume, disorder.
Suppose we take a critical system with some symmetry and perturb it by a small
symmetry-breaking term. For example, consider the Ising fixed point, perturbed by
a small magnetic field. There will no longer be a sharp transition between high and
low temperature, but what does the critical theory say about the behavior of physical
quantities?

110
Our scaling function expression for

2−α h
fs (h, t) = |t| F±
|t|∆
tells us that when h = 0 we see the expected critical behavior, e.g. cv ∼ |t|−α . On
the other hand, when h ∼ t∆ , something else happens – the dependence on t in the
argument of the scaling function matters; tX ≡ h1/∆ is called the crossover temperature.
x→∞ 1
When h t∆ , instead we see the limit x → ∞ of F± (x) ∼ x1+ δ , which gives instead
cV ∼ t−α−∆(1+1/δ) , a completely ‘wrong’ critical exponent:
(
t−α , tX t 1
cV ∼ .
t−α−∆(1+1/δ) , t tX

So watch out for residual magnetic field when trying to measure critical exponents.
O(3) → O(2)×Z2 . [Here I am really just re-typing Cardy §4.2] Above we considered
a Z2 -symmetric fixed point, perturbed by something which broke all the symmetry.
Consider instead an O(3)-symmetric fixed point which breaks the symmetry down to
~i = (Six , S y , Siz )
O(2) × Z2 . A definite lattice model is provided by 3-component rotors S i
~ ~
with Si · Si = 1 on a lattice in 2 < d < 4, with
X X
−H = ~i · S
Jij S ~j + D (Siz )2 .
ij i

The O(3)-symmetric fixed point is named after Heisenberg. The Landau theory looks
like
1 ~ 2 1 a a
L= ∇φ + tφ φ + u (φa φa )2 + Dφ2z .
2 2

For large D > 0, the perturbation encourages large

Sz = ±1, and we can just forget S x,y , it’s an Ising
model. For large D < 0, |S z | 1, and we forget Sz ,
it’s just an XY model – an O(n) model with n = 2.
For finite D, we don’t know a priori, but we can
connect the dots of the phase diagram as at right.

H
Near D = 0, T = TcH , let t = T −T
TcH
c
, and we can write a scaling function using the
scaling near the Heisenberg fixed point

−nd nytH nyD 2−αH −yD /ytH
fs (t, D) = b fs tb , Db = |t| Ψ D|t| (7.11)

111
where ytH is the dimension of t as a perturbation of the Heisenberg fixed point, H,
and yD > 0 is the dimension of D, which is relevant at that fixed point. In the
H
second expression we did the familiar step of choosing tbnyt = some order-one number.
φ ≡ yD /ytH is called the crossover exponent; notice that it is determined by data of the
Heisenberg (UV) fixed point.
So at D = 0, cV ∼ ∂t2 fs ∼ |t|−αH where αH is the specific heat exponent at H. The
crossover happens when D0 |t|−φ ∼ 1, i.e. |t| ∼ tX ≡ D1/φ , the crossover temperature.
What happens for larger D0 ?
Here’s a piece of physics input: for |t| tX , we should see Ising behavior, cV ∼
−αI
|t| . This is because the trajectory (A in the figure above) will spend a long time
near the Ising fixed point. This input constrains the scaling function Ψ in (7.11):
α /φ
cV ∼ |t|−αH Ψ D|t|−φ = D−αH /φ D|t|−φ H Ψ D|t|−φ ≡ D−αH /φ Ψ̃ tD−1/φ

where Ψ̃ is another scaling function. Demanding the Ising singularity at tc (D), we have
!
cV ∼ A(D) (t − tc (D))−αI

which requires
−αI
Ψ̃ tD−1/φ ∼ a tD−1/φ − b

, (a, b constants)

which in turn implies

−αI
cV ∼ aD(αI −αH )/φ t − bD1/φ .
This tells us two interesting things about measurable quantities: (1) the peak in cV
across the ising transition varies with D as cV ∼ D(αI −αH )/φ (grows, since αI > αH ).
And (2), tc ∼ D1/φ is the shape of the phase boundary; notice that this is determined
completely by properties of the UV fixed point H. Since it turns out φ < 1, it is shaped
like a cusp, as in the cartoon above.

7.2 Finite-size scaling

[Cardy §4.4, Goldenfeld §9.11] Another important example of a crossover phenomenon

comes from departing from the thermodynamic limit. Simulations happen in finite
volume, say V = Ld < ∞. This means that there are no sharp phase transitions.
Nevertheless, we can use the RG to predict what the physics will look like if we simulate
a system in a critical regime of its parameters. The prediction is in terms of the infinite-
system critical temperature and the critical exponents, and therefore can be used to
extract this information from simulations.

112
The system size L enters through the combination L = N a where N is the number
of sites on a side. An RG step replaces a → ba, holding L fixed. Therefore N → N/b.
Assuming that N does not appear explicitly in the RG map (this is violated by long-
range interactions), we can add the parameter N to the list of arguments of the singular
free energy density:

fs t, h, K3 , · · · , N −1 = b−d fs tbyt , hbyh , K3 by3 , · · · , N −1 b ,

i.e. N −1 acts just like a relevant parameter with dimension yN = +1. Goldenfeld gives
a nice definition: a relevant parameter is one that an experimenter has to adjust to
reach the critical point; the inverse system size N −1 certainly must be adjusted to zero
to have any critical behavior.
Then, for simplicity at h = 0, we can write a scaling function by the hopefully-
standard-by-now trick of running b to an appropriate value:

fs t, N −1 = |t|2−α F± N −1 |t|−yN /yt .

The argument of the scaling function is

N −1 |t|−yN /yt = N −1 |t|−1/yt = N −1 |t|−ν = N −1 ξ∞

where ξ∞ is the would-be correlation length at L = ∞ for the given couplings.

Having done the RG business, I can write L = N a instead of
N , where a is the microscopic lattice spacing. If L ξ, we
are in the thermodynamic limit. Around |t| = tX ∼ L−1/ν ,
the crossover temperature, something (gradually!) happens,
and below this temperature, the answer depends on the
finite-size geometry and boundary conditions. For a finite
system, the dependence on t is analytic.

To understand this regime, let’s rewrite the scaling function so that the analytic
dependence on t is manifest:

fs t, L−1 = |t|2−α F L−1 |t|−ν

(7.12)
2−α α−2
= |t|2−α L−1 |t|−ν ν F̃ tL1/ν = L ν F̃ tL1/ν

(7.13)

where F̃ is a new scaling function. The fact that N is finite means that F̃ (x) is analytic
in its argument. This leads to powerful conclusions. In particular, it enormously
constrains the functions which blow up at a critical point, such as the susceptibility
χT (t, L−1 ) ∼ ∂h2 fs ∼ Lγ/ν ψ tL1/ν , or the correlation length ξ, or the specific heat

cV t, L−1 ∼ ∂t2 fs ∼ Lα/ν F̃ 00 tL1/ν .

113
Instead of a divergence, each of these functions will have
a maximum at some value of t, determined entirely by the
scaling function F̃ 00 – say x0 is the location of the maximum
of F̃ 00 (x). This means that the location of the peak in t is t0 =
x0 /L1/ν ∼ L−1/ν – we know its dependence on system size!
Furthermore, the height of the (finite!) peak as a function of
L is determined by the prefactor Lα/ν .

For example, consider the correlation length itself:

ξ t, N −1 = bξ tbyt , bN −1

(7.14)
= t−ν Fξ N −1 tν

(7.15)
= t−ν Lt−ν F̃ L1/ν t

(7.16)
= LF̃ tL1/ν .

(7.17)
x→∞
For L → ∞ at fixed t 1, ξ ∼ t−ν requires F̃ (x) ∼ x−ν . For fixed L, t → 0, we
have ξ ∼ L, and F̃ (x) = A + Bx + · · · is analytic near 0. But this implies that

ξ t, L−1 = A + BtL1ν + · · ·

so that at the critical value of the couplings, K, for any L, this is ξ (0, L−1 ) = A. If
we plot ξ/L as a function of K for various L, all the curves will cross at the critical
coupling.
A similar argument applies to the magnetization (which is
easier to measure)
d−yh
M t, h = 0, L−1 = t yt FM t−ν L−1 = L−d+yh F̃M (tLyt )

y→0
where F̃M (y) = A + By + · · · is analytic near 0 since this
is the finite-size limit. Thus
M (t, L−1 )
= A + BtLyt + · · ·
Lyh −d
– the curves will cross at the critical coupling K = Kc
(i.e. t = 0).
This allows us to determine the critical value of the coupling, and yh . (Probably, keep-
ing track of the leading irrelevant operator is a good idea.) We can then determine
ν = 1/yt as well by
M
∂K ∼ BLyt
Lyh −d
i.e. log(LHS) = log B + ν1 log L.

114
• The finite-size scaling analysis (up to (7.13)) applies equally well if the system
geometry is L × L × ∞ or L × ∞ × ∞. The difference is that the scaling function
F 00 will be different. In the former case, the system is effectively one-dimensional,
and F 00 is still smooth. It can be determined by a transfer matrix calculation.
In the latter case, F 00 is determined by the critical behavior of an auxiliary 2d
system. Cardy has a bit more detail on this point.

• This account of scaling and scaling functions is completely ahistorical. The idea
of data collapse was known and used long before it was justified by the RG in
the way I’ve described.

[End of Lecture 16]

8 The operator product expansion and conformal

perturbation theory
[Cardy, chapter 5] Some of the information in the beta functions depends on our choice
of renormalization scheme and on our choice of regulator. Some of it does not: for
example, the topology of the fixed points, and the critical exponents associated with
them. Next we discuss a point of view which makes clear some of the data in the
beta functions is also universal. It also gives a more general perspective on the epsilon
expansion and why it works. And it leads to the modern viewpoint on conformal field
theory.
Operator product expansion (OPE). Suppose we want to understand a corre-
lation function of local operators like

hφi (x1 )φj (x2 )Φi

where {Φ} is a collection of other local operators at locations {xl }; suppose that the
two operators we’ve picked out are closer to each other than to any of the others:

|x1 − x2 | |x1,2 − xl |, ∀l.

Then from the point of view of the collection Φ, φi φj looks like a single local operator.
But which one? Well, it looks like some sum over all of them:
X
hφi (x1 )φj (x2 )Φi = Cijk (x1 − x2 ) hφk (x1 )Φi
k

115
where {φk } is some basis of local operators. By Taylor expanding we can move all the
space-dependence of the operators to one point, e.g.:
∂
(x2 −x1 )µ µ
φ(x2 ) = e ∂x1
φ(x1 ) = φ(x1 ) + (x2 − x1 )µ ∂µ φ(x1 ) + · · · .

A shorthand for this collection of statements (for any Φ) is the OPE

X
φi (x1 )φj (x2 ) ∼ Cijk (x1 − x2 )φk (x1 ) (8.1)
k

which is to be understood as an operator equation: true for all states, but only up to
collisions with other operator insertions (hence the ∼ rather than =).
This is an attractive concept, but is useless unless we can find a good basis of local
operators. At a fixed point of the RG, it becomes much more useful, because of scale
invariance. This means that we can organize our operators according to their scaling
dimension. Roughly it means two wonderful simplifications:

• We can find a special basis of operators {Oi } where

δij
hφi (x)φj (0)i? = (8.2)
r2∆i
(here, for the simple case of scalar operators) where ∆i is the scaling dimension
of φi . The ? indicates that this correlator is evaluated at the fixed point. (8.2)
defines the multiplicative normalizations of the φk . This basis is the same as the
operators multiplying eigenvectors of the scaling matrix R? in (1.6), and the ∆k
are related to the eigenvalues (by yk = d − ∆k ).
P
Given (8.2), we can order the contributions to k in the OPE (8.1) by increasing
∆k , which means smaller contributions to hφφΦi.

• Further, the form of Cijk is fixed up to a number. Again for scalar operators,
X cijk
Oi (x1 )Oj (x2 ) ∼ Ok (x1 ) (8.3)
k
|x1 − x2 |∆i +∆j −∆k

where cijk is now a set of pure numbers, the OPE coefficients (or structure con-
stants).
The structure constants are universal data about the fixed point: they transcend
perturbation theory. How do I know this? Because they can be computed from
correlation functions of scaling operators at the fixed point: multiply the BHS of
(8.3) by Ok (x3 ) and take the expectation value at the fixed point:
(8.3) X cijk0
hOi (x1 )Oj (x2 )Ok (x3 )i? = hOk0 (x1 )Ok (x3 )i?
k0
|x1 − x2 |∆i +∆j −∆k

116
(8.2) cijk 1
= ∆ +∆ −∆
(8.4)
|x1 − x2 | i j k |x1 − x3 |2∆k

(There is a better way to organize the RHS here, but let me not worry about
that here.) The point here is that by evaluating the LHS at the fixed point, with
some known positions x1,2,3 , we can extract cijk .

Confession: I (and Cardy) have used a tiny little extra assumption of conformal
invariance to help constrain the situation here. It is difficult to have scale invariance
without conformal invariance, so this is not a big loss of generality. We can say more
about this later but for now it is a distraction.
Conformal perturbation theory. Suppose we find a fixed point of the RG, H? .
(For example, it could be the gaussian fixed point of N scalar fields.) Let us study its
neighborhood. (For example, we could seek out the nearby interacting Wilson-Fisher
fixed point in D < 4 in this way.) For definiteness and simplicity let’s think about the
equilibrium partition function
Z = tre−H
– we set the temperature equal to 1 and include it in the couplings, so H is dimension-
less. We can parametrize it as
XX
H = H? + gi a∆i Oi (x) (8.5)
x i

where a is the short distance cutoff (e.g. the lattice spacing), and Oi has dimensions of
length−∆i as you can check from (8.2). So gi are de-dimensionalized couplings which
we will treat as small and expand in32 .
Then
D P P ∆i O (x)
E
Z = Z? e− x i gi a i
|{z} ?
≡tre−H ?
1
dd r
R
dd x
P
x ' ad X Z
' Z? 1 − hOi (x)i? d−∆i
gi
i
a
Z d d
1 X d x1 d x2
+ gi gj hOi (x1 )Oj (x2 )i?
2 ij a2d−∆i −∆j
Z Z Z Q3 !
d
1 X a=1 d x a
− gi gj gk hOi (x1 )Oj (x2 )Ok (x3 )i? + ... .
3! ijk a3d−∆i −∆j −∆k

Comments:
32
Don’t be put off by the word ‘conformal’ in the name ‘conformal perturbation theory’ – it just
means doing perturbation theory about a general fixed point, not necessarily the gaussian one.

117
• We used the fact that near the fixed point, the correlation length is much larger
than the lattice spacing to replace x ' a1d dd x.
P R

• There is still a UV cutoff on all the integrals – the operators can’t get within a
lattice spacing of each other: |xi − xj | > a.

• The integrals over space are also IR divergent; we cut this off by putting the
whole story in a big box of size L. This is a physical size which should be
RG-independent.

• The structure of this expansion does not require the initial fixed point to be a
free fixed point; it merely requires us to be able to say something about the
correlation functions. As we will see, the OPE structure constants cijk are quite
enough to learn something.

Now let’s do the RG dance. We’ll take the high-energy point of view here: while
preserving Z, we make an infinitesimal change of the cutoff,
a → ba = (1 + δ`)a, 0 < δl 1 .
The price for preserving Z is letting the couplings run gi = gi (b). Where does a appear?
(1) in the integration measure factors ad−∆i .
R
(2) in the cutoffs on dx1 dx2 which enforce |x1 − x2 | > a.
(3) not in the IR cutoff – L is fixed during the RG transformation, independent of b .
The leading-in-δ` effects of (1) and (2) are additive and so may be considered separately:
(1) g̃i = (1 + δ`)d−∆i gi ' gi + (d − ∆i )gi δ` ≡ gi + δ1 gi
The effect of (2) first appears in the O(g 2 ) term, the change in which is
Z d
d x1 d d x 2
X Z
(2) gi gj hOi (x1 )Oj (x2 )i?
i,j |x1 −x2 |∈(a(1+δ`),a) a2d−∆i −∆j | {z }
= k cijk |x1 −x2 |∆k −∆i −∆j hOk i?
P
X Z
−2d+∆k
= δ` gi gj cijk Ωd−1 a hOk i?
ijk

So this correction can be absorbed by a change in gk according to

1 X
δ2 gk = −δ` Ωd−1 cijk gi gj + O(g 3 )
2 ij

where the O(g 3 ) term comes from triple collisions which we haven’t considered here.
Therefore we arrive at the following expression for evolution of couplings: dg
d`
= (δ1 g + δ2 g) /δ`
dgk 1 X
= (d − ∆k )gk − Ωd cijk gi gj + O(g 3 ) . (8.6)
d` 2 ij

118
33
At g = 0, the linearized solution is dgk /gk = (d − ∆k )d` =⇒ gk ∼ e(d−∆k )` which
translates our understanding of relevant and irrelevant at the initial fixed point in terms
of the scaling dimensions ∆k : gk is relevant if ∆k < d.
(8.6) says that to find the interaction bit of the beta function for gk , we look at all
the OPEs between operators in the perturbed hamiltonian (8.5) which produce gk on
the RHS.
Let’s reconsider the Ising model from this point of view:
1X X
H =− J(x − x0 )S(x)S(x0 ) − h S(x)
2 x,x0 x
1X X X 2
'− J(x − x0 )S(x)S(x0 ) − h S(x) + λ S(x)2 − 1
2 0 x x
Z x,x 2
d 1 ~ −2 2 d−4 4 −1−d/2
' d x ∇φ + r0 a φ + u0 a φ + ha φ (8.7)
2

In the first step I wrote a lattice model of spins S = ±1; in the second step I used
the freedom imparted by universality to relax the S = ±1 constraint, and replace it
with a potential which merely discourages other values of S; in the final step we took
a continuum limit.
In (8.7) I’ve temporarily included a Zeeman-field term hS which breaks the φ → −φ
symmetry. Setting it to zero it stays zero (i.e. it will not be generated by the RG)
because of the symmetry. This situation is called technically natural.
Now, consider for example as our starting fixed point the Gaussian fixed point, with
Z
1 ~ 2
H?,0 = dd x ∇φ .
2

Since this is quadratic in φ, all the correlation functions (and hence the OPEs, which
we’ll write below) are determined by Wick contractions using

N
hφ(x1 )φ(x2 )i?,0 = .
|x1 − x2 |d−2
33
To make the preceding discussion we considered the partition function Z. If you look carefully you
will see that in fact it was not really necessary to take the expectation values hi? to obtain the result
(8.6). Because the OPE is an operator equation, we can just consider the running of the operator e−H
and the calculation is identical. A reason you might consider doing this instead is that expectation
values of scaling operators on the plane actually vanish hOi (x)i? = 0. However, if we consider the
partition function in finite volume (say on a torus of side length L), then the expectation values
of scaling operators are not zero. You can check these statements explicitly for the normal-ordered
operators at the gaussian fixed point introduced below. Thanks to Sridip Pal for bringing these issues
to my attention.

119
2
It is convenient to rescale the couplings of the perturbing operators by gi → Ωd−1 gi
to remove the annoying Ωd−1 /2 factor from the beta function equation. Then the RG
equations (8.6) say 
dh
P

 d`
 = (1 + d/2)h − ij cijh gi gj
dr0
P
d`
= 2r0 − ij cijr0 gi gj

 du0 = u − P c g g

d` 0 ij iju0 i j

So we just need to know a few numbers, which we can compute by doing Wick con-
tractions with free fields.

Algebra of scaling operators at the Gaussian fixed point. It is convenient

to choose a basis of normal-ordered operators, which are defined by subtracting out
their self-contractions. That is

On ≡: φn := φn − (self-contractions)

so that h: φn :i = 0, and specifically34

O2 = φ2 − φ2 , O4 = φ4 − 6 φ2 φ2 + φ4 . (8.10)

This amounts to a shift in couplings r0 → r0 + 3u hφ2 i? . The benefit of this choice of

basis is that we can ignore any diagram where an operator is contracted with itself.
Note that the contractions hφ2 i discussed here are defined on the plane. They are in
fact quite UV sensitive and require some short-distance cutoff.
34
The coefficients in (8.10) disagree with Cardy’s book. Here’s where these numbers come from.
The self-contractions are annoying both because they are more terms, and also because they are
infinite. We want to define the On so that they are both orthonormal and finite. When I write
φ2 , you can imagine that I am separating the locations of the two operators by some cutoff , so
φ2 = hφ()φ(0)i = 2−d ; the goal is to subtract off all the bits which are singular as → 0, and then
take the limit.
We can do this inductively. In particular, On is orthogonal to the identity operator O0 = 1 says
hOn i = 0. This fixes O2 = φ2 − φ2 . To save writing let G0 ≡ φ2 (x) . Now let

O4 = φ4 + aφ2 φ2 + b φ4 = φ4 + aφ2 G0 + 3bG20 .

First we demand
!
0 = hO4 i = 3G20 + aG20 + 3bG20
which requires 0 = 3 + a + 3b. The next demand is that
!
0 = hO4 (x)O2 (0)i = φ4 (x)φ2 (0) − φ4 G2 + a φ2 (x)G0 φ2 (0) − φ2 G20 + 3bG20 hO2 (0)i (8.8)

= 3G30 + 12Gx G0 − 3G30 + a 2Gx G0 + G30 − G30 = Gx G0 (12 + 2a) + G30 0. (8.9)

which requires a = −6, and hence b = +1. Notice, however, that this changes nothing about the
operational definition (omit self-contractions). Thanks to Aria Yom for questioning the expression in
(8.10).

120
To compute their OPEs, we consider a correlator of the form above:

hOn (x1 )Om (x2 )Φi

We do wick contractions with the free propagator,

but the form of the propagator doesn’t matter for
the beta function, only the combinatorial factors.
If we can contract all the operators making up On
with those of Om , then what’s left looks like the
identity operator to Φ; that’s the leading term, if
it’s there, since the identity has dimension 0, the
lowest possible. More generally, some number of
φs will be left over and will need to be contracted
with bits of Φ to get a nonzero correlation func-
tion. For example, the contributions to O2 · O2 are depicted at right. In determining
the combinatoric factors, note that permuting the legs on the right does not change
anything, they are identical.
The part of the result we’ll need (if we set h = 0) can be written as (omitting the
implied factors of |x1 − x2 |∆i +∆j −∆k necessary to restore dimensions):

O2 O2 ∼ 21 + 4O2 + O4 + · · ·


O2 O4 ∼ 12O2 + 8O4 + · · ·


O O ∼ 241 + 96O2 + 72O4 + · · ·
4 4

Notice that the symmetric operators (the ones we might add to the action preserving
the symmetry) form a closed subalgebra of the operator algebra.

At h = 0, the result is (the N = 1 case of the result in §6.6)

(
dr0
d`
= 2r0 − 4r02 − 2 · 12r0 u0 − 96u20
du0
d`
= u0 − r02 − 2 · 8r0 u0 − 72u20

and so the (N = 1) WF fixed point occurs at u0 = u?0 = /72, r0 = O(2 ).

The difference numerical numbers in the values of the fixed point couplings come
from our different parametrization (recall that we shifted the definition of r when
we switched to a basis of normal-ordered operators in (8.10)) – that is not universal
information. We can extract something universal and independent of our choices as

121
follows. Linearizing the RG flow about the new fixed point,
dr0
= 2r0 − 24u?0 r0 + · · ·
d`
gives
dr0 24 24 ν1
= (2 − )d` =⇒ r0 ∼ e(2− 72 )` ≡ e`
r0 72
1 1
which gives ν = 2
+ 12
+ O(2 ).
[End of Lecture 17]

122
9 Lower dimensions and continuous symmetries
[Cardy §6, Goldenfeld §11]

9.1 Lower critical dimension

Mean field theory gets better as the number of dimensions grows, so naturally it gets
worse when the number of dimensions shrinks. For low enough d, the fluctuations
completely destroy the order at any finite temperature. For the Ising model, this lower
critical dimension is d = 1; that is, Tc is zero for an Ising chain. Recall our (Peierls’)
understanding of this: if we fix the spins to be up at one end of the chain, then the
free energy cost for making a region of down spins is

∆F1 = E − T S ' 4J − 2T ln L

where L is the system size – the domain walls can be in any of L places. For any
T > 0, for large enough L, ∆F1 is negative (hence favorable). In contrast, in d = 2, the
energy of the domain wall is 2JLd−1 = 2JL, while the entropy is of order log µL (the
domain wall is a self-avoiding but closed random walk of length of order L; at each of
∼ L steps it has of order µ = z − 1 choices of direction to go), so

∆F2 ∼ L (2J − T ln µ)

which, for small enough T , is positive for all L.

In contrast with this case of spontaneous breaking of a discrete symmetry, the
domain walls of a continuous symmetry are floppier, and this raises the lower critical
dimension. If we fix the order parameter to have one value at one end of the system
and some other value at the other end (a distance L away), the transition from one
~ ∼ 1;
to the other can be made gradually, so that the order parameter gradient is ∇φ L
~ 2 ∼ 12 and the energy is dd x(∇φ)
~ 2 ∼ Ld−2 , which is much
R
the energy density is (∇φ) L
cheaper than JLd−1 for an Ising domain wall.
Rigidity of the order parameter. This notion we just defined and estimated –
the free energy cost for twisting the boundary conditions of the order parameter for
a spontaneously-broken continuous symmetry – is an important and useful one, called
the stiffness or rigidity. The origin of the name is as follows: the Hamiltonian govern-
ing a collection of atoms has continuous translation symmetry; in a crystalline solid,
this continuous symmetry is spontaneously broken down to a discrete subgroup pre-
served by the lattice. The discussion here applies to this case as well. Translating the
whole solid doesn’t change the spacing between atoms; it is a symmetry operation and

123
doesn’t change the energy. Translating different parts of the solid by slightly different
amounts will therefore cost a small energy, proportional to the gradient of the trans-
lation. The excitations of the solid therefore include large-correlation-length modes
(Goldstone bosons) ~ui (x) which appear in the energy only through their derivatives.
When experiencing such an elastic deformation, the solid will exert a restoring force,
R
encoded in the energy functional for u by terms like K ∂i uj C ijkl ∂k ul (analogous to
~ 2 ). The fact that a solid is rigid is a consequence of spontaneous symmetry
R
K(∇θ)
breaking, and this concept of rigidity generalizes to other cases of SSB. We’ll have more
to say about the stiffness of magnets.
Hohenberg-Mermin-Wagner-Coleman Theorem. Consider an O(n) model,
~ ~ ~
with n-component rotors pSr at each site, Sr · Sr = 1, ∀r. If the system orders, we can
~r = ( 1 − σr2 , ~σr ), where ~σr is an n − 1-component vector pointing
write the spin as S
transversely to the ordering direction, describing the fluctuations about a particular
ordered state – we will assume σ 2 1. The action for these fluctuations (known as
spin waves in the context of magnetism) is
Z
1 X K d
2
~

~ 4

S=− Jrr0 σr σr0 ' const + d r ∇σ + O ∇σ . (9.1)
2T rr0 2

As we observed before, there is no ‘mass term’ ∝ σ 2 . This is because σ parametrizes

the orbit of the chosen magnetization by the broken continuous symmetry, which must
all have the same energy. The only way it can cost energy to change the magnetization
is if it varies in space; therefore the action can depend on σ only through its gradient
~
∇σ. Therefore the correlation length is infinite everywhere in the ordered phase, and
we are justified in using the continuum approximation in (9.1). The parameter K ∼ TJ
is the spin stiffness.
The fluctuations of σ go like
eikr
Z I
1 −S[σ] gaussian integral δab
hσa (r)σb (0)i = [Dσ]e σa (r)σb (0) = d̄d k 2 .
Z K BZ k
We care about this because it corrects the expectation value of the spin:
1
hS1 (r)i = 1 − σ(r)2 + · · · (9.2)
2
n−1 d̄d k
I
=1− 2
+ ··· . (9.3)
K BZ k

And now here’s the crucial point: in d = 2, this fluctuation correction to the magne-
R 2
tization goes like d̄k2k ∼ ln LΛ. It diverges with system size, which clearly means it’s
not a small correction to the leading term. (Notice that the form of the integrand is not
exactly correct far from k = 0 in the Brillouin zone, but it is the infrared divergence at

124
k = 0 which is the story here.) The singularity from the long-wavelength fluctuations
is only worse if d < 2. The way out is that our assumption that there was ordering in
the first place was wrong in d ≤ 2. We conclude that it is not possible to spontaneously
break a continuous symmetry in d ≤ 235 . A more proofy proof of this statement is on
the homework.

9.2 Kosterlitz-Thouless phase and phase transition

Although there is no magnetization for n ≥ 2 in d ≤ 2, there can still be a low-

temperature phase which is distinct from the high-temperature disordered phase, and
separated from it by a continuous phase transition. This is a new kind of phase tran-
sition, where the crucial degrees of freedom at the transition are not those of the order
parameter, since there is no order parameter. To understand this, let’s focus on the
case n = 2, d = 2 which is the most interesting.
Let Φ(r) = S1 (r) + iS2 (r) in the rotor description above. Φ(r) could also be
the macroscopic wavefunction of a superfluid, where the broken symmetry is the one
associated with particle-number conservation. The action can be parametrized as
Z 2 ! Z
2 1 ~ 2 u 0 2 r 0 K 2
~
−S[Φ] = d x |∇Φ| + |Φ| − = d2 x ∇θ + ···
2 4 u0 2

In the second step, we focussed on the universal physics by considering u0 large, with
fixed r0 /u0 . This has the effect of making the longitudinal excitations very costly
– the walls of the potential
q become very steep about the circle of minima. Writing
iθ(r) r0
Φ(r) = e u0
+ δ(r) , the longitudinal excitation δ(r) is very hard to excite and
q
we can forget about it. We defined K = ur00 , but recall that the overall coefficient
of the action is J/T , and this is what determines K. The angular variable θ is the
Goldstone mode – it only appears in the action via its derivatives.
The spin Green’s function is

G(r) ≡ Φ(r)Φ(0)† (9.4)

i(θ(r)−θ(0))
∝ e (9.5)
= e− 2 h(θ(r)−θ(0)) i .
1 2
(9.6)
35
Actually there is an interesting exception to this statement, involving orientational order. For a
discussion of this exception (as well as the history, an interesting extension, and a generalization of
the theorem) see this new paper by Halperin.

125
where we used Wick’s theorem in the last step. This correlation function of the Gold-
stones is
(θ(r) − θ(0))2 = 2 θ(0)2 − hθ(r)θ(0)i

(9.7)
Z Λ=1/a D E ra 1
2 d̄2 k |θ̃k |2 1 − eikr ' log r/a . (9.8)
0 | {z } 2πK
1
=
Kk2

Therefore
1 T
G(r) = r−η , η = = . (9.9)
2πK 2πJ
At the last step, we restored the factor of 1/T in the action. Important comments:

• η is indeed the anomalous dimension of the spin operator, defined as usual for a
critical theory by G(r) ∼ r2−d−η . But this is not at a critical point, this behavior
occurs everywhere in a whole phase.
On the other hand, at high temperatures, we know that the correlations must be
short-ranged (for example, using the (convergent!) high-temperature expansion),
T J
G(r) ∼ e−r/ξ . The distinction between these two asymptotic behaviors of G(r)
is sharp, and they represent different phases.
The low-temperature phase is consistent with the Mermin-Wagner theorem –
there is no disconnected piece of G = Gconnected . It is called algebraic order or
quasi-long-range order. In between there must be a phase transition of some kind,
which we will understand below.

• K is not a redundant variable. In much of our previous discussion of scalar

~ 2 of the kinetic term by redefining
field theory, we removed the coefficient r2 (∇φ)
√
φ → r2 φ. For the goldstone mode θ (and more generally), its normalization is
fixed by its periodicity θ ≡ θ + 2π. Changing K really changes the theory.

• And indeed, the exponent η varies with K and hence with temperature! K is an
exactly marginal perturbation of a scale-invariant theory– it parametrizes a line
of (different!) fixed points.

Why isn’t (9.9) an exact statement for all temperatures? In our computation of
(9.9) we neglected the important fact that θ ' θ +2π, θ is compact. This means that in
addition to the smooth configurations which lead to (9.9), there are other, topologically
distinct, configurations where as we move around in a loop in space, θ wanders around
on the circle, and only returns to itself up to a multiple of 2π. That is, there can be
configurations of θ(r) and loops C for which
I
~ = 2πn, n ∈ Z.
d~r · ∇θ (9.10)
C

126
We say that the loop C encloses n vortices. The presence of a vortex is topological
because the winding number n is an integer, which therefore cannot vary continuously.
~ ∼ 1 , from which we can estimate that the energy of a vortex is
(9.10) says that ∇θ r
Z
1 2
~
Eone vortex = K d2 r ∇θ = πJ log L/a
2
where L is the system size. Notice that this diverges in the thermodynamic limit: a
net number of vortices is not a finite energy configuration. To have finite energy, the
largest loops must contain a net number zero of vortices. However smaller regions may
contain vortices (n > 0) and antivortices (n < 0). The energy of a vortex-antivortex
pair (a vortex dipole) separated by distance R is
Z R
dr
Ev−v̄ ' ∼ log R/a,
a r
finite in the thermodynamic limit. We estimated above the energy of a single vortex,
but what is its free energy? Its entropy is
2
L
Sone vortex = log ( # of possible locations ) = log
a
so that
(
+∞, T < πJ/2,
Fone vortex = Eone vortex −T Sone vortex = (πJ − 2T ) log L/a →
−∞, T > πJ/2 ≡ TKT .

This gives us an estimate (it turns out to be exactly correct) for the transition tem-
perature between the low-temperature, algebraically-ordered phase, and the high-
temperature disordered phase.
[End of Lecture 18]
KT transition. To give a more quantitative account of the transition, we must
explicitly include the vortices in our calculation. To this end we deform our theory of
the Goldstone field θ by introducing a fugacity for vortices – adding a vortex lowers
the energy by y0 . Formally we can do this by changing the action to
Z Z 2
2 1 ~ 2 dx †

S = d xK ∇θ − y0 V (x) + V (x) .
2 a2
In this expression V (x) is an operator which creates a vortex at position x, and V † (x)
creates an antivortex at x. V (x) is an example of a disorder operator – it is defined by
its effects on the spins: for example
Z
hV (x) · · ·i0 ≡ [Dθ]e−S[θ] · · ·
~ = 2π
H
configurations of θ with Cx r · ∇θ
d~

127
where Cx is any curve containing the point x. (Here h· · ·i0 denotes an expectation value
in the theory with y0 = 0.) By some cleverness (following Cardy), we will figure out
what we need without finding an explicit expression for V ; such an expression can be
found as part of a duality map (see Cardy §3.3 or Herbut §6.3). Notice that an expec-
L→∞
tation value with only a single vortex will be hV (x) · · ·i0 ∝ e−E1 /T ∼ e−πK log L/a → 0,
but expectations with zero total vortex number hV (x)V (0)? · · ·i0 will be finite.
Granting this starting point, the partition sum is now a function of two variables:
D R †
E
Z(K, y0 ) = ey0 (V +V ) (9.11)
0
Z Z Z
1
= 1 + y0 V + V † + y02 2 V V † + ··· (9.12)
2 0
Z 2 2
d r1 d r2
2
= 1 + y0 4
V (r1 )V † (r2 ) 0 + · · · (9.13)
a
The terms with odd powers of y0 vanish by the fact that they have a net number of
vortices, and therefore e−E = e−∞ = 0, zero Boltzmann weight.
In the last line (9.13), the vortex-antivortex correlator V (r1 )V † (r2 ) 0 ≡ e−E(r1 ,r2 )
is just the partition function for the spin waves in the presence of a vortex at r1 and an
antivortex at r2 . We can find the resulting free energy E(r1 , r2 ) by saddle point (since
the integral over smooth configurations of θ is gaussian), i.e. just solve the equations
of motion
∇2 θ = 0, (9.14)
(away from the vortices) with boundary conditions demanding the appropriate winding
number around r1 , r2 . The solution is

θ(~r) = Θ(~r − ~r1 ) − Θ(~r − ~r2 ), where Θ(~r) ≡ the angle between ~r and x̂.

The resulting free energy is36

r12
E(r1 , r2 ) = 2πK ln + 2πK C̃.
a
36
To get the important and confusing factors of 2π correct, a perhaps more useful expression for θ
comes from noticing that the vortices are sources of θ: integrating over a small region R1 containing
the point r1 Z I
2 ~ =!
∇ θ= d~r · ∇θ 2π.
R1 Cr1
2
Therefore, we must have ∇ θ = 2π (δ(r − r1 ) − δ(r − r2 )) whose momentum space solution is
Z 2
d̄ k ikr ikr1
− eikr2 .

θ(r) = 2π 2
e e
k
Plugging this into
d̄2 k
Z 2 Z
1 ~ ~

E= K ∇θ = (2π)2 K 2
1 − eik·~r12
2 k

128
with r12 ≡ |~r1 − ~r2 |. The additive constant, which is associated with the energy in the
core of the vortices, we can absorb into a rescaling of y0 : y0 → y0 e−πK C̃ ≡ y. Therefore
we conclude that r −2πK
12
V (r1 )V † (r2 ) 0 = ,
a
– we learn that the scaling dimension of the operator V is ∆V = πK. The scaling
behavior of y is then y(b) = byV y(1) with yV = d − ∆V = 2 − πK:

dy
βy = = (2 − πK)y (9.15)
d`
– y is irrelevant for x ≡ 2 − πK ∝ T − TKT < 0, and relevant for x > 0. Here
TKT ≡ πJ/2 as in the estimate above.
To complete the RG equations we need to know how the temperature variable x
runs. Either T or x is the coupling associated with the ‘energy operator’ (∂θ)2 . We
know a few things a priori: if y = 0, it doesn’t run. Only even powers of y can appear
since the total number of vortices and antivortices must be zero (hence zero mod 2).
Therefore, near x = 0, y = 0, the RG equation for x must have the form
dx
βx = = Ay 2 + O(y 4 ). (9.16)
d`
A fancier argument (in Cardy’s book) uses the OPE: Comparing (9.15) with our general
form in terms of OPE coefficients (8.6), we see that the OPE between V and the energy
operator has the form V · (∂θ)2 ∼ V + · · · . A general fact of CFT (Cardy §11.2) says
that the OPE coefficients cijk are completely symmetric, and this means that we must
have V · V ? ∼ (∂θ)2 + · · · . Comparing to (8.6) them implies (9.16).
Equation (9.16) has a nice physical interpretation.
Notice that the equation we solved for the behav-
ior of the θ field in response to the vortices (9.14) θ A0
is Coulomb’s law in d = 2, where θ plays the role (anti)vortex ± charge
of the electrostatic potential. K plays the role of K dielectric constant
R d ~
the dielectric constant of the medium. The running 2πK log r Coulomb potential, d̄k2k eik·~r
of x (equivalently K) in (9.16) is dielectric screen- χV in (9.17) polarizability, χE
ing of the Coulomb field θ by the charge-anticharge
(vortex-antivortex) pairs.
which is the same integral as we saw in (9.8):

d̄2 k
Z ra 1
~
2
1 − eik·~r ' ln r/a + cst.
k 2π

129
The combined RG equations
dy dx
= xy, = Ay 2
d` d`
dy x
imply that dx = Ay which integrates to Ay 2 − x2 = a constant determined by the
initial conditions. The flow lines in the xy plane are hyperbolae, except for
√ the special
initial condition
√ where the constant is zero, which is the lines y = ±x/ A. The line
y = −x/ A is the critical surface of the KT critical point at x = y = 0. Any initial
condition above this line flows off to the upper right, large x and y.

Figure 4: The Kosterlitz-Thouless phase diagram. The red line is the critical surface of the KT fixed
point; to its right is the disordered phase, where the flows end up at large x, y. The thin blue line is
a cartoon of a family of initial conditions for different values of the temperature, including the fact
that the bare value of y goes like e−πK C̃ = e−#/T , and hence is small for small T .

To understand what happens at large x, y, consider the vortex polarizability (the

analog of the polarizability of the dielectric medium, the fluctuations of the average
dipole moment):
2
~ ~0) = ∂ f , with H → H + E
X D E X
χV ≡ R2 n(R)n( ~· ~
Rn(R) (9.17)
R
∂E R~ ∂E~0
R

~ is the number of vortices at the site R,

where n(R) ~ and therefore P ~ Rn(R)
~ is the
R
vortex dipole moment. The vortex number fluctuations satisfy
−2πK
D
~
E
2 −E( ~
R,0) 4 2 R
n(R)n(0) = 0 + y 2e + O(y ) = y + O(y 4 )
a
where the y 2 term comes from configurations with a single vortex at R and and an
antivortex at 0 (or vice versa, hence the factor of 2). Thereore
X
χV = y 2 R2−2πK + O(y 4 ) → ∞ for T > TKT .
R

130
~
Infinite polarizability means free charges: it means an arbitrarily small electric field E
moves all the ± charges to opposite ends of the sample. By the same token, it means
an external charge is completely screened beyond the correlation length ξ. Everywhere
in the previous sentences ‘charge’ means ‘vortex’.
Let’s return to the spin stiffness, the free energy cost for twisted boundary con-
ditions. More precisely, consider the system on a torus, and consider the boundary
conditions θ(x, L) = θ(x, 0), θ(L, y) = θ(0, y) + α. We can relate this to the periodic
BC problem by defining
θ(x, y) = θ0 (x, y) − αx/L
where θ0 (x, y) is periodic. In the gaussian approximation, the free energy density is

K0 K0 α 2
(∂θ)2 = + ···
2 2 L
where the · · · comes from the periodic bit, which does not care about α. The spin
stiffness is defined to be
gaussian
κ ≡ L2 ∂α2 f = K0 .
Including the effects of fluctuations and vortices and all that – in the low-temperature
phase – the stiffness is κ = K(` = ∞), the running coupling evaluated in the far
infrared. This is because, in the low temperature phase, the vortex fugacity flows to
y(∞) = 0, and we return to the gaussian model, with a renormalized coupling K(∞).
Therefore: 
T < Tc : κ = K(∞) varies with T


T = Tc : κ = K(∞) = Kc = π2 , a universal value


T > T : κ=0
c

where we know the answer for T > Tc because the finite correlation length means
∂α f ∝ e−L/ξ – the influence of the boundary conditions is short-ranged, and so the
leading bit of the free energy doesn’t care about α.

We conclude that across TKT , the spin stiffness jumps by

a universal amount ∆K = π2 . This has been measured in
superfluid films.

131
Screening by vortices. Above I may have made the β function equation for x
seem mysterious. Actually, it can be directly calculated by considering the renormal-
ization of the stiffness, i.e. its screening by vortex-antivortex pairs. Here we go:
[Chaikin-Lubensky §9.4] We’ll compute the running of the stiffness parameter K
by computing the stiffness in the presence of nonzero y: that is, we compute the free
energy in the presence of a uniform gradient of θ:

~ · ~x
θ(x) = θ0 (x) + α

where α~ parametrizes the twist around the two directions and θ0 is periodic. We can
further decompose the periodic bit

θ0 (x) ≡ θv (x) + θs (x)

into a smooth piece θs which satisfies 0 = ij ∂i ∂j θs , and a vortex piece, which satisfies
ij ∂i ∂j θv = 2πnv (but 0 = δij ∂i ∂j θv ). Then the free energy defines the renormalized
stiffness K R :
1
F (α) − F (0) ≡ L2 K R α2 (9.18)
2
= − ln tre−H − F (0) (9.19)
1 R
~

= L2 Kα2 − ln tr e−H(α=0) e−K α~ ·∇θ0 − F (0) (9.20)
2 Z Z
1 2 1 2
= L Kα − K2
d x d2 x0 h∂i θ(x)∂j θ(x0 )i αi αj + O(α4 ).
2
(9.21)
2 2

(Note that we are still working in the convention where T = 1.) In the second line we
expanded out S = 21 K d2 x (∂θ0 + α)2 . Since d2 x∂i θs (x) = 0 for all configurations,
R R

only θv contributes to (9.21).

The defining condition of θv (that is, ij ∂i ∂j θv = 2πnv ) tells us that in momentum
space,
−iji qi
(∂j θv )(q) = 2πnv (q).
q2

132
Therefore37 :
Z
R 2
K = K − (2πK) d2 x h∂i θ(x)∂i θ(0)i (9.22)
hnv (q)nv (−q)i
= K − (2πK)2 lim . (9.23)
q→0 q2
q→0
Net vortex neutrality implies hnv (q)i → 0, plus rotation invariance, implies that
1
hnv (q)nv (−q)i = χv q 2 + O(q 4 )
2
where χv is exactly the vortex polarizability defined above. Therefore
Z 2 −2πK+2
1 2 2 dx x
KR = K − (2πK) y + O(y 4 ) (9.24)
2 a2 a
Z ∞
1 2 2 drr3−2πK
= K − (2πK) y 2π . (9.25)
2 a a4−2πK
Let’s take the high-energy point of view on the RG: we change the cutoff a → ae` and
demand that the physics (K R ) is invariant. This is accomplished by replacing
Z ae`
2 drr3−2πK
K → K(`) = K − cy
a a4−2πK
(where c > 0 is a constant) and

y → y(`) = ye`(2−πK) .

This reproduces our RG equations:

−∂` K = ∂` x = Ay 2 (9.26)
3
∂` y = (2 − πK(`))y(`) + O(y ) = xy. (9.27)

37
A little bit more detail which justifies the first line in (9.22): Claim 1:

h(∂i θv )(q1 )(∂j θv )(q2 )i = f (q)(2π)2 δ 2 (q1 + q2 ) (δij − q̂i q̂j ) ≡ Gij .

Claim 2:
(2π)2
(∂i θv )(q)(∂i θv )(−q) = nv (q)nv (−q).
q2
Claim 1 follows from 0 = ∂i ∂i θv , which implies q1i Gij = q2j Gij = 0. Translation invariance implies
Gij ∝ δ(q1 + q2 ), and rotation invariance implies Gij = A(q 2 )δij + B(q 2 )qi qj .
Claim 2 follows from ij ∂i ∂j θv = 2πnv , i.e.∇~ × ∇θ ~ v = 2πnv ẑ. Taking curl of the BHS gives
~ ~ ~ 2 ~ ~
∇ × (∇ × ∇θv ) = −∇ (∇θv ) = ∇ × 2πnv ẑ which says
−iji qi
(∂j θv )(q) = 2πnv (q).
q2

133
10 RG approach to walking

10.1 SAWs and O(n → 0) magnets

[Brézin, ch 8; Cardy ch 9; the original reference is (brief!) P. de Gennes, Phys. Lett. A38
(1972) 339.]
At each site i of a lattice (actually, it could be an arbitrary graph), place an n-
component vector ~si ; we’ll normalize them so that for each site i n = ~si ·~si ≡ na=1 (sai )2 ,
P

and we’ll study the hamiltonian

X
H(s) = −K ~si · ~sj
hiji

(I have named the coupling K to make contact with our previous discussion of SAWs).
Denote by dΩ(s) the round (i.e. O(n)-invariant) measure on an (n − 1)-sphere, nor-
R
malized to dΩ(s) = 1. The partition sum is
Z Y
Z = dΩ(si ) e−H(s)
i
 k
∞
Kk
Z Y X X
= dΩ(si )  ~si · ~sj 
i k=0
k!
hiji
k≡Nl (G) Z Y
X K Y
= dΩ(si ) ~si · ~sj . (10.1)
graphs, G
k! i hiji∈G

Here we are doing the high-temperature expansion, and further expanding the product
of factors of the Hamiltonian; we interpret each such term as a graph G covering a
subset of links of the lattice. Nl (G) is the number of links covered by the graph G.
Now we can do the spin integrals. The integral table is
Z
dΩ(s) = 1
Z
dΩ(s)sa sb = δab n
Z
n
dΩ(s)sa sb sc sd = (δab δcd + 2 perms) (10.2)
n+2
where the second follows by O(n) invariance and taking partial traces. The generating
function is useful:
Rπ n−2 ∞ Rπ
dθ sin θe x cos θ p dθ sinn−2 θ cosp θ
Z X x
fn (x) ≡ dΩ(s)e~x·~s = 0 R π = 0 R
π
0
dθ sinn−2 θ p! dθ sinn−2 θ
p=0 | 0 {z }
=0,n odd

134
∞
X x2p np n→0 x2
= 1+ → f 0 (x) = 1 + . (10.3)
p=1
p! 2p n(n + 2) · · · (n + 2p − 2) 2

Let’s interpret this last result: it says that in the limit n → 0, each site is covered
either zero times or two times. This means that the graphs which contribute at n → 0
avoid themselves. 38
Returning to n > 0, since sai sbi = 0 if a 6= b, the value of the spin is conserved
along the closed loops. We get a factor of n from the spin sums na=1 from closed
P
n→0
loops. Only closed loops contribute to Z. So Z → 1, yay. Less trivially, however,
consider Z Y
−1
Gab=11 a=1 b=1
(r, K) ≡ s0 sr ≡Z dΩ(si )e−H(s) s10 s1r .
i

Doing the same high-temperature expansion to the numerator, we get contributions

from loops which end only at sites 0 and r. In the n → 0 limit all the closed loops go
away from both numerator and denominator, leaving
n→0
X
Gab=11 (r, K) → K p Mp (~r) (10.4)
p

where Mp (~r) is (as in (2.4)) the number of SAWs going from 0 to ~r in p steps. This is
the generating function we considered earlier! The quantity G in (2.6) is actually the
correlation function of the O(n → 0) magnet!
Summing the BHS of (10.4) over r, the LHS is r G11 (r, K) = χ11 (K) ∼ (Kc −K)−γ
P

near the critical point of this magnet. The RHS is p r K p Mp (R) = p Mp K p → ∞

P P P

which K → Kc (from below), from which we concluded earlier that for large walks,
p1
Mp ∼ pγ−1 ap (with a = 1/Kc , a non-universal constant which is sometimes fetishized
by mathematicians).
Furthermore, the quantity ξ in (2.7) is actually the correlation length, G11 (r, K) ∼
e−r/ξ . At the critical point, ξ ∼ (Kc − K)−ν means that Rp ∼ pν , which determines
1
the fractal dimension of the SAW in d dimensions to be DSAW = limn→0 ν(n,d) , where
ν(n, d) is the correlation-length critical exponent for the O(n) Wilson-Fisher fixed point
in d dimensions.
[End of Lecture 19]
38
Cardy has a clever way of avoiding these spherical integrations by starting with a microscopic
P
model with a nice high temperature expansion (namely H(s) = hiji log (1 + K~si · ~sj )) and appealing
to universality.

135
10.1.1 SAW:RW::WF:Gaussian

In the same way, the Gaussian fixed point determines the fractal dimension of the
unrestricted walk. This can be seen by a high-temperature expansion of the Gaussian
model. (For more on this point of view, see Parisi §4.3 - 4.4.) Alternatively, consider
unrestricted walks on a graph with adjacency matrix Aij , starting from the origin 0.
Denote the probability of being at site r after n steps by Pn (r). Starting at 0 means
P0 (r) = δr,0 . For an unrestricted walk, we have the one-step (Markov) recursion:
1X
Pn+1 (r) = Ar0 r Pn (r) (10.5)
z r0
P
where the normalization factor z ≡ r0 Ar0 r is the number of neighbors (more generally,
the matrix A could be a weighted adjacency matrix and z could depend on r). Defining
the generating function
X∞
G(r|q) ≡ q n Pn (r)
n=0

the recursion (10.5) implies

q
δr0 r − Ar0 r G(r|q) = δr0 ,0 . (10.6)
z
In words: G is the correlation function of the Gaussian model with
Z Y
q
dφr e− r0 r φr0 (δr0 r − z Ar0 r )φr .
P
Z=
r

For the hypercubic lattice with spacing a, this is

ei~r·~p
Z
G(r|q) = d̄d p .
1 − zq µ cos apµ
P
BZ

The long-wavelength properties (for which purposes the denominator may be replaced
by p2 + r as r ∼ q − 1) of the Gaussian model near its critical point at q → 1 determine
√
the behavior of large unrestricted walks, and in particular the RMS size ∼ n and
fractal dimension is 2.
And the Gaussian answer is the right answer even for a SAW in d > 4. We could
anticipate this based on our understanding of the fate of the WF fixed point as d → 4
from below. How can we see the correctness of mean field theory for SAWs in d > 4
directly from the walk?
There is a simple answer, and also a more involved, quantitative answer (next).
The simple answer is: the random walk has fractal dimension D = 2 (if it is embed-
ded in two or more dimensions and is unrestricted). Two-dimensional subspaces of

136
Rd will generically intersect (each other or themselves) if d ≤ 4 (generic intersection
happens when the sum of the codimensions is ≤ 0, so the condition for intersection is
underdetermined). For d ≥ 4, they generically miss each other, and the self-avoidance
condition does not have a big effect.

10.1.2 Worldsheet theory of the SAW

[Cardy §9.2] Consider the following statistical model of a chain of N monomers at

positions ~ri in d dimensions:
N
!
Z Y X (~ri+1 − ~ri )2 XX
Z= dd ri exp − 2
− uad δ d (~ri − ~rj ) .
i=1 i
a i j

The first term insists that neighboring monomers be spaced by a distance approxi-
mately a. The second term penalizes a configuration where any two monomers collide.
We used factors of the chain-spacing a to render the coupling u dimensionless.
Now zoom out. Suppose that a ξ so that we may treat the polymer as a
continuous chain, ~r(ti ≡ ia2 ) ≡ ~ri . In taking the continuum limit we must take t ∼ a2
in order to keep the coefficient of the ṙ2 term independent of a. The exponent becomes
the Edwards hamiltonian:
Z 2 Z Z
dr d−4
HE [r] = dt + ua dt1 dt2 δ d (~r1 − ~r2 ).
dt

This is a ‘worldsheet’ point of view: it is a 1d system of size N a2 , with a long (infinite)

range interaction. a plays the role of a UV cutoff.
If u = 0, so the walk does not avoid itself, dimensional analysis ([r] = 1, [t] = 2 and
demanding the kinetic term be dimensionless) gives
1/2
t
r(t) ∼ a (10.7)
a2

and r does not scale when we rescale the cutoff a → ba, t → t.

RG: The interaction strength uad−4 is independent of the cutoff when d = 4. It is
irrelevant for d > 4, giving back the MFT result (10.7), as promised.
For d < 4 it seems to grow in the IR, and we might hope for an IR fixed point u? ,
and a resulting anomalous dimension for the operator r:

a → ba, r → b−x r, t → t.

137
Here is a clever (though approximate) argument (due to Flory) that suggests a value
for x. At a fixed point, the two terms in H must conspire, and so should scale the
same way. For general x, the kinetic term and the potential scale respectively as
KE → KEb−2x , V → V bd−4+dx
4−d
suggesting that x = 2+d
. Dimensional analysis says

t 1+x
r(t) = af 2
∼t 2
a
and therefore the RMS walk size is
1+x 3
R = r(t = N ) ∼ N ν , ν = |Flory = .
2 d+2
This isn’t too bad; in fact it’s exactly right in d = 2. See Cardy Chapter 9 for more
on this, and see the homework for a more quantitative approach to the value of ν.

A comment on ‘power counting’.

How did we know from the engineering dimensional analysis that u was irrelevant
when d > 4?
Let me describe the analogous argument in the case of field theory with local inter-
R
actions. Consider the gaussian critical point in d dimensions S0 [φ] = dd x(∇φ)2 , so
that the length dimensions of the field are [φ] = 2−d
R d
2
. Perturb by Sinteraction ≡ d x gφp .

2−d
0 = [Sinteraction ] = d + [g] + p[φ] =⇒ [g] = −(d + p[φ]) = − d + p .
2
The coupling is dimensionless when [g] = 0 which happens when
2d
p = pd ≡ ,
d−2
this case is naively scale invariant, at least until we study the fluctuations. For d > 2,
the coupling g has length dimensions

> 0 when p > pd ,
 non-renormalizable or irrelevant
p − pd 
[g] = d · = 0 when p = pd , renormalizable or marginal
pd  
< 0 when p < p , super-renormalizable or relevant.
d

Consider the ‘non-renormalizable’ case. Suppose we calculate some physical quan-

tity f with [f ] as its naive dimension, in perturbation theory in g, e.g. by Feynman
diagrams. We’ll get:
∞
X
f= g n cn
n=0

138
with cn independent of g. So

[f ] = n[g] + [cn ] =⇒ [cn ] = [f ] − n[g]

So if [g] > 0, cn must have more and more powers of some inverse length as n increases.
What dimensionful quantity makes up the difference?? The dimensions are made up
by dependence on the short-distance cutoff Λ = 2π a
. which has [Λ] = −1. Generically:
n[g]
cn = c̃n (Λ) , where c̃n is dimensionless, and n[g] > 0 – it’s higher and higher powers
of the cutoff. But this means that if we integrate out shells down to Λ/b, in order for
physics to be independent of the zoom parameter b, the microscopic coupling g(b) will
have to depend on b to cancel this factor. In particular, we’ll have to have
b→∞
g(b) = g0 b−n[g] → 0.

139
10.2 RG approach to unrestricted lattice walk

We showed above that the generating function G(r|q) for unrestricted walks on a
lattice (from 0 to r) satisfies (10.6), which says that it’s a Green’s function for the
lattice laplacian. The data of the Green’s function is encoded in the spectrum of the
adjacency matrix
Aij vj = vi . (10.8)
This determines G via X 1
G(i|q) = v0 vi .

1 − q/z

The eigensystem of A encodes the solution to many physics problems. For example,
we could consider a continuous-time random walk, where the probability pi (t) for a
walker to be at site i at time t satisfies

∂t pi = (δij zj − Aij )pj (10.9)

P
where zj ≡ i Aij is coordination number at site j, which addition guarantees 0 =
P
i ∂t pi , the conservation of probability. The solution is then
X
pi (t) = e−(z−)t vi vj pj (0) .
,j

Alternatively, we could think of these as the equations for the normal modes of the
lattice vibrations of a collection of springs stretched along the bonds of the lattice. In
that case, this spectrum determines the (phonon contribution to the) heat capacity of
a solid with this microstructure.
Previously, we solved this problem using translation symmetry of the lattice, by
going to momentum space. Here I would like to illustrate an RG solution to this
eigenvalue problem which is sometimes available. It takes advantage of the scaling
symmetry of the lattice. Sometimes both scaling symmetry and translation symmetry
are both present, but they don’t commute.
Sometimes, as for most fractals, only the self-similarity is present.
So this method is useful for developing an analytic understanding
of walks on fractal graphs, or more generally the spectrum of their
adjacency matrix. I believe the original references are this paper
and this one. Roughly, we are going to learn how to compute the
phonon contribution to the heat capacity of the broccoflower!

Let’s solve (10.8) for the case of a chain, with Aij = t(δi,j+1 +δi,j−1 ). I’ve introduced
a ‘hopping amplitude’ t which can be regarded as related to the length of the bonds.

140
The eigenvalue equation can be rewritten as
t
vi = (vi−1 + vi+1 ) . (10.10)

Notice that if i is odd, then the entries on the RHS only involve even sites. So this
equation eliminates vi at the odd sites in terms of the values at the even sites. Plugging
this back into the equation for an even site gives

t2
v2l = t(v2l−1 + v2l+1 ) = (v2l−2 + v2l + v2l + v2l+2 )

t2
=⇒ v2l = (v2l−2 + v2l+2 ) .
| 2 − 2t
{z }
2

≡t0 /

This is the same equation as (10.10), but with half as many sites, i.e. the zoom factor
is b = 2.
t0 is a renormalized hopping ampli-
tude:
t0 t2 (t/)2
= 2 = .
− 2t2 1 − 2(t/)2

This is a recursive map for the ratio x =

t/. Actually, it can be mapped to the
logistic map y → ry(1 − y), with r = 4,
by the change of variables y = 4x−1 − 2.
A lot is known about this map.
We can regard this recursion as a rule
for growing the lattice (and all of its eigenvectors) starting from a small chunk of the
stuff. How do we reconstruct the eigenvectors recursively? Suppose we start with a
chain of 2n+1 sites and suppose we know an eigenstate v n,n for this case with n 6= 0.
There is a solution on a lattice with twice as many sites with

n+1,± n+1,± vjn,n + vj−1

n,n
v2j n+1 = vjn,n , v2j+1 n+1 =−
±
n
√
where ±
n+1 = ± 2 − n .

Let’s cheat and remind ourselves of the known answer for the
spectrum using translation invariance: E(k) = 2t cos ka ranges
from −2t to 2k as k varies over the BZ from 0 to 2π/a. Let’s use
this to learn how to understand the iteration map.

141
For the chain, the map has three fixed points, at x = 0, 12 , −1. Let’s think of fixing
E and varying the initial hopping rate. If t0 ∈ (−E/2, E/2) (that is, if |E| > 2t is in
the band gap) then tn→∞ → t? = 0 eventually reaches the fixed point at x = 0 (as in
n1 n
the left figure). More precisely, it goes like tn ∼ Ee−2 λ for some λ.
Such an orbit which asymptotes to t → 0 can be described by decoupled clusters –
the wavefunction is localized. I learned about this from this paper.
In contrast, one with finite or infinite asymptotic t is associated with an extended
state. This happens if |t0 | > E/2 (so that E ∈ (−2t, 2t) is in the band). Then
tn > |E|/2 for all n, and we have a nonzero effective hopping even between two sites
that are arbitrarily far-separated.
The fixed point at t? = E/2 is the state with k = 0, i.e. the uniform state.

The procedure works for other examples, too, including

some without translation invariance, where the spectrum can
be quite different. Consider the Sierpinski triangle lattice.
[from Domany et al, linked above]

A1 = t(B1 + B2 + B4 + B5 ). (10.11)

B1 = t(A1 + A5 + B2 + B3 ), B4 = t(A1 + A4 + B5 + B6 ),

B2 = t(A1 + A2 + B1 + B3 ), B5 = t(A1 + A3 + B4 + B6 ),
B3 = t(A2 + A5 + B1 + B2 ), B6 = t(A4 + A3 + B5 + B4 ).

Eliminating the B sites by solving the previous six equations for them in terms of
the A sites and plugging into (10.11) gives an equation of the same form on a coarser
lattice
t2
A1 = t0 (A2 + A3 + A4 + A5 ), t0 = .
− 3t
√
Zoom factor is b = 2. In terms of the dimensionless ratio x ≡ t/,

x2
x→
1 − 3x

142
Here’s a way to visualize the huge qualitative difference from
this map relative to the result for the chain. Plot, as a func-
tion of some initial x = t/, the value of the nth iterate, for
some large value of n (here 105 × 2). For the chain (shown at
the top), every x which starts in the band stays in the band 39
(xn > 1/2 if x0 > 1/2), and vice versa. For the Sierpinski
case, we get this Cantor-like set of localized states. Here the
spacing on the x-axis is 10−2 ; if we scan more closely, we’ll
find more structure.

10.3 Spectral dimension

Here’s one more notion of dimension, for a graph embedded in Rd , following Toulouse
et al. Think of the graph as a Debye solid, that is, put springs on the links of the
graph, each with natural frequency ω02 = K/m. The normal modes of this collection of
springs have frequencies ω with ωn2 /ω02 which are eigenvalues of the adjacency matrix.
The density of states of such modes for small ω is an ingredient in the heat capacity
of the resulting model solid. Denote by ρ(ω)dω the number of modes with frequency
in the interval (ω, ω + dω).
For a translation-invariant system in d dimensions, the modes can be labelled by
wavenumber and ρ(ω)dω = d̄d k which at ω → 0 (in the thermodynamic limit) is
governed by Goldstone’s acoustic phonon with ω = vs k and therefore ρ(ω) ∝ ω d−1 .
More generally, we define the spectral dimension ds of the graph by the power law
relation
N →∞ then ω→0 ds
ρ(ω) ∼ ω .
Sometimes it’s called the diffusion dimension. It is a useful idea! One cool application
is to figuring out how many dimensions your average spacetime has when you do a
simulation involving dynamical triangulations. (See §5.2 of this paper.)
Now suppose that instead of translation-invariance, we have dilatation invariance,
i.e. self-similarity. The number of sites for a graph Γ of linear size L scales as

N (L) ∼ LDΓ

where DΓ is the fractal dimension. This means that if we assemble a scaled up version
whose linear size is scaled up by b, we have N (bL) = bDΓ N (L) sites. And it means,
39
Thanks to Daniel Ben-Zion for help with these figures.

143
just by counting eigenvalues, that the density of states per unit cell must scale like

ρL (ω) = bDΓ ρL/b (ω). (10.12)

Consider L finite so that the spectrum {ωn } is discrete, and focus on the nth
eigenvalue from the bottom, for some fixed n. If we knew that this eigenvalue scaled
with system size like
ω(L/b) = bx ω(L)
then
ρL/b (ω) = b−x ρL (ωb−x ) (10.13)
(10.12),(10.13) b=ω 1/x DΓ −x
=⇒ ρL (ω) = bDΓ −x ρL (ωb−x ) ∼ ω x .

Claim: The smooth part of the spectrum of the Sierpinski fractal solid does scale
like bx for some x which we can determine. A more earnest pursuit of the equations
(10.11) implies that
!
(ω 0 )2
2
ω2
2
ω ω
2
7→ 2
= 2
d+3−
ω0 ω0 ω0 ω02

ω 2 (L/2) = ω 2 (L)(d + 3 + O(ω 2 )) ≡ ω 2 (L)22x

log(d + 3)
=⇒ x = .
2 log 2
(We used b = 2 since the number of sites per edge of the triangle is halved at each
decimation step.) This means that the smooth part of the spectrum behaves as
DΓ −x
ρ(ω) ∼ ω x = ω ds .

10.4 Resistor networks

The resistor network on a Sierpinski d-gasket is studied here. The scaling with size
of the conductivity of stuff made from such a graph can be related to its spectral
dimension.
Unlike the paragon of nerd-sniping problems (the resistor network on the square
lattice), this problem cannot be solved by going to momentum space.
Consider sending a current I into one corner of a Sierpinski gasket. By symmetry,
a current I/d must emerge from the other d corners.
Call ρ(a) the resistance of one bond with lattice spacing a. Now we want to compute
the effective, coarse-grained resistance ρ(ba) for b > 1. The symmetry of the problem

144
forbids current from crossing the middle of the triangle, and this allows us to compute
the voltage drop between the input corner and any of the others. Specifically, this
voltage drop is preserved if
d+3
ρ(ba) = ρ(a) ≡ bζ ρ(a)|b=2
d+1

log d+3
d+1
ζ= .
log 2
L
Now if we iterate this map ` times so that b` = a
for some macroscopic L, then the
resistance of the whole chunk of stuff is

ρ(L) ∼ Lζ

and the conductivity of the stuff (in J~ = σ E,

~ an intensive quantity) is

L2−d
σ(L) = ∼ L−t
ρ(L)

with scaling exponent t = d − 2 + ζ.

Exercise: relate ζ to the spectral dimension ds .

145
11 RG sampler platter
I didn’t get this far in lecture, but the goal of this section is to convey some more the
huge range of applications of the renormalization group perspective.

11.1 Disorder

[McComb p.60; Creswick, chapter 3.]

I want to emphasize that RG of couplings is a subset of RG of probability distribu-
tions.
So far in this course, we’ve been studying clean systems, ones whose couplings
are the same at each location. It is often important and interesting to consider the
case where the couplings are only uniform on average. Just encoding the values of the
couplings in one realization of the system is then quite a job, never mind computing the
resulting free energy. But for large systems, we can often appeal yet again to the central
idea of statistical physics and choose the couplings from some probability distribution.
(A physical quantity for which this assumption works is said to be ‘self-averaging’.)
This probability distribution will then itself evolve under the RG.
Let’s consider a case where we can study this in detail, namely the nearest-neighbor
Ising ferromagnet on hierarchical graphs. Such a graph can constructed by a sprouting
rule: at each step of the construction, replace each link with some given motif. For

example, the sprouting rule produces the diamond hierar-

chical lattice. I denote the new sites in black. The beauty of this construction for
our purposes is that decimating the black sites precisely undoes the construction step:

The generalization which replaces each link with q segments is called the Berker

lattice, I think. For q = 3, this looks like:

146
Let vhiji ≡ tanh βJhiji . Consider tracing over the black sites A and B in
the figure at right. Using the high-temperature-expansion formula, this
isn’t hard:
X X Y
e−∆Heff (sC ,sC ) = e−H(s) =

1 + vhiji si sj
sA ,sB =±1 sA ,sB links,hiji
= 22 (1 + v1 v2 sC sC ) (1 + v3 v4 sC sd )
= 22 ((1 + v1 v2 v3 v4 ) + (v1 v2 + v3 v4 )sC sD )
= 22 (1 + v1 v2 v3 v4 ) (1 + v 0 sC sD )

with
v1 v2 + v3 v4
v 0 (v1 ..v4 ) = . (11.1)
1 + v1 v2 v3 v4
In the clean limit where all couplings are the same, this is
2v 2
v0 = .
1 + v4
This has fixed points at
v ? = 0, 1, 0.0437 . (11.2)
Just as we did for the Ising chain in §3, we can study the behavior near the nontrivial
√
fixed point and find (here b = 2) that ν ' 1.338. Redoing this analysis to include
also a magnetic field, we would find yh = 1.758 for the magnetization exponent.
But now suppose that the couplings are chosen from some initial product distribu-
tion, independently and identically distributed. Some examples for the bond distribu-
tion to consider are:

Random bond dilution: P (J) = xδ(J − J0 ) + (1 − x)δ(J)

Source of frustration: P (J) = xδ(J − J0 ) + (1 − x)δ(J + J0 )
J2
Edwards-Anderson spin glass: P (J) ∝ exp − 2
2J0

After the decimation step, the distribution for any link evolves according to the
usual formula for changing variables in a probability distribution, using the RG relation
(11.1): Z
P 0 (v 0 ) = dv1 dv2 dv3 dv4 δ(v 0 − v 0 (v1 ..v4 ))P (v1 ) · · · P (v4 ).

The preceding relation is then an RG recursion equation for the distribution of couplings
P (v) 7→ (R(P )) (v). As usual when confronted with such a recursion, we should ask
about its fixed points, this case fixed distributions:
Z
!
P? (v) = dv1 dv2 dv3 dv4 δ(v − v 0 (v1 ..v4 ))P? (v1 ) · · · P? (v4 ). (11.3)

147
We know some solutions of this equation. One is

P? (v) = 1δ(v − v? )

with v? given by one of the solutions in (11.2).

Another set of fixed points is associated with bond percolation. Here’s what I mean
about percolation. Consider the T → 0 (or J → ∞) limit of a ferromagnetic NN Ising
model on a graph, with one catch: For each pair of neighbors, we randomly decide
whether or not to place a link. This is a model of bond percolation. This is realized in
our system here by a distribution of the form

Px (v) = xδ(v − 1) + (1 − x)δ(v).

Plugging this ansatz into the map gives

Px0 (v) = x4 + 4x3 (1 − x) + 2x2 (1 − x)2 δ(v−1)+ 4x2 (1 − x2 ) + 4x(1 − x)3 + (1 − x)4 δ(v)

where the terms come from enumerating which of the four bonds is zero. So: the
distribution is self-similar, but the bond-placing probability x evolves according to

x 7→ x0 = 2x2 − x4 .

So each fixed point of this map gives a solution the fixed-distribution equation (11.3).
They occur at

0, nobody’s home


x? = 1, everybody’s home
 √
 5−1 ,

percolation threshold on the DHL.
2

We can study (a subset of) flows between these fixed points if we make the more
general ansatz
p(v) = xδ(v − v0 ) + (1 − x)δ(v)
with two parameters v0 , x. Then we get a 2d map, much more manageable, if the
evolution preserves the form. It almost does. The evolution rule can be estimated by

x0 = −x4 + 2x
Z
x0 v00 = hviP 0 ≡ dvvP 0 (v). (11.4)

The result is
Z 4
Z Y Y
x0 v00 = dv dvi δ(v − v 0 (v1 ..v4 ))v p(vi )
i=1 i

148
Z Y
v1 v2 + v3 v4 Y
= dvi p(vi )
i
1 + v1 v2 v3 v4 i
2v02 4
=x 4
+ 4x3 (1 − x)v02 + 2x2 (1 − x)2 v02 .
1 + v0

In the second line we used the delta function to do the v integral.

Indicated by the thick black link is the critical surface of
the clean Ising fixed point at x = 1, v = v ? . The percolation
fixed point at x = x? , v = 1 is unstable to the clean Ising fixed
point. Besides the structure of the phase diagram, we can
infer the angle at which the Ising critical surface approaches
x→x 1
x = x? : Tc (x) → ? − log(x−x ?)
.

Strong-disorder RG. There is a simplifying limit where the distribution of the

couplings is very broad. Such a distribution is sometimes an attractive fixed point
of the RG, called a strong-disorder or even infinite-disorder fixed point depending on
the extremity. This limit is simplifying because then we can order the RG analysis
t
by looking at the largest coupling ti ≡ Ω first, and we can use j6Ω=i 1 as a small
parameter. A useful reference is this paper by Altman and Refael. A more detailed
but scarier discussion is this review.
Let’s analyze the example of an adjacency matrix of a graph with random values
of the hopping parameter for each link. For simplicity let’s think about the case where
the graph is a chain.
So we want to solve
.. .. ..
    
. . .
    

 0 ti  vi−1 
 
 vi−i 
 
 ti 0 ti+1   vi  =  vi 
    

 ti+1 0 ti+2  v 
  i+1 
v 
 i+1 
.. .. ..
. . .

With some random ti chosen from some broad distribution, so that individual ti will
be very different from each other. Consider the largest ti ≡ T , and assume that it is
much bigger than all the others, including its neighbors. Then we can eliminate the
two sites connected by the strong bond by solving the 2 × 2 problem

0 T vi−1 vi−1
' .
T 0 vi vi

More precisely, we can eliminate these two sites vi−1 , vi in terms of their neighbors

149
using their two rows of the eigenvalue equation:
 .   . 
...

.. ..
    
 0 t`   v`   v` 
    
 t` 0 T  vi−1  vi−1 
  vi  =  vi 
    

 T 0 tr    

 tr 0  v 
 r 
 v 
 r 
... .. ..
. .
−1
vi−1 − T t`
=⇒ =
vi T − tr

The result of plugging this back into the neighboring rows of

the equation is to make an effective hopping between ` and r of
approximate strength
t` tr
t`r ∼ . (11.5)
T
(I approximated T .)

This RG rule (11.5) (which we could name for Dasgupta and Ma in a slightly more
fancy context) is very simple in terms of the logs of the couplings, ζ ≡ log T /t :

ζ 0 = ζ` + ζr

– they just add. Here ζ ∈ (0, ∞) and ζ = 0 is the strongest bond.

Let’s make an RG for the probability distribution using this rule. The first step is
to reduce the UV cutoff, by decimating the highest-energy or shortest distance degrees
of freedom. Here the UV cutoff is just the largest hopping parameter, T . Then the
new effective bonds have the distribution (in the log)
Z ∞ Z ∞
Pnew (ζ) = dζ` dζr P (ζ` )P (ζr )δ(ζ − ζ` − ζr ).
0 0

Imagine we start the RG at some initial strongest bond T0 . Then Γ = log T0 /T says
how much RGing we’ve done so far. The second rescaling step puts the distribution
back in the original range, which requires shifting everyone

T T − dT dT
ζi = log 7→ log '= ζi − = ζi − dΓ
ti ti T
This moves the whole distribution to the left: P (ζ) 7→ P (ζ + dΓ) = P (ζ) + dΓP 0 (ζ) +
O(dΓ), i.e.
dP (ζ)
drescale P (ζ) = dΓ.
dζ

150
The change in the full distribution from adding in the new bonds is

dnew bonds P (ζ) = dΓP (0) Pnew (ζ).

| {z }
strongest bond

And the full evolution is

Z ∞ Z ∞
dP (ζ) dP (ζ)
= + P (0) dζ` dζr P (ζ` )P (ζr )δ(ζ − ζ` − ζr ).
dΓ dζ 0 0

This equation has a simple solution:

1
PΓ (ζ) = f (Γ)e−f (Γ)ζ =⇒ ∂Γ f = −f 2 =⇒ f (Γ) = .
Γ
1 −ζ/Γ
=⇒ PΓ (ζ) = e .
Γ
In terms of the hoppings, this is
1− Γ1
1 Γ→∞ 1
PT (t) ∼ →
t t
– as we zoom out we approach a non-normalizable distribution. This is an infinite-
randomness fixed point.
What is this analysis good for? For one thing, we can estimate the fraction of
undecimated sites at each step. Each time we decimate a link, we remove two sites.
Therefore, the number of undecimated sites evolves by

dN = −2PΓ (0)N dΓ.

Approaching the fixed point, PΓ (0) = f (Γ) = Γ1 , so the solution is

N0 N0
N (Γ) = ∼ 2 .
Γ2 log (T0 /t)

The average distance between surviving sites is L(Γ) ∼ a NN(Γ)

0
∼ aΓ2 ∼ a log2 (T0 /t).
Here’s an attempt at a physical application. Let’s go back to using the spectrum
of the (random-entry) adjacency matrix to determine the heat capacity of a Debye-
Einstein solid. Let’s add in a diagonal (Einstein) spring constant:
X p2 X
i
H= + ω0 xi + ω02
2 2
Aij (xi − xj )2 .
i
2 ij

So the spectrum of normal modes is

ωn2 = ω02 (1 + n )

151
where n are the eigenvalues of A. And we take Aij = (δi,i+1 + δi,i−1 ) ti and choose ti
from the strong-disorder distribution found above.
To find the heat capacity at temperature 1/β, we should run the RG from some
initial UV cutoff T0 down to the T associated with temperature T , which is of order
T 2
ω0
. Because of the breadth of the distribution, the bonds with t < T 2 are likely to
have t T 2 and we can ignore them. Any site not participating in a bond produces a
simple equipartition contribution ∆E = kB T (i.e. it adds a constant to CV ) as long as
1/β > Ω0 . Sites participating in a bond have ω T and are frozen out. So the heat
capacity is
CV (β) = N (T )
where N (T ) is the number of undecimated sites when the temperature is T , which
means here that the RG scale is T ∼ T 2 . So this model produces a crazy dependence
on the temperature,
1
CV ∼ 2 .
log (T 2 )

11.2 RG viewpoint on matched asymptotic expansions

[Goldenfeld chapter 10]

It is possible to get anomalous dimensions even from systems without any stochastic
element (i.e. thermal fluctuations or quantum fluctuations). It is even possible to get
them from linear differential equations. The latter is demonstrated, for example, by
this analysis. (I apologize that I have not found the time to figure out how to explain
this without all the surrounding complications.)
Goldenfeld gives an extended discussion of a diffusion equation, perturbed by a
nonlinear, singular term, called the Barenblatt equation.

11.3 RG approach to the period doubling approach to chaos

[Creswick ch 2, Strogatz, Dan Arovas’ 200B notes!]

152

RwRuuEEmT5y6DNPfCmx8 Statistical Physics Lecture Notes 2023
No ratings yet
RwRuuEEmT5y6DNPfCmx8 Statistical Physics Lecture Notes 2023
18 pages
Introduction To Renormalisation: João F. Melo
100% (1)
Introduction To Renormalisation: João F. Melo
68 pages
The Nonperturbative Functional Renormalization Group and Its Applications
No ratings yet
The Nonperturbative Functional Renormalization Group and Its Applications
135 pages
Complex Physics
No ratings yet
Complex Physics
264 pages
PhaseTransitionsICFP Chapter
No ratings yet
PhaseTransitionsICFP Chapter
138 pages
Chap RG
No ratings yet
Chap RG
118 pages
QFT1
No ratings yet
QFT1
157 pages
Proposed Evacuation Center With Research Objectives
100% (6)
Proposed Evacuation Center With Research Objectives
7 pages
Gopal A Krishnan 06
No ratings yet
Gopal A Krishnan 06
168 pages
SFT Intro
No ratings yet
SFT Intro
179 pages
Lecture Notes of Statistical Mechanics
No ratings yet
Lecture Notes of Statistical Mechanics
252 pages
Main
No ratings yet
Main
114 pages
RG Lecture
No ratings yet
RG Lecture
71 pages
Critical Phenomenon
100% (2)
Critical Phenomenon
701 pages
Basics of Thermal Field Theory: A Tutorial On Perturbative Computations
No ratings yet
Basics of Thermal Field Theory: A Tutorial On Perturbative Computations
237 pages
Short Guide To RG
No ratings yet
Short Guide To RG
35 pages
Lectures Kings
No ratings yet
Lectures Kings
234 pages
Guhr 1998 Random Matrix Theories in Quantum P
No ratings yet
Guhr 1998 Random Matrix Theories in Quantum P
178 pages
RG Noh Ver1
No ratings yet
RG Noh Ver1
46 pages
Cargese Lectures
No ratings yet
Cargese Lectures
189 pages
Approach To: Renormalization-Group Interacting
No ratings yet
Approach To: Renormalization-Group Interacting
64 pages
The Ergodicity Landscape of Quantum Theories: Wen Wei Ho and Dorde Radi Cevi C
No ratings yet
The Ergodicity Landscape of Quantum Theories: Wen Wei Ho and Dorde Radi Cevi C
55 pages
Lecture Notes For The C6 Theory Option
No ratings yet
Lecture Notes For The C6 Theory Option
67 pages
Introduction To The Renormalization Group: Hands-On Course To The Basics of The RG
100% (1)
Introduction To The Renormalization Group: Hands-On Course To The Basics of The RG
35 pages
QFT Irfu1 PDF
No ratings yet
QFT Irfu1 PDF
98 pages
John Cardy - Scaling and Renormalization in Statistical Physics
No ratings yet
John Cardy - Scaling and Renormalization in Statistical Physics
252 pages
559 Notes
No ratings yet
559 Notes
150 pages
Renormalisation Group: 4.1 Conceptual Approach
No ratings yet
Renormalisation Group: 4.1 Conceptual Approach
16 pages
Get Field Theory The Renormalization Group and Critical Phenomena 2 Revised Edition Daniel J. Amit Free All Chapters
No ratings yet
Get Field Theory The Renormalization Group and Critical Phenomena 2 Revised Edition Daniel J. Amit Free All Chapters
76 pages
Lectures On The Functional Renormalization Group Method: Janos Polonyi
No ratings yet
Lectures On The Functional Renormalization Group Method: Janos Polonyi
71 pages
Statp 2223
No ratings yet
Statp 2223
104 pages
QFT PDF
No ratings yet
QFT PDF
529 pages
Ulrich Mosel Path Integrals in Field Theory An Introduction Springer
100% (1)
Ulrich Mosel Path Integrals in Field Theory An Introduction Springer
226 pages
Quantum Condensed Matter Physics-Lecture
No ratings yet
Quantum Condensed Matter Physics-Lecture
480 pages
Qstat
No ratings yet
Qstat
76 pages
Advanced Quantum Field Theory (2023) - Miranda C.N. Cheng
No ratings yet
Advanced Quantum Field Theory (2023) - Miranda C.N. Cheng
67 pages
Introduction To Statical Fiel Theory
100% (3)
Introduction To Statical Fiel Theory
178 pages
Comprehensive Viva-Voce Questions-2020
No ratings yet
Comprehensive Viva-Voce Questions-2020
15 pages
Lecture 10
No ratings yet
Lecture 10
4 pages
QFT Review Dec13 2018
100% (1)
QFT Review Dec13 2018
248 pages
Scale-Independent Scale Dependence and Renormalization in Zero Dimensions
No ratings yet
Scale-Independent Scale Dependence and Renormalization in Zero Dimensions
53 pages
advancedQFT PDF
No ratings yet
advancedQFT PDF
137 pages
ACQFT09 JZinnJustin
No ratings yet
ACQFT09 JZinnJustin
83 pages
(Daniel J. Amit) Field Theory, The Renormalization Group
100% (2)
(Daniel J. Amit) Field Theory, The Renormalization Group
412 pages
Chiral Lagrangian
No ratings yet
Chiral Lagrangian
77 pages
Vyper and Python Smart Contracts On Blockchain - Full Course For Beginners (English (Auto-Generated) ) (DownloadYoutubeSubtitles - Com)
No ratings yet
Vyper and Python Smart Contracts On Blockchain - Full Course For Beginners (English (Auto-Generated) ) (DownloadYoutubeSubtitles - Com)
1,465 pages
Hierarchy Problem in Phi4 Theory PDF
No ratings yet
Hierarchy Problem in Phi4 Theory PDF
107 pages
Dotsenko - Intro To Stat Mech of Disordered Spin Systems PDF
No ratings yet
Dotsenko - Intro To Stat Mech of Disordered Spin Systems PDF
115 pages
Second Quantization
No ratings yet
Second Quantization
26 pages
Lectures
No ratings yet
Lectures
66 pages
Peptide Train
No ratings yet
Peptide Train
4,123 pages
Euclidean Field Theory: Kasper Peeters & Marija Zamaklar
No ratings yet
Euclidean Field Theory: Kasper Peeters & Marija Zamaklar
41 pages
Skript QFT2
No ratings yet
Skript QFT2
107 pages
Choose The BEST Answer.: Practice Test 2 - Assessment of Learning Multiple Choice
100% (1)
Choose The BEST Answer.: Practice Test 2 - Assessment of Learning Multiple Choice
6 pages
Laphormur F7 - Rieter Manual
No ratings yet
Laphormur F7 - Rieter Manual
391 pages
The Renormalization Group - Lecture Notes (Condensed) : Jan Tuzlić Offermann
No ratings yet
The Renormalization Group - Lecture Notes (Condensed) : Jan Tuzlić Offermann
6 pages
Statistical Mechanics Lecture Notes (2006), L13
No ratings yet
Statistical Mechanics Lecture Notes (2006), L13
7 pages
Unit 1 Introduction To HRM
100% (1)
Unit 1 Introduction To HRM
6 pages
Random Matrix Theories in Quantum Physics
No ratings yet
Random Matrix Theories in Quantum Physics
178 pages
Both Statements Are False
No ratings yet
Both Statements Are False
26 pages
(English (Auto-Generated) ) Kevin O'Leary - This $28 Habit Is Keeping You Poor! Every Time You Get Paid, Do This! (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) Kevin O'Leary - This $28 Habit Is Keeping You Poor! Every Time You Get Paid, Do This! (DownSub - Com)
54 pages
2023 LLMBC Whats Next
No ratings yet
2023 LLMBC Whats Next
95 pages
Dr. Sameh Ahmad Muhamad Abdelghany Lecturer of Clinical Pharmacology Mansura Faculty of Medicine
No ratings yet
Dr. Sameh Ahmad Muhamad Abdelghany Lecturer of Clinical Pharmacology Mansura Faculty of Medicine
52 pages
ControllerKUKA Sunrise Cabinet Med
No ratings yet
ControllerKUKA Sunrise Cabinet Med
114 pages
Mies Van Der Rohe and The Philosophy of Work
No ratings yet
Mies Van Der Rohe and The Philosophy of Work
5 pages
Cranmer ML SymbolicRegression
No ratings yet
Cranmer ML SymbolicRegression
136 pages
Maleic Anhydride Plant Design
No ratings yet
Maleic Anhydride Plant Design
46 pages
141
No ratings yet
141
12 pages
(English (Auto-Generated) ) AI Consultant Tutorial - How To Earn Money With AI Consulting (In 2024) (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) AI Consultant Tutorial - How To Earn Money With AI Consulting (In 2024) (DownSub - Com)
5 pages
2023 LLMBC Ux For Luis
No ratings yet
2023 LLMBC Ux For Luis
95 pages
DVP06XA-S Mixed Analog Input-Output Module
No ratings yet
DVP06XA-S Mixed Analog Input-Output Module
2 pages
Boltzmann Generators - Sampling Equilibrium States of Many-Body Systems With Deep Learning
No ratings yet
Boltzmann Generators - Sampling Equilibrium States of Many-Body Systems With Deep Learning
46 pages
1239915-Fairwinds Festival of Delights - The Homebrewery
No ratings yet
1239915-Fairwinds Festival of Delights - The Homebrewery
7 pages
(English (Auto-Generated) ) How To Actually Make Money Online (No-BS Guide) (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) How To Actually Make Money Online (No-BS Guide) (DownSub - Com)
6 pages
(English (Auto-Generated) ) How Did They Actually Take This Picture - (Very Long Baseline Interferometry) (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) How Did They Actually Take This Picture - (Very Long Baseline Interferometry) (DownSub - Com)
9 pages
The Evolving Concept of Life
No ratings yet
The Evolving Concept of Life
17 pages
Mockingbird
No ratings yet
Mockingbird
4 pages
Conditional Normalizing Flows For Active Learning of Coarse-Grained Molecular Representations
No ratings yet
Conditional Normalizing Flows For Active Learning of Coarse-Grained Molecular Representations
24 pages
Stimper 22 A
No ratings yet
Stimper 22 A
22 pages
Autonomous LLM-driven Research From Data To Human-Verifiable Research Papers
No ratings yet
Autonomous LLM-driven Research From Data To Human-Verifiable Research Papers
38 pages
前30页源代码 final
No ratings yet
前30页源代码 final
37 pages
Machine-Guided Path Sampling To Discover Mechanisms of Molecular Self-Organization
No ratings yet
Machine-Guided Path Sampling To Discover Mechanisms of Molecular Self-Organization
21 pages
(English (Auto-Generated) ) How To Find A Planet You Can't See (DownSub - Com)
No ratings yet
(English (Auto-Generated) ) How To Find A Planet You Can't See (DownSub - Com)
6 pages
Prenatal Genetic Testing For Monogenic Diabetes Due To Glucokinase Deficiency (December 2023) What's New
No ratings yet
Prenatal Genetic Testing For Monogenic Diabetes Due To Glucokinase Deficiency (December 2023) What's New
33 pages
Six Sigma Level - 1 Exam 31 12 24
No ratings yet
Six Sigma Level - 1 Exam 31 12 24
12 pages
Summary Tables: Bigg Pharmaceutical Company BP3304-002
No ratings yet
Summary Tables: Bigg Pharmaceutical Company BP3304-002
55 pages
Martin Audus 2023 Emerging Trends in Machine Learning A Polymer Perspective
No ratings yet
Martin Audus 2023 Emerging Trends in Machine Learning A Polymer Perspective
20 pages
Marisela Frasuto - Beverly Hills Cop
No ratings yet
Marisela Frasuto - Beverly Hills Cop
5 pages
Kisi-Kisi SOAL UJIAN AKHIR SEKOLAH BAHASA INGGRIS 2024
No ratings yet
Kisi-Kisi SOAL UJIAN AKHIR SEKOLAH BAHASA INGGRIS 2024
12 pages
Sambhav Daksh Syed Abhimanyu
No ratings yet
Sambhav Daksh Syed Abhimanyu
10 pages
Et Zc341 Ec-3r Solution Second Sem 2013-2014
No ratings yet
Et Zc341 Ec-3r Solution Second Sem 2013-2014
9 pages
Biosignal Processing Final Exam Updated
No ratings yet
Biosignal Processing Final Exam Updated
3 pages
Juliani 2
No ratings yet
Juliani 2
4 pages
Tutorial Benzene and Phenol
No ratings yet
Tutorial Benzene and Phenol
4 pages
Stats 101 Assignment 1
No ratings yet
Stats 101 Assignment 1
9 pages
Chicago Boogie - Alto Sax
No ratings yet
Chicago Boogie - Alto Sax
2 pages
Communication in Freaky Friday
No ratings yet
Communication in Freaky Friday
4 pages
Mapreduce Join Document
No ratings yet
Mapreduce Join Document
4 pages
The Design and Manufacture of Medicines: M I C H A E L E - Aultoribpharmphdfaapsfrpharms
No ratings yet
The Design and Manufacture of Medicines: M I C H A E L E - Aultoribpharmphdfaapsfrpharms
3 pages
Om 4 Ldemryafk 1 PRKPSDW
No ratings yet
Om 4 Ldemryafk 1 PRKPSDW
1 page
Deadline Yemen (The Elizabeth Darcy Series)
From Everand
Deadline Yemen (The Elizabeth Darcy Series)
Peggy Hanson
5/5 (1)
Advanced college algebra study guide
From Everand
Advanced college algebra study guide
Harrison Cook
No ratings yet
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
From Everand
ADVANCED COLLEGE ALGEBRA STUDY GUIDE
Harrison K Cook
No ratings yet
Gray Hat Hacking the Ethical Hacker's
From Everand
Gray Hat Hacking the Ethical Hacker's
Çağatay Şanlı
5/5 (1)
Operation Longlife
From Everand
Operation Longlife
E. Hoffmann Price
3.5/5 (3)
Operation Exile
From Everand
Operation Exile
E. Hoffmann Price
3.5/5 (1)
Hamlet Had an Uncle: A Comedy of Honor
From Everand
Hamlet Had an Uncle: A Comedy of Honor
James Branch Cabell
4.5/5 (7)
Kellory the Warlock
From Everand
Kellory the Warlock
Lin Carter
No ratings yet

2018F 217 Lectures

Uploaded by

2018F 217 Lectures

Uploaded by

Physics 217: The Renormalization Group

Last updated: 2021/03/30, 12:52:46

1 Scaling and self-similarity 6

4 Mean Field Theory 47

8 The operator product expansion and conformal perturbation theory115

9 Lower dimensions and continuous symmetries 123

10 RG approach to walking 134

11 RG sampler platter 146

Initial Tentative Plan:

1. Scaling and self-similarity

2. RG treatment of random walks

4. Critical phenomena (a great victory of the RG). 4 −  expansions

6. RG treatment of percolation and lattice animals

7. RG understanding of the method of matched asymptotic expansions

8. RG treatment of stochastic PDEs

Sources for these notes (anticipated):

Introduction to Renormalization Group Methods in Physics, by R. J. Creswick, H. A. Far-

Lectures on Phase Transitions and the Renormalization Group, by Nigel Goldenfeld.

Statistical Physics of Fields, by Mehran Kardar.

Introduction to Statistical Field Theory, by Eduard Brézin.

Renormalization Group Methods, a guide for beginners, by W. D. McComb.

Scaling and Renormalization in Statistical Physics, by John Cardy.

Modern Theory of Critical Phenomena, by S.-K. Ma (UCSD).

Statistical Field Theory, by David Tong.

RHS ≡ right-hand side.

I reserve the right to add to this page as the notes evolve.

1.1 Fractal dimension

For objects defined by algebraic equations, D = dT . For

• O is a set of points in Rd , the dimension of the objects composing O (maybe

It behooves us to give some examples with D ∈

1. A Cantor set in d = 1 can be defined beginning with a line segment of length

2. Here’s an example in d = 2. Take a square with side length a0 . Now divide it

The figure at left is defined by a similar procedure. I

So far we’ve been discussing fractals defined by an artificial procedure. Consider a

This is a random variable with average

which vanishes by our assumption of rotation invariance of the individual distributions.

For the sub-steps, the same relation says we should cover

which says that the fractal dimension of the (unrestricted, in

A few points regarding the notion of fractal dimension.

The Hausdorff dimension we’ve defined is not the only can-

Another one is the correlation dimension, which is related to a problem on the

1.3 RG treatment of random walk

Note that d¯d k ≡ dd k .)

1.4 Anatomy of an RG scheme

As we saw in (1.5), we are taking a passive point of view on the RG transformations:

1. We can reach a fixed point, h? = R(h? ). (We’ll in-

2. We can go around in circles forever. This is called a

3. We can jump around chaotically forever.

1.5 Scaling behavior near a fixed point

Now, suppose we have found a fixed point h? of our RG transformation, which is a

critical surface of h? ≡ {h| lim Rn (h) = h? } ≡ S(h? )

– this is the basin of attraction of the fixed point in question.

Together, these are orthonormal

and complete X (n) (n)

ρn (λ)ρn (λ0 ) = ρn (λλ0 ). (1.8)

ρn (λ) = λyn (1.9)

for some yn independent of λ.

y(λ) + y(1 + ) = y(λ + λ)

2.1 Biased gaussian walk

So, after the coarse-graining step, we have

2.2 Universality class of the (unrestricted) random walk

The important point for us is the expansion:

2.3 Self-avoiding walks have their own universality class

~ ≡ # of SAWs with n steps connecting ~0 to R.

with some characteristic exponent ν.

where K is a fugacity whose size determines the relative contributions of walks of

Preview: if Mn grows exponentially, with some power-law prefactor,

where we regard Γ0 as defining an equiv-

1. K = 0, which is an ensemble dominated by very short walks, and in particular

2. K = ∞, which is dominated by crazy lattice-filling walks. Maybe interesting.

νSAW > νRW = 1/2.

Here in our theorists’ paradise, we measure temperature in units of energy, kB = 1.

where hiji means the set of nearest neighbors i, j.

Also of interest is the spin susceptibility:

But for any function φ X X

by renaming summation variables. Therefore on a bipartite lattice

4. Critical phenomena (a great victory of the RG). 4 − expansions

y(λ) + y(1 + ) = y(λ + λ)