0% found this document useful (0 votes)
30 views157 pages

QFT1

These lecture notes for Physics 8.323 at MIT cover the fundamentals of Relativistic Quantum Field Theory, including the integration of quantum mechanics and special relativity, Lagrangian field theory, and quantization of fields. The document outlines various topics such as symmetries, path integrals, and perturbative calculations, along with homework assignments for practical application. It serves as a comprehensive resource for students studying advanced concepts in quantum field theory.

Uploaded by

Devang Bajpai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views157 pages

QFT1

These lecture notes for Physics 8.323 at MIT cover the fundamentals of Relativistic Quantum Field Theory, including the integration of quantum mechanics and special relativity, Lagrangian field theory, and quantization of fields. The document outlines various topics such as symmetries, path integrals, and perturbative calculations, along with homework assignments for practical application. It serves as a comprehensive resource for students studying advanced concepts in quantum field theory.

Uploaded by

Devang Bajpai
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 157

Daniel Harlow

Massachusetts Institute of Technology


Lecture notes for Physics 8.323: Relativistic Quantum field theory I, spring 2024

Contents
1 Why quantum field theory? 4
1.1 Combining quantum mechanics and special relativity . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 A notational aside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Relativistic propagator and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Creation and annihilation operators on multi-particle Hilbert space . . . . . . . . . . . 8
1.1.4 Quantum fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Many-body quantum systems with local interactions . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Quantum field theory in quantum gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Mathematical difficulties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 What this course is and is not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Lagrangian field theory 17


2.1 Particle Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Field Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 Relativistic notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Symmetries and currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.5 Noether currents for Poincaré symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
2.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3 Quantization of a free scalar field 29


3.1 Canonical commutation relations and wave functionals . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Heisenberg fields and particle states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Non-locality of the annihilation operator in position space . . . . . . . . . . . . . . . . . . . . 35
3.4 Lorentz transformations and microcausality revisited . . . . . . . . . . . . . . . . . . . . . . . 35
3.5 Quantization of a complex scalar field, antiparticles . . . . . . . . . . . . . . . . . . . . . . . . 37
3.6 Correlation functions I: Definition and physical meaning . . . . . . . . . . . . . . . . . . . . . 38
3.7 Correlation functions II: Calculation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4 Algebras and symmetries in quantum field theory 45


4.1 The algebraic approach to field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
4.2 Symmetry in quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4.3 Internal symmetries in quantum field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.4 Spacetime symmetries in quantum field theory . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.5 Correlation functions of tensor fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
4.6 Correlation functions involving conserved currents . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

5 Path integrals in quantum mechanics and quantum field theory 58


5.1 Hamiltonian path integral in quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2 Ground state preparation and the iϵ prescription . . . . . . . . . . . . . . . . . . . . . . . . . 60
5.3 An aside on Gaussian integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.4 Lagrangian path integral in quantum mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.5 Path integral calculation of the harmonic oscillator ground state . . . . . . . . . . . . . . . . 64

1
5.6 Path integral calculation of the Feynman propagator in field theory . . . . . . . . . . . . . . . 64
5.7 Euclidean path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6 CRT , spin-statistics, and all that 69


6.1 The CRT theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
6.2 Spin and statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
6.3 The structure of vacuum entanglement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
6.4 Unruh Effect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
6.5 Reeh-Schlieder property . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
6.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

7 Perturbative calculation of correlation functions in interacting theories 79


7.1 Perturbation series for an integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.2 Feynman diagrams for Gaussian integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.3 Feynman diagrams for an “interacting” integral . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.4 Exponentiation of connected diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
7.5 Perturbative computation of correlation functions . . . . . . . . . . . . . . . . . . . . . . . . . 87
7.6 Feynman diagrams for perturbative correlation functions in ϕ4 theory . . . . . . . . . . . . . 90
7.7 Feynman rules in momentum space for correlation functions in ϕ4 theory . . . . . . . . . . . 92
7.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

8 Particles and Scattering 96


8.1 One-particle states . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
8.2 Multiparticle states in non-interacting theories . . . . . . . . . . . . . . . . . . . . . . . . . . 99
8.3 Multiparticle states in interacting theories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
8.4 Cross sections and decay rates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
8.5 Unitarity and the optical theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
8.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

9 Scattering from correlation functions in quantum field theory 109


9.1 Exact two-point function in interacting quantum field theory . . . . . . . . . . . . . . . . . . 109
9.2 Matrix elements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
9.3 Back to the two-point function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
9.4 The LSZ reduction formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
9.5 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

10 Scattering in perturbation theory 121


10.1 Self-energy and the two-point function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
10.2 Perturbative calculation of the S-matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
10.3 Computing the cross section . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
10.4 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129

11 Loop diagrams 130


11.1 Self-energy at one loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.1.1 Lattice regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
11.1.2 Hard momentum cutoff regulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
11.1.3 Pauli-Villars regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
11.1.4 Dimensional regularization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
11.2 Two-to-two scattering at one loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
11.3 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140

2
12 Renormalizability and the Renormalization Group 141
12.1 Power counting and renormalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
12.2 Cancellation of divergences in renormalizable theories . . . . . . . . . . . . . . . . . . . . . . 143
12.3 The Wilsonian approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
12.4 Polchinski’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
12.5 Why renormalizability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
12.6 Effective field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
12.7 Fixed points and conformal symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
12.8 Critical phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153

3
1 Why quantum field theory?
The goal of this class is to teach you quantum field theory (QFT), which is the central foundation
(together with general relativity) of most of contemporary theoretical physics. In 2024 you cannot claim to
understand the laws of physics without knowing some QFT, and here is an opportunity for you to learn it.
The full class is three semesters long, so in this semester we are just getting started.
QFT is not an easy subject to study: there are many subtle arguments and long calculations, and
moreover, as we will see throughout the class, QFT rests on somewhat shaky mathematical ground, which
can make it difficult to know which results are really solid. It is often said that nobody truly understands
QFT, and many current research seminars around the world are devoted to trying to understand how to
formulate it better. A consequence of this state of affairs is that, unlike for older subjects such as classical
electromagnetism, there is no settled way to teach QFT and the various textbooks are all written from quite
different perspectives. I’ll say a bit more about the perspective of this class at the end of the lecture.
Given the difficulty of the subject, it is important to understand at the outset where we are going and
why. The goal of this lecture and the following one is to present the basic conceptual motivations for thinking
about QFT, aiming to get some intuition for why it is a good idea to think about the quantum mechanics
of fields. There are three main motivations that we will consider:
ˆ Quantum field theory is (likely) the only way to create a quantum theory of interacting relativistic
particles. This is why quantum field theory is of great importance in particle physics: the standard
model of particle physics, which governs the interactions of elementary particles through the electro-
magnetic, strong, and weak forces, is a quantum field theory. For example one of the great triumphs of
the standard model of particle physics is its successful description of something called the anomalous
magnetic moment of the electron:

Theory : ae = 0.001159652181643(764)
Experiment : ae = 0.00115965218073(28). (1.1)

This theory calculation is a tour-de-force of quantum field theory, and we will compute the first few
digits of it next semester in QFT 2.
ˆ Quantum field theory is the natural language for describing the low-energy physics of many-body
quantum systems with local interactions. This is why quantum field theory is of great importance in
condensed matter physics: many important solid-state phenomena such as superconductivity, phase
transitions in magnets, and the fractional quantum hall effect are quantitatively understood using the
machinery of quantum field theory. For example in an Ising magnet the spontaneous magnetization
M scales as
M ∝ (Tc − T )β (1.2)
for temperatures just below the critical temperature Tc , with the “critical exponent” β being given by

1
8
 dspatial = 2
β = .326419(3) dspatial = 3 (1.3)
1

2 d spatial = 4

These exponents can be computed in quantum field theory: by the end of this semester we will be able
to understand the dspatial = 2 and dspatial = 4 cases, while the dspatial = 3 case (the hardest) is an
area of active ongoing research!
ˆ Quantum field theory arises ubiquitously in our most promising approach to combining quantum me-
chanics and gravity, which consists of a set of related ideas under the general umbrella of “string
theory”. There it arises both as the low-energy description of brane systems and also as the “holo-
graphic dual” of non-perturbative quantum gravity in spacetimes with negative cosmological constant.

4
One of the big successes of the latter is its confirmation in many cases of the Bekenstein-Hawking
formula for the entropy of a black hole:

Ahorizon c3
S= . (1.4)
4Gℏ

We will now discuss each of these motivations in turn, focusing on the first two since the third is mostly
beyond the scope of this class. If you are new to QFT (as I hope many of you are), the arguments may go
by a bit fast for you. If that is the case do not worry: we will do most of these manipulations again in much
more detail in later lectures. Our goal here is to paint in broad strokes, getting a flavor of what is to come
in the weeks and months ahead!

1.1 Combining quantum mechanics and special relativity


By the end of the 1920s non-relativistic quantum mechanics was on a fairly firm mathematical foundation.
For example for a system of N particles of mass m interacting via a potential V (⃗x1 , . . . ⃗xM ) we can (at least
in principle) determine everything we want to know about the system by solving the many-body Schrodinger
equation " #
N
ℏ2 X 2
− ∇ − iℏ∂t + V (⃗x1 , . . . ⃗xM ) ψ (⃗x1 , . . . , ⃗xN ; t) = 0. (1.5)
2m i=1 i
This equation however has two problems from the point of view of special relativity:
|p|2
p
(i) The kinetic terms are non-relativistic, being compatible with E = 2m instead of E = |p|2 c2 + m2 c4 .

(ii) The potential interactions are instantaneous, which is not compatible with the relativistic principle
that nothing can move faster than light.
Problem (i) is not too difficult to solve: it isn’t pretty, but we can just make the replacement
N N
ℏ2 X 2 X q
− ∇i → −ℏ2 c2 ∇2i + m2 c4 (1.6)
2m i=1 i=1

in equation (1.5). In this way it is fairly straightforward to make a relativistic theory of non-interacting
quantum particles, for example in the one-particle case the solutions of the Schrodinger equation can be
expanded in a basis of energy eigenstates
√ 2 2 2 4
ψ(⃗x, t) = ei⃗p·⃗x−i |p| c +m c t . (1.7)

Problem (ii) however is more serious: in order for quantum dynamics to be compatible with special relativity,
we need to make sure that all interactions are local in spacetime. Based on our experience with electromag-
netism, which is after all a relativistic theory, we can guess that the natural way to incorporate spacetime
locality is to introduce fields. When we move an electric charge here in Cambridge, it is not really true
that there is a physically-detectable Coulomb potential that immediately adjusts what is going on in the
Andromeda galaxy. What happens instead is that we create a ripple in the electromagnetic field which then
propagates outwards at the speed of light, updating the Coulomb field as it goes along. What is perhaps
more surprising is that it turns out that we need to introduce fields for the charges as well: an electron field,
a proton field, and so on.

1.1.1 A notational aside


Before proceeding further, it is time for the deep realization that the factors of ℏ and c in the previous
paragraph are unnecessary and distracting. We can get rid of the former by measuring time in units of

5
Figure 1: Propagation inside the lightcone in 1 + 1 dimensions: in a theory where nothing is faster than light
a disturbance at (xi , ti ) should not be able to reach a point (xf , tf ) which is spacelike separated.

inverse energy, with ℏ as the conversion factor, and we can get rid of the latter by measuring distance in
units of seconds, with c as the conversion factor. From now on we will therefore work in units where

ℏ = c = 1, (1.8)

with all dimensionful quantities having units which are some power of energy. In particular length and time
are both measured in units of inverse energy, while mass is measured in units of energy. For example the
radius of the earth is
R⊕ 1
= (1.9)
ℏc 4.9 × 10−33 J
and the acceleration due to gravity at the Earth’s surface is
gℏ
= 3.43 × 10−42 J. (1.10)
c
These units are clearly not so practical for daily life, but in situations where both relativity and quantum
mechanics are important they are indispensable.

1.1.2 Relativistic propagator and causality


Let’s now try to understand in more detail why particles need to be replaced by fields. Although the non-
interacting theory based on the wave functions (1.7) is both relativistic and quantum, there is a sense in
which it allows information to propagate faster than light. To be concrete, let’s work in 1 + 1 dimensions
and consider the propagator
G(xf , tf ; xi , ti ) := ⟨xf |e−iH0 (tf −ti ) |xi ⟩. (1.11)
Here |xi ⟩ and |xf ⟩ are eigenstates of the particle position: in the p basis the wave functions of such eigenstates
are given by
⟨p|x⟩ = e−ipx . (1.12)
The physical meaning of the propagator is that its absolute value squared is the probability to find the particle
at position xf at time tf given that it was located at position xi at time ti (its phase has information about
the same question for initial and final momenta). We’ll focus on the situation where xf > xi , tf > ti ,
and tf − ti < xf − xi , in which case (xf , tf ) and (xi , ti ) are spacelike separated so no signal which is not
faster than light can propagate between them. More geometrically, the point (xf , tf ) is outside of the future
lightcone of (xi , ti ) (see figure 1). We’ll now show that the propagator in this situation is nonzero, which

6
Figure 2: Deforming the contour in the complex p-plane. The defining contour is the lower dashed line, which
can be smoothly deformed via a two large circle segments at infinity to the upper contour which wraps the
branch cut along the positive imaginary axis.

shows that the information that there is a particle located at position xi at time ti propagates faster than
light!
We can evaluate the propagator (1.11) by inserting a complete set of momentum eigenstates:
dp ip(xf −xi )−i√p2 +m2 (tf −ti )
Z ∞
G(xf , tf ; xi , ti ) = e . (1.13)
−∞ 2π

Going forward we might as well use translation invariance to set ti = xi = 0 and relabel tf = t, and xf = x,
in which case we can consider the simpler function
dp ipx−i√p2 +m2 t
Z ∞
G(x, t) := e . (1.14)
−∞ 2π

This integral is not so easy to evaluate analytically, and one can worry if it even converges due to the
oscillatory behavior at infinity. Since we are assuming that 0 < t < x, the convergence of this integral is
controlled by the ipx term in the exponent. To make sure it is convergent, we can slightly rotate the phase
of the p integral so that it goes off to infinity at a small angle ϵ above the real p-axis in both directions (see
figure 2). This is convergent since at large positive p we have

eie px
≈ e−ϵpx+ipx (1.15)
while at large negative p we have
−iϵ
eie px
≈ eϵpx+ipx . (1.16)
Moreover by Cauchy’s theorem the answer is independent of ϵ since we can rotate the contour freely from
one ϵ to the next (the circle segments at infinity do not contribute since the integrand is exponentially
suppressed), and so we can take ϵ → 0 to recover the propagator. On the other hand to estimate the value
of the propagator, it is more convenient to instead rotate the contour up to wrap around the branch cut
that runs along the imaginary axis from p = im off to p = i∞ (see figure 2). Evaluating the integral on this
contour, we see that we have
Z ∞
dλ −λx  √λ2 −m2 t √
2 2

G(x, t) = i e e − e− λ −m t
m 2π
i ∞
Z p 
= dλe−λx sinh λ2 − m2 t . (1.17)
π m

7
The integrand here (ignoring the factor of i) is strictly positive for λ > m, and so we see that the propagator
is indeed nonzero outside of the lightcone! On the other hand it isn’t very nonzero: by using the monotonicity
of sinh y for y > 0 and then ignoring the negative exponential we have
Z ∞  1Z ∞
p e−m(x−t)
dλe−λx sinh λ2 − m2 t < dλe−λ(x−t) = , (1.18)
m 2 m 2(x − t)

and thus
e−m(x−t)
0 < |G(x, t)| < . (1.19)
2π(x − t)
Therefore the propagator of a massive relativistic particle is suppressed exponentially outside the lightcone,
but it isn’t zero.
In a relativistic theory, there is something deeply wrong with being able to send information faster than
light. Indeed by doing a boost we can change the time ordering of any pair of spacelike-separated events, so
if we can communicate faster than light then we can also communicate backwards in time. In the presence
of interactions it is even worse: by sending a message to a point outside of your future lightcone and then
receiving a message back you can communicate directly with points in your own own past lightcone. Such
things are called violations of causality, which is the principle that you shouldn’t be able to send signals to
your own past. Physics seems unlikely to make much sense in situations where causality is violated, so we
had better find a way to fix this.

1.1.3 Creation and annihilation operators on multi-particle Hilbert space


There is a useful way of re-organizing the above discussion of relativistic particles, historically called second
quantization, which helps points the way towards how to restore causality. The idea is to introduce a larger
Hilbert space, called Fock space, where any number of relativistic particles, including zero, is allowed. The
nicest way to do this is by introducing an annihilation operator a(x), which removes a particle from the
system at point x if one exists and otherwise annihilates the state. Its adjoint a† (x) creates a particle at x.
The algebra of creation and annihilation operators is given by

[a(x), a(x′ )] = 0
[a† (x), a† (x′ )] = 0
[a(x), a† (x′ )] = δ(x − x′ ), (1.20)

and the zero-particle state |Ω⟩ is defined to be the one which is annihilated by all a(x). Other states are
created from |Ω⟩ by acting with creation operators, for example a one-particle state with wave function ψ(x)
is represented in this language by Z
|ψ⟩ = dxψ(x)a† (x)|Ω⟩. (1.21)

For practice we can check that the norm works out:


Z Z
⟨ψ|ψ⟩ = dx′ ψ ∗ (x′ ) dxψ(x)⟨Ω|a(x′ )a(x)† |Ω⟩
Z Z
= dx ψ (x ) dxψ(x)⟨Ω|[a(x′ ), a(x)† ]|Ω⟩
′ ∗ ′

Z Z
= dx′ ψ ∗ (x′ ) dxψ(x)δ(x − x′ )
Z
= dx|ψ(x)|2

= 1. (1.22)

8
More generally a multi-particle state ψ(x1 , x2 , . . . xN ) is represented by
Z
1
|ψ⟩ = √ dx1 . . . dxn ψ(x1 , . . . , xN )a† (x1 ) . . . a† (xN )|Ω⟩. (1.23)
N!
We note in passing that since the a† (xi ) all commute with each other, the particles we are describing are
bosons:1
a† (x)a† (x′ )|Ω⟩ = a† (x′ )a† (x)|Ω⟩. (1.24)
The Hamiltonian is given by
Z p
H0 = dx a† (x) −∂x2 + m2 a(x)
Z
dp p 2
= p + m2 a† (p)a(p), (1.25)

where in the second line we have introduced the Fourier-transformed annihilation operator
Z
a(p) = dxe−ipx a(x). (1.26)

It may feel like we have suddenly introduced an entire new form of quantum mechanics, but except for
introducing a rule that the particles are bosons (which we couldn’t see before since we only considered one-
particle states), this is really just a different bookkeeping for the same old multi-particle quantum mechanics.
In particular the second expression for the Hamiltonian shows that the energy eigenstates are states of the
form
a† (p1 ) . . . a† (pN )|Ω⟩, (1.27)
with the total energy just being
N q
X
E= p2i + m2 . (1.28)
i=1
In this language we can rewrite the propagator (1.14) as
G(x, t) = ⟨Ω|a(x)e−iH0 t a† (0)|Ω⟩
= ⟨Ω|a(x, t)a† (0)|Ω⟩
= ⟨Ω|[a(x, t), a† (0)]|Ω⟩. (1.29)
where we have introduced the Heisenberg picture annihilation operator
a(x, t) = eiH0 t a(x)e−iH0 t . (1.30)
In fact this is true as an operator equation:
[a(x, t), a† (0)] = G(x, t). (1.31)
What we have learned from our discussion of the propagator is therefore that creation and annihilation
operators in the Heisenberg picture do not commute at spacelike separation. It is a bit tedious to work out,
but this also implies that the number operator
N (x, t) = a† (x, t)a(x, t), (1.32)
which counts how many particles there are at position x and time t, does not commute with itself at spacelike
separation. Unlike the creation/annihilation operators, the number operator is hermitian and thus should be
observable. If we are going to save causality, we thus need to argue that the number of particles at position
x and time t cannot actually be measured by someone in the vicinity of x and t!
1 If we want to get fermions, we should instead impose anticommutation relations {a(x), a(x′ )} = {a† (x), a† (x′ ) = 0 and

{a(x), a† (x′ )} = δ(x − x′ ), where {A, B} = AB + BA is called the anticommutator of A and B. We will discuss fermions in
more detail later in the semester.

9
Figure 3: Translation of a function f (x) by a. Note that here f ′ is the transformed function, not the
derivative, and that it is the inverse of the translation which appears in the argument of the function.

1.1.4 Quantum fields


A good way to proceed is to consider what kind of interactions we could add to the Hamiltonian (1.25) that
wouldn’t violate causality. We’d like to somehow build the interaction Hamiltonian V out of the creation
and annihilation operators, but in a way where (now working in d spacetime dimensions) it is an integral
Z
V (t) = dd−1 xHint (t, ⃗x) (1.33)

of an interaction density Hint (t, ⃗x) that commutes with itself at spacelike separation (otherwise we could
violate causality by measuring the energy density at spacelike separation). The easiest way to achieve this is
to construct an interaction density which commutes with itself at spatial separation, and then also demand
that it transform as a Lorentz scalar in the sense that2
U (Λ, a)† Hint (x)U (Λ, a) = Hint (Λ−1 (x − a)), (1.34)
where U (Λ, a) is the unitary operator on Hilbert space which implements the Poincaré transformation
x′µ = Λµν xν + aµ (1.35)
on the Hilbert space of the theory. This ensures commutativity at spacelike separation since if x and y are
spacelike separated there is always a Poincaré transformation that sends them to the same time slice and
we have assumed that Hint (x) commutes with itself at spatial separation. It may be puzzling that we used
the inverse Poincaré transformation in the argument of Hint , the idea behind this is shown in figure 3: we
want to define the symmetry transformation to “move the scalar along with the symmetry”, meaning that
the “new” scalar at x should be equal to the “old” scalar at the point Λ−1 (x − a) where x “came from”.
You will also show in the homework that defining things this way is necessary for us to have two successive
Poincaré transformations combine in the natural way.
I’ll note in passing that by using time-dependent perturbation theory we can write a formula for the
particle scattering matrix in a theory with an interaction of this form (see chapter three of Weinberg volume
I) as

(−i)n
X Z
S =1+ dd x1 . . . dd xn T {Hint (x1 ) . . . Hint (xn )}. (1.36)
n=1
n!
If Hint is a Lorentz scalar then this is manifestly Lorentz-invariant except for the time-ordering symbol
T . As long as Hint commutes with itself at spacelike separation however, then the time ordering is also
independent of Lorentz frame and so S will indeed be Lorentz-invariant
U (Λ, a)† SU (Λ, a) = S. (1.37)
2 Here I’ll introduce a standard notation for the rest of the class: when I write ⃗
x I mean a point in space, while when I write
x I mean a point (t, ⃗x) in spacetime.

10
We will discuss scattering theory in more detail later in the class.
How then can we build an Hint which is a Lorentz scalar that commutes with itself at spacelike separation?
The only idea which seems to work is that it should be built out of fields: linear combinations of the creation
and annihilation operators of the form3
X Z dd−1 p  

ϕi (x) = ui (x; p, σ, n)a(p, σ, n) + v i (x; p, σ, n)a (p, σ, n) , (1.38)
σ,n
(2π)d−1

with the coefficient functions ui and vi carefully constructed to ensure that4

[ϕi (x), ϕj (y)]± = [ϕi (x), ϕ†j (y)]± = 0 (x − y)2 > 0, (1.39)

and also that under Poincaré transformations we have a simple transformation law
X
U (Λ, a)† ϕi (x)U (Λ, a) = Dij (Λ)ϕj (Λ−1 (x − a)). (1.40)
j

Here we have allowed for multiple species of particle labeled by n, and also for the particles have spin σ,
in which case the creation and annihilation operators need to be labeled by n and σ in addition p. You
will show in the homework that the consistently composing Poincaré transformations requires the matrices
Dij (Λ) to furnish a representation of the Lorentz group in the sense that
X
Dij (Λ1 )Djk (Λ2 ) = Dik (Λ1 Λ2 ). (1.41)
j

The requirement (1.39) of (anti)commutativity at spacelike separation is sometimes called microcasaulity.


Given fields obeying (1.39) and (1.40), it is then a straightforward matter to construct interaction Hamilto-
nians which are Lorentz scalars. For example given a vector field V µ (x) transforming as

U (Λ, a)† V µ (x)U (Λ, a) = Λµν V ν (Λ−1 (x − a)), (1.42)

some local interactions we could write down which are Lorentz scalars are

(V µ Vµ )2 V µ V ν ∂µ Vν V µ Vµ ∂ν V ν , (1.43)

which all commute with each other at spacelike separation since V µ (x) does.
So far you might be tempted to view this construction as just more bookkeeping: we are still working in
our old multi-particle Hilbert space and constructing things using creation and annihilation operators (albeit
in nice linear combinations). How can bookkeeping fix a problem with causality? The key point is that we
now make a fundamental shift in how we physically interpret all of the above equations:
⋆ In quantum field theory, we postulate that the observables that can be measured in the vicinity of a
spacetime point x are those constructed from the fields at x, not those constructed from the position-
space creation and annihilation operators a† (x) and a(x).
We will see in a few lectures that the a(x) and a† (x) are non-local when expressed in terms of the fields,
so the apparent failure of causality we saw above is really just a consequence of failing to identify the right
physical degrees of freedom. If we build a detector here in this room right now, the claim is that what it
really couples to are the fields and not the particles.
We are already in a position to see two of the most remarkable consequences of relativistic quantum field
theory:
3 Here we work in the “interaction picture”, where operators evolve under the free Hamiltonian H . Heisenberg picture fields
0
in interacting theories cannot be decomposed in this way, and when interactions are strong the interaction picture is not useful
so this motivation needs some revisiting (see the next section).
4 Here ± indicates that for fields which create fermions we actually want anticommutativity instead of commutativity at

spacelike separation.

11
ˆ In interacting quantum field theories, the number of particles is not conserved: we have
not yet specified the functions ui and vi , but we will see soon that both most be nonzero in order to
preserve commutativity at spacelike separation. Interactions which are polynomials of the fields will
thus always heuristically have the form (a + a† )n , which necessarily includes terms that do not have
the same number of creation and annihilation operators and thus do not conserve particle number. We
should therefore expect that particle scattering in field theory can create any set of particles for which
there is sufficient energy, at least as long as the final particles have the same symmetry charges as the
initial particles. The idea that energy can be freely converted into particles is quite natural from the
point of view of Einstein’s equation E = mc2 .
ˆ Every particle must have an antiparticle of equal mass √ 2 and opposite charge: we will see
2
soon that the time-dependence of ui and vi is given by e∓i p +m t , so to preserve commutativity at
spacelike separation for all times, which requires a cancellation between terms involving both ui and
vi , these must have the same time-dependence and thus multiply creation/annihilation operators for
particles of the same mass. On the other hand they must have opposite charge under any continuous
internal symmetry. This is because in order to have an internal symmetry of the Lagrangian we need
the field to have a simple transformation law

e−iQθ ϕi (x)eiQθ = eiθq ϕi (x), (1.44)

where Q is the charge operator for the symmetry and q is the charge of the field, which means that
the annihilation and creation operators appearing in ϕi must both transform by a factor eiθq . This
means that the particles created by the creation operator have opposite charge of those annihilated by
the annihilation operator (you will show this in the homework). A particle can be its own antiparticle,
but only if it has q = 0 for all continuous symmetries.

There are other important general consequences we will understand later, including:
ˆ Spin-statistics theorem: in constructing ui and vi , it turns out that commutativity at spacelike
separation is only possible when the particles that are created/annihilated have integer spin. For
particles of half-integer spin, we instead need to impose anticommutativity at spacelike separation.
As mentioned above, commutativity leads to bosons and anticommutativity leads to fermions. Hence
we see that bosons must have integer spin and fermions must have half-integer spin.
ˆ CRT theorem: In any relativistic quantum field theory, it turns out that there is always a symmetry
that exchanges particles and antiparticles (C), reflects a spatial direction (R), and reverses time (T ).
All of these predictions have been confirmed to remarkable precision by experiment, for example colliding two
photons at high energy can produce an electron-positron pair, a neutron decays to a proton, an electron, and
a neutrino, the antiparticle of the electron is the positron, electrons are fermions of spin 1/2 while photons
are bosons of spin one, and it was recently confirmed that hydrogen and antihydrogen have the same rate
for the 2s → 1s transition, as required by CRT symmetry.

1.2 Many-body quantum systems with local interactions


There is another way to motivate quantum field theory. Let’s imagine a physical system with a large number
of independent degrees of freedom that are arrayed in a lattice pattern in space, as in figure 4. What it
means to say that the degrees of freedom are independent is that the Hilbert space of the theory has a tensor
product form O
H= Hi , (1.45)
i

12
Figure 4: A lattice system with two spatial dimensions. There are independent degrees of freedom at each
of the red sites, and each term in the Hamiltonian only couples degrees of freedom on nearby sites.

where i labels the sites on the lattice.5 We say that an operator is a local operator at site i if it is the tensor
product of an operator on Hi with the identity operator on all of the other sites. We are then interested in
Hamiltonians of the form X
H= Oi , (1.46)
i

where each Oi is built from local operators at sites in the vicinity of i, meaning sites that are an O(1) number
of links away (as opposed to something that grows with the total size of the system). Such Hamiltonians are
called local Hamiltonians. For example our lattice could be ions in a crystal, and the degrees of freedom
at the sites could describe local displacements of the ions. Another example we will come back to repeatedly
is the quantum Ising model in a transverse field, where each Hi is a two-level system and the Hamiltonian
is given by X X
H=− σx (i) − λ σz (i)σz (j), (1.47)
i ⟨ij⟩

where ⟨ij⟩ indicates nearest neighbors on the lattice.


In studying many-body local quantum systems we are usually not interested in the details of what is
happening on the lattice scale. For example if you look at the atomic scale a superconductor is a giant
mess: it is only when you zoom out and look at the long-distance behavior that you can see that something
remarkable is going on. As we do this zooming out process, it becomes harder and harder to see that there
is really a lattice and the system starts to look continuous in space. In other words it starts to look like a
system whose Hamiltonian has the form
Z
H = dd−1 xH(⃗x), (1.48)

with the energy density H(⃗x) being built out of operators localized at ⃗x. Moreover local operators at ⃗x
will commute with local operators at ⃗y , since they live on different tensor factors of the microscopic Hilbert
space (1.46). Thus it starts to look like a quantum field theory! This zooming out process is called the
renormalization group, and it is an idea of fundamental importance for any dynamical system (including
classical systems) with local interactions at short distances.
It is often the case that the interesting long-distance excitations of a many-body quantum system look
rather different than the fundamental lattice degrees of freedom. For example:
ˆ In a crystal, the fundamental degrees of freedom are protons and electrons interacting through Coulomb
forces but at long distances the excitations are phonons, which are ripples made out of vibrations in
the lattice structure.
5 This is not the most general possibility, as we could also add degrees of freedom on the links of the lattice, faces, etc, and

also perhaps constrain the physical states by imposing some kind of local constraint. We will see these generalizations arise
later when we consider gauge theories.

13
ˆ In quantum chromodynamics (QCD), which is the fundamental theory of strong interactions, the
fundamental degrees of freedom are quarks and gluons but the long-distance excitations are hadrons
such as protons, neutrons, and pions.
ˆ In 1 + 1 dimensions the fundamental degrees of freedom of the quantum Ising model (1.47) are Pauli
spins, but at the “critical point” λ = 1 the long-distance excitations are pairs of non-interacting massless
fermions. This is also the essence of why the two-dimensional classical Ising model is solvable.
These examples illustrate an important weak point of our above argument that we need quantum field
theory to combine special relativity and quantum mechanics: when the interactions of the fundamental
fields appearing in the Lagrangian are strong, such as in QCD at low energies, there need not be any simple
relationship between these fundamental fields and the low-energy particle excitations. What the argument
leading to (1.38) really constructs is a “low-energy effective field theory”, whose fields create and annihilate
the low-energy excitations. In quantum field theory the basic question we are often really trying to answer
is the following: given some short-distance formulation of the theory using local fields, what are the long-
distance excitations and how do they interact?
It is also worth emphasizing that, although we began this discussion by talking about particles, not all
quantum field theories lead to particles. Field theories without particles include “conformal field theories”,
which are more naturally understood in terms of correlation functions of local operators with simple scaling
transformations, and “topological field theories”, which are more naturally understood in terms of the algebra
of extended “surface” operators which can be freely deformed in spacetime. Moreover these are not weird
esoteric theories: the long-distance description of any second-order phase transition is a conformal field
theory, and the fractional quantum hall effect is described by a topological quantum field theory. Even in
the standard model of particle physics, there are “infrared divergences” arising from the presence of massless
particles such as the photon and dealing with these correctly requires us to consider asymptotic states which
are clouds of infinite numbers of particles rather than individual particles. In quantum field theory it is the
fields that are essential, not the particles.
There is an important caveat to mention here: we are quite confident that the laws of nature are relativis-
tic, so in high-energy physics we are for the most part only interested in relativistic quantum field theories.
In condensed matter physics however Lorentz invariance can be broken by the existence of the material we
are studying, so the field theories that show up in condensed matter physics do not need to be relativistic.
Sometimes they are however, for example in the case of the quantum Ising model or the fractional quantum
hall system, and the methods you learn in this class generalize to the non-relativistic case without much
difficulty.

1.3 Quantum field theory in quantum gravity


We have seen that quantum field theory gives a way to successfully combine quantum mechanics and special
relativity. The next frontier in fundamental physics is learning how to combine quantum field theory and
general relativity, which is Einstein’s theory of gravity. General relativity has been quite successful in
explaining gravitational phenomena in astrophysics and cosmology, but it is a classical theory and so far
attempts to “quantize” it in the conventional way (which we will review next time) have not been successful.
A number of lines of reasoning have led to the idea that a theory which combines gravity and quantum
mechanics will need to be “holographic”, in the sense that its fundamental formulation lives in fewer spacetime
dimensions than are perceived by long-distance observers such as ourselves. The most concrete example of
this phenomenon is the “AdS/CFT” correspondence, which says that quantum gravity in a universe with
negative cosmological constant is equivalent to a standard quantum field theory (in fact a conformal field
theory, hence “CFT”) living in one fewer spacetime dimension. This idea has been realized (and in fact was
discovered) within the broader framework of “string theory”, which is a speculative proposal for a theory
of quantum gravity based on dynamical objects called “branes” (short for membrane), which have spatial
volumes of various dimension. Often these branes have the feature that at long distances the gravity in the
ambient space the live in can be ignored, in which case their long-distance excitations are again controlled

14
by quantum field theory. In fact AdS/CFT correspondence arises in string theory in precisely this way. This
also gives a novel way of constructing interesting quantum field theories that so far are not accessible by the
more conventional techniques based on Lagrangians that we will use in this class.

1.4 Mathematical difficulties


One of the difficulties of learning quantum field theory is that many of the standard manipulations are
difficult to justify in a mathematically rigorous manner. These mathematical problems arise because in
quantum field theory there is formally an infinite number of degrees of freedom: at least one for each point
in space. This leads to two different kinds of divergences in mathematical expressions, short-distance or
“UV” divergences and long-distance or “IR” divergences. The former arise because in a finite volume of
space there are infinitely many degrees of freedom due to the continuous nature of space. These divergences
can be regularized by working on a spatial lattice as in figure 4. Conventionally the lattice spacing is called
a, and its inverse
1
Λ := (1.49)
a
is called the UV cutoff. IR divergences instead arise because the volume of flat space is infinite: these
divergences are present even in the presence of a spatial lattice, so to regulate them we need to make the
spatial volume V finite. In most cases we are only really able to make sense of quantum field theory in a
mathematically rigorous way when both Λ and V are finite. We then need to master the art of constructing
appropriate observables that stay finite in the limit that Λ and V both go to infinity. For example at finite
temperature the total energy goes to infinity as V → ∞ but the energy density stays finite. Similarly the
fluctuations of a field ϕ(x) go to infinity as Λ → ∞ but the fluctuations of a “smeared” field
Z
ϕf = dd xf (x)ϕ(x) (1.50)

stays finite. In the latter f is taken to be a smooth function of compact support.

1.5 What this course is and is not


Finally I’ll make a few comments about the philosophy of this class. The traditional approach to teaching
quantum field theory is based on getting to perturbative calculations of scattering processes in the standard
model of particle physics as quickly as possible. For this approach see e.g. the books of Peskin and Schroeder
or Schwartz. This will not be the approach we take here. Although particle physics was the original arena of
interest for quantum field theory, today it has grown far beyond these beginnings. Quantum field theorists
today study many different quantum field theories in a variety of spacetime dimensions, and it would be a
mistake to focus so narrowly on one particular quantum field theory in one particular spacetime dimension.6
Indeed the traditional approach to teaching field theory is a bit like designing an electromagnetism class
to get to capacitors as quickly as possible, and then staying there for months. Our approach will instead
be to study quantum field theory as a general framework for analyzing many-body quantum systems. In
illustrating quantum field theory phenomena we will typically go for the simplest model that exhibits them,
and although we will sometimes use perturbation theory we will work non-perturbatively whenever it is
possible. Many of the most exciting quantum field theory phenomenon such as confinement and duality
are fundamentally non-perturbative, and an overly perturbative class along the traditional lines would miss
them. We will eventually talk about the standard model of particle physics, since after all it is good to know
the fundamental laws of nature, but we will view it as one application among many rather than the main
goal of our labors.
6 In fact even for particle physics applications it turns out to be useful to work in d spacetime dimensions, as this enables us

to use ’t Hooft and Veltman’s loony (but brilliant) dimensional regularization method for computing Feynman diagrams.

15
1.6 Homework
1. Let’s get some practice using natural units:
(a) What is the mass of the sun measured in Joules? What about in electron volts? Recall that
1 eV = 1.6 × 10−19 J.
(b) What is one meter in inverse electron volts?
(c) The mass of a proton is 1.67 × 10−27 kg. What is this in electron volts? What do we get if we
convert it to an inverse length? How does this length compare to the size of a nucleus?
(d) The mass of an electron is 9.1 × 10−31 kg. What is this in electron volts? What do we get if
we convert it to an inverse length? How does this length compare to the size of an atom? Any
thoughts about how this comparison went versus the one for the nucleus?
(e) Say that a force is quoted to you in units of eV2 . What factors of c and ℏ should you supply to
convert it back to Newtons?
(f) Say that an energy flux is quoted to you in units of eV4 . What factors of c and ℏ should you
supply to convert it back to Joules per meter squared per second?

If you are having trouble with these, a good way to proceed is to remember that ℏ has units of energy
times time and c has units of length over time. So you can use c to convert all lengths to times, and
then use ℏ to convert all times to energies. Masses can be converted to energy by multiplying by c2 ,
2. Show that the multiparticle states (1.23) are normalized correctly. You will need to use the cre-
ation/annihilation algebra. If you are having trouble I recommend showing it recursively.

3. If an operator a annihilates particles of charge q, what is the commutator of the symmetry charge Q
with a and a† ? What are e−iQθ aeiQθ and e−iQθ a† eiQθ ?
4. Show that the second line of (1.25) follows from the first

5. Show that if we apply the Poincaré transformations (Λ1 , a1 ) and then (Λ2 , a2 ) in succession, the
resulting Poincaré transformation is (Λ2 Λ1 , Λ2 a1 + a1 ). Then show that the field transformation (1.40)
is consistent with the composition rule U (Λ2 , a2 )U (Λ1 , a1 ) = U (Λ2 Λ1 , Λ2 a1 + a2 ) provided that the
matrix D obeys the Lorentz representation condition (1.41).

16
2 Lagrangian field theory
Having motivated the idea of quantum fields from various directions, we now commence studying them in
detail. We will begin with the classical theory of fields, starting from the Lagrangian point of view.7

2.1 Particle Lagrangians


We’ll first briefly consider the Lagrangian mechanics of interacting particles, whose trajectories are parametrized
by functions xa (t) with a = 1, 2, . . . , N . Depending on how we interpret a we can think of this as N particles
moving in one spatial dimension or as N/(d − 1) particles moving in (d − 1) spatial dimensions. We can
think of xa (t) as an N -component vector evolving  in time, which we will notate as x(t). The dynamics are
determined by the Lagrangian function L x, ẋ; t , with the rule being that physical trajectories are those
around which the action functional
Z tf

S[x] := dtL x(t), ẋ(t); t (2.1)
ti

is stationary up to terms at the future/past boundaries.8 Note that the Lagrangian is local in time: at time
t it only depends on the positions and velocities of the particles at time t. We have included t as a separate
argument in the Largangian to allow it to have some explicit time-dependence, for example through a time-
dependent background field that the particle is moving in. To study stationarity, we insert an infinitesimal
variation
x′ (t) = x(t) + δx(t) (2.2)
into the action:
XZ tf  
′ ∂L a ∂L ˙ a
S[x ] =S[x] + dt δx (t) + δx (t)
a ti ∂xa ∂ ẋa
" # ! tf
XZ tf
∂L ∂L˙ X ∂L a
a
=S[x] + dt − δx (t) + δx (t) . (2.3)
a ti ∂xa ∂ ẋa a
∂ ẋa
ti

The third term in the second line consists of a future boundary term and a past boundary term, so stationarity
means that the second term should vanish for all variations δxa (t). In other words the Euler-Lagrange
equations
∂L ˙
∂L
a
= (2.4)
∂x ∂ ẋa
must hold. For example if we have
m
L = ẋ2 − V (x), (2.5)
2
then we must have
∂V
mẍa = − a . (2.6)
∂x
We can pass to the Hamiltonian formalism by introducing the canonical momenta
∂L
pa = , (2.7)
∂ ẋa
7 I’ll present the traditional approach that assumes the Lagrangian depends only on the fields and their first derivatives.

Later in the semester we will also be interested in theories with more derivatives in the Lagrangian: the traditional method for
dealing with this is to introduce auxiliary fields to rewrite the Lagrangian in a way that only involves first derivatives. For a
more modern approach that works directly with the original fields see my paper 1906.08616 with Jie-qiang Wu.
8 You may have been taught that the action should be stationary without qualification. This is true if we fix boundary

conditions at tf /ti , but doing that amounts to singling out some particular set of initial/final conditions. We are trying to
characterize the theory as a whole, so we shouldn’t bias the discussion by picking out some particular state of the system.

17
and the Hamiltonian is given by X
H= pa ẋa − L. (2.8)
a

To quantize we replace the Poisson bracket

{xa , pb } = δba (2.9)

by a commutator
[xa , pb ] = iδba , (2.10)
a
which we represent on a Hilbert space spanned by eigenstates |x⟩ of the X operator:

X a |x⟩ = xa |x⟩. (2.11)

2.2 Field Lagrangians


Let’s now generalize from particles to fields. We will consider fields living in d-dimensional Minkowski space,
with spatial points labeled by a (d − 1)-vector

⃗x = (x1 , x2 , . . . , xd−1 ). (2.12)

The field trajectories are given by functions ϕa (t, ⃗x), where a is a label that runs over some finite number
of fields. This can be viewed as a generalization of the previous subsection in two different ways. The first
way is that we are now allowing the trajectories to depend on space as well as time, in which case we go
back to the particle case by taking d = 1. The second way is that we can think of each field at each point in
space as a distinct particle, in which case we have generalized the previous subsection to an infinite number
of particles. Our notation is more closely aligned to the former interpretation, but the latter is valuable
conceptually because it makes clear that fundamentally we shouldn’t have to do anything for fields that we
didn’t already do for particles.
To specify the field dynamics we need a Lagrangian. As we are interested in constructing field theories
which respect microcausality (i.e. commutativity at spacelike separation), we should take this Lagrangian
to be an integral over space of a local Lagrangian density:
Z  

L[ϕ; t] = dd−1 x L ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x . (2.13)

⃗ t, ⃗x) is constructed out the fields and their derivatives at (t, ⃗x), and we have allowed for
Here L(ϕ, ϕ̇, ∇ϕ;
explicit dependence on space and time. I’ve written L[ϕ; t] with square brackets to emphasize that it is a
functional: it is a function of the functions ϕa and ϕ̇a throughout timeslice at time t. A simple example of
a Lagrangian density is the free scalar field Lagrangian, where we have a single field ϕ(t, ⃗x) with

⃗ 1 2 ⃗ 
⃗ − m 2 ϕ2 ,
L(ϕ, ϕ̇, ∇ϕ) = ϕ̇ − ∇ϕ · ∇ϕ (2.14)
2
where m is a parameter that we will see next time gives the mass of the particles created by this field. We
could introduce explicit space and time dependence by letting m depend on t and ⃗x.
To find the equations of motion we adopt the same principle as before: the action
Z tf  
S := ⃗
dtdd−1 x L ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x (2.15)
ti

18
should be stationary about physical field configurations up to future and past boundary terms. Computing
the variation, we find
X Z tf Z h ∂L ∂L ˙ a ∂L i
δS = dt dd−1 x δϕa
(t, ⃗
x ) + δϕ (t, ⃗
x ) + · ⃗ a (t, ⃗x)
∇δϕ
a ti ∂ϕa ∂ ϕ˙a ⃗ a
∂ ∇ϕ
X Z tf Z h ∂L ˙
∂L

∂L
i
= dt dd−1 x − − ⃗ ·
∇ δϕa (t, ⃗x)
t ∂ϕ a
∂ ϕ ˙a ∂ ⃗
∇ϕ a
a i
! tf
X Z X Z tf Z
d−1 ∂L a ⃗ · ∂L δϕa (t, ⃗x).
+ d x δϕ + dt dA (2.16)
∂ϕ ˙a
ti Sd−2

⃗ a
∂ ∇ϕ
a ti a

The third line consists of future/past boundary terms and a spatial boundary term at the (d − 2)-sphere
Sd−2
∞ at spatial infinity. The former are acceptable, but the latter need to vanish in order for the theory to
make sense. The usual way to deal with this is to impose spatial boundary conditions requiring the fields to
vanish at infinity, in which case the variations δϕa must also vanish at infinity and so this term vanishes.9
We therefore see that the action will be stationary (up to future/past terms) if the Euler-Lagrange equations

˙
∂L

∂L

∂L

+∇· = (2.17)
∂ϕ ˙a ⃗
∂ ∇ϕ a ∂ϕa

are satisfied throughout spacetime. For example for our free scalar field Lagrangian we have

ϕ̈ − ∇2 ϕ = −m2 ϕ, (2.18)

which is a massive version of the wave equation known as the Klein-Gordon equation.
As in the particle case we can also introduce a canonical momentum
∂L
πa ≡ , (2.19)
∂ ϕ̇a
in terms of which the Hamiltonian is given by
Z  

H[ϕ; t] = dd−1 x H ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x (2.20)

with Hamiltonian density X


⃗ t, ⃗x) =
H(ϕ, ϕ̇, ∇ϕ; ⃗ t, ⃗x).
pa ϕ̇a − L(ϕ, ϕ̇, ∇ϕ; (2.21)
a

To complete the construction of the Hamiltonian formalism we need to solve (2.19) to determine ϕ̇ in terms
of ϕ and p. Sometimes this is not possible due to constraints, in which case more sophisticated methods are
needed that we will return to later. For the free scalar field there is no problem, we simply have

π = ϕ̇ (2.22)

and
1 2 ⃗ 
⃗ + m 2 ϕ2 .
H= π + ∇ϕ · ∇ϕ (2.23)
2
9 It is also interesting to consider field theories in finite volume, in which case there is more to say about this term. For

example we could impose Neumann boundary conditions n̂ · ∂L a


⃗ a = 0 instead of Dirichlet boundary conditions δϕ = 0
∂ ∇ϕ
and it would still vanish. More generally we can take the action to include additional boundary terms at spatial infinity,
whose variation is designed to cancel the spatial boundary term in (2.16) when we impose the boundary conditions of interest.
Ultimately the choice of spatial boundary conditions is part of the definition of the theory: a box with Dirichlet boundary
conditions is a different physical system from one with Neumann boundary conditions.

19
Once the Hamiltonian formalism is constructed, we can then quantize the theory by converting the
equal-time Poisson brackets

{ϕa (t, ⃗x), ϕb (t, ⃗y )} = 0


{πa (t, ⃗x), πb (t, ⃗y )} = 0
{ϕa (t, ⃗x), πb (t, ⃗y )} = δ d−1 (⃗x − ⃗y ) (2.24)

to commutators in the usual way and then representing them on a Hilbert space spanned by field eigenstates
|ϕ⟩ obeying
Φa (0, ⃗x)|ϕ⟩ = ϕa (⃗x)|ϕ⟩. (2.25)
We will carry out this procedure in detail for the free scalar field next time, but we note that in a Lorentz-
invariant theory the commutativity at spatial separation we see here extends to commutativity at spacelike
separation.

2.3 Relativistic notation


It is now convenient to introduce relativistic notation by combining space and time into a d-vector

x = (t, ⃗x). (2.26)

We will write the components of ⃗x as xi , with i = 1, 2, . . . , d − 1, and the components of x as xµ , with


µ = 0, 1, . . . , d − 1. By definition we have
x0 = t. (2.27)
If we were mathematicians we would be zealous in adhering to using x for the vector and xµ for its compo-
nents, but in physics there is a longstanding tradition of conflating the two since writing xµ instead of x has
the convenient feature of reminding us what kind of object we are talking about (and after all if you know
its components then you know the vector).10 The inner product of two d-vectors v µ and uµ is given by

u · v = uµ v ν ηµν , (2.28)

where11  
−1 0 0 ... 0
0 1 0 ... 0
 
ηµν :=  0 0 1 ... 0 (2.29)


 .. .. .. .. .. 
 . . . . .
0 0 0 ... 1
is the d-dimensional Minkowski metric and we are using the Einstein summation convention that sums
over pairs of repeated indices automatically. This inner product is preserved under Lorentz transformations

u′µ = Λµν uν
v ′µ = Λµν v ν ,

where Λµν is any d × d matrix that obeys

Λµα Λν β ηµν = ηαβ . (2.30)


10 This abuse of notation is sometimes formalized by using “abstract index notation”, which writes the abstract vector as xa

and its components as xµ . This is done for example in Wald’s book. We already have enough kinds of index to be getting on
with however, so we will stick to being somewhat cavalier about the difference between x and xµ (and analogously ⃗ x and xi ).
A similar remark applies about the difference between a function f and the evaluation f (x) of that function on an element x
of its domain, which we have already conflated several times.
11 Some benighted particle theorists use a horrid “mostly-minus” convention for η
µν that reverses its overall sign, and in this
context our convention is called “mostly-plus”. In general life is not improved by increasing the number of minus signs, and
that is absolutely the case here.

20
Indeed we have

u′ · v ′ = Λµα uα Λν β v β ηµν
= ηαβ uα v β
= u · v. (2.31)

The set of d × d matrices obeying (2.30) is called the Lorentz group, and we will have lots to say about it
later.
It is also convenient to introduce d-component objects with a lowered Lorentz index, called one-forms,
which transform as
ωµ′ = Λµν ων . (2.32)
Here Λν µ indicates the transpose of the inverse of Λν µ , meaning that it obeys

Λν µ Λν α = δµα (2.33)

with δµα being the Kronecker delta that is equal to one if α = µ and zero otherwise. A simple example of
a one-form is the scalar gradient

∂µ ϕ = (ϕ̇, ∇ϕ), (2.34)
which transforms with the inverse-transpose of Λ because the partial derivative transforms opposite to the
spacetime coordinates. We can compute the inner product of two one-forms by using the inverse metric η µν ,
which again is diagonal with diagonal elements (−1, 1, . . . , 1):

ω · σ = ωµ σν η µν . (2.35)

You will show on the homework that we can use the metric to turn a vector into a one-form and the inverse
metric to turn a one-form into a vector by “lowering” and “raising” the indices

uµ := ηµν uν
ω µ := η µν ων , (2.36)

and also that the inverse-transpose Lorentz transformation Λν µ is indeed obtained by raising/lowering the
indices of the original Lorentz transformation Λν µ in this way.
Using this notation we can write the free scalar field Lagrangian density more elegantly in a few different
ways,
1
∂µ ϕ∂ν ϕη µν + m2 ϕ2

L(ϕ, ∂ϕ) = −
2
1
= − ∂µ ϕ∂ µ ϕ + m2 ϕ2

2
1
= − ∂ϕ · ∂ϕ + m2 ϕ2 ,

(2.37)
2
and the Klein-Gordon equation becomes
∂ 2 − m2 ϕ = 0.

(2.38)
More generally we can write the action as
Z  
S[ϕ; x] = dd x L ϕ(x), ∂ϕ(x); x (2.39)

and the Euler-Lagrange equations as  


∂L ∂L
∂µ = . (2.40)
∂∂µ ϕa ∂ϕa

21
2.4 Symmetries and currents
One of the most important advantages of the Lagrangian formalism is the close relationship between sym-
metries and conserved quantities. By definition a symmetry in the Lagrangian formalism is an invertible
change of variables
ϕ′a (x) = F [ϕ] (2.41)
which leaves the action invariant up to future/past boundary terms. A particularly interesting kind of
symmetry is a continuous symmetry, which is a family of symmetries Fθ labeled by a continuous parameter
θ such that when θ = 0 the transformation Fθ is the identity. In particular we can take θ to be infinitesimal,
in which case we’ll call it ϵ, and we then have a field theory version of Noether’s theorem: any infinitesimal
transformation of the fields that leaves the action invariant up to future/past boundary terms leads to a
conserved current. Indeed consider an infinitesimal transformation12
ϕ′a (x) = ϕa (x) + ϵδS ϕa (x) (2.42)
that to first order in ϵ leaves the action (2.39) invariant up to future/past boundary terms.13 Here δS ϕa
is some function of ϕa and its derivatives at x, and possibly also x itself explicitly. One way for this to
happen is for the Lagrangian density itself to be invariant, but more generally its transformation could be
the divergence of a vector since that would still integrate to a future/past boundary term (assuming that the
spatial boundary terms vanish). More explicitly, in order for the transformation (2.42) to be a symmetry we
need
X  ∂L ∂L

a a
δS L = δ
a S
ϕ + ∂ δ
a µ S
ϕ = ∂µ α µ , (2.43)
a
∂ϕ ∂∂ µ ϕ
where αµ is some local function of ϕ and ∂ϕ. We can rewrite this expression as
!
X ∂L X ∂L ∂L

a µ
∂µ δ ϕ −α
a S
= ∂µ a
− a
δ S ϕa , (2.44)
a
∂∂ µ ϕ a
∂∂ µ ϕ ∂ϕ

and then observe that the right-hand side vanishes for field configurations ϕa (x) that obey the Euler-Lagrange
equations (2.40). In other words we see that the Noether current
X ∂L
J µ (x) := − δS ϕa (x) + αµ (x) (2.45)
a
∂∂µ ϕa

obeys the conservation law


∂µ J µ = 0. (2.46)
Writing this equation in non-relativistic notation we have
⃗ · J⃗ = −J˙0 ,
∇ (2.47)
which is precisely the continuity equation familiar from electromagnetism. In the usual way it implies that
the charge Z
Q(t) = dd−1 x J 0 (t, ⃗x) (2.48)

is independent of time. We have chosen the overall sign and normalization of J µ so that Q is the generator
of the symmetry in the sense that for any observable O we have the Poisson bracket14
{Q, O} = δS O. (2.49)
12 It is not obvious, but every infinitesimal symmetry can be “exponentiated” to produce a continuous symmetry so the two
ideas are equivalent.
13 It is important here that the action needs to be invariant for any ϕa (x), not just solutions of the equations of motion. In

the latter case the action is always invariant to first order under any continuous transformation!
14 This Poisson bracket is easy to derive when δ ϕ depends only on ϕ and not its derivatives and α = 0. The general case is
S
tricky and I haven’t found a textbook discussion, the only derivation I know is given in section 4.2 of 1906.08616.

22
After quantization this becomes
[Q, O] = iδS O. (2.50)
As a simple example of this construction let’s return to our free scalar theory and now set the mass m
to zero. The Lagrangian density is then invariant under the shift symmetry

ϕ′ (x) = ϕ(x) + ϵ, (2.51)

so we have a continuous symmetry with

δS ϕ = 1
αµ = 0. (2.52)

The Noether current is given by


J µ = ∂ µ ϕ, (2.53)
and the conservation law follows immediately from the (massless) wave equation ∂ 2 ϕ = 0.

2.5 Noether currents for Poincaré symmetry


A more sophisticated example of a continuous symmetry in field theory is Poincaré symmetry, which is
the full set of symmetries obtained by combining Lorentz transformations and spacetime translations. A
general Poincaré transformation can be put in the form

x′µ = Λµν xν + aν , (2.54)

where Λ is a Lorentz transformation obeying (2.30) and a is an arbitrary vector, but to interpret it as a
dynamical symmetry for Noether’s theorem we need to recast it as a transformation of the fields rather than
the coordinates.15 On a scalar field the transformation is simple to write down: we have

ϕ′ (x) = ϕ(Λ−1 (x − a)). (2.55)

Here the inverse transformation appears inside the field so that the composition of Poincaré transformations
works out correctly, as you showed on the previous homework.
To apply Noether’s theorem we need to understand infinitesimal Lorentz transformations, meaning we
should write
Λµν = δνµ + ϵω µν (2.56)
and substitute into (2.30) to see what the constraints are on ω µν . Indeed we have

δαµ + ϵω µα δβν + ϵω ν β ηµν = ηαβ + ϵ (ωαβ + ωβα ) + O(ϵ2 ),


 
(2.57)

so for (2.30) to hold we need ω with both indices down to be antisymmetric:

ωβα = −ωαβ . (2.58)

Including also an infinitesimal translation aµ = ϵbµ we therefore have

ϕ(Λ−1 (x − a)) = ϕ(x − ϵ(b + ωx) + O(ϵ2 ))


= ϕ(x) − ϵ(bν + ω να xα )∂ν ϕ(x) + O(ϵ2 ), (2.59)
15 The viewpoint where the fields transform and the coordinates stay the same is sometimes called the active viewpoint, to be

distinguished from a passive viewpoint where the coordinates transform and the fields stay the same. As in many situations,
here it is better to be active.

23
t y t
4 4 4

2 2 2

0 x 0 x 0 x

-2 -2 -2

-4 -4 -4
-4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4

Figure 5: Killing vector fields for a spacetime translation, a spatial rotation, and a boost.

and thus

δS ϕ(x) = −(bµ + ω µα xα )∂µ ϕ(x)


= −ξ µ (x)∂µ ϕ(x), (2.60)

where in the second line we have introduced a Killing vector field

ξ µ (x) := bµ + ω µα xα (2.61)

that we can view as pointing in the direction of the infinitesimal Poincaré transformation in question. For
example an infinitesimal boost in the x1 direction has ω 01 = ω 10 = 1 and thus

ξ µ = (x1 , x0 , 0, . . . , 0), (2.62)

while a pure rotation in the 12 plane has ω 21 = −ω 12 = 1, and thus

ξ µ = (0, −x2 , x1 , 0, . . . , 0). (2.63)

These Killing vector fields are illustrated in figure 5. More generally a Killing vector field is by definition a
vector field for which
∂µ ξν + ∂ν ξµ = 0, (2.64)
as you can easily check is the case here.16 Contracting this equation with the inverse metric, we also see that

∂µ ξ µ = 0. (2.65)

In Minkowski space (2.61) gives the full set of Killing vectors. It is spanned by d − 1 infinitesimal boosts,
(d − 1)(d − 2)/2 infinitesimal rotations, and d infinitesimal spacetime translations. For d = 4 this gives three
boosts (in the x, y, and z directions), three rotations (in the xy, yz, and zx planes), three space translations
(in the x, y, and z directions), and one time translation.
By definition a theory which is Poincaré-invariant is one whose Lagrangian density is a scalar under
Poincaré transformations, meaning that
δS L = −ξ µ ∂µ L (2.66)
for any Killing vector ξ µ . You will check this equation (somewhat laboriously) for our free scalar theory in
the homework. Since ξ µ is a Killing vector, by equation (2.65) we have

ξ µ ∂µ L = ∂µ (ξ µ L) (2.67)
16 The motivation for this definition is that an infinitesimal coordinate transformation x′µ = xµ + ξ µ (x) leaves the spacetime

metric ηµν invariant if and only if ξ µ obeys (2.64).

24
and thus
δS L = ∂µ αµ (2.68)
with
αµ = −ξ µ L. (2.69)
For any Killing vector ξ µ we therefore have a conserved Noether current
X ∂L
Jξµ (x) = − δS ϕa (x) − ξ µ L(x). (2.70)
a
∂∂µ ϕa

In the free scalar theory we can evaluate this, giving


1
Jξµ = −ξ α ∂ µ ϕ∂α ϕ + ξ µ ∂α ϕ∂ α ϕ + m2 ϕ2

2
= −ξν T µν , (2.71)

where
1
T µν := ∂ µ ϕ∂ ν ϕ − η µν ∂α ϕ∂ α ϕ + m2 ϕ2

(2.72)
2
is called the energy-momentum tensor. It has two nice properties:
(1) Symmetry:
T µν = T νµ (2.73)

(2) Conservation:
∂µ T µν = 0. (2.74)

Indeed any tensor obeying these two properties has the feature that contracting it with a Killing vector gives
a conserved current:
∂µ (ξν T µν ) = ∂µ ξν T µν + ξν ∂µ T µν = 0. (2.75)
It is not obvious from (2.70) that we can in general write Jξµ in terms of a symmetric conserved energy-
momentum tensor in this way, since when there are fields that are not scalars δS ϕa can involve derivatives
of ξ µ ,but it turns out that when such derivatives appear they can always be removed by shifting Jξµ by
local term whose divergence is identically zero.17 The resulting energy-momentum tensor has a more elegant
equivalent definition as the derivative of the action with respect to the spacetime metric:
Z
ϵ
S[ϕ, ηµν + ϵhµν ] = S[ϕ, ηµν ] + dd xT µν (x)hµν (x) + O(ϵ2 ). (2.76)
2
The metric is a symmetric tensor so this T µν obeys condition (1) automatically, and with a little more
differential geometry than we are requiring for this class you can also show that it obeys condition (2)
provided that there are no Lorentz-violating background fields.
We can understand the physical meaning of the energy momentum tensor by looking at the Noether
currents for pure spacetime translations with ωµν = 0. By definition the total momentum vector P λ , which
is the generator of spacetime translations, is given by
Z
dd−1 x Jξ0 = −ξλ P λ . (2.77)

Therefore we have Z
µ
P = dd−1 x T 0µ , (2.78)

17 See section 7.4 of Weinberg for the case where the Lagrangian has only first derivatives, as we’ve been considering here, or

appendix A of 2108.04841 for the general case.

25
so we can think of T 00 as the energy density and T 0i as the momentum density (hence the name of the
tensor). And indeed for our free scalar theory, from (2.72) we have
1 2 ⃗ 
⃗ + m 2 ϕ2 ,
T 00 = ϕ̇ + ∇ϕ · ∇ϕ (2.79)
2
consistent with the Hamiltonian density (2.23). We can also define generators of pure Lorentz transformations
(with bµ = 0) via Z
1
dd−1 xJξ0 = ωµν J µν , (2.80)
2
which gives Z
J µν = dd−1 x xµ T 0ν − xν T 0µ .

(2.81)

Here J ij is the angular momentum for a rotation in the ij plane, while J i0 is the generator of a boost in the
i direction.

26
2.6 Homework
1. Show that Λµν = ηµα η νβ Λαβ is indeed the inverse of Λµν , in the sense that Λµλ Λν λ = δνµ .
2. Show that the gradient ∂µ ϕ transforms as a one-form under the Poincaré transformation ϕ′ (x) =
ϕ(Λ−1 (x − a)).
3. Show that if V µ is a vector then Vµ = ηµν V ν transforms as a one-form, and also that if ωµ is a one-form
then ω µ = η µν ων transforms as a vector.
4. The Lagrangian density for Maxwell theory is
1
L = − F µν Fµν ,
4
where
Fµν = ∂µ Aν − ∂ν Aµ
is the field strength tensor and Aµ is a one-form usually called the gauge potential or gauge field. The
relationship between Aµ and the usual scalar potential ϕ and vector potential A ⃗ is that A = (−ϕ, A).

(a) Write out the Euler-Lagrange field equations which follow from the Maxwell Lagrangian. Use the
relativistic variables Aµ and Fµν .
(b) For d = 4, give expressions for the components of Fµν in terms of the usual electric and magnetic
⃗ and B,
fields E ⃗ and use these to rewrite the equations of motion in terms of E ⃗ and B.
⃗ How
do these relate to Maxwell’s equations? Did you get all four equations, and if not where do the
others come from?
(c) Now add a term Aµ J µ to the Lagrangian density, where J µ = (ρ, J) ⃗ is the spacetime electric
current. Show how this modifies the equations of motion, and check that for d = 4 it gives the
correct charge and current terms in Maxwell’s equations. In this part you can view J µ as a
“background” current, meaning that when you compute the variation of the action you can take
its variation to be zero. Eventually we will build J µ out of other fields which create charged
particles, but this does not effect the equation of motion obtained by varying Aµ .
5. The Langrangian density for a complex free scalar field is given by

L = −∂ µ ϕ∗ ∂µ ϕ − m2 ϕ∗ ϕ.

(a) Find the Euler-Lagrange equations for this action. In principle in computing variations you should
treat the independent fields as the real and imaginary parts of ϕ, but your life will be easier if you
can convince yourself that you can instead treat ϕ and ϕ∗ as the independent variables. Convince
yourself that you indeed can do this for a general Lagrangian density L(ϕ, ϕ∗ , ∂ϕ, ∂ϕ∗ ).
(b) Show that the transformation ϕ′ (x) = eiθ ϕ(x) is a symmetry for any θ, write out its infinitesimal
version (i.e. to linear order in θ), and construct the associated Noether current. Confirm explicitly
that this current is conserved as a consequence of the equations of motion. You again will do
better to view ϕ and ϕ∗ as the independent fields.
(c) Write an expression for the conserved symmetry charge Q, and check that it indeed generates the
symmetry transformation as in equation (2.49).
6. Show explicitly that the free real scalar Lagrangian density obeys the invariance condition (2.66) under
the infinitesimal transformation δS ϕ = −ξ µ ∂µ ϕ for any Killing vector ξ µ .
7. (extra credit) The action of a free scalar field in a general metric gµν is given by

Z
1
dd x −g ∂µ ϕ∂ν ϕg µν + m2 ϕ2 ,

S=−
2

27
where g indicates the determinant of the matrix gµν and g µν is its inverse. Show that if we take
gµν = ηµν + ϵhµν , the energy momentum tensor we construct as in equation (2.76) is the same one we
found from the Noether current. To do this you need to look up or derive how the determinant and
inverse of a matrix respond to a small change in the matrix.
8. (extra credit) The Maxwell action in a general metric is

Z
1
S=− dd x −gFµν Fαβ g µα g νβ .
4
What is energy-momentum tensor which follows from varying this action with respect to gµν ? For
d = 4 write T00 in terms of the electric and magnetic fields; does the answer look familiar?

28
3 Quantization of a free scalar field
The previous lecture was rather formal. Formalism is good for organizing one’s thinking, but to really
understand things you need to get your hands dirty. In this lecture and the following one we will carry out
in detail the canonical quantization of a free scalar field in d spacetime dimensions, with Lagrangian density
1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ . (3.1)
2 2
The word “free” here means that the Lagrangian is quadratic in the fields - we’ll see that this implies that
the particles in this theory are non-interacting. For now we will take ϕ to be real-valued, we will discuss
soon how to generalize to the case of complex ϕ. The free scalar field is both simple and profound: it is
exactly solvable, and yet it illustrates many of the deep aspects of quantum field theory that we will return
to again and again. Before beginning it is worth emphasizing that this model is not only of interest as an
example: it has many physical realizations. Some examples in various dimensions:
ˆ The Higgs boson in the Standard Model of particle physics, discovered in 2010 at the LHC, is to first
approximation described by a free scalar field with d = 4 and m = 125 GeV.
ˆ Helium 4 (He4 ) at low temperature and standard pressure is a special kind of liquid, called a superfluid,
which flows with zero viscosity. The low-energy excitations of this liquid are density waves called
phonons, and they are described by a free scalar field theory with d = 4 and m = 0. If we confine
Helium-4 to a two-dimensional surface, then it is described by a free scalar field with d = 3 and m = 0.
ˆ The protons and neutrons in nuclei are held together by exchanging particles called pions, and these
pions are governed at low-energy by free scalar fields with d = 4 and m = 134 MeV (for the π 0 ) and
m = 139 MeV (for the π ± ). The π 0 is a real scalar field, while the π ± are complex (as we will introduce
below).
ˆ In string theory the embedding of the string worldsheet into spacetime is described using free scalar
fields with d = 2 and m = 0.
In fact the 2016 Nobel prize in physics was awarded in substantial part for understanding the d = 2 version
of this theory!

3.1 Canonical commutation relations and wave functionals


Let’s first recall the Hamiltonian formulation of the free scalar: defining a canonical conjugate momentum
∂L
π= = ϕ̇, (3.2)
∂ ϕ̇
we have the Hamiltonian density
1 2 1 m2 2
H=π + ∇ϕ · ∇ϕ + ϕ . (3.3)
2 2 2
Lifting the classical fields ϕ(x) and π(x) to quantum operators Φ(x) and Π(x) and their Poisson brackets to
commutators, we have the algebra18
[Φ(t, ⃗x), Φ(t, ⃗y )] = 0
[Π(t, ⃗x), Π(t, ⃗y )] = 0
[Φ(t, ⃗x), Π(t, ⃗y )] = iδ d−1 (⃗x − ⃗y ). (3.4)
18 You may wonder why the third commutator has a δ-function on the right-hand side instead of some kind of Kronecker-δ

with continuous indices. This is because we defined π as a partial derivative of the Lagrangian density, as opposed to a partial
derivative of the Lagrangian. The latter actually vanishes since it is multiplied by the infinitesimal dd−1 x, so in field theory
it is better to use the former. With a lattice regulator they are related by a power of the lattice spacing a, as we will see in a
moment.

29
The first step of canonical quantization is to represent this algebra on a Hilbert space, which in the particle
case we take to be the vector space of square-normalizeable wave functions. We can do the same thing here,
but we need to introduce a space of normalizeable wave functionals

Ψ[ϕ] = ⟨ϕ|Ψ⟩, (3.5)

where |ϕ⟩ labels a complete eigenbasis for Φ(⃗x) := Φ(0, ⃗x):

Φ(⃗x)|ϕ⟩ = ϕ(⃗x)|ϕ⟩. (3.6)

Note that the states |ϕ⟩ are labeled by functions ϕ : Rd−1 → R, so Ψ is indeed a functional (a function of
a function). In order to compute the inner product between two wave functionals Ψ1 and Ψ2 , we need to
compute a functional integral Z
⟨Ψ2 |Ψ1 ⟩ := DϕΨ2 [ϕ]∗ Ψ1 [ϕ]. (3.7)

Functional integrals are rather delicate mathematical objects, as we will discuss in more detail when we get
to path integrals. Roughly speaking the idea is to define the measure as
Y
Dϕ := dϕ(⃗x), (3.8)

x

so in other words we integrate independently over the value of ϕ at each point in space.
To represent the algebra (3.4) on this Hilbert space, imitating nonrelativistic quantum mechanics we can
take
δ
Π(⃗x) := Π(0, ⃗x) = −i , (3.9)
δϕ(⃗x)
where the quantity appearing on the right-hand side is the functional derivative defined by
δ
ϕ(⃗y ) = δ d−1 (⃗x − ⃗y ). (3.10)
δϕ(⃗x)

We can easily check the canonical commutation relation:


   
δ δ
ϕ(⃗x) · −i − −i · ϕ(⃗x) = iδ d−1 (⃗x − ⃗y ). (3.11)
δϕ(⃗y ) δϕ(⃗y )

Proceeding as in non-relativistic quantum mechanics, the next step is then to construct energy eigenstates
by solving the functional Schrödinger equation

δ2
Z  
1
dd−1 x − + ∇ϕ · ∇ϕ(⃗
x ) + m2
ϕ(⃗
x )2
Ψ[ϕ] = EΨ[ϕ]. (3.12)
2 δϕ(⃗x)2

In principle solving this equation (including interactions and other types of fields) is “all there is” to quantum
field theory.19
We can make the functional Schrödinger formalism more rigorous by regularizing the theory using a
spatial lattice, so that the field variable is only defined on a discrete set of spatial points ⃗x which are part of
a lattice L. Taking L to be a cubic lattice, this more explicitly looks like
 !2 
1 X d−1  X Φ(⃗x + ⃗δ) − Φ(⃗x)
H= a Π(⃗x)2 + + m2 Φ(⃗x)2  , (3.13)
2 a
x∈L
⃗ ⃗
δ

19 More carefully this is all there is to field theories which are constructed from Lagrangians. There are some exotic field

theories that do not seem constructable in this way, and studying them requires techniques that are mostly beyond the scope
of this class.

30
with
[Φ(⃗x), Π(⃗y )] = ia−(d−1) δ⃗x,⃗y (3.14)
and thus

Π(⃗x) = −ia−(d−1) . (3.15)
∂ϕ(⃗x)
Here a is the lattice spacing, and ⃗δ ranges over the orthogonal lattice displacements ax̂1 , ax̂2 , . . . , ax̂d−1 . The
functional Schrödinger equation then becomes
 !2 
1 X d−1  −2(d−1) ∂ 2 X ϕ(⃗x + ⃗δ) − ϕ(⃗x)
a −a + + m2 ϕ(⃗x)2  Ψ[ϕ] = EΨ[ϕ], (3.16)
2 ∂ϕ(⃗x)2 a
x∈L
⃗ ⃗
δ

which is now just a second-order partial differential equation in many variables. If we also work in finite
volume, so that the total number of points is finite, then we can (at least in principle) try solving this
equation on a computer. In free theories this is not necessary since the theory can be solved exactly (see
below), but if we include interactions (such as say a ϕ4 term in the Hamiltonian) then this approach can be
viable.20

3.2 Heisenberg fields and particle states


Wave functionals are conceptually important in quantum field theory because they make it clear that ulti-
mately we are still doing the same quantum mechanics we learned in the non-relativistic case. Unfortunately
however they are somewhat unwieldy objects, as we have already seen, and indeed in quantum field theory
the wave functional approach is not so useful in practice. It turns out to be a better idea to study the field
operators directly, rather than the states, especially in the Heisenberg picture.
Let’s first recall that by definition in the Heisenberg picture the time-dependence of an operator is given
by
O(t) = eiHt O(0)e−iHt . (3.17)
Taking the time derivative (and being a bit cavalier about operator ordering in the second step) we see that

Ȯ(t) = i[H, O(t)] = −{H, O(t)}, (3.18)

which is precisely the classical equation of motion (in Hamiltonian form) for O(t). Thus in the free scalar
theory (where there is no issue of operator ordering since there are no terms in the Hamiltonian involving
both Φ and Π) the Heisenberg field
Φ(t, ⃗x) = eiHt Φ(⃗x)e−iHt (3.19)
should obey its classical equation of motion, namely the Klein-Gordon equation

(∂ 2 − m2 )Φ = 0. (3.20)

As discussed in the last lecture we will impose boundary conditions requiring the fields to vanish at spatial
infinity, and any solution of the Klein-Gordon equation which vanishes at spatial infinity can be expanded
in terms of a plane-wave basis set of solutions given by
1 ⃗
f⃗k (t, ⃗x) = p eik·⃗x−iω⃗k t (3.21)
2ω⃗k

and its complex conjugate. Here we have defined


p
ω⃗k = |k|2 + m2 , (3.22)
20 In practice there are often much better numerical techniques available however, with “monte carlo” evaluation of the path

integral being the long-standing champion for many theories (including this one). Newer approaches which are gaining ground
are the “numerical bootstrap” and quantum simulation.

31
p
where |k| = ⃗k · ⃗k, and we have included the factor of √ 1 for future convenience (it ensures that we
2ω ⃗
k
end up with properly-normalized annihilation/creation operators below). Defining a spacetime momentum
vector
k µ = (ω⃗k , ⃗k), (3.23)
in relativistic notation we have
1
f⃗k (x) = √ eik·x . (3.24)
2k 0
Expanding the Heisenberg field in terms of these solutions we have

dd−1 k h
Z i
∗ †
Φ(x) = f⃗ (x)a⃗ + f⃗ (x)a⃗
(2π)d−1 k k k k
Z d−1
d k 1 h
ik·x −ik·x †
i
= e a ⃗
k + e a⃗ , (3.25)
(2π)d−1 2ω⃗k
p
k

where a⃗k and a⃗† are operator coefficients in the mode expansion of the operator Φ(x). The operator
k
coefficients of f⃗k and f⃗k∗ are hermitian conjugates because ϕ is a real field and so Φ needs to be a hermitian
operator. The factor of (2π)1d−1 is included as a matter of convenience: it has to appear somewhere due to
the way that Fourier transforms work, and this turns out to be the best place to put it. There is a mantra
for remembering where it goes which we’ll call Coleman’s rule:
⋆ Whenever you integrate over momentum there is a factor of 1/(2π) for each component, and whenever
you have a momentum-conserving δ-function then it comes with a factor of 2π for each component.
So far we haven’t actually done much, but let’s now see what the canonical commutation relations (3.4)
have to say about the algebra of a⃗k and a⃗† . The easiest way to do this is to use the Fourier transform to
k
extract a⃗k and a⃗† from the t = 0 fields Φ(⃗x) and Π(⃗x). In doing such calculations there are two crucial
k
identities:
Z

dd−1 xe−ik·⃗x = (2π)d−1 δ d−1 (⃗k)

dd−1 k i⃗k·⃗x
Z
e = δ d−1 (⃗x), (3.26)
(2π)d−1
where we have placed the factors of 2π in accordance with Coleman’s rule. Using the first of these we have

dd−1 k
Z Z Z
1 h
i⃗ −i⃗ x †
i
dd−1 xe−i⃗p·⃗x Φ(⃗x) = dd−1
xe −i⃗
p·⃗
x
e k·⃗
x
a⃗
k + e k·⃗
a⃗
(2π)d−1 2ω⃗k
p
k

dd−1 k
Z
1 h d−1 d−1 ⃗ d−1 d−1 ⃗ †
i
= (2π) δ ( k − p
⃗ )a⃗k + (2π) δ (k + p
⃗ )a⃗
(2π)d−1 2ω⃗k
p
k

1  
=p ap⃗ + a†−⃗p (3.27)
2ωp⃗

and
dd−1 k −iω⃗k
Z Z Z h i
i⃗ −i⃗ x †
dd−1 xe−i⃗p·⃗x Π(⃗x) = d d−1
xe −i⃗p·⃗
x
e k·⃗
x
a ⃗
k − e k·⃗
a⃗ (3.28)
(2π)d−1 2ω⃗k
p
k
r
dd−1 k ω⃗k h
Z i
d−1 d−1 ⃗ d−1 d−1 ⃗ †
= −i (2π) δ ( k − p
⃗ )a⃗
k − (2π) δ (k + p
⃗ )a⃗
(2π)d−1 2 k
r 
ωp⃗ 
= −i ap⃗ − a†−⃗p , (3.29)
2

32
and thus
Z  
1 √ i
ap⃗ = √ dd−1 xe−i⃗p·⃗x ωp⃗ Φ(⃗x) + √ Π(⃗x)
2 ωp⃗
Z  
† 1 d−1 x √
p·⃗
i⃗ i
ap⃗ = √ d xe ωp⃗ Φ(⃗x) − √ Π(⃗x) . (3.30)
2 ωp⃗

We can then use these expressions together with the canonical commutation relations (3.4) to show that:
Z Z
i ′
 
[ap⃗ , ap⃗ ′ ] = d x dd−1 ye−i⃗p·⃗x−i⃗p ·⃗y [Φ(⃗x), Π(⃗y ] − [Φ(⃗y ), Π(⃗x)] = 0
d−1
2
[a†p⃗ , a†p⃗ ′ ] = −[ap⃗ , ap⃗ ′ ]† = 0
Z Z
† i ′
 
[ap⃗ , ap⃗ ′ ] = − d x dd−1 ye−i⃗p·⃗x+i⃗p ·⃗y [Φ(⃗x), Π(⃗y ] + [Φ(⃗y ), Π(⃗x)]
d−1
2
Z

= dd−1 xei(⃗p −⃗p)·⃗x

p − p⃗ ′ ).
= (2π)d−1 δ d−1 (⃗ (3.31)

These results should look familiar: they are the algebra of creation and annihilation operators for an infinite
number of harmonic oscillators, with the oscillators labeled by the spatial momentum p⃗. They are also
the momentum space version of the creation/annihilation operators on multi-particle Fock space that we
introduced back in the first lecture. Defining a vacuum state |Ω⟩ by the property that

ap⃗ |Ω⟩ = 0 (3.32)

for all p⃗ (we will show in a moment that this is indeed the ground state of the Hamiltonian), we have
one-particle states of the form
a†p⃗ |Ω⟩, (3.33)
two-particle states of the form
a†p⃗ ap†⃗ ′ |Ω⟩, (3.34)
and so on.
To justify the words “vacuum” and “particle” here however, we need to study the Hamiltonian. This is
given by Z
1 h i
⃗ x)|2 + m2 Φ(⃗x)2 ,
H= dd−1 x Π(⃗x)2 + |∇Φ(⃗ (3.35)
2
into which we should substitute our expression (3.25) for the Heisenberg field. This calculation is a bit
tedious, I’ll compute the first term here and you’ll do the other two in the homework:
dd−1 k dd−1 p
Z Z Z Z
1 1 1 
i⃗ −i⃗ x †

p·⃗ −i⃗
p·⃗
x †

dd−1 xΠ(⃗x)2 = dd−1 x √ −iω ⃗
k e k·⃗
x
a⃗
k + iω⃗k e k·⃗
a⃗ −iω p
⃗ e i⃗ x
a⃗
k + iω p
⃗ e ap⃗
2 2 (2π)d−1 (2π)d−1 2 ω⃗k ωp⃗ k
Z d−1
1 d k  
= ω⃗ a† a⃗ + a⃗k a⃗†k − a⃗k a−⃗k − a⃗†k a†−⃗k . (3.36)
4 (2π)d−1 k ⃗k k

Combining all three terms, we find

dd−1 k
Z
1 
† †

H= ω⃗ a a⃗ + a⃗ a . (3.37)
2 (2π)d−1 k ⃗k k k ⃗ k

33
This looks quite a bit like the harmonic oscillator Hamiltonian, and we can make it look more so by using
the algebra (3.31):

dd−1 k
Z
1 
† †

H= ω ⃗ a a⃗ + a⃗ a
2 (2π)d−1 k ⃗k k k ⃗ k

dd−1 k
Z
1  
= d−1
ω⃗k a⃗† a⃗k + [a⃗k , a⃗† ] + a⃗† a⃗k
2 (2π) k k k
Z d−1 Z
d k 1
= ω⃗ a† a⃗ + dd−1 kω⃗k δ d−1 (0). (3.38)
(2π)d−1 k ⃗k k 2

The first term here is just what we would like: the operator a⃗† a⃗k is the number operator that counts how
k
many particles there are of momentum ⃗k, so this term says that each particle of momentum ⃗k contributes
ω⃗k to the energy. For example if we act on a one-particle state we have

dd−1 k dd−1 k
Z  Z
† †
ω⃗ a⃗ a⃗ a p
⃗ |Ω⟩ = ω⃗ a† (2π)d−1 δ d−1 (⃗k − p⃗)|Ω⟩ = ωp⃗ a†p⃗ |Ω⟩, (3.39)
(2π) d−1 k k k (2π)d−1 k ⃗k

so one-particle states ap†⃗ |Ω⟩ are eigenstates of this term with eigenvalue ωp⃗ . Ignoring the second term, we
thus have succeeding in finding the eigenstates of the Hamiltonian!
What however are we to say about the second term in (3.38)? On the one hand it does not involve the
creation/annihilation operators and thus is proportional to the identity, which means that the eigenstates
we just found are also eigenstates of the full Hamiltonian. On the other hand it is embarassingly infinite,
for two different reasons. The first reason is the δ-function evaluated at zero, which is an “infrared (IR)
divergence” arising because the momentum ⃗k is a continuous parameter. If we were to work in finite volume
V , then the momentum would be discrete and we would find δ d−1 (0) ∼ V . The second reason is the integral
over ⃗k, which diverges at large ⃗k since in continuum field theory we can have particles of arbitrarily high
momentum. This is called an “ultraviolet (UV) divergence”, and it is regulated if we introduce a lattice
with lattice spacing a since then it does not make sense to consider momenta larger than of order the “UV
cutoff”
1
Λ := . (3.40)
a
With both cutoffs in place we therefore have
Z
1
dd−1 kω⃗k δ d−1 (0) ∼ V Λd , (3.41)
2
which you can check indeed has units of energy. What are we to make of this term? The essential point is
that since it is proportional to V , we can write it as a local integral of a constant over space:
Z Z
1 d−1 d−1 d
d kω⃗k δ (0) ∼ Λ dd−1 x. (3.42)
2
We would thus precisely get a term of this form if from the beginning we had taken the Lagrangian to include
a “cosmological constant” term
∆L = −ρ0 , (3.43)
and so the term (3.41) is usually called a renormalization of the cosmological constant. Somehow the
dynamics of our free scalar field have generated a gigantic energy density filling the universe! This is a quite
remarkable prediction, but unfortunately it is also quite inconsistent with our understanding of the world.
In the absence of gravity such an energy density would have no measurable effect, but gravity responds to
the total energy density and such a gigantic positive energy density would lead to a universe that tore itself
apart via exponential expansion on a timescale of order Λ1 . We don’t quite know what the scale of Λ should

34
be, but from the Large Hadron Collider it should at least be bigger than ∼ 10TeV and this already tells us
that Λ1 ≲ 10TℏeV ∼ 6 × 10−29 s. Not good. No es bueno. 很不好 .
What should we do? There is only one way out: we need to introduce an additional “bare” cosmological
constant term in the original Lagrangian,
Lct ∼ Λd , (3.44)
called a counterterm, whose coefficient is precisely tuned to cancel the cosmological constant generated by
our free scalar field. The full Hamiltonian is then just be given by
dd−1 k
Z
Hren = ω⃗ a† a⃗ , (3.45)
(2π)d−1 k ⃗k k
so the vacuum has zero energy as hoped. This is our first example of a procedure called renormalization,
by which we carefully tune the coefficients in the Lagrangian in a Λ-dependent way to cancel UV divergences.
This may seem like a rather ugly fix. Why should the Lagrangian be fine-tuned in this way? How do we
know that there won’t be other UV divergences that can’t be canceled in this way? These are excellent
questions, and we will discuss them in considerable detail in the lectures to come.

3.3 Non-locality of the annihilation operator in position space


In the first lecture we tried (and failed) to build a quantum theory of relativistic particles using annihilation
and creation operators labeled by position. We can now straightforwardly see why this did not work: taking
the Fourier transform of (3.30), we see that
dd−1 p i⃗p·⃗x dd−1 p
Z Z Z  
1 p·(⃗
d−1 i⃗ y) √
x−⃗ i
a⃗x = e ap⃗ = √ d ye ωp⃗ Φ(⃗y ) + √ Π(⃗y ) . (3.46)
(2π)d−1 2 (2π)d−1 ωp⃗

This is not a local function of Φ and Π since the inverse Fourier transforms of ωp⃗ and √1ωp⃗ do not vanish
away from zero, and this non-locality is the origin of the apparent acausality we found in the first lecture.
In the non-relativistic limit however these functions become constants, in which case their inverse Fourier
transforms are δ-functions so a⃗x indeed becomes local:
r  
m i
a⃗x → Φ(⃗x) + Π(⃗x) . (3.47)
2 m

For this reason non-relativistic systems are often formulated using a⃗x and a⃗†x instead of Φ(⃗x) and Π(⃗x).

3.4 Lorentz transformations and microcausality revisited


We’ll now make an aside to see more explicitly how the scalar field theory we have constructed avoids
the problems we saw in the first lecture with a particle-based relativistic quantum mechanics. There we
motivated fields by looking for linear combinations of creation and annihilation operators that
(1) have simple Lorentz transformation properties
(2) commute at spacelike seperation.
In the free field theory we have been studying these conditions follow automatically from the canonical
commutation relations together with the invariance of the action under Lorentz transformations which act
on Φ as a scalar, but it is instructive to see how they arise from the point of view of the creation and
annihilation operators.
Let’s first consider Lorentz transformations. In a few lectures we will show that any Lorentz transforma-
tion Λ that does not reverse the direction of time must be implemented on the Hilbert space by a unitary
operator U (Λ), which we will take to leave the ground state invariant:
U (Λ)|Ω⟩ = |Ω⟩. (3.48)

35
To understand how U (Λ) acts on the rest of the Hilbert space, we need to understand its action on the
creation and annihilation operators. Before deciding this it is convenient to first understand the Lorentz
dd−1 p
transformation properties of the measure (2π) d−1 . The easiest way to do this is to note that the full measure

dd p
(2π)d
which integrates over p0 as well as p⃗ is Lorentz-invariant, since Lorentz transformations preserve the
Minkowski metric ηµν . We however only want to integrate over Lorentz vectors pµ which obey the on-shell
condition p0 = ωp⃗ . We can implement this using a Lorentz-invariant δ-function, leading to a manifestly
Lorentz-invariant measure
dd p
2πδ(−p2 + m2 ) Θ(p0 ). (3.49)
(2π)d
The Heaviside Θ function here is one for p0 > 0 and zero for p0 < 0, and is there to make sure that the δ
function only picks out p0 = ωp⃗ (as opposed to p0 = −ωp⃗ ). Θ(p0 ) is Lorentz invariant because we are only
dd−1 p
considering Lorentz transformations that do not reverse time. We can then relate this measure to (2π) d−1

via
dd p 2 2 0 dd−1 p dp0 2π
2πδ(−p + m ) Θ(p ) = δ(p0 − ωp⃗ )
(2π)d (2π)d−1 2π 2p0
dd−1 p 1
= , (3.50)
(2π)d−1 2ωp⃗

so if we define
Λµν pν = (p0Λ , p⃗Λ ) (3.51)
then
dd−1 pΛ 1 dd−1 p 1
d−1
= . (3.52)
(2π) 2ωp⃗Λ (2π)d−1 2ωp⃗
This also shows that we have

p ′Λ − p⃗Λ ) = ωp⃗ (2π)d−1 δ d−1 (⃗


ωp⃗Λ (2π)d−1 δ d−1 (⃗ p ′ − p⃗). (3.53)

Proceeding to consider the action of U (Λ), we can guess that

U (Λ)a†p⃗ |Ω⟩ = Np⃗,Λ a†p⃗Λ |Ω⟩ (3.54)

for some constant Np⃗,Λ that we can determine by requiring U (Λ) to be unitary. Indeed we want that

p ′Λ − p⃗Λ ) = ⟨Ω|ap⃗ ′ U (Λ)† U (Λ)a†p⃗ |Ω⟩


|Np⃗,Λ |2 (2π)d−1 δ d−1 (⃗
= ⟨Ω|ap⃗ ′ ap†⃗ |Ω⟩
p ′ − p⃗),
= (2π)d−1 δ d−1 (⃗ (3.55)

so from (3.53) we see that


ωp⃗Λ
r
Np⃗,Λ = (3.56)
ωp⃗
is consistent with unitarity. We therefore have the Lorentz transformations

ωp⃗Λ
r
U (Λ)ap⃗ U (Λ)† = ap⃗
ωp⃗ Λ
ωp⃗Λ †
r
U (Λ)ap†⃗ U (Λ)† = a . (3.57)
ωp⃗ p⃗Λ

36
We can use this to work out the Lorentz transformations of the field:
dd−1 p
Z
1 h ip·x †
i
U (Λ)Φ(x)U (Λ)† = e U (Λ)a p
⃗ U (Λ) †
+ e −ip·x
U (Λ)a p
⃗ U (Λ) †
(2π)d−1 2ωp⃗
p

dd−1 p ωp⃗Λ h ip·x
Z i
−ip·x †
= √ e a p
⃗ + e a p
⃗Λ
(2π)d−1 2ωp⃗ Λ

Z d−1 √
d pΛ ωp⃗ h i
= d−1
√ Λ eip·x ap⃗Λ + e−ip·x a†p⃗Λ
(2π) 2ωp⃗Λ
Z d−1
d p 1 h i(Λ−1 p)·x −i(Λ−1 p)·x †
i
= e ap⃗ + e a p

(2π)d−1 2ωp⃗
p

= Φ(Λx). (3.58)

Going from the first to the second line we used (3.57), going from the second to the third we used (3.52),
going from the third to the fourth we relabeled the integration variable p⃗Λ → p⃗, and in going from the fourth
to the fifth we used that
(Λ−1 p) · x = Λαβ pα xβ = pα Λαβ xβ = p · (Λx). (3.59)
Thus we see that indeed we have succeeded in constructing a Lorentz scalar out of creation/annihilation
operators which themselves have more complicated transformations, at least for Lorentz transformations
that do not reverse time. We will discuss time-reversal symmetry in a few lectures, where we will see that
it needs to be represented on Hilbert space by an antiunitary operator instead of a unitary operator.
Turning now to microcausality, let’s compute the commutator of Φ at spatial separation:

dd−1 p dd−1 k
Z Z
1 
x−i⃗
p·⃗
i⃗ k·⃗
y † −i⃗ x+i⃗
p·⃗ k·⃗
y †

[Φ(⃗x), Φ(⃗y )] = √ e [a p
⃗ , a⃗ ] − e [a⃗ ,
k p a ⃗ ]
(2π)d−1 (2π)d−1 2 ωp⃗ ω⃗k k

dd−1 p 1  i⃗p·(⃗x−⃗y)
Z 
= d−1
e − e−i⃗p·(⃗x−⃗y)
(2π) 2ωp⃗
= 0, (3.60)

where in going from the first to the second line we used (3.31) and in going from the second to the third
we flipped the sign of the integration variable in the second term. The point to notice however is that the
vanishing of this commutator required a nontrivial cancellation between two terms. For example if we had
tried to make the field Φ using only annihilation operators, then its commutator with its hermitian conjugate
would not vanish at spatial separation:
h Z dd−1 p 1
Z
dd−1 k 1 i Z dd−1 p 1
p·⃗ −i⃗ y †
e i⃗ x
a p
⃗ , e k·⃗
a⃗ = ei⃗p·(⃗x−⃗y) ̸= 0 (3.61)
(2π)d−1 2ωp⃗ (2π)d−1 2ω⃗k (2π)d−1 2ωp⃗
p p
k

It is the requirement of microcausality that requires us to use fields that involve both creation and annihilation
operators, leading to the distinctive predictions of particle number non-conservation and the existence of
antiparticles as discussed in the first lecture.

3.5 Quantization of a complex scalar field, antiparticles


There is a simple but instructive generalization of the free scalar field we have been discussing so far, where
ϕ is taken to be complex and the Lagrangian density to be

L = −∂ µ ϕ∗ ∂µ ϕ − m2 ϕ∗ ϕ. (3.62)

You will show on the homework that the equation of motion for this theory is again just

∂ 2 ϕ = m2 ϕ, (3.63)

37
but when we expand the field in terms of solutions there is no longer a reason for the creation and annihilation
operators to be related. We thus should write

dd−1 p
Z
1 h ip·x −ip·x †
i
Φ(x) = e a p
⃗ + e b ⃗ ,
p (3.64)
(2π)d−1 2ωp⃗
p

where ap⃗ and bp⃗ are not related. The canonical commutation relations follow from the observation that

Π(x) = Φ̇† (x), (3.65)

so we have
[Φ(⃗x), Φ̇† (⃗y )] = [Φ† (⃗x), Φ̇(⃗y )] = iδ d−1 (⃗x − ⃗y ) (3.66)
with all other commutators vanishing. In the homework you will show that these imply that ap⃗ , ap†⃗ and bp⃗ , bp†⃗
give two independent sets of annihilation/creation operators. This theory thus has two species of particles,
both with mass m. You will also show that these particles have opposite charge under the symmetry
ϕ′ (x) = eiθ ϕ(x), and indeed one is the antiparticle of the other.

3.6 Correlation functions I: Definition and physical meaning


We have now solved the theory of a free scalar field. What we haven’t done however is compute anything
interesting with it. We’ve acknowledged that the functional Schrödinger formalism is not so useful in practice,
so what kinds of questions are interesting in quantum field theory? Long experience has shown that the
physics of quantum fields is most elegantly packaged into vacuum expectation values of products of Heisenberg
fields, otherwise known as correlation functions.
The simplest correlation function for any field O(x) is its one-point function in the ground state:

⟨O(x)⟩ := ⟨Ω|O(x)|Ω⟩. (3.67)

If O(x) is hermitian then the physical interpretation of this quantity is clear: it is the expectation value for
what we get if we measure O(x) in the ground state. In quantum field theory it is often (but not always) the
case that the one-point functions of the fields vanish. Usually more interesting is the two-point function:
for any two fields O1 (x1 ) and O2 (x2 ) we have

⟨O2 (x2 )O1 (x1 )⟩ := ⟨Ω|O2 (x2 )O1 (x1 )|Ω⟩. (3.68)

The two point function is important for many physical questions. Perhaps the most direct physical interpre-
tation is that when x1 and x2 are spacelike separated and O1 and O2 have vanishing one-point functions, the
two-point function is a measure of how correlated the fluctuations are in measurements of the independent
observables O1 and O2 (we need to assume spacelike separation to ensure the operators are independent,
i.e. commuting). More generally if their one-point functions don’t vanish we can still quantify the amount
of correlation using the connected two-point function

⟨O2 (x2 )O1 (x1 )⟩c := ⟨O2 (x2 )O1 (x1 )⟩ − ⟨O2 (x2 )⟩⟨O1 (x1 )⟩. (3.69)

The two-point function also has a physical interpretation when x1 and x2 are not spacelike separated:
it tells us about the linear response of the theory to an external source. Indeed let’s say we have a field
theory with Hamiltonian H, and then we turn on a position-dependent source J(x) for a field O1 (x) such
that the Schrödinger picture Hamiltonian becomes:

H(t) = H0 + V (t) (3.70)

with Z
V (t) := λ dd−1 xJ(t, ⃗x)O1 (⃗x). (3.71)

38
Here λ is a parameter controlling the strength of the source that we will take to be small. The question we
will ask is the following: assuming that J goes to zero at early times, if we start in the ground state of H0
at early times, what is the expectation value of a field O2 as a function of space and time? We can answer
this question using time-dependent perturbation theory. Indeed including this interaction we have
R t2 R t2
dt′ H(t′ ) † dt′ H(t′ )
⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = ⟨Ω|(T e−i −∞ ) O2 (⃗x2 )T e−i −∞ |Ω⟩
† iH0 t2 −iH0 t2
= ⟨Ω|UI (t2 ) e O2 (⃗x2 )e UI (t2 )|Ω⟩, (3.72)

where
dt′ H(t′ ) dt′ eiH0 t V (t)e−iH0 t
Rt Rt
UI (t) = eiH0 t T e−i −∞ = T e−i −∞ (3.73)
is the interaction picture time-evolution operator (you can check that these two expressions are equiv-
alent by showing they have the same time derivative and obey the same initial condition at t = −∞). The
letter T here is the time-ordering symbol, it means that earlier operators go to the right. Expanding in
λ we have Z t Z
UI (t) = 1 − iλ dt′ dd−1 x′ J(t, ⃗x ′ )eiH0 t O1 (⃗x ′ )e−iH0 t + O(λ2 ), (3.74)
−∞

and thus to linear order in λ we have (assuming that O2 has vanishing one-point function in the unperturbed
theory)
Z t2 Z

⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = −iλ dt dd−1 x′ J(t′ , ⃗x ′ )⟨[O2 (t2 , ⃗x2 ), O1 (t′ , ⃗x ′ )]⟩0 . (3.75)
−∞

Here ⟨⟩0 indicates the vacuum expectation value of Heisenberg operators in the unperturbed theory. In
particular if we take J to be a delta function localized at (t1 , ⃗x1 ), then we have

⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = −iλΘ(t2 − t1 )⟨[O2 (t2 , ⃗x2 ), O1 (t1 , ⃗x1 )]⟩0 . (3.76)

Thus we see that the response of a quantum field theory to a local perturbation is determined by a difference
of two-point functions at arbitrary separation. The Θ function arises because if t1 > t2 then the source is
outside of the region of t′ intregration so the δ-function never contributes. This response vanishes unless x2
is in the future lightcone of x1 , as it had better, which by the way is another illustration of the fact that by
introducing fields we have solved the causality problems of relativistic particle quantum mechanics.
A simple example of an application of this calculation is the following: we can create a source for the
scalar field theory describing liquid helium-4 by firing a high-energy neutron at a bubble of liquid helium,
and then (3.76) describes how the local density of helium atoms in the bubble responds. In the homework
you will play with this and see how the response depends on whether the sample has two or three spatial
dimensions.
Higher-point correlation functions are also interesting. At spacelike separation they quantify conditional
fluctuations such as knowing how likely we are to see correlation between two operators given that we
measured a third to have some value, while at timelike separation they give more information about how
the theory responds to perturbations. We will also see later that for quantum field theories with particles at
low energies, higher-point correlation functions can be used to extract the scattering matrix.

3.7 Correlation functions II: Calculation


Having introduced the idea of correlation functions, let’s compute some in our free scalar field theory.
The one-point function of the scalar field Φ is easy:

⟨Φ(x)⟩ = 0 (3.77)

since we can view the ap⃗ in Φ as annihilating |Ω⟩ and the a†p⃗ as annihilating ⟨Ω|.
The two-point function
G(x2 , x1 ) := ⟨Φ(x2 )Φ(x1 )⟩ (3.78)

39
is more interesting. Note that the two-point function we have defined has Φ(x2 ) to the left of Φ(x1 ) regardless
of the time-ordering of x1 and x2 : it is to be distinguished from the Feynman propagator, which is defined
to include a time-ordering symbol
GF (x2 , x1 ) := ⟨T Φ(x2 )Φ(x1 )⟩. (3.79)
In quantum field theory correlation functions without time ordering such as (3.78) are sometimes called
Wightman functions to distinguish them from correlation functions that are time-ordered. We can easily
write the Feynman propagator in terms of the Wightman two-point function:

GF (x2 , x1 ) = Θ(t2 − t1 )G(x2 , x1 ) + Θ(t1 − t2 )G(x1 , x2 ). (3.80)

It is harder to go the other way (you need to do some nontrivial analytic continuation), so in quantum field
theory it is usually a good idea to view the Wightman functions as the fundamental objects of the theory. In
particular we emphasize that the linear response (3.76) involves two-point functions with both time orderings
and thus requires the Wightman two-point function. The Feynman propagator is important in perturbative
calculations, as we will see in later lectures.
In the free scalar field theory we can compute the (Wightman) two-point function:
Z d−1
d p1 dd−1 p2 1
G(x2 , x1 ) = √ eip2 ·x2 −ip1 ·x1 ⟨Ω|ap⃗2 a†p⃗1 |Ω⟩
(2π)d−1 (2π)d−1 2 ωp⃗1 ωp⃗2
dd−1 p 1 ip·(x2 −x1 )
Z
= e , (3.81)
(2π)d−1 2ωp⃗
where in the first line we observed that the only non-vanishing term involves an annihilation operator to the
left of a creation operator and in the second line we used the algebra (3.31). We won’t spend valuable class
time doing this integral since we will later have a better way to compute the same quantity using the path
integral,21 but the result is
! d−2
2
1 m p
G(x2 , x1 ) = K d−2 (m (x2 − x1 )2 + is21 ϵ) (3.82)
(2π)d/2
p
(x2 − x1 )2 + is21 ϵ 2

where s21 is equal to one if t2 − t1 is positive and minus one if it is negative and ϵ is a small positive quantity
whose purpose is to define the branch of the square root when (x2 − x1 )2 < 0 but should otherwise be taken
to zero. This is an example of what is called an “iϵ prescription”, which we will see again and again. Kα (x)
is a modified Bessel function of the second kind: the only things worth knowing about it at the moment are
its asymptotics:22
Γ(α)2α−1
(
0 < |x| ≪ 1
Kα (x) ≈ p πxα −x . (3.84)
2x e x≫1

In particular at general separations which are small compared to the inverse mass we have
Γ(d/2 − 1) 1
G(x2 , x1 ) ≈  d−2 , (3.85)
2π d/2  2
(x2 − x1 )2 + is21 ϵ
21 If you want to try it, I recommend first considering the d = 2 case. You can deform the p contour to wrap around one of

the cuts on the imaginary p axis as in figure two from lecture one, which leads to one of the standard integral representations
of K0 (m|x2 − x1 |). In the general case you need to first do an angular integral, after which you can do the same manipulation.
22 Another thing that is perhaps worth knowing is that it simplifies when α is a half-integer, which here means that d is odd.
q
π −x
For example for d = 3 we simply have K1/2 (x) = 2x
e and thus

2
e−m (x2 −x1 ) +is21 ϵ
G(x2 , x1 ) = p . (3.83)
4π (x2 − x1 )2 + is21 ϵ

40
Figure 6: Closing the contour in the complex p0 plane for the integral (3.90).

while at spacelike separations which are large compared to the inverse mass we have

md−2
G(x2 , x1 ) ≈ d+1 d−1 d−1 e−m|x2 −x1 | . (3.86)
2 2 π 2 (m|x2 − x1 |) 2

There is quite a bit of physics in these expressions, here are some key points:

(1) The two-point function is nonzero at spacelike separation, so independent fields are correlated with
each other in the ground state. Correlation between independent (i.e. commuting) degrees of freedom
in a pure quantum state is called entanglement, so what we are seeing is that in quantum field
theory the vacuum is a highly-entangled state. Indeed since the two-point function diverges in the
limit x2 → x1 , the amount of entanglement is infinite!

(2) In the massless limit (3.85) becomes exact so the correlation function (for d > 2) decays as an inverse
power of the distance between the points. You will study the d = 2 case in the homework.
(3) In the massive case the correlation decays exponentially with distance at spacelike separations which
are large compared to m−1 .

This discussion illustrates something of a general maxim about correlation functions in quantum field theory:
the physics is more clear in position space, but the formulas are simpler in momentum space. More pithily,
in quantum field theory you should think in position space but compute in momentum space.
The short-distance divergence of the two-point function also has an important mathematical consequence:
it shows that the field Φ(x) is not actually a good quantum operator, since acting on the vacuum (or indeed
any other state of finite energy) we get a state of infinite norm. In order to get something which is a good
operator, we need to smear Φ(x) against a smooth function of compact support:
Z
Φf = dd xf (x)Φ(x). (3.87)

This statement is sometimes formalized by saying that in quantum field theory the fields themselves are
operator-valued distributions. We will show in the next lecture that this smearing indeed produces a
well-defined operator.
You may have found it annoying that our expression (3.81) for the two-point function involves integrals
over only the spatial components of momentum; wouldn’t it be nice to have a more manifestly covariant
dd−1 p 1
expression? Of course we did already show that the measure (2π) d−1 2ω
p

is Lorentz-invariant, but there our
0 2 2
demonstration involved the non-analytic objects Θ(p ) and δ(p + m ). It turns out to be a very good idea

41
to come up with an expression for the two-point function that is manifestly both covariant and analytic in
momentum. We can do this by showing that
Z ∞
1 −iωp⃗ (t2 −t1 ) dp0 −is21 0
e = lim 2 2
e−ip (t2 −t1 ) , (3.88)
2ωp⃗ ϵ→0 −∞ 2π p + m − iϵs21

where s21 again is one if t2 − t1 is positive and minus one if it is negative, since from (3.81) we then have

dd p −is21
Z
G(x2 , x1 ) = lim eip·(x2 −x1 ) . (3.89)
ϵ→0+ (2π) p + m2 − iϵs21
d 2

The appearance of the vanishingly small quantity ϵ > 0 here is another example of an iϵ prescription. To
demonstrate (3.88), it is convenient to rewrite the integral on the right-hand side as
∞ 0
dp0 e−ip (t2 −t1 )
Z
−s21 , (3.90)
−∞ 2πi (p − (ωp⃗ − iϵs21 ))(p0 + (ωp⃗ − iϵs21 ))
0

where we have used that

(p0 − (ωp⃗ − iϵs21 ))(p0 + (ωp⃗ − iϵs21 )) = −(p2 + m2 − 2iωp⃗ ϵs21 ) + O(ϵ2 ) (3.91)

and then redefined 2ωp⃗ ϵ → ϵ since the only thing we care about ϵ is that it is small and positive. The integral
(3.90) can be computed using the residue theorem. Indeed recall that if f (z) is an analytic function in a
region R containing a point z0 , then we have23
Z
1 f (z)
= f (z0 ) (3.92)
2πi ∂R z − z0
f (z)
where the integral is taken in the counter-clockwise direction about z0 . Said differently, the function z−z 0
has a simple pole at z = z0 and the integral around this pole extracts the residue f (z0 ). The integrand
(3.90) has two simple poles, at
p0 = ± (ωp⃗ − iϵs21 ) . (3.93)
We can evaluate the integral using the residue theorem by closing the integration contour along the real axis
at infinity in the lower or upper half plane depending on whether s21 is positive or negative respectively (see
figure 6). Either way the integral picks up the residue of the pole at p0 = ωp⃗ − iϵs21 , but there is a sign
difference since in the former case the integral is clockwise while in the latter case it is counter clockwise.
We therefore have
0
dp0 e−ip (t2 −t1 )
Z
1 −iωp⃗ (t2 −t1 )
0 0
= −s21 e , (3.94)
2πi (p − (ωp⃗ − iϵs21 ))(p + (ωp⃗ − iϵs21 )) 2ωp⃗

so multiplying by −s21 we recover (3.88).


Finally it will be convenient later to have a formula similar to (3.89) for the Feynman propagator.
Proceeding as in the derivation of (3.81), we have

dd−1 p 1 is21 p·(x2 −x1 )


Z
GF (x2 , x1 ) = e
(2π)d−1 2ωp⃗
dd−1 p 1 i⃗p·(⃗x2 −⃗x1 )−is21 ωp⃗ (t2 −t1 )
Z
= e (3.95)
(2π)d−1 2ωp⃗
23 The intuition for this is essentially the divergence theorem in two dimensions, although to make it rigorous the logic goes

the other way since the divergence theorem requires continuous partial derivatives and showing that an analytic function has
continuous partial derivatives is usually done using the residue theorem.

42
where in the first line there is an s21 in the exponent because depending on the time-ordering which field
contributes an ap⃗ and which contributes an a†p⃗ switches and in going from the first line to the second line we
flipped the direction of the integral over p⃗. The quantity s21 (t2 − t1 ) is always positive, so in this integral we
can use the identity (3.88) replacing (t2 − t1 ) → s21 (t2 − t1 ) and setting s21 to one on the right hand side.
Flipping the direction of the p0 integral we get
dd p −i
Z
GF (x2 , x1 ) = lim eip·(x2 −x1 ) , (3.96)
ϵ→0 (2π) p + m2 − iϵ
d 2

which is a bit simpler than the expression (3.89) for the two-point function. In particular the Feynman
propagator has the nice property that it is a Green’s function for the Klein-Gordon operator:
dd p i(p2 + m2 ) ip·(x2 −x1 )
Z
(∂22 − m2 )GF (x2 , x1 ) = lim e
ϵ→0 (2π)d p2 + m2 − iϵ
dd p ip·(x2 −x1 )
Z
=i e
(2π)d
= iδ d (x2 − x1 ). (3.97)

This would not have worked for the Wightman function since the derivative acting on s21 would have
generated additional terms.
You may be wondering why we stopped with two-point functions: what about three-point functions,
four-point functions, and so on? In free field theory the answer is simple: these end up either vanishing or
just being combinations of two-point functions. Indeed the n-point function

⟨Φ(x1 )Φ(x2 ) . . . Φ(xn )⟩ (3.98)

vanishes when n is odd since there are no terms with an equal number of creation and annihilation operators.
When n is even we simply pair them up to get a sum of products of two-point functions. For example to
compute the four-point function we introduce annihilation and creation parts of Φ(x) as

dd−1 p 1 ip·x
Z
Φ− (x) = e ap⃗
(2π)d−1 2ωp⃗
dd−1 p 1 −ip·x †
Z
Φ+ (x) = e ap⃗ , (3.99)
(2π)d−1 2ωp⃗
observe that
dd−1 p 1 ip(x−y)
Z
[Φ− (x), Φ+ (y)] = e = G(x, y), (3.100)
(2π)d−1 2ωp⃗
and then compute
D E D   E
Φ(x4 )Φ(x3 )Φ(x2 )Φ(x1 ) = Φ− (x4 ) Φ+ (x3 ) + Φ− (x3 ) Φ+ (x2 ) + Φ− (x2 ) Φ+ (x1 )
D  E
= [Φ− (x4 ), Φ+ (x3 )] + Φ− (x4 )Φ− (x3 ) [Φ− (x2 ), Φ+ (x1 )] + Φ+ (x2 )Φ+ (x1 )
D E
= G(x4 , x3 )G(x2 , x1 ) + Φ− (x4 )Φ− (x3 )Φ+ (x2 )Φ+ (x1 )
D  E
= G(x4 , x3 )G(x2 , x1 ) + Φ− (x4 ) [Φ− (x3 ), Φ+ (x2 )] + Φ+ (x2 )Φ− (x3 ) Φ+ (x1 )
= G(x4 , x3 )G(x2 , x1 ) + G(x3 , x2 )G(x4 , x1 ) + G(x4 , x2 )G(x3 , x1 ). (3.101)

This pattern continues to higher orders: the n-point function with even n is given by the sum over all pairings
of n of the products of two-point functions of the pairing, with the order of the operators in each pair given
by their order in the full n-point function. The same is true for the time-ordered n-point function, but with
the two-point function replaced by the Feynman propagator.

43
3.8 Homework
1. Evaluate the other two terms in our expression (3.35) for the free scalar Hamiltonian, confirming that
this leads to (3.37).
2. Find the vacuum wave functional for a free scalar field. Hint: the answer has the form
 Z 
1 d−1 d−1
Ψ[ϕ] ∝ exp − d xd yK(⃗x − ⃗y )ϕ(⃗x)ϕ(⃗y ) ,
2

so you just need to find the function K(⃗x − ⃗y ). The condition you need to satisfy is that this wave
functional is annihilated by ap⃗ for all momenta p⃗, and you can use the expression (3.30) for ap⃗ and also
the definition (3.9) of the canonical momenta acting on wave functionals. Your life will be easiest if you
transform K and ϕ to momentum space, but extra credit if you can give a position-space expression
for K in d = 4 (Bessel functions are involved).
3. The response of superfluid liquid helium to a localized perturbation with source O1 = ϕ(t1 , ⃗x1 ) is given
by equation (3.76), with the two Wightman functions appearing in the commutator given by (3.85).
Taking the perturbation at t1 = 0 and ⃗x1 = 0 and taking the measured operator O2 to be ϕ(t, ⃗x), plot
the response ⟨Ω|ϕ(t, ⃗x)|Ω⟩ as a function of t and the spatial radius r = |x| for d = 3 and d = 4. Is there
a qualitative difference between two cases?

4. Starting from the expression (3.64) for a complex scalar field and the canonical commutators (3.66),
calculate the commutators of the operators ap⃗ , bp⃗ , a†p⃗ , bp†⃗ . Derive an expression for the Hamiltonian H
in terms of these creation/annihilation operators, and also give an expression for the symmetry charge
Q for the symmetry ϕ′ = eiθ ϕ that you derived in the last homework. What are the charges of the
particles in this theory?

5. Expand the massless two-point function (3.85) in the limit d → 2. You will find a series in (d − 2)
that begins with a divergence that goes like 1/(d − 2) followed by a term that is finite and nonzero as
d → 2. What is this correction term? Do you see anything strange about it?
6. Extra credit: evaluate the momentum integral (3.81) for the two-point function assuming that x1 and
x2 are spacelike separated in the cases d = 2, d = 3, and d = 4. You will likely need to consult some
reference on Bessel integrals, e.g. Gradshteyn and Ryzhik or Abramowitz and Stegun, both of which
are available as pdfs online. If the experience leaves you enthusiastic you can try the case of timelike
separation as well; this is actually a bit easier since you can go to a frame where ⃗x2 − ⃗x1 = 0.

44
4 Algebras and symmetries in quantum field theory
In this lecture we return to formalism, introducing a general algebraic language that we can use to precisely
define the idea of symmetry in quantum field theory. We will learn about the difference between internal
symmetries and spacetime symmetries, learn more about global structure of the Lorentz group, and study
how correlation functions in quantum field theory are constrained by global symmetries.

4.1 The algebraic approach to field theory


In the Lagrangian approach to field theory we have been pursuing thus far, there is a set of “fundamental”
fields ϕa (x) appearing as dynamical variables in the Lagrangian. Other local operators such as ϕ2 and ∂µ ϕ∂ν ϕ
are constructed out of these fundamental fields and their derivatives. In strongly-interacting theories however
it is often the case that the fundamental fields are not so closely related to the interesting physics at long
distances. Indeed sometimes the same quantum field theory has multiple presentations in terms of different
choices of fundamental fields, which is a phenomenon called duality. It is therefore sometimes useful to
adopt a language for quantum field theory that de-emphasizes the fundamental fields and treats all local
operators on equal footing. This is the algebraic approach to quantum field theory.
The basic idea of algebraic field theory is that for each open spatial region R there is an algebra of
operators A[R] associated to that region. Roughly speaking A[R] consist of all the operators made out of
sums and products of the fields in R and their derivatives. There are various opinions about how general
the spatial regions R should be, in this class we will require that each R lies within a constant time slice in
some Lorentz frame.24 The algebras obey three natural axioms:
ˆ Nesting: If R1 ⊂ R2 , then A[R1 ] ⊂ A[R2 ].

ˆ Causality: If R1 and R2 are spacelike separated, then A[R1 ] ⊂ A′ [R2 ]. Here the symbol A′ [R] indi-
cates the commutant of A[R], meaning the set of (bounded) operators that commute (or anticommute
in the case of fermions) with everything in A[R].
ˆ Haag Duality: For any region R we have A′ [R] = A[R], where R is the interior of the spatial
complement of R in the time slice it lives in.
Nesting, also sometimes called “isotony”, formalizes the idea that you cannot make more operators by
restricting which fields you can use, causality is a consequence of the (anti)commutativity of fields at spacelike
separation, and Haag duality expresses the idea that the algebra is “complete” in the sense that A[R] contains
everything you can build out of the fields.25
Conceptually these axioms are all we will need from the algebraic approach to field theory, but there are
some mathematical subtleties in making the definition of A[R] precise which are worth discussing. Don’t
worry if the rest of this section goes by too fast, the goal is to make you aware of these things rather than
to turn you into a master practitioner. The first problem is that we saw in the last lecture that the fields
themselves are not actually genuine operators. For example if we act with a free scalar field on the vacuum
we get a state of infinite norm:
⟨Ω|Φ(x)Φ(x)|Ω⟩ = G(x, x) = ∞. (4.1)
To get a good operator we need to smear against a smooth (meaning infinitely-differentiable) function
f : Rd → R of compact support: Z
Φf = dd xf (x)Φ(x). (4.2)

24 We do this to avoid needing to discuss quantization on curved slices. More generally R can be any open achronal set.
25 Haag duality should not be confused with the “duality” mentioned in the previous paragraph, whereby the same quantum
field theory can have two seemingly different presentations. Unfortunately both usages are completely standard.

45
To see that this makes the norm finite, we can first note that we have

dd−1 p
Z Z
1 h ip·x −ip·x †
i
Φf = dxf (x) e a p
⃗ + e a p

(2π)d−1 2ωp⃗
p

dd−1 p
Z
1 he i
= d−1
p f (ωp⃗ , p⃗)∗ ap⃗ + fe(ωp⃗ , p⃗)ap†⃗ , (4.3)
(2π) 2ωp⃗

where Z
fe(k) = dd xe−ik·x f (x) (4.4)

is the (d-dimensional) Fourier transform of f . It is useful to recall two facts about Fourier transforms:
ˆ If f : Rd → R is a smooth function that is bounded in absolute value by 1+|x| C
d+1 for some C > 0 (here
d
|x| is the Euclidean length on R ), and moreover which has the property that when acted on by any
finite number of partial derivatives it continues obey this bound (possibly with different C for different
sets of derivatives), then the Fourier transform fe(k) exists and decays faster than any power at large
|k|. The proof of this is fairly simple: by differentiating under the integral sign and integration by
parts we have
Z
kµ1 . . . kµm f (k) = dd xkµ1 . . . kµm e−ik·x f (x)
e
Z
= im dd x∂µ1 . . . ∂µm (e−ik·x )f (x)
Z
= (−i)m dd xe−ik·x ∂µ1 . . . ∂µm f (x), (4.5)

and the third line vanishes at large |k| by the Riemann-Lebesgue lemma (see Wikipedia) since by
C
assumption ∂µ1 . . . ∂µm f (x) is integrable since it is smooth and bounded in absolute value by 1+|x|d+1 .

ˆ If f : Rd → R is a continuous function of compact support then its Fourier transform fe(k) is an entire
function, meaning that it is analytic for arbitrary complex k. This is because we can simply define the
derivative of the Fourier transform by
Z
∂ fe
= dd x(ixµ )e−ik·x f (x), (4.6)
∂kµ S

which is convergent since f is continuous and S (the support of f ) is compact.


Results of this type illustrate the general maxim that continuity/differentiability properties of an integrable
function f translate into statements about the decay of its Fourier transform at infinity.26 In particular we
have learned that the Fourier transform fe of a smooth function of compact support is a very well-behaved
function: it is analytic for all k and decays faster than any power at infinity. These properties ensure that
Φf is a better-behaved operator than Φ(x). For example we can compute the norm of the state Φf |Ω⟩:

dd−1 p 1 e
Z
⟨Ω|Φf Φf |Ω⟩ = |f (ωp⃗ , p⃗)|2 . (4.7)
(2π)d−1 2ωp⃗

This integral is now convergent at large |p| due to the fast decay of fe, and for m > 0 it is also convergent
at p = 0 since ωp⃗ is finite there and fe is analytic. When m = 0 there is an apparent singularity at p = 0
dd−1 p
due to the ωp⃗ in the denominator, but as long as d > 2 this is compensated by the volume measure (2π) d−1

26 Another useful result which is intermediate between these two is that an integrable function which is analytic in a strip of

finite thickness about the real axis has a Fourier transform which decays exponentially at large k.

46
Figure 7: Domains of dependence for spatial regions in Minkowski space. The regions R1 and R2 are blue,
while their domains of dependence are the green diamond-shaped spacetime regions.

so the integral is still finite. When d = 2 there is a logarithmic divergence at p = 0 in the massless case,
which shows that there is indeed an infrared pathology for a massless scalar in d = 2 that cannot be removed
by smearing.27 It is important to emphasize that the introduction of smeared fields Φf is not purely a
mathematical convenience; no real detector has perfect spatial resolution, so this smearing is really physical
- the function f describes the spacetime profile of the detector which couples to Φ(x).
Which smeared operators ϕf can be associated to which spatial regions R? The answer to this question
is not completely obvious, since in order to get a good operator we need the support of f to have nontrivial
extent in time. On the other hand we should expect that in a relativistic field theory the operators at a
location x which lies to the future or past of a timeslice should be expressible solely in terms of the fields on
that timeslice which are not spacelike separated from x. Given a spatial region R we therefore introduce the
idea of its domain of dependence D[R], which is the set of spacetime points x with the property that every
timelike curve which intersects x also intersects R. In Minkowski space this is equivalent to the set of points
which are spacelike-separated from all points in R, see figure 7 for an illustration.28 Moreover this definition
has the property that if R1 ⊂ R2 then D[R1 ] ⊂ D[R2 ]. Operators ϕf with the support of f contained in
D[R] thus will obey nesting and causality, and therefore are thus natural candidates for elements of A[R].
There is one further issue however that needs to be addressed: although the operator Φf is better-
behaved than Φ(x), it still can in general have arbitrary large eigenvalues. An operator whose eigenvalues
are unbounded can have rather strong restrictions on its domain, which makes it difficult to include in
an algebra since products of unbounded operators are complicated to handle. For example in the simple
harmonic oscillator the state √ ∞
6X1
|ψ⟩ = |n⟩ (4.8)
π n=0 n
P
has unit norm but if we act on this state with the Hamiltonian H = n ω(n + 1/2)|n⟩⟨n| we get a state
of infinite norm and the expectation value of H in the state |ψ⟩ is also infinity. This kind of divergence is
usually viewed as unphysical however, as given a detector of finite size we can’t actually measure an observable
with an infinite number of distinct possible outcomes. It is thus standard to restrict A[R] p to only contain
operators O which are bounded in the sense that there is some constant C such that ⟨ψ|O† O|ψ⟩| ≤ C
for all normalizeable states |ψ⟩. Given smeared fields Φf it is not difficult to create bounded operators, for
27 This has interesting physical consequences, with perhaps the most important being that there cannot be spontaneous

breaking of a continuous symmetry in d = 2. This statement is called the Mermin-Wagner-Coleman theorem, and we will say
more about it when we get to spontaneous symmetry breaking later in the semester.
28 Another way to motivate the definition of the domain of dependence is that it is the region in which the wave equation (or

more generally any well-behaved hyperbolic PDE) should have a unique solution given initial data specified on R. Outside of
D[R] the solution will depend also on the initial data on R.

47
1
example if Φ(x) is hermitian then eiΦf and 1+Φ2f
are both bounded, and so is the spectral projection onto
the eigenstates of Φf which lie between any two distinct real numbers.
The algebra A[R] associated to a spatial region R in quantum field theory gives an example of a famous
mathematical notion:
Definition 1 Let H be a Hilbert space. A set A of bounded operators on H is a von Neumann algebra
if the following things are true:
(1) A is closed under addition, multiplication, and hermitian conjugation.

(2) A contains λI for any λ ∈ C, where I is the identity operator.


(3) A is closed under “weak limits”, meaning that of On ∈ A are a sequence of operators such that the
sequences ⟨ψ|On |ϕ⟩ are convergent for all states |ψ⟩, |ϕ⟩ ∈ H then there exists an operator O ∈ A such
that ⟨ψ|O|ϕ⟩ = limn→∞ ⟨ψ|On |ϕ⟩ for all |ψ⟩, |ϕ⟩ ∈ H.

Elements of A[R] are bounded for the reasons discussed in the previous paragraph, they obey (1) because if
we can measure two hermitian operators O1 and O2 then we can measure simple functions of them such as
O1 + O2 and O1 O2 + O2 O1 and i(O1 O2 − O2 O1 ), they obey (2) because we can always measure the identity
by doing nothing, and they obey (3) because a limit of measurements should be a measurement. There are
many powerful mathematical results about von Neumann algebras with interesting implications for quantum
field theory, and in particular there is a classification of von Neumann algebras under which the algebras
associated to bounded regions are “type III1 ”, but this is not a class in mathematical physics we will stop
here.

4.2 Symmetry in quantum mechanics


What is a symmetry in quantum field theory? At the classical level we already discussed this in the context
of Noether’s theorem, where we defined a symmetry as a local transformation of the dynamical fields which
leaves the action invariant up to future/past boundary terms. From the path integral point of view (which
we have not yet introduced) we could just continue to apply this definition quantum mechanically, but it is
useful to also consider how to define symmetries in quantum mechanics directly from the Hilbert space point
of view.
A rather minimal requirement for a symmetry in quantum mechanics is that it should at least preserve
the probabilistic interpretation of the inner product, meaning that it should be an invertible transformation
f : H → H of Hilbert space that preserves instantaneous transition amplitudes

|(f (ψ), f (ϕ))|2 = |(ψ, ϕ)|2 . (4.9)

Here we have temporarily dispensed with Dirac notation and instead used the mathematician notation (·, ·)
for the inner product on H.29 We also require that the inverse transformation preserves amplitudes in the
same way. It is a fundamental theorem of Wigner (see section 2.A of Weinberg) that the only transformations
obeying these requirements arise from unitary or antiunitary operators on H. In other words we must either
have a linear operator U obeying
(U ψ, U ϕ) = (ψ, ϕ) (4.10)
for all ψ and ϕ such that
f (ψ) = U ψ, (4.11)
or else an antilinear operator Θ obeying

(Θψ, Θϕ) = (ϕ, ψ) (4.12)


29 The reasons for this notational change are 1) to write equation (4.9) in Dirac notation we’d need to introduce a dual action

of f on bras and 2) Dirac notation is confusing when antilinear operators are involved.

48
for all ψ and ϕ such that
f (ψ) = Θψ. (4.13)
A linear operator L is one for which
L(aψ + bϕ) = aLψ + bLϕ, (4.14)
while an antilinear operator A is one for which

A(aψ + bϕ) = a∗ Aψ + b∗ Aϕ. (4.15)

Defining the adjoints of linear/antilinear operators by

(ψ, L† ϕ) = (Lψ, ϕ)
(ψ, A† ϕ) = (ϕ, Aψ), (4.16)

we see that a linear operator U is unitary if and only if U † U = I and an antilinear operator Θ is antiunitary
if and only if Θ† Θ = I.
Although preserving instantaneous transition amplitudes is a necessary condition to have a symmetry in
quantum mechanics, it is clearly not sufficient: otherwise any unitary or antiunitary operator would be a
symmetry! There must also be a sense in which the unitaries/antiunitaries which are genuine symmetries
preserve more of the structure of the theory. In particular any symmetry of quantum theory should be
compatible with its dynamics. This requirement is easiest to formalize when the symmetry in question does
not affect the direction of time evolution: we then simply require that

e−iHt U = U e−iHt , (4.17)

i.e. that transforming and then evolving is the same as evolving and then transforming. Multiplying by U †
on the left, we can also write this as
U † e−iHt U = e−iHt . (4.18)
Since either of these equations must be true for all t, they are equivalent to requiring that

(iH)U = U (iH). (4.19)

So far we have not decided whether U is unitary or antiunitary. Let’s first try antiunitary: then (4.19) is
equivalent to
HU = −U H. (4.20)
This however leads to trouble: if ψE is an energy eigenstate of energy E, then we have

HU ψE = −U HψE = −EU ψE (4.21)

and thus we see that U ψE is an energy eigenstate of energy −E. Most Hamiltonians of physical interest
do not have the property that their spectrum is symmetric about H = 0, and in particular in quantum
field theory the Hamiltonian is usually bounded from below but not from above. Thus we have learned
that any symmetry which does not affect the direction of time evolution is implemented by a unitary (NOT
antiunitary) operator on Hilbert space. Equation (4.19) then tells us that

HU = U H, (4.22)

which is the usual maxim that a symmetry in quantum mechanics is a unitary operator that commutes with
the Hamiltonian.
The set of all distinct unitaries U that commute with the Hamiltonian form what mathematicians call
a group, which is a set G whose elements can be multiplied together in such a way that the following
conditions are true:

49
ˆ Associativity: For any g1 , g2 , g3 ∈ G we have (g1 g2 )g3 = g1 (g2 g3 ).

ˆ Identity: There exists an element e ∈ G such that eg = ge = g for all g ∈ G.

ˆ Inverses: For each g ∈ G, there exists g −1 such that gg −1 = g −1 g = e.

These axioms imply that e and g −1 are unique. They are obeyed here because if U1 and U2 commute with
the Hamiltonian then
U1 U2 H = U1 HU2 = HU1 U2 , (4.23)
and if U H = HU then
U † H = U † HU U † = U † U HU † = HU † . (4.24)
Hopefully this is not your first time seeing the definition of a group, but if it is then I assure you groups are
ubiquitous in physics so best to get started learning about them. Simple examples of groups are the real
numbers R under addition, the group U (1) of complex phases eiθ under multiplication, the group U (N ) of
N × N unitary matrices under matrix multiplication, and the group SU (N ) of N × N unitary matrices of
determinant one (again under matrix multiplication).30 A group G is called abelian if it is commutative,
meaning that g1 g2 = g2 g1 for all g1 , g2 ∈ G. R and U (1) are abelian, while U (N ) and SU (N ) are non-abelian
for N ≥ 2.
What about symmetries that do affect the direction of time evolution? In relativistic theories there are
only two such symmetries: we can mix time and space translations using a Lorentz boost, or we can reverse
the direction of time using time-reversal symmetry.31 We have already seen in our free scalar theory that any
Lorentz transformation which does not reverse time can be represented by a unitary operator U (Λ) which
acts on the annihilation operators as

ωp⃗Λ
r
U (Λ)ap⃗ U (Λ)† = ap⃗ , (4.25)
ωp⃗ Λ

so in particular this is true for Lorentz boosts. More generally in any quantum field theory we expect that
a Lorentz boost in the n̂ direction of rapidity η acts on the Hamiltonian as

U † HU = cosh η H + sinh η n̂ · P⃗ , (4.26)

which is a consequence of the fact that the spacetime momentum P µ transforms as a spacetime vector.
Since we are (momentarily) considering the possibility that U could be antiunitary however, we should
really require that
U † (iH)U = i(cosh η H + sinh η n̂ · P⃗ ). (4.27)
If U is unitary this is equivalent to (4.26), but if it is antiunitary then we should instead require that

U † HU = −(cosh η H + sinh η n̂ · P⃗ ) (4.28)

This equation however is not continuous as η → 0, so this would be a rather pathological representation of
Lorentz symmetry. Moreover it would again have a problem with the spectrum of the Hamiltonian: given a
simultaneous eigenstate ψE,⃗p of H and P⃗ , we would have

HU ψE,⃗p = − (cosh ηE + sinh ηn̂ · p⃗) U ψE,⃗p . (4.29)

In any quantum field theory which can be interpreted as a scattering theory of particles it is quite natural
to impose the following requirement:
30 These examples may misleadingly suggest that all groups are matrix groups, meaning groups that can be represented

with finite-dimensional matrices. This is true for groups which are topologically compact, but it isn’t true in general.
31 Time-reversal symmetry may not actually be a symmetry by itself, for example in the Standard Model of particle physics

it isn’t, but we will see in a few lectures that there is a combination of time reversal with other transformations, called CRT ,
which is always a symmetry in any relativistic quantum field theory.

50
ˆ Spectrum condition: In any relativistic quantum field theory we have

H ≥ n̂ · P⃗ , (4.30)

where n̂ is any unit vector and H is defined so that the energy of the ground state is zero. The operator
inequality means that H − n̂ · P⃗ is a positive semidefinite operator.
p
This condition should hold because each particle has energy ω = |p|2 + m2 ≥ |p|, and when we add up
energies there are no cancellations while when we add up momenta there can be.32 We then have (for η > 0)

cosh η E + sinh η n̂ · p⃗ ≥ cosh ηE − sinh η|p| ≥ E(cosh η − sin η), (4.31)

and so assuming that H is unbounded from above we can again generate energy eigenstates of arbitrarily
negative energy by acting with U . From now on we will therefore assume that boosts are implemented by
unitary operators.
Finally we can consider time-reversal, which we will take to be represented by an operator ΘT . This
should act on the time evolution operator as

Θ†T e−iHt ΘT = eiHt , (4.32)

and thus obey


Θ†T (iH)ΘT = −iH. (4.33)
If we assume ΘT is unitary then we have

Θ†T HΘT = −H, (4.34)

which we can discard as before since it would require the spectrum of H to be symmetric about zero. We
therefore see that we want ΘT to be antiunitary, since this gives the more reasonable condition

Θ†T HΘT = H. (4.35)

For example in the simple harmonic oscillator time reversal is implemented by an antiunitary operator which
acts on the X basis
ΘT |x⟩ = |x⟩, (4.36)
leading to

Θ†T XΘT = X
Θ†T P ΘT = −P. (4.37)

The energy eigenstates |n⟩ have real wave functions in the X basis, and thus are invariant under time-reversal:
Z Z Z
ΘT |n⟩ = ΘT dx⟨x|n⟩|x⟩ = dx⟨x|n⟩ΘT |x⟩ = dx⟨x|n⟩|x⟩ = |n⟩. (4.38)

4.3 Internal symmetries in quantum field theory


In quantum field theory there is additional structure which is not present in general quantum systems: the
operators are organized into the local algebras A[R] obeying nesting, causality, and duality. In order for a
symmetry in quantum field theory to be useful, it needs to respect this local structure. The simplest kind of
symmetry that respects this structure is an internal symmetry, which roughly speaking is a symmetry that
maps any local (Heisenberg) operator O(x) to another local operator which is located at the same spacetime
point. More formally we have a definition:
32 If we are willing to just assume that boosts are unitary, for example because we reject the discontinuity at η = 0 in the

antiunitary case, then we can give a simpler and more rigorous argument for the spectrum condition: it must be true so that
the Hamiltonian in any Lorentz frame is a positive operator.

51
Definition 2 An internal symmetry of a quantum field theory in d-dimensional Minkowski space is a
unitary operator U such that
(1) For any spatial region R the algebra A[R] is preserved by conjugation by U and U † , meaning that for
any O ∈ A[R] we have U † OU ∈ A[R] and U OU † ∈ A[R].

(2) For any spacetime point x the energy-momentum tensor Tµν (x) is invariant under conjugation by U :

U † Tµν (x)U = Tµν (x). (4.39)

The first requirement here expresses the idea that the symmetry should preserve the local algebra. The
second is a strengthening of the idea that U should commute with the Hamiltonian: it expresses the idea
of local conservation of the symmetry charge. More concretely, it says that symmetry charge cannot
leave a region of space without passing through its edges (see figure ). This is not obvious, and showing
it is a consequence of (4.39) requires more differential geometry than we are using in this class.33 We can
also motivate (4.39) in a more mundane way: a generic quantum field theory shouldn’t have more than one
energy-momentum tensor, and whatever an internal symmetry sends the energy-momentum tensor to is an
equally valid candidate for an energy-momentum tensor and therefore must be the original one. The set of
internal symmetries in a quantum field theory forms a group, as you can easily check.
There is an important further classification of internal symmetries based on what kinds of operators they
act nontrivially on. In the simplest quantum field theories all operators are built out of the local operators,
in which case any nontrivial internal symmetry U must act nontrivially on some local operator O(x). Such
internal symmetries are called global internal symmetries. An example of a global internal symmetry is the
phase rotation of a free complex scalar

U (θ)† Φ(x)U (θ) = eiθ Φ(x), (4.40)

whose symmetry group is clearly isomorphic to the group U (1). Conventionally we say that this theory
has a U (1) global symmetry. This semester we will only discuss theories where all operators are built from
local operators, so all internal symmetries are global. Next semester we will discuss gauge theories such
as quantum electrodynamics, where there can be extended operators that are not built from local operators.
The reason for this is familiar from Maxwell theory: we cannot create an electrically charged particle without
also creating an electric field sourced by it that satisfies Gauss’s law, and this electric field must extend out
to spatial infinity. Therefore there are no local operators that carry nonzero electric charge. On the other
hand there are clearly states of nonzero electric charge, such as a state with one electron in the center of
space. These are created by acting on the vacuum with extended operators that create both the electron
and its Coulomb field, and it is these extended operators which carry nonzero electric charge.34
Another important question about any internal symmetry in quantum field theory is whether or not
the ground state |Ω⟩ is invariant. If it is not, then we say that the symmetry is spontaneously broken.
Spontaneously broken global internal symmetries are very interesting in quantum field theory, for example
being essential to our understanding of magnets, superfluidity, and nuclear physics. There is also a sense
33 More formally local conservation is expressed as the requirement that we can continuously deform the slice on which U

is defined without changing the operator. This is often described by saying that the symmetry operator R U is a topological
surface operator. In the continuous case this is a consequence of Noether’s theorem: the charge Q = dd−1 xJ 0 (t, ⃗ x) can be
written as Q = Σ nµ Jµ where Σ is the surface t = 0 and nµ is its normal vector, and then the fact that we can continuously
R
deform Σ without changing Q is a consequence of the divergence theorem and the current conservation equation ∂µ J µ = 0.
The basic idea in showing that the invariance of Tµν implies this deformability in general is to use that the stress tensor is the
functional derivative of the action with respect to the metric and that the action is invariant under arbitrary diffeomorphisms
which act on both the dynamical fields and also the background spacetime metric.
34 The distinction between gauge and global symmetry defined here is not the way this distinction is traditionally presented.

The conventional definition is that in terms of the fundamental fields a global symmetry is one which acts the same way at all
points in space while a gauge symmetry is one where the symmetry transformation can vary from point to point. This definition
is problematic however, as most of the gauge transformations defined this way are mere redundancies of description and for
discrete symmetries it isn’t clear what the difference is. The algebraic definition I’ve given here isolates the physical distinction
between the two without introducing confusing historical baggage.

52
in which gauge symmetries can be spontaneously broken, called the Anderson-Higgs mechanism, although
the concept is somewhat less well-defined than for global symmetries. We will have more to say about
spontaneous symmetry breaking in later lectures. If an internal global symmetry is unbroken, meaning
that the ground state is invariant, then it implies a powerful constraint on the correlation functions of the
theory. Indeed if we define
O′ (x) = U † O(x)U, (4.41)
then we must have

⟨O1′ (x1 ) . . . On′ (xn )⟩ = ⟨Ω|U † O1 (x1 )U . . . U † On (xn )U |Ω⟩ = ⟨O1 (x1 ) . . . On (xn )⟩. (4.42)

For example if we have a U (1) global symmetry, this tells us that for all θ ∈ [0, 2π] we have

ei(q1 +...qn )θ ⟨O1 (x1 ) . . . On (xn )⟩ = ⟨O1 (x1 ) . . . On (xn )⟩, (4.43)

which shows that this correlation function obeys the selection rule that it must vanish unless the sum of the
operator charges vanishes. For example this explains why you will find that ⟨Φ(x)Φ(y)⟩ = ⟨Φ† (x)Φ† (y)⟩ = 0
in the free complex scalar theory.

4.4 Spacetime symmetries in quantum field theory


We now turn to symmetries that act nontrivially on the spacetime coordinates, which are called spacetime
symmetries. In terms of their action on operators, these are symmetries that move local operators around.
In relativistic field theory the most familiar of these are Poincaré transformations, which for example act on
a scalar field as
U (Λ, a)† Φ(x)U (Λ, a) = Φ(Λ−1 (x − a)). (4.44)
There are three other kinds of spacetime symmetries that can show up in relativistic theories, which I’ll
mention here but not discuss further:
ˆ Conformal symmetry: In quantum field theories which do not possess any dimensionful parameter,
such as the massless free scalar theory, then in addition to Poincaré symmetry we also have a scaling
symmetry xµ′ = λxµ for any λ > 0. It is not obvious, but Poincaré symmetry plus scaling symmetry
seems to imply the existence of a broader spacetime symmetry called conformal symmetry, which
consists of arbitrary angle-preserving coordinate transformations. Field theories with this enhanced
symmetry are called conformal field theories, and conformal field theories are very important to the
logical structure of quantum field theory: any quantum field theory is supposed to asymptote to a
conformal field theory in the limit of short or long distance.
ˆ Supersymmetry: Supersymmetries are fermionic symmetries that exchange fermionic and bosonic
fields. The spin-statistics theorem (which we will prove in a few lectures) shows that bosons must
have integer spin and fermions must have half-integer spin, so a symmetry which exchanges them
must transform nontrivially under rotations. Supersymmetries thus mix nontrivially with Poincaré
transformations, and must thus be spacetime symmetries themselves. Supersymmetric field theories
have many nice properties, and in particular many interesting quantities can be computed exactly. They
are thus a source of interesting solvable examples of interesting field theory phenomena. There is also
some hope that supersymmetry will be relevant in the real world, for example to address the hierarchy
problem in particle physics (as we will discuss later), and also in string theory where supersymmetry
seems to be necessary for the consistency of the theory.
ˆ Diffeomorphism symmetry: There is a particularly simple kind of quantum field theory called a
topological field theory, for which arbitrary coordinate transformations are symmetries. These theories
arise in some interesting condensed matter systems such as those exhibiting the fractional quantum
hall effect, and they also appear in some corners of string theory. One can think of topological field
theory as a special kind of conformal field theory.

53
Returning to Poincaré symmetry, the full set of Poincaré transformations forms a group called the Poincaré
group and it is useful to now make a few general comments about its global structure. Recall that this is
defined to be the set of coordinate transformations

xµ′ = Λµν xν + aµ , (4.45)

with aµ arbitrary and Λ obeying


Λµ α Λν β ηµν = ηαβ . (4.46)
The subgroup of the Poincaré group with a = 0 is called the Lorentz group, and it is denoted O(d − 1, 1).
Taking the determinant of (4.46) we see that

(det Λ)2 = 1, (4.47)

and splitting the time and space terms of the 00 component of (4.46) we see that
X
(Λ00 )2 = 1 + (Λi0 )2 (4.48)
i

and thus
(Λ00 )2 ≥ 1. (4.49)
We therefore can split up the Lorentz group into four connected components labeled by the signs of det Λ
and Λ00 . The simplest of these components is the one containing the identity transformation, which is called
the identity component and denoted SO+ (d − 1, 1) (here “S” indicates unit determinant and “+” indicates
Λ00 ≥ 1). Any element of the other components can be written as an element of SO+ (d − 1, 1) multiplied
by one of the following three Lorentz transformations:

R : (t, x1 , x2 , . . . xd−1 ) 7→ (t, −x1 , x2 , . . . , xd−1 )


T : (t, x1 , x2 , . . . xd−1 ) 7→ (−t, x1 , x2 , . . . , xd−1 )
RT : (t, x1 , x2 , . . . xd−1 ) 7→ (−t, −x1 , x2 , . . . , xd−1 ). (4.50)

The transformation R reflects the spatial x1 coordinate, the transformation T reverses time, and the transfor-
mation RT does both. Due to our general discussion above we should expect that T and RT are represented
by antiunitary operators ΘT and ΘRT , while R is represented by a unitary operator UR . Therefore two of
the connected components of the Lorentz group are unitary and two are antiunitary. When d is even it is
conventional to replace R by an operation P, called parity, that reflects all spatial coordinates. When d is
odd however P is in SO+ (d − 1, 1), so in general it is best to stick with R.
The fact that the Poincaré group has four connected components suggests the possibility that there
could be relativistic field theories where only some of these components give genuine symmetries. We should
always include the identity component SO+ (d − 1, 1) (otherwise what would we mean by “relativistic field
theory”), but there are indeed interesting theories where some of the other components are not symmetries.
In fact this possibility is realized in the Standard Model of particle physics, which has neither parity nor
time-reversal symmetry. On the other hand we will see in a few lectures that there is a way of combining
RT with an internal transformation C, called charge conjugation, that gives a combined transformation
CRT which is always a symmetry in any relativistic field theory (even if C, R, and T separately are not
symmetries). Thus we always at least have a spacetime symmetry group SO(d − 1, 1), where the absence of
the + indicates that we have included the RT component of the Lorentz group but the S indicates that we
have not included the R and T components.
The existence of a unitary representation of SO+ (d−1, 1) obeying U (Λ, a)† A[R]U (Λ, a) = A[Λ−1 (R−a)],
the spectrum condition, nesting, and causality together form what are called the Haag-Kastler axioms
for algebraic quantum field theory. It is widely agreed that these axioms are necessary for any reasonable
definition of relativistic quantum field theory. There is less agreement on what else is needed, two things
I personally would also include are duality and the existence of a conserved symmetric energy-momentum
tensor that generates SO+ (d − 1, 1).

54
4.5 Correlation functions of tensor fields
Just as in the case of internal symmetries, spacetime symmetries imply powerful constraints on correlation
functions. First considering elements of the Poincare group with Λ00 ≥ 1, we can define

O′ (x) = U † (Λ, a)O(x)U (Λ, a) (4.51)

with U (Λ, a) being unitary. Assuming the ground state is invariant under Poincare symmetry, we then have

⟨Ω|O1′ (x1 ) . . . On′ (xn )|Ω⟩ = ⟨Ω|O1 (x1 ) . . . On (xn )|Ω⟩ (4.52)

just as in the internal case. In particular let’s say that the operators O(x) are tensor fields, meaning that
they come with some number of raised and lowered indices such that their Poincare transformation is

O′µ1 ...µnν1 ...νm (x) = Λµ1 α1 . . . Λµnαn Λν1 β1 . . . Λνmβm Oα1 ...αnβ1 ...βm (Λ−1 (x − a)). (4.53)

By taking Λ to be the identity we see that the correlation function must be invariant under translating all of
the coordinates xµ1 , . . . , xµn by an arbitrary vector aµ , and thus that the correlation function can only depend
on differences of these coordinates. When Λ is not the identity further constraints are imposed, for example
the two-point function of a vector operator V µ (x) must obey

⟨V µ (x1 )V ν (x2 )⟩ = Λµ α Λν β ⟨V α (Λ−1 x1 )V β (Λ−1 x2 )⟩, (4.54)

which determines the form of the two-point function to be

⟨V µ (x1 )V ν (x2 )⟩ = η µν f (x1 − x2 )2 + (xµ1 − xµ2 )(xν1 − xν2 )g (x1 − x2 )2


 
(4.55)

with f and g being functions of a single variable.


We can also consider Poincaré transformations with Λ00 ≤ −1, which are implemented by antiunitary
operators Θ(Λ, a). The local operators transform as

O′ (x) = Θ† (Λ, a)O(x)Θ(Λ, a) (4.56)

as before, but the constraint on correlation functions is now a bit trickier to derive. Assuming that the
ground state is invariant under Θ† , we have

⟨O1′ (x1 ) . . . On′ (xn )⟩ = (Ω, O1′ (x1 ) . . . On′ (xn )Ω)
= (Θ† Ω, Θ† O1 (x1 ) . . . On (xn )Ω)
= (O1 (x1 ) . . . On (xn )Ω, Ω)
= (Ω, (O1 (x1 ) . . . On (xn ))† Ω)
= ⟨On (xn )† . . . O1 (x1 )† ⟩. (4.57)

Here we have switched to mathematician notation in the middle to handle the antiunitary operators. Thus
we see that an antiunitary symmetry reverses the operator of the operators in a correlation function and
takes their hermitian conjugates. This has the nice feature that it sends time-ordered correlation functions
to time-ordered correlation functions.

4.6 Correlation functions involving conserved currents


We saw in an earlier lecture that from the Lagrangian point of view, Noether’s theorem tells us that any
continuous symmetry in field theory leads to a conserved current J µ (x).35 The current conservation equation
35 This theorem has not quite been proven from the abstract point of view taken in this lecture (i.e. using only the Haag-

Kastler axioms), and indeed in my long paper with Hirosi Ooguri we give some counterexamples. These counterexamples are in
somewhat pathological theories however, and so far it seems likely that Noether’s theorem is true for sufficiently well-behaved
theories.

55
imposes interesting constraints on correlation functions that contain such currents, since inserting ∂µ J µ into
any (Wightman) correlation function must give zero. For example you will show on the homework that
imposing the conservation equation ∂µ V µ = 0 on the vector field appearing in (4.55) implies that the
functions f and g obey the constraint
d+1
f ′ (x) + xg ′ (x) + g(x) = 0. (4.58)
2
There is also an interesting constraint on time-ordered correlation functions of a conserved current J µ . We
can illustrate the idea using a two-point function:

∂µ ⟨T J µ (x)O(y)⟩ = ∂µ Θ(x0 − y 0 )⟨J µ (x)O(y)⟩ + Θ(y 0 − x0 )⟨O(y)J µ (x)⟩




= δ(x0 − y 0 )⟨[J 0 (x), O(y)]⟩, (4.59)

where the term on the right-hand side comes from the derivative acting on the Heaviside Θ function. More
generally we have
n
X
∂µ ⟨T J µ (x)O1 (y1 ) . . . On (yn )⟩ = δ(x0 − ym
0
)⟨T O1 (y1 ) . . . [J µ (x), Om (ym )] . . . On (yn )⟩. (4.60)
m=1

Note that the commutators appearing on the right-hand side are at equal time due to the δ-function, and
thus vanish when x ̸= y. We can therefore expand them in the δ function and its derivatives:

[J 0 (y 0 , ⃗x), O(y 0 , ⃗y )] = A(y 0 , ⃗y )δ d−1 (⃗x − ⃗y ) + B i (y 0 , ⃗y )∂i δ d−1 (⃗x − ⃗y ) + . . . (4.61)

Integrating this equation over ⃗x we see that

A(y) = [Q, O(y)] = iδS O(y), (4.62)

and so we see that the divergence of a time-ordered correlation function involving a conserved current obeys
the Ward identity:
n
X
∂µ ⟨T J µ (x)O1 (y1 ) . . . On (yn )⟩ = i δ d (x − ym )⟨T O1 (y1 ) . . . δS O(ym ) . . . On (yn )⟩ + . . . , (4.63)
m=1

where the “. . .” indicates terms proportional to derivatives of δ d (x − ym ) with respect to x. In quantum


field theory terms in correlation functions which vanish unless the operators are at the same point are
called contact terms, and usually they have ambiguities depending on how the theory is regulated at short
distance. The leading term in (4.61) is an exception, as we were able to determine it from the symmetry
algebra.

56
4.7 Homework
1. Compute the two-point functions ⟨Φ(x)Φ(y)⟩ and ⟨Φ† (x)Φ(y)⟩ for a complex scalar field, giving each
answer both as a covariant integral over spacetime momenta and also directly in position space in
terms of a Bessel function. You are free to use our results for the real scalar field, so you shouldn’t
need to evaluate any new integrals.
2. Show that if R1 and R2 are open spatial regions (which recall for us means that each lies in a constant
time slice in some Lorentz frame) obeying R1 ⊂ R2 , then their domains of dependence obey D[R1 ] ⊂
D[R2 ].
3. Show that SU (N ) is indeed a group, meaning that it is closed under matrix multiplication and matrix
inverse.
4. Show that every Lorentz transformation is indeed a product of an element of SO+ (d − 1, 1) with 1,
R, T , or RT . Hint: this shouldn’t require any detailed calculation or explicit parameterization of the
Lorentz group.

5. Argue that the vector two-point function indeed has the form (4.55), and also show that if ∂µ V µ = 0
then (4.58) follows.
6. Check that the two-point functions we computed for real and complex scalar fields are consistent with
the time-reversal constraint (4.57).
7. Extra credit: Antiunitary operators may seem somewhat counter-intuitive, but there is an elegant
characterization of any antiunitary operator due to Wigner that you will work out in this problem.
First argue that if Θ is antiunitary then Θ2 is unitary. There therefore must be a basis |i⟩ in which
we have Θ2 |i⟩ = e−2iθi |i⟩ and (Θ† )2 |i⟩ = (Θ2 )† |i⟩ = e2iθi |i⟩ for some θi ∈ (−π/2, π/2]. Work out how
Θ and Θ† act in this basis, and then argue that their action on arbitrary superpositions follows from
antilinearity. Hint: you want to show that up to phase redefinitions you can take this basis to consist
of states which are invariant and pairs of states which are exchanged up to a phase by acting with Θ.
You might start by showing that Θ|i⟩ is also an eigenstate of Θ2 .

57
5 Path integrals in quantum mechanics and quantum field theory
So far we have discussed quantum field theory in the Hamiltonian formalism. This formalism has many
advantages, for example it is where the physical interpretation of a quantum system in terms of measurements
and counting degrees of freedom is most clear, but it obscures the full symmetry of relativistic theories since
one needs to pick a Lorentz frame to define the canonical momenta and the Hamiltonian.36 Giving up on
manifest Lorentz invariance makes it harder to demonstrate some of the deeper consequences of Lorentz
invariance, such as the CRT and spin-statistics theorems, and it also makes practical calculations more
difficult since each intermediate step seems to depend on the Lorentz frame but the end result doesn’t. In
classical mechanics there is a clear way to handle this problem: we can think more about the Lagrangian and
less about the Hamiltonian. The goal of the path integral approach to quantum mechanics, first suggested
by Dirac and then greatly expanded by Feynman, is to give an independent (but equivalent) formulation of
quantum mechanics that based on the Lagrangian instead of the Hamiltonian. We will spend the rest of this
lecture developing this approach.

5.1 Hamiltonian path integral in quantum mechanics


We will first discuss the path integral for a finite number of quantum degrees of freedom, which we will refer
to as Qa . They have canonical conjugate momenta Pa , and these obey the canonical commutation relations

[Qa , Pb ] = iδba
[Qa , Qb ] = 0
[Pa , Pb ] = 0. (5.1)

We will take the Hamiltonian H(Q, P ) to be a polynomial in Q and P whose terms are ordered in such a
way that all P ’s appear to the right of all Q’s (using the canonical commutation relations we can always
write any product of P s and Qs as a sum of terms with this ordering), and we will work in the Heisenberg
picture so that both Q and P are functions of time. For convenience we will take the Hamiltonian to be
time-independent, but there is no real difficulty in repeating the argument for a time-dependent Hamiltonian.
Let’s say we are interested in computing the propagator G(qf , qi ; tf , ti ) in the Q basis. In the Schrödinger
picture this is given by
G(qf , qi ; tf , ti ) = ⟨qf |e−iH(tf −ti ) |qi ⟩, (5.2)
but since we are working in the Heisenberg picture we’ll instead write it as

G(qf , qi ; tf , ti ) = ⟨qf , tf |qi , ti ⟩, (5.3)

where |q, t⟩ is a simultaneous eigenstate of the Qa (t):

Qa (t)|q, t⟩ = q a (t)|q, t⟩. (5.4)

Explicitly we have |q, t⟩ = eiHt |q, 0⟩.


The idea behind the path integral formalism is to break up the propagator into a repeated integral over
propagators with smaller time separation by inserting complete sets of states:
N
Y −1 Z 
⟨qf , tf |qi , ti ⟩ = dqm ⟨qf , tf |qN −1 , tf − ϵ⟩⟨qN −1 , tf − ϵ|qN −2 , tf − 2ϵ⟩ . . . ⟨q2 , t2 |q1 , t1 ⟩⟨q1 , ti + ϵ|qi , ti ⟩.
m=1
(5.5)
Here we have split the time interval tf − ti into N pieces of size ϵ. We can think of the integration variables
q1a , . . . qN
a a
−1 as giving a discretization of possible trajectories the system could follow from qi at time ti to
36 There actually is an approach to Hamiltonian mechanics that does not require a choice of Lorentz frame, which is called the

covariant phase space approach. See my first paper with Jie-qiang for a review. Quantization in this approach is somewhat
subtle, and we won’t pursue the topic here.

58
Figure 8: Discretizing a particle trajectory from (qi , ti ) to (qf , tf ). The dashed lines show the positions
which are integrated over in the intermediate steps.

qfa at time tf , see figure 8 for an illustration in the case of a single particle moving in one dimension. The
integral is therefore a sum over (discretized) intermediate trajectories; a path integral. The expression
(5.5) however is not so useful: we need some way to compute the propagators. At finite ϵ this of course isn’t
any easier than computing the full propagator, but in the limit of small ϵ a simplification is possible:

⟨q ′ , t + ϵ|q, t⟩ = ⟨q ′ , t|e−iHϵ |q, t⟩


≈ ⟨q ′ , t|(1 − iϵH(Q(t), P (t)))|q, t⟩
Z
dp ′
= ⟨q , t|(1 − iϵH(q ′ , p))|p, t⟩⟨p, t|q, t⟩

Z
dp P ′a a
= (1 − iϵH(q ′ , p))ei a pa (q −q )

dp iϵ Pa pa q′a −q a
Z  
−H(q ′ ,p)
≈ e ϵ
. (5.6)

Here in going from the first to second and fourth to fifth lines we have neglected terms which are O(ϵ2 ),
in going from the second to the third line we have inserted a complete set of states and used that in H
the momenta are ordered to the right, and in going from the third to the fourth line we have used the
momentum-space wave function P a
⟨q, t|p, t⟩ = ei a p qa . (5.7)
We can then use this repeatedly in (5.5) and take the limit ϵ → 0, which gives
−1 Z −1 Z
N  NY " N −1 !#
a
− qℓa

Y dpn X X qℓ+1
⟨qf , tf |qi , ti ⟩ = lim dqm exp iϵ pℓ,a − H (qℓ+1 , pℓ )
ϵ→0
m=1 n=0
2π a
ϵ
ℓ=0
" Z !#
Z Z tf
qf
X
a
:= Dq|qi Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) . (5.8)
ti a

q
Here we have defined q0 = qi and qN =R qf , and Dq|qfi indicates a functional integral over paths q a (t)
R

obeying q a (ti ) = qia and q a (tf ) = qfa . Dp indicates a functional integral over paths pa (t) in momentum
space with no restrictions at ti and tf . Equation (5.8) is called a Hamiltonian path integral expression
for the propagator. The quantity appearing in the exponent is essentially i times the Lagrangian, except
that p is treated as an independent variable instead of being related to q and q̇.

59
In quantum field theory we are particularly interested in expectation values of products of Heisenberg
operators, and these also have a useful path integral representation. Indeed we can consider the quantity
 
⟨qf , tf |OM Q(tM ), P (tM ) . . . O1 Q(t1 ), P (t1 ) |qi , ti ⟩, (5.9)

where I’ve put a line over the times of the Heisenberg operators to distinguish them from the timesteps
appearing in the path integral discretization. We will assume that the operators are time-ordered, meaning
that
t1 ≤ t2 ≤ . . . ≤ tM , (5.10)
and we will also take these operators to be ordered so that all canonical momenta appear to the left (note
that this is the opposite of the ordering we chose for the Hamiltonian). We can evaluate this quantity by
inserting complete sets of states as before, except now we occasionally need to evaluate
Z
dp ′
⟨q ′ , t + ϵ|O(Q(t), P (t))|q, t⟩ = ⟨q , t|e−iϵH(Q(t),P (t)) |p, t⟩⟨p, t|O(Q(t), P (t))|q, t⟩

dp iϵ Pa pa q′a −q a
Z  
−H(q ′ ,p)
≈ e ϵ
O(q(t), p(t)). (5.11)

Thus we see that the only effect of time-ordered operator insertions is to insert these operators evaluated as
functions of q and p into the path integral:
Z Z
qf
⟨qf , tf |T O1 (Q(t1 ), P (t1 )) . . . OM (Q(t1 ), P (tM ))|qi , ti ⟩ = Dq|qi Dp O1 (q(t1 ), p(t1 )) . . . OM (q(tM ), p(tM ))
" Z !#
tf X
a
× exp i dt pa (t)q̇ (t) − H(q(t), p(t))
ti a
(5.12)

Here we have used the time-ordering symbol T on the left-hand side to ensure that operators are time-ordered,
so we no longer need to impose (5.10).

5.2 Ground state preparation and the iϵ prescription


In quantum field theory the canonical coordinates Qa (t) become Heisenberg fields, and it isn’t so useful
to consider expectation values in eigenstates of these fields. What we really want are vacuum expectation
values, so for the path integral formulation to be useful in quantum field theory we need a path integral way
to prepare the ground state. Fortunately there is a fairly simple way of doing this. Let’s first recall that the
eigenstates |q, t⟩ of Q obey
|q, t⟩ = eiHt |q, 0⟩. (5.13)
As with any state in the Hilbert space, we can expand |q, 0⟩ in terms of energy eigenstates:
X
|q, 0⟩ = Ci (q)|i⟩, (5.14)
i

with H|i⟩ = Ei |i⟩. The idea is then to give t a small imaginary part via

t = e−iϵ τ, (5.15)

with τ real and 0 < ϵ ≪ 1, and then take τ to be large and negative. Working to leading order in ϵ we then
have X
|q, e−iϵ τ ⟩ ≈ |q, (1 − iϵ)τ ⟩ = e(i+ϵ)τ H |q, 0⟩ = Ci (q)e(i+ϵ)Ei τ |i⟩, (5.16)
i

60
Figure 9: The iϵ prescription for computing correlation functions in quantum field theory. Here t1 , t2 , . . . are
the locations of the operators and the time contour is shown in red. In practice it simplifies calculations if
we also analytically continue the operator times as tm = e−iϵ τ m , as then we can straighten the contour to
the dashed one.

so if we take τ to −∞ this gives us a state which is proportional to the ground state (which we renormalize
to have zero energy):
|qi , −(1 − iϵ)∞⟩ = C0 (q)|Ω⟩. (5.17)
Therefore we can write a (Hamiltonian) path integral expression for the ground state wave function:
" Z !#
Z Z 0
1 qf
X
a
⟨qf , 0|Ω⟩ = Dq|qi Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) . (5.18)
C0 (qi ) −(1−iϵ)∞ a

We can also use (5.17) to give a path integral expression for the time-ordered correlation functions:
Z Z
  1 0
⟨Ω|T O1 Q(t1 ), P (t1 ) . . . OM Q(tM ), P (tM ) |Ω⟩ = Dq| 0 Dp O1 (q(t1 ), p(t1 )) . . . OM (q(tM ), p(tM ))
|C0 (0)|2
" Z !#
(1−iϵ)∞ X
a
× exp i dt pa (t)q̇ (t) − H(q(t), p(t)) ,
−(1−iϵ)∞ a
(5.19)
where for convenience we have arbitrarily taken qia = qfa = 0. The contour for the t integral is shown in figure
9. This contour prescription is the path integral version of the iϵ prescription, and we will soon see that it
gives rise to the same iϵ prescription in the Feynman propagator that we found from the canonical approach
a few lectures ago. You may worry that this formula still requires us to know |C0 (0)|, but by removing the
operator insertions we can also use it to give us a path integral formula for this,
" Z !#
Z Z (1−iϵ)∞ X
2 0 a
|C0 (0)| = Dq|0 Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) , (5.20)
−(1−iϵ)∞ a

so the correlation function is really a ratio of two path integrals. This is convenient because ambiguities in
the normalization of the path integral measure cancel between the numerator and denominator.

5.3 An aside on Gaussian integrals


To proceed further, we now need to remember (or learn) a few things about Gaussian integrals. You hopefully
haven’t made it this far in your education without knowing that
Z ∞ √
x2
dxe− 2 = 2π, (5.21)
−∞

61
but just in case the proof is to look at the square of this integral and change to polar coordinates:
Z ∞ 2 Z ∞ Z ∞ Z ∞ Z ∞
− x2
2 2
− x +y
2 2
− r2 d  − r2 
dxe = dx dy e 2 = 2π dr re = 2π dr −e 2 = 2π. (5.22)
−∞ −∞ −∞ 0 0 dr
Once we have this basic result we can derive others, for example for any A > 0 and any complex B we have
Z ∞ Z ∞ Z ∞ r
−A x 2
+Bx − A
( x− B 2
) + B2
2A = √
1 B2
−z 2 2π B2
dx e 2 = dx e 2 A e 2A dze = e 2A . (5.23)
−∞ −∞ A −∞ A
By differentiating this expression with respect to B we can compute all the moments of the Gaussian
distribution, for example
A ∞
r
d2 B2
Z
A 2 1
dxx2 e− 2 x = 2
e 2A = . (5.24)
2π −∞ dB B=0 A
We can also consider multiple integrals: given a symmetric matrix A which we will at first assume to be real
and positive, and a complex vector B, we have the integral
Z
1 T T
Z[A, B] := dxe− 2 x Ax+B x (5.25)

where x is a real vector. We can diagonalize A as A = OT DO, where O is orthogonal and D is diagonal
with positive elements d1 , d2 , . . .. We can then change variables to x e = Ox, giving
Z
1 T T
Z[A, B] = de xe− 2 xe Dex+(OB) xe
Y Z 1
P

= xi e− 2 xei di xei + j Oij Bj xei
de
i
( j Oij Bj )2
r
Y  2π P 
= e 2di

i
di
1 1
B T A−1 B
=q e2 . (5.26)
A
Det 2π

There is an easy way to remember this result: up to a determinant factor, we can evaluate a Gaussian integral
by evaluating its integrand on the value of x for which its exponent is stationary. Indeed the exponent in
(5.25) has a stationary point at
x = A−1 B, (5.27)
and we then have
1 1
− xT Ax + Bx = B T A−1 B. (5.28)
2 2
We can also use this result to compute correlation functions:
1 T
dx xi1 . . . xin e− 2 x Ax
R
∂ ∂ 1 T −1
R − 1 T = ... e2B A B , (5.29)
dx e 2 x Ax ∂B i1 ∂B in B=0

in particular the two-point function is given by


1 T
dx xi xj e− 2 x Ax
R
1 T = (A−1 )ij . (5.30)
dx e− 2 x Ax
R

In quantum mechanics we are not only interested in the situation where A is real and positive. We can
extend our result (5.26) to more general A by analytic continuation; a minimal condition for the convergence
of the integral (5.25) is that A has positive real part, meaning that A + A† is positive, and (5.26) will apply
for any such matrix provided that we are careful to define the sign of the square root by analytic continuation
from real positive A.

62
5.4 Lagrangian path integral in quantum mechanics
So far the path integrals we have discussed have independent integrals over the trajectories q(t) and p(t).
These manifestly rely on the Hamiltonian formalism, and thus are not manifestly covariant in relativistic
theories. To get covariant expressions we need to get rid of p(t). The best way to do this is to integrate
it out, meaning to simply evaluate the functional integral over p(t). In many theories of physical interest,
including in particular the standard model of particle physics and also general relativity, the Hamiltonian
is a quadratic function of the canonical momenta. The functional integral over p(t) is therefore a Gaussian
integral, and we can thus evaluate it using the methods of the previous subsection. Indeed the stationarity
condition is simply Hamilton’s equation
∂H
q̇ a = , (5.31)
∂pa
so evaluating the Gaussian integral over p(t) has precisely the effect of converting the exponent in the path
integral into the Lagrangian! More explicitly, considering expectation values of operators that depend only
on Q (and not P ) we have
q
Dq|qfi
Z R tf
⟨qf , tf |T O1 (Q(t1 )) . . . OM (Q(tM ))|qi , ti ⟩ =
p O1 (q(t1 )) . . . OM (q(tM )) ei ti L(q(t),q̇(t)) .
Det (2πA[q])
(5.32)
Here A[q] is the “matrix” appearing in the term in the Hamiltonian which is quadratic in P , as in equation
(5.25). In simple theories (such as the harmonic oscillator or the standard model of particle physics) A is
independent of q, in which case the determinant factor is a field-independent constant and can be absorbed
into a rescaling of the measure.37 Equation (5.32) is called the Lagrangian path integral, and unlike the
Hamiltonian path integral it manifestly has (up to possible regularization issues) all the symmetries of the
classical Lagrangian L. Using the iϵ prescription we can also give a Lagrangian path integral expression for
time-ordered correlation functions:
R ∞(1−iϵ)
Dq|00 i dtL(q(t),q̇(t))

R
O1 (q(t1 )) . . . OM (q(tM ))e −∞(1−iϵ)
Det(2πA[q])
⟨Ω|T O1 (Q(t1 )) . . . OM (Q(tM ))|Ω⟩ = R ∞(1−iϵ) . (5.34)
Dq|00 i dtL(q(t),q̇(t))

R
e −∞(1−iϵ)
Det(2πA[q])

This expression, together with its Euclidean continuation we will introduce soon, is the starting point for
many (most?) standard calculations in quantum field theory.
The restriction to operator insertions that don’t depend on P is not so serious, as we can differentiate
both sides of equation (5.32) with respect to the operator times t1 , t2 , . . . to get path integral expressions for
correlation functions involving time derivatives of q. The restriction to Hamiltonians which are quadratic in
P is more concerning. In general the best that can be said is that by integrating out p we will always get some
local Lagrangian which has whatever symmetries the theory has, but it won’t in general be the Legendre
transform of the Hamiltonian we started with. On the other hand in quantum field theory we usually end
up writing down the most general local Lagrangian that is consistent with the symmetries in question (see
our discussion of effective field theories in later lectures), and the new Lagrangian resulting from integrating
out p will differ from the one resulting from the Legendre transformation only by shifts of the values of
the parameters in this Lagrangian. By starting with the Lagrangian approach we therefore land on the
same class of theories as we did starting from the Hamiltonian approach, but now with a more complicated
relationship between the two approaches. These comments also apply to the somewhat arbitrary choices we
made for the operator ordering of H and O: other choices would just differ by shifting the coefficients of
37 An example of a theory which is not “simple” in this regard is the “non-linear σ-model”, which is a theory of multiple

scalar fields ϕn with Lagrangian density


1
L = − gmn (ϕ)∂µ ϕm ∂ µ ϕn − V (ϕ). (5.33)
2
m
Here gmn (ϕ) is a Euclidean metric on the target space of the fields ϕ (x). This theory shows up in the low-energy description
of pions in nuclear physics.

63
the local terms appearing in H and O. In general shifts of this type are called renormalizations, and in
defining path integrals we always give ourselves some leeway in how to renormalize both the Hamiltonian
and the operators appearing in expectation values.

5.5 Path integral calculation of the harmonic oscillator ground state


As a first illustration of using a path integral for a practical calculation we can compute the ground state
wave function of the simple harmonic oscillator. Up to normalization this is given by
Z R0 2 2 2
⟨xf |Ω⟩ ∝ Dx|0 f ei −(1−iϵ)∞ dt 2 (ẋ −m x ) .
x 1
(5.35)

This integral is Gaussian, so we can evaluate it using our formula (5.26): we are supposed to find the saddle
point of the exponent and then evaluate the integrand on it. The saddle point equation is

d2 x
= −m2 x, (5.36)
dt2
but it is more convenient to rewrite this in terms of τ = (1 + iϵ)t:

d2 x
= −m2 (1 − 2iϵ)x. (5.37)
dτ 2
We are interested in finding the saddle point which vanishes at τ = −∞ and is equal to xf at τ = 0; this is

x(τ ) = xf eim(1−iϵ)τ . (5.38)

Evaluating the exponent of the integrand we have

x2f ix2f 2m2 (1 − iϵ) mx2f


Z 0
−m2 (1 − 2iϵ) − m2 e2im(1−iϵ)τ = −

i(1 − iϵ) dτ =− , (5.39)
−∞ 2 2 2im(1 − iϵ) 2

which is indeed the exponent for the correct harmonic oscillator ground state.

5.6 Path integral calculation of the Feynman propagator in field theory


We can evaluate the Feynman propagator for a free scalar field along similar lines. Taking into account the
iϵ prescription, integrating by parts we can write the exponent as
i ∞
Z Z
dτ dd−1 xϕ −(1 + 2iϵ)∂τ2 + ∇2 − m2 ϕ.

(5.40)
2 −∞

Thuss the “matrix” A for this Gaussian integral is

A(x1 , x2 ) = −i −(1 + 2iϵ)∂τ2 + ∇2 − m2 ,



(5.41)

which we can easily invert in momentum space:

dd p ieip(x2 −x1 )
Z
A−1 (x1 , x2 ) =
(2π)d (1 + 2iϵ)(p0 )2 − |p|2 − m2
dd p −ieip(x2 −x1 )
Z
= , (5.42)
(2π)d p2 + m2 − iϵ

where in the second line we rescaled ϵ by the positive quantity (p0 )2 . By equation (5.30) this should be equal
to the Feynman propagator GF (x1 , x2 ), and indeed it matches the expression we found a few lectures ago
using the canonical formalism.

64
We can also use this approach to independently derive a position space expression for the Feynman
propagator. Indeed from equation (5.41) the Feynman propagator should obey

−(1 + 2iϵ)∂τ22 + ∇22 − m2 GF (x1 , x2 ) = iδ(x1 − x2 ).



(5.43)

By Lorentz invariance GF should really only be a function of


p p
s = (⃗x2 − ⃗x1 )2 − (t2 − t1 )2 = (⃗x2 − ⃗x1 )2 − (1 − 2iϵ)(τ2 − τ1 )2 , (5.44)

and substituting this into (5.43) we find that away from s = 0 the Feynman propagator obeys (up to terms
of order ϵ2 )
d−1 ′
G′′F (s) + GF (s) − m2 GF (s) = 0, (5.45)
s
which is a standard ordinary differential equation whose solutions can be expressed in terms of Bessel
functions (as you can easily check in mathematica). The solution which goes to zero at large positive s is
d−2
GF ∝ s− 2 K d−2 (ms), (5.46)
2

and we can fix the coefficient of proportionality either by requiring that this obeys

(∇22 − m2 )GF (x2 , x1 ) = iδ d (x2 − x1 ) (5.47)

or else by matching to the integral (5.42) in the massless limit that ms ≪ 1 where the integral is easier to
compute. This is the same position-space two-point function we quoted in lecture five, except that now the
iϵ prescription we are using gives us the Feynman propagator instead of the two-point function.

5.7 Euclidean path integrals


We’ve seen that it is convenient to analytically continue t slightly into the complex plane via the iϵ prescrip-
tion t = e−iϵ τ . In fact it is an even better idea to continue all the way to ϵ = π/2, i.e. to

t = −iτ. (5.48)

The path integral on this contour is called the Euclidean path integral, and for many questions the
Euclidean path integral is the best way to think about it. Given its importance, it is worth repeating the
deriviation we gave in Lorentzian signature directly in Euclidean signature. The idea is to define Euclidean
Heisenberg operators by38
O(τ ) = eτ H O(0)e−τ H , (5.49)
with eigenstates |q, −iτ ⟩ = eHτ |q, 0⟩. Proceeding as in the Lorentzian case, we can note that
Z
dp ′
⟨q ′ , −i(τ + ϵ)|O(Q(τ ), P (τ )|q, −iτ ⟩ = ⟨q , −iτ |e−ϵH(Q(τ ),P (τ )) |p, −iτ ⟩⟨p, −iτ |O(Q(τ ), P (τ ))|q, −iτ ⟩

dp ϵ i Pa pa q′a −q a
Z  
−H(q ′ ,p)
≈ e ϵ
O(q(τ ), p(τ )) (5.50)

and therefore by inserting complete sets of states we have
Z Z
q
⟨qf , −iτf |T O1 (Q(τ 1 ), P (τ 1 )) . . . OM (Q(τ M ), P (τ M ))|qi , −iτi ⟩ = Dq|qfi Dp O1 (q(τ 1 ), p(τ 1 )) . . . OM (q(τ M ), p(τ M ))
"Z !#
τf X
a
× exp dτ i pa (τ )q̇ (τ ) − H(q(τ ), p(τ )) .
τi a
(5.51)
38 These operators are somewhat delicate mathematically due to the presence of eτ H , which has a very limited domain. It is

always ok to use them in time-ordered vacuum correlators however, which in the end is the only place we will use them.

65
Taking τi → −∞ and τf → ∞ now automatically projects onto the ground state, so no analytic continuation
is needed to convert this into a vacuum expectation value:
Z Z
1 0
⟨Ω|T O1 (Q(τ 1 ), P (τ 1 )) . . . OM (Q(τ M ), P (τ M ))|Ω⟩ = Dq|0 Dp O1 (q(τ 1 ), p(τ 1 )) . . . OM (q(τ M ), p(τ M ))
|C0 (0)|2
"Z !#
∞ X
a
× exp dτ i pa (τ )q̇ (τ ) − H(q(τ ), p(τ )) .
−∞ a
(5.52)

Converting this into a Lagrangian path integral (with the same caveats as before), we end up with
Dq|00
R∞
√ O1 (q(τ 1 )) . . . OM (q(τ M ))e− dτ LE (q,q̇)
R
−∞
Det(2πA)
⟨Ω|T O1 (Q(τ 1 )) . . . OM (Q(τ M ))|Ω⟩ = 0
R∞ , (5.53)
√ Dq|0 e− −∞ dτ LE (q,q̇)
R
Det(2πA)

where LE is the Euclidean Lagrangian defined in terms of the Lorentzian Lagrangian by


   
dq dq
LE q, := −L q, i . (5.54)
dτ dτ

For example for the simple harmonic oscillator the Euclidean Lagrangian is
1 2
q̇ + m2 q 2 ,

LE = (5.55)
2
while for a free scalar field the Euclidean Lagrangian is the spatial integral of the Euclidean Lagrangian
density
1 2 
LE = ϕ̇ + ∇ϕ · ∇ϕ + m2 ϕ2 . (5.56)
2
There are a few essential points to make about the Euclidean path integral:
ˆ Mathematically
R∞ it is much better behaved than the Lorentzian path integral. The Euclidean action
SE = −∞ dτ LE is often real and bounded from from below, as you can see from the harmonic oscillator
and the free scalar, so the integrand e−SE exponentially suppresses field configurations which aren’t
near ϕ = 0. This makes it possible to give it a mathematically rigorous formulation (at least in the
case of a finite number of degrees of freedom), look up “Wiener measure” if you want to learn about
it.
ˆ In situations where SE is real and bounded from below we can interpret the Euclidean path integral
(5.53) as computing expectation values in a classical probability distribution. Many famous classical
statistical systems arise in this way, for example the Euclidean path integral for a free particle is the
classical theory of Brownian motion and the Euclidean path integral for a free scalar field with d = 2
is the classical theory of random surfaces. The critical point in the phase diagram of water is also
described by a (interacting) Euclidean scalar field theory, as are the fluctuations of magnets at the
Curie temperature. Euclidean path integrals also arise in quantitative finance: the prices of options as
a function of time are fluctuating variables which can be characterized by a Euclidean path integral.
ˆ In situations where the Euclidean path integral has a probabilistic interpretation it is amenable to
explicit numerical evaluation. The standard approach to this is called the Monte Carlo method, which
samples from the probability distribution and then assumes that the expectation value is dominated
by its value on a typical instance. This is a very powerful method for evaluating high-dimensional
integrals. For example in QCD, the theory of the strong nuclear force, my colleagues here in the
Center for Theoretical Physics use this method to compute the masses of hadrons such as the proton

66
and neutron to quite good accuracy. The computational resources involved are somewhat terrifying,
for example in a recent calculation my colleague Will Detmold used the fastest publicly-available
supercomputer in the world, Frontier at Oak Ridge National Laboratory, to evaluate the Euclidean
path integral of QCD on a Euclidean spacetime lattice with 72 × 72 × 72 × 192 sites, consuming of
order 1011 Joules of energy in the process.

ˆ In relativistic theories something particularly nice happens: if we have SO+ (d − 1, 1) symmetry in


Lorentzian signature then we have SO(d) rotational symmetry in Euclidean signature. This Euclidean
rotation invariance is at the heart of many famous results in quantum field theory, as we will see in
the next lecture.
ˆ We can also use the Euclidean path integral to compute Lorentzian correlation functions: to compute a
time-ordered correlator of operators O1 (t1 ), O2 (t2 ), . . ., we simply compute their Euclidean correlation
function as a function of τ 1 , τ 2 , . . . and then analytically continue the time of each operator as τ =
i(1 − iϵ)t. This analytic continuation is called Wick rotation; essentially we are approaching the
iϵ contour shown in figure 9 from the Euclidean contour instead of the Lorentzian one. You will
check in the homework that this continuation again gives the correct iϵ prescription for the Feynman
propagator.

ˆ Euclidean path integrals arise naturally in the context


 of quantum statistical mechanics. For example
to compute the partition function Z(β) = Tr e−βH , we evaluate the Euclidean path integral with
periodic boundary conditions in time, with periodicity β.

67
5.8 Homework
1. Rewrite the operator P QP Q as a sum of operators with all P to the right and all Q to the left.
2. Use the path integral to find the propagator ⟨q ′ , t′ |q, t⟩ of a free quantum particle moving on a line
P2
with Hamiltonian H = 2m . Hint: use the discretized version of the Lagrangian path integral.

3. Use the path integral to find the propagator for the simple harmonic oscillator, with Hamiltonian
P2
H = 2m + k2 Q2 . Hint: you should expand the function q(t) you are integrating over as a classical
solution qcl plus a fluctuating piece δq, and then expand δq in Fourier modes and integrate over the
coefficients of these modes. I recommend first doing the calculation neglecting any prefactors which are
independent of k: you can find the k-independent prefactor at the end by comparing to your answer
for the previous problem in the limit k → 0.

4. Use the Lorentzian path integral with an iϵ prescription to find the Feynman propagator of a free
massive complex scalar field (remember that this is the time-ordered two-point function of Φ and Φ† ).
5. Use the Euclidean path integral followed by a Wick rotation to compute the Feynman propagator of a
real free scalar field with mass m. Hint: you should find that the Euclidean Feynman propagator is a
Greens function for the Euclidean Klein-Gordon operator, obeying (∇2x − m2 )GF (x, y) = −δ d (x − y).
It is ok to leave your expression for it in terms of a spacetime momentum integral, but you should
make sure that after Wick rotation you get the right iϵ prescription for the Feynman propagator in
Lorentzian signature.

68
6 CRT , spin-statistics, and all that
We’ve now developed two powerful formalisms for thinking about quantum field theory: the operator ap-
proach based on algebras acting on Hilbert spaces and the path integral approach, both in Lorentzian and
Euclidean signature. In this lecture we will put the pieces together to prove some of the famous results
in relativistic quantum field theory: the CRT theorem, the relation between spin and statistics, and the
thermal nature of vacuum entanglement (the Unruh effect). All of these results are true non-perturbatively
in any relativistic quantum field theory, as the arguments will hopefully make clear. The title of this lecture
is shamelessly adapted from a famous book by Streater and Wightman, which discusses the first two of these
from a rigorous (but somewhat out-dated) approach.

6.1 The CRT theorem


Let’s first recall our Euclidean path integral expression for correlation functions in quantum field theory:

Dϕ O1 [ϕ] . . . OM [ϕ]e−SE [ϕ]


R
⟨Ω|T O1 [Φ] . . . OM [Φ]|Ω⟩ = R (6.1)
Dϕe−SE [ϕ]

Here I have switched from the particle notation we used in the last lecture to field notation, and also absorbed
the determinant factor coming from integrating out the momenta into the measure Dϕ. In any relativistic
quantum field theory this path integral is invariant under Euclidean rotation symmetry, in the sense that if
FΛ is a transformation of field space which implements a Euclidean rotation Λ ∈ SO(d), i.e.

FΛ ϕa (x) = DE (Λ)ab ϕb (Λ−1 x) (6.2)

on the dynamical fields, then the combination of the path integral measure and action are invariant:

D(FΛ ϕ)e−SE [FΛ ϕ] = Dϕe−SE [ϕ] . (6.3)

The invariance of the action is the classical statement of having a symmetry, while the invariance of the
measure reflects the statement that the regularization of the theory implicit in the path integral does not
destroy the symmetry (much later we will see examples of situations where this happens). Using this
invariance we can derive a constraint on correlation functions:
Dϕ O1 [ϕ] . . . OM [ϕ]e−SE [ϕ]
R
⟨Ω|T O1 [Φ] . . . OM [Φ]|Ω⟩ = R
Dϕe−SE [ϕ]
D(FΛ ϕ) O1 [FΛ ϕ] . . . OM [FΛ ϕ]e−SE [FΛ ϕ]
R
= R
Dϕe−SE [ϕ]
Dϕ O1 [FΛ ϕ] . . . OM [FΛ ϕ]e−SE [ϕ]
R
= R
Dϕe−SE [ϕ]
= ⟨Ω|T O1 [FΛ Φ] . . . OM [FΛ Φ]|Ω⟩. (6.4)

In going from the first to the second line here we changed variables in the path integral, in going from the
second to the third we used the symmetry condition (6.3), and in going from the third to the fourth we used
(6.1).
To prove the CRT theorem we are interested in the Euclidean rotation Λ = RT , which acts as39

RT : (τ, x1 , x2 , . . . , xd−1 ) = (−τ, −x1 , x2 , . . . , xd−1 ) (6.5)


39 When d is even we can combine RT with spatial rotations to define an operation PT which simply acts as PT : x 7→ −x.

This then leads to a symmetry called CPT , which is a symmetry of any relativistic field theory when d is even. Historically the
theorem discussed in this section has thus usually been called the CPT theorem, especially by particle physicists who only care
about the case of d = 4, while the terminology CRT is of more recent origin. We have focused on CRT nonetheless because 1)
it is the thing which works in any spacetime dimension and 2) it is what naturally arises from the proof of the theorem.

69
I emphasize that RT is indeed an element of SO(d), it is a rotation by π in the plane of τ and x1 . This
transformation reverses the direction of Euclidean time, so it also reverses the order of the operators in the
Euclidean correlation function. To be more concrete, if O1 lives at time τ1 , O2 at time τ2 , etc, and for
simplicity we assume that τ1 ≤ τ2 ≤ . . ., then the Euclidean statement of this symmetry is that
f (f −1)
⟨Ω|OM [Φ] . . . O1 [Φ]|Ω⟩ = (−1) 2 ⟨Ω|O1 [FRT Φ] . . . OM [FRT Φ]|Ω⟩
f /2
= (−1) ⟨Ω|O1 [FRT Φ] . . . OM [FRT Φ]|Ω⟩, (6.6)

where
f = fO 1 + . . . + fO M (6.7)
is the total number of fermionic operators appearing in O1 . . . OM . This pesky minus sign arises from
something we haven’t discussed yet, which is that when you time-order fermionic operators the process is
antisymmetric instead of symmetric. We’ll see this in more detail when we discuss the fermionic path integral
in a month or so, but for now the basic idea is that since fermionic fields anticommute instead of commute
at spacelike separation it must be that the degrees of freedom which represent them in the path integral are
also anticommuting. The second line of (6.6) follows from the first because correlation functions involving
fermions vanish unless the total number of fermions is even, which is a consequence of the fact that the
Lagrangian density is always bosonic (this is called fermion parity symmetry).
The CRT theorem is what we get when we analytically continue (6.6) to Lorentzian signature. We can
formalize the analytic continuation be introducing a Wick rotation operation W , whose action on dynamical
fields is defined to perform the analytic continuation τ = it. On Euclidean scalar fields we have

W Φ(t, ⃗x) = Φ(it, ⃗x), (6.8)

while for tensor fields each raised τ indices get a factor of i and each lowered τ indices get a factor of −i. So
for example a vector field V µ has  0   0 
V (t, ⃗x) iV (it, ⃗x)
W = (6.9)
V j (t, ⃗x) V j (it, ⃗x)
while a one-form field ωµ has    
ω0 (t, ⃗x) −iω0 (it, ⃗x)
W = . (6.10)
ωj (t, ⃗x) ωj (it, ⃗x)
These factors are necessary because we’d like to preserve e.g. the expressions V = V µ ∂µ and ω = ωµ dxµ , so
we should rotate V 0 in the same way as we rotate τ and ω0 in the opposite way.40 Analytic continuation of
(6.6) thus gives

⟨Ω|W OM [Φ] . . . W O1 [Φ]|Ω⟩ = (−1)(fO1 +...fOM )/2 ⟨Ω|W O1 [FRT Φ] . . . W OM [FRT Φ]|Ω⟩. (6.11)

In order to give this a symmetry interpretation in Lorentzian signature, we can first observe that the sym-
metry must be antiunitary since it reverses time. To see what the antiunitary is, we need to first recall that
for any antiunitary operator Θ that preserves the ground state we have the constraint
′†
⟨Ω|O1 . . . OM |Ω⟩ = ⟨Ω|OM . . . O1′† |Ω⟩ (6.12)

Thus we can interpret (6.11) as indicating that our Lorentzian theory has an antiunitary symmetry ΘCRT
whose action on the dynamical fields is41

Θ†CRT W Φa (x)Θ = ifa (W FRT Φa (x))† , (6.13)


40 For future reference I’ll mention that spinor fields do not pick up any phase factors under Wick rotation, but the γ-matrix

γ 0 transforms as W γ τ = iγ t . Don’t worry about this if you don’t yet know what it means.
41 Note that this definition does not require or use independent definitions of C, R, and T . In general these are not symmetries,

and even when they are there is some freedom in how they are defined. The name CRT is thus in some sense a historical
anachronism, the whole is better-defined than its parts.

70
where fa = 1 if Φa is fermionic and fa = 0 if Φa is bosonic. In particular note that we need to take the
complex conjugate of the analytic continuation of the Euclidean rotation to match (6.12), this is the origin
of the “C” in CRT . For example the action of CRT on a (Lorentzian) complex scalar Φ or complex vector
V µ is42

Θ†CRT Φ(x)ΘCRT = Φ(RT x)†


Θ†CRT V µ (x)ΘCRT = (RT )µν V ν (RT x)† . (6.14)

Once we have understood spinor fields we will also see that a Dirac spinor transforms as

Θ†CRT Ψ(x)ΘCRT = γ 0∗ γ 1∗ Ψ∗ (RT x). (6.15)

In general we can write the CRT transformation in Lorentzian signature as


†
Θ†CRT Φa (x)ΘCRT = ifa DE (RT )ab Φb (RT x) , (6.16)

since any factors of i and −i from the Wick rotation of any τ indices cancel between the two sides. In the
homework you will show that this equation together with the spin-statistics theorem imply that

Θ2CRT = 1, (6.17)

in any quantum field theory.43


The CRT theorem is quite remarkable from the point of view of the topology Lorentz group. In Lorentzian
signature RT lives in a component of O(d−1, 1) which is disconnected from the identity component SO+ (d−
1, 1). If we just assume a relativistic theory has SO+ (d − 1, 1) symmetry, there is no particular reason why
we should expect any version of RT to be a symmetry. In Euclidean signature however RT is in the identity
component SO(d) of O(d), and thus must be a symmetry. So far there does not seem to be any nice proof
of the CRT theorem that doesn’t involve analytic continuation away from Lorentzian signature. The only
exception is a brute-force argument, given e.g. in Weinberg, that one simply can’t make a polynomial
Lagrangian out of tensor and spinor fields that isn’t CRT invariant - the proof is just to check this for
all possible terms. Any experimental observation of CRT -violation would be a very big deal, as it would
mean that we have to give up either locality or special relativity. And at least to the extent that quantum
mechanics + special relativity implies locality, we’d really need to give up either on quantum mechanics or
relativity!44

6.2 Spin and statistics


In non-relativistic quantum mechanics we learn that each type of particle has a spin s which can take integer
or half-integer values. We also learn that each type of particle should be a boson or a fermion, meaning
that if we exchange two of them the wave function should be symmetric or antisymmetric. A priori there
does not seem to be any reason why these two should be related, and indeed in non-relativistic quantum
mechanics it is easy to write down theories of particles with arbitrary spin and statistics. On the other hand
in relativistic quantum field theory there is a very simple rule:45
42 Here we have slightly abused notation to use the same symbol RT for the Lorentzian map RT : (t, x1 , x2 , . . . , xd−1 ) →

(−t, −x1 , x2 , . . . , xd−1 ).


43 In theories with extra global symmetries people sometimes combine CRT with those symmetries to get something that

doesn’t square to one. The CRT we have constructed here however is the only unbreakable one, up to the possibility of
multiplying it by the fermion parity operator (−1)F which acts as one on all bosonic states and minus one on all fermionic
states.
44 In quantum gravity there are good reasons to think that we can have quantum mechanics and (general) relativity without

having locality, but as far as we can tell CRT continues to be a good symmetry even in quantum gravity. See my recent paper
with Numasawa for more on this.
45 The fully non-perturbative proof of the spin-statistics theorem we give here is due to Schwinger. In most quantum field

theory books the theorem is proven in a more banal way that applies only to free fields. Essentially one tries to construct free
fields for particles of various spin, and then finds that it only works if the fields commute for integer spin and anticommute for
half-integer spin.

71
ˆ Spin-statistics relation: Particles with integer spin are bosons, while particles with half-integer spin
are fermions.
The idea behind this rule is quite easy to understand, but we first need to discuss the subtle fact (which
hopefully you have seen before) that a rotation by 2π acts on objects of half-integer spin as −1. For example
in the context of a spin 1/2 particle in three spatial dimensions a rotation by θ about the z axis is implemented
on the Hilbert space by
U (θ) = eiθσz /2 , (6.18)
so U (2π) = eiπσz = −1. Mathematically we can express this by saying that the action of rotations on a
spin 1/2 particle does not give a genuine representation of the rotation group SO(3) in the sense of a set
of unitary operators U (g) such that U (g1 )U (g2 ) = U (g1 g2 ), since a rotation by 2π is equal to nothing in
the rotation group but apparently it isn’t equal to nothing acting on a spin 1/2 particle. We will discuss
this in more detail later in the semester, but the right way to understand this is that in a relativistic theory
with half-integer spin particles the spacetime symmetry group isn’t really SO+ (d − 1, 1), but instead what is
called its double cover Spin+ (d − 1, 1). Locally Spin+ (d − 1, 1) looks just like SO+ (d − 1, 1), but globally
it is different in that each element of SO+ (d − 1, 1) corresponds to two elements of Spin+ (d − 1, 1) which
differ by a rotation by 2π. The rotation part of Spin+ (3, 1), which is the double cover of SO(3), is precisely

given by the set of matrices of the form eiθ·⃗σ/2 , which is nothing but the matrix group SU (2). We will see
how to extend this to a double cover of the full Lorentz group later in the semester when we discuss spinors.
Turning now to spin and statistics, the basic ingredient we will need is to understand in more detail how
the Euclidean rotation matrix DE (RT ) acts on the fields Φa and their complex conjugates. This is a bit
tricky, so hold on tight! Let’s first recall that in Lorentzian signature we have

U (Λ)† Φa (x)U (Λ) = D(Λ)ab Φb (Λ−1 x), (6.19)

and thus
U (Λ)† Φa (x)† U (Λ) = D∗ (Λ)ab Φb (Λ−1 x)† . (6.20)
1
In particular when Λ is a boost of rapidity η in the x direction we have
01
D(Λ) = eiηJ , (6.21)

where the matrix J 01 is the boost generator in the representation D of the Lorentz group. To turn a boost
into a Euclidean rotation, we want analytically continue t = −iτ and η = −iθ such that

t′ = cosh(η)t + sinh(η)x
x′ = cosh(η)x + sinh(η)t (6.22)

become

τ ′ = cos(θ)τ + sin(θ)x
x′ = cos(θ)x − sin(θ)τ. (6.23)

We therefore have 01
DE (Λ) = eθJ (6.24)
1
for a Euclidean rotation by θ in the τ, x plane. In Euclidean signature the rotation group SO(d) is a
compact group, and the finite-dimensional representations of such groups are always unitary. We therefore
see that J 01 must be anti-hermitian. D∗ (Λ) therefore analytically continues to
01 ∗ 01 T 01 T
e−iη(J )
= eiη(J )
= eθ(J )
= DE (Λ)T . (6.25)

In particular this applies to RT , which is just a Euclidean rotation by π.

72
We next need to understand how the hermitian conjugate of fields works in Euclidean signature. In
Lorentzian signature we have the convenient fact that we can take the hermitian conjugate before or after
time evolution and end up with the same thing:

O(t)† = (eiHt Oe−iHt )† = eiHt O† e−iHt = O† (t). (6.26)

In Euclidean signature we aren’t so lucky, the hermitian conjugate now gives

O(τ )† = (eHτ Oe−Hτ )† = e−Hτ O† eHτ ̸= eHτ O† e−Hτ . (6.27)

To deal with this it is conventional to define a Euclidean adjoint

O∗ (τ ) = eHτ O† e−Hτ = O(−τ )† , (6.28)

as this is the quantity which analytically continues to O† (t) in Lorentzian signature. Therefore from the pre-
vious paragraph, in Euclidean signature we have the somewhat counter-intuitive symmetry transformations

Φ′ (x) = DE (RT )Φ(x)


Φ∗′ (x) = DE (RT )T Φ∗ (RT x). (6.29)

To proceed further, we now change our basis of fields Φa (x) to diagonalize DE (RT ). Recall that this
is a unitary matrix, and since all fields have integer or half-integer spin it must obey DE (RT )4 = 1. Its
eigenvalues are therefore ±1 on fields of integer spin and ±i on fields of half-integer spin. We may then
observe that

⟨Ω|Φ∗ (τ, ⃗0)Φ(−τ, ⃗0)|Ω⟩ = (−1)2jϕ ⟨Ω|T Φ∗ (−τ, ⃗0)Φ(τ, ⃗0)|Ω⟩


= (−1)2jϕ +fϕ ⟨Ω|Φ(τ, ⃗0)Φ∗ (−τ, ⃗0)|Ω⟩, (6.30)

where in the first line jϕ is the spin of Φ and we have used our Euclidean rotation rule (6.4) and also that if
Φ has integer spin both rotations contribute ±1 while if Φ has half-integer spin then they both contribute
±i. Note that (6.29) here is crucial in getting the factor of (−1)2jϕ , as it ensures Φ and Φ∗ contribute with
the same sign in front of i in the fermionic case. The second line then follows from the antisymmetry of the
time-ordered product for fermions, as explained below (6.6).
Finally we can complete the proof by showing that the correlation functions on both sides of (6.30) are
strictly positive: this implies the theorem because then we need

(−1)2jϕ +fϕ = 1, (6.31)

which means that when jϕ is an integer we must have fϕ = 0 while when jϕ is a half-integer we must have
fϕ = 1. It is easy to show that they are positive semidefinite, as they are the squared norms of states:

⟨Ω|Φ∗ (τ, ⃗0)Φ(−τ, ⃗0)|Ω⟩ = ||Φ(−τ, ⃗0)|Ω⟩||2 ≥ 0


⟨Ω|Φ(τ, ⃗0)Φ∗ (−τ, ⃗0)|Ω⟩ = ||Φ∗ (−τ, ⃗0)|Ω⟩||2 ≥ 0, (6.32)

where here we have used (6.28). We will show later in the lecture that these norms cannot vanish, so provided
that the theorem is proved!46
It is instructive to consider how a naive version of this argument which doesn’t use special relativity can
fail. The idea of the naive argument is to do the same manipulation using a spatial rotation by π instead
of RT . We can derive the relation (6.30) just as before (except with the fields now being at ±xx̂), but the
failure mode is that we can no longer show that the correlators aren’t zero! Indeed in non-relativistic field
theory you can have a field Φ that only has an annihilation part and such a field can indeed annihilate the
vacuum.
46 More precisely what we showed is that Φ and Φ∗ commute/anticommute at spacelike separation if they have integer/half-

integer spin. In the homework you will show that this implies the same for Φ with Φ and Φ∗ with Φ∗ .

73
Figure 10: Re-interpreting the ground state wave function as a Euclidean transition amplitude in half of
space.

It is also instructive to compare this argument to the more conventional one given e.g. in Weinberg.
There one tries to construct free fields that create particles of arbitrary spin, finding out by brute force
that it is impossible to choose the coefficient functions ui and vi such that the field both transforms in
a valid representation of Spin+ (d − 1, 1) and commutes/anticommutes at spacelike separation unless the
spin-statistics relation is satisfied. The proof given here by contrast does not rely on free fields and is also
more intuitive. As in the case of the CRT theorem, any experimental demonstration of a violation of the
spin-statistics connection would be catastrophic for quantum mechanics and special relativity.

6.3 The structure of vacuum entanglement


There is a nice way to use the ideas we have been discussing to analyze the structure of the ground state
wave function in relativistic quantum field theory. The idea is to decompose space into a “left” region with
x1 < 0 and a “right” region with x1 > 0. To simplify our analysis we will restrict to bosonic theories where
all fields commute at spacelike separation, in which case the fields in the L region are independent of the
fields in the R region, so at least in the presence of a cutoff we can write the Hilbert space as a tensor product

H = HL ⊗ H R . (6.33)

The operators in the algebra A(L) are product operators of the form OL ⊗ IR , while the operators in the
algebra A(R) = A′ (L) are product operators of the form IL ⊗ OR .47 The ground state wave function is
computed by the Euclidean path integral in the region τ < 0:
Z R 0 R d−1
⟨ϕL ϕR |Ω⟩ ∝ Dϕ|ϕ0 L ,ϕR e− −∞ d xLE (ϕ,∂ϕ) . (6.34)

The idea is to change our interpretation of this path integral from being split up on horizontal slices to being
split up on radial slices, as shown in figure 10. We thus have

⟨ϕL ϕR |Ω⟩ ∝ ⟨ϕR |e−πKR |FRT ϕL ⟩


X
= e−πωn ⟨ϕR |n⟩R ⟨n|FRT ϕL ⟩ (6.35)
n

where KR is the right-sided boost operator


Z
KR = dd−1 xx1 T00 (x), (6.36)
x1 >0
47 In fermionic theories this structure is more complicated because the fermionic fields in L and R need to anticommute instead

of commute; we haven’t introduced enough fermion technology to deal with this yet so for now we’ll stick to bosons.

74
also sometimes called the Rindler Hamiltonian, and |n⟩ is a complete basis of KR eigenstates with
eigenvalues ωn . You can think of (6.35) as arising from applying our usual path integral derivation to
Euclidean evolution by the Rindler Hamiltonian, which generates rotation in the τ x1 plane. To turn (6.35)
into an expression for the ground state however we need to find a way to get ϕL into a bra instead of a ket.
We can do this by introducing a “partial CRT ” operator ΘR CRT : HR → HL which acts as

′∗
ΘR
CRT |ϕR ⟩ = |ϕL ⟩. (6.37)

Here ϕ′L indicates the CRT transformation of ϕL , which is indeed a function of ϕR . This operator implements
CRT on operators in the left region, as we can check by noting that if x is in L we have:
 
R†
Φ(x)ΘRCRT |ϕ R ⟩ = ΘR
CRT Θ CRT Φ(x)Θ R
CRT |ϕR ⟩

= ΘR
CRT Φ (x)|ϕR ⟩

= ΘR
CRT ϕL (x)|ϕR ⟩

= ϕ′∗ R
L (x)ΘCRT |ϕR ⟩. (6.38)

In the first line here we used that ΘR CRT is antiunitary, the second line is just implementing CRT on Φ, in
the third line we use that for bosonic theories Φ and Φ† are commuting so an eigenstates of Φ is also an
eigenstates of Φ′ , and in the fourth line we used that ΘRCRT is antilinear. From (6.13) we can rewrite (6.37)
as
ΘR
CRT |ϕR ⟩ = |FRT ϕR ⟩, (6.39)
2
and making the substitution ϕR = FRT ϕL and using that FRT = 1 on bosons we have

ΘR†
CRT |ϕL ⟩ = |FRT ϕL ⟩. (6.40)

This then implies that


⟨n|FRT ϕL ⟩ = ⟨n|ΘR† R
CRT |ϕL ⟩ = ⟨ϕL |ΘCRT |n⟩, (6.41)
and thus X
⟨ϕL ϕR |Ω⟩ ∝ e−πωn ⟨ϕL |ΘR
CRT |n⟩⟨ϕR |n⟩. (6.42)
n

We therefore have shown that in any relativistic quantum field theory the ground state has the simple
entangled form X
|Ω⟩ ∝ e−πωn ΘRCRT |n⟩ ⊗ |n⟩, (6.43)
n

which is called the Rindler decomposition. Stated heuristically, the Rindler eigenstates in the right region
are entangled with their CRT conjugates in the left region.48

6.4 Unruh Effect


The Rindler decomposition has two very important consequences, the first of which is the Unruh effect:
an observer moving at constant acceleration a in the vacuum of a relativistic field theory feels a temperature
ℏa
TU nruh = , (6.45)
2πckB
48 There were several points in this argument which need to be revisited if there are fermions. We don’t yet have the tools to

do so, but I’ll mention that the result in that case becomes
X
|Ω⟩ ∝ e−πωn i−FL ΘR
CRT |n⟩ ⊗ |n⟩, (6.44)
n

where FL is the number of fermions in the left region.

75
where I have temporarily restored the unsightly dimensionful constants ℏ, c, and kB . This is a quite
remarkable statement, although not one which is easy to experience yourself. For example if we take a to be
9.8 m/s2 we get
TU nruh ≈ 4 × 10−20 K. (6.46)
To derive this, we first note that an observer living in the right region can take the partial trace over the left
region, leading to a vacuum density matrix
X
ρR ∝ e−2πωn |n⟩⟨n| = e−2πKR . (6.47)
n

1
This is nothing but a thermal density matrix, but with “Hamiltonian” KR and “temperature” TK = 2π .
The world should therefore look thermal to someone whose proper time is proportional to the boost rapidity
η. From equation (6.22), we see that such a person should be moving on a trajectory

t(η) = x0 sinh η
x(η) = x0 cosh η. (6.48)

Note that this trajectory is the boost image of the point (0, x0 ). The proper time along this trajectory is
related to η by
τ = ηx0 , (6.49)
and we can compute the proper acceleration:
s 2 2
d2 x d2 t

1
a= − = . (6.50)
dτ 2 dτ 2 x0

Therefore the proper temperature seen by this observer is


dη 1 a
TU nruh = TK = = . (6.51)
dτ 2πx0 2π
This effect is the essence of Hawking’s calculation showing that black holes evaporate into thermal radiation,
and in fact Unruh discovered it by way of trying to come up with an intuitive interpretation of Hawking’s
paper.

6.5 Reeh-Schlieder property


There is a second important consequence of the Rindler decomposition, which we will call the Reeh-Schlieder
property:
ˆ In relativistic quantum field theory there are no nonzero local operators which annihilate the vacuum.
In free field theory this statement is quite intuitive: since any field is the sum of an annihilation and a
creation part, to project onto the annihilation part we need to use a Fourier transform which is an integral
over all of space. The proof for general field theories is quite simple: for any operator O localized in the
right region R (or more carefully any element of A[R]), using the Rindler decomposition the squared norm
of the state created by acting with O on the vacuum is given by
X
⟨Ω|O† O|Ω⟩ ∝ e−2πωn ⟨n|O† O|n⟩
n
X
= e−2πωn ⟨n|O† |m⟩⟨m|O|n⟩
n,m
X
= e−2πωn |⟨m|O|n⟩|2 . (6.52)
n,m

76
The final expression here is a sum of positive semi-definite terms, so it can vanish only if each term vanishes.
Therefore if O annihilates the vacuum, all of its matrix elements must vanish - in other words O must itself
be zero. We note in passing that the Reeh-Schlieder property is precisely what we needed to complete our
proof of the spin-statistics theorem, so that theorem is now proved as well. We also note that this argument
actually proves something stronger: it shows that no operator which is an element of A[R] in some Lorentz
frame can annihilate the vacuum. For example any nonzero product of a finite number of local operators at
arbitrary points also cannot annihilate the vacuum, since by an appropriate spacetime translation we can put
all the operators into the domain of dependence of the right region R and the vacuum is translation-invariant.
The Reeh-Schlieder property has a rather surprising consequence: it implies that any state in the Hilbert
space can be obtained by acting on the vacuum with an operator which is supported only in the left region
L (by symmetry the same is of course true for the right region R, or more generally for the left or right
region in any Lorentz frame). The proof goes like this: suppose by contradiction that there is a nontrivial
subspace S ⊂ H which is orthogonal to all the states which can be written as O|Ω⟩ for some O ∈ A[L]. We
will argue that the projection PS is a nonzero element of A[R] which annihilates |Ω⟩. By the Reeh Schlieder
property this is not allowed, and so the subspace S must be zero-dimensional. The idea is to first consider
PS ⊥ = 1 − PS , which is the projection onto the subspace of states which can be created by acting on |Ω⟩
with elements of A[L]. For any O in A[L] we have

OPS ⊥ = PS ⊥ OPS ⊥ (6.53)

and
O † PS ⊥ = PS ⊥ O † PS ⊥ , (6.54)
where in both cases the argument is that both sides of the equation act as zero on S and as O/O on S ⊥ . †

Taking the dagger of the second equation and combining them, we see that

OPS ⊥ = PS ⊥ O, (6.55)

and thus that PS ⊥ is in the commutant of A[L]. By Haag duality this is equal to A[R], and so we have
PS = 1 − PS ⊥ ∈ A[R]. Moreover PS clearly annihilates |Ω⟩ since |Ω⟩ ∈ S ⊥ .
We only proved the Reeh-Schlieder property for half-space regions, but in fact it is true for any region
which is not a complete time slice.49 In other words any operator which annihilates the vacuum cannot be in
A[R] for any region R that is not a complete time slice. The argument just given then implies an even more
shocking consequence: for any open spatial region R and any quantum state |ψ⟩, we can find an element O
of A[R] such that50
|ψ⟩ = O|Ω⟩. (6.56)
For example we can instantaneously create the moon by acting on the vacuum with an operator that has
support only in this classroom! This is a rather extreme example of what is called quantum teleportation.51
It is worth briefly mentioning some standard mathematical terminology which is used in discussing the
Reeh-Schlieder property. In von Neumann algebra a state |Ω⟩ with the property that it is not annihilated
by any nonzero element of a von Neumann algebra A is said to be separating for that algebra. Similarly a
state |Ω⟩ with the property that A|Ω⟩ is a dense set of states in the Hilbert space H is said to be a cyclic
state for A. What the Reeh Schlieder property says is that in quantum field theory the vacuum is both
cyclic and separating for the algebra A[R] associated to any spatial region which isn’t a complete time slice.

49 Unfortunately I’m not aware of a simple proof of this generalization, except in the special case of conformal field theories.
50 This statement isn’t actually quite true: if we are careful about infinite-dimensional Hilbert spaces, what we find from the
proof in the previous paragraph is that we can create a state which as close to |ψ⟩ as we like in the Hilbert space norm. A
mathematician would describe this situation by saying that the set A[R]|Ω⟩ is dense in the Hilbert space H.
51 To be clear, the operator which does this is not unitary so we can’t use it to communicate faster than light. This seeming

non-locality is of the EPR variety, rather than the worse non-locality we found in the first lecture by trying to quantize a
relativistic quantum particle.

77
6.6 Homework
1. Using (6.16) and also the spin-statistics theorem, show that Θ2CRT = 1.
2. Check that the complex scalar action is invariant under CRT .

3. Check that the massive (real) vector action with Lagrangian

1 m2 µ
L = − (∂µ Vν − ∂ν Vµ )(∂ µ V ν − ∂ ν V µ ) − V Vµ (6.57)
4 2
is also invariant under CRT .
4. Let’s model the hydrogen atom by a classical electron orbiting the proton in a circle whose radius is
the Bohr radius a0 = 5 × 10−11 m. What is the Unruh temperature experienced by the electron? How
does it compare to the binding energy of hydrogen?

5. Show that if Φ(x)Φ† (y) ± Φ† (y)Φ(x) = 0 at spacelike separation, then we also have Φ(x)Φ(y) ±
Φ(y)Φ(x) = 0 and Φ† (x)Φ† (y) ± Φ† (y)Φ† (x) = 0 at spacelike separation. Hint: you should assume
that Φ(x)Φ(y) + sΦ(y)Φ(x) = 0 with either s = 1 or s = −1, and then show that s needs to be the
same sign as appears in Φ(x)Φ† (y) ± Φ† (y)Φ(x) = 0. I recommend considering the norm of the state
Φ(x)Φ(y)|Ω⟩, and you will need to use the Reeh-Schlieder property and also that as (x − y)2 → +∞
we have
⟨Φ† (x)Φ(x)Φ† (y)Φ(y)⟩ → ⟨Φ† (x)Φ(x)⟩⟨Φ† (y)Φ(y)⟩, (6.58)
which is an example of what is called cluster decomposition. In general cluster decomposition says
that the connected correlation functions of local operators should always decay at large separation, in
this case the connected two-point function of the composite operator Φ† Φ.

78
7 Perturbative calculation of correlation functions in interacting
theories
So far our results in this class have fallen into two categories:

ˆ Explicit calculations in free field theory

ˆ General formal results (such as the CRT and spin-statistics theorems) which are valid in any relativistic
quantum field theory.
Free field theory is quite useful for getting an initial picture of how quantum field theory works, and formal
results are of course important for understanding the general structure of quantum field theory, but in the
end of the day most field theories are not free and formal results won’t get us to detailed predictions that
can be quantivatively compared to experiment. It is time for us to learn how to do some explicit calculations
in field theories that are not free.
The simplest interacting field theory is called ϕ4 theory, and its Lagrangian density is given by

1 m2 2 λ 4
L = − ∂µ ϕ∂ µ ϕ − ϕ − ϕ . (7.1)
2 2 4!
It must be acknowledged from the outset that no analytic solution of this theory is known. It is not difficult
to see why it cannot be solved using the methods we have discussed so far: the Heisenberg equation of
motion
λ
(∇2 − m2 )Φ = Φ3 (7.2)
6
is non-linear, and thus cannot be solved using the Fourier transform, and the path integral
Z R d
Dϕei d xL (7.3)

is not Gaussian so we can’t compute it using our Gaussian tricks. In fact there is an even more severe problem:
for d ≥ 4 this model is widely expected to not even have a continuum limit: it can only be defined precisely
in the presence of a finite UV cutoff such as a lattice. Nonetheless there is much to be gained by studying
this model, and the key idea that will allow us to make progress is perturbation theory: we treat the
parameter λ, called the coupling constant, as small, and then we compute interacting correlation functions
as power series in λ about their free field values. There is a beautiful diagrammatic way of organizing such
calculations, called Feynman diagrams, which we will meet for the first time in this lecture. Perturbative
calculations using Feynman diagrams are the central focus of a large fraction of the practicing quantum field
theorists in the world, especially those working in particle physics, and developing a good intuition for them
is essential for any aspiring theoretical physicist (or any aspiring particle experimentalist).

7.1 Perturbation series for an integral


As a first illustration of the perturbative method, we’ll consider the integral
Z ∞
1 1 2 λ 4
f (λ) = √ dx e− 2 x − 4! x . (7.4)
2π −∞
with λ > 0. This integral can be evaluated in closed form, according to Mathematica we have
r
3 3 3
f (λ) = e 4λ K1/4 ( ), (7.5)
2πλ 4λ

79
but our approach here will be to ignore this and try to approximate f (λ) when λ ≪ 1. The idea is to Taylor
expand the “interaction” term, which allows us to rewrite the integral as a sum over Gaussian moments:
Z ∞ ∞ n
λx4

1 1 2
X 1
f (λ) = √ dxe− 2 x −
2π −∞ n=0
n! 4!
∞   n ∞
1 X 1 −λ
Z
1 2
“ = ”√ dxx4n e− 2 x (7.6)
2π n=0 n! 4! −∞

I’ve put the equality in quotes in the second line since we have recklessly exchanged the order of summation
and integration, a sin for which we will shortly pay a price. Proceeding boldly ahead in the meantime, we
can be encouraged by the fact that the terms in the sum are suppressed by higher powers of λ as n increases,
and so we can hope that truncating this sum to the first few terms gives a good approximation to f (λ) when
λ is small. The easiest way to evaluate these Gaussian moments is to remember the integral definition
Z ∞
Γ(y) = dssy−1 e−s (7.7)
0

of the Euler Γ-function and change variables x2 = 2s, which gives


Z ∞
1 2
dxx4n e− 2 x = 22n+1/2 Γ(2n + 1/2), (7.8)
−∞

so we can write the perturbative expansion as


∞  n
1 X 1 λ
f (λ)“ = ” √ Γ(2n + 1/2) − . (7.9)
π n=0 n! 6

The first few terms in the sum are given by


λ 35λ2 385λ3 25025λ4
f (λ)“ = ”1 − + − + + .... (7.10)
8 384 3072 98304
In figure 11 we show how this approximation does against the exact expression (7.5): at least in the range
0 < λ < .3 including higher order terms indeed seems to give us a better and better approximation to f (λ).
Unfortunately things are not so simple as this plot might suggest. Recalling Stirling’s approximation
that at large x we have
Γ(x) = exp [x log x − x + O(log x)] , (7.11)
n 2n log n
we see that
 the coefficients of λ in the series (7.9) eventually grow like e at large n, which is faster
λ n
than 6 is decreasing no matter the size of λ: the perturbation series (7.9) is divergent! This is the price we
pay for our earlier illegal exchange of an integral and an infinite sum. Another way to anticipate this trouble
is that the integral for f (λ) is badly divergent for λ < 0, so asking for a convergent power series at λ = 0 is
asking for too much. You may be desperately hoping that this problem is special to this particular example,
but I assure that it isn’t: almost any perturbation series in quantum field theory (or even in non-relativistic
quantum mechanics) is divergent. We therefore need to decide what to do. Discarding the method altogether
is too drastic given the impressive success shown in figure 11, but we’d like to get a better sense of when
it succeeds and when it doesn’t. The key idea to remember is that perturbation theory is an asymptotic
series, which means that if we sum the first N terms in the series we get an approximation to the function
whose error is of order λN for sufficiently small λ. The reason the series doesn’t converge is that as N gets
larger, we need to go to smaller λ for this approximation to be good. Asymptotic series are written using
the “∼” symbol, so we can rewrite (7.9) as
∞  n
1 X 1 λ
f (λ) ∼ √ Γ(2n + 1/2) − . (7.12)
π n=0 n! 6

80
1.02
0th order

1.01 2nd order

4th order
1.00 Exact

3rd order
0.99
1st order

0.00 0.05 0.10 0.15 0.20 0.25 0.30

Figure 11: Comparing the first few terms in perturbation theory to the exact answer. What is plotted here
is the ratio of the partial sum of the first few terms to the exact answer; for λ < .3 the first order result
already brings us within a percent of right answer, and including higher order terms gets us even closer.

To show that this series is indeed asymptotic, note that we can legally move a finite number of the terms in
the sum past the integral to get
N −1 n Z ∞ ∞ n
λx4
 
1 X 1 λ 1 − 21 x2
X 1
f (λ) = √ Γ(2n + 1/2) − +√ dxe − , (7.13)
π n=0 n! 6 2π −∞ n! 4!
n=N

and therefore
N −1  n  N Z ∞ ∞  m
1 X 1 λ λ 1 − 12 x2
X 1 λ
f (λ) − √ Γ(2n + 1/2) − = − √ dxe − x4(m+N ) .
π n=0 n! 6 4! 2π −∞ m=0
(m + N )! 4!
(7.14)
λ N

In the second line we relabeled the sum to pull out an overall factor of − 4! . The thing it multiplies
approaches a constant as λ → 0, so the error of the series is indeed of order λN at small λ.
We can understand the implications of the asymptotic nature of this series as follows: the series will not
begin to diverge until we get to large enough n that

elog n λ ∼ 1, (7.15)

or in other words
1
n∼ . (7.16)
λ
At this point the terms are of order
1 log(1/c)
ϵmin = c λ = e− λ , (7.17)
where c is some O(1) constant which is less than one. ϵmin is the most accurate that the perturbation series
can be, after this including more terms only causes the error to get larger. We illustrate this qualitative
behavior in figure 12. Effects which are of order ϵmin or smaller are typically referred to as non-perturbative
effects, and in situations where they are of interest we need to use methods that go beyond perturbation
theory. For reasonable values of λ however this minimal error can be quite small, for example in quantum
1
electrodynamics we have λ ≈ 137 so the QED perturbation series should be good up to an unrecoverable
error which is of order
ϵmin ∼ e−137 . (7.18)

81
Figure 12: The qualitative behavior of perturbation theory: adding more terms to the series increases the
−#
accuracy until we get to N ∼ λ1 terms, at which point the error of the series is of order e λ . After this
the series begins to diverge and the approximation gets worse and worse. In the plot label an indicates the
coefficient of λn in the perturbative expansion for f (λ).

I’d say this is close enough for most practical purposes! From now on we will therefore use perturbation
theory without further handwringing about its validity, except in non-perturbative situations where we are
indeed interested in effects of order ϵmin .52

7.2 Feynman diagrams for Gaussian integrals


In the previous section we took advantage of our knowledge of Γ-functions to immediately compute the
coefficients in the perturbation series. In more general examples this is not possible, so we need another
method. The idea which always works is to compute the integral (7.8) using our Gaussian integral technology.
Indeed recall that we have Z ∞
1 1 2 B2
√ dxe− 2 x +Bx = e 2 , (7.19)
2π −∞
and therefore Z ∞  m
1 m − 21 x2 d B2
√ dxx e = e 2 |B=0 . (7.20)
2π −∞ dB
At first the combinatorics of computing these derivatives is somewhat intimidating, after all it has to give the
series (7.10) whose coefficients are not so simple-looking, but life is simple once we realize that each derivative
can only do one of two things: bring down a factor of B from the exponent or compute the derivative of the
existing prefactor. We therefore have
 m     
d B2 B2 d d d
e 2 =e 2 B+ B+ ... B + × 1, (7.21)
dB dB dB dB
d

where are there m copies of B + dB . In order to get a term which survives when we set B = 0, there must
d
be an equal number of Bs and dB s, with each derivative appearing to the left of the B that it acts on. There
are no such terms when m is odd, so we see that the integral (7.20) vanishes unless n is even. When m is
even, the number of terms is equal to the number of pairings of m objects since each derivative needs to be
52 Of course if λ is not small then neither is ϵ
min , so in that case we are obviously interested in effects of order ϵmin ! More
interesting are situations where λ is small but we nonetheless still care about some non-perturbatively small effect. For example
there could be some process whose rate is zero to all orders in perturbation theory but not zero, in which case non-perturbative
effects give the leading contribution. The possible decay of the Higgs vacuum is an example of such an effect.

82
paired with the B it acts on. The number of such pairings is
m!
Nm := , (7.22)
2m/2 (m/2)!
since we can chose the first element of the first pair, the second element of the first pair, and so on down to
the 2nd element of the m/2nd pair, and then we need to divide by a factor of two for each pair since the
order doesn’t matter and also divide by the number of permutations of the pairs. Therefore we have
Z ∞ (
1 m − 12 x2 Nm m even
√ dxx e = . (7.23)
2π −∞ 0 m odd

This of course is equal to what we found using the Γ function (with the replacement m = 4n), as you will
check on the homework.
In quantum field theory we are really interested in multi-dimensional Gaussian integrals, which we found
obey s  Z    
A 1 1
Det dx exp − xT Ax + B T x = exp B T A−1 B (7.24)
2π 2 2
and thus s  Z
A 1 T ∂ ∂ 1 T −1
Det dxxi1 . . . xim e− 2 x Ax
= ... e2B A B . (7.25)
2π ∂Bi1 ∂Bim
B=0
We can think of this as the m-point correlation function in the Gaussian distribution. To compute it we
again can observe that each derivative again does one of two things, which now are to bring down a factor
of A−1 B or to take the derivative of the existing prefactor, so we have
   
∂ ∂ 1 T −1 1 T −1 X ∂ X ∂
... e2B A B = e2B A B  A−1
i1 j 1 B j 1 +
... A−1
im jm Bjm +
 × 1. (7.26)
∂Bi1 ∂Bim j
∂B i1 j
∂Bi m
1 m

As before we can only get a term that survives taking B = 0 if each partial derivative is paired with a B to
its right, so the integral again vanishes for odd m while for even m we have
s  Z
A 1 T
X Y
Det dxxi1 . . . xim e− 2 x Ax = A−1
ij ik . (7.27)

P (j,k)∈P

Here P indicates pairings of 1, . . . , m. As before there are Nm such pairings, but now they can make different
contributions to the integral. For example for m = 4 we have
s  Z
A 1 T
Det dxxi1 xi2 xi3 xi4 e− 2 x Ax = A−1 −1 −1 −1 −1 −1
i1 i2 Ai3 i4 + Ai1 i3 Ai2 i4 + Ai1 i4 Ai2 i3 . (7.28)

We are now ready for our first meeting with Feynman diagrams. These are simply a graphical way of
representing the different pairings appearing on the right side of equation 7.27. The idea is quite trivial: we
draw a dot for each xi appearing in the correlation function, and then we draw lines connecting them to
indicate the pairing. Each pairing contributes a “propagator” A−1 . The m = 4 case is shown in figure 13.

7.3 Feynman diagrams for an “interacting” integral


We can explore the idea of Feynman diagrams further by considering an “interacting” integral
s  Z
A 1 T λ
P 4
f (λ) = Det dxe− 2 x Ax− 4! i xi , (7.29)

83
+ +
Figure 13: Feynman diagrams for the four-point function in the Gaussian distribution.

which you can think of as a simple model of the interacting ϕ4 theory we began the lecture with. The
perturbative expansion for this integral is
s  X ∞  n X Z
A 1 λ 1 T
f (λ) ∼ Det − dxx4i1 . . . x4in e− 2 x Ax , (7.30)
2π n=0 n! 4! i ...i 1 n

and we can evaluate these integrals using our pairing formula (7.27). We now meet a new phenomenon how-
ever, which is that many of the pairings give the same answer due to the repeated indices in the interaction.
For example the first order n = 1 contribution to the series is
s  Z
λX A 1 T λ X
− Det dxx4i e− 2 x Ax = − × 3 × (A−1
ii )
2
4! i 2π 4! i
λ X −1 2
=− (Aii ) , (7.31)
8 i

where all three pairings appearing in (7.28) contribute equally. The second order contribution has three
distinct kinds of pairings: those where each interaction has two self-pairings, those where each interaction
has one self-pairing, and those where there are no self-pairings. These lead to
s
λ2 λ2
 Z
X A 4 4 − 12 xT Ax
X
9(A−1 2 −1 2 −1 −1 −1 2 −1 4

2
Det dxx x
i j e = 2 ii ) (Ajj ) + 72Aii Ajj (Aij ) + 24(Aij )
2 × (4!) ij 2π 2 × (4!) ij
X 1 1 −1 −1 −1 2 1

= λ2 (A−1 )2
(A −1 2
) + A A (A ) + (A −1 4
) ,
i,j
128 ii jj
16 ii jj ij
48 ij
(7.32)
where the factors of 9, 72, and 24 count how many pairings there are of each type. Counting these pairings
takes a bit of practice to get used to, we illustrate the idea in figure 14.
The diagrams in figure 14 are useful for counting pairings, but it is also useful to have a simpler set
of diagrams which are designed so that the same diagram automatically represents all the pairings in each
equivalence Pclass. Following Feynman, the idea is to combine all the dots appearing in each factor of the
interaction i x4i to a single interaction vertex, giving us the Feynman diagram expansion. See figure 15
for the set of Feynman diagrams contributing to f (λ) up through order n2 . In terms of these diagrams we
can rewrite our asymptotic series for f (λ) as
X 1 X Y
f (λ) ∼ 1 + (−λ)nD A−1
im iℓ , (7.33)
sD i1 ...inD (m,ℓ)∈LD
D

where m and ℓ label the interaction vertices of the diagram, nD indicates the number of interaction vertices
in D, LD indicates the set of (unoriented) links in D, and sD is called the symmetry factor of the diagram
and is given by
nD !(4!)nD
sD = , (7.34)
pD

84
Figure 14: Counting pairings at first and second order. For the n = 1 pairings, we need to pick which of
three other is to pair the first i with. For the n = 2 pairings where each interaction has two self-pairings, we
need to make this choice independently for each interaction. For the n = 2 pairings where both interactions
have a single self-pairing, for each interaction we need to pick which two of the four is are self-paired, and
then there are two ways to do the remaining pairings. For the n = 2 pairings with no self-pairings, we need
to pick which of the four js pairs with the first i, which of the remaining three js pairs with the second i,
and which of the remaining two js pairs with the third i.

Figure 15: Feynman diagrams contributing to f (λ) up through order λ2 . As we saw above, the symmetry
factors for these diagrams are sD = 1, 8, 128, 16, and 48.

85
with pD the number of pairings which give rise to this diagram as in figure 14. Except for sD all factors in
(7.33) are easy to read off by visual inspection of D, so Feynman diagrams give a powerful way of immediately
seeing what is going on at each order in perturbation theory. There is actually also a way to compute sD
directly from the diagram, it is the size of the automorphism group of the diagram, but as long as you
do not intend to become a high-order amplitudes expert it is easy enough (and perhaps safer) to just use
the method of figure 14 to compute pD .53 In more realistic theories where the interaction vertices are less
symmetric we conveniently often have SD = 1.

7.4 Exponentiation of connected diagrams


You may have already noticed in figure 15 that at second order we started getting diagrams which are topo-
logically disconnected. This makes the computation of the perturbation series for f (λ) somewhat redundant,
as diagrams from lower orders are constantly reappearing at higher orders. In fact there is a beautiful com-
binatoric simplification: the sum (7.33) of all disconnected Feynman diagrams is actually the exponential of
the sum of connected diagrams only! In other words we have
X 1 X Y
log f (λ) ∼ (−λ)nC A−1
im iℓ , (7.35)
sC i1 ...inC (m,ℓ)∈LC
C

where C indicates the set of connected Feynman diagrams.


To derive (7.35), we need to understand how to evaluate a disconnected diagram in terms of its connected
components. We will indicate by
1 X Y
VD = (−λ)nD A−1
im iℓ . (7.36)
sD i1 ...inD (m,ℓ)∈LD

the “value” of a Feynman diagram. If D is disconnected then most of these terms are just products of
the analogous terms for its connected components, but we need to be careful about the symmetry factor.
Indeed let’s say that a disconnected diagram D has connected components C1 , C2 , . . . , CM , which we will
momentarily take to be all distinct from each other. We can write the pairing number pD of the full
disconnected diagram as
    
nD nD − nC 1 nD − nC1 − . . . − nCM −1
pD = ... × pC1 . . . pCM
nC1 nC2 nCM
nD !
= × pC1 . . . pCM , (7.37)
nC1 ! . . . nCM !

where the combinatoric factors account for the number of ways we can choose which interaction vertices get
assigned to which connected components, and we then multiply by the number of pairings we can do within
each component. If the diagrams appear with repetitions, say ma repetitions of Ca , then we need to divide
by additional factors of ma ! since exchanging identitical connected components of a pairing gives the same
pairing. We thus in general have
nD ! 1 m
pD = m1 mM × × (pC1 )m1 . . . (pCM ) M , (7.38)
(nC1 !) . . . (nCM !) m1 ! . . . mM !
53 This interpretation of S
D is actually the reason we included the factor of 1/4! in the interaction vertex. The basic idea is
that nD !(4!)nD gives an “estimate” of how many pairings there are with a given diagram topology, since permuting the nD
vertices and permuting which of the four dots at each vertex get attached to other dots can’t change the graph topology. This
sometimes is an overestimate however, as whenever the graph has an automorphism then acting on a pairing with it gives the
same pairing. Therefore SD is precisely counting the number of such automorphisms. For example in the second diagram of
figure 15 there are three Z2 automorphisms: one that reflects the top lobe, one that reflects the bottom lobe, and one that
exchanges the two lobes. We therefore have SD = 8. Similarly for the fifth diagram there is a four-fold permutation symmetry
of the links, as well as a Z2 symmetry that exchanges the two vertices, so we have SD = 4! × 2 = 48.

86
Figure 16: Feynman diagrams for computing the numerator of (7.42) with two external points. Dividing
by the denominator of (7.42) removes all disconnected diagrams. The symmetry factors of the connected
diagrams here are sD = 1, 2, 6, 4, and 4.

and therefore
1 pD 1 1
= n
= m1 mM × . (7.39)
SD nD !(4!) D SC1 . . . SCM m1 ! . . . mM !
We can therefore write the value of D as
Y (VC )mC
VD = . (7.40)
mC !
C

Finally we can observe that these are precisely the coefficients that these values appear with in
!
P Y Y X (VC )mC
V V
e C C
= e =C
, (7.41)
m
mC !
C C C

which computes the proof of (7.35).

7.5 Perturbative computation of correlation functions


Let’s now see how to use Feynman diagrams to compute perturbative corrections to the Gaussian correlation
functions (7.27). We want to evaluate
1 T λ
P 4
dxxi1 . . . xiM e− 2 x Ax− 4! i xi
R
⟨xi1 . . . xiM ⟩ = 1 T λ
P 4 . (7.42)
dxe− 2 x Ax− 4! i xi
R

We already know how to compute the denominator perturbatively: it is the exponential q of the sum of
A

connected Feynman diagrams with only interaction vertices (divided by a factor of Det 2π that will
cancel with the same factor in the numerator). Let’s think about how to compute the numerator. The
perturbation series for the numerator is
s  Z s  X  n Z !n
A 1 T λ
P 4 A 1 λ X 1 T
Det dxxi1 . . . xiM e− 2 x Ax− 4! i xi ∼ Det − dxxi1 . . . xiM x4i e− 2 x Ax
2π 2π n
n! 4! i
X 1  λ n X X Y
= − A−1 ij ,ik , (7.43)
n
n! 4! i ...i M +1 M +n P (j,k)∈P

where in the second line I’ve labeled the n interaction vertices as iM +1 , . . . , iM +n . In such calculations the
xia with a ∈ (1, M ) are referred to as “external” and the ia with a ∈ (M + 1, M + n) are referred to as
“internal” or “interaction”.54
r  
54 In A
this terminology the denominator of (7.42) (multiplied by Det 2π
) is the exponential of the sum of connected
diagrams with no external legs, also sometimes called the sum over “vacuum bubbles”.

87
Figure 17: Feynman diagrams contributing to the four-point function up through O(λ). Note however that
the second two rows are all really just incorporating corrections to the two point functions in the first row;
it is only the fourth row that is a “genuinely four-point” contribution.

As in the previous subsection we can group this sum over pairings into equivalence classes labeled by
Feynman diagrams, with the diagrams contributing to the two-point function through second order shown
in figure 16. In general in terms of diagrams we have
s  Z
A 1 T λ
P 4 X 1 X Y
Det dxxi1 . . . xiM e− 2 x Ax− 4! i xi ∼ (−λ)nD A−1
im iℓ , (7.44)
2π SD i ...i
D M +1 M +nD m,ℓ∈LD

where now nD is the number of interaction vertices, m and ℓ run over the links of the diagram including
links to external points, and SD is again the symmetry factor

nD !(4!)nD
SD = (7.45)
pD
with pD the number of pairings of the m + N points that give rise to the diagram D. As before we can
interpret SD as counting the automorphisms of the diagram, now restricting to those automorphisms which
keep the external points fixed. We also have an exponentiation result: the numerator of (7.42) is equal to
the sum over diagrams where all interaction vertices are connected to at least one external point times the
exponential of the sum over connected vacuum bubbles. The second factor just cancels the denominator
(7.42), so we then have
X 1 X Y
⟨xi1 . . . xiM ⟩ = (−λ)nĈ A−1
im iℓ (7.46)
SĈ iM +1 ...iM +n
Ĉ m,ℓ∈LĈ

where Ĉ runs over the set of diagrams where each interaction vertex is connected to at least one external
point. The first few such diagrams for the four-point function are shown in figure 17. Note that these
diagrams still are not all connected, essentially because there are diagrams which amount to just correcting

88
Figure 18: Feynman diagrams for computing the connected four-point function, including all contributions
up through λ2 . The symmetry factor of the first diagram is one and the symmetry factors for the others are
all two.

the two-point functions appearing in figure 13 rather than giving “genuinely four-point” contributions. To
focus on the latter, we should look at the connected four-point function, which is defined by

⟨xi1 . . . xi4 ⟩c = ⟨xi1 . . . xi4 ⟩ − ⟨xi1 xi2 ⟩⟨xi3 xi4 ⟩ − ⟨xi1 xi3 ⟩⟨xi2 xi4 ⟩ − ⟨xi1 xi4 ⟩⟨xi2 xi3 ⟩. (7.47)

More generally the connected M -point function ⟨xi1 . . . xiM ⟩c is defined recursively by55
X Y Y
⟨xi1 . . . xiM ⟩ = ⟨ xij ⟩c . . . ⟨ xij ⟩c , (7.48)
S j∈S1 j∈SL

where the sum is over partitions S of M into parts S1 , . . . , SL . This defines ⟨xi1 . . . xiM ⟩c in terms of lower-
point connected correlation functions and the full correlation function ⟨xi1 . . . xiM ⟩. Recursing down to the
lowest level, we take ⟨xi ⟩c = ⟨xi ⟩. Forgetting for a moment that in this theory the odd moments of xi vanish,
the first few explicit solutions of this definition are

⟨xi ⟩c = ⟨xi ⟩
⟨xi xj ⟩c = ⟨xi xj ⟩ − ⟨xi ⟩⟨xj ⟩
⟨xi xj xk ⟩c = ⟨xi xj xk ⟩ − ⟨xi xj ⟩⟨xk ⟩ − ⟨xi xk ⟩⟨xj ⟩ − ⟨xj xk ⟩⟨xi ⟩ + 2⟨xi ⟩⟨xj ⟩⟨xk ⟩. (7.49)

In practice however the definition (7.48) is more useful, as it shows that what the connected correlation
function is really doing is removing all parts of the full correlation function which are mere products of
lower-point correlation functions. Said differently, it builds up the full correlation function out of connected
components in precisely the same way as Feynman diagrams do. We therefore can express the connected
correlation function as a sum over connected diagrams only:
X 1 X Y
⟨xi1 . . . xiM ⟩c = (−λ)nC A−1
im iℓ , (7.50)
SC iM +1 ...iM +nC m,ℓ∈LC
C

where now the sum is over genuinely connected diagrams C. We show the first few diagrams contributing
to the connected four-point function in figure 18.
Already some patterns may be apparent in the diagrams we have discussed. Let’s emphasize two of them:
ˆ The number of diagrams grows quite rapidly as we go to higher orders in λ. Roughly speaking grows
like some power of nD !, since the total number of pairings grows like this and the symmetry factors grow
55 In other contexts the connected correlation functions are called “cumulants” or “Ursell functions”.

89
too slowly to make up for it (after all generic diagrams should have few symmetries). This growth is
consistent with the idea that the series should be divergent, since a power of nD ! will always eventually
beat λnD . It also means that computing higher-order Feynman diagrams is a rather laborious process,
requiring many clever tricks to make progress.
ˆ For a fixed number of external legs, as we go to higher order the number of loops in the diagram
increases by one for each power of λ. Diagrams are thus often classified by the number of loops
rather than the number of interaction vertices, as it is really the number of loops that determines the
complexity of evaluating individual diagrams. Connected diagrams with interaction vertices but no
loops are called tree diagrams, while higher loop diagrams are referred to as one-loop diagrams,
two-loop diagrams, and so on. Most theoretical physicists these days never need to evaluate a
diagram with more than one loop, so in this class our focus will be on computing tree and one-loop
diagrams rather than developing machinery for higher loop computations.

7.6 Feynman diagrams for perturbative correlation functions in ϕ4 theory


Having now set up all of our machinery, it is quite easy (atR least formally) to generalize from Rperturbative
correlation functions for the integral (7.42): we just replace dx by the Euclidean path integral Dϕ leading
to λ
R d 4
Dϕϕ(x1 ) . . . ϕ(xM )e−S0 − 4! d xϕ
R
⟨T ϕ(x1 ) . . . ϕ(xM )⟩ = λ
R
d 4
. (7.51)
Dϕe−S0 − 4! d xϕ
R

Here S0 is the Euclidean action


Z
1 
⃗ · ∇ϕ
⃗ + m 2 ϕ2

S0 = dd x ϕ̇2 + ∇ϕ (7.52)
2

of the free massive scalar field (ϕ̇ indicates the derivative with respect to the Euclidean time τ ). The
perturbative evaluation of this path integral is precisely the same as in the previous section, leading to
Z Z
nC 1
X Y
⟨T ϕ(x1 ) . . . ϕ(xM )⟩c ∼ (−λ) d xM +1 . . . dd xM +nC
d
GE (xm − xℓ ), (7.53)
SC
C m,ℓ∈LC

where
dd p eip·x
Z
GE (x) = (7.54)
(2π)d p2 + m2
is the Euclidean propagator. The two-point function is thus computed by the sum of the connected subset
of the diagrams appearing in figure 16, while the four-point function is computed by the sum of connected
diagrams appearing in figure 18. So for example the first few terms in the expansion for the Euclidean
two-point function are
Z
λ
⟨T ϕ(x1 )ϕ(x2 )⟩ =GE (x2 − x1 ) − dd x3 GE (x1 − x3 )GE (x2 − x3 )GE (0)
2
Z
2 1
+λ dd x3 dd x4 GE (x1 − x3 )GE (x2 − x4 )GE (x3 − x4 )3
6
!
1 1
+ GE (x1 − x3 )GE (x3 − x4 )GE (x2 − x4 )GE (0)2 + GE (x1 − x3 )GE (x2 − x3 )GE (x3 − x4 )2 GE (0)
4 4
+ .... (7.55)

You may be alarmed by the factors of GE (0) = ∞ in the one-loop and two-loop contributions to this
formula. These are further “UV divergences” of the type we met already in computing the Hamiltonian in free

90
field theory. There we saw the divergence could be absorbed into a redefinition of the cosmological constant
via a process we called renormalization. We’ll eventually see that we can also absorb the divergences here
into a redefinition of the particle mass m and a rescaling of the field ϕ. To get a first sense of the former we
can observe that a change δm2 in the mass squared corrects the Euclidean propagator as
dd p δm2
Z
δGE (x) = − eip·x , (7.56)
(2π) (p + m2 )2
d 2

while the one-loop contribution to the propagator is


dd p eip·(x2 −x1 )
Z d
d p 1 dd p 2
Z Z
λ GE (0) d ip1 ·(x1 −x3 )+ip2 (x2 −x3 ) λGE (0)
− d x 3 e = − ,
2 (2π)d (2π)d (p21 + m2 )(p22 + m2 ) 2 (2π)d (p2 + m2 )2
(7.57)
so the only effect of this diagram is to shift the mass-squared of the free propagator by
λGE (0)
δm2 = . (7.58)
2
Said differently, if we call the mass in the Lagrangian mbare then the “true” mass squared is
λGE (0)
m2true = m2bare + + O(λ2 ). (7.59)
2
In particular if we want mtrue to be finite, then we should tune m2bare to cancel this divergent contribution
and leave a leftover finite piece. This is another example of the process of renormalization. It is important
to emphasize that the fact that there is a nontrivial relationship between mtrue and mbare has nothing to do
with the fact that the relationship involves UV divergent objects. Even in a theory where all momentum
integrals were convergent, such as a lattice theory or a theory in low spacetime dimensions, it would still be
true that in interacting theories mtrue ̸= mbare . This may make you wonder what we even mean by mtrue ,
we will answer this question in the next lecture.
The fine-tuning of m2bare which is needed to cancel the mass of a scalar particle is sometimes viewed with
suspicion, and the fact that we need to do it for the mass of the Higgs boson (a scalar) in the standard model
of particle physics is often called the hierarchy problem. We will discuss this more in a few lectures.
We can also compute higher-point correlation functions, for example the leading contribution to the
connected Euclidean four-point function is
Z
⟨T ϕ(x1 )ϕ(x2 )ϕ(x3 )ϕ(x4 )⟩c = −λ dd x5 GE (x1 − x5 )GE (x2 − x5 )GE (x3 − x5 )GE (x4 − x5 ) + . . . (7.60)

To get perturbative expressions for time-ordered Lorentzian correlation functions, we should rotate τ =
i(1 − iϵ)t in all external locations x1 , . . . , xM and also in all interaction locations xM +1 , . . . , xM +nC . This
has two effects: it converts all Euclidean propagators to Feynman propagators
dd p −ieip·x
Z
GF (x) = , (7.61)
(2π) p + m2 − iϵ
d 2

and provides an extra factor of inC from the dτ factors in the integrals over interaction locations. Thus we
have the Lorentzian formula
Z Z
nC 1
X Y
⟨T ϕ(x1 ) . . . ϕ(xM )⟩c ∼ (−iλ) d xM +1 . . . dd xM +nC
d
GF (xm − xℓ ). (7.62)
SC
C m,ℓ∈LC

In these calculations the exponentiated sum over connected bubble diagrams canceled between the nu-
merator and denominator of (7.51). It is worth mentioning however that this sum does have a physical
interpretation: it renormalizes the cosmological constant. To see this, note that any connected bubble dia-
gram will be proportional to the volume of spacetime since there is a symmetry of translating all interaction
vertices by the same amount. We can therefore view each connected diagram with no external legs as giving
a contribution to the cosmological constant.

91
Figure 19: Momentum labels for a one-loop contribution to the four-point function.

7.7 Feynman rules in momentum space for correlation functions in ϕ4 theory


Earlier in the semester we met the QFT mantra that one should think in position space but compute
in momentum space. This mantra certainly applies to perturbative correlation functions. Working now
in Lorentzian signature, the basic idea is to replace GF in (7.62) by its momentum representation (7.61)
and then evaluate all of the integrals over interaction vertex locations to get momentum-conserving delta
functions. The only subtlety in doing this is that although GF (x2 − x1 ) is symmetric under exchanging
x1 and x2 , the momentum representation isn’t manifestly symmetric so in assigning a momentum to a link
in the graph we need to pick a direction. This is typically indicated on the diagram by drawing a small
arrow next to the link which is labeled by the momentum, as in figure 19. For this diagram the momentum
representation is

(−iλ)2
Z d
d p1 dd p4 dd p dd q ip1 ·x1 +...ip4 ·x4
⟨T ϕ(x1 ) . . . ϕ(x4 )⟩c ⊃ . . . e
2 (2π)d (2π)d (2π)d (2π)d
−i −i −i −i −i −i
× 2
p1 + m2 − iϵ p22 + m2 − iϵ p23 + m2 − iϵ p24 + m2 − iϵ p2 + m2 − iϵ q 2 + m2 − iϵ
× (2π)d δ d (p1 + p2 + p + q)(2π)d δ d (p3 + p4 − p − q). (7.63)

This looks a bit nicer if we take the Fourier transform and use one of the δ-functions to evaluate the q
integral, giving us

(−iλ)2 −i −i
⟨T ϕ(p1 ) . . . ϕ(p4 )⟩c ⊃ (2π)d δ d (p1 + p2 + p3 + p4 ) 2 2
... 2
2 p1 + m − iϵ p4 + m2 − iϵ
d
−i −i
Z
d p
× . (7.64)
(2π) p + m − iϵ (p − p3 − p4 )2 + m2 − iϵ
d 2 2

Here the “time-ordered correlator in momentum space” just means the Fourier transform of the time-ordered
position space correlator, and the δ-function in front is called a momentum-conserving δ-function. Such
a δ-function appears in every momentum-space correlation function, and is a consequence of the fact that
the correlation functions in position space only depend on relative positions due to spacetime translation
invariance. The “hard” part of computing this diagram is evaluating the integral over the loop momentum
on the second line, we will learn how to evaluate such integrals in a few weeks.
The procedure employed in the previous paragraph is easily formalized into an algorithm for evaluating
any Feynman diagram for a momentum-space correlation function. This algorithm is called the Feynman
rules, and given a connected Feynman diagram C contributing to the connected M -point function it goes
like this:
1. Write a factor of (−iλ)nC , where nC is the number of interaction vertices in C.
2. Divide by the symmetry factor SC .
3. Label the momenta of all propagators, with external momenta pointed outwards.

92
4. Multiply by an overall momentum-conserving δ-function (2π)d δ d (p1 + . . . + pM ).
−i
5. Multiply by a factor of p2 +m 2 −iϵ for each propagator, both internal and external, imposing momentum

conservation at all interaction vertices (e.g. imposing q = p3 + p4 − p) in the previous diagram).


6. Integrate over any internal momenta which are not determined by the δ-functions.
This algorithm is the daily routine of every perturbative quantum field theorist, although as we already saw
in the previous section some care is needed to deal with UV (and eventually IR) divergences and at higher
loops one has to be rather organized.

93
Figure 20: Diagrams for problem 3.

Figure 21: Diagram for problem 4.

7.8 Homework
PN −1
1. Using Mathematica (or your favorite competitor), for f (λ) given by (7.5) make plots of log | n=0 an λn −
f (λ)| as a function of N for λ = .5, .2, .1, and .05. Here an are the coefficients in the perturbation series
(7.12). Are your plots consistent with the qualitative story in figure 12? In particular note the maximal
accuracy and the value of N at which the series begins to diverge.
2. Starting from the definition (7.7), show that the Γ-function obeys Γ(x + 1) = xΓ(x) (to use (7.7) you
can assume Re x > 0, but if you are comfortable with analytic continuation
√ then you should also argue
that this identity holds for all complex x). Also show that Γ(1/2) = π. Using these results, show
that (7.8) and (7.23) are compatible.

3. Check the symmetry factors quoted in the captions of figures 16 and 18, and also compute the symmetry
factors for the two diagrams shown in figure 20.
4. Using the momentum-space Feynman rules, write down an expression for the contribution of the two-
loop diagram in figure 21 to the Fourier transform of the Lorentzian time-ordered four-point function in
λϕ4 theory. You should evaluate all momentum integrals which can be evaluated using the momentum-
conserving δ-functions at the vertices, but you can leave any remaining integrals unevaluated. Make
sure to label the directions of the momenta on your diagram.

Figure 22: Feynman diagrams for the four-point function in a Gaussian integral over complex degrees of
freedom, note that there is one fewer diagram than in the real case.

94
5. Show that Z
† e T x∗
Ax+B T x+B 1 e T A−1 B
dxdx∗ e−x = iA
 eB , (7.65)
Det 2π
where x is a vector with complex components, A is a positive symmetric matrix, and the measure is
defined by dxdx∗ = −2idRe(x)dIm(x). Use this to show that correlation functions of the form
 Z
∗ ∗ iA †
⟨xi1 . . . xiM xj1 . . . xjN ⟩ = Det dxdx∗ xi1 . . . xiM x∗j1 . . . x∗jM e−x Ax (7.66)

are given by a sum over pairings as in (7.27), but where now each pair must contain one x and one
x∗ (so in particular they vanish unless M = N ). Feynman diagrams for a complex degree of freedom
therefore include an arrow on each propagator that points from x to x∗ , as in figure 22.

6. Consider now an interacting complex scalar field with Lagrangian56


λ 4
L = −∂µ ϕ∗ ∂ µ ϕ − m2 ϕ∗ ϕ − |ϕ| . (7.67)
4
Using the results of the previous problem, what are the Feynman rules for time-ordered connected
correlation functions in momentum space? Using your rules, write out an expression in momentum
space for the contribution of the diagram in figure 19 to the four-point function where the two left dots
are ϕs and the two right dots are ϕ∗ s (you will need to add arrows to the propagators). You don’t need
to evaluate the final momentum integral.

56 Note that the denominator in the interaction term is 4, not 4!. This is because the symmetry of the vertex is now only

exchanging two ϕs and two ϕ∗ s instead of permuting four ϕs.

95
8 Particles and Scattering
So far we have organized our discussion of quantum field theory in terms of correlation functions - vacuum
expectation values of products of Heisenberg fields. At least in free field theory however we saw that we
could also interpret correlation functions in terms of particles: we found a basis of eigenstates of the
Hamiltonian whose elements each have some definite number of non-interacting bosons each carrying a
definite spacetime momentum. It is natural to ask if interacting field theories also have a description in
terms of particles. In general they don’t, which is why we have focused on correlation functions so far, but
many of the interacting quantum field theories of most interest in physics do indeed give rise to particles and
it is therefore worthwhile for us to spend some time understanding when this happens and how to relate it
to our knowledge of correlation functions.

8.1 One-particle states


What do we mean by a particle? Our first approach to answering this question is to say what we mean
by a one-particle state. We have already met particles of spin zero in our free scalar theory, but now it is
time to understand the general case. To motivate the problem I’ll mention a question that bothered me as
a student: it is often said that photons have spin one, but if so then why do they only have two polarization
states instead of three? We’ll answer this question at the end of this section.
The definition we propose is the following:57
ˆ In any relativistic quantum system, a one-particle state is an eigenstate of the total spacetime
momentum operator P µ which is part of an irreducible representation of the (identity component of
the) Poincaré group whose basis element labels are discrete and finite except for the eigenvalue of P µ .

In other words we can expand the set of one-particle states in a basis of P µ eigenstates

P µ |p, σ⟩ = pµ |p, σ⟩, (8.1)

where σ runs over a finite set and Poincaré transformations U (Λ, a) act within the subspace of the full Hilbert
space of the theory that is spanned by this basis. As in previous lectures we will normalize these states so
that
⟨p′ , σ ′ |p, σ⟩ = (2π)d−1 δ d−1 (⃗
p ′ − p⃗)δσ′ σ , (8.2)
where ensuring diagonality in the σ indices may require some Gram-Schmidt procedure. The action of
spacetime translations in this basis is simple, we have
µ µ
e−iaµ P |p, σ⟩ = e−iaµ p |p, σ⟩, (8.3)

so what we need to understand is the action of Lorentz transformations U (Λ). These obey

P µ U (Λ)|p, σ⟩ = U (Λ)U (Λ−1 )P µ U (Λ)|p, σ⟩


= U (Λ)Λµν P ν |p, σ⟩
= Λµν pν U (Λ)|p, σ⟩, (8.4)

so we see that we need to have X


U (Λ)|p, σ⟩ = Cσ′ ,σ (Λ, p)|Λp, σ ′ ⟩ (8.5)
σ′

for some matrices Cσ′ ,σ (Λ, p).


57 Recall that a representation π of a group G is a set of linear maps D (G) on a vector space V obeying D (g )D (g ) =
π π 1 π 2
Dπ (g1 g2 ). A representation is irreducible if the only subspaces S ⊂ V which are preserved by all elements of π are V itself
and the empty subspace. It is unfortunately standard to also refer to V itself as the representation, as we have already done in
the above definition of one-particle states.

96
To work out the structure of the Cσ′ ,σ (Λ, p), it is useful to first consider the special case where Λµν pν = pµ .
Given a spacetime momentum pµ , the subgroup of the Lorentz group which preserves pµ is called the little
group for pµ . Given a Lorentz transformation W which is in the little group for some spacetime momentum
k µ , acting on |k, σ⟩ the transformation (8.5) simplifies to
X
U (W )|k, σ⟩ = Dσ′ ,σ (W )|k, σ ′ ⟩, (8.6)
σ′

where we’ve defined


Dσ′ ,σ (W ) = Cσ′ ,σ (W, k). (8.7)
µ
These D-matrices give a finite-dimensional representation of the little group for k :
X
Dσ,σ′ (W1 )Dσ′ ,σ′′ (W2 ) = Dσ,σ′′ (W1 W2 ). (8.8)
σ′

A warning:
ˆ The D-matrices appearing here are operators which represent the little group on the Hilbert space
of quantum mechanics. They are NOT the same as the D(Λ) matrices we met in earlier lectures,
which represent the full
P Lorentz group acting on the components of the fields in the theory via
U (Λ−1 )Φa (x)U (Λ) = b Dab (Λ)Φb (Λ−1 x). Many people have wasted a lot of time being confused
about the difference between the (typically infinite-dimensional) representation U (Λ) of Lorentz sym-
metry acting on Hilbert space and the (typically finite-dimensional) representation D(Λ) of Lorentz
symmetry acting on fields. The D-matrices we are introducing now are involved in the part of the
former which acts on one-particle states, not the latter.

The key point is then that once we have decided on a representation for the little group, the representation
of the full Lorentz group is determined as well. The idea is that the set of possible spacetime momenta pµ for
a particle are all related by Lorentz transformations, so we can write each pµ as a Lorentz transformation Lp
of some fixed reference momentum k µ . The detailed form of k µ depends on whether the particle is massive
or massless, and will be considered in the next paragraph. So far we have not said anything about how the
σ indices at different momenta are related, we can determine this by simply adopting a convention where
the state |p, σ⟩ is related to the state |k, σ⟩ by

|p, σ⟩ = N (p)U (Lp )|k, σ⟩. (8.9)

Here N (p) is a normalization factor that we include to maintain the normalization (8.2). We showed back
in lecture four that this requires s
k0
N (p) = , (8.10)
p0
dd p dd−1 p 1
with the idea being that the object (2π)d
2πδ(p2 +m2 )Θ(p0 ) =
(2π)d−1 2ωp ⃗
defines a Lorentz-invariant measure
′ ω
on spatial momenta and this implies a Lorentz transformation δ d−1
(⃗ p ′ − p⃗). For general
p Λ − p⃗Λ ) = ωp⃗p⃗ δ d−1 (⃗
Λ
Λ we then have

U (Λ)|p, σ⟩ = N (p)U (Λ)U (Lp )|k, σ⟩


= N (p)U (LΛp )U (L−1
Λp ΛLp )|k, σ⟩
X
= N (p) Dσ′ ,σ (L−1 ′
Λp ΛLp )U (LΛp )|k, σ ⟩
σ′
N (p) X
= Dσ′ ,σ (L−1 ′
Λp ΛLp )|Λp, σ ⟩, (8.11)
N (Λp) ′
σ

97
where we have observed that L−1 µ
Λp ΛLp is in the little group with respect to k , and thus that
s
(Λp)0
Cσ′ ,σ (Λ, p) = Dσ′ ,σ (L−1
Λp ΛLp ). (8.12)
p0

This way of building a representation of a group out of a representation for one of its subgroups is called the
method of induced representations.
To discuss the structure of the little group in more detail, we need to be more explicit about which
Lorentz group we are considering: do we include its non-identity components, and do we go to the double
cover where a rotation by 2π acts as −1 on particles of half-integer spin? Let’s first consider the more
familiar case where we take 2π to be the identity, in which case we are interested in the identity component
SO+ (d − 1, 1) of the Lorentz group. The little group depends on whether pµ is timelike, null, or spacelike,
and in the timelike and null cases it also depends on whether it is future or past pointing. There are no
known particles whose momentum is spacelike (this would be called “tachyons” and would allow causality
violation), nor are there any known particles with p0 < 0 (these would have negative energy and destabilize
the vacuum). We will thus focus on the cases where pµ is timelike or null with p0 > 0, which describe massive
and massless particles respectively.
In the massive case we have p · p = −m2 for some m > 0, and by going to the rest from of the particle
we can choose our reference momentum to be

k µ = (m, 0, . . . 0). (8.13)

The little group thus consists of those elements of SO+ (d − 1, 1) which preserve the vector (m, 0, . . . , 0).
No Lorentz transformation which involves a boost can do this, so the little group of a massive particle is
just the spatial rotation group SO(d − 1). Therefore each massive particle is characterized by an irreducible
representation of the spatial rotation group, which of course is what we call the spin of the particle. In
particular for d = 4 the irreducible representations of SO(3) are labeled by integers j ≥ 0, with the spin-j
representation having dimension 2j + 1 as you hopefully know. If we generalize SO+ (d − 1, 1) to its double
cover Spin+ (d − 1, 1), then the little group becomes Spin(d − 1) so more representations are allowed. In
particular for d = 4 the little group becomes Spin(3) = SU (2), which allows for half-integer j.
The massless case is perhaps more novel. A massless particle has no rest frame, so the best we can do is
choose the reference momentum k µ to point in the positive x1 direction:

k µ = (κ, κ, 0, . . . 0) (8.14)

with κ > 0. To find the little group the easiest way to proceed is to find the set of Lorentz generators which
annihilate k µ . We can write a general Lorentz generator as
 
0 b1 b2 . . . bd−1
 b1 0
 −c2 . . . −cd−1  
 b2 c
J = −i  2 , (8.15)

 .. .. A

 . . 
bd−1 cd−1
where the bi multiply boost generators in the i direction and the ci multiply rotation generators in the
1 − i plane. The matrix A is an arbitrary real antisymmetric matrix, which we can think of as generating
a rotation that doesn’t involve the x1 direction. Demanding that this annihilates (8.14) tells us that we
need b1 = 0 and bi = −ci for all i ≥ 2, so we see that the little group of a massless particle moving in the
x1 direction is generated by rotations involving the x2 , . . . , xd−1 directions and combinations of boosts and
rotations whose generators have the form
Ai = J i0 + J i1 . (8.16)
These generators are mutually commuting, and are rotated into each other by rotations involving the
x2 , . . . , xd−1 directions, so the little group of a massless particle is isomorphic to the group ISO(d − 2)

98
of Euclidean rotations and translations of Rd−2 . Since the Ai are mutually commuting we can simultane-
ously diagonalize them, but this leads to a problem: since their eigenvalues give a vector in Rd−2 which can
be continuously rotated, if we do not have Ai = 0 for all i ≥ 2 then the index σ cannot run only over a finite
(or even discrete) range. Such representations are called “continuous spin particles”, and they are typically
viewed as pathological. For example such a particle could never be in thermal equilibrium, and seems quite
hard to reconcile with quantum gravity. There has also never been any evidence for a continuous spin par-
ticle. More pedantically, continuous spin particles don’t actually obey our definition of particle state since
σ is not finite. For any of these reasons, we will from now on restrict to representations with Ai = 0. This
reduces the little group to SO(d − 2), and the choice of an irreducible SO(d − 2) representation associated to
a particle is called its helicity. In particular for d = 4 helicity is determined by an irreducible representation
of SO(2), and these are one-dimensional and labeled by an integer j. Indeed the representation is simply

Dj (θ) = eijθ , (8.17)

where θ is the rotation angle in the 2 − 3 plane. If we generalize SO+ (d − 1, 1) to Spin+ (d − 1, 1), then the
little group of a massless particle (again with Ai = 0) is Spin(d − 2). For d = 4 this is again isomorphic
to SO(2), but now with half-integer j allowed (so that we have to rotate by 4π to get back to where we
started).
Now we can return to the question of why photons do not have three spin states. The reason is that they
don’t really have spin one, spins are for massive particles. What they have is helicity one! Photons with right
circular polarization have helicity j = 1, while photons with left circular polarization have helicity j = −1.
These states transform in distinct irreducible representations of the Lorentz group, and in particular they
are not mixed together by Lorentz transformations (which is quite different from the situation for a massive
particle of spin one).
There is a somewhat confusing terminology issue with helicity that is related to the idea of spatial
reflection symmetry R (or parity P in even dimensions). Photons with helicity one and helicity minus one
are not mixed by the action of the connected Lorentz group SO+ (d − 1, 1), so strictly speaking according to
our definition we should view them as different types of particle. On the other hand all interactions which
involve photons also preserve R symmetry, and R symmetry does mix photons of opposite helicity. It is
therefore conventional to refer to both helicities as photons. The same is true for gravitons, which have
helicity j = ±2. In the standard model of particle physics however there are particles called neutrinos, which
are involved in nuclear reactions, and which are treated as massless with helicity j = −1/2. The standard
model also has massless particles with helicity j = 1/2, which are called antineutrinos. The interactions of
these particles do not respect R symmetry, and so they are given different names.58

8.2 Multiparticle states in non-interacting theories


We now consider states with more than one particle. In non-interacting theories this is a fairly straightforward
matter: a multiparticle state has the form

|p1 , σ1 , n1 ; p2 , σ2 , n2 ; . . . ; pM , σM , nM ⟩, (8.18)

where the new label ni tells us the type of the ith particle (i.e. is it a photon, an electron, etc). The Poincaré
transformation of such a state is just the product transformation
 s 
M 0
e−ia·Λp (Λpi )
Y X  
U (Λ, a)|p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩ = 0 Dσn′i,σi L−1
Λpi ΛLpi
 |Λp1 , σ1′ , n1 ; . . . ; ΛpM , σM

, nM ⟩.
i=1
pi ′
i
σi
(8.19)
58 There is clear evidence that at least two of the three known types of neutrino is massive. Most particle physicists expect

that in fact they all are, but so far this has not been confirmed experimentally. Understanding the nature of the neutrino mass
matrix is one of the main goals of current particle physics research.

99
The normalization of these states is a bit trickier since we need to account for identical particles. For example
in our free scalar theory we have

⟨Ω|ap⃗ ′2 ap⃗ ′1 a†p⃗1 a†p⃗2 |Ω⟩ =(2π)d−1 δ d−1 (⃗


p ′1 − p⃗1 ) × (2π)d−1 δ d−1 (⃗
p ′2 − p⃗2 )
p ′1 − p⃗2 ) × (2π)d−1 δ d−1 (⃗
+ (2π)d−1 δ d−1 (⃗ p ′2 − p⃗1 ). (8.20)

In general we impose

X M
Y
⟨p′1 , σ1′ , n′1 ; . . . ; p′M , σM

, n′M |p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩ = (−1)fπ p ′π(i) − p⃗i )δσπ(i)
(2π)d−1 δ d−1 (⃗ ′ σi δn′π(i) ni ,
π i=1
(8.21)

where the sum is over permutations π of M objects and fπ indicates the number of fermions which are
exchanged by the permutation. In free field theory this sum is automatically generated by the algebra of
creation/annihilation operators, as we saw in (8.20). It is also useful to introduce the idea of a complete set
of multiparticle states, which we can write as
∞ Y M
!
dd−1 pi X X
Z
X 1
I= d−1
|p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩⟨p1 , σ1 , n1 ; . . . ; pM , σM , nM |, (8.22)
i=1
(2π) σ n
S(n)
M =0 i i

where the “symmetry factor” S(n) counts the number of possible permutations of identitical particles in
|p1 , σ1 , n1 . . . pM , σM , nM ⟩ (so in particular if none of the ni are equal then S(n) = 1). Here by convention
the state with zero particles is of course the vacuum |Ω⟩.
As you can already see the notation for multiparticle states is somewhat tedious, so following Weinberg
we’ll adopt an abbreviated notation where a multiparticle state is simply called |α⟩, the inner product (8.21)
is written as
⟨α|β⟩ = δ(α − β), (8.23)
and the resolution of the identity (8.22) is written as
Z
I = dα|α⟩⟨α|. (8.24)

8.3 Multiparticle states in interacting theories


Now we come to the crucial question: to what extent do interacting quantum field theories have particles?
It is clear that asking for a complete basis of states of the form (8.18) with Poincaré transformation (8.19) is
asking for too much: in particular taking Λ = 1 and a to be a time translation, (8.19) would imply that the
energy of a multiparticle state is just the sum of the single-particle energies while we know that this isn’t
true e.g. because of potential energy between particles. (8.19) would also imply that the number of particles
is conserved with time, while we know clearly from experiment that it is not. On the other hand there is no
such objection to interacting theories having one-particle states obeying (8.5). For example the hydrogen
atom is a completely stable one-particle state in quantum electrodynamics, as is the electron and the proton.
I emphasize that from the point of view of scattering theory there is no difference between “fundamental”
particles such as the electron, that correspond to fields in the Lagrangian, and “bound state” particles such
as the hydrogen atom. After all one can never be sure that there isn’t a “more fundamental” theory in which
electrons are also bound states, as indeed is true for protons (which are bound states of quarks and gluons).
Given the existence of one-particle states it is natural to hope for the existence of multiparticle states,
at least in the limit that the particles live in wave packets that are localized far away from each other.
Moreover in this limit we can hope for the energy and momentum of the particles to indeed be additive. In
order to realize these hopes however, we need the interactions between the particles to fall off sufficiently
fast with distance. This is always true in theories where all particles are massive, and that is the regime

100
?
?
Figure 23: In and out states in scattering: in the in state |α, +⟩ we have a definite particle configuration at
t → −∞, while in an out state |β, +⟩ we have a definite particle configuration at t → ∞. In general an in
state evolves to a complicated superposition of out states, with the coefficients being given by the S-matrix.

where scattering theory is most clearly established. In theories with massless particles one can still try to use
scattering theory, but one often encounters “infrared divergences” when doing so and these typically need to
be dealt with on a somewhat case-by-case basis. Those of you who have studied scattering in non-relativistic
quantum mechanics should already be familiar with this problem, as attempts to treat scattering off of a
Coulomb potential using standard methods break down due to logarithmic divergences. Our approach will
be simply to proceed with the assumption that additive multiparticle states exist, with the understanding
that this will sometimes lead to trouble with massless particles that we will need to address when it arises.
We formalize this as follows:
ˆ A quantum mechanical theory with Hamiltonian H has a scattering description if H has a complete
set of “in state” eigenstates, denoted |α, +⟩, and also a complete set of “out state” eigenstates |α, −⟩,
both with eigenvalues
H|α, ±⟩ = Eα |α, ±⟩, (8.25)
where Eα are the eigenvalues of a non-interacting multiparticle Hamiltonian H0 with eigenstates |α⟩,
such that we have
Z Z
−iEα t
lim dα g(α)e |α, ±⟩ = lim dα g(α)e−iEα t |α⟩ (8.26)
t→∓∞ t→∓∞

for arbitrary smooth (and integrable) wave packets g(α) which respect the exchange symmetry of any
identical particles.
What this definition says is that wave packets of the in states look like non-interacting multiparticle eigen-
states at early times, while wave packets of the out states look like non-interacting multiparticle eigenstates
at late times. The basic idea is illustrated in figure 23.
One immediate consequence of this definition is that the inner product of in states with in states and out
states with out states is the same as for non-interacting particles:

⟨β, ±|α, ±⟩ = δ(β − α), (8.27)

which follows because the inner product is time-independent so we can compute at early/late times for in/out
states where they coincide with the non-interacting eigenstates. More interesting is the overlap between in
and out states, which by definition is called the S-matrix:

Sβα := ⟨β, −|α, +⟩. (8.28)

The S-matrix is the primary object of interest in scattering theory; it tells us the quantum amplitude to
find the system in an out state |β, −⟩ given that it started in an in state |α, +⟩. More earthily, the S-matrix

101
provides the answer to a question well known to children everywhere: if you take some stuff and slam it
together, what comes out? Many physicists, who after all have much in common with children, spend their
days studying precisely this question.
A very important property of this S-matrix is that it is unitary, which follows immediately from its
definition since it is a change of basis between two complete sets of orthonormal states. We can also check
this explicitly: Z Z

dβ Sβα Sβγ = dβ ⟨α, +|β, −⟩⟨β, −|γ, +⟩ = ⟨α, +|γ, +⟩ = δ(α − γ). (8.29)

One reason why the unitarity of the S-matrix is interesting is that if we have a perturbative expansion
for S then the unitarity constraint mixes different orders in perturbation theory, which sometimes lets us
determine higher-order contributions from lower-order ones.
It will be useful in what follows to write down the Lorentz transformation of the S-matrix: from (8.19)
we have (in more explicit notation)
s  s 
N M
Y (Λp′i )0 X n′i ∗ Y (Λp j )0 X
n
⟨p′1 , σ1′ , n′1 ; . . . ; p′N , σN

, n′N , −|p1 , σ1 , n1 ; . . . pM , σM , nM , +⟩ =  Dσ′ ,σ′ (Wp )  Dσjj ,σj (Wp )
i=1
p′0
i ′
i i
j=1
p0j σ
σi j

× ⟨Λp′1 , σ ′1 , n′1 ; . . . ; Λp′N , σ ′N , n′N , −|Λp1 , σ 1 , n1 ; . . . ΛpM , σ M , nM , +⟩,


(8.30)

with
Wp = L−1
Λpi ΛLpi . (8.31)
Due to the unitarity of the little group representations this formula simplifies if we take the absolute value
squared and sum over all initial and final spins/helicities:
N   M 
(Λp′i )0 Y (Λpj )0
X Y 
|⟨p′1 , σ1′ , n′1 ; . . . ; p′N , σN

, n′N , −|p1 , σ1 , n1 ; . . . pM , σM , nM , +⟩|2 =
′ ...σ ′ ,σ ...σ i=1
p′0
i j=1
p0j
σ1 N 1 M
X
× |⟨Λp′1 , σ1′ , n′1 ; . . . ; Λp′N , σN

, n′N , −|Λp1 , σ1 , n1 ; . . . ΛpM , σM , nM , +⟩|2 . (8.32)
′ ...σ ′ ,σ ...σ
σ1 N 1 M

In other words in our condensed notation if we define


  
Nβ Nα
Y p Y p
Seβα =  2Eβ,i   2Eα,j  Sβα , (8.33)
i=1 j=1

where Eα,i is the energy of the ith ingoing particle and Eβ,j is the energy of the jth outgoing particle, then
the quantity X
|Seβα |2 (8.34)
spin/helicity

is Lorentz invariant.

8.4 Cross sections and decay rates


Before considering how to compute the S-matrix in quantum field theory, we will first make an aside to
explain how to use it. Experimentalists typically do not report the outcomes of scattering experiments
directly in terms of the S-matrix, but rather in terms of cross sections and decay rates. In this section we
will work out how to relate these to the S-matrix.59
The first thing we need to understand is why experimentalists do not directly measure the S-matrix. The
basic issue is that the S-matrix is always proportional to a momentum-conserving δ-function, so the quantity
59 Cross sections will not be much of a focus in this class, but it is still good to know what they are!

102
|Sβα |2 , which naively is the probability to find an out state |β, −⟩ given that we start in an in state |α, +⟩,
is infinite. To see the necessity of this δ-function, we can observe that

Sβα = ⟨β, −|eiP ·a e−iP ·a |α, +⟩ = ei(pβ −pα )·a Sβα , (8.35)

for all spacetime vectors a, which is only possible if Sβα vanishes if pα ̸= pβ . Here pα and pβ are the total
spacetime momenta of the in and out states (α and β label the states, they are NOT Lorentz indices).
Moreover if we integrate Sβα against generic normalized wave packets we expect a finite and nonzero answer
since we are computing an overlap of normalized states which have no reason to be orthogonal in general, so
whatever support Sβα has when pα = pβ must be strong enough to integrate to a nonzero result - in other
words there must be a δ-function. It is convenient to extract this δ-function, and also the non-interacting
contribution δ(β − α), in hopes that the remaining part of Sβα is nonsingular:

Sβα = δ(β − α) + i × (2π)d δ d (pβ − pα )Mβα . (8.36)

The factor of i here is conventional, I offer no explanation for it. Mβα can still have further δ-function
singularities, but only when a subset of the particles in α has exactly the same spacetime momenta as a
subset of the particles in β. These additional singularities can be removed by defining a “connected” S-matrix
in exactly the same way we did for correlation functions, to avoid this we will just restrict to studying the
S-matrix away from these special kinematic points.
To understand what to do about the divergence of |Sβα |2 , we first need to realize that it is an infrared
divergence: the quantity δ d−1 (0) is infinity in momentum space because we are working in infinite volume.
In finite volume the inner product of our one-particle states is
Z

⟨p′ , σ ′ |p, σ⟩ = δσσ′ dd−1 xei(⃗p−⃗p )·⃗x = V δσσ′ δp⃗p⃗ ′ , (8.37)

so we can formally interpret δ d−1 (0) as the volume of space. The square
√ of the S-matrix is therefore diverging
because we defined it as an overlap of states whose norms are order V instead of states whose norms are
one. The principled way to fix this is to only consider scattering of normalizeable wave packets. In particular
the quantity Z 2
dαg(α)Sβα (8.38)

with Z
dα|g(α)|2 = 1 (8.39)

should be finite. A somewhat lazier approach, but which also works and leads more quickly to the same
answer, is to just work in finite volume and stick to momentum eigenstates. Indeed in finite volume we can
define properly normalized in and out states
1
|α, ±⟩V = |α, ±⟩, (8.40)
V Nα /2
in terms of which the differential transition probability from α to β is
|Sβα |2
dP (α → β) = |⟨β, −|α, +⟩V |2 dNβ = dβ, (8.41)
V Nα
where
dNβ = V Nβ dβ (8.42)
is the number of states in the infinesimal phase space window dβ. To derive this, recall that for a single
particle momentum in finite volume we have

dd−1 p
dN = V (8.43)
(2π)d−1

103
since momenta in the box are quantized (for example if we take it to be square torus of length L) as

2π⃗n
p⃗ = (8.44)
L
with ⃗n a spatial vector of integers. If we avoid choices of α and β where Mβα has additional δ-functions (or
just focus on connected scattering), then we can write this as
2
dP (α → β) = V −Nα (2π)d δ d (pβ − pα ) |Mβα |2 dβ. (8.45)

To deal with the square of the δ-function, we notice that in finite volume we can write it as
2 2
(2π)d δ d (pβ − pα ) = V δp⃗α p⃗β T δEα Eβ

= V T × V T δp⃗α p⃗β δEα Eβ
= V T (2π)d δ d (pβ − pα ), (8.46)

where T is the total time elapsed in the scattering process (i.e. we work in a “time box” as well as a spatial
box). The transition rate, which is the transition probability per unit time, is therefore given by

dΓ(α → β) = V 1−Nα (2π)d δ d (pβ − pα )|Mβα |2 dβ. (8.47)

We’ve now succeeding in pushing all infrared divergences into an overall power of the volume, as everything
else appearing here is sensible in the large volume limit.
To proceed further, we need to think a bit about how to connect this setup to what experimentalists
actually do. The easiest case is Nα = 1, for which the power of V just cancels. We are then studying the
decay of an unstable particle, whose differential decay rate into a final state β is apparently given by

dΓ(α → β) = (2π)d δ d (pβ − pα )|Mβα |2 dβ. (8.48)

I must confess however that this formula (although correct when interpreted properly) is a bit of a cheat: an
unstable particle isn’t really a one-particle state of the theory in infinite volume, so we can’t really interpret
Mβα as part of the S-matrix. On the other hand if the total decay rate
Z
Γ = dβ(2π)d δ d (pβ − pα )|Mβα |2 (8.49)

is small compared to the inverse of our time interval T then we can effectively treat the particle as stable
in our box setup. This formula should therefore be correct as long as the lifetime of the particle is long
compared to all other scales in the problem.60
A somewhat more complicated (but also better defined) case is Nα = 2, which is the scattering of two
particles to many. The classic setup for this experiment is shown in figure 24: we have a beam of incident
identical particles with momentum p1 aimed at a target particle with momentum p2 = (m2 , ⃗0). The natural
thing to measure is the differential rate for scattering into the out state β divided by the incident flux, which
is called the differential cross section:
dΓ(α → β)
dσ(α → β) = . (8.50)

Since we are working in the rest frame of the target particle, the incident flux fα is given by

fα = |v1 |ρ1 (8.51)


60 There is a more rigorous justification for this based on defining an unstable particle as a resonance in the scattering of

stable particles and Mαβ in terms of the residue of this resonance, but we won’t explore it here. If you just compute Mαβ using
the Feynman rules we’ll find below extrapolated to the case of one ingoing particle then you will get the right answer.

104
Figure 24: Fixed-target scattering: an incident beam of identical particles (shown in red) with identical
momenta are scattered off of a single target particle (shown in blue). What is the rate at which each out

state β is produced per unit incident flux? The answer to this question is the differential cross section dβ .

where ⃗v1 is the velocity of incident particles and ρ1 is their density. In a general Lorentz frame (which after
all we had better include since the target particle could be massless) the flux is instead defined to be
fα = uα ρ1 , (8.52)
where p
(p1 · p2 )2 − m21 m22
uα = (8.53)
E1 E2
is called the relative velocity. You will show in the homework that when p⃗1 and p⃗2 are collinear (i.e.
proportional to each other) then we have
uα = |⃗v1 − ⃗v2 |. (8.54)
For general p1 , p2 the motivation for this definition of flux is that it makes the spin-summed differential
cross section Lorentz invariant, as we will see in a moment. Returning now to our box setup with Nα = 2,
our one-particle box states |α, ∓⟩V are properly normalized so the number of particles in the box in such a
state is one. The particle density is therefore 1/V , and we can think of the transition rate (8.47) as arising
from a beam of particle one with density ρ1 = V1 and flux uα /V scattering off of the other particle in the
box (wherever it is) just as in figure 24. The differential cross section is therefore given by
dσ(α → β) = u−1 d d 2
α (2π) δ (pβ − pα )|Mβα | dβ. (8.55)
This formula is used anytime someone wants to compare a theoretical calculation of two-particle scattering
to experiment!
Let’s briefly consider the Lorentz transformation properties of the differential cross section. Mβα has
the same Lorentz transformation properties as Sβα , and we saw in equation (8.32) that the Lorentz trans-
formations of |Sβα |2 is simple once we sum over spins/helicities and multiply by the product of initial and
final energies of each particle. More concretely, if we define
  
Nβ Nα
Y p Y p
Mfβα =  2Eβ,i   2Eα,j  Mβα , (8.56)
i=1 j=1

then X
fβα |2
|M (8.57)
spin/helicity

is Lorentz invariant. We also saw in lecture four (and mentioned below equation (8.10)) that the quantity

f = Q dβ
dβ (8.58)
N
i=1 2Eβ,i

105
is Lorentz invariant, where again Eβ,i is the energy of the ith particle in the final state. We therefore are
motivated to sum (8.55) over initial and final spins/helicities and then rewrite it as
X 1 d d
X
fβα |2 × dβ,
dσ(α → β) = p × (2π) δ (p β − pα ) × |M f (8.59)
4 (p · p )2 − m2 m2
spin/helicity 1 2 1 2 spin/helicity

where the right-hand side is now a product of manifestly Lorentz-invariant quantities (in particular because
of our definition (8.53) of the relative velocity). Thus we see that the spin-summed differential cross section
is Lorentz invariant!
You perhaps are wondering about the physical motivation for summing over initial and final spins/helicities.
In fact what we really should do is sum over final spins/helicities and average over initial spins/helicities,
for the following reasons:
ˆ Typically the method for preparing a beam of particles does not preferentially treat one spin/helicity
state over another. We therefore should expect the initial state in a scattering process to be a mixed
quantum state where all spin/helicity configurations are equally likely, in which case we should average
over initial spins/helicities in the transition rate.
ˆ Typically in measuring the final state we do not get a good measurement of the spins/helicities of the
particles. We should therefore sum over these to compute the transition rate which does not distinguish
between different spins/helicities.
In situations where either of these statements is not the case, then we need to deal with the full differential
cross section.

8.5 Unitarity and the optical theorem


The unitarity of the S-matrix has some nice consequences for cross sections and decay rates. We can first
observe that
Z

δ(γ − α) = dβSβγ Sβα
Z
= dβ δ(β − γ) − i(2π)d δ d (pβ − pγ )M∗βγ δ(β − α) + i(2π)d δ d (pβ − pα )Mβα
 

 Z 
= δ(γ − α) + (2π)d δ d (pγ − pα ) iMγα − iM∗αγ + dβ(2π)d δ d (pβ − pα )M∗βγ Mβα . (8.60)

The δ(γ − α) terms cancel on both sides, so the remaining equality tells this that for any states α and γ such
that pγ = pα we should have
Z
iMγα − iM∗αγ + dβ(2π)d δ d (pβ − pα )M∗βγ Mβα = 0. (8.61)

In particular this is true if α = γ, in which case we find


Z
1
Im Mαα = dβ(2π)d δ d (pβ − pα )|Mβα |2 . (8.62)
2
From equation (8.47) we can rewrite the right-hand side of this equation in terms of the total transition rate

dΓ(α → β)
Z
Γ(α) = dβ , (8.63)

which gives
Γ(α) = 2V 1−Nα Im Mαα . (8.64)

106
In particular for Nα = 1 we have
Γ(α) = 2Im Mαα , (8.65)
so the lifetime of an unstable particle is just two times the imaginary part of its forward scattering amplitude.
For Nα = 2 we can rewrite things in terms of the total cross section

dσ(α → β)
Z
V Γ(α)
σ(α) = dβ = , (8.66)
dβ uα
which gives
2
σ(α) = Im Mαα . (8.67)

This last result is called the optical theorem. Both (8.65) and (8.67) express the idea that by unitary
any decay or scattering which is possible must decrease the probability that no scattering happens, which is
certainly a reasonable thing to expect!

107
8.6 Homework
1. The helicity of a photon in general dimensions is that of the vector representation of SO(d − 2), so a
photon in d dimensions has d − 2 independent values of σ (i.e. independent polarization states). How
would you interpret this in 1+1 and 2+1 dimensions? Hint: think about how the classical polarization
of an EM wave should work in these dimensions.
2. The helicity of a graviton in general dimensions is that of the representation of SO(d − 2) furnished
by a symmetric traceless two-tensor hij . How many independent polarizations does a graviton have in
d spacetime dimensions?
3. Check that our resolution (8.22) of the identity indeed acts as the identity on free scalar two-particle
states of the form ap†⃗ a†p⃗ ′ |Ω⟩.
4. Consider the scattering of a non-relativistic quantum particle off of a δ-function potential, with Hamil-
tonian
p2
H= + V0 δ(x). (8.68)
2m
You can assume V0 > 0. Give explicit formulas for the In and Out states of this theory, and compute
the S-matrix. Hint: to get a complete basis you need to consider incident waves from both the left and
the right, and you need to make sure your states are eigenstates of the full Hamiltonian. This theory
arises from the more general scattering theory we’ve considered in the limit where one of the incident
particles is infinitely massive and its interaction with the other particle has infinitely-short range.
5. Confirm that (8.32) follows from (8.30) by the unitarity of the little group representations.
6. Show that the relative velocity (8.53) becomes (8.54) in the “collinear” situation where the spatial
momenta are proportional to each other (possibly with opposite sign).

108
9 Scattering from correlation functions in quantum field theory
In the last lecture we studied scattering theory in quantum mechanics. In particular we encountered the
idea of “in” and “out” states |α, ±⟩ and the S-matrix

Sβα = ⟨β, −|α, +⟩ = δ(β − α) + i(2π)d δ d (pβ − pα )Mβα . (9.1)

We also learned how to convert the S-matrix into observable transition rates such as the differential cross-
section
dσ(α → β) = u−1 d d 2
α (2π) δ (pβ − pα )|Mβα | dβ (9.2)
for two-particle scattering. In this lecture we return to quantum field theory, looking to answer two questions:
1. How can we tell when an interacting quantum field theory has a scattering description in terms of
particles?
2. In quantum field theories which do have a scattering description, how can we compute the S-matrix
starting from the correlation functions?
We will see that the answer to the first of these questions is that the existence of particles in a quantum
field theory leads to poles in the Fourier transform of its two-point functions, and the answer to the second
is given by the LSZ reduction formula. Once we establish these tools, we will finally be in a position to
compute genuine observables in interacting quantum field theories for comparison with experiment!61

9.1 Exact two-point function in interacting quantum field theory


As a warmup to our discussion of scattering, we’ll begin with a general discussion of the structure of the
two-point function of Heisenberg operators in interacting quantum field theory. The idea is to consider
Z Z
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dx1 dx2 e−i(k2 ·x2 +k1 ·x1 ) e−ϵ|t2 −t1 | ⟨Ω|T O2a2 (x2 )O1a1 (x1 )|Ω⟩, (9.3)

where O1a1 and O2a2 are local operators that transform in irreducible representations of the Lorentz group:
X
U † (Λ)Oiai (x)U (Λ) = Diai bi (Λ)Oibi (Λ−1 x), (9.4)
bi

with i = 1, 2. We will assume that


⟨Ω|Oiai (x)|Ω⟩ = 0, (9.5)
which follows from Lorentz symmetry unless Oi is a scalar and which we can achieve in the scalar case by
redefining Oi by an additive shift. The convergence factor e−ϵ|t2 −t1 | is useful to include, as we will soon see.
In a theory with a scattering description we can evaluate this two-point function by inserting a complete
set of in/out states:
Z Z Z 
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dα dx1 dx2 e−i(k2 ·x2 +k1 ·x1 ) e−ϵ|t2 −t1 | Θ(t2 − t1 )⟨Ω|O2a2 (x2 )|α, ±⟩⟨α, ±|O1a1 (x1 )|Ω⟩

+ (−1)fO Θ(t1 − t2 )⟨Ω|O1a1 (x1 )|α, ±⟩⟨α, ±|O2a2 (x2 )|Ω⟩ ,
(9.6)
where fO = 1 if O1 and O2 are fermionic and fO = 0 if they are bosonic. We can simplify the matrix
elements appearing here by noting that

Oiai (xi )) = e−iP ·xi Oiai (0)eiP ·xi , (9.7)


61 My apologies that it is taking such a long time to get there. The basic curse of quantum field theory pedagogy is that if

you want to get to practical applications quickly you won’t understand what you are doing, and in this class we’ve decided to
take our time and learn things properly.

109
and thus
⟨Ω|Oiai (xi )|α, ±⟩ = eipα ·xi ⟨Ω|Oiai (0)|α, ±⟩
⟨α, ±|Oiai (xi )|Ω⟩ = e−ipα ·xi ⟨α, ±|Oiai (xi )|Ω⟩, (9.8)
from which we have
Z "Z
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dα dx1 dx2 e−i(k1 +pα )·x1 −i(k2 −pα )·x2 −ϵ(t2 −t1 ) ⟨Ω|O2a2 (0)|α, ±⟩⟨α, ±|O1a1 (0)|Ω⟩
t2 >t1
Z #
−i(k1 −pα )·x1 −i(k2 +pα )·x2 +ϵ(t2 −t1 )
+ (−1) fO
dx1 dx2 e ⟨Ω|O1a1 (0)|α, ±⟩⟨α, ±|O2a2 (0)|Ω⟩ .
t2 <t1
(9.9)
The spatial integrals here give simple δ-functions, but the integrals over t1 and t2 are a bit trickier:
Z ∞ Z ∞ Z ∞
i(k10 +p0α −iϵ)t1 +i(k20 −p0α +iϵ)t2 0 0 i
dt1 dt2 e = dt1 ei(k1 +k2 )t1 0 0 + iϵ
−∞ t1 −∞ k 2 − p α

0 0 i
= 2πδ(k1 + k2 ) × 0 (9.10)
k2 − p0α + iϵ
and
Z ∞ Z ∞ Z ∞
i(k10 −p0α +iϵ)t1 +i(k20 +p0α −iϵ)t2 0 i0
dt2 dt1 e = dt2 ei(k1 +k2 )t2
−∞ t2 −∞ k10
− p0α + iϵ
i
= 2πδ(k10 + k20 ) × 0 . (9.11)
k1 − p0α + iϵ
We thus have
Z "
i
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dα (2π)d−1 δ d−1 (k⃗2 − p
⃗α )(2π)d−1 δ d−1 (k⃗1 + p
⃗α )2πδ(k10 + k20 ) ⟨Ω|O2a2 (0)|α, ±⟩⟨α, ±|O1a1 (0)|Ω⟩
k20 − p0α + iϵ
#
i
+ (−1) fO
(2π)d−1 d−1
δ (k⃗2 + p
⃗α )(2π)d−1 δ d−1 (k⃗1 − p
⃗α )2πδ(k10 + k20 ) a1 a2
⟨Ω|O1 (0)|α, ±⟩⟨α, ±|O2 (0)|Ω⟩ .
k10 − p0α + iϵ
(9.12)
i i
The key point to notice here are the pole factors and what we will now show is that the
k20 −p0α +iϵ k10 −p0α +iϵ
:
contribution to the α integral coming from one-particle states turns these poles into poles of the momentum-
space correlation function ⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ . For one-particle states we simply have
Z X Z dd−1 p
dα = (9.13)
σ,n
(2π)d−1

and
pα = (ωn,⃗p , p⃗), (9.14)
with p
ωn,⃗p = |p|2 + m2n . (9.15)
Evaluating the momentum integrals we thus find
X i
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ ⊃ (2π)d δ(k2 + k1 ) ⟨Ω|O2a2 (0)|k2 , σ, n⟩⟨k2 , σ, n|O1a1 (0)|Ω⟩
n,σ
k20 − ωn,⃗k2 + iϵ
!
i
+ (−1)fO ⟨Ω|O1a1 (0)|k1 , σ, n⟩⟨k1 , σ, n|O2a2 (0)|Ω⟩ .
k10 − ωn,⃗k1 + iϵ
(9.16)

110
The two-point function therefore has a pole whenever the external momenta go “on-shell” for any particle
species n for which the matrix elements do not vanish. This may seem technical, but in fact it is profound:
ˆ The way we tell if a quantum field theory has particles is we look for on-shell poles in the Fourier
transform of the time-ordered two-point function: they exist if and only if the theory has one-particle
states, and we can determine the masses of the particles from the locations of the poles.

In particular I want to emphasize that nothing in this derivation assumed that the particles are “fundamental”
in the sense of being associated with fields in the Lagrangian, for example in QED the hydrogen atom
contributes poles to two-point functions and the same is true for protons in QCD.
What about other contributions to the α integral? The vacuum contribution vanishes because of (9.5).
We will not try to show it systematically, but the multi-particle states only contribute branch cuts which in
massive theories are away from the “on-shell” poles at k20 = ±ω⃗k2 so the pole contributions comes only from
one-particle states. The basic idea is that the integral
Z zmax  0 
dz k − zmin + iϵ
0
= log (9.17)
zmin k − z + iϵ k 0 − zmax + iϵ

has only logarithmic branch singularities, where here z stands for the integral over the additional momenta
in a multi-particle state and we’d have zmin = mn + ωn,⃗k so the branch point is at k 0 = mn + ωn,⃗k which
is different from ωn,⃗k unless mn = 0. When mn = 0 some more care is needed, but in essence we can still
distinguish a pole from a branch point even if they are right on top of each other.

9.2 Matrix elements


Let’s now say something about the matrix elements ⟨Ω|Oa (0)|⃗k, σ, n⟩ and ⟨⃗k, σ, n|Oa (0)|Ω⟩ appearing in the
residues of the one-particle poles we just found (to make life simpler we have dropped the i indices for now
since we are only considering the Os one at a time). These are strongly constrained by Lorentz symmetry,
as we will now show.62 Taking Λ to be an element of the little group for k, i.e. a Lorentz transformation for
which Λk = k, we have
X ′ ′
⟨Ω|Oa (0)|k, σ, n⟩ = ⟨Ω|U (Λ)Oa (0)U † (Λ)U (Λ)|k, σ, n⟩ = Daa (Λ−1 )D̂σn′ σ (Λ)⟨Ω|Oa (0)|k, σ ′ , n⟩. (9.18)
a′ ,σ ′

Here I’ve put a hat on the little group representation matrices D̂n to distinguish them from the operator
representation matrices D appearing in (9.4), and used the little group transformation
X
U (Λ)|k, σ, n⟩ = Dσn′ σ (Λ)|k, σ ′ , n⟩. (9.19)
σ′

Viewing ⟨Ω|Oa (0)|k, σ, n⟩ as a matrix T aσ , we can write this as

T = D(Λ−1 )T D̂n (Λ), (9.20)

or equivalently
D(Λ)T = T D̂n (Λ). (9.21)
Similarly we have X ′ ′
⟨k, σ, n|Oa (0)|Ω⟩ = Daa (Λ−1 )D̂σn∗′ σ (Λ)⟨k, σ ′ , n|Oa (0)|Ω⟩, (9.22)
a′ ,σ ′

and so the matrix


Teaσ = ⟨k, σ, n|Oa (0)|Ω⟩ (9.23)
62 This argument is somewhat mathematical, the answer is in equation (9.41) if you want to just trust me.

111
obeys
D(Λ)Te = TeD̂n∗ (Λ). (9.24)
n
We have taken both D and D̂ to form irreducible representations of the little group, and equations (9.21)
and (9.24) can be interpreted in terms of group theory as saying that the matrices T and Te are intertwiners
between these irreducible representations. More precisely, T is an intertwiner from the D̂ representation to
the D representation and Te is an intertwiner from the conjugate of the D̂ representation to the D repre-
sentation. It is a theorem in group theory that nonzero intertwiners between finite-dimensional irreducible
representations exist only if the representations are equivalent by a similarity transformation, and moreover
that even in this case the intertwiner is unique up to an overall constant factor.63
In fact we have already discussed these intertwiners, in the context of free field theory. Given a particle
type n, we argued in the first two lectures that to make a relativistic quantum theory we should begin by
constructing a free field which annihilates that particle with the form
X Z dd−1 p 1 h i
a a ip·x a c −ip·x †
Φ (x) = u (⃗
p , σ, n)e a p
⃗ σn + v (⃗
p , σ, n )e a p
⃗σn c . (9.25)
(2π)d−1 2ωn,⃗p
p
σ

Here nc is the antiparticle of n (which may coincide with n or not), and the functions ua and v a are chosen
so that the field commutes/anticommutes at spacelike separation and we have the Lorentz transformation
X ′ ′
U † (Λ)Φa (x)U (Λ) = Daa (Λ)Φa (Λ−1 x). (9.26)
a′

Due to our little group transformation (9.19) we see that the creation and annihilation operators in this field
must transform as
c
U (Λ)a†p⃗σnc U (Λ−1 ) = D̂σn′ σ (Λ)a†p⃗σ′ nc
X

σ′
X
−1
U (Λ)ap⃗σn U (Λ )= D̂σn∗′ σ (Λ)ap⃗σ′ n , (9.27)
σ′

where Λ is in the little group for p. In order for the field to have the Lorentz transformation (9.26), these
transformations must combine with ua and v a to give
X X ′ ′
p, σ ′ , n)D̂σn′ ,σ (Λ) =
ua (⃗ Daa (Λ)ua (⃗
p, σ, n)
σ′ a′
X c X ′ ′

a
v (⃗
p, σ , n)D̂σn′ σ∗ (Λ) = Daa (Λ)v a (⃗
p, σ, nc ). (9.28)
σ′ a′

In other words ua and v a are intertwiners, so by the uniqueness of intertwiners they must be proportional
to T and Te respectively:
⟨Ω|Oa (0)|k, σ, n⟩ = An (⃗k) ua (⃗k, σ, n)
enc (⃗k) v a (⃗k, σ, nc ).
⟨k, σ, nc |Oa (0)|Ω⟩ = A (9.29)

We can learn more about the proportionality functions An and A enc by considering general Lorentz
transformations.64 For general Λ instead of (9.18) we have
s
ωn,⃗kΛ X aa′ −1 n a′
a
⟨Ω|O (0)|k, σ, n⟩ = D (Λ )D̂σ′ σ (L−1 ′
Λk ΛLk )⟨Ω|O (0)|Λk, σ , n⟩, (9.30)
ω⃗k ′ ′
a ,σ

63 If you know a little representation theory the proof is a fairly straightforward application of Schur’s lemmas, see theorem

A.5 in my long paper with Ooguri.


64 You may wonder why we restricted to the little group above if we need the general transformation anyways here: the reason

is that the uniqueness result for intertwiners only applies to finite-dimensional irreducible representations, and it is only the
little group which acts in a finite-dimensional representation on particle states.

112
which from (9.29) tells us that
s
ωn,⃗kΛ X ′
a′ ⃗
An (⃗k)u (⃗k, σ, n) =
a
An (⃗kΛ ) Daa (Λ−1 )D̂σn′ σ (L−1 ′
Λk ΛLk )u (kΛ , σ , n). (9.31)
ω⃗k
a′ ,σ ′

Here we are using the notation that Λ(ωn,⃗k , ⃗k) = (ωn,⃗kΛ , ⃗kΛ ), and as in the previous lecture Lp is the
Lorentz transformation which maps a reference momentum to p. Similarly from the transformation of
⟨k, σ, nc |Oa (0)|Ω⟩ we have
s
ωn,⃗kΛ X ′ c
a′ ⃗
enc (⃗k)v (⃗k, σ, n ) =
A a c enc (⃗kΛ )
A Daa (Λ−1 )D̂σn′ σ∗ (L−1 ′ c
Λk ΛLk )v (kΛ , σ , n ). (9.32)
ω⃗k ′ ′ a ,σ

We can simplify these by noting that in general the transformations of the creation and annihilation operators
is
ωn,⃗pΛ X nc
r
U (Λ)a†p⃗σnc U (Λ−1 ) = D̂σ′ σ (L−1 †
Λp ΛLp )ap⃗Λ σ ′ nc
ωp⃗
σ′
ωn,⃗pΛ X n∗ −1
r
U (Λ)ap⃗σn U (Λ−1 ) = D̂σ′ σ (LΛp ΛLp )ap⃗Λ σ′ n , (9.33)
ωp⃗ ′ σ

and that the Lorentz transformation (9.26) for the free field then implies (extra credit homework) that we
have
X X ′ ′
ua (⃗kΛ , σ ′ , n)D̂σn′ σ (L−1
Λk ΛLk ) = Daa (Λ)ua (⃗k, σ, n)
σ′ a′
X c X ′ ′
v a (⃗kΛ , σ ′ , nc )D̂σn′ σ∗ (L−1
Λk ΛLk ) = Daa (Λ)v a (⃗k, σ, nc ). (9.34)
σ′ a′

Using these in (9.31), (9.32) we see that we must have

ωn,⃗k An (⃗k) = ωn,⃗kΛ An (⃗kΛ )


p p

enc (⃗k) = pω ⃗ A
ωn,⃗k A e c ⃗
n,kΛ n (kΛ ), (9.35)
p

and thus
Zn
An (⃗k) = q
2ωn,⃗k

enc (⃗k) = qZnc


e
A (9.36)
2ωn,⃗k

where Zn and Zenc are pure numbers. Moreover Zn and Zenc are related by CRT symmetry: by CRT we
have
⟨Ω|Oa (0)|k, σ, n⟩ = ⟨Θ†CRT kσn|(Θ†CRT Oa (0)ΘCRT )† |Ω⟩, (9.37)
where ⟨Θ†CRT kσn| is the bra dual to the ket Θ†CRT |kσn⟩ and we have used that ΘCRT is antiunitary and
leaves the vacuum unchanged. Recalling that for CRT we have

(Θ†CRT Oa (0)ΘCRT )† = i−fO DE (RT )ab Ob (0), (9.38)

and also that CRT maps particles to antiparticles, we see that (9.37) gives a proportionality relation between
Zn and Zenc . Working out the proportionality coefficient this way is a bit tricky (we’d need to sort out how

113
CRT acts on one-particle states), but once we know such a relation exists we can instead just determine the
coefficient by comparing to free field theory. There we have
1
⟨Ω|Φa (0)|k, σ, n⟩ = q ua (⃗k, σ, n)
2ωn,⃗k
1
⟨k, σ, nc |Φa (0)|Ω⟩ = q v a (⃗k, σ, nc ), (9.39)
2ωn,⃗k

so the coefficient of proportionality is one:


Zn = Zenc . (9.40)
Thus we at last have
Zn
⟨Ω|Oa (0)|k, σ, n⟩ = q ua (⃗k, σ, n)
2ωn,⃗k
Zn
⟨k, σ, nc |Oa (0)|Ω⟩ = q v a (⃗k, σ, nc ), (9.41)
2ωn,⃗k

which up to the overall factor of Zn are rather remarkably the same as we would have obtained simply from
replacing Oa by Φa and using free field theory!

9.3 Back to the two-point function


We can now at last go back to our expression (9.16) for the two-point function. Given our new knowledge
about the matrix elements, and now taking O1† = O2 = O, we have

X i 1 X a2 ⃗
⟨T Oa2 (k2 )Oa1 † (k1 )⟩ϵ ⊃ (2π)d δ(k2 + k1 ) |Zn |2 u (k2 , σ, n)ua1 ∗ (k⃗2 , σ, n)
n
k20 − ωn,⃗k2 + iϵ 2ωn,⃗k2 σ
!
i 1 X a2 ⃗
+ (−1)fO v (k1 , σ, nc )v a1 ∗ (k⃗1 , σ, nc ) ,
k10 − ωn,⃗k1 + iϵ 2ωn,⃗k1 σ
(9.42)

where I again remind you that we are focusing on the contribution of the on-shell pole. You will show in
the homework that up to the factor of |Zn |2 , this is precisely the Fourier transform of the time-ordered two
point function of the free field (9.25):65

⟨T Oa2 (k2 )Oa1 † (k1 )⟩ϵ −−


0
−−−−→ |Zn |2 ⟨T Φa2 (k2 )Φa1 † (k1 )⟩ϵ . (9.43)
ki →ωn,⃗k
i

Thus we see that in the vicinity of an on-shell pole, the exact two-point function in any quantum field theory
with a particle description just becomes that of free field theory up to an overall factor! It is important to
make several comments about this however:
ˆ In general the particles appearing here have nothing to do with the fields appearing in the Lagrangian.
The free fields we are discussing here may thus look nothing like the “true” fields appearing in the
Lagrangian.
65 If there are multiple types of particle with exactly the same mass and spin/helicity (besides just n and nc ) then O a could

create a superposition of them, in which case there could still be a sum over some subset of n here. In this case however we
can just redefine our basis of particle types to treat this superposition as its own type of particle. Typically this situation only
arises when there is a global symmetry to enforce the degeneracy, in which case this redefinition will just be a global symmetry
transformation.

114
ˆ On the other hand in situations when the interactions are weak and we are interested in particles which
do correspond to fundamental fields, we indeed can (and will) chose Oa to just be the fundamental
field for the particle in question. It cannot be emphasized enough however that mass mn appearing in
this formula is the genuine particle mass, not the mass parameter appearing in the Lagrangian. We
saw already at one loop in ϕ4 theory that these are not the same.
ˆ Moreover the two-point function of a field whose kinetic term in the Lagrangian is normalized as
− 12 ∂µ ϕ∂ µ ϕ (in the scalar case) will NOT have Zn = 1 in the interacting theory. We have not yet
computed enough loop diagrams to see this happen, but we eventually will (unfortunately in ϕ4 theory
one needs to go to two loops to see it). Rescaling the fundamental field to give us something whose
two-point function doesn’t have a factor of |Zn |2 is (for historical reasons) called wave function
renormalization.
ˆ In a situation where the particle we want to create is not fundamental, it may not seem so clear which
operator Oa we should use to get a two-point function with a non-vanishing Zn . In fact it is easy: we
simply look for any local operator with the same symmetry charges as a free field which annihilates that
particle should have. So for example in QCD it is easy to construct a local operator out of quark and
gluon fields with the same symmetry transformations as a field that would annihilate the proton, and
we can just use that one even though it undoubtedly will create all sorts of mess in the multiparticle
states which do not contribute to the pole.

9.4 The LSZ reduction formula


Having worked at some length with the exact two-point function in interacting theories, we are now at last
ready to give a general algorithm for extracting the S-matrix from correlation functions in quantum field
theories with a particle description. The argument is somewhat delicate, and I must confess that some steps
will be motivated by analogy to the above discussion rather than justified in detail, but the final answer is
quite simple and forms the basis for all scattering computations in quantum field theory. The idea is that
the S-matrix element for M particles going to N particles is the coefficient of a multi-dimensional pole in
the Fourier transform of a time-ordered M + N -point correlation function.
We’ll begin with the object
Z Z
′ ′
⟨Ω|T ON (kN ) . . . O1 (k1 )O1 (k1 ) . . . OM (kM )|Ω⟩ϵ := dx1 . . . dxN +M e−i(k1 ·x1 +...+kM ·xM +k1 ·xM +1 +...+kN ·xM +N )
′aN ′ ′a1 ′ b1 † bM †

′aN
× ⟨Ω|T ON (xM +N ) . . . O1′a1 (xM +1 )O1b1 † (x1 ) . . . OM
bM †
(xM )|Ω⟩
× e−ϵ(tmax −tmin ) (9.44)
bM
where O1b1 , . . . , OM ′aN
and O1′a1 , . . . , ON are Heisenberg operators in some quantum field theory with a scat-
tering description that transform in irreducible representations of the Lorentz group. In the ϵ-regulator,
tmin and tmax are the least and greatest of the times t1 , . . . , tM +N . We will eventually arrange so that
k1 , . . . , kM are the spacetime momenta of the ingoing particles and k1′ , . . . , kN

are the spacetime momenta
of the outgoing particles. Very roughly we’ll see that you can think of the Oi† (ki ) as creation operators for
particles in an “in” state and the Oi′ (ki ) as annihilation operators for particles in an “out” state. Our goal is
to show that this object has a multi-dimensional pole as we take the external momenta to be on-shell, and
in particular when we do this so that

ki0 → −ωni ,⃗ki


ki′0 → ωn′ ,⃗k′ . (9.45)
i i

This pole arises from the region of integration where we have

t1 ≤ t2 ≤ . . . ≤ tM +N , (9.46)

115
and is a generalization of the first term in (9.42) (other regions of integration give poles where some of the
ki0 are equal to ωni ,⃗ki and some of the ki′0 are equal to −ωn′ ,⃗k′ ). Focusing on this region of the integral, and
i i
defining
Gϵ := ⟨Ω|T ON′aN ′
(kN ) . . . O1′a1 (k1′ )O1b1 † (k1 ) . . . OM
bM †
(kM )|Ω⟩ϵ (9.47)
to save space, we can insert complete sets of scattering states to get
Z Z
′ ′
Gϵ ⊃ dx1 . . . dxN +M dα1 . . . dαM dβ1 . . . dβN e−i(k1 ·x1 +...+kM ·xM +k1 ·xM +1 +...+kN ·xM +N )−ϵ(tM +N −t1 )
t1 ≤t2 ...≤tM +N
′aN
× ⟨Ω|ON (xM +N )|β1 ⟩ . . . ⟨βN −1 |O1′a1 (xM +1 )|βN ⟩⟨βN |αM ⟩⟨αM |O1b1 † (x1 )|αM −1 ⟩ . . . ⟨α1 |OM
bM †
(xM )|Ω⟩,
(9.48)

where to save more space I’ve here adopted a convention that α states are “in” states and β states are “out”
states. Note the appearance of the M -particle to N -particle S-matrix ⟨βN |αM ⟩; our goal is now to show that
this can be extracted by isolating the on-shell pole.
As before we can extract the position dependence of the matrix elements in scattering states as

⟨γ|O(x)|δ⟩ = e−i(pγ −pδ )·x , (9.49)

so we can rewrite things as


Z Z
Gϵ ⊃ dα1 . . . dαM dβ1 . . . dβN ⟨βN |αM ⟩ dx1 . . . dxN +M
t1 ≤t2 ...≤tM +N
′ ′
× e−i[(k1 +pαM −pαM −1 )·x1 +...+(kM +pα1 )·xM +(k1 −pβN +pβN −1 )·xM +1 +...+(kN −pβ1 )·xM +N ] e−ϵ(tM +N −t1 )
′aN
× ⟨Ω|ON (0)|β1 ⟩ . . . ⟨βN −1 |O1′a1 (0)|βN ⟩⟨αM |O1b1 † (0)|αM −1 ⟩ . . . ⟨α1 |OM
bM †
(0)|Ω⟩. (9.50)

The integrals over spatial positions now give simple δ-functions, but the integrals over time require a bit
more work. Defining
Z ∞   Z ∞   Z ∞
i k10 +p0α −p0α −iϵ t1 i k20 +p0α −p0α 0 0
dtM ei(kM +pα1 )tM
t2
T := dt1 e M M −1 dt2 e M −1 M −2 ...
−∞ t1 tM −1
Z ∞   Z ∞ Z ∞
i k1′0 −p0β +p0β i( ′0 0 0
)tM +N −1 ′0 0
dtM +N ei(kN −pβ1 +iϵ)tM +N ,
tM +1 kN −1 −pβ2 +pβ1
× dtM +1 e N N −1 ... dtM +N −1 e
tM tM +N −2 tM +N −1
(9.51)

we can evaluate the integrals from right to left to find


0 ′0
T =2πδ(ktot + ktot + p0αM − p0βN )
−i −i −i
× 0 0 0 0 0 0 0
... 0 0
k1 + pαM − pαM −1 − iϵ k1 + k2 + pαM − pαM −2 − iϵ ktot − kM + p0αM − p0α1 − iϵ
i i i
× ′0 0 . . . ′0 ′0 ′0 . (9.52)
ktot − pβN + iϵ kN −1 + kN − pβ2 + iϵ kN − p0β1 + iϵ
0

Here we’ve defined

ktot = k1 + . . . + kM

ktot = k1′ + . . . + kN

. (9.53)

116
Evaluating the spatial integrals, we thus have
Z
Gϵ ⊃ dα1 . . . dαM dβ1 . . . dβN ⟨βN |αM ⟩ T
   
× (2π)d−1 δ d−1 ⃗k1 + p⃗αM − p⃗αM −1 . . . (2π)d−1 δ d−1 ⃗kM + p⃗α1
   
× (2π)d−1 δ d−1 ⃗k1′ − p⃗βN + p⃗βN −1 . . . (2π)d−1 δ d−1 ⃗kN

− p⃗β1
′aN
× ⟨Ω|ON (0)|β1 ⟩ . . . ⟨βN −1 |O1′a1 (0)|βN ⟩⟨αM |O1b1 † (0)|αM −1 ⟩ . . . ⟨α1 |OM
bM †
(0)|Ω⟩. (9.54)

The effect of these δ-functions is simply to impose that

p⃗α1 = −⃗kM
p⃗α2 = −(⃗kM + ⃗kM −1 )
..
.
p⃗α = −⃗ktot ,
M
(9.55)

and also that

p⃗β1 = ⃗kN

p⃗β2 = (⃗kN′
+ ⃗kN

−1 )
..
.
p⃗β = ⃗k ′ .
N tot (9.56)

To make sure we get a pole of maximum strength we should choose the multiparticle states appearing in
(9.54) to ensure that the answer has no remaining momentum integrals. The way to do this to take α1 and
β1 to be one-particle states, α2 and β2 to be two-particle states, and so on.
To proceed further we need to say something about the matrix elements of the O operators. For each O
matrix element the bra has one fewer particle than the ket, so we are interested in the part of O which is
proportional to an annihilation operator. In the previous section we saw that we can write this as

dd−1 p ua (⃗
Z
X p, σ, n) .
Oa (0) = Zn a (9.57)
d−1
2ωn,⃗p p⃗,σ,n
p
n,σ
(2π)

Similarly for each of the O† s the bra has one more particle than the ket, so are interested in the part of O†
which is proportional to a creation operator, which is given by

dd−1 p ub∗ (⃗
Z
b†
X
∗ p, σ, n) †
O (0) = Zn a . (9.58)
d−1
2ωn,⃗p p⃗,σ,n
p
n,σ
(2π)

To simplify life we’ll assume that we’ve chosen either our particle basis or our operators O such that Zn
is nonzero for only one n with a given mass and spin, in which case we can drop the sum on n in these
BM †
expressions. There is then only one way to satisfy all of the spatial δ-functions: OM must create a particle
⃗ bM −1 † ⃗
of spatial momentum −kM , OM −1 must create a particle of spatial momentum −kM −1 , and so on, and
similarly O1′a1 must annihilate a particle of momentum ⃗k1 , O2′b2 must annihilate a particle of momentum ⃗k2 ,
and so on. We can therefore simplify the quantity T from the time integrals:

0 −i −i i i
T = 2πδ(kM +ωnM ,⃗kM )× ... 0 . . . ′0 .
k10 + ωn1 ,⃗k1 − iϵ kM −1 + ωnM −1 ,⃗kM −1 − iϵ k1′0 − ωn′ ,⃗k′ + iϵ kN − ωn′ ⃗′+ iϵ
1 1 N kN
(9.59)

117
0
It may seem strange that we have treated kM differently than all the other energies, this is because momentum
conservation doesn’t let us really vary all the momenta independently. We can restore the symmetry by using

0 i i
2πδ(kM + ωnM ,⃗kM ) = 0 − 0 , (9.60)
kM + ωnM ,⃗kM + iϵ kM + ωnM ,⃗kM − iϵ
0
with the understanding that we are interested in the pole where kM = −ωnM ,⃗kM − iϵ, in which case we can
simply write
−i −i i i
T ⊃ ... 0 . . . ′0 . (9.61)
k10 + ωn1 ,⃗k1 − iϵ kM + ωnM ,⃗kM − iϵ k1′0 − ωn′ ,⃗k′ + iϵ kN − ωn′ ,⃗k′ + iϵ
1 1 N N

Having now done all of the momentum integrals, we arrive at last at the LSZ formula:66

   
X ubj (⃗kj′ , σj′ , n′j )
N
Y i YM X uai ∗ (−⃗ki , σi , ni ) −i
Gϵ −−0−−−−−−−→ Zn′ q × ′0  Zn∗ q × 0 
ki → −ωn ,⃗k
j
′ 2ω ′ ⃗ ′
k j − ω n ′ ,⃗
k ′ + iϵ
i
2ω k i + ω n ,⃗k − iϵ
i i j=1 σ n ,kj j j j
j
i=1 σi n ,⃗ k i i
i i
ki′0 → ωn′ ,⃗k′
i i

× ⟨k1′ , σ1′ , n′1 ; . . . ; kN


′ ′
, σN , n′N , −| − k1 , σ1 , n1 ; . . . ; −kM , σM , nM , +⟩. (9.62)

The third line here is just the S-matrix with arbitrary external particles, so by stripping off the factors in
the first two lines we can directly extract it! The only remaining difficulty is how to “undo” the sums over
spin/helicity; there is a standard way to do this but we will postpone discussion of it until we discuss fields
with nonzero spin more explicitly.
The LSZ formula is often written in a slightly more covariant way by noting that near the poles we have

i −i(k 0 + ω⃗k − iϵ) −i2ω⃗k


= ≈ 2 (9.63)
k 0 − ω⃗k + iϵ (ω⃗k − iϵ)2 − (k 0 )2 k + m2 − iϵ
and
−i −i(ω⃗k − iϵ − k 0 ) −i2ω⃗k
= ≈ 2 , (9.64)
k0 + ω⃗k − iϵ (ω⃗k − iϵ)2 − (k 0 )2 k + m2 − iϵ
and thus67
 q   q 
N −i 2ωn′ ,⃗k′ M −i 2ωni ,⃗ki
j j
Y X Y X
bj ⃗ ′
Gϵ −−0−−−−−−−→ Zn′ ′ ′
u (kj , σj , nj ) × 2 × ∗
Zn u i (−⃗ki , σi , ni ) × 2
a ∗ 
ki → −ωn ,⃗k
j
ki + m2ni − iϵ i
ki + m2ni − iϵ
i i j=1 σj′ i=1 σ i
ki′0 → ωn′ ,⃗k′
i i

× ⟨k1′ , σ1′ , n′1 ; . . . ; kN


′ ′
, σN , n′N , −| − k1 , σ1 , n1 ; . . . ; −kM , σM , nM , +⟩. (9.65)

Due to the pesky signs in the momenta of the “in” state here, one sometimes defines Gϵ to have the opposite
sign for ki (anyways the signs in the Fourier transform are a matter of convention).
There are a few points which are worth making about this formula:
ˆ The LSZ formula is completely non-perturbative, computing the exact S-matrix of the true asymptotic
states of the theory. Choosing the operators O and O′ in general requires you to know enough about
the theory to be able to find a local operator that creates/annihilates each particle type n with nonzero
Zn . As mentioned above this is usually not difficult however: you just find an operator that has the
right symmetry charges and then a nonzero Zn is generic.
66 LSZ stands for Lehmann, Symanzik, and Zimmerman. Their original paper from 1954 only treats the scalar case and

assumes weak coupling. It is written in German, Ein Prosit if you can read it!
67 In many textbooks the “in” and “out” states are normalized in a more covariant way by absorbing the factors of
q
2ωn′ ,⃗k′
j j
q
and 2ωn ,⃗k into their definitions, which makes this formula look even more covariant.
i i

118
ˆ Comparison to (9.42) may have you worried that we are only computing the S-matrix for particles and
not antiparticles, but of course it is arbitrary which particles we view as an antiparticles. Given a free
field Φa (x) as in equation (9.25), we can exchange the role of particles and antiparticles by taking the
adjoint of the field. We have implicitly done this in our presentation of the LSZ formula since we have
only the ua intertwiners appearing, so in particular if the amplitude involves both a particle and its
anti-particle then we have used an O analogous to Φ for the former and an O analogous to Φ† for the
latter. If we instead want to adhere to some pre-existing convention for which particles are antiparticles
(for example if we want positrons to be antiparticles), and we want to take all Os to be analogous to
Φ, then in the LSZ formula we should make the replacements ua∗ → v a for each antiparticle in the
initial state and ua → v a∗ for each antiparticle in the final state.
ˆ Since we have related S-matrix elements to correlation functions, all symmetry constraints on correla-
tion functions must imply symmetry constraints on S-matrix elements. For example if there is a U (1)
global symmetry under which the incoming particles have charges q1 , q1 , . . . and the outgoing particles
have charges q1′ , q2′ , . . ., then we must have

q1 + q2 + . . . + qM = q1′ + q2′ + . . . + qN

. (9.66)

ˆ The masses appearing in the poles are again the physical masses, not bare masses that appear in the
Lagrangian. After all the latter do not even make sense for composite particles. The factors of Zn′
and Zn are again called wave function renormalization.
In the next lecture we will learn how to use the LSZ formula to compute the S-matrix in weakly interacting
theories using Feynman diagrams.

119
9.5 Homework
−i
1. Show that the expression (9.16) is compatible with our expression (2π)d δ d (k2 + k1 ) k2 +m2 −iϵ for the
2
momentum-space Feynman propagator in free scalar field theory. You should take O1 (x) = O2 (x) =
Φ(x), where Φ is a real free scalar field.
2. Consider the derivative ∂ µ Φ(x) of a free scalar field. From the point of view of this lecture this is just
as good of a candidate for a field that creates a free scalar particle as Φ(x) itself is. What are uµ and
v µ for the free field ∂ µ Φ? Show that these uµ and v µ obey the intertwiner equations (9.28) and (9.34).

3. Evaluate the Fourier transform of the time-ordered two-point function ⟨Ω|T Φa2 (x2 )Φa1 † (x1 )|Ω⟩ of a
general
P free field as in equation (9.25), and show that it gives the right hand side of (9.42) but without
the n |Zn |2 .
4. Evaluate the integrals in T and show that they lead to (9.52). Make sure to go from right to left, and
be prepared to use the δ-function at the end to rewrite the poles involving ki0 .

5. Extra credit: starting from the general Lorentz transformation properties of one-particle states, the
free-field expression (9.25), and the field transformation (9.26), derive the creation and annihilation
operator transformations (9.33) and the ua and v a transformations (9.34).

6. Extra extra credit: derive Zn = Zenc directly from (9.37) and (9.38), without using free field theory. You
will need to figure out how one-particle states transform under CRT , which requires you to think about
how to analytically continue the machinery of the little group to Euclidean signature. (Disclosure: I
tried this myself, but there was a sign I so far couldn’t get to work out in the fermionic case.)

120
10 Scattering in perturbation theory
In the last lecture we met the LSZ formula relating the S-matrix to the Fourier transform of time-ordered
correlation functions:

 q 
N −i 2ωn′ ,⃗k′
j j
) . . . O1′a1 (k1′ )O1b1 † (k1 ) . . . OM
bM †
Y X
′aN
⟨Ω|T ON ′
(kN (kM )|Ω⟩ −−0−−−−−−−→ Zn′ ubj (⃗kj′ , σj′ , n′j ) × ′2 
ki → −ωn ,⃗k
j
kj + m2n′ − iϵ
i i j=1 σj′ j
ki′0 → ωn′ ,⃗k′
i i
 q 
M
Y X −i 2ωni ,⃗ki
× Zn∗ uai ∗ (−⃗ki , σi , ni ) × 2 
i=1
i
σi
ki + m2ni − iϵ

× ⟨k1′ , σ1′ , n′1 ; . . . ; kN


′ ′
, σN , n′N , −| − k1 , σ1 , n1 ; . . . ; −kM , σM , nM , +⟩.
(10.1)

In this lecture we will learn how to use this formula in perturbation theory to compute the S-matrix. To
simplify expressions we will restrict to particles with zero spin/helicity and take the operators O and O′ to
be scalars, in which case the formula simplifies to68
 q   q
−iZn∗i 2ωni ,⃗ki

N −iZn′j 2ωn′ ,⃗k′ M
j j
) . . . O1′ (k1′ )O1† (k1 ) . . . OM

Y Y
′ ′
⟨Ω|T ON (kN (kM )|Ω⟩ −−0−−−−−−−→  ×  
ki → −ωn ,⃗k
j=1
kj′2 + m2n′ − iϵ i=1
ki2 + m2ni − iϵ
i i j
ki′0 → ωn′ ,⃗k′
i i

× ⟨k1′ , n′1 ; . . . ; kN

, n′N , −| − k1 , n1 ; . . . ; −kM , nM , +⟩. (10.2)

We can invert this to get a formula directly for the S-matrix:


   
YN kj′2 + m2n′ − iϵ Y M
k 2
+ m2
− iϵ
⟨k1′ , n′1 ; . . . ; kN

, n′N , −|k1 , n1 ; . . . ; kM , nM , +⟩c =  qj   i qni 
j=1 −iZ n ′
j
2ω ′ ⃗
nj ,kj′ i=1 −iZ ∗
ni 2ωni ,⃗
ki


× ⟨Ω|T ON ′
(kN ) . . . O1′ (k1′ )O1† (−k1 ) . . . OM

(−kM )|Ω⟩c ki0 → ωn ,⃗k ,
i i
ki′0 → ωn′ ,⃗k′
i i
(10.3)

where I’ve taken the liberty of flipping the sign of the ingoing momenta in the Fourier transform. I’ve also
taken the connected part of the S-matrix, which is defined in just the same way as the connected part of
the correlation functions and therefore can be computed by using the connected correlation function on
the right-hand side. Let’s study this formula specifically in the context of our interacting ϕ4 theory, with
Lagrangian density
1 m2 λ
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ4 . (10.4)
2 2 4!
We will take all of the Os and O′ s to just be Φ. Since this theory has only one kind of particle and Φ is
hermitian, we can further simplify (10.3) to
  !
N ′2 2 M
Y k
 j q
+ m − iϵ Y ki2 + m2 − iϵ
⟨k1′ ; . . . ; kN

, −|k1 ; . . . ; kM , +⟩c =  ′
⟨Ω|T Φ(kN ) . . . Φ(k1′ )Φ(−k1 ) . . . Φ(−kM )|Ω⟩c ki0 → ω⃗k .
−iZ 2ω⃗ki
p
j=1 −iZ 2ω⃗k′ i=1
i
j ki′0 → ω⃗k′
i
(10.5)
68 It is easy to put back the spin, we will do it later when we consider particles of spin/helicity 1/2 and 1.

121
We’ve written m0 for the “bare” mass in the Lagrangian to distinguish it from the genuine physical mass m
which appears in equation (10.5). Recall that Z here is defined by

Z
⟨Ω|Φ(0)|k⟩ = ⟨k|Φ(0)|Ω⟩ = p , (10.6)
2ω⃗k

with p
ω⃗k = |k|2 + m2 . (10.7)
Z must be real by the hermiticity of Φ.
We learned a few lectures ago that in perturbation theory we can compute the Fourier transform of the
connected time-ordered correlation functions of Φ using the momentum-space Feynman rules:

1. Write down a connected Feynman diagram C with N + M external points.


2. Write a factor of −iλ for each interaction vertex
3. Divide by the symmetry factor sC .
4. Label all internal and external momenta (with external momenta outgoing), enforcing momentum
conservation at all interaction vertices.
5. Write down an overall momentum-conserving δ-function for the external momenta.
−i
6. For each propagator supply a factor of p2 +m20 −iϵ

7. Integrate over all remaining loop momenta.

8. Sum over connected diagrams C.


What we need to figure out now is how these rules are modified by the pole factors in (10.5). Naively these
just remove the propagators attached to the external points and multiply by factors of Z −1 , but we need to
be careful about the difference between m and m0 .

10.1 Self-energy and the two-point function


To clarify the difference between m and m0 we can be a little more organized about the perturbative
calculation of the two-point function. Let’s first define the self-energy Σ(p2 ) of the scalar particle by

−i
⟨T Φ(p)Φ(p′ )⟩ = (2π)d δ d (p + p′ ) . (10.8)
p2 + m20 + Σ(p2 ) − iϵ

The δ-function here is a consequence of translation invariance, and the fact that it depends only on p2 is a
consequence of Lorentz invariance. We saw in the previous lecture that this correlation function has a pole
at p2 = −m2 , so the relationship between the bare and physical masses is determined by solving

m2 = m20 + Σ(−m2 ). (10.9)

We also saw that the residue of this pole is −i(2π)d δ d (p + p′ )Z 2 , so apparently we have
1
Z2 = . (10.10)
1 + Σ′ (−m2 )

122
= 1PI 1PI 1PI

( 1PI = )
Figure 25: The one-particle-irreducible (1PI) decomposition of the two-point function. The full two-point
function is built out of a sum of increasing numbers of 1PI bubbles chained together by propagators, leading
to a geometric series.

We can then consider how to compute the self-energy perturbatively. In the free theory with λ = 0 we
have Σ = 0, so we can rewrite the momentum space propagator perturbatively:
−i −i p2 + m20 − iϵ
= ×
p2 + m20 + Σ(p2 ) − iϵ p2 + m20 − iϵ p2 + m20 + Σ(p2 ) − iϵ
−i 1
= 2 ×
p + m20 − iϵ 1 + 2 Σ(p22 )
p +m0 −iϵ
   2 !
−i −i −i
= 2 × 1 + −iΣ(p2 ) 2 + −iΣ(p2 ) 2 + ... .
p + m20 − iϵ p + m20 − iϵ p + m20 − iϵ
(10.11)
In the last line here we have “unsummed” a geometric series to get a Taylor expansion in Σ, which we should
think of as being O(λ). We can then compare this expression to what we get from the Feynman diagram
expansion for the two-point function, for which the first few diagrams are shown in the first line of figure 25.
To isolate the contribution of Σ, we note that we can organize this series as a geometric sum by splitting it
up into “one-particle irreducible” (1PI) pieces. By definition a 1PI Feynman diagram is a connected diagram
with at least one interaction vertex and also the property that there is no internal link such that removing
that link splits the diagram into two connected components. The second, third, and fourth diagrams in the
first line of figure 25 are 1PI but the first and fifth are not. Comparing the 1PI decomposition to the last
line of equation (10.11), we see the following rule:
−iΣ(p2 ) =sum over two-point 1PI diagrams with external propagators
and the momentum-conserving δ-function removed.
In particular at one loop the only contribution to Σ is the “snail” diagram (the first diagram in the third
line of figure 25), so we have
dd q −i
Z
2 iλ
−iΣ(p ) = − + O(λ2 )
2 (2π)d q 2 + m20 − iϵ

= − GF (0) + O(λ2 ). (10.12)
2

123
Figure 26: Momentum labels for the sunset diagram.

Equation(10.9) then shows us that

m2 = m20 + λGF (0)/2 + O(λ2 ), (10.13)

just as we found back in lectures 10-11. We have now done better than we did then however, as we have
seen that by resumming an infinite sum of diagrams this mass shaft indeed is a shift of the pole location to
all orders in perturbation theory.
You may have already noticed that at one loop Σ(p2 ) is actually independent of p2 , so from (10.10) we
see that
Z = 1 + O(λ2 ). (10.14)
To get a nonzero contribution to Z we need a 1PI diagram that has nontrivial p2 dependence once the external
propagators are removed, and the first diagram with this property is the two-loop “sunset” diagram (the
second diagram in the third line of figure 25). Choosing momentum labels as in figure 26, the contribution
of this diagram to Σ(p2 ) is

λ2 dd q dd ℓ −i −i −i
Z Z
−iΣ(p2 ) ⊃ − . (10.15)
6 (2π)2 (2π)d q 2 + m20 − iϵ ℓ2 + m20 − iϵ (p − ℓ − q)2 + m20 − iϵ

You can see the explicit p2 dependence here in the third propagator. For now we won’t try to evaluate
the loop integrals. In the homework you’ll meet another scalar field theory which already has a nonzero
contribution to Z at one loop.

10.2 Perturbative calculation of the S-matrix


Returning now to the perturbative computation of the connected S-matrix, from the LSZ formula (10.5)
we want to start with the Fourier transform of the time-ordered N + M -point function and send all of the
external momenta on-shell, dividing by an exact momentum-space two-point function for each external leg as
we do. To get a sense of what this means at the level of Feynman diagrams, we show the first contributions to
the four-point function (appropriate for computing 2 → 2 scattering) in figure 27. In this diagram we should
view the second row of diagrams as simply providing the one-loop corrections to the external propagators in
the single diagram in the first row, and so these diagrams will be canceled when we divide out by the exact
two-point functions on the legs. We can make this cancellation automatic by only summing over “pruned”
connected Feynman diagrams, meaning connected diagrams with at least one interaction vertex and also the
property that there is no internal link such that removing that link would result in one external point being
in different connected component from all of the others external points.69 This pruning removes all diagrams
69 1PIdiagrams are always pruned, but the converse is not true since we could have an internal line whose removal splits the
diagram into pieces each of which contains multiple external points. A simple example is the six-point tree level diagram where
three external legs are connected to one interaction vertex and three external legs are connected to another interaction vertex,
with one link connecting the two vertices.

124
Figure 27: Contributions to the four-point function up to O(λ2 ). The diagrams in the second row are not
pruned, so we should remove them in computing the S-matrix.

which would give corrections to the external propagators, so finish removing the exact two-point functions
on the external legs we just need to divide by the free propagators. We thus have the following rule:70
q q q q
2ω⃗k′ . . . 2ω⃗k′ 2ω⃗k1 . . . 2ω⃗kM ⟨k1′ ; . . . ; kN

, −|k1 ; . . . kM , +⟩c =sum over all pruned connected Feynman diagrams
1 N

with external propagators removed.


In our discussion of scattering theory it was convenient to remove the overall momentum conserving δ-
function, defining the quantity Mβα by
Sβα = δ(β − α) + i × (2π)d δ d (pβ − pα )Mβα , (10.16)
and we can also get rid of the annoying factors of the energy in our formula by defining (as we did a few
lectures ago) q q q q
Mfβα = 2ω⃗ ′ . . . 2ω⃗ ′ 2ω⃗ . . . 2ω⃗ Mβα .
k k k1 kM (10.17)
1 N

f(α → β), in which case we have


To give us more room to describe the states we’ll also write this as M
fc (k1 , . . . , kM → k1′ , . . . , kN
iM ′
) = sum over all pruned connected Feynman diagrams with external
propagators and the momentum-conserving δ-function removed.
As a first example we can consider the contributions to the 2 → 2 S-matrix. At tree level we just have
the single diagram in the first row of figure 28, which simply gives
fc (k1 , k2 → k ′ , k ′ ) = −iλ + O(λ2 ).
iM (10.18)
1 2

At one loop we then have the three diagrams in the third row, whose momenta we label as in figure 28,
which add up to
λ2 dd ℓ
Z
fc (k1 , k2 → k1′ , k2′ ) ⊃ − −i h −i
iM
2 (2π) ℓ + m0 − iϵ (ℓ + k1′ − k1 )2 + m20 − iϵ
d 2 2

−i −i i
+ 2
+ ′ 2 2
.
(ℓ − k1 − k2 ) + m0 − iϵ
2 (ℓ + k2 − k1 ) + m0 − iϵ
(10.19)
We will learn how to evaluate this integral in the next lecture.
70 Pruned diagrams are usually instead called “amputated”, but pruning feels less gruesome to me.

125
Figure 28: Momentum labels for the pruned diagrams contributing to the 2 → 2 S-matrix at tree level and
one loop.

10.3 Computing the cross section


We’ll conclude this lecture by computing the tree level differential cross section for 2 → 2 scattering in ϕ4
theory. Let’s first consider the general differential cross section with two particles in the final state and
potentially different masses for all four particles, which we saw earlier is given by
dσ(α → β) = u−1 d d 2
α (2π) δ (pβ − pα )|M(α → β)| dβ (10.20)
with p
(k1 · k2 )2 − m2n1 m2n2
uα = . (10.21)
ωn1 ,⃗k1 ωn2 ,k⃗2
Writing out these factors more explicitly with two particles in the final state and restricting to the situation
where the individual ingoing momenta are not equal to the individual outgoing momenta we have
1
dσ(k1 , σ1 , n1 ; k2 , σ2 , n2 → k1′ , σ1′ , n′1 ; k2 , σ2′ , n′2 ) = p (2π)d δ d (k1′ + k2′ − k1 − k2 )
4 (k1 · k2 )2 − m2n1 m2n2
fc (α → β)|2
|M 1 dd−1 k1′ dd−1 k2′
× × If inal . (10.22)
4ωn′ ⃗k′ ωn′ ⃗k′ 2 (2π)d−1 (2π)d−1
1 1 2 2

Here If inal is equal to zero if the final state particles are distinguishable (i.e. if n′1 ̸= n′2 ) and equal to one if
they are indistinguishable (i.e. if n′1 = n′2 ) (such a factor was part of the definition of dβ). We can use the

spatial momentum-conserving δ-function to evaluate the integral over k⃗1 , so we are left with
1 1
dσ(k1 , σ1 , n1 ; k2 , σ2 , n2 → k1 + k2 − k2′ , σ1′ , n′1 ; k2′ , σ2′ , n′2 ) =
2If inal 4 (k1 · k2 )2 − m2n1 m2n2
p

fc |2
|M dd−1 k2′
× 2πδ(−ωn1 ,⃗k1 − ωn2 ,⃗k2 + ωn′ ,⃗k′ + ωn′ ,⃗k′ ) ,
1 1 2 2 4ωn′ ,⃗k′ ωn′ ,⃗k′ (2π)d−1
1 1 2 2
(10.23)

where now we set


⃗k ′ = ⃗k1 + ⃗k2 − ⃗k ′ (10.24)
1 2

126
and the differential cross section is to be integrated over only k2′ . We will study this in the “center of mass
frame” where
⃗k2 = −k⃗1 := ⃗k (10.25)
and
⃗k ′ = −⃗k ′ := ⃗k ′ . (10.26)
2 1

In this frame the spacetime momenta are


q 
k1 = |k|2 + m2n1 , −⃗k
q 
k2 = |k|2 + m2n2 , ⃗k
q 
k1′ = |k ′ |2 + m2n′ , −⃗k ′
1
q 
k2′ = |k ′ |2 + m2n′ , ⃗k ′ ,
2

with total center of mass energy


q q
Etot := |k|2 + m2n1 + |k|2 + m2n2 . (10.27)

and q
(k1 · k2 )2 − m2n1 m2n2 = |k|Etot . (10.28)
The energy conserving δ-function sets
q q
Etot = |k ′ |2 + m2n′ + |k ′ |2 + m2n′ , (10.29)
1 2

which has no solution if Etot < mn′1 + mn′2 and has solution
q
2 − m2 − m2 )2 − 4m2 m2
(Etot n′ n′ n′ n′
|k ′ | =
1 2 1 2
(10.30)
2Etot
if Etot ≥ mn′1 + mn′2 . In order to use the δ function to simplify the differential cross section we need to
rewrite by noting that
d q ′ 2 2 +
q
′ |2 + m2
 |k ′ | |k ′ | |k ′ |Etot
|k | + m ′
n1 |k n2′ = + =
d|k ′ |
q q q q
|k ′ |2 + m2n′ |k ′ |2 + m2n′ |k ′ |2 + m2n′ |k ′ |2 + m2n′
1 2 1 2

(10.31)
and thus
q q  q 
q q |k ′ |2 + m2n′ |k ′ |2 + m2n′ 2 − m2 − m2 )2 − 4m2 m2
(Etot n ′ n ′ n ′ n ′

δ |k|′ −
1 2 1 2 1 2
2πδ(−Etot + |k ′ |2 + m2n′ + |k ′ |2 + m2n′ ) = 2π .
1 2 |k ′ |Etot 2Etot
(10.32)
The integration measure is
dd−1 k ′
= (2π)−(d−1) |k ′ |d−2 d|k ′ |dΩd−2 , (10.33)
(2π)d−1
where dΩ is the volume measure on a unit Sd−2 . Putting this all together we see that the differential cross
section (now only to be integrated over the angular coordinates on Sd−2 ) is

dσ 1 1 |k ′ |d−3 f 2
dΩd−2
= If inal
2 (2π)d−2 16|k|Etot2 |Mc | , (10.34)

127
with |k ′ | and Etot being given in terms of |k| by equations (10.30) and (10.27).
Returning now to our ϕ4 theory, since all external masses are equal equation (10.30) simplifies to

|k ′ | = |k| (10.35)

and so we have
dσ 1 1 |k|d−4 f 2
dΩd−2
= 2 | Mc | .
2 (2π)d−2 16Etot
(10.36)

fc = λ2 , so the differential cross section is therefore given by


At tree level we simply have M

dσ 1 1 |k|d−4 2
λ + O(λ3 ) .

= 2 (10.37)
dΩd−2 2 (2π)d−2 16Etot

This answer is independent of angle, so the outgoing particles are equally likely to come out in any direction.
The total cross section σ is therefore just the differential cross section times the volume71

2π (d−1)/2
Ωd−2 = (10.38)
Γ( d−1
2 )

of a unit Sd−2 :
Ωd−2 |k|d−4 2
λ + O(λ3 ) .

σ= d−2 2 (10.39)
32(2π) Etot
At one loop the differential cross section becomes angle-dependent, leading to a more interesting differential
cross section. In the homework you will consider another scalar field theory which already at tree level has
an angle-dependent differential cross section.
It is interesting to consider the high- and low-energy limits of the tree-level cross section as a function of
incident energy. This dependence is given by

|k|d−4
σ∝ λ2 . (10.40)
|k|2 + m2

To get a sense of the real strength of the interactions we should compare σ to some other quantity with units
of area, and at high energies the only such quantity available is |k|−(d−2) . We thus can get a rough estimate
of the interaction strength at high energy by

σ|k|d−2 ∼ λ2 |k|2(d−4) . (10.41)

Thus for d > 4 the interaction strength grows with energy, and one might worry whether the theory really
makes sense at short distances (it probably doesn’t). In the massless limit this scaling also controls the
theory at low energies, and so when d < 4 the interaction strength grows at low energies in the massless
case. Thus for d < 4 perturbation theory will not be valid at low energy and we will need to use some more
exotic technique. In particular this is true for d = 3, and the strongly-interacting theory one reaches at low
energy in that case governs the behavior of classical Ising magnets in three spatial dimensions. In the case
of d = 4 the interaction strength is constant, and then we need to go to higher order in perturbation theory
to see what happens. We will soon see that at one-loop the interactions grow logarithmically with energy in
d = 4. This kind of argument is made more precisely using the idea of the renormalization group, which
we will return to soon.

71 You will derive this volume in the homework.

128
10.4 Homework
1. Derive equation (10.38) for the volume of a sphere in general dimensions. Hint: The easiest way to
2
do this is to evaluate the multi-dimensional Gaussian integral dd−1 xe−|x| in both cartesian and
R

spherical coordinates and then compare the answers.


2. Another simple scalar field theory we can study is

1 m2 g0
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ3 (10.42)
2 2 3!
with g > 0. Non-perturbatively this theory is rather sick, as the naive ground state near ϕ = 0 can
decay by tunneling through the barrier and then rolling down to ϕ = 0−∞. It is also rather fine-tuned,
as we arbitrarily didn’t write down a linear term proportional to ϕ in the potential that would have
been consistent with all the symmetries of the theory. Nonetheless it is a useful model for playing with
Feynman diagrams, which are not sophisticated enough to see the non-perturbative instability. In fact
some textbooks (such as Srednicki) use this theory as their primary example of an interacting field
theory, as the Feynman diagrams are more similar to those of QED.

(a) Make a sketch of the potential for the field in this theory.
(b) Draw the Feynman diagrams which contribute to the self energy Σ(p2 ) up through two loops.
(c) Draw the Feynman diagrams which contribute to the 2 → 2 scattering amplitude up through one
loop.
(d) Evaluate the tree-level 2 → 2 scattering amplitude and differential cross section. In what space-
time dimension is the cross section measured in units of the wavelength roughly constant at large
energies?

129
11 Loop diagrams
We’ve now learned how to compute the perturbative S-matrix and perturbative correlation functions in
quantum field theory. In particular we wrote down several one- and two- loop Feynman integrals, but so far
we have not attempted to actually integrate over any of the loop momenta. The goal of this lecture is to
rectify that.

11.1 Self-energy at one loop


Let’s first return to our one-loop expression for the self-energy of the scalar field with interacting Lagrangian72

1 m2 λ0
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ4 , (11.1)
2 2 4!
which we found in the previous lecture to be

dd q −i
Z
λ0
Σ(p2 ) = . (11.2)
2 (2π) q + m20 − iϵ
d 2

The first thing we will do with this integral is rotate the q 0 contour to Euclidean signature by substituting
0 0
qL = iqE , leading to
dd q
Z
λ0 1
Σ(p2 ) = , (11.3)
2 (2π) q + m20
d 2

where we can drop the iϵ since the denominator of the propagator is now positive-definite. We can rewrite
this in radial coordinates as
λ0 Ωd−1 ∞ q d−1 dq
Z
2
Σ(p ) = , (11.4)
2 (2π)d 0 q 2 + m20
which is an integral that diverges at large q for d ≥ 2. To make sense of the integral we therefore need
to regulate it in some way. We will consider four regulators in turn, understanding the advantages and
disadvantages of each.

11.1.1 Lattice regulator


The most physical way to regulate a quantum field theory at short distance is by using a spacetime lattice.
For example in Euclidean signature we could say that the field ϕ(x) only lives at the points

(x0E , x1 , . . . xd−1 ) = (an0 , an1 , . . . , and−1 ), (11.5)

where a is some short distance scale called the lattice spacing. This is called a cubic spacetime lattice.
When we introduce the Fourier transform on a lattice, the momenta which can appear are restricted in an
interesting way. This is because when x ∈ aZ we have
2πm
ei(p+ a )x = eipx (11.6)

for any integer m. To get a genuinely independent set of Fourier modes, we should therefore restrict to p in
the range
π π
p ∈ (− , ). (11.7)
a a
µ
What this does in loop integrals is that it restricts each component of qE to lie in this range. Since at finite
a this range is finite, this rule assigns a finite value to all loop integrals.
72 I’ve relabeled the bare coupling constant to λ , anticipating that some renormalization will be necessary to convert this to
0
a “physical” coupling λ.

130
Figure 29: Momentum integration regions for lattice (on the left) and hard momentum cutoff (on the right)
regulators in 1 + 1 dimensions.

In most cases lattice regularization is by far the best way to do non-perturbative calculations in interacting
quantum field theories. You “simply” put the Euclidean path integral on a lattice and then evaluate it with
a big computer using the Monte Carlo method. It is also the best way to think about regularization in
quantum field theory, as there is a clear physical picture of what is going on. Unfortunately however the
lattice regulator is rather awkward for concrete calculations in perturbation theory, as the region over which
the momentum integral is evaluated breaks most of the Euclidean symmetry of the problem (see figure
29). Lattice regulators are therefore rarely used in perturbative calculations. For example even the d = 2
lattice-regulated self-energy integral
π/a 0 π/a
dk 1
Z Z
λ0 dkE 1
Σ(p2 ) = 0 )2 + (k 1 )2 + m2 (11.8)
2 −π/a 2π −π/a 2π (kE 0

is too hard for me to evaluate.73

11.1.2 Hard momentum cutoff regulator


There is an obvious way to “improve” the lattice regulation to be more symmetric: we can simply cut off
the momentum integral in a spherically-symmetric way,

q 2 ≤ Λ2 , (11.9)

where Λ is some fixed large energy scale. This makes the integral much easier, as we can now go to radial
coordinates:
λ0 Ωd−1 Λ q d−1 dq
Z
Σ(p2 ) = . (11.10)
2 (2π)d 0 q 2 + m20
This integral can be done for general d ≥ 0 in terms of a hypergeometric function, but it is perhaps more
instructive to just give the answers for d = 1, 2, 3, 4:
  
1 Λ
 arctan
 m0   m0 
 d=1

m2
 
Λ 1
λ0 Ωd−1 log m0 + 2 log 1 + Λ20 d=2

2
Σ(p ) = ×   . (11.11)
2 (2π)d  Λ − m 0 arctan Λ
d=3


 m0
2 2 2
 Λ + m0 log 2m0 2

d=4

2 2 Λ +m 0

73 Mathematica did give me some terrifying expression, but when I asked it to expand this answer for small a it gave me

1.5mb of garbage.

131
Expanding these at large Λ we have

π
2m0+ . 
.. d=1



 Λ
λ0 Ωd−1  log m0 + . . . d=2
Σ(p2 ) = × , (11.12)
2 (2π)d  Λ − m20 π + . . . d=3

 Λ2
 2 Λ
2 − m0 log m0 + . . . d=4

where in each case “. . .” indicates terms which vanish as Λ → ∞. The d = 1 case gives a (finite) quantum
correction to the frequency of the quartic anharmonic oscillator, while for d = 2, 3, 4 we see that we have
increasingly divergent corrections to mass.

11.1.3 Pauli-Villars regularization


The hard momentum cutoff has made our life somewhat easier, but for more complicated loop integrals it
still leads to several problems. In particular if our loop diagram has several propagators where the same loop
momentum appears shifted in different ways, such as we had for the one-loop contribution to 2 → 2 scattering
at the end of the previous lecture, then a hard momentum regulator still isn’t rotationally invariant. It also
leads to difficulties once we introduce gauge fields such as the electromagnetic field. It turns out to be nicer
to “gradually” turn off the contributions of large momenta rather than brutally discarding them completely.
This is called Pauli-Villars regularization, with the canonical approach being to modify all propagators
via the replacement
−i −i i
2 2 → 2 2 + 2 . (11.13)
p + m0 − iϵ p + m0 − iϵ p + Λ2 − iϵ
In Euclidean signature this becomes
1 1 1
2 → 2 2 − 2 . (11.14)
p2 + m0 p + m0 − iϵ p + Λ2

The new term in the propagator here is small when p2 ≪ Λ2 , but for p2 ≫ Λ2 it improves the high-momentum
behavior of the propagator from 1/p2 to 1/p4 . This improves the convergence of loop integrals.
Unfortunately the canonical Pauli-Villars regulation (11.13) doesn’t always render loop integrals finite.
For example our self-energy integral in d = 4 is still logarithmically divergent at high momentum, and in
higher dimensions things are only worse. To deal with this we will instead consider an “improved” Pauli-
Villars regulator, which in Euclidean signature modifies the propagator as
p2
1 e− Λ 2
→ . (11.15)
p2 + m20 p2 + m20

The exponential factor is close to one when p2 ≪ Λ2 just as before, but now for p2 ≫ Λ2 the exponential
suppression ensures that all loop integrals will be finite in any dimension and for any number of propagators.
Making use of Mathematica, with this regulator the self-energy of our scalar field theory becomes

π
2m0+ . .. d=1



Λ γ
− 2 + ...

λ0 Ωd−1  log d=2
Σ(p2 ) = d
× √π m 0 , (11.16)
2 (2π)  2 Λ − m20 π + . . .
 d=3

 Λ2
 2 Λ γ 2
2 − m0 log m0 + 2 m0 + . . . d = 4

where again “. . .” indicates terms which vanish as Λ → ∞ and


n
!
X 1
γ := lim − log n ≈ .577 (11.17)
n→∞ k
k=1

132
is called the Euler-Mascheroni constant.
Already a pattern is hopefully apparent: the power-law divergent contributions to Σ(p2 ) are different for
different choices of regulator, but when there is a logarithmic divergence its coefficient is universal and when
there is no logarithmic divergence the finite term is also universal. The finite piece cannot be universal when
there is a logarithmic divergence since we can always rescale the cutoff:
bΛ Λ
a log = a log + a log b. (11.18)
m0 m0
As you can see this changes the finite piece but doesn’t change the coefficient of the logarithm. Such rescalings
of course also change the coefficients of any power-law divergences.
Something else is also hopefully clear: the relationship between the bare and physical mass is regularization-
dependent: for example for d = 4 we at one loop have
( 2
Λ 2 Λ
2 2 λ0 2 − m0 log m0 hard momentum cutoff
m = m0 + 2 Λ2
2 Λ γ 2 . (11.19)
16π 2 − m0 log m0 + 2 m0 improved Pauli-Villars

Thus the value of m0 we should choose to match the observed value of m depends on which regularization
scheme we use. This is sometimes described by saying that bare masses are scheme-dependent.

11.1.4 Dimensional regularization


There is one more regularization scheme we will consider. It is simultaneously the most useful and the least
physical - the dimensional regularization method of ’t Hooft and Veltman. The basic problem with
choosing a regulator is that it is usually hard to choose a regulator which preserves all symmetries of your
theory. What ’t Hooft and Veltman proposed to get around this is the following hack:
ˆ Evaluate all loop integrals in low enough spacetime dimension d that they are convergent. Once you
have the result, analytically continue back to the dimension you are actually interested in.
It is hopefully clear that this is only a hack. For example as far as I am aware it doesn’t actually make
sense to think about quantum field theory in 3.5 spacetime dimensions. On the other hand this procedure
clearly will give some kind of well-defined answer, and as long as we only assign physical interpretations to
universal quantities we can hope that this answer is the same as what we’d have gotten from some more
physical regularization such as a lattice.
Let’s see how this works in practice for the self-energy. The integral we need to evaluate is
Z ∞ d−1
q dq
2 + m2
, (11.20)
0 q 0

which is convergent for 0 < d < 2 and gives74


Z ∞
q d−1 dq πmd−2
0
= . (11.21)
0 q 2 + m20 2 sin dπ
2

Combining this with the expression


2π d/2
Ωd−1 = (11.22)
Γ(d/2)
that we derived last time, we have

λ0 2π d/2 1 πm0d−2
Σ(p2 ) = . (11.23)
2 Γ(d/2) (2π)d 2 sin dπ
2
74 This is a special case of the integral (11.47) below, as you can check using the Γ function reflection formula Γ(z)Γ(1 − z) =
π
sin(πz)
.

133
Let’s now think a bit about the analytic structure of this expression. Most of the factors are well-behaved
for positive d, but the sin dπ
2 in the denominator leads to poles at each even value of d. Let’s first therefore
consider the odd values - in particular we have
(
πmd−2
0
π
2m0 d=1

 = πm0
. (11.24)
2 sin 2 − 2 d=3

These are precisely the universal finite contributions we found using hard momentum cutoff and improved
Pauli-Villars above! For d = 1 this is no mystery, since anyways the integral is convergent so the regulator
can’t matter, but for d = 3 the dimensional regularization method has automatically removed the linear
divergence but kept the correct finite piece.
In even dimensions we need to be more careful due to the poles. The basic idea is to work in d = 2(n − ϵ)
dimensions, in which case the pole at d = 2n will show up as a factor of 1/ϵ. To expand (11.23) near d = 2n
we need two pieces of information. The first is the behavior of the sin factor near d = 2n, which is easily
shown to be
1 (−1)n+1
1 + O(ϵ2 ) .

= (11.25)
sin(π(n − ϵ)) πϵ
We also need to know how to deal with the Γ(n + ϵ) in the denominator of (11.23). This is a bit trickier:
from the Taylor expansion we have

Γ(n − ϵ) = Γ(n) 1 − ψ(n)ϵ + O(ϵ2 ) ,



(11.26)

where
Γ′ (z)
ψ(z) = (11.27)
Γ(z)
is called the digamma function. For our purposes it is useful to know that by taking the logarithm of the
Γ-function recursion relation Γ(z + 1) = zΓ(z) and then differentiating we see that the digamma function
obeys
1
ψ(z + 1) = ψ(z) + , (11.28)
z
and thus
n−1
X1
ψ(n) = + ψ(1) (11.29)
k
k=1

for any positive integer n. Computing ψ(1) = Γ (1) is a bit tricky, but after a nasty integral evaluation (or
more elegantly by using the Weierstrass product representation of Γ) one finds

ψ(1) = −γ (11.30)

and thus
n−1
X 1
ψ(n) = − γ. (11.31)
k
k=1

We will also at several points need to use the fact that for a > 0 we have

aϵ = eϵ log a = 1 + ϵ log a + O(ϵ2 ). (11.32)

Using all these we have


n−1
!
λ0 Ω2n−1 (−1)n+1 m2n−2 1 X1
Σ(p2 ) = × 0
− log(4π 3 ) − γ + − log m20 + O(ϵ) , (11.33)
2 (2π)2n 2 ϵ k
k=1

so again we see that the logarithmic term in m0 matches the logarithmic term we got from the hard cutoff
and improved Pauli-Villars regulators.

134
You may be puzzled about how we were able to get a dimensionful quantity (m20 ) inside of a logarithm
in (11.33). To understand this we need to account for the dimensions of the bare coupling constant λ0 . In d
spacetime dimensions a scalar field needs to have units of energy to the (d − 2)/2, since we need the kinetic
term ∂µ ϕ∂ µ ϕ to have energy dimension d so that integrating it against dd x gives a dimensionless quantity.
The interaction term λ0 ϕ4 must also have energy dimension d, which means that the energy dimension of λ0
is (4 − d). Energy dimensions are a very useful notion in quantum field theory, so there is a special notation
for them: if a quantity O has units of energy to the ∆, then we write

[O] = ∆. (11.34)

For example in our free scalar theory we have

[L] = d
d−2
[ϕ] =
2
2
[m0 ] = 2
[λ0 ] = 4 − d. (11.35)

When doing dimensional regularization we wish to expand things around d = 2n, so we can write the bare
coupling “near” d = 2n in terms of the bare coupling “at” d = 2n as
{2(n−ϵ)} {2n} {2n}
= µ2ϵ λ0 1 + ϵ log µ2 ,

λ0 = λ0 (11.36)

where µ is an arbitrary quantity with energy dimension one. Substituting this into (11.33) we get
n−1
!
{2n}
2 λ0 Ω2n−1 (−1)n+1 m2n−2
0 1 3
X1 µ2
Σ(p ) = × − log(4π ) − γ + + log 2 + O(ϵ) , (11.37)
2 (2π)2n 2 ϵ k m0
k=1

which looks more sensible dimensionally. The scale µ is called the renormalization scale, we will discuss
its physical interpretation below. Putting everything together we have the expressions
 λ0

 4m0  2  d=1
µ

 λ0 1 − γ + log

d=2
8π ϵ 4π 3 m2
m2 = m20 + λ0 m0
0
(11.38)
− 8π 2 d=3

   
− λ0 m0 1 µ2

32π 2 ϵ − γ + 1 + log 4π 3 m20
d=4

for determining the physical mass at one-loop in terms of the bare mass and coupling in dimensional regu-
larization. In practice the way this expression is usually used is the opposite however: the physical mass m
is measured and then we use this formula to determine m0 .

11.2 Two-to-two scattering at one loop


Let’s now see how we can use these ideas to study the one-loop contribution to 2 → 2 scattering in ϕ4 theory
that we computed last time:
λ20 dd ℓ
Z
fc (k1 , k2 → k1′ , k2′ ) ⊃ − −i h −i
iM
2 (2π)d ℓ2 + m20 − iϵ (ℓ + k1′ − k1 )2 + m20 − iϵ
−i −i i
+ + .
(ℓ − k1 − k2 )2 + m20 − iϵ (ℓ + k2 − k1′ )2 + m20 − iϵ
(11.39)

135
The external momenta ki , ki′ here should be taken on-shell to give a genuine scattering matrix element,
but for now it is convenient to allow them to take general values so that we can analytically continue to
Euclidean signature:
iλ20 dd ℓ
Z
fc (k1 , k2 → k1′ , k2′ ) ⊃ 1 h 1
iM
2 (2π) ℓ + m0 (ℓ + k1′ − k1 )2 + m20
d 2 2

1 1 i
+ 2
+ ′ 2 2
. (11.40)
(ℓ − k1 − k2 ) + m0
2 (ℓ + k2 − k1 ) + m0
This integral is convergent for 0 < d < 4 and logarithmically divergent for d = 4. We could study it using
any of the regulators we discussed above, but we will stick to dimensional regularization so for now we are
assuming that d is in the convergent range. The three terms give the same integral three times, so we just
need to figure out how to compute
dd ℓ
Z
1 1
I(q) = (11.41)
(2π)d ℓ2 + m20 (ℓ + q)2 + m20
for general Euclidean q (we will eventually analytically continue back to on-shell Lorentzian q). This integral
may look difficult, but there is a clever trick due to Feynman which makes it tractable: we use the identity
(that you will derive in the homework)
Z 1
1 dx
= 2
, (11.42)
AB 0 (xA + (1 − x)B)

which is valid for A, B > 0, to combine the denominators:


x(ℓ2 + m20 ) + (1 − x)((ℓ + q)2 + m20 ) = ℓ2 + 2(1 − x)ℓ · q + (1 − x)q 2 + m20
= (ℓ + (1 − x)q)2 + x(1 − x)q 2 + m20 , (11.43)
and thus
1
dd ℓ
Z Z
1
I(q) = dx
0 (2π) ((ℓ + (1 − x)q) + x(1 − x)q 2 + m20 )2
d 2

1
dd ℓ
Z Z
1
= dxd 2 (11.44)
0 (2π) (ℓ + x(1 − x)q 2 + m20 )
2
Z ∞
Ωd−1 1 ℓd−1 dℓ
Z
= d
dx 2 (11.45)
(2π) 0 0 (ℓ2 + x(1 − x)q 2 + m20 )
In going to the second line we have made an additive shift of the integration variable, and in going to the
third we changed to radial coordinates. x here is called a Feynman parameter. The propagators in more
general loop diagrams can be combined using multiple Feynman parameters, for example we have
Z 1 Z 1−x
1 2
= dx dy . (11.46)
ABC 0 0 (xA + yB + (1 − x − y)C)3
The remaining momentum integral in (11.45) can then be evaluated using the general formula (that you
will derive in the homework)
Z ∞
ℓa−1 σ a−2b Γ(a/2)Γ(b − a/2)
dℓ 2 2 b
= , (11.47)
0 (ℓ + σ ) 2 Γ(b)
which is valid for σ > 0 and 0 < a < 2b, giving us
Ωd−1 1 1  d−4 Γ(d/2)Γ(2 − d/2)
Z
I(q) = d
dx m20 + x(1 − x)q 2 2
(2π) 2 0 Γ(2)
Z 1
Γ(2 − d/2) d−4
dx m20 + x(1 − x)q 2 2 .

= (11.48)
(4π)d/2 0

136
Defining the Mandelstam variables
s = −(k1 + k2 )2
t = −(k1′ − k1 )2
u = −(k1′ − k2 )2 , (11.49)
we can then write the one-loop contribution to the scattering amplitude as
2 Z 1  
fc (k1 , k2 → k ′ , k ′ ) ⊃ iλ0 Γ(2 − d/2) 2
 d−4 2
 d−4 2
 d−4
iM 1 2 dx m 0 − x(1 − x)s 2
+ m 0 − x(1 − x)t 2
+ m0 − x(1 − x)u 2
.
2 (4π)d/2 0
(11.50)
We will now focus on the cases of d = 3 and d = 4.
For d = 3 we are in the convergent region, so we simply have
Z 1
1 dx
I(q) = p . (11.51)
8π 0 m20 + x(1 − x)q 2
For q 2 > 0 this integral gives  
2m0
π − 2 arctan √
1 q2
I(q) = p , (11.52)
8π q2
which is the regime we need for the terms involving t and u since on shell we always have t < 0 and u < 0.
To compute the integral involving s we need to restore the iϵ by taking q 2 = −(s + iϵ), which leads to
 √ 
s+2m0 s
1 iπ + log s−2m0 √s
I(q) = √ (11.53)
8π s
where we have used that s ≥ 4m20 . We won’t have too much to say about these results, but one comment is
that at this order we can replace m0 → m since the difference between the two is higher-order in λ0 , so the
one loop contribution to 2 → 2 scattering is finite without any further renormalization. It also decays with
energy since you will show in the homework that s is essentially just the center of mass energy squared.
For d = 4 we need to be more careful due to the pole in Γ(2 − d/2). Setting d = 4 − 2ϵ, we have
Γ(1 + ϵ) 1 1
Γ(ϵ) = = (1 + ψ(1)ϵ + . . .) = − γ + O(ϵ). (11.54)
ϵ ϵ ϵ
Using this in (11.48), together with (11.32), we have
 Z 1 
1 1 2 2

I(q) = − γ + log(4π) − dx log m0 + x(1 − x)q . (11.55)
16π 2 ϵ 0

To turn this into an expression for M


fc we need to again be careful about units near 4 dimensions: noting
that
fc ] = [λ0 ] = 4 − d,
[M (11.56)
with factors of µϵ inserted so that Mf and λ0 have their d = 4 units we have
Z 1 
iλ20 3 µ2 µ2 µ2
 
iMc = −iλ0 +
f − 3γ + 3 log(4π) + dx log 2 + log 2 + log 2 .
32π 2 ϵ 0 m0 − sx(1 − x) m0 − tx(1 − x) m0 − ux(1 − x)
(11.57)
This expression is indeed UV divergent as ϵ → 0. Before discussing how to fix this, I’ll mention that in a
more physical regularization scheme (e.g. such as a lattice or Pauli-Villars) we’d instead have found
Z 1 
iλ20 Λ2 Λ2 Λ2

iMc = −iλ0 +
f dx log 2 + log 2 + log 2 (11.58)
32π 2 0 m0 − sx(1 − x) m0 − tx(1 − x) m0 − ux(1 − x)

137
where Λ is the UV cutoff. Either way we are now ready for the key question: what are we supposed to do
about this UV divergence?
There is only one sensible thing to do: we absorb this divergence into a redefinition of the bare coupling
constant λ0 . In the context of the bare mass m0 we had a physical motivation for doing this: we wanted
to write things in terms of the physical mass m instead of the scheme-dependent bare mass m0 . Is there a
similar justification here? Indeed there is - the bare coupling λ0 is no more directly measurable than the
bare mass m0 . What is measurable is the 2 → 2 S-matrix, so the simplest thing we can do is simply define a
physical coupling λ so that the exact 2 → 2 S-matrix is equal to its tree-level value at some preferred choice
for the initial momenta. More concretely we will impose that
fc |s=4m2 ,t=0,u=0 = −iλ.
iM (11.59)

Looking at our above expressions, this requires that


Z 1
λ20 3 µ2 µ2
 
λ = λ0 − − 3γ + 3 log(4π) + 2 log 2 + dx log 2 (11.60)
32π 2 ϵ m0 0 m0 − 4m2 x(1 − x)
in dimensional regularization and
Z 1
λ20 Λ2 Λ2
 
λ = λ0 − 2 log + dx log (11.61)
32π 2 m20 0 m20 − 4m2 x(1 − x)
in a more physical regularization scheme. At the order we are working in λ0 we can easily rewrite these as
expressions for the bare coupling in terms of the physical mass and the physical coupling:
Z 1
λ2 3 µ2 µ2
 
λ0 =λ + − 3γ + 3 log(4π) + 2 log 2 + dx log 2
32π 2 ϵ m 0 m − 4m2 x(1 − x)
2 2
 
λ 3 µ
=λ+ − 3γ + 2 + 3 log(4π) + 3 log 2 (11.62)
32π 2 ϵ m
and
Z 1
λ2 Λ2 Λ2
 
λ0 =λ + 2 log 2 + dx log 2
32π 2 m 0 m − 4m2 x(1 − x)
2 2
 
λ Λ
=λ + 3 log 2 + 2 (11.63)
32π 2 m

Now the moment of truth: using either (11.62) or (11.63) we can rewrite the scattering amplitude M fc for
general s, t, u in terms of the physical mass and coupling:
Z 1   2
iλ2 m − 4m2 x(1 − x) m2 m2
    
iMc = −iλ +
f dx log + log + log .
32π 2 0 m2 − sx(1 − x) m2 − tx(1 − x) m2 − ux(1 − x)
(11.64)
All UV divergences are gone, and the answer is now independent of which regularization scheme we used!
We have to fit two parameters (λ and m) to experiment, but this expression gives a function’s worth of
predictions in exchange. The integrals can again be evaluated in terms of inverse trig functions but I won’t
bother.
This argument leading to the finite and scheme-independent scattering amplitude (11.64) may have
seemed a bit like magic. Why did this happen? Does it continue to happen at higher loops and for more
complicated scattering amplitudes? Are there more parameters we need to tune, or is it just λ0 and m0 ? It
is far from obvious, but the answers to the latter two questions are “yes it continues” and “no it is just λ0
and m0 ”. Understanding why is our next order of business.
As a first indication that things may not be so mysterious, I’ll mention that the derivative of λ0 with
respect to the logarithm of either the renormalization scale µ (in dim reg) or the cutoff Λ (in a more physical

138
scheme) holding the physical coupling λ fixed is a very useful quantity, usually called the β-function. Here
it is given by
dλ0 3λ2
β(λ) := Λ = . (11.65)
dΛ 16π 2
3λ Λ
Note in particular that λ0 grows with energy, so when we reach a regime where 16π 2 log m ∼ 1, or in other
words the cutoff reaches
16π 2
Λstrong ∼ me 3λ , (11.66)
then the theory becomes strongly coupled and our perturbative approach breaks down. This is usually
viewed as evidence that the continuum limit does not really exist for ϕ4 theory in d = 4. Fortunately if λ
is small this is a rather high energy scale, for example in the standard model of particle physics the Higgs
boson is a scalar field theory whose mass is 125GeV and whose quartic coupling is of order λ ∼ .1, so the
scale where the Higgs becomes strongly coupled is

Λstrong ∼ (125 GeV) × e525 , (11.67)

which is a far higher energy scale than the Planck scale of Mp ∼ 1018 GeV where quantum gravity effects
are expected to become important. If we view our theory as having a genuine cutoff Λ at some scale which
is large compared to where we do experiments but small compared to Λstrong , then these UV divergences
start looking less scary and perhaps we will be able to tame them more systematically. Doing so is the goal
of the next lecture.

139
11.3 Homework
1. Check the Feynman parameter identity (11.42) for A, B > 0.
2. Evaluate the general loop integral (11.47). One strategy is the following: 1) rescale ℓ to extract the
overall power of σ, 2) rewrite the integral in terms of the Euler β-function
Z 1
β(z1 , z2 ) = tz1 −1 (1 − t)z2 −1 . (11.68)
0

1
using the change of variables t = 1+ℓ2 , and 3) use a famous expression for the β function,

Γ(z1 )Γ(z2 )
β(z1 , z2 ) = . (11.69)
Γ(z1 + z2 )
R∞ R∞
To derive this last expression, start with Γ(z1 )Γ(z2 ) = 0 ds1 0 ds2 s1z1 −1 sz22 −1 e−s1 −s2 and then use
the change of variables s1 = st and s2 = s(1 − t).
3. Show that on shell the Mandelstam variables (11.49) obey s + t + u = 4m2 and s = Etot
2
, where Etot
is the total energy in the center-of-mass frame. Also show that t, u ≤ 0.

140
12 Renormalizability and the Renormalization Group
In the previous lecture we saw that once we expressed the one-loop 2 → 2 scattering amplitude of ϕ4 theory
in terms of the physical mass and coupling parameters m and λ, the amplitude was independent of the
short-distance cutoff Λ (or the renormalization scale µ in dimensional regularization). In this lecture we will
sketch a general understanding of why this is true, starting with a more “old-fashioned” approach based
on showing that the divergences cancel in certain “renomalizeable” theories such as the ϕ4 theory and then
moving to a more modern “Wilsonian” approach based on viewing the cutoff Λ as being physical and then
seeking to understand physics at energy scales which are low compared to Λ.

12.1 Power counting and renormalizability


Let’s now consider in a rather general way the possible divergences of Feynman diagrams. We’ll consider a
general quantum field theory with a set of fields Φa and a set of interaction vertices labeled by i, with each
interaction vertex involving Nia powers of Φa and di derivatives. We’ll write the interaction Lagrangian as
X
L= λ i Oi , (12.1)
i

where Oi is some power of the fields and their derivatives and λi is a coupling constant. We will discuss
below how to normalize the fields so that Oi and λi are separately well-defined. In this section we will study
the convergence of a general one-particle irreducible diagram with E a external Φa legs, I a internal Φa legs,
and Vi vertices of type i. We will focus on the particular region of the loop integration space where all loop
momenta become large at the same rate. This is not the only region a divergence can come from, but our
results will be indicative of the general case.
Before beginning we need to think a bit about the large-momentum behavior of the propagator. For
a scalar field this is easy, it just goes like 1/k 2 . In lecture 14 you showed on the homework that the
momentum-space Feynman propagator of a field of general spin is
i 1 X a⃗ i 1 X a ⃗
GF (k) = u (k, σ, n)ub∗ (⃗k, σ, n)−(−1)F 0 v (−k, σ, nc )v b∗ (−⃗k, σ, nc ).
k0
− ωn,⃗k + iϵ 2ωn,⃗k σ k + ωn,⃗k − iϵ 2ωn,⃗k σ
(12.2)
We haven’t discussed the spin sums of the intertwiners yet, but they always give polynomials of k so at large
k this propagator will go as
GF ∼ k 2sa −2 , (12.3)
where 2sa is highest power of k appearing in the spin sums. Roughly speaking sa is the spin of the ath field,
for example for a spin 1/2 field we will see that sa = 1/2 and for a massive vector field we’ll have sa = 1. In
the massless case however sometimes sa is lower than expected due to gauge symmetry, for example sf = 0
for photons and gravitons. I will adopt a convention where the field is normalized so that the highest power
of k in GF has coefficient one (perhaps multiplied by some dimensionless tensor such as η µν to make up the
a, b∗ indices), in which case (12.3) tells us that the energy dimension of the field obeys

2[Φa ] − d = 2sa − 2 (12.4)

and thus
d−2
[Φa ] = sa + . (12.5)
2
Turning now to the question of the divergence of the loop integrals, each internal propagator contributes
dd k
P
an integral (2π) d . The total number of loop integrals is d a Ia , but each vertex contributes a momentum-
conserving δ function so the total number of loop integrals is
!
X X
d Ia − Vi + 1 (12.6)
a i

141
since there will always be a single momentum-conserving δ function left over which doesn’t constrain the loop
momenta. Going to spherical coordinates in this full space of loop integrals thus gives a radial momentum
integral of the form Z
dkk d( a Ia − i Vi +1)+2 a Ia (sa −1)+ Vi di −1
P P P P
(12.7)

which will be divergent if the degree of divergence


X X
D := Ia (d + 2sa − 2) + Vi (di − d) + d (12.8)
a i

is greater than or equal to zero. We can simplify this expression by observing that since each internal a line
connects two vertices and each external line is connected to one vertex we have
X
Vi Nia = 2Ia + Ea , (12.9)
i

and thus    !
X d−2 X X d−2
D =d− Ea sa + − Vi d − di − Nia sa + . (12.10)
a
2 i a
2
We can write this more simply by observing that the quantity multiplying Vi in the sum over i is just d
minus the energy dimension X
∆i := di + Nia [Φa ] (12.11)
a

of the operator appearing in the ith interaction vertex. We therefore have


X X
D =d− Ea [Φa ] − Vi (d − ∆i ). (12.12)
a i

The qualitative behavior of this formula depends very strongly on ∆i : if all interactions obey ∆i ≤ d, then
adding additional interaction vertices cannot increase the degree of divergence. In this case the theory is
said to be renormalizable. More generally we can classify interaction vertices into three groups:
ˆ Vertices with ∆i < d are called super-renormalizable. For example the ϕ4 interaction in d = 3
obeys
[ϕ4 ] = 2 < 3, (12.13)
and is thus super-renormalizable.
ˆ Vertices with ∆i ≤ d are called renormalizable. For example the ϕ4 interaction in d = 4 obeys

[ϕ4 ] = 4, (12.14)

and is thus renormalizable but not super-renormalizable.


ˆ Vertices with ∆i > d are called non-renormalizable. For example the ϕ4 interaction in d = 5 obeys

[ϕ4 ] = 6 > 5, (12.15)

and is thus non-renormalizable.


A theory with at least one non-renormalizable interaction is said to be non-renormalizable, as it has the
property that diagrams become more and more divergent as we go to higher and higher orders in perturbation
theory. On the other hand in a renormalizable theory there are only a finite number of amplitudes which
have D ≥ 0, namely those with X
Ea [Φa ] ≤ d. (12.16)
a

142
So for example a real scalar field has
d−2
[Φ] = , (12.17)
2
so a scattering amplitude with E external particles can have D ≥ 0 only if
2d
E≤ . (12.18)
d−2
For d = 4 this is E ≤ 4, while for d = 3 this is E ≤ 6. We will now argue that this translates into
the statement that in a renormalizable theory UV divergences can be removed by absorbing them into a
finite number of coupling constants. This is just what we found at one-loop in ϕ4 theory in d ≤ 4. In a
theory where all interactions are super-renormalizable something even stronger is true: there are only a finite
number of diagrams which are UV divergent. The UV divergences in a super-renormalizable theory can thus
be completely removed by coupling constant shifts which are polynomials in the coupling, and that can be
computed at some fixed order in perturbation theory. For example in the ϕ4 theory in d = 3 we found no
divergence at one loop in 2 → 2 scattering.
Before continuing it is worth mentioning that there is a simple interpretation of the condition for a
vertex to be non-renormalizable: the quantity d − ∆i is precisely the energy dimension of the coupling λi
which appears in front of the interaction operator OI . Non-renormalizable interactions are thus those with
coupling constants that have negative energy dimension, while super-renormalizable interactions are those
with positive energy dimension.

12.2 Cancellation of divergences in renormalizable theories


Continuing with our study of the divergence of a general 1PI diagram in the integration region where all
momenta go to infinity together, we can consider what happens when we differentiate the diagram with
respect to some external momentum p. Each time we do this it decreases the degree of divergence of the
diagram by one since the derivative acts on the propagators, for example we have

dℓ (p + ℓ)2 + 2a(p + ℓ) − b
Z Z
d dℓ p + ℓ + a
2
= − , (12.19)
dp 2π (p + ℓ) + b 2π ((p + ℓ)2 + b)2

where the integral on the left is logarithmically divergent at large ℓ but the integral on the right is convergent.
More generally if we differentiate a diagram with D ≥ 0 a total number of (D + 1) times it becomes
convergent. The divergent part of the diagram must therefore consist of a polynomial in the external
momenta, heuristically of the form

ΛD pMD + ΛD−1 pMD−1 + . . . log ΛpM0 , (12.20)

where we have written pMn to represent any product of Mn components of the external momenta. These
powers will also be multiplied by various mass scales from the coupling constants of the theory to make sure
they have the right units (in a massless theory the units will need to work out without this). These however
are precisely the form of divergence which can be removed by adding local terms to the Lagrangian! More
concretely, to remove a divergence of the heuristic form ΛD−n pMD−n in a diagram with Ea external Φa legs,
we introduce a term with Ea factors of each field and MD−n derivatives acting on those fields with the same
index structure as in the divergence. For example let’s say we are computing the self-energy of a scalar in
d = 4 and we find the divergences
Λ
Σ(p2 ) ⊃ aΛ2 + (b + cp2 ) log . (12.21)
m
Λ Λ
We can absorb aΛ2 + b log m divergence into a shift of the bare mass term, and we can absorb the cp2 log m
µ
divergence into a shift of the kinetic term ∂µ Φ∂ Φ, i.e. into a wave function renormalization. In the previous
lecture we computed a and b at one loop, and we pointed out that c is also nonzero at two loops. More

143
+
Figure 30: Canceling a divergent subdiagram with a counterterm. For d = 2, 3 the full diagram has D < 0,
but the subdiagram is still divergent. Here the dot with an x through it indicates an insertion of the mass
Z 2 m20 −m2 2
renormalization term − 2 ΦR .

generally we only need to do this subtraction for diagrams with D ≥ 0, and thus we only need to include
shifts of interaction terms with ∆i ≤ d.
In the traditional way of describing this process one rewrites the bare Lagrangian in terms of the physical
mass and coupling m and λ, and also defines a rescaled field
Φ
ΦR = (12.22)
Z
which has a finite two-point function and in particular whose on-shell residue is the same as that of a free
field. We thus have
1 m2 λ0
L = − ∂µ Φ∂ µ Φ − 0 Φ2 − Φ4 (12.23)
2 2 4!
Z2 Z 2 2
m0 2 λ0 Z 4 4
= − ∂µ ΦR ∂ µ ΦR − ΦR − ΦR (12.24)
2 2 4!
1 m λ
= − ∂µ ΦR ∂ µ ΦR − Φ2R − Φ4R + Lct , (12.25)
2 2 4!
where
Z2 − 1 Z 2 m20 − m2 2 λ0 Z 4 − λ 4
Lct = − ∂µ ΦR ∂ µ ΦR − ΦR − ΦR (12.26)
2 2 4!
is called the counterterm Lagrangian and its individual terms are called counterterms. In the old-
fashioned approach to renormalization one views these counterterms as “corrections” to the original theory
which are included to cancel the infinities. They are treated as additional interaction vertices, providing
corrections to a free theory whose mass is now the physical mass and whose interaction vertex is now −iλ
instead of −iλ0 . This is not actually different from what we did in the previous lecture, where we followed
the Wilsonian approach (to be developed further in the next section) of tuning the bare couplings so that
the physical couplings have their observed values: the counterterms are just an alternative way of describing
this tuning.
Before proceeding to the Wilsonian approach, we need to confront the fact that so far we have only
considered the region of momentum integration where all loop momenta go to infinity together. This of
course is an important contribution to the integrals, but we also need to consider the possibility of divergences
that arise when only a subset of the momenta go to infinity. This is a subtle and difficult problem, whose
traditional solution we won’t describe in detail since the Wilsonian approach deals with it in a much cleaner
(but more abstract) manner. We will instead content ourselves with a few remarks about the ingredients
which go into the traditional proof that the same renormalization which removes the divergences in the
integration region we have considered so far also removes them for the full range of momentum integration.

144
+ +
Figure 31: A Feynman diagram with overlapping divergences. The red and blue dashed lines surround
four-point subdiagrams which each are logarithmically divergent in d = 4, but we can only use a four-point
counterterm to cancel the divergence from one of them. The remaining divergence must be canceled by an
additional two-point counterterm.

ˆ The first step in proving renormalizability is Weinberg’s theorem, which says that a multi-loop
integral will be convergent if and only if its degree of divergence is negative as we take any linear
combination of the loop momenta to infinity. This means that we can show convergence using a
generalization of the method employed so far.
ˆ One can think of the various options for which momenta go to infinity together in terms of subdia-
grams of the full Feynman diagram. For example a diagram whose degree of divergence as defined
above is negative can still diverge due to a subdiagram whose degree of divergence is positive. See
figure 30 for an example.
ˆ In simple cases there is a simple fix to the presence of a divergent subdiagram: we can simply ignore
the rest of the diagram, in which case we have already seen that the divergence can be canceled by
including an appropriate counterterm. At least to the extent that the propagators involved in the
divergent subdiagram are not involved in other divergent subdiagrams, this cancellation works also in
the full diagram (see figure 30).

ˆ The key technical problem with this approach however is the possibility of overlapping divergences,
meaning situations where we have multiple divergent subdiagrams with propagators in common. See
figure 31 for an example. In such a case it isn’t so clear that we can cancel both divergences with
counterterms, as once we replace one of the subdiagrams by a counterterm we have lost part of the
other subdiagram. The systematic approach to dealing with this goes under the name “BPHZ”, for
Bogoliubov, Parasiuk, Hepp, and Zimmerman, and it requires a detailed analysis of the structure of
the diagrams using the infamous “forest formula”. In the end everything does work though, and the
renormalization which fixes the divergences in the region where all momenta scale together indeed
removes the divergences from subdiagrams as well.

12.3 The Wilsonian approach


At least to my taste, the above discussion of renormalizability leaves something to be desired. We started
with very simple ideas, essentially based on dimensional analysis, but then to turn these into a full proof of
renormalizability we found that we needed to address some annoying technical problems. Shouldn’t there
be a better way that makes the physical meaning of renormalization obvious? Fortunately there is: the
Wilsonian approach to renormalization.75
75 Kenneth G. Wilson’s academic biography is an interesting one: during his PhD and also for eight years after he wrote almost

no papers (in particular he wrote zero papers as a graduate student and his 1961 thesis still has zero citations). Somehow he
managed to get a faculty position anyways, and also tenure at Cornell. He then proceeded to revolutionize physics, explaining
the real meaning of renormalization in the process, and ended up with a Nobel Prize. I do not recommend trying to replicate
this trajectory.

145
The first essential idea for the Wilsonian approach is to view the cutoff as physical. In a condensed
matter system this is self-explanatory: at the atomic scale in a solid there is a genuine lattice of ions, with
electrons constrained be near the ions, and at shorter distances there is nothing. In high-energy physics it is
less obvious that there needs to be a genuine cutoff at short distances (or equivalently high energies), but the
quantization of gravity seems to require major modifications of the laws of physics at the (absurdly small)
Planck length: r
ℏG
ℓp = ≈ 10−35 m. (12.27)
c3
Moreover there are several indications from particle physics (such as the mass of neutrinos, the existence of
dark matter, and the small baryon-to-photon ration of the universe) that some kind of modification of the
standard model of particle physics is necessary at sufficiently short distances.
The second essential idea for the Wilsonian approach is decoupling. This means that the details of
what is going on at large energies/short distances do not affect what is going on low energies/long distances.
For example if we regulate a scalar field theory by putting it on a lattice, when we look at the low-energy
physics of the system we cannot tell whether the lattice has a cubic structure or a hexagonal structure. We
also cannot detect the existence of very heavy particles by doing low-energy experiments.
The third essential idea for the Wilsonian approach is integrating out. The idea here is that since
low-energy physics does not depend on the details of high-energy physics, rather than carrying around all
that high-energy physics for no reason we can simply sum over it in the path integral once and for all. This
produces a “low-energy effective field theory”, where all effects of the high-energy modes are repackaged into
the values of the low-energy coupling constants.
Indeed let’s consider a rather general-looking quantum field theory with an explicit cutoff Λ, with action
XZ
SΛ = dd xgi (Λ)Λd−∆i Oi . (12.28)
i

Here Oi are some basis for all the scalar local operators in the theory, and ∆i are their energy dimensions.
In general there are infinitely many such operators, so the sum over i here needs to be viewed somewhat
heuristically. We have chosen to extract a power of the cutoff Λ from the coupling constants, which is chosen
so that the quantities gi (Λ) are dimensionless. The idea of the Wilsonian approach is that if we lower the
cutoff from Λ0 to Λ (with Λ < Λ0 ), we should tune the Λ-dependence of the couplings so that the low-energy
physics is not affected. You may worry whether or not we can do this, but in fact there is a simple path
integral method: we split all fields into a “high-energy” part ΦH , consisting of the modes which exist for
cutoff Λ0 but not for cutoff Λ, and a “low-energy” part ΦL , consisting of the modes which exist for both
cutoffs. For any observable OL [ΦL ] built only out of the low-energy modes we then have

DϕL DϕH OL [ϕL ]e−SΛ0 [ϕL ,ϕH ]


R
⟨OL [ΦL ]⟩ =
DϕL DϕH e−SΛ0 [ϕL ,ϕH ]
R

DϕL OL [ϕL ]e−SΛ [ϕL ]


R
= R , (12.29)
DϕL e−SΛ [ϕL ]
where Z
−SΛ [ϕL ]
e := DϕH e−SΛ0 [ϕL ,ϕH ] . (12.30)

In other words, the low-energy effective action is obtained by starting with the full action and then integrating
out the high-energy modes. This process gives a flow in the space of actions (or equivalently a flow in the
space of coupling constants) which is called renormalization group flow.76
The operation (12.30) has an important defect: in general there is no reason for the action SΛ to be
local even if we start with a local action SΛ0 . On the other hand since we only integrated out modes whose
76 The name is misleading, as renormalization group flow is not invertible (how would you “un-integrate”?) so there isn’t

really a group structure. A more accurate name would be “renormalization semigroup”, but unfortunately we are stuck with
this one.

146
wavelengths are at most of order Λ1 , any non-localities we generate should be constrained to this scale. We
therefore can Taylor expand them to express SΛ as a local action order by order in Λ1 . This suppression is
already built into our expression (12.28), as each derivative increase the dimension of Oi and thus costs a
power of Λ. As a simple example of this, we can consider a non-local term
Z
dd xdd yK(x − y)ϕ(x)ϕ(y). (12.31)

Since this came from integrating out short-distance modes with momenta roughly between Λ and Λ0 , the
Fourier transform of K(x − y) should be a reasonably smooth function with compact support in k. K will
therefore be an analytic function that decays rapidly at separations which are large compared to Λ1 . For ϕ
configurations which vary only on scales which are large compared to Λ1 , we can therefore approximate K as
a sum of δ-functions and their derivatives. In this way given a local action SΛ0 with couplings

gi (Λ0 ) := gi0 (12.32)

we can construct a local action SΛ with couplings gi (Λ) that gives the same low-energy physics. At first
order in Λ0 − Λ the new couplings are functions of the old couplings only, so they must obey Wilson’s
renormalization group (RG) equation
dgi
Λ = βi (g(Λ)). (12.33)

In other words we can think of the renormalization group flow as being the integral curves generated by
a vector field βi on the space of couplings. In the last lecture we computed the β function for the scalar
λ 4
coupling 4! ϕ in d = 4 at one loop, finding

3λ2
βλ (λ) = . (12.34)
16π 2
It must be emphasized that this flow generically generates all possible couplings which are allowed by the
symmetries of the theory - it does NOT only generate renormalizable couplings with ∆i ≤ d.77

12.4 Polchinski’s theorem


Using Wilson’s ideas, there is a beautiful argument due to Polchinski which explains in a deep way the
renormalizability results sketched in the previous sections of this lecture. The idea is that in theories which
start out weakly-coupled at the initial cutoff scale Λ0 , the RG equation (12.33) has a powerful focusing
behavior that suppresses any information about what is going on at the scale Λ0 . More precisely, there is a
finite-dimensional manifold in the space of coupling constants, whose dimensionality is equal to the number
of renormalizable couplings, which is an attractor for the renormalization group. Wherever we start out, we
eventually end up near this manifold (at least as long as we stay within the region of validity for perturbation
theory). The only remaining high-energy information about where we started is where on this manifold we
end up. The space of low-energy theories therefore can be parametrized by the renormalizable couplings
only.
To see this focusing behavior, we can study how the renormalization group equation behaves under a
small change δgi in the trajectory. Working to first order in δgi , we have
dδgi X
Λ = Mij δgj , (12.35)
dΛ j

with
∂βi
Mij = . (12.36)
∂gj
77 The only exceptions I know of to this rule are free theories, conformal field theories for which all β vanish, and supersym-
i
metric theories.

147

Introducing matrix notation and also using to indicate a derivative with respect to log Λ, we can rewrite
this equation as
δg ′ = M δg. (12.37)
So far this equation does not distinguish between renormalizable and non-renormalizable couplings. To
distinguish them, it is useful to introduce a projection matrix
(
δij i renormalizable
Pij = (12.38)
0 otherwise,

and also a matrix


∂gi
Dij = , (12.39)
∂gj0
where in Dij the derivative is computed for the particular trajectory gi (Λ) which obeys gi (Λ0 ) = gi0 . Following
Polchinski we can then introduce a clever second projection

Π = 1 − DP (P DP )−1 P, (12.40)

which is designed to decouple the renormalizable and non-renormalizable couplings in the RG equation.78
Π is indeed a projection in the linear algebra sense of obeying

Π2 = Π, (12.41)

but it is not orthogonal in the sense that it doesn’t obey Π† = Π. P and Π are related by the equations79

PΠ = 0
Π(1 − P ) = (1 − P ). (12.42)

We can then define a projected coupling variation

ξ = Πδg, (12.43)

which by construction obeys


Pξ = 0 (12.44)
so all renormalizable couplings have been removed. To compute the derivative of ξ with respect to scale we
first need to understand the scale dependence of Π. Differentiating both sides of the RG equation (12.33)
with respect to gj0 and using that partials of gi with respect to Λ and gj0 commute, we have

∂gi′ X ∂βi ∂gk


0 = (12.45)
∂gj ∂gk ∂gj0
k

and thus
D′ = M D. (12.46)
Moreover since for any matrix N we have

(N −1 )′ = −N −1 N ′ N −1 , (12.47)
78 The inverse matrix (P DP )−1 here should only be used on vectors which are in the image of P . Otherwise the inverse does

not exist.
79 The relationship between P and Π is interesting from a linear algebra point of view. If Π were hermitian then equations

(12.42) would imply that Π = 1 − P . Since Π is not hermitian, we can only conclude that (1 − P )v = v ⇔ Πv = v. We will see
im a moment however that what we are really interested in the null space of Π, and this need not coincide with the null space
of 1 − P .

148
we also have

Π′ = 1 − M DP (P DP )−1 P + DP (P DP )−1 P M DP (P DP )−1 P


= −ΠM DP (P DP )−1 P. (12.48)

We thus straightforwardly have

ξ ′ = Πδg ′ + Π′ δg
= ΠM δg − ΠM DP (P DP )−1 P δg
= ΠM ξ, (12.49)

so the projection Π has succeeded in decoupling the RG equation. Rewriting this in terms of P and D we
have

ξ ′ = M − DP (P DP )−1 P M ξ.

(12.50)

So far our discussion has been non-perturbative. In a situation where perturbation theory is valid, we
can usefully approximate the matrix M using free field theory. In free field theory the action should not
have any cutoff dependence since there are no loop diagrams, so we need the quantities

gi (Λ)Λd−∆i (12.51)

to be cutoff-independent as the interactions go to zero. We therefore have

gi′ ≈ (∆i − d)gi , (12.52)

and thus
βi ≈ (∆i − d)gi (12.53)
and
Mij = ∂j βi ≈ (∆i − d)δij . (12.54)
The key point is then the following. The renormalizable components of ξ are zero by construction, and to
the extent that M is diagonal in the same basis as P we can ignore the second term in (12.50) since then

P M ξ ≈ M P ξ = 0. (12.55)

We therefore have (
0 i renormalizable
ξi′ ≈ (12.56)
(∆i − d)ξi i non-renormalizable.
The non-renormalizable couplings are precisely those for which ∆i − d > 0, so we thus see that the entire
vector ξ vanishes like a power of ΛΛ0 as we flow to Λ ≪ Λ0 ! And moreover this conclusion is preserved under
perturbative corrections as long as these are small compared to ∆i − d (which they always will be for small
enough coupling). Once this suppression is complete, the full set of coupling variations needs to obey

Πδg = 0, (12.57)

or more explicitly
δg = DP (P DP )−1 P δg. (12.58)
In other words we can determine the change in all of the infinitely many non-renormalizable couplings by
looking at the change in the renormalizable couplings alone. Said differently, if we know the values of
all of the renormalizable couplings in the low-energy action then the non-renormalizable couplings are all
determined. This, in essence, is the statement of renormalizability!

149
To see more closely the connection between (12.58) and renormalizability, we first should note that
although the matrix D, which depends on the cutoff Λ0 and initial couplings gi0 , appears in equation (12.58),
the relationship determining the non-renormalizable couplings in terms of the renormalizable ones actually
can’t depend on these. This is because the focusing behavior of the RG equation (12.33) is a purely local
affair in the space of couplings: we are solving a first-order differential equation, and we only need to
know what is going on in the vicinity of where we are solving it. The finite-dimensional attractor manifold
therefore cannot depend on where the flows started. This is the essential point: all low-energy observables
can be computed using only the low-energy action SΛ [ϕL ], and thus expressed entirely in terms of where
we are on the attractor manifold. We can parametrize where we are on this manifold using the low-energy
renormalizable couplings, in which case all results will depend only on these low-energy couplings and the
(low) cutoff scale Λ, NOT on the initial cutoff Λ0 or initial couplings gi0 . But this is precisely the statement
of renormalizability: all observables can be expressed as functions of the low-energy couplings and kinematic
variables without any dependence on the cutoff or the bare couplings.

12.5 Why renormalizability?


In the traditional approach to renormalizability there is a preferred set of theories, the renormalizable
theories, where only renormalizable terms in the action have nonzero coupling constants. This is a powerful
constraint on theories, as it forbids most of the potential terms one could write in the Lagrangian. At any
point you could have asked me why I did not add terms like ϕ42 or (∂µ ϕ∂ µ ϕ)2 to the Lagrangian of our
interacting scalar theory, and in the end renormalizability is the reason. For example the standard model
of particle physics is a renormalizable theory (at least until we include neutrino masses and gravity), and
this seems essential for it to be predictive. In particular in the first lecture we mentioned as a great success
of quantum field theory that we can compute the anomalous magnetic moment of the electron to many
significant figures and it agrees with experiment. On the other hand there is a simple non-renormalizable
term we could add to the Lagrangian which would allow us to tune this quantity to be whatever we want,
and if this were allowed then we could no longer say that the standard model predicts a definite value.
In the Wilsonian approach however, we generically study actions with nonzero coupling constants for
all possible terms in the action. Does this mean that we have given up on the spectacular predictivity of
renormalizable theories? In fact we have not: what Polchinski’s theorem shows is that even if we allow non-
renormalizeable terms to be turned on, at low energies we still only have a finite number of parameters for the
theory; in fact we have precisely the same number of parameters as we’d have had by restricting to renormal-
izeable terms alone. From the Wilsonian point of view, the point is not that we need to restrict to theories
with renormalizable terms only: instead the right statement is that even if we allow non-renormalizable
terms in the bare action, at low energies the theory will still look like a renormalizable theory!
In particular we do not lose any low-energy information by setting the non-renormalizeable couplings to zero
in the bare theory, so we might as well do so: we are back to the old-fashioned renormalized perturbation
theory we constructed above.
How did non-renormalizeable terms become so much less threatening than they seemed in the traditional
approach? The reason is that our formula (12.12) for the degree of divergence of a Feynman diagram assumed
that the coupling constants λi are independent of the cutoff scale Λ, while in the Wilsonian approach we take
λi = gi Λd−∆i . The increasing degree of divergence as we bring down more powers of a non-renormalizeable
vertex is thus offset by the factor of Λd−∆i which multiplies gi . For this reason Wilson invented a new
terminology for classifying operators to replace the old one of renormalizable vs. non-renormalizable:

ˆ An operator Oi with dimension ∆i < d is said to be relevant

ˆ An operator Oi with dimension ∆i = d is said to be marginal

ˆ An operator Oi with dimension ∆i > d is said to be irrelevant.

In particular note the demotion of operators with ∆i > d from “non-renormalizable” to “irrelevant”: if
we change the dimensionless coupling for an irrelevant operator by an O(1) amount at short distance, the

150
Figure 32: Integrating out a heavy particle of mass M creates new interactions for the light fields which are
suppressed by the mass of the particle.

only effect at low energies is a shift of where we are on the attractor manifold that could just as well have
been achieved by changing the coefficients of the relevant and marginal operators alone. These days most
non-ancient theoretical physicists prefer this terminology for classifying operators to the old one, and in fact
it has been something of a chore for me to not use it thus far. From now on I will switch to using it.

12.6 Effective field theory


We’ve now seen that the RG flow equation tends to suppress information about high-energy physics. If
E is the energy scale where we are doing experiments, then high-energy details at some cutoff scale Λ are
suppressed by powers of the dimensionless ratio E Λ . This is both a blessing and a curse: it means that we
can figure out a theory of low-energy physics without needing to know what is happening at high-energy,
but it also makes it hard to figure out what is going on at high energy!
Fortunately for us there are two situations where this suppression of high-energy physics is not complete.
The obvious situation is when E Λ isn’t that small - there is some new physics at an energy scale Λ which
is high but not too high. For example there could be a heavy massive particle with mass M interacting
with a massless particle at energy scale E ≪ M , in which case the massless particle would have irrelevant
interactions suppressed by powers of M arising from Feynman diagrams where the massive particle was
E
exchanged (see figure 32). As long as we work only up to a fixed order in M we only need to include a
finite number of diagrams, since including more propagators of the heavy particle gives us higher and higher
inverse powers of M .
The less obvious, but still very important, situation where irrelevant interactions can’t be neglected is
when they are the only kind of interactions. The canonical example of this situation is in general relativity,
which is Einstein’s theory of gravity. Its Lagrangian density (with zero cosmological constant) is
1
L= R, (12.59)
16πG
where G is Newton’s constant and R is the Ricci scalar. To make this look more like quantum field theory
we define √
gµν = ηµν + Ghµν , (12.60)
in terms of which we heuristically have
 √ 
K = ∂h∂h 1 + Gh + Gh2 + . . . . (12.61)

Here we have not indicated how the indices are contracted or attempted to compute O(1) factors. The key
point however is that all interaction terms are suppressed by powers of the Planck mass
r
ℏc
Mp = ≈ 2.2 × 10−8 kg. (12.62)
G

151
This may not seem like a large mass compared to your own mass, but it is a gigantic energy scale for
an elementary particle. For example it is about 1019 times the mass of a proton, which is about 1 GeV.
Nonetheless gravity is a part of our every day experience, since the tiny gravitational force of each proton
in the earth on each atom in our bodies adds up and there is no competing force to overwhelm it.
Another example of a field with only irrelevant interactions is a real scalar field with a shift symmetry
ϕ′ = ϕ + a. The Lagrangian for this theory can only be made out of derivatives of ϕ, and the only relevant or
marginal term of this type is the massless kinetic term − 12 ∂µ ϕ∂ µ ϕ. We therefore need to include irrelevant
operators such as
g
L ⊃ − d (∂µ ϕ∂ µ ϕ)2 (12.63)
Λ
to get nontrivial scattering. This example has great physical relevance, as it arises whenever there is a
“spontaneously-broken continuous global symmetry”. For example this happens in nuclear physics, where
the pion fields are the scalars, and also in condensed matter physics systems such as liquid helium at the
critical point. We will learn more about these next semester.
The unifying theme of these examples is that we can parametrize the effects of unknown high-energy
physics by including irrelevant operators in the low-energy theory suppressed by powers of the energy scale
Λ of that unknown physics. The theory including these terms is only valid when viewed as computing an
expansion in E Λ . Such a theory is called an effective field theory. Our current best understanding of the
laws of physics is an effective field theory, as it includes irrelevant operators to explain gravity and also the
observed nonzero values of neutrino masses. Effective field theories inevitably break down when we consider
energies of order Λ, and to understand what happens then we need to know the real high-energy physics.
For pions and liquid helium we already know this, while finding it for gravity is one of the biggest problems
in physics. We will meet effective field theories again in the next semester, and in fact there is an entire class
about them taught by Iain Stewart here at MIT.80

12.7 Fixed points and conformal symmetry


One of the most important aspects of the renormalization group is the possibility of fixed points, meaning
points gi∗ in the space of couplings where
βi (g ∗ ) = 0 (12.64)
for all i. The theories which live at these couplings are necessarily scale-invariant, meaning that in addition
to Poincaré symmetry they also have dilation symmetry

xµ = λxµ . (12.65)
This is because at a fixed point the theory has no dimensionful parameters, so any dimensionless observable
must be invariant under a change of units.
It is widely expected that in relativistic field theories a fixed point must also have a larger spacetime
symmetry group called conformal symmetry, which is the set of diffeomorphisms which send the spacetime
metric to a scalar multiple of itself.81 Infinitesimally conformal transformations are generated by conformal
Killing vectors, which in Minkowski space obey
∂µ ξν + ∂ν ξµ = Aηµν . (12.66)
We can determine A by contracting both sides of this equation with η µν , leading to
2
∂µ ξν + ∂ν ξµ = ∂α ξ α ηµν . (12.67)
d
The general solution of this equation is
ξµ (x) = aµ + ωµν xν + bxµ + 2cα xα xµ − cµ x2 , (12.68)
80 You can find this class on OCW, highly recommended!
81 This expectation is a theorem for d = 2, proven by our friend Polchinski, and so far no counterexamples have been found
for d > 2.

152
where ωµν = −ωνµ . The first two terms here are infinitesimal Poincaré transformations, while b parametrizes
an infinitesimal dilation. The vector cα parametrizes the infinitesimal version of what is called a special
conformal transformation. A quantum field theory with conformal symmetry is called a conformal field
theory, so in relativistic field theory a fixed point of the renormalization group likely always corresponds to
a conformal field theory. In what follows we will not need to use conformal symmetry however, so we will
stick to the language of fixed points.
Fixed points are natural “starting” and “ending” points for the renormalization group flow. The typical
situation is that we begin with a “UV” fixed point, deform by a relevant operator with a small coefficient,
and then flow off in the space of couplings until we reach some other “IR” fixed point. As a simple example
we can consider our old friend the free massive scalar theory:

1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ . (12.69)
2 2
In this theory the only nontrivial coupling is the mass m2 , which we will parametrize as a function of the
cutoff by
m2 = g2 Λ2 (12.70)
as usual. The renormalization group equation is quite simple:

g2′ = −2g2 , (12.71)

so the β-function is
β2 (g2 ) = −2g2 . (12.72)
Thus we see that to get a fixed point we need g2 = 0, or equivalently m = 0. This is quite sensible: if m2
2

is not zero then the theory has a dimensionful parameter and cannot be scale invariant. This fixed point is
the simplest conformal field theory: the massless free scalar. If we now deform the action by turning on a
small nonzero value g20 for g2 at some cutoff scale Λ0 , then we now have
 2
Λ0
g2 (Λ) = g20 . (12.73)
Λ

When Λ ∼ Λ0 this is a small contribution, but as we lower Λ it grows and once we get to the regime where
q
Λ ∼ Λ0 g20 (12.74)

this deformation has a large effect on the theory. Of course in this case the right-hand side of (12.74) is just
the mass m, so it is hardly news that the mass becomes important for energies E ≲ m. Indeed below this
scale there are no states except for the ground state, so in this particular renormalization group flow the IR
fixed point is the trivial conformal field theory with zero degrees of freedom.
This last example may have you worried that the IR fixed point in quantum field theory is often the
trivial CFT. Indeed this is generically the case, for a simple reason: as long as the candidate IR CFT has at
least one scalar relevant operator which is invariant under all symmetries of the UV CFT, then generically
the RG flow is repelled from the fixed point in this direction so we need to tune a continuous parameter in
the initial conditions to hit it. See figure 33 for an illustration. If the candidate IR CFT has more than one
invariant relevant operator, then we need to tune a continuous parameter for each such operator. Sometimes
however you get lucky: the IR CFT can have no invariant relevant operators or there may be some reason
why they cannot be turned on. We will meet examples of this type next semester.

12.8 Critical phenomena


Let’s now use all this technology to predict something for a real experiment. The system we will study is a
classical Ising magnet in three spatial dimensions. This has a phase transition as we vary the temperature,

153
Figure 33: Renormalization group flows in the vicinity of a UV fixed point (shown in blue) with two relevant
operators and an IR fixed point (shown in red) with one relevant operator. To hit the IR fixed point, we
need to tune the initial flow direction from the blue point, otherwise we flow off to what is likely a trivial
theory.

exhibiting spontaneous magnetization when T < Tc where Tc is either called the Curie temperature or the
critical temperature. In particular for T ≈ Tc this system is described by the Euclidean version of our old
friend the massive scalar field ϕ, with Lagrangian
1 m2 2 λ 4
LE = ∂µ ϕ∂ µ ϕ + ϕ + ϕ . (12.75)
2 2 4!
Here ϕ is essentially the average magnetization, the Ising spin-flip symmetry is represented as ϕ′ = −ϕ,
and82
m2 ∝ T − Tc (12.76)
so the phase transition happens at m = 0. This tuning to m = 0 is precisely the tuning mentioned at the end
of the previous section, so at low energy this theory at m = 0 should be described by a nontrivial conformal
field theory with one relevant operator that is invariant under the spin-flip symmetry. This theory is not so
easy to compute in, as for d = 3 the operator ϕ4 in the free theory (the UV fixed point) is relevant so at low
energy its dimensionless coupling becomes strong. Indeed finding a reliable way to do computations in the
IR fixed point of the critical Ising model in d = 3 is one of the most famous problems in theoretical physics.83
We will now see that by using a clever trick due to Wilson and Fisher we can compute some aspects of this
theory surprisingly reliably using results we have already obtained.
We first need to get a sense of what kind of quantity we would like to compute. The first thing to note
is that as we flow to the IR fixed point some renormalization of the operator ϕ2 will typically be necessary.
In other words the operator which has cutoff-independent correlation functions will have the form

ϕ2 = [ϕ2 ]0 Λγϕ2 , (12.77)

where [ϕ2 ]0 is the “bare” ϕ2 operator at the UV fixed point and γϕ2 is called the anomalous dimension of
ϕ2 . The full energy dimension of ϕ2 at the IR fixed point (working for the moment in d spacetime dimensions)
82 The argument for this that m2 should vanish at T = Tc by scale invariance and the effective Lagrangian should be analytic
in T , so generically it should vanish linearly.
83 For d ≥ 4 the ϕ4 coupling is marginal or irrelevant (and in the marginal d = 4 case it still flows to zero in the IR since the

one-loop β function is positive), so the IR CFT is just the massless free scalar theory. For d = 2 the scalar description breaks
down due to infrared divergences and other methods are needed; we will show next semester that the IR fixed point for d = 2
is actually a free fermion theory.

154
is thus
∆ϕ2 = d − 2 + γϕ2 . (12.78)
2
We can read off the anomalous dimension of ϕ from its Euclidean two-point function at the critical point,
since by dimensional analysis this must be given by
C
⟨ϕ2 (x)ϕ2 (y)⟩ = (12.79)
|x − y|2∆ϕ2
with C a dimensionless constant. It is convenient to take the Fourier transform of this, which by dimensional
analysis must be
⟨ϕ2 (k1 )ϕ2 (k2 )⟩ = (2π)d δ d (k1 + k2 )D|k1 |2∆ϕ2 −d , (12.80)
with D again a dimensionless constant. Since the quantity m2 ϕ2 must have dimension d, we must have

[m2 ] = d − ∆ϕ2 = 2 − γϕ2 , (12.81)

so we can write
m2 = ξ γϕ2 −2 (12.82)
where ξ has units of length and is called the correlation length of the system. Combining this with (12.76),
we see that we must have
1
∼ (T − Tc )ν (12.83)
ξ
with
1
ν= . (12.84)
2 − γϕ 2
ν here is an example of what is called a critical exponent, and the relation (12.83) is easily measurable
in a real magnet. There are other critical exponents for other thermodynamic quantities, and all of them
can be related to the energy dimensions of relevant operators in the IR CFT. Computing these dimensions
is thus the central problem in understanding the Ising phase transition.84
Now, following Wilson and Fisher, let’s see how to compute the anomalous dimension γϕ2 . The method
we will use is called the “ϵ-expansion”, and if this is the first time you are hearing it you may think I am
crazy. The idea is to continuously connect the nontrivial IR CFT in d = 3 to the free scalar CFT in d = 4
by taking d = 4 − 2ϵ, expanding perturbatively in ϵ, and then setting ϵ = 1/2. It is not clear a priori that
this is a good thing to do, but it turns out that the O(1) coefficients in this expansion work out in such a
way that ϵ = 1/2 is small enough to get a decent approximation.85 Let’s first recall that in d = 4 we found
the expression
3λ2
β= (12.85)
16π 2
for the β-function of the quartic coupling in λϕ4 theory. For d = 4 − 2ϵ this coupling becomes dimensionful,
so following the Wilsonian approach we should introduce a dimensionless coupling g4 via

λ = g4 Λ2ϵ . (12.86)

The coupling g4 thus has nontrivial scale dependence even in the free theory, scaling like g4 (Λ) ∼ Λ−2ϵ . Its
β-function at one loop is thus
3g42
β = −2ϵg4 + . (12.87)
16π 2
84 Infact the same IR CFT governs many other physical systems, including the critical point of the phase diagram of water. All
of these systems have the same critical exponents, which is a rather remarkable convergence due to the great differences in the
underlying physics of these system. This “universality” is a beautiful illustration of the focusing power of the renormalization
group.
85 There are other more modern (and more rigorous) approaches to doing this calculation, but for the most part they require

substantial numerical work while the ϵ-expansion gives quick analytic results that already work pretty well.

155
Figure 34: Leading diagrams contributing to an insertion of [ϕ2 ]0 into a correlation function.

We can therefore find a fixed point by canceling these two terms against each other, leading to
32π 2 ϵ
g4∗ = . (12.88)
3
If we take ϵ → 12 this gives a rather large coupling in d = 3, but we can boldly press ahead and see what we
find for the anomalous dimension γϕ2 .
For small ϵ we can compute γϕ2 by studying the renormalization of the composite operator ϕ2 . So far
we have not discussed how to compute correlation functions of composite operators, but the basic idea is
simple: start with the pieces of the operator at different points, and then bring them together ignoring any
diagrams with propagators connecting the pieces of the operator. In particular for ϕ2 we subtract a factor
of GF (0), removing the obvious divergence proportional to the identity operator as we bring the two ϕ’s
together. This renormalization of composite operators is called normal ordering, and it must be done even
in free field theory to define a sensible composite operator. We can also understand normal ordering in the
operator approach, where the divergence arises from the term
dd−1 p dd−1 p
Z Z
1 ′
Φ2 (x) ⊃ d−1 √ ei(p−p )x ap⃗ a†p⃗ ′ , (12.89)
(2π) (2π)d−1 2 ωp⃗ ωp⃗′
which has the divergent vacuum expectation value
dd−1 p 1
Z
2
⟨Φ (x)⟩ = = GF (0). (12.90)
(2π)d−1 2ωp⃗

What normal ordering does in free field theory is re-order all products of a’s and a† ’s so that the a† ’s are
to the left of the a’s, ensuring a vanishing vacuum expectation value. It is convenient to instead compute
correlation functions of the rescaled operator 21 ϕ2 , as Feynman diagrams for these have symmetry factors
that work in the way we are familiar with (the 1/2 is similar to the 1/4! we put in front of ϕ4 , and cancels
the two ways that incoming propagators can be attached to the operator). The leading diagrams arising
from an insertion of 21 [ϕ2 ]0 are shown in figure 34. Evaluating these diagrams we see that at this order the
only effect is to multiply the Fourier transform
Z
1 2 1
[ϕ (p)]0 = dxe−ip·x [ϕ2 (x)]0 (12.91)
2 2
by a factor  
λ
Nϕ2 = 1 − I(p) + O(λ2 ) , (12.92)
2
where I(p) is our old friend
Z 1
dd ℓ
Z  
1 1 1 1 2 2

I(q) = = − γ + log(4π) − dx log m + x(1 − x)q .
(2π)d ℓ2 + m2 (ℓ + q)2 + m2 16π 2 ϵ 0
(12.93)

156
Here we are interested in the massless case, so rewriting things in terms of g4 we have

Λ2
   
g4 1 2
Nϕ2 = 1 − − γ + log(4π) + 2 + log 2 + O(g4 ) . (12.94)
32π 2 ϵ p

Absorbing the finite one-loop contributions into a rescaling of the cutoff via
1
log Λ′2 = − γ + log(4π) + 2 + log Λ2 , (12.95)
ϵ
we can write this as
  ′2  
g4 Λ 2
Nϕ2 = 1− log + O(g4 )
32π 2 p2
 ′2 − g4 2
Λ 32π 
1 + O(g42 )

= 2
(12.96)
p

Therefore we can remove the cutoff dependence by defining the renormalized operator
g4
ϕ2 = Λ′ 16π2 [ϕ2 ]0 . (12.97)

which means that the anomalous dimension of ϕ2 is


g4
γϕ2 = . (12.98)
16π 2
At the fixed point (12.88) we therefore have

γϕ2 = , (12.99)
3
so from (12.84) the critical exponent ν is given by
1 ϵ
ν= + + O(ϵ2 ). (12.100)
2 6
Boldly setting ϵ = 1/2 to get to d = 3 we therefore have the prediction
7
ν≈ ≈ .583. (12.101)
12
This prediction is fairly close to the experimental value .625 ± .006, which is nice. We can do better working
to higher order in ϵ, the state of the art calculation based on this method gives .6290 ± .0025,86 which is
in quite impressive agreement with the experimental value! In the next semester we will meet many other
experimental success of quantum field theory.

86 These values are from Zinn-Justin’s book “Quantum field theory and critical phenomena”.

157

You might also like