QFT1
QFT1
Contents
1 Why quantum field theory? 4
1.1 Combining quantum mechanics and special relativity . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 A notational aside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.2 Relativistic propagator and causality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.3 Creation and annihilation operators on multi-particle Hilbert space . . . . . . . . . . . 8
1.1.4 Quantum fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Many-body quantum systems with local interactions . . . . . . . . . . . . . . . . . . . . . . . 12
1.3 Quantum field theory in quantum gravity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Mathematical difficulties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 What this course is and is not . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1
5.6 Path integral calculation of the Feynman propagator in field theory . . . . . . . . . . . . . . . 64
5.7 Euclidean path integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.8 Homework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
2
12 Renormalizability and the Renormalization Group 141
12.1 Power counting and renormalizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
12.2 Cancellation of divergences in renormalizable theories . . . . . . . . . . . . . . . . . . . . . . 143
12.3 The Wilsonian approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
12.4 Polchinski’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
12.5 Why renormalizability? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
12.6 Effective field theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
12.7 Fixed points and conformal symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
12.8 Critical phenomena . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
3
1 Why quantum field theory?
The goal of this class is to teach you quantum field theory (QFT), which is the central foundation
(together with general relativity) of most of contemporary theoretical physics. In 2024 you cannot claim to
understand the laws of physics without knowing some QFT, and here is an opportunity for you to learn it.
The full class is three semesters long, so in this semester we are just getting started.
QFT is not an easy subject to study: there are many subtle arguments and long calculations, and
moreover, as we will see throughout the class, QFT rests on somewhat shaky mathematical ground, which
can make it difficult to know which results are really solid. It is often said that nobody truly understands
QFT, and many current research seminars around the world are devoted to trying to understand how to
formulate it better. A consequence of this state of affairs is that, unlike for older subjects such as classical
electromagnetism, there is no settled way to teach QFT and the various textbooks are all written from quite
different perspectives. I’ll say a bit more about the perspective of this class at the end of the lecture.
Given the difficulty of the subject, it is important to understand at the outset where we are going and
why. The goal of this lecture and the following one is to present the basic conceptual motivations for thinking
about QFT, aiming to get some intuition for why it is a good idea to think about the quantum mechanics
of fields. There are three main motivations that we will consider:
Quantum field theory is (likely) the only way to create a quantum theory of interacting relativistic
particles. This is why quantum field theory is of great importance in particle physics: the standard
model of particle physics, which governs the interactions of elementary particles through the electro-
magnetic, strong, and weak forces, is a quantum field theory. For example one of the great triumphs of
the standard model of particle physics is its successful description of something called the anomalous
magnetic moment of the electron:
Theory : ae = 0.001159652181643(764)
Experiment : ae = 0.00115965218073(28). (1.1)
This theory calculation is a tour-de-force of quantum field theory, and we will compute the first few
digits of it next semester in QFT 2.
Quantum field theory is the natural language for describing the low-energy physics of many-body
quantum systems with local interactions. This is why quantum field theory is of great importance in
condensed matter physics: many important solid-state phenomena such as superconductivity, phase
transitions in magnets, and the fractional quantum hall effect are quantitatively understood using the
machinery of quantum field theory. For example in an Ising magnet the spontaneous magnetization
M scales as
M ∝ (Tc − T )β (1.2)
for temperatures just below the critical temperature Tc , with the “critical exponent” β being given by
1
8
dspatial = 2
β = .326419(3) dspatial = 3 (1.3)
1
2 d spatial = 4
These exponents can be computed in quantum field theory: by the end of this semester we will be able
to understand the dspatial = 2 and dspatial = 4 cases, while the dspatial = 3 case (the hardest) is an
area of active ongoing research!
Quantum field theory arises ubiquitously in our most promising approach to combining quantum me-
chanics and gravity, which consists of a set of related ideas under the general umbrella of “string
theory”. There it arises both as the low-energy description of brane systems and also as the “holo-
graphic dual” of non-perturbative quantum gravity in spacetimes with negative cosmological constant.
4
One of the big successes of the latter is its confirmation in many cases of the Bekenstein-Hawking
formula for the entropy of a black hole:
Ahorizon c3
S= . (1.4)
4Gℏ
We will now discuss each of these motivations in turn, focusing on the first two since the third is mostly
beyond the scope of this class. If you are new to QFT (as I hope many of you are), the arguments may go
by a bit fast for you. If that is the case do not worry: we will do most of these manipulations again in much
more detail in later lectures. Our goal here is to paint in broad strokes, getting a flavor of what is to come
in the weeks and months ahead!
(ii) The potential interactions are instantaneous, which is not compatible with the relativistic principle
that nothing can move faster than light.
Problem (i) is not too difficult to solve: it isn’t pretty, but we can just make the replacement
N N
ℏ2 X 2 X q
− ∇i → −ℏ2 c2 ∇2i + m2 c4 (1.6)
2m i=1 i=1
in equation (1.5). In this way it is fairly straightforward to make a relativistic theory of non-interacting
quantum particles, for example in the one-particle case the solutions of the Schrodinger equation can be
expanded in a basis of energy eigenstates
√ 2 2 2 4
ψ(⃗x, t) = ei⃗p·⃗x−i |p| c +m c t . (1.7)
Problem (ii) however is more serious: in order for quantum dynamics to be compatible with special relativity,
we need to make sure that all interactions are local in spacetime. Based on our experience with electromag-
netism, which is after all a relativistic theory, we can guess that the natural way to incorporate spacetime
locality is to introduce fields. When we move an electric charge here in Cambridge, it is not really true
that there is a physically-detectable Coulomb potential that immediately adjusts what is going on in the
Andromeda galaxy. What happens instead is that we create a ripple in the electromagnetic field which then
propagates outwards at the speed of light, updating the Coulomb field as it goes along. What is perhaps
more surprising is that it turns out that we need to introduce fields for the charges as well: an electron field,
a proton field, and so on.
5
Figure 1: Propagation inside the lightcone in 1 + 1 dimensions: in a theory where nothing is faster than light
a disturbance at (xi , ti ) should not be able to reach a point (xf , tf ) which is spacelike separated.
inverse energy, with ℏ as the conversion factor, and we can get rid of the latter by measuring distance in
units of seconds, with c as the conversion factor. From now on we will therefore work in units where
ℏ = c = 1, (1.8)
with all dimensionful quantities having units which are some power of energy. In particular length and time
are both measured in units of inverse energy, while mass is measured in units of energy. For example the
radius of the earth is
R⊕ 1
= (1.9)
ℏc 4.9 × 10−33 J
and the acceleration due to gravity at the Earth’s surface is
gℏ
= 3.43 × 10−42 J. (1.10)
c
These units are clearly not so practical for daily life, but in situations where both relativity and quantum
mechanics are important they are indispensable.
6
Figure 2: Deforming the contour in the complex p-plane. The defining contour is the lower dashed line, which
can be smoothly deformed via a two large circle segments at infinity to the upper contour which wraps the
branch cut along the positive imaginary axis.
shows that the information that there is a particle located at position xi at time ti propagates faster than
light!
We can evaluate the propagator (1.11) by inserting a complete set of momentum eigenstates:
dp ip(xf −xi )−i√p2 +m2 (tf −ti )
Z ∞
G(xf , tf ; xi , ti ) = e . (1.13)
−∞ 2π
Going forward we might as well use translation invariance to set ti = xi = 0 and relabel tf = t, and xf = x,
in which case we can consider the simpler function
dp ipx−i√p2 +m2 t
Z ∞
G(x, t) := e . (1.14)
−∞ 2π
This integral is not so easy to evaluate analytically, and one can worry if it even converges due to the
oscillatory behavior at infinity. Since we are assuming that 0 < t < x, the convergence of this integral is
controlled by the ipx term in the exponent. To make sure it is convergent, we can slightly rotate the phase
of the p integral so that it goes off to infinity at a small angle ϵ above the real p-axis in both directions (see
figure 2). This is convergent since at large positive p we have
iϵ
eie px
≈ e−ϵpx+ipx (1.15)
while at large negative p we have
−iϵ
eie px
≈ eϵpx+ipx . (1.16)
Moreover by Cauchy’s theorem the answer is independent of ϵ since we can rotate the contour freely from
one ϵ to the next (the circle segments at infinity do not contribute since the integrand is exponentially
suppressed), and so we can take ϵ → 0 to recover the propagator. On the other hand to estimate the value
of the propagator, it is more convenient to instead rotate the contour up to wrap around the branch cut
that runs along the imaginary axis from p = im off to p = i∞ (see figure 2). Evaluating the integral on this
contour, we see that we have
Z ∞
dλ −λx √λ2 −m2 t √
2 2
G(x, t) = i e e − e− λ −m t
m 2π
i ∞
Z p
= dλe−λx sinh λ2 − m2 t . (1.17)
π m
7
The integrand here (ignoring the factor of i) is strictly positive for λ > m, and so we see that the propagator
is indeed nonzero outside of the lightcone! On the other hand it isn’t very nonzero: by using the monotonicity
of sinh y for y > 0 and then ignoring the negative exponential we have
Z ∞ 1Z ∞
p e−m(x−t)
dλe−λx sinh λ2 − m2 t < dλe−λ(x−t) = , (1.18)
m 2 m 2(x − t)
and thus
e−m(x−t)
0 < |G(x, t)| < . (1.19)
2π(x − t)
Therefore the propagator of a massive relativistic particle is suppressed exponentially outside the lightcone,
but it isn’t zero.
In a relativistic theory, there is something deeply wrong with being able to send information faster than
light. Indeed by doing a boost we can change the time ordering of any pair of spacelike-separated events, so
if we can communicate faster than light then we can also communicate backwards in time. In the presence
of interactions it is even worse: by sending a message to a point outside of your future lightcone and then
receiving a message back you can communicate directly with points in your own own past lightcone. Such
things are called violations of causality, which is the principle that you shouldn’t be able to send signals to
your own past. Physics seems unlikely to make much sense in situations where causality is violated, so we
had better find a way to fix this.
[a(x), a(x′ )] = 0
[a† (x), a† (x′ )] = 0
[a(x), a† (x′ )] = δ(x − x′ ), (1.20)
and the zero-particle state |Ω⟩ is defined to be the one which is annihilated by all a(x). Other states are
created from |Ω⟩ by acting with creation operators, for example a one-particle state with wave function ψ(x)
is represented in this language by Z
|ψ⟩ = dxψ(x)a† (x)|Ω⟩. (1.21)
Z Z
= dx′ ψ ∗ (x′ ) dxψ(x)δ(x − x′ )
Z
= dx|ψ(x)|2
= 1. (1.22)
8
More generally a multi-particle state ψ(x1 , x2 , . . . xN ) is represented by
Z
1
|ψ⟩ = √ dx1 . . . dxn ψ(x1 , . . . , xN )a† (x1 ) . . . a† (xN )|Ω⟩. (1.23)
N!
We note in passing that since the a† (xi ) all commute with each other, the particles we are describing are
bosons:1
a† (x)a† (x′ )|Ω⟩ = a† (x′ )a† (x)|Ω⟩. (1.24)
The Hamiltonian is given by
Z p
H0 = dx a† (x) −∂x2 + m2 a(x)
Z
dp p 2
= p + m2 a† (p)a(p), (1.25)
2π
where in the second line we have introduced the Fourier-transformed annihilation operator
Z
a(p) = dxe−ipx a(x). (1.26)
It may feel like we have suddenly introduced an entire new form of quantum mechanics, but except for
introducing a rule that the particles are bosons (which we couldn’t see before since we only considered one-
particle states), this is really just a different bookkeeping for the same old multi-particle quantum mechanics.
In particular the second expression for the Hamiltonian shows that the energy eigenstates are states of the
form
a† (p1 ) . . . a† (pN )|Ω⟩, (1.27)
with the total energy just being
N q
X
E= p2i + m2 . (1.28)
i=1
In this language we can rewrite the propagator (1.14) as
G(x, t) = ⟨Ω|a(x)e−iH0 t a† (0)|Ω⟩
= ⟨Ω|a(x, t)a† (0)|Ω⟩
= ⟨Ω|[a(x, t), a† (0)]|Ω⟩. (1.29)
where we have introduced the Heisenberg picture annihilation operator
a(x, t) = eiH0 t a(x)e−iH0 t . (1.30)
In fact this is true as an operator equation:
[a(x, t), a† (0)] = G(x, t). (1.31)
What we have learned from our discussion of the propagator is therefore that creation and annihilation
operators in the Heisenberg picture do not commute at spacelike separation. It is a bit tedious to work out,
but this also implies that the number operator
N (x, t) = a† (x, t)a(x, t), (1.32)
which counts how many particles there are at position x and time t, does not commute with itself at spacelike
separation. Unlike the creation/annihilation operators, the number operator is hermitian and thus should be
observable. If we are going to save causality, we thus need to argue that the number of particles at position
x and time t cannot actually be measured by someone in the vicinity of x and t!
1 If we want to get fermions, we should instead impose anticommutation relations {a(x), a(x′ )} = {a† (x), a† (x′ ) = 0 and
{a(x), a† (x′ )} = δ(x − x′ ), where {A, B} = AB + BA is called the anticommutator of A and B. We will discuss fermions in
more detail later in the semester.
9
Figure 3: Translation of a function f (x) by a. Note that here f ′ is the transformed function, not the
derivative, and that it is the inverse of the translation which appears in the argument of the function.
of an interaction density Hint (t, ⃗x) that commutes with itself at spacelike separation (otherwise we could
violate causality by measuring the energy density at spacelike separation). The easiest way to achieve this is
to construct an interaction density which commutes with itself at spatial separation, and then also demand
that it transform as a Lorentz scalar in the sense that2
U (Λ, a)† Hint (x)U (Λ, a) = Hint (Λ−1 (x − a)), (1.34)
where U (Λ, a) is the unitary operator on Hilbert space which implements the Poincaré transformation
x′µ = Λµν xν + aµ (1.35)
on the Hilbert space of the theory. This ensures commutativity at spacelike separation since if x and y are
spacelike separated there is always a Poincaré transformation that sends them to the same time slice and
we have assumed that Hint (x) commutes with itself at spatial separation. It may be puzzling that we used
the inverse Poincaré transformation in the argument of Hint , the idea behind this is shown in figure 3: we
want to define the symmetry transformation to “move the scalar along with the symmetry”, meaning that
the “new” scalar at x should be equal to the “old” scalar at the point Λ−1 (x − a) where x “came from”.
You will also show in the homework that defining things this way is necessary for us to have two successive
Poincaré transformations combine in the natural way.
I’ll note in passing that by using time-dependent perturbation theory we can write a formula for the
particle scattering matrix in a theory with an interaction of this form (see chapter three of Weinberg volume
I) as
∞
(−i)n
X Z
S =1+ dd x1 . . . dd xn T {Hint (x1 ) . . . Hint (xn )}. (1.36)
n=1
n!
If Hint is a Lorentz scalar then this is manifestly Lorentz-invariant except for the time-ordering symbol
T . As long as Hint commutes with itself at spacelike separation however, then the time ordering is also
independent of Lorentz frame and so S will indeed be Lorentz-invariant
U (Λ, a)† SU (Λ, a) = S. (1.37)
2 Here I’ll introduce a standard notation for the rest of the class: when I write ⃗
x I mean a point in space, while when I write
x I mean a point (t, ⃗x) in spacetime.
10
We will discuss scattering theory in more detail later in the class.
How then can we build an Hint which is a Lorentz scalar that commutes with itself at spacelike separation?
The only idea which seems to work is that it should be built out of fields: linear combinations of the creation
and annihilation operators of the form3
X Z dd−1 p
†
ϕi (x) = ui (x; p, σ, n)a(p, σ, n) + v i (x; p, σ, n)a (p, σ, n) , (1.38)
σ,n
(2π)d−1
[ϕi (x), ϕj (y)]± = [ϕi (x), ϕ†j (y)]± = 0 (x − y)2 > 0, (1.39)
and also that under Poincaré transformations we have a simple transformation law
X
U (Λ, a)† ϕi (x)U (Λ, a) = Dij (Λ)ϕj (Λ−1 (x − a)). (1.40)
j
Here we have allowed for multiple species of particle labeled by n, and also for the particles have spin σ,
in which case the creation and annihilation operators need to be labeled by n and σ in addition p. You
will show in the homework that the consistently composing Poincaré transformations requires the matrices
Dij (Λ) to furnish a representation of the Lorentz group in the sense that
X
Dij (Λ1 )Djk (Λ2 ) = Dik (Λ1 Λ2 ). (1.41)
j
some local interactions we could write down which are Lorentz scalars are
(V µ Vµ )2 V µ V ν ∂µ Vν V µ Vµ ∂ν V ν , (1.43)
which all commute with each other at spacelike separation since V µ (x) does.
So far you might be tempted to view this construction as just more bookkeeping: we are still working in
our old multi-particle Hilbert space and constructing things using creation and annihilation operators (albeit
in nice linear combinations). How can bookkeeping fix a problem with causality? The key point is that we
now make a fundamental shift in how we physically interpret all of the above equations:
⋆ In quantum field theory, we postulate that the observables that can be measured in the vicinity of a
spacetime point x are those constructed from the fields at x, not those constructed from the position-
space creation and annihilation operators a† (x) and a(x).
We will see in a few lectures that the a(x) and a† (x) are non-local when expressed in terms of the fields,
so the apparent failure of causality we saw above is really just a consequence of failing to identify the right
physical degrees of freedom. If we build a detector here in this room right now, the claim is that what it
really couples to are the fields and not the particles.
We are already in a position to see two of the most remarkable consequences of relativistic quantum field
theory:
3 Here we work in the “interaction picture”, where operators evolve under the free Hamiltonian H . Heisenberg picture fields
0
in interacting theories cannot be decomposed in this way, and when interactions are strong the interaction picture is not useful
so this motivation needs some revisiting (see the next section).
4 Here ± indicates that for fields which create fermions we actually want anticommutativity instead of commutativity at
spacelike separation.
11
In interacting quantum field theories, the number of particles is not conserved: we have
not yet specified the functions ui and vi , but we will see soon that both most be nonzero in order to
preserve commutativity at spacelike separation. Interactions which are polynomials of the fields will
thus always heuristically have the form (a + a† )n , which necessarily includes terms that do not have
the same number of creation and annihilation operators and thus do not conserve particle number. We
should therefore expect that particle scattering in field theory can create any set of particles for which
there is sufficient energy, at least as long as the final particles have the same symmetry charges as the
initial particles. The idea that energy can be freely converted into particles is quite natural from the
point of view of Einstein’s equation E = mc2 .
Every particle must have an antiparticle of equal mass √ 2 and opposite charge: we will see
2
soon that the time-dependence of ui and vi is given by e∓i p +m t , so to preserve commutativity at
spacelike separation for all times, which requires a cancellation between terms involving both ui and
vi , these must have the same time-dependence and thus multiply creation/annihilation operators for
particles of the same mass. On the other hand they must have opposite charge under any continuous
internal symmetry. This is because in order to have an internal symmetry of the Lagrangian we need
the field to have a simple transformation law
where Q is the charge operator for the symmetry and q is the charge of the field, which means that
the annihilation and creation operators appearing in ϕi must both transform by a factor eiθq . This
means that the particles created by the creation operator have opposite charge of those annihilated by
the annihilation operator (you will show this in the homework). A particle can be its own antiparticle,
but only if it has q = 0 for all continuous symmetries.
There are other important general consequences we will understand later, including:
Spin-statistics theorem: in constructing ui and vi , it turns out that commutativity at spacelike
separation is only possible when the particles that are created/annihilated have integer spin. For
particles of half-integer spin, we instead need to impose anticommutativity at spacelike separation.
As mentioned above, commutativity leads to bosons and anticommutativity leads to fermions. Hence
we see that bosons must have integer spin and fermions must have half-integer spin.
CRT theorem: In any relativistic quantum field theory, it turns out that there is always a symmetry
that exchanges particles and antiparticles (C), reflects a spatial direction (R), and reverses time (T ).
All of these predictions have been confirmed to remarkable precision by experiment, for example colliding two
photons at high energy can produce an electron-positron pair, a neutron decays to a proton, an electron, and
a neutrino, the antiparticle of the electron is the positron, electrons are fermions of spin 1/2 while photons
are bosons of spin one, and it was recently confirmed that hydrogen and antihydrogen have the same rate
for the 2s → 1s transition, as required by CRT symmetry.
12
Figure 4: A lattice system with two spatial dimensions. There are independent degrees of freedom at each
of the red sites, and each term in the Hamiltonian only couples degrees of freedom on nearby sites.
where i labels the sites on the lattice.5 We say that an operator is a local operator at site i if it is the tensor
product of an operator on Hi with the identity operator on all of the other sites. We are then interested in
Hamiltonians of the form X
H= Oi , (1.46)
i
where each Oi is built from local operators at sites in the vicinity of i, meaning sites that are an O(1) number
of links away (as opposed to something that grows with the total size of the system). Such Hamiltonians are
called local Hamiltonians. For example our lattice could be ions in a crystal, and the degrees of freedom
at the sites could describe local displacements of the ions. Another example we will come back to repeatedly
is the quantum Ising model in a transverse field, where each Hi is a two-level system and the Hamiltonian
is given by X X
H=− σx (i) − λ σz (i)σz (j), (1.47)
i ⟨ij⟩
with the energy density H(⃗x) being built out of operators localized at ⃗x. Moreover local operators at ⃗x
will commute with local operators at ⃗y , since they live on different tensor factors of the microscopic Hilbert
space (1.46). Thus it starts to look like a quantum field theory! This zooming out process is called the
renormalization group, and it is an idea of fundamental importance for any dynamical system (including
classical systems) with local interactions at short distances.
It is often the case that the interesting long-distance excitations of a many-body quantum system look
rather different than the fundamental lattice degrees of freedom. For example:
In a crystal, the fundamental degrees of freedom are protons and electrons interacting through Coulomb
forces but at long distances the excitations are phonons, which are ripples made out of vibrations in
the lattice structure.
5 This is not the most general possibility, as we could also add degrees of freedom on the links of the lattice, faces, etc, and
also perhaps constrain the physical states by imposing some kind of local constraint. We will see these generalizations arise
later when we consider gauge theories.
13
In quantum chromodynamics (QCD), which is the fundamental theory of strong interactions, the
fundamental degrees of freedom are quarks and gluons but the long-distance excitations are hadrons
such as protons, neutrons, and pions.
In 1 + 1 dimensions the fundamental degrees of freedom of the quantum Ising model (1.47) are Pauli
spins, but at the “critical point” λ = 1 the long-distance excitations are pairs of non-interacting massless
fermions. This is also the essence of why the two-dimensional classical Ising model is solvable.
These examples illustrate an important weak point of our above argument that we need quantum field
theory to combine special relativity and quantum mechanics: when the interactions of the fundamental
fields appearing in the Lagrangian are strong, such as in QCD at low energies, there need not be any simple
relationship between these fundamental fields and the low-energy particle excitations. What the argument
leading to (1.38) really constructs is a “low-energy effective field theory”, whose fields create and annihilate
the low-energy excitations. In quantum field theory the basic question we are often really trying to answer
is the following: given some short-distance formulation of the theory using local fields, what are the long-
distance excitations and how do they interact?
It is also worth emphasizing that, although we began this discussion by talking about particles, not all
quantum field theories lead to particles. Field theories without particles include “conformal field theories”,
which are more naturally understood in terms of correlation functions of local operators with simple scaling
transformations, and “topological field theories”, which are more naturally understood in terms of the algebra
of extended “surface” operators which can be freely deformed in spacetime. Moreover these are not weird
esoteric theories: the long-distance description of any second-order phase transition is a conformal field
theory, and the fractional quantum hall effect is described by a topological quantum field theory. Even in
the standard model of particle physics, there are “infrared divergences” arising from the presence of massless
particles such as the photon and dealing with these correctly requires us to consider asymptotic states which
are clouds of infinite numbers of particles rather than individual particles. In quantum field theory it is the
fields that are essential, not the particles.
There is an important caveat to mention here: we are quite confident that the laws of nature are relativis-
tic, so in high-energy physics we are for the most part only interested in relativistic quantum field theories.
In condensed matter physics however Lorentz invariance can be broken by the existence of the material we
are studying, so the field theories that show up in condensed matter physics do not need to be relativistic.
Sometimes they are however, for example in the case of the quantum Ising model or the fractional quantum
hall system, and the methods you learn in this class generalize to the non-relativistic case without much
difficulty.
14
by quantum field theory. In fact AdS/CFT correspondence arises in string theory in precisely this way. This
also gives a novel way of constructing interesting quantum field theories that so far are not accessible by the
more conventional techniques based on Lagrangians that we will use in this class.
to use ’t Hooft and Veltman’s loony (but brilliant) dimensional regularization method for computing Feynman diagrams.
15
1.6 Homework
1. Let’s get some practice using natural units:
(a) What is the mass of the sun measured in Joules? What about in electron volts? Recall that
1 eV = 1.6 × 10−19 J.
(b) What is one meter in inverse electron volts?
(c) The mass of a proton is 1.67 × 10−27 kg. What is this in electron volts? What do we get if we
convert it to an inverse length? How does this length compare to the size of a nucleus?
(d) The mass of an electron is 9.1 × 10−31 kg. What is this in electron volts? What do we get if
we convert it to an inverse length? How does this length compare to the size of an atom? Any
thoughts about how this comparison went versus the one for the nucleus?
(e) Say that a force is quoted to you in units of eV2 . What factors of c and ℏ should you supply to
convert it back to Newtons?
(f) Say that an energy flux is quoted to you in units of eV4 . What factors of c and ℏ should you
supply to convert it back to Joules per meter squared per second?
If you are having trouble with these, a good way to proceed is to remember that ℏ has units of energy
times time and c has units of length over time. So you can use c to convert all lengths to times, and
then use ℏ to convert all times to energies. Masses can be converted to energy by multiplying by c2 ,
2. Show that the multiparticle states (1.23) are normalized correctly. You will need to use the cre-
ation/annihilation algebra. If you are having trouble I recommend showing it recursively.
3. If an operator a annihilates particles of charge q, what is the commutator of the symmetry charge Q
with a and a† ? What are e−iQθ aeiQθ and e−iQθ a† eiQθ ?
4. Show that the second line of (1.25) follows from the first
5. Show that if we apply the Poincaré transformations (Λ1 , a1 ) and then (Λ2 , a2 ) in succession, the
resulting Poincaré transformation is (Λ2 Λ1 , Λ2 a1 + a1 ). Then show that the field transformation (1.40)
is consistent with the composition rule U (Λ2 , a2 )U (Λ1 , a1 ) = U (Λ2 Λ1 , Λ2 a1 + a2 ) provided that the
matrix D obeys the Lorentz representation condition (1.41).
16
2 Lagrangian field theory
Having motivated the idea of quantum fields from various directions, we now commence studying them in
detail. We will begin with the classical theory of fields, starting from the Lagrangian point of view.7
is stationary up to terms at the future/past boundaries.8 Note that the Lagrangian is local in time: at time
t it only depends on the positions and velocities of the particles at time t. We have included t as a separate
argument in the Largangian to allow it to have some explicit time-dependence, for example through a time-
dependent background field that the particle is moving in. To study stationarity, we insert an infinitesimal
variation
x′ (t) = x(t) + δx(t) (2.2)
into the action:
XZ tf
′ ∂L a ∂L ˙ a
S[x ] =S[x] + dt δx (t) + δx (t)
a ti ∂xa ∂ ẋa
" # ! tf
XZ tf
∂L ∂L˙ X ∂L a
a
=S[x] + dt − δx (t) + δx (t) . (2.3)
a ti ∂xa ∂ ẋa a
∂ ẋa
ti
The third term in the second line consists of a future boundary term and a past boundary term, so stationarity
means that the second term should vanish for all variations δxa (t). In other words the Euler-Lagrange
equations
∂L ˙
∂L
a
= (2.4)
∂x ∂ ẋa
must hold. For example if we have
m
L = ẋ2 − V (x), (2.5)
2
then we must have
∂V
mẍa = − a . (2.6)
∂x
We can pass to the Hamiltonian formalism by introducing the canonical momenta
∂L
pa = , (2.7)
∂ ẋa
7 I’ll present the traditional approach that assumes the Lagrangian depends only on the fields and their first derivatives.
Later in the semester we will also be interested in theories with more derivatives in the Lagrangian: the traditional method for
dealing with this is to introduce auxiliary fields to rewrite the Lagrangian in a way that only involves first derivatives. For a
more modern approach that works directly with the original fields see my paper 1906.08616 with Jie-qiang Wu.
8 You may have been taught that the action should be stationary without qualification. This is true if we fix boundary
conditions at tf /ti , but doing that amounts to singling out some particular set of initial/final conditions. We are trying to
characterize the theory as a whole, so we shouldn’t bias the discussion by picking out some particular state of the system.
17
and the Hamiltonian is given by X
H= pa ẋa − L. (2.8)
a
by a commutator
[xa , pb ] = iδba , (2.10)
a
which we represent on a Hilbert space spanned by eigenstates |x⟩ of the X operator:
The field trajectories are given by functions ϕa (t, ⃗x), where a is a label that runs over some finite number
of fields. This can be viewed as a generalization of the previous subsection in two different ways. The first
way is that we are now allowing the trajectories to depend on space as well as time, in which case we go
back to the particle case by taking d = 1. The second way is that we can think of each field at each point in
space as a distinct particle, in which case we have generalized the previous subsection to an infinite number
of particles. Our notation is more closely aligned to the former interpretation, but the latter is valuable
conceptually because it makes clear that fundamentally we shouldn’t have to do anything for fields that we
didn’t already do for particles.
To specify the field dynamics we need a Lagrangian. As we are interested in constructing field theories
which respect microcausality (i.e. commutativity at spacelike separation), we should take this Lagrangian
to be an integral over space of a local Lagrangian density:
Z
⃗
L[ϕ; t] = dd−1 x L ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x . (2.13)
⃗ t, ⃗x) is constructed out the fields and their derivatives at (t, ⃗x), and we have allowed for
Here L(ϕ, ϕ̇, ∇ϕ;
explicit dependence on space and time. I’ve written L[ϕ; t] with square brackets to emphasize that it is a
functional: it is a function of the functions ϕa and ϕ̇a throughout timeslice at time t. A simple example of
a Lagrangian density is the free scalar field Lagrangian, where we have a single field ϕ(t, ⃗x) with
⃗ 1 2 ⃗
⃗ − m 2 ϕ2 ,
L(ϕ, ϕ̇, ∇ϕ) = ϕ̇ − ∇ϕ · ∇ϕ (2.14)
2
where m is a parameter that we will see next time gives the mass of the particles created by this field. We
could introduce explicit space and time dependence by letting m depend on t and ⃗x.
To find the equations of motion we adopt the same principle as before: the action
Z tf
S := ⃗
dtdd−1 x L ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x (2.15)
ti
18
should be stationary about physical field configurations up to future and past boundary terms. Computing
the variation, we find
X Z tf Z h ∂L ∂L ˙ a ∂L i
δS = dt dd−1 x δϕa
(t, ⃗
x ) + δϕ (t, ⃗
x ) + · ⃗ a (t, ⃗x)
∇δϕ
a ti ∂ϕa ∂ ϕ˙a ⃗ a
∂ ∇ϕ
X Z tf Z h ∂L ˙
∂L
∂L
i
= dt dd−1 x − − ⃗ ·
∇ δϕa (t, ⃗x)
t ∂ϕ a
∂ ϕ ˙a ∂ ⃗
∇ϕ a
a i
! tf
X Z X Z tf Z
d−1 ∂L a ⃗ · ∂L δϕa (t, ⃗x).
+ d x δϕ + dt dA (2.16)
∂ϕ ˙a
ti Sd−2
∞
⃗ a
∂ ∇ϕ
a ti a
The third line consists of future/past boundary terms and a spatial boundary term at the (d − 2)-sphere
Sd−2
∞ at spatial infinity. The former are acceptable, but the latter need to vanish in order for the theory to
make sense. The usual way to deal with this is to impose spatial boundary conditions requiring the fields to
vanish at infinity, in which case the variations δϕa must also vanish at infinity and so this term vanishes.9
We therefore see that the action will be stationary (up to future/past terms) if the Euler-Lagrange equations
˙
∂L
∂L
∂L
⃗
+∇· = (2.17)
∂ϕ ˙a ⃗
∂ ∇ϕ a ∂ϕa
are satisfied throughout spacetime. For example for our free scalar field Lagrangian we have
ϕ̈ − ∇2 ϕ = −m2 ϕ, (2.18)
which is a massive version of the wave equation known as the Klein-Gordon equation.
As in the particle case we can also introduce a canonical momentum
∂L
πa ≡ , (2.19)
∂ ϕ̇a
in terms of which the Hamiltonian is given by
Z
⃗
H[ϕ; t] = dd−1 x H ϕ(t, ⃗x), ϕ̇(t, ⃗x), ∇ϕ(t, ⃗x); t, ⃗x (2.20)
To complete the construction of the Hamiltonian formalism we need to solve (2.19) to determine ϕ̇ in terms
of ϕ and p. Sometimes this is not possible due to constraints, in which case more sophisticated methods are
needed that we will return to later. For the free scalar field there is no problem, we simply have
π = ϕ̇ (2.22)
and
1 2 ⃗
⃗ + m 2 ϕ2 .
H= π + ∇ϕ · ∇ϕ (2.23)
2
9 It is also interesting to consider field theories in finite volume, in which case there is more to say about this term. For
19
Once the Hamiltonian formalism is constructed, we can then quantize the theory by converting the
equal-time Poisson brackets
to commutators in the usual way and then representing them on a Hilbert space spanned by field eigenstates
|ϕ⟩ obeying
Φa (0, ⃗x)|ϕ⟩ = ϕa (⃗x)|ϕ⟩. (2.25)
We will carry out this procedure in detail for the free scalar field next time, but we note that in a Lorentz-
invariant theory the commutativity at spatial separation we see here extends to commutativity at spacelike
separation.
u · v = uµ v ν ηµν , (2.28)
where11
−1 0 0 ... 0
0 1 0 ... 0
ηµν := 0 0 1 ... 0 (2.29)
.. .. .. .. ..
. . . . .
0 0 0 ... 1
is the d-dimensional Minkowski metric and we are using the Einstein summation convention that sums
over pairs of repeated indices automatically. This inner product is preserved under Lorentz transformations
u′µ = Λµν uν
v ′µ = Λµν v ν ,
and its components as xµ . This is done for example in Wald’s book. We already have enough kinds of index to be getting on
with however, so we will stick to being somewhat cavalier about the difference between x and xµ (and analogously ⃗ x and xi ).
A similar remark applies about the difference between a function f and the evaluation f (x) of that function on an element x
of its domain, which we have already conflated several times.
11 Some benighted particle theorists use a horrid “mostly-minus” convention for η
µν that reverses its overall sign, and in this
context our convention is called “mostly-plus”. In general life is not improved by increasing the number of minus signs, and
that is absolutely the case here.
20
Indeed we have
u′ · v ′ = Λµα uα Λν β v β ηµν
= ηαβ uα v β
= u · v. (2.31)
The set of d × d matrices obeying (2.30) is called the Lorentz group, and we will have lots to say about it
later.
It is also convenient to introduce d-component objects with a lowered Lorentz index, called one-forms,
which transform as
ωµ′ = Λµν ων . (2.32)
Here Λν µ indicates the transpose of the inverse of Λν µ , meaning that it obeys
Λν µ Λν α = δµα (2.33)
with δµα being the Kronecker delta that is equal to one if α = µ and zero otherwise. A simple example of
a one-form is the scalar gradient
⃗
∂µ ϕ = (ϕ̇, ∇ϕ), (2.34)
which transforms with the inverse-transpose of Λ because the partial derivative transforms opposite to the
spacetime coordinates. We can compute the inner product of two one-forms by using the inverse metric η µν ,
which again is diagonal with diagonal elements (−1, 1, . . . , 1):
ω · σ = ωµ σν η µν . (2.35)
You will show on the homework that we can use the metric to turn a vector into a one-form and the inverse
metric to turn a one-form into a vector by “lowering” and “raising” the indices
uµ := ηµν uν
ω µ := η µν ων , (2.36)
and also that the inverse-transpose Lorentz transformation Λν µ is indeed obtained by raising/lowering the
indices of the original Lorentz transformation Λν µ in this way.
Using this notation we can write the free scalar field Lagrangian density more elegantly in a few different
ways,
1
∂µ ϕ∂ν ϕη µν + m2 ϕ2
L(ϕ, ∂ϕ) = −
2
1
= − ∂µ ϕ∂ µ ϕ + m2 ϕ2
2
1
= − ∂ϕ · ∂ϕ + m2 ϕ2 ,
(2.37)
2
and the Klein-Gordon equation becomes
∂ 2 − m2 ϕ = 0.
(2.38)
More generally we can write the action as
Z
S[ϕ; x] = dd x L ϕ(x), ∂ϕ(x); x (2.39)
21
2.4 Symmetries and currents
One of the most important advantages of the Lagrangian formalism is the close relationship between sym-
metries and conserved quantities. By definition a symmetry in the Lagrangian formalism is an invertible
change of variables
ϕ′a (x) = F [ϕ] (2.41)
which leaves the action invariant up to future/past boundary terms. A particularly interesting kind of
symmetry is a continuous symmetry, which is a family of symmetries Fθ labeled by a continuous parameter
θ such that when θ = 0 the transformation Fθ is the identity. In particular we can take θ to be infinitesimal,
in which case we’ll call it ϵ, and we then have a field theory version of Noether’s theorem: any infinitesimal
transformation of the fields that leaves the action invariant up to future/past boundary terms leads to a
conserved current. Indeed consider an infinitesimal transformation12
ϕ′a (x) = ϕa (x) + ϵδS ϕa (x) (2.42)
that to first order in ϵ leaves the action (2.39) invariant up to future/past boundary terms.13 Here δS ϕa
is some function of ϕa and its derivatives at x, and possibly also x itself explicitly. One way for this to
happen is for the Lagrangian density itself to be invariant, but more generally its transformation could be
the divergence of a vector since that would still integrate to a future/past boundary term (assuming that the
spatial boundary terms vanish). More explicitly, in order for the transformation (2.42) to be a symmetry we
need
X ∂L ∂L
a a
δS L = δ
a S
ϕ + ∂ δ
a µ S
ϕ = ∂µ α µ , (2.43)
a
∂ϕ ∂∂ µ ϕ
where αµ is some local function of ϕ and ∂ϕ. We can rewrite this expression as
!
X ∂L X ∂L ∂L
a µ
∂µ δ ϕ −α
a S
= ∂µ a
− a
δ S ϕa , (2.44)
a
∂∂ µ ϕ a
∂∂ µ ϕ ∂ϕ
and then observe that the right-hand side vanishes for field configurations ϕa (x) that obey the Euler-Lagrange
equations (2.40). In other words we see that the Noether current
X ∂L
J µ (x) := − δS ϕa (x) + αµ (x) (2.45)
a
∂∂µ ϕa
is independent of time. We have chosen the overall sign and normalization of J µ so that Q is the generator
of the symmetry in the sense that for any observable O we have the Poisson bracket14
{Q, O} = δS O. (2.49)
12 It is not obvious, but every infinitesimal symmetry can be “exponentiated” to produce a continuous symmetry so the two
ideas are equivalent.
13 It is important here that the action needs to be invariant for any ϕa (x), not just solutions of the equations of motion. In
the latter case the action is always invariant to first order under any continuous transformation!
14 This Poisson bracket is easy to derive when δ ϕ depends only on ϕ and not its derivatives and α = 0. The general case is
S
tricky and I haven’t found a textbook discussion, the only derivation I know is given in section 4.2 of 1906.08616.
22
After quantization this becomes
[Q, O] = iδS O. (2.50)
As a simple example of this construction let’s return to our free scalar theory and now set the mass m
to zero. The Lagrangian density is then invariant under the shift symmetry
δS ϕ = 1
αµ = 0. (2.52)
where Λ is a Lorentz transformation obeying (2.30) and a is an arbitrary vector, but to interpret it as a
dynamical symmetry for Noether’s theorem we need to recast it as a transformation of the fields rather than
the coordinates.15 On a scalar field the transformation is simple to write down: we have
Here the inverse transformation appears inside the field so that the composition of Poincaré transformations
works out correctly, as you showed on the previous homework.
To apply Noether’s theorem we need to understand infinitesimal Lorentz transformations, meaning we
should write
Λµν = δνµ + ϵω µν (2.56)
and substitute into (2.30) to see what the constraints are on ω µν . Indeed we have
distinguished from a passive viewpoint where the coordinates transform and the fields stay the same. As in many situations,
here it is better to be active.
23
t y t
4 4 4
2 2 2
0 x 0 x 0 x
-2 -2 -2
-4 -4 -4
-4 -2 0 2 4 -4 -2 0 2 4 -4 -2 0 2 4
Figure 5: Killing vector fields for a spacetime translation, a spatial rotation, and a boost.
and thus
ξ µ (x) := bµ + ω µα xα (2.61)
that we can view as pointing in the direction of the infinitesimal Poincaré transformation in question. For
example an infinitesimal boost in the x1 direction has ω 01 = ω 10 = 1 and thus
These Killing vector fields are illustrated in figure 5. More generally a Killing vector field is by definition a
vector field for which
∂µ ξν + ∂ν ξµ = 0, (2.64)
as you can easily check is the case here.16 Contracting this equation with the inverse metric, we also see that
∂µ ξ µ = 0. (2.65)
In Minkowski space (2.61) gives the full set of Killing vectors. It is spanned by d − 1 infinitesimal boosts,
(d − 1)(d − 2)/2 infinitesimal rotations, and d infinitesimal spacetime translations. For d = 4 this gives three
boosts (in the x, y, and z directions), three rotations (in the xy, yz, and zx planes), three space translations
(in the x, y, and z directions), and one time translation.
By definition a theory which is Poincaré-invariant is one whose Lagrangian density is a scalar under
Poincaré transformations, meaning that
δS L = −ξ µ ∂µ L (2.66)
for any Killing vector ξ µ . You will check this equation (somewhat laboriously) for our free scalar theory in
the homework. Since ξ µ is a Killing vector, by equation (2.65) we have
ξ µ ∂µ L = ∂µ (ξ µ L) (2.67)
16 The motivation for this definition is that an infinitesimal coordinate transformation x′µ = xµ + ξ µ (x) leaves the spacetime
24
and thus
δS L = ∂µ αµ (2.68)
with
αµ = −ξ µ L. (2.69)
For any Killing vector ξ µ we therefore have a conserved Noether current
X ∂L
Jξµ (x) = − δS ϕa (x) − ξ µ L(x). (2.70)
a
∂∂µ ϕa
where
1
T µν := ∂ µ ϕ∂ ν ϕ − η µν ∂α ϕ∂ α ϕ + m2 ϕ2
(2.72)
2
is called the energy-momentum tensor. It has two nice properties:
(1) Symmetry:
T µν = T νµ (2.73)
(2) Conservation:
∂µ T µν = 0. (2.74)
Indeed any tensor obeying these two properties has the feature that contracting it with a Killing vector gives
a conserved current:
∂µ (ξν T µν ) = ∂µ ξν T µν + ξν ∂µ T µν = 0. (2.75)
It is not obvious from (2.70) that we can in general write Jξµ in terms of a symmetric conserved energy-
momentum tensor in this way, since when there are fields that are not scalars δS ϕa can involve derivatives
of ξ µ ,but it turns out that when such derivatives appear they can always be removed by shifting Jξµ by
local term whose divergence is identically zero.17 The resulting energy-momentum tensor has a more elegant
equivalent definition as the derivative of the action with respect to the spacetime metric:
Z
ϵ
S[ϕ, ηµν + ϵhµν ] = S[ϕ, ηµν ] + dd xT µν (x)hµν (x) + O(ϵ2 ). (2.76)
2
The metric is a symmetric tensor so this T µν obeys condition (1) automatically, and with a little more
differential geometry than we are requiring for this class you can also show that it obeys condition (2)
provided that there are no Lorentz-violating background fields.
We can understand the physical meaning of the energy momentum tensor by looking at the Noether
currents for pure spacetime translations with ωµν = 0. By definition the total momentum vector P λ , which
is the generator of spacetime translations, is given by
Z
dd−1 x Jξ0 = −ξλ P λ . (2.77)
Therefore we have Z
µ
P = dd−1 x T 0µ , (2.78)
17 See section 7.4 of Weinberg for the case where the Lagrangian has only first derivatives, as we’ve been considering here, or
25
so we can think of T 00 as the energy density and T 0i as the momentum density (hence the name of the
tensor). And indeed for our free scalar theory, from (2.72) we have
1 2 ⃗
⃗ + m 2 ϕ2 ,
T 00 = ϕ̇ + ∇ϕ · ∇ϕ (2.79)
2
consistent with the Hamiltonian density (2.23). We can also define generators of pure Lorentz transformations
(with bµ = 0) via Z
1
dd−1 xJξ0 = ωµν J µν , (2.80)
2
which gives Z
J µν = dd−1 x xµ T 0ν − xν T 0µ .
(2.81)
Here J ij is the angular momentum for a rotation in the ij plane, while J i0 is the generator of a boost in the
i direction.
26
2.6 Homework
1. Show that Λµν = ηµα η νβ Λαβ is indeed the inverse of Λµν , in the sense that Λµλ Λν λ = δνµ .
2. Show that the gradient ∂µ ϕ transforms as a one-form under the Poincaré transformation ϕ′ (x) =
ϕ(Λ−1 (x − a)).
3. Show that if V µ is a vector then Vµ = ηµν V ν transforms as a one-form, and also that if ωµ is a one-form
then ω µ = η µν ων transforms as a vector.
4. The Lagrangian density for Maxwell theory is
1
L = − F µν Fµν ,
4
where
Fµν = ∂µ Aν − ∂ν Aµ
is the field strength tensor and Aµ is a one-form usually called the gauge potential or gauge field. The
relationship between Aµ and the usual scalar potential ϕ and vector potential A ⃗ is that A = (−ϕ, A).
⃗
(a) Write out the Euler-Lagrange field equations which follow from the Maxwell Lagrangian. Use the
relativistic variables Aµ and Fµν .
(b) For d = 4, give expressions for the components of Fµν in terms of the usual electric and magnetic
⃗ and B,
fields E ⃗ and use these to rewrite the equations of motion in terms of E ⃗ and B.
⃗ How
do these relate to Maxwell’s equations? Did you get all four equations, and if not where do the
others come from?
(c) Now add a term Aµ J µ to the Lagrangian density, where J µ = (ρ, J) ⃗ is the spacetime electric
current. Show how this modifies the equations of motion, and check that for d = 4 it gives the
correct charge and current terms in Maxwell’s equations. In this part you can view J µ as a
“background” current, meaning that when you compute the variation of the action you can take
its variation to be zero. Eventually we will build J µ out of other fields which create charged
particles, but this does not effect the equation of motion obtained by varying Aµ .
5. The Langrangian density for a complex free scalar field is given by
L = −∂ µ ϕ∗ ∂µ ϕ − m2 ϕ∗ ϕ.
(a) Find the Euler-Lagrange equations for this action. In principle in computing variations you should
treat the independent fields as the real and imaginary parts of ϕ, but your life will be easier if you
can convince yourself that you can instead treat ϕ and ϕ∗ as the independent variables. Convince
yourself that you indeed can do this for a general Lagrangian density L(ϕ, ϕ∗ , ∂ϕ, ∂ϕ∗ ).
(b) Show that the transformation ϕ′ (x) = eiθ ϕ(x) is a symmetry for any θ, write out its infinitesimal
version (i.e. to linear order in θ), and construct the associated Noether current. Confirm explicitly
that this current is conserved as a consequence of the equations of motion. You again will do
better to view ϕ and ϕ∗ as the independent fields.
(c) Write an expression for the conserved symmetry charge Q, and check that it indeed generates the
symmetry transformation as in equation (2.49).
6. Show explicitly that the free real scalar Lagrangian density obeys the invariance condition (2.66) under
the infinitesimal transformation δS ϕ = −ξ µ ∂µ ϕ for any Killing vector ξ µ .
7. (extra credit) The action of a free scalar field in a general metric gµν is given by
√
Z
1
dd x −g ∂µ ϕ∂ν ϕg µν + m2 ϕ2 ,
S=−
2
27
where g indicates the determinant of the matrix gµν and g µν is its inverse. Show that if we take
gµν = ηµν + ϵhµν , the energy momentum tensor we construct as in equation (2.76) is the same one we
found from the Noether current. To do this you need to look up or derive how the determinant and
inverse of a matrix respond to a small change in the matrix.
8. (extra credit) The Maxwell action in a general metric is
√
Z
1
S=− dd x −gFµν Fαβ g µα g νβ .
4
What is energy-momentum tensor which follows from varying this action with respect to gµν ? For
d = 4 write T00 in terms of the electric and magnetic fields; does the answer look familiar?
28
3 Quantization of a free scalar field
The previous lecture was rather formal. Formalism is good for organizing one’s thinking, but to really
understand things you need to get your hands dirty. In this lecture and the following one we will carry out
in detail the canonical quantization of a free scalar field in d spacetime dimensions, with Lagrangian density
1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ . (3.1)
2 2
The word “free” here means that the Lagrangian is quadratic in the fields - we’ll see that this implies that
the particles in this theory are non-interacting. For now we will take ϕ to be real-valued, we will discuss
soon how to generalize to the case of complex ϕ. The free scalar field is both simple and profound: it is
exactly solvable, and yet it illustrates many of the deep aspects of quantum field theory that we will return
to again and again. Before beginning it is worth emphasizing that this model is not only of interest as an
example: it has many physical realizations. Some examples in various dimensions:
The Higgs boson in the Standard Model of particle physics, discovered in 2010 at the LHC, is to first
approximation described by a free scalar field with d = 4 and m = 125 GeV.
Helium 4 (He4 ) at low temperature and standard pressure is a special kind of liquid, called a superfluid,
which flows with zero viscosity. The low-energy excitations of this liquid are density waves called
phonons, and they are described by a free scalar field theory with d = 4 and m = 0. If we confine
Helium-4 to a two-dimensional surface, then it is described by a free scalar field with d = 3 and m = 0.
The protons and neutrons in nuclei are held together by exchanging particles called pions, and these
pions are governed at low-energy by free scalar fields with d = 4 and m = 134 MeV (for the π 0 ) and
m = 139 MeV (for the π ± ). The π 0 is a real scalar field, while the π ± are complex (as we will introduce
below).
In string theory the embedding of the string worldsheet into spacetime is described using free scalar
fields with d = 2 and m = 0.
In fact the 2016 Nobel prize in physics was awarded in substantial part for understanding the d = 2 version
of this theory!
with continuous indices. This is because we defined π as a partial derivative of the Lagrangian density, as opposed to a partial
derivative of the Lagrangian. The latter actually vanishes since it is multiplied by the infinitesimal dd−1 x, so in field theory
it is better to use the former. With a lattice regulator they are related by a power of the lattice spacing a, as we will see in a
moment.
29
The first step of canonical quantization is to represent this algebra on a Hilbert space, which in the particle
case we take to be the vector space of square-normalizeable wave functions. We can do the same thing here,
but we need to introduce a space of normalizeable wave functionals
Note that the states |ϕ⟩ are labeled by functions ϕ : Rd−1 → R, so Ψ is indeed a functional (a function of
a function). In order to compute the inner product between two wave functionals Ψ1 and Ψ2 , we need to
compute a functional integral Z
⟨Ψ2 |Ψ1 ⟩ := DϕΨ2 [ϕ]∗ Ψ1 [ϕ]. (3.7)
Functional integrals are rather delicate mathematical objects, as we will discuss in more detail when we get
to path integrals. Roughly speaking the idea is to define the measure as
Y
Dϕ := dϕ(⃗x), (3.8)
⃗
x
so in other words we integrate independently over the value of ϕ at each point in space.
To represent the algebra (3.4) on this Hilbert space, imitating nonrelativistic quantum mechanics we can
take
δ
Π(⃗x) := Π(0, ⃗x) = −i , (3.9)
δϕ(⃗x)
where the quantity appearing on the right-hand side is the functional derivative defined by
δ
ϕ(⃗y ) = δ d−1 (⃗x − ⃗y ). (3.10)
δϕ(⃗x)
Proceeding as in non-relativistic quantum mechanics, the next step is then to construct energy eigenstates
by solving the functional Schrödinger equation
δ2
Z
1
dd−1 x − + ∇ϕ · ∇ϕ(⃗
x ) + m2
ϕ(⃗
x )2
Ψ[ϕ] = EΨ[ϕ]. (3.12)
2 δϕ(⃗x)2
In principle solving this equation (including interactions and other types of fields) is “all there is” to quantum
field theory.19
We can make the functional Schrödinger formalism more rigorous by regularizing the theory using a
spatial lattice, so that the field variable is only defined on a discrete set of spatial points ⃗x which are part of
a lattice L. Taking L to be a cubic lattice, this more explicitly looks like
!2
1 X d−1 X Φ(⃗x + ⃗δ) − Φ(⃗x)
H= a Π(⃗x)2 + + m2 Φ(⃗x)2 , (3.13)
2 a
x∈L
⃗ ⃗
δ
19 More carefully this is all there is to field theories which are constructed from Lagrangians. There are some exotic field
theories that do not seem constructable in this way, and studying them requires techniques that are mostly beyond the scope
of this class.
30
with
[Φ(⃗x), Π(⃗y )] = ia−(d−1) δ⃗x,⃗y (3.14)
and thus
∂
Π(⃗x) = −ia−(d−1) . (3.15)
∂ϕ(⃗x)
Here a is the lattice spacing, and ⃗δ ranges over the orthogonal lattice displacements ax̂1 , ax̂2 , . . . , ax̂d−1 . The
functional Schrödinger equation then becomes
!2
1 X d−1 −2(d−1) ∂ 2 X ϕ(⃗x + ⃗δ) − ϕ(⃗x)
a −a + + m2 ϕ(⃗x)2 Ψ[ϕ] = EΨ[ϕ], (3.16)
2 ∂ϕ(⃗x)2 a
x∈L
⃗ ⃗
δ
which is now just a second-order partial differential equation in many variables. If we also work in finite
volume, so that the total number of points is finite, then we can (at least in principle) try solving this
equation on a computer. In free theories this is not necessary since the theory can be solved exactly (see
below), but if we include interactions (such as say a ϕ4 term in the Hamiltonian) then this approach can be
viable.20
which is precisely the classical equation of motion (in Hamiltonian form) for O(t). Thus in the free scalar
theory (where there is no issue of operator ordering since there are no terms in the Hamiltonian involving
both Φ and Π) the Heisenberg field
Φ(t, ⃗x) = eiHt Φ(⃗x)e−iHt (3.19)
should obey its classical equation of motion, namely the Klein-Gordon equation
(∂ 2 − m2 )Φ = 0. (3.20)
As discussed in the last lecture we will impose boundary conditions requiring the fields to vanish at spatial
infinity, and any solution of the Klein-Gordon equation which vanishes at spatial infinity can be expanded
in terms of a plane-wave basis set of solutions given by
1 ⃗
f⃗k (t, ⃗x) = p eik·⃗x−iω⃗k t (3.21)
2ω⃗k
integral being the long-standing champion for many theories (including this one). Newer approaches which are gaining ground
are the “numerical bootstrap” and quantum simulation.
31
p
where |k| = ⃗k · ⃗k, and we have included the factor of √ 1 for future convenience (it ensures that we
2ω ⃗
k
end up with properly-normalized annihilation/creation operators below). Defining a spacetime momentum
vector
k µ = (ω⃗k , ⃗k), (3.23)
in relativistic notation we have
1
f⃗k (x) = √ eik·x . (3.24)
2k 0
Expanding the Heisenberg field in terms of these solutions we have
dd−1 k h
Z i
∗ †
Φ(x) = f⃗ (x)a⃗ + f⃗ (x)a⃗
(2π)d−1 k k k k
Z d−1
d k 1 h
ik·x −ik·x †
i
= e a ⃗
k + e a⃗ , (3.25)
(2π)d−1 2ω⃗k
p
k
where a⃗k and a⃗† are operator coefficients in the mode expansion of the operator Φ(x). The operator
k
coefficients of f⃗k and f⃗k∗ are hermitian conjugates because ϕ is a real field and so Φ needs to be a hermitian
operator. The factor of (2π)1d−1 is included as a matter of convenience: it has to appear somewhere due to
the way that Fourier transforms work, and this turns out to be the best place to put it. There is a mantra
for remembering where it goes which we’ll call Coleman’s rule:
⋆ Whenever you integrate over momentum there is a factor of 1/(2π) for each component, and whenever
you have a momentum-conserving δ-function then it comes with a factor of 2π for each component.
So far we haven’t actually done much, but let’s now see what the canonical commutation relations (3.4)
have to say about the algebra of a⃗k and a⃗† . The easiest way to do this is to use the Fourier transform to
k
extract a⃗k and a⃗† from the t = 0 fields Φ(⃗x) and Π(⃗x). In doing such calculations there are two crucial
k
identities:
Z
⃗
dd−1 xe−ik·⃗x = (2π)d−1 δ d−1 (⃗k)
dd−1 k i⃗k·⃗x
Z
e = δ d−1 (⃗x), (3.26)
(2π)d−1
where we have placed the factors of 2π in accordance with Coleman’s rule. Using the first of these we have
dd−1 k
Z Z Z
1 h
i⃗ −i⃗ x †
i
dd−1 xe−i⃗p·⃗x Φ(⃗x) = dd−1
xe −i⃗
p·⃗
x
e k·⃗
x
a⃗
k + e k·⃗
a⃗
(2π)d−1 2ω⃗k
p
k
dd−1 k
Z
1 h d−1 d−1 ⃗ d−1 d−1 ⃗ †
i
= (2π) δ ( k − p
⃗ )a⃗k + (2π) δ (k + p
⃗ )a⃗
(2π)d−1 2ω⃗k
p
k
1
=p ap⃗ + a†−⃗p (3.27)
2ωp⃗
and
dd−1 k −iω⃗k
Z Z Z h i
i⃗ −i⃗ x †
dd−1 xe−i⃗p·⃗x Π(⃗x) = d d−1
xe −i⃗p·⃗
x
e k·⃗
x
a ⃗
k − e k·⃗
a⃗ (3.28)
(2π)d−1 2ω⃗k
p
k
r
dd−1 k ω⃗k h
Z i
d−1 d−1 ⃗ d−1 d−1 ⃗ †
= −i (2π) δ ( k − p
⃗ )a⃗
k − (2π) δ (k + p
⃗ )a⃗
(2π)d−1 2 k
r
ωp⃗
= −i ap⃗ − a†−⃗p , (3.29)
2
32
and thus
Z
1 √ i
ap⃗ = √ dd−1 xe−i⃗p·⃗x ωp⃗ Φ(⃗x) + √ Π(⃗x)
2 ωp⃗
Z
† 1 d−1 x √
p·⃗
i⃗ i
ap⃗ = √ d xe ωp⃗ Φ(⃗x) − √ Π(⃗x) . (3.30)
2 ωp⃗
We can then use these expressions together with the canonical commutation relations (3.4) to show that:
Z Z
i ′
[ap⃗ , ap⃗ ′ ] = d x dd−1 ye−i⃗p·⃗x−i⃗p ·⃗y [Φ(⃗x), Π(⃗y ] − [Φ(⃗y ), Π(⃗x)] = 0
d−1
2
[a†p⃗ , a†p⃗ ′ ] = −[ap⃗ , ap⃗ ′ ]† = 0
Z Z
† i ′
[ap⃗ , ap⃗ ′ ] = − d x dd−1 ye−i⃗p·⃗x+i⃗p ·⃗y [Φ(⃗x), Π(⃗y ] + [Φ(⃗y ), Π(⃗x)]
d−1
2
Z
′
= dd−1 xei(⃗p −⃗p)·⃗x
p − p⃗ ′ ).
= (2π)d−1 δ d−1 (⃗ (3.31)
These results should look familiar: they are the algebra of creation and annihilation operators for an infinite
number of harmonic oscillators, with the oscillators labeled by the spatial momentum p⃗. They are also
the momentum space version of the creation/annihilation operators on multi-particle Fock space that we
introduced back in the first lecture. Defining a vacuum state |Ω⟩ by the property that
for all p⃗ (we will show in a moment that this is indeed the ground state of the Hamiltonian), we have
one-particle states of the form
a†p⃗ |Ω⟩, (3.33)
two-particle states of the form
a†p⃗ ap†⃗ ′ |Ω⟩, (3.34)
and so on.
To justify the words “vacuum” and “particle” here however, we need to study the Hamiltonian. This is
given by Z
1 h i
⃗ x)|2 + m2 Φ(⃗x)2 ,
H= dd−1 x Π(⃗x)2 + |∇Φ(⃗ (3.35)
2
into which we should substitute our expression (3.25) for the Heisenberg field. This calculation is a bit
tedious, I’ll compute the first term here and you’ll do the other two in the homework:
dd−1 k dd−1 p
Z Z Z Z
1 1 1
i⃗ −i⃗ x †
p·⃗ −i⃗
p·⃗
x †
dd−1 xΠ(⃗x)2 = dd−1 x √ −iω ⃗
k e k·⃗
x
a⃗
k + iω⃗k e k·⃗
a⃗ −iω p
⃗ e i⃗ x
a⃗
k + iω p
⃗ e ap⃗
2 2 (2π)d−1 (2π)d−1 2 ω⃗k ωp⃗ k
Z d−1
1 d k
= ω⃗ a† a⃗ + a⃗k a⃗†k − a⃗k a−⃗k − a⃗†k a†−⃗k . (3.36)
4 (2π)d−1 k ⃗k k
dd−1 k
Z
1
† †
H= ω⃗ a a⃗ + a⃗ a . (3.37)
2 (2π)d−1 k ⃗k k k ⃗ k
33
This looks quite a bit like the harmonic oscillator Hamiltonian, and we can make it look more so by using
the algebra (3.31):
dd−1 k
Z
1
† †
H= ω ⃗ a a⃗ + a⃗ a
2 (2π)d−1 k ⃗k k k ⃗ k
dd−1 k
Z
1
= d−1
ω⃗k a⃗† a⃗k + [a⃗k , a⃗† ] + a⃗† a⃗k
2 (2π) k k k
Z d−1 Z
d k 1
= ω⃗ a† a⃗ + dd−1 kω⃗k δ d−1 (0). (3.38)
(2π)d−1 k ⃗k k 2
The first term here is just what we would like: the operator a⃗† a⃗k is the number operator that counts how
k
many particles there are of momentum ⃗k, so this term says that each particle of momentum ⃗k contributes
ω⃗k to the energy. For example if we act on a one-particle state we have
dd−1 k dd−1 k
Z Z
† †
ω⃗ a⃗ a⃗ a p
⃗ |Ω⟩ = ω⃗ a† (2π)d−1 δ d−1 (⃗k − p⃗)|Ω⟩ = ωp⃗ a†p⃗ |Ω⟩, (3.39)
(2π) d−1 k k k (2π)d−1 k ⃗k
so one-particle states ap†⃗ |Ω⟩ are eigenstates of this term with eigenvalue ωp⃗ . Ignoring the second term, we
thus have succeeding in finding the eigenstates of the Hamiltonian!
What however are we to say about the second term in (3.38)? On the one hand it does not involve the
creation/annihilation operators and thus is proportional to the identity, which means that the eigenstates
we just found are also eigenstates of the full Hamiltonian. On the other hand it is embarassingly infinite,
for two different reasons. The first reason is the δ-function evaluated at zero, which is an “infrared (IR)
divergence” arising because the momentum ⃗k is a continuous parameter. If we were to work in finite volume
V , then the momentum would be discrete and we would find δ d−1 (0) ∼ V . The second reason is the integral
over ⃗k, which diverges at large ⃗k since in continuum field theory we can have particles of arbitrarily high
momentum. This is called an “ultraviolet (UV) divergence”, and it is regulated if we introduce a lattice
with lattice spacing a since then it does not make sense to consider momenta larger than of order the “UV
cutoff”
1
Λ := . (3.40)
a
With both cutoffs in place we therefore have
Z
1
dd−1 kω⃗k δ d−1 (0) ∼ V Λd , (3.41)
2
which you can check indeed has units of energy. What are we to make of this term? The essential point is
that since it is proportional to V , we can write it as a local integral of a constant over space:
Z Z
1 d−1 d−1 d
d kω⃗k δ (0) ∼ Λ dd−1 x. (3.42)
2
We would thus precisely get a term of this form if from the beginning we had taken the Lagrangian to include
a “cosmological constant” term
∆L = −ρ0 , (3.43)
and so the term (3.41) is usually called a renormalization of the cosmological constant. Somehow the
dynamics of our free scalar field have generated a gigantic energy density filling the universe! This is a quite
remarkable prediction, but unfortunately it is also quite inconsistent with our understanding of the world.
In the absence of gravity such an energy density would have no measurable effect, but gravity responds to
the total energy density and such a gigantic positive energy density would lead to a universe that tore itself
apart via exponential expansion on a timescale of order Λ1 . We don’t quite know what the scale of Λ should
34
be, but from the Large Hadron Collider it should at least be bigger than ∼ 10TeV and this already tells us
that Λ1 ≲ 10TℏeV ∼ 6 × 10−29 s. Not good. No es bueno. 很不好 .
What should we do? There is only one way out: we need to introduce an additional “bare” cosmological
constant term in the original Lagrangian,
Lct ∼ Λd , (3.44)
called a counterterm, whose coefficient is precisely tuned to cancel the cosmological constant generated by
our free scalar field. The full Hamiltonian is then just be given by
dd−1 k
Z
Hren = ω⃗ a† a⃗ , (3.45)
(2π)d−1 k ⃗k k
so the vacuum has zero energy as hoped. This is our first example of a procedure called renormalization,
by which we carefully tune the coefficients in the Lagrangian in a Λ-dependent way to cancel UV divergences.
This may seem like a rather ugly fix. Why should the Lagrangian be fine-tuned in this way? How do we
know that there won’t be other UV divergences that can’t be canceled in this way? These are excellent
questions, and we will discuss them in considerable detail in the lectures to come.
For this reason non-relativistic systems are often formulated using a⃗x and a⃗†x instead of Φ(⃗x) and Π(⃗x).
35
To understand how U (Λ) acts on the rest of the Hilbert space, we need to understand its action on the
creation and annihilation operators. Before deciding this it is convenient to first understand the Lorentz
dd−1 p
transformation properties of the measure (2π) d−1 . The easiest way to do this is to note that the full measure
dd p
(2π)d
which integrates over p0 as well as p⃗ is Lorentz-invariant, since Lorentz transformations preserve the
Minkowski metric ηµν . We however only want to integrate over Lorentz vectors pµ which obey the on-shell
condition p0 = ωp⃗ . We can implement this using a Lorentz-invariant δ-function, leading to a manifestly
Lorentz-invariant measure
dd p
2πδ(−p2 + m2 ) Θ(p0 ). (3.49)
(2π)d
The Heaviside Θ function here is one for p0 > 0 and zero for p0 < 0, and is there to make sure that the δ
function only picks out p0 = ωp⃗ (as opposed to p0 = −ωp⃗ ). Θ(p0 ) is Lorentz invariant because we are only
dd−1 p
considering Lorentz transformations that do not reverse time. We can then relate this measure to (2π) d−1
via
dd p 2 2 0 dd−1 p dp0 2π
2πδ(−p + m ) Θ(p ) = δ(p0 − ωp⃗ )
(2π)d (2π)d−1 2π 2p0
dd−1 p 1
= , (3.50)
(2π)d−1 2ωp⃗
so if we define
Λµν pν = (p0Λ , p⃗Λ ) (3.51)
then
dd−1 pΛ 1 dd−1 p 1
d−1
= . (3.52)
(2π) 2ωp⃗Λ (2π)d−1 2ωp⃗
This also shows that we have
for some constant Np⃗,Λ that we can determine by requiring U (Λ) to be unitary. Indeed we want that
ωp⃗Λ
r
U (Λ)ap⃗ U (Λ)† = ap⃗
ωp⃗ Λ
ωp⃗Λ †
r
U (Λ)ap†⃗ U (Λ)† = a . (3.57)
ωp⃗ p⃗Λ
36
We can use this to work out the Lorentz transformations of the field:
dd−1 p
Z
1 h ip·x †
i
U (Λ)Φ(x)U (Λ)† = e U (Λ)a p
⃗ U (Λ) †
+ e −ip·x
U (Λ)a p
⃗ U (Λ) †
(2π)d−1 2ωp⃗
p
√
dd−1 p ωp⃗Λ h ip·x
Z i
−ip·x †
= √ e a p
⃗ + e a p
⃗Λ
(2π)d−1 2ωp⃗ Λ
Z d−1 √
d pΛ ωp⃗ h i
= d−1
√ Λ eip·x ap⃗Λ + e−ip·x a†p⃗Λ
(2π) 2ωp⃗Λ
Z d−1
d p 1 h i(Λ−1 p)·x −i(Λ−1 p)·x †
i
= e ap⃗ + e a p
⃗
(2π)d−1 2ωp⃗
p
= Φ(Λx). (3.58)
Going from the first to the second line we used (3.57), going from the second to the third we used (3.52),
going from the third to the fourth we relabeled the integration variable p⃗Λ → p⃗, and in going from the fourth
to the fifth we used that
(Λ−1 p) · x = Λαβ pα xβ = pα Λαβ xβ = p · (Λx). (3.59)
Thus we see that indeed we have succeeded in constructing a Lorentz scalar out of creation/annihilation
operators which themselves have more complicated transformations, at least for Lorentz transformations
that do not reverse time. We will discuss time-reversal symmetry in a few lectures, where we will see that
it needs to be represented on Hilbert space by an antiunitary operator instead of a unitary operator.
Turning now to microcausality, let’s compute the commutator of Φ at spatial separation:
dd−1 p dd−1 k
Z Z
1
x−i⃗
p·⃗
i⃗ k·⃗
y † −i⃗ x+i⃗
p·⃗ k·⃗
y †
[Φ(⃗x), Φ(⃗y )] = √ e [a p
⃗ , a⃗ ] − e [a⃗ ,
k p a ⃗ ]
(2π)d−1 (2π)d−1 2 ωp⃗ ω⃗k k
dd−1 p 1 i⃗p·(⃗x−⃗y)
Z
= d−1
e − e−i⃗p·(⃗x−⃗y)
(2π) 2ωp⃗
= 0, (3.60)
where in going from the first to the second line we used (3.31) and in going from the second to the third
we flipped the sign of the integration variable in the second term. The point to notice however is that the
vanishing of this commutator required a nontrivial cancellation between two terms. For example if we had
tried to make the field Φ using only annihilation operators, then its commutator with its hermitian conjugate
would not vanish at spatial separation:
h Z dd−1 p 1
Z
dd−1 k 1 i Z dd−1 p 1
p·⃗ −i⃗ y †
e i⃗ x
a p
⃗ , e k·⃗
a⃗ = ei⃗p·(⃗x−⃗y) ̸= 0 (3.61)
(2π)d−1 2ωp⃗ (2π)d−1 2ω⃗k (2π)d−1 2ωp⃗
p p
k
It is the requirement of microcausality that requires us to use fields that involve both creation and annihilation
operators, leading to the distinctive predictions of particle number non-conservation and the existence of
antiparticles as discussed in the first lecture.
L = −∂ µ ϕ∗ ∂µ ϕ − m2 ϕ∗ ϕ. (3.62)
You will show on the homework that the equation of motion for this theory is again just
∂ 2 ϕ = m2 ϕ, (3.63)
37
but when we expand the field in terms of solutions there is no longer a reason for the creation and annihilation
operators to be related. We thus should write
dd−1 p
Z
1 h ip·x −ip·x †
i
Φ(x) = e a p
⃗ + e b ⃗ ,
p (3.64)
(2π)d−1 2ωp⃗
p
where ap⃗ and bp⃗ are not related. The canonical commutation relations follow from the observation that
so we have
[Φ(⃗x), Φ̇† (⃗y )] = [Φ† (⃗x), Φ̇(⃗y )] = iδ d−1 (⃗x − ⃗y ) (3.66)
with all other commutators vanishing. In the homework you will show that these imply that ap⃗ , ap†⃗ and bp⃗ , bp†⃗
give two independent sets of annihilation/creation operators. This theory thus has two species of particles,
both with mass m. You will also show that these particles have opposite charge under the symmetry
ϕ′ (x) = eiθ ϕ(x), and indeed one is the antiparticle of the other.
If O(x) is hermitian then the physical interpretation of this quantity is clear: it is the expectation value for
what we get if we measure O(x) in the ground state. In quantum field theory it is often (but not always) the
case that the one-point functions of the fields vanish. Usually more interesting is the two-point function:
for any two fields O1 (x1 ) and O2 (x2 ) we have
⟨O2 (x2 )O1 (x1 )⟩ := ⟨Ω|O2 (x2 )O1 (x1 )|Ω⟩. (3.68)
The two point function is important for many physical questions. Perhaps the most direct physical interpre-
tation is that when x1 and x2 are spacelike separated and O1 and O2 have vanishing one-point functions, the
two-point function is a measure of how correlated the fluctuations are in measurements of the independent
observables O1 and O2 (we need to assume spacelike separation to ensure the operators are independent,
i.e. commuting). More generally if their one-point functions don’t vanish we can still quantify the amount
of correlation using the connected two-point function
⟨O2 (x2 )O1 (x1 )⟩c := ⟨O2 (x2 )O1 (x1 )⟩ − ⟨O2 (x2 )⟩⟨O1 (x1 )⟩. (3.69)
The two-point function also has a physical interpretation when x1 and x2 are not spacelike separated:
it tells us about the linear response of the theory to an external source. Indeed let’s say we have a field
theory with Hamiltonian H, and then we turn on a position-dependent source J(x) for a field O1 (x) such
that the Schrödinger picture Hamiltonian becomes:
with Z
V (t) := λ dd−1 xJ(t, ⃗x)O1 (⃗x). (3.71)
38
Here λ is a parameter controlling the strength of the source that we will take to be small. The question we
will ask is the following: assuming that J goes to zero at early times, if we start in the ground state of H0
at early times, what is the expectation value of a field O2 as a function of space and time? We can answer
this question using time-dependent perturbation theory. Indeed including this interaction we have
R t2 R t2
dt′ H(t′ ) † dt′ H(t′ )
⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = ⟨Ω|(T e−i −∞ ) O2 (⃗x2 )T e−i −∞ |Ω⟩
† iH0 t2 −iH0 t2
= ⟨Ω|UI (t2 ) e O2 (⃗x2 )e UI (t2 )|Ω⟩, (3.72)
where
dt′ H(t′ ) dt′ eiH0 t V (t)e−iH0 t
Rt Rt
UI (t) = eiH0 t T e−i −∞ = T e−i −∞ (3.73)
is the interaction picture time-evolution operator (you can check that these two expressions are equiv-
alent by showing they have the same time derivative and obey the same initial condition at t = −∞). The
letter T here is the time-ordering symbol, it means that earlier operators go to the right. Expanding in
λ we have Z t Z
UI (t) = 1 − iλ dt′ dd−1 x′ J(t, ⃗x ′ )eiH0 t O1 (⃗x ′ )e−iH0 t + O(λ2 ), (3.74)
−∞
and thus to linear order in λ we have (assuming that O2 has vanishing one-point function in the unperturbed
theory)
Z t2 Z
′
⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = −iλ dt dd−1 x′ J(t′ , ⃗x ′ )⟨[O2 (t2 , ⃗x2 ), O1 (t′ , ⃗x ′ )]⟩0 . (3.75)
−∞
Here ⟨⟩0 indicates the vacuum expectation value of Heisenberg operators in the unperturbed theory. In
particular if we take J to be a delta function localized at (t1 , ⃗x1 ), then we have
⟨Ω|O2 (t2 , ⃗x2 )|Ω⟩ = −iλΘ(t2 − t1 )⟨[O2 (t2 , ⃗x2 ), O1 (t1 , ⃗x1 )]⟩0 . (3.76)
Thus we see that the response of a quantum field theory to a local perturbation is determined by a difference
of two-point functions at arbitrary separation. The Θ function arises because if t1 > t2 then the source is
outside of the region of t′ intregration so the δ-function never contributes. This response vanishes unless x2
is in the future lightcone of x1 , as it had better, which by the way is another illustration of the fact that by
introducing fields we have solved the causality problems of relativistic particle quantum mechanics.
A simple example of an application of this calculation is the following: we can create a source for the
scalar field theory describing liquid helium-4 by firing a high-energy neutron at a bubble of liquid helium,
and then (3.76) describes how the local density of helium atoms in the bubble responds. In the homework
you will play with this and see how the response depends on whether the sample has two or three spatial
dimensions.
Higher-point correlation functions are also interesting. At spacelike separation they quantify conditional
fluctuations such as knowing how likely we are to see correlation between two operators given that we
measured a third to have some value, while at timelike separation they give more information about how
the theory responds to perturbations. We will also see later that for quantum field theories with particles at
low energies, higher-point correlation functions can be used to extract the scattering matrix.
⟨Φ(x)⟩ = 0 (3.77)
since we can view the ap⃗ in Φ as annihilating |Ω⟩ and the a†p⃗ as annihilating ⟨Ω|.
The two-point function
G(x2 , x1 ) := ⟨Φ(x2 )Φ(x1 )⟩ (3.78)
39
is more interesting. Note that the two-point function we have defined has Φ(x2 ) to the left of Φ(x1 ) regardless
of the time-ordering of x1 and x2 : it is to be distinguished from the Feynman propagator, which is defined
to include a time-ordering symbol
GF (x2 , x1 ) := ⟨T Φ(x2 )Φ(x1 )⟩. (3.79)
In quantum field theory correlation functions without time ordering such as (3.78) are sometimes called
Wightman functions to distinguish them from correlation functions that are time-ordered. We can easily
write the Feynman propagator in terms of the Wightman two-point function:
It is harder to go the other way (you need to do some nontrivial analytic continuation), so in quantum field
theory it is usually a good idea to view the Wightman functions as the fundamental objects of the theory. In
particular we emphasize that the linear response (3.76) involves two-point functions with both time orderings
and thus requires the Wightman two-point function. The Feynman propagator is important in perturbative
calculations, as we will see in later lectures.
In the free scalar field theory we can compute the (Wightman) two-point function:
Z d−1
d p1 dd−1 p2 1
G(x2 , x1 ) = √ eip2 ·x2 −ip1 ·x1 ⟨Ω|ap⃗2 a†p⃗1 |Ω⟩
(2π)d−1 (2π)d−1 2 ωp⃗1 ωp⃗2
dd−1 p 1 ip·(x2 −x1 )
Z
= e , (3.81)
(2π)d−1 2ωp⃗
where in the first line we observed that the only non-vanishing term involves an annihilation operator to the
left of a creation operator and in the second line we used the algebra (3.31). We won’t spend valuable class
time doing this integral since we will later have a better way to compute the same quantity using the path
integral,21 but the result is
! d−2
2
1 m p
G(x2 , x1 ) = K d−2 (m (x2 − x1 )2 + is21 ϵ) (3.82)
(2π)d/2
p
(x2 − x1 )2 + is21 ϵ 2
where s21 is equal to one if t2 − t1 is positive and minus one if it is negative and ϵ is a small positive quantity
whose purpose is to define the branch of the square root when (x2 − x1 )2 < 0 but should otherwise be taken
to zero. This is an example of what is called an “iϵ prescription”, which we will see again and again. Kα (x)
is a modified Bessel function of the second kind: the only things worth knowing about it at the moment are
its asymptotics:22
Γ(α)2α−1
(
0 < |x| ≪ 1
Kα (x) ≈ p πxα −x . (3.84)
2x e x≫1
In particular at general separations which are small compared to the inverse mass we have
Γ(d/2 − 1) 1
G(x2 , x1 ) ≈ d−2 , (3.85)
2π d/2 2
(x2 − x1 )2 + is21 ϵ
21 If you want to try it, I recommend first considering the d = 2 case. You can deform the p contour to wrap around one of
the cuts on the imaginary p axis as in figure two from lecture one, which leads to one of the standard integral representations
of K0 (m|x2 − x1 |). In the general case you need to first do an angular integral, after which you can do the same manipulation.
22 Another thing that is perhaps worth knowing is that it simplifies when α is a half-integer, which here means that d is odd.
q
π −x
For example for d = 3 we simply have K1/2 (x) = 2x
e and thus
√
2
e−m (x2 −x1 ) +is21 ϵ
G(x2 , x1 ) = p . (3.83)
4π (x2 − x1 )2 + is21 ϵ
40
Figure 6: Closing the contour in the complex p0 plane for the integral (3.90).
while at spacelike separations which are large compared to the inverse mass we have
md−2
G(x2 , x1 ) ≈ d+1 d−1 d−1 e−m|x2 −x1 | . (3.86)
2 2 π 2 (m|x2 − x1 |) 2
There is quite a bit of physics in these expressions, here are some key points:
(1) The two-point function is nonzero at spacelike separation, so independent fields are correlated with
each other in the ground state. Correlation between independent (i.e. commuting) degrees of freedom
in a pure quantum state is called entanglement, so what we are seeing is that in quantum field
theory the vacuum is a highly-entangled state. Indeed since the two-point function diverges in the
limit x2 → x1 , the amount of entanglement is infinite!
(2) In the massless limit (3.85) becomes exact so the correlation function (for d > 2) decays as an inverse
power of the distance between the points. You will study the d = 2 case in the homework.
(3) In the massive case the correlation decays exponentially with distance at spacelike separations which
are large compared to m−1 .
This discussion illustrates something of a general maxim about correlation functions in quantum field theory:
the physics is more clear in position space, but the formulas are simpler in momentum space. More pithily,
in quantum field theory you should think in position space but compute in momentum space.
The short-distance divergence of the two-point function also has an important mathematical consequence:
it shows that the field Φ(x) is not actually a good quantum operator, since acting on the vacuum (or indeed
any other state of finite energy) we get a state of infinite norm. In order to get something which is a good
operator, we need to smear Φ(x) against a smooth function of compact support:
Z
Φf = dd xf (x)Φ(x). (3.87)
This statement is sometimes formalized by saying that in quantum field theory the fields themselves are
operator-valued distributions. We will show in the next lecture that this smearing indeed produces a
well-defined operator.
You may have found it annoying that our expression (3.81) for the two-point function involves integrals
over only the spatial components of momentum; wouldn’t it be nice to have a more manifestly covariant
dd−1 p 1
expression? Of course we did already show that the measure (2π) d−1 2ω
p
⃗
is Lorentz-invariant, but there our
0 2 2
demonstration involved the non-analytic objects Θ(p ) and δ(p + m ). It turns out to be a very good idea
41
to come up with an expression for the two-point function that is manifestly both covariant and analytic in
momentum. We can do this by showing that
Z ∞
1 −iωp⃗ (t2 −t1 ) dp0 −is21 0
e = lim 2 2
e−ip (t2 −t1 ) , (3.88)
2ωp⃗ ϵ→0 −∞ 2π p + m − iϵs21
where s21 again is one if t2 − t1 is positive and minus one if it is negative, since from (3.81) we then have
dd p −is21
Z
G(x2 , x1 ) = lim eip·(x2 −x1 ) . (3.89)
ϵ→0+ (2π) p + m2 − iϵs21
d 2
The appearance of the vanishingly small quantity ϵ > 0 here is another example of an iϵ prescription. To
demonstrate (3.88), it is convenient to rewrite the integral on the right-hand side as
∞ 0
dp0 e−ip (t2 −t1 )
Z
−s21 , (3.90)
−∞ 2πi (p − (ωp⃗ − iϵs21 ))(p0 + (ωp⃗ − iϵs21 ))
0
(p0 − (ωp⃗ − iϵs21 ))(p0 + (ωp⃗ − iϵs21 )) = −(p2 + m2 − 2iωp⃗ ϵs21 ) + O(ϵ2 ) (3.91)
and then redefined 2ωp⃗ ϵ → ϵ since the only thing we care about ϵ is that it is small and positive. The integral
(3.90) can be computed using the residue theorem. Indeed recall that if f (z) is an analytic function in a
region R containing a point z0 , then we have23
Z
1 f (z)
= f (z0 ) (3.92)
2πi ∂R z − z0
f (z)
where the integral is taken in the counter-clockwise direction about z0 . Said differently, the function z−z 0
has a simple pole at z = z0 and the integral around this pole extracts the residue f (z0 ). The integrand
(3.90) has two simple poles, at
p0 = ± (ωp⃗ − iϵs21 ) . (3.93)
We can evaluate the integral using the residue theorem by closing the integration contour along the real axis
at infinity in the lower or upper half plane depending on whether s21 is positive or negative respectively (see
figure 6). Either way the integral picks up the residue of the pole at p0 = ωp⃗ − iϵs21 , but there is a sign
difference since in the former case the integral is clockwise while in the latter case it is counter clockwise.
We therefore have
0
dp0 e−ip (t2 −t1 )
Z
1 −iωp⃗ (t2 −t1 )
0 0
= −s21 e , (3.94)
2πi (p − (ωp⃗ − iϵs21 ))(p + (ωp⃗ − iϵs21 )) 2ωp⃗
the other way since the divergence theorem requires continuous partial derivatives and showing that an analytic function has
continuous partial derivatives is usually done using the residue theorem.
42
where in the first line there is an s21 in the exponent because depending on the time-ordering which field
contributes an ap⃗ and which contributes an a†p⃗ switches and in going from the first line to the second line we
flipped the direction of the integral over p⃗. The quantity s21 (t2 − t1 ) is always positive, so in this integral we
can use the identity (3.88) replacing (t2 − t1 ) → s21 (t2 − t1 ) and setting s21 to one on the right hand side.
Flipping the direction of the p0 integral we get
dd p −i
Z
GF (x2 , x1 ) = lim eip·(x2 −x1 ) , (3.96)
ϵ→0 (2π) p + m2 − iϵ
d 2
which is a bit simpler than the expression (3.89) for the two-point function. In particular the Feynman
propagator has the nice property that it is a Green’s function for the Klein-Gordon operator:
dd p i(p2 + m2 ) ip·(x2 −x1 )
Z
(∂22 − m2 )GF (x2 , x1 ) = lim e
ϵ→0 (2π)d p2 + m2 − iϵ
dd p ip·(x2 −x1 )
Z
=i e
(2π)d
= iδ d (x2 − x1 ). (3.97)
This would not have worked for the Wightman function since the derivative acting on s21 would have
generated additional terms.
You may be wondering why we stopped with two-point functions: what about three-point functions,
four-point functions, and so on? In free field theory the answer is simple: these end up either vanishing or
just being combinations of two-point functions. Indeed the n-point function
vanishes when n is odd since there are no terms with an equal number of creation and annihilation operators.
When n is even we simply pair them up to get a sum of products of two-point functions. For example to
compute the four-point function we introduce annihilation and creation parts of Φ(x) as
dd−1 p 1 ip·x
Z
Φ− (x) = e ap⃗
(2π)d−1 2ωp⃗
dd−1 p 1 −ip·x †
Z
Φ+ (x) = e ap⃗ , (3.99)
(2π)d−1 2ωp⃗
observe that
dd−1 p 1 ip(x−y)
Z
[Φ− (x), Φ+ (y)] = e = G(x, y), (3.100)
(2π)d−1 2ωp⃗
and then compute
D E D E
Φ(x4 )Φ(x3 )Φ(x2 )Φ(x1 ) = Φ− (x4 ) Φ+ (x3 ) + Φ− (x3 ) Φ+ (x2 ) + Φ− (x2 ) Φ+ (x1 )
D E
= [Φ− (x4 ), Φ+ (x3 )] + Φ− (x4 )Φ− (x3 ) [Φ− (x2 ), Φ+ (x1 )] + Φ+ (x2 )Φ+ (x1 )
D E
= G(x4 , x3 )G(x2 , x1 ) + Φ− (x4 )Φ− (x3 )Φ+ (x2 )Φ+ (x1 )
D E
= G(x4 , x3 )G(x2 , x1 ) + Φ− (x4 ) [Φ− (x3 ), Φ+ (x2 )] + Φ+ (x2 )Φ− (x3 ) Φ+ (x1 )
= G(x4 , x3 )G(x2 , x1 ) + G(x3 , x2 )G(x4 , x1 ) + G(x4 , x2 )G(x3 , x1 ). (3.101)
This pattern continues to higher orders: the n-point function with even n is given by the sum over all pairings
of n of the products of two-point functions of the pairing, with the order of the operators in each pair given
by their order in the full n-point function. The same is true for the time-ordered n-point function, but with
the two-point function replaced by the Feynman propagator.
43
3.8 Homework
1. Evaluate the other two terms in our expression (3.35) for the free scalar Hamiltonian, confirming that
this leads to (3.37).
2. Find the vacuum wave functional for a free scalar field. Hint: the answer has the form
Z
1 d−1 d−1
Ψ[ϕ] ∝ exp − d xd yK(⃗x − ⃗y )ϕ(⃗x)ϕ(⃗y ) ,
2
so you just need to find the function K(⃗x − ⃗y ). The condition you need to satisfy is that this wave
functional is annihilated by ap⃗ for all momenta p⃗, and you can use the expression (3.30) for ap⃗ and also
the definition (3.9) of the canonical momenta acting on wave functionals. Your life will be easiest if you
transform K and ϕ to momentum space, but extra credit if you can give a position-space expression
for K in d = 4 (Bessel functions are involved).
3. The response of superfluid liquid helium to a localized perturbation with source O1 = ϕ(t1 , ⃗x1 ) is given
by equation (3.76), with the two Wightman functions appearing in the commutator given by (3.85).
Taking the perturbation at t1 = 0 and ⃗x1 = 0 and taking the measured operator O2 to be ϕ(t, ⃗x), plot
the response ⟨Ω|ϕ(t, ⃗x)|Ω⟩ as a function of t and the spatial radius r = |x| for d = 3 and d = 4. Is there
a qualitative difference between two cases?
4. Starting from the expression (3.64) for a complex scalar field and the canonical commutators (3.66),
calculate the commutators of the operators ap⃗ , bp⃗ , a†p⃗ , bp†⃗ . Derive an expression for the Hamiltonian H
in terms of these creation/annihilation operators, and also give an expression for the symmetry charge
Q for the symmetry ϕ′ = eiθ ϕ that you derived in the last homework. What are the charges of the
particles in this theory?
5. Expand the massless two-point function (3.85) in the limit d → 2. You will find a series in (d − 2)
that begins with a divergence that goes like 1/(d − 2) followed by a term that is finite and nonzero as
d → 2. What is this correction term? Do you see anything strange about it?
6. Extra credit: evaluate the momentum integral (3.81) for the two-point function assuming that x1 and
x2 are spacelike separated in the cases d = 2, d = 3, and d = 4. You will likely need to consult some
reference on Bessel integrals, e.g. Gradshteyn and Ryzhik or Abramowitz and Stegun, both of which
are available as pdfs online. If the experience leaves you enthusiastic you can try the case of timelike
separation as well; this is actually a bit easier since you can go to a frame where ⃗x2 − ⃗x1 = 0.
44
4 Algebras and symmetries in quantum field theory
In this lecture we return to formalism, introducing a general algebraic language that we can use to precisely
define the idea of symmetry in quantum field theory. We will learn about the difference between internal
symmetries and spacetime symmetries, learn more about global structure of the Lorentz group, and study
how correlation functions in quantum field theory are constrained by global symmetries.
Causality: If R1 and R2 are spacelike separated, then A[R1 ] ⊂ A′ [R2 ]. Here the symbol A′ [R] indi-
cates the commutant of A[R], meaning the set of (bounded) operators that commute (or anticommute
in the case of fermions) with everything in A[R].
Haag Duality: For any region R we have A′ [R] = A[R], where R is the interior of the spatial
complement of R in the time slice it lives in.
Nesting, also sometimes called “isotony”, formalizes the idea that you cannot make more operators by
restricting which fields you can use, causality is a consequence of the (anti)commutativity of fields at spacelike
separation, and Haag duality expresses the idea that the algebra is “complete” in the sense that A[R] contains
everything you can build out of the fields.25
Conceptually these axioms are all we will need from the algebraic approach to field theory, but there are
some mathematical subtleties in making the definition of A[R] precise which are worth discussing. Don’t
worry if the rest of this section goes by too fast, the goal is to make you aware of these things rather than
to turn you into a master practitioner. The first problem is that we saw in the last lecture that the fields
themselves are not actually genuine operators. For example if we act with a free scalar field on the vacuum
we get a state of infinite norm:
⟨Ω|Φ(x)Φ(x)|Ω⟩ = G(x, x) = ∞. (4.1)
To get a good operator we need to smear against a smooth (meaning infinitely-differentiable) function
f : Rd → R of compact support: Z
Φf = dd xf (x)Φ(x). (4.2)
24 We do this to avoid needing to discuss quantization on curved slices. More generally R can be any open achronal set.
25 Haag duality should not be confused with the “duality” mentioned in the previous paragraph, whereby the same quantum
field theory can have two seemingly different presentations. Unfortunately both usages are completely standard.
45
To see that this makes the norm finite, we can first note that we have
dd−1 p
Z Z
1 h ip·x −ip·x †
i
Φf = dxf (x) e a p
⃗ + e a p
⃗
(2π)d−1 2ωp⃗
p
dd−1 p
Z
1 he i
= d−1
p f (ωp⃗ , p⃗)∗ ap⃗ + fe(ωp⃗ , p⃗)ap†⃗ , (4.3)
(2π) 2ωp⃗
where Z
fe(k) = dd xe−ik·x f (x) (4.4)
is the (d-dimensional) Fourier transform of f . It is useful to recall two facts about Fourier transforms:
If f : Rd → R is a smooth function that is bounded in absolute value by 1+|x| C
d+1 for some C > 0 (here
d
|x| is the Euclidean length on R ), and moreover which has the property that when acted on by any
finite number of partial derivatives it continues obey this bound (possibly with different C for different
sets of derivatives), then the Fourier transform fe(k) exists and decays faster than any power at large
|k|. The proof of this is fairly simple: by differentiating under the integral sign and integration by
parts we have
Z
kµ1 . . . kµm f (k) = dd xkµ1 . . . kµm e−ik·x f (x)
e
Z
= im dd x∂µ1 . . . ∂µm (e−ik·x )f (x)
Z
= (−i)m dd xe−ik·x ∂µ1 . . . ∂µm f (x), (4.5)
and the third line vanishes at large |k| by the Riemann-Lebesgue lemma (see Wikipedia) since by
C
assumption ∂µ1 . . . ∂µm f (x) is integrable since it is smooth and bounded in absolute value by 1+|x|d+1 .
If f : Rd → R is a continuous function of compact support then its Fourier transform fe(k) is an entire
function, meaning that it is analytic for arbitrary complex k. This is because we can simply define the
derivative of the Fourier transform by
Z
∂ fe
= dd x(ixµ )e−ik·x f (x), (4.6)
∂kµ S
dd−1 p 1 e
Z
⟨Ω|Φf Φf |Ω⟩ = |f (ωp⃗ , p⃗)|2 . (4.7)
(2π)d−1 2ωp⃗
This integral is now convergent at large |p| due to the fast decay of fe, and for m > 0 it is also convergent
at p = 0 since ωp⃗ is finite there and fe is analytic. When m = 0 there is an apparent singularity at p = 0
dd−1 p
due to the ωp⃗ in the denominator, but as long as d > 2 this is compensated by the volume measure (2π) d−1
26 Another useful result which is intermediate between these two is that an integrable function which is analytic in a strip of
finite thickness about the real axis has a Fourier transform which decays exponentially at large k.
46
Figure 7: Domains of dependence for spatial regions in Minkowski space. The regions R1 and R2 are blue,
while their domains of dependence are the green diamond-shaped spacetime regions.
so the integral is still finite. When d = 2 there is a logarithmic divergence at p = 0 in the massless case,
which shows that there is indeed an infrared pathology for a massless scalar in d = 2 that cannot be removed
by smearing.27 It is important to emphasize that the introduction of smeared fields Φf is not purely a
mathematical convenience; no real detector has perfect spatial resolution, so this smearing is really physical
- the function f describes the spacetime profile of the detector which couples to Φ(x).
Which smeared operators ϕf can be associated to which spatial regions R? The answer to this question
is not completely obvious, since in order to get a good operator we need the support of f to have nontrivial
extent in time. On the other hand we should expect that in a relativistic field theory the operators at a
location x which lies to the future or past of a timeslice should be expressible solely in terms of the fields on
that timeslice which are not spacelike separated from x. Given a spatial region R we therefore introduce the
idea of its domain of dependence D[R], which is the set of spacetime points x with the property that every
timelike curve which intersects x also intersects R. In Minkowski space this is equivalent to the set of points
which are spacelike-separated from all points in R, see figure 7 for an illustration.28 Moreover this definition
has the property that if R1 ⊂ R2 then D[R1 ] ⊂ D[R2 ]. Operators ϕf with the support of f contained in
D[R] thus will obey nesting and causality, and therefore are thus natural candidates for elements of A[R].
There is one further issue however that needs to be addressed: although the operator Φf is better-
behaved than Φ(x), it still can in general have arbitrary large eigenvalues. An operator whose eigenvalues
are unbounded can have rather strong restrictions on its domain, which makes it difficult to include in
an algebra since products of unbounded operators are complicated to handle. For example in the simple
harmonic oscillator the state √ ∞
6X1
|ψ⟩ = |n⟩ (4.8)
π n=0 n
P
has unit norm but if we act on this state with the Hamiltonian H = n ω(n + 1/2)|n⟩⟨n| we get a state
of infinite norm and the expectation value of H in the state |ψ⟩ is also infinity. This kind of divergence is
usually viewed as unphysical however, as given a detector of finite size we can’t actually measure an observable
with an infinite number of distinct possible outcomes. It is thus standard to restrict A[R] p to only contain
operators O which are bounded in the sense that there is some constant C such that ⟨ψ|O† O|ψ⟩| ≤ C
for all normalizeable states |ψ⟩. Given smeared fields Φf it is not difficult to create bounded operators, for
27 This has interesting physical consequences, with perhaps the most important being that there cannot be spontaneous
breaking of a continuous symmetry in d = 2. This statement is called the Mermin-Wagner-Coleman theorem, and we will say
more about it when we get to spontaneous symmetry breaking later in the semester.
28 Another way to motivate the definition of the domain of dependence is that it is the region in which the wave equation (or
more generally any well-behaved hyperbolic PDE) should have a unique solution given initial data specified on R. Outside of
D[R] the solution will depend also on the initial data on R.
47
1
example if Φ(x) is hermitian then eiΦf and 1+Φ2f
are both bounded, and so is the spectral projection onto
the eigenstates of Φf which lie between any two distinct real numbers.
The algebra A[R] associated to a spatial region R in quantum field theory gives an example of a famous
mathematical notion:
Definition 1 Let H be a Hilbert space. A set A of bounded operators on H is a von Neumann algebra
if the following things are true:
(1) A is closed under addition, multiplication, and hermitian conjugation.
Elements of A[R] are bounded for the reasons discussed in the previous paragraph, they obey (1) because if
we can measure two hermitian operators O1 and O2 then we can measure simple functions of them such as
O1 + O2 and O1 O2 + O2 O1 and i(O1 O2 − O2 O1 ), they obey (2) because we can always measure the identity
by doing nothing, and they obey (3) because a limit of measurements should be a measurement. There are
many powerful mathematical results about von Neumann algebras with interesting implications for quantum
field theory, and in particular there is a classification of von Neumann algebras under which the algebras
associated to bounded regions are “type III1 ”, but this is not a class in mathematical physics we will stop
here.
Here we have temporarily dispensed with Dirac notation and instead used the mathematician notation (·, ·)
for the inner product on H.29 We also require that the inverse transformation preserves amplitudes in the
same way. It is a fundamental theorem of Wigner (see section 2.A of Weinberg) that the only transformations
obeying these requirements arise from unitary or antiunitary operators on H. In other words we must either
have a linear operator U obeying
(U ψ, U ϕ) = (ψ, ϕ) (4.10)
for all ψ and ϕ such that
f (ψ) = U ψ, (4.11)
or else an antilinear operator Θ obeying
of f on bras and 2) Dirac notation is confusing when antilinear operators are involved.
48
for all ψ and ϕ such that
f (ψ) = Θψ. (4.13)
A linear operator L is one for which
L(aψ + bϕ) = aLψ + bLϕ, (4.14)
while an antilinear operator A is one for which
(ψ, L† ϕ) = (Lψ, ϕ)
(ψ, A† ϕ) = (ϕ, Aψ), (4.16)
we see that a linear operator U is unitary if and only if U † U = I and an antilinear operator Θ is antiunitary
if and only if Θ† Θ = I.
Although preserving instantaneous transition amplitudes is a necessary condition to have a symmetry in
quantum mechanics, it is clearly not sufficient: otherwise any unitary or antiunitary operator would be a
symmetry! There must also be a sense in which the unitaries/antiunitaries which are genuine symmetries
preserve more of the structure of the theory. In particular any symmetry of quantum theory should be
compatible with its dynamics. This requirement is easiest to formalize when the symmetry in question does
not affect the direction of time evolution: we then simply require that
i.e. that transforming and then evolving is the same as evolving and then transforming. Multiplying by U †
on the left, we can also write this as
U † e−iHt U = e−iHt . (4.18)
Since either of these equations must be true for all t, they are equivalent to requiring that
So far we have not decided whether U is unitary or antiunitary. Let’s first try antiunitary: then (4.19) is
equivalent to
HU = −U H. (4.20)
This however leads to trouble: if ψE is an energy eigenstate of energy E, then we have
and thus we see that U ψE is an energy eigenstate of energy −E. Most Hamiltonians of physical interest
do not have the property that their spectrum is symmetric about H = 0, and in particular in quantum
field theory the Hamiltonian is usually bounded from below but not from above. Thus we have learned
that any symmetry which does not affect the direction of time evolution is implemented by a unitary (NOT
antiunitary) operator on Hilbert space. Equation (4.19) then tells us that
HU = U H, (4.22)
which is the usual maxim that a symmetry in quantum mechanics is a unitary operator that commutes with
the Hamiltonian.
The set of all distinct unitaries U that commute with the Hamiltonian form what mathematicians call
a group, which is a set G whose elements can be multiplied together in such a way that the following
conditions are true:
49
Associativity: For any g1 , g2 , g3 ∈ G we have (g1 g2 )g3 = g1 (g2 g3 ).
These axioms imply that e and g −1 are unique. They are obeyed here because if U1 and U2 commute with
the Hamiltonian then
U1 U2 H = U1 HU2 = HU1 U2 , (4.23)
and if U H = HU then
U † H = U † HU U † = U † U HU † = HU † . (4.24)
Hopefully this is not your first time seeing the definition of a group, but if it is then I assure you groups are
ubiquitous in physics so best to get started learning about them. Simple examples of groups are the real
numbers R under addition, the group U (1) of complex phases eiθ under multiplication, the group U (N ) of
N × N unitary matrices under matrix multiplication, and the group SU (N ) of N × N unitary matrices of
determinant one (again under matrix multiplication).30 A group G is called abelian if it is commutative,
meaning that g1 g2 = g2 g1 for all g1 , g2 ∈ G. R and U (1) are abelian, while U (N ) and SU (N ) are non-abelian
for N ≥ 2.
What about symmetries that do affect the direction of time evolution? In relativistic theories there are
only two such symmetries: we can mix time and space translations using a Lorentz boost, or we can reverse
the direction of time using time-reversal symmetry.31 We have already seen in our free scalar theory that any
Lorentz transformation which does not reverse time can be represented by a unitary operator U (Λ) which
acts on the annihilation operators as
ωp⃗Λ
r
U (Λ)ap⃗ U (Λ)† = ap⃗ , (4.25)
ωp⃗ Λ
so in particular this is true for Lorentz boosts. More generally in any quantum field theory we expect that
a Lorentz boost in the n̂ direction of rapidity η acts on the Hamiltonian as
which is a consequence of the fact that the spacetime momentum P µ transforms as a spacetime vector.
Since we are (momentarily) considering the possibility that U could be antiunitary however, we should
really require that
U † (iH)U = i(cosh η H + sinh η n̂ · P⃗ ). (4.27)
If U is unitary this is equivalent to (4.26), but if it is antiunitary then we should instead require that
This equation however is not continuous as η → 0, so this would be a rather pathological representation of
Lorentz symmetry. Moreover it would again have a problem with the spectrum of the Hamiltonian: given a
simultaneous eigenstate ψE,⃗p of H and P⃗ , we would have
In any quantum field theory which can be interpreted as a scattering theory of particles it is quite natural
to impose the following requirement:
30 These examples may misleadingly suggest that all groups are matrix groups, meaning groups that can be represented
with finite-dimensional matrices. This is true for groups which are topologically compact, but it isn’t true in general.
31 Time-reversal symmetry may not actually be a symmetry by itself, for example in the Standard Model of particle physics
it isn’t, but we will see in a few lectures that there is a combination of time reversal with other transformations, called CRT ,
which is always a symmetry in any relativistic quantum field theory.
50
Spectrum condition: In any relativistic quantum field theory we have
H ≥ n̂ · P⃗ , (4.30)
where n̂ is any unit vector and H is defined so that the energy of the ground state is zero. The operator
inequality means that H − n̂ · P⃗ is a positive semidefinite operator.
p
This condition should hold because each particle has energy ω = |p|2 + m2 ≥ |p|, and when we add up
energies there are no cancellations while when we add up momenta there can be.32 We then have (for η > 0)
and so assuming that H is unbounded from above we can again generate energy eigenstates of arbitrarily
negative energy by acting with U . From now on we will therefore assume that boosts are implemented by
unitary operators.
Finally we can consider time-reversal, which we will take to be represented by an operator ΘT . This
should act on the time evolution operator as
which we can discard as before since it would require the spectrum of H to be symmetric about zero. We
therefore see that we want ΘT to be antiunitary, since this gives the more reasonable condition
For example in the simple harmonic oscillator time reversal is implemented by an antiunitary operator which
acts on the X basis
ΘT |x⟩ = |x⟩, (4.36)
leading to
Θ†T XΘT = X
Θ†T P ΘT = −P. (4.37)
The energy eigenstates |n⟩ have real wave functions in the X basis, and thus are invariant under time-reversal:
Z Z Z
ΘT |n⟩ = ΘT dx⟨x|n⟩|x⟩ = dx⟨x|n⟩ΘT |x⟩ = dx⟨x|n⟩|x⟩ = |n⟩. (4.38)
antiunitary case, then we can give a simpler and more rigorous argument for the spectrum condition: it must be true so that
the Hamiltonian in any Lorentz frame is a positive operator.
51
Definition 2 An internal symmetry of a quantum field theory in d-dimensional Minkowski space is a
unitary operator U such that
(1) For any spatial region R the algebra A[R] is preserved by conjugation by U and U † , meaning that for
any O ∈ A[R] we have U † OU ∈ A[R] and U OU † ∈ A[R].
(2) For any spacetime point x the energy-momentum tensor Tµν (x) is invariant under conjugation by U :
The first requirement here expresses the idea that the symmetry should preserve the local algebra. The
second is a strengthening of the idea that U should commute with the Hamiltonian: it expresses the idea
of local conservation of the symmetry charge. More concretely, it says that symmetry charge cannot
leave a region of space without passing through its edges (see figure ). This is not obvious, and showing
it is a consequence of (4.39) requires more differential geometry than we are using in this class.33 We can
also motivate (4.39) in a more mundane way: a generic quantum field theory shouldn’t have more than one
energy-momentum tensor, and whatever an internal symmetry sends the energy-momentum tensor to is an
equally valid candidate for an energy-momentum tensor and therefore must be the original one. The set of
internal symmetries in a quantum field theory forms a group, as you can easily check.
There is an important further classification of internal symmetries based on what kinds of operators they
act nontrivially on. In the simplest quantum field theories all operators are built out of the local operators,
in which case any nontrivial internal symmetry U must act nontrivially on some local operator O(x). Such
internal symmetries are called global internal symmetries. An example of a global internal symmetry is the
phase rotation of a free complex scalar
whose symmetry group is clearly isomorphic to the group U (1). Conventionally we say that this theory
has a U (1) global symmetry. This semester we will only discuss theories where all operators are built from
local operators, so all internal symmetries are global. Next semester we will discuss gauge theories such
as quantum electrodynamics, where there can be extended operators that are not built from local operators.
The reason for this is familiar from Maxwell theory: we cannot create an electrically charged particle without
also creating an electric field sourced by it that satisfies Gauss’s law, and this electric field must extend out
to spatial infinity. Therefore there are no local operators that carry nonzero electric charge. On the other
hand there are clearly states of nonzero electric charge, such as a state with one electron in the center of
space. These are created by acting on the vacuum with extended operators that create both the electron
and its Coulomb field, and it is these extended operators which carry nonzero electric charge.34
Another important question about any internal symmetry in quantum field theory is whether or not
the ground state |Ω⟩ is invariant. If it is not, then we say that the symmetry is spontaneously broken.
Spontaneously broken global internal symmetries are very interesting in quantum field theory, for example
being essential to our understanding of magnets, superfluidity, and nuclear physics. There is also a sense
33 More formally local conservation is expressed as the requirement that we can continuously deform the slice on which U
is defined without changing the operator. This is often described by saying that the symmetry operator R U is a topological
surface operator. In the continuous case this is a consequence of Noether’s theorem: the charge Q = dd−1 xJ 0 (t, ⃗ x) can be
written as Q = Σ nµ Jµ where Σ is the surface t = 0 and nµ is its normal vector, and then the fact that we can continuously
R
deform Σ without changing Q is a consequence of the divergence theorem and the current conservation equation ∂µ J µ = 0.
The basic idea in showing that the invariance of Tµν implies this deformability in general is to use that the stress tensor is the
functional derivative of the action with respect to the metric and that the action is invariant under arbitrary diffeomorphisms
which act on both the dynamical fields and also the background spacetime metric.
34 The distinction between gauge and global symmetry defined here is not the way this distinction is traditionally presented.
The conventional definition is that in terms of the fundamental fields a global symmetry is one which acts the same way at all
points in space while a gauge symmetry is one where the symmetry transformation can vary from point to point. This definition
is problematic however, as most of the gauge transformations defined this way are mere redundancies of description and for
discrete symmetries it isn’t clear what the difference is. The algebraic definition I’ve given here isolates the physical distinction
between the two without introducing confusing historical baggage.
52
in which gauge symmetries can be spontaneously broken, called the Anderson-Higgs mechanism, although
the concept is somewhat less well-defined than for global symmetries. We will have more to say about
spontaneous symmetry breaking in later lectures. If an internal global symmetry is unbroken, meaning
that the ground state is invariant, then it implies a powerful constraint on the correlation functions of the
theory. Indeed if we define
O′ (x) = U † O(x)U, (4.41)
then we must have
⟨O1′ (x1 ) . . . On′ (xn )⟩ = ⟨Ω|U † O1 (x1 )U . . . U † On (xn )U |Ω⟩ = ⟨O1 (x1 ) . . . On (xn )⟩. (4.42)
For example if we have a U (1) global symmetry, this tells us that for all θ ∈ [0, 2π] we have
ei(q1 +...qn )θ ⟨O1 (x1 ) . . . On (xn )⟩ = ⟨O1 (x1 ) . . . On (xn )⟩, (4.43)
which shows that this correlation function obeys the selection rule that it must vanish unless the sum of the
operator charges vanishes. For example this explains why you will find that ⟨Φ(x)Φ(y)⟩ = ⟨Φ† (x)Φ† (y)⟩ = 0
in the free complex scalar theory.
53
Returning to Poincaré symmetry, the full set of Poincaré transformations forms a group called the Poincaré
group and it is useful to now make a few general comments about its global structure. Recall that this is
defined to be the set of coordinate transformations
and splitting the time and space terms of the 00 component of (4.46) we see that
X
(Λ00 )2 = 1 + (Λi0 )2 (4.48)
i
and thus
(Λ00 )2 ≥ 1. (4.49)
We therefore can split up the Lorentz group into four connected components labeled by the signs of det Λ
and Λ00 . The simplest of these components is the one containing the identity transformation, which is called
the identity component and denoted SO+ (d − 1, 1) (here “S” indicates unit determinant and “+” indicates
Λ00 ≥ 1). Any element of the other components can be written as an element of SO+ (d − 1, 1) multiplied
by one of the following three Lorentz transformations:
The transformation R reflects the spatial x1 coordinate, the transformation T reverses time, and the transfor-
mation RT does both. Due to our general discussion above we should expect that T and RT are represented
by antiunitary operators ΘT and ΘRT , while R is represented by a unitary operator UR . Therefore two of
the connected components of the Lorentz group are unitary and two are antiunitary. When d is even it is
conventional to replace R by an operation P, called parity, that reflects all spatial coordinates. When d is
odd however P is in SO+ (d − 1, 1), so in general it is best to stick with R.
The fact that the Poincaré group has four connected components suggests the possibility that there
could be relativistic field theories where only some of these components give genuine symmetries. We should
always include the identity component SO+ (d − 1, 1) (otherwise what would we mean by “relativistic field
theory”), but there are indeed interesting theories where some of the other components are not symmetries.
In fact this possibility is realized in the Standard Model of particle physics, which has neither parity nor
time-reversal symmetry. On the other hand we will see in a few lectures that there is a way of combining
RT with an internal transformation C, called charge conjugation, that gives a combined transformation
CRT which is always a symmetry in any relativistic field theory (even if C, R, and T separately are not
symmetries). Thus we always at least have a spacetime symmetry group SO(d − 1, 1), where the absence of
the + indicates that we have included the RT component of the Lorentz group but the S indicates that we
have not included the R and T components.
The existence of a unitary representation of SO+ (d−1, 1) obeying U (Λ, a)† A[R]U (Λ, a) = A[Λ−1 (R−a)],
the spectrum condition, nesting, and causality together form what are called the Haag-Kastler axioms
for algebraic quantum field theory. It is widely agreed that these axioms are necessary for any reasonable
definition of relativistic quantum field theory. There is less agreement on what else is needed, two things
I personally would also include are duality and the existence of a conserved symmetric energy-momentum
tensor that generates SO+ (d − 1, 1).
54
4.5 Correlation functions of tensor fields
Just as in the case of internal symmetries, spacetime symmetries imply powerful constraints on correlation
functions. First considering elements of the Poincare group with Λ00 ≥ 1, we can define
with U (Λ, a) being unitary. Assuming the ground state is invariant under Poincare symmetry, we then have
⟨Ω|O1′ (x1 ) . . . On′ (xn )|Ω⟩ = ⟨Ω|O1 (x1 ) . . . On (xn )|Ω⟩ (4.52)
just as in the internal case. In particular let’s say that the operators O(x) are tensor fields, meaning that
they come with some number of raised and lowered indices such that their Poincare transformation is
O′µ1 ...µnν1 ...νm (x) = Λµ1 α1 . . . Λµnαn Λν1 β1 . . . Λνmβm Oα1 ...αnβ1 ...βm (Λ−1 (x − a)). (4.53)
By taking Λ to be the identity we see that the correlation function must be invariant under translating all of
the coordinates xµ1 , . . . , xµn by an arbitrary vector aµ , and thus that the correlation function can only depend
on differences of these coordinates. When Λ is not the identity further constraints are imposed, for example
the two-point function of a vector operator V µ (x) must obey
as before, but the constraint on correlation functions is now a bit trickier to derive. Assuming that the
ground state is invariant under Θ† , we have
⟨O1′ (x1 ) . . . On′ (xn )⟩ = (Ω, O1′ (x1 ) . . . On′ (xn )Ω)
= (Θ† Ω, Θ† O1 (x1 ) . . . On (xn )Ω)
= (O1 (x1 ) . . . On (xn )Ω, Ω)
= (Ω, (O1 (x1 ) . . . On (xn ))† Ω)
= ⟨On (xn )† . . . O1 (x1 )† ⟩. (4.57)
Here we have switched to mathematician notation in the middle to handle the antiunitary operators. Thus
we see that an antiunitary symmetry reverses the operator of the operators in a correlation function and
takes their hermitian conjugates. This has the nice feature that it sends time-ordered correlation functions
to time-ordered correlation functions.
Kastler axioms), and indeed in my long paper with Hirosi Ooguri we give some counterexamples. These counterexamples are in
somewhat pathological theories however, and so far it seems likely that Noether’s theorem is true for sufficiently well-behaved
theories.
55
imposes interesting constraints on correlation functions that contain such currents, since inserting ∂µ J µ into
any (Wightman) correlation function must give zero. For example you will show on the homework that
imposing the conservation equation ∂µ V µ = 0 on the vector field appearing in (4.55) implies that the
functions f and g obey the constraint
d+1
f ′ (x) + xg ′ (x) + g(x) = 0. (4.58)
2
There is also an interesting constraint on time-ordered correlation functions of a conserved current J µ . We
can illustrate the idea using a two-point function:
where the term on the right-hand side comes from the derivative acting on the Heaviside Θ function. More
generally we have
n
X
∂µ ⟨T J µ (x)O1 (y1 ) . . . On (yn )⟩ = δ(x0 − ym
0
)⟨T O1 (y1 ) . . . [J µ (x), Om (ym )] . . . On (yn )⟩. (4.60)
m=1
Note that the commutators appearing on the right-hand side are at equal time due to the δ-function, and
thus vanish when x ̸= y. We can therefore expand them in the δ function and its derivatives:
and so we see that the divergence of a time-ordered correlation function involving a conserved current obeys
the Ward identity:
n
X
∂µ ⟨T J µ (x)O1 (y1 ) . . . On (yn )⟩ = i δ d (x − ym )⟨T O1 (y1 ) . . . δS O(ym ) . . . On (yn )⟩ + . . . , (4.63)
m=1
56
4.7 Homework
1. Compute the two-point functions ⟨Φ(x)Φ(y)⟩ and ⟨Φ† (x)Φ(y)⟩ for a complex scalar field, giving each
answer both as a covariant integral over spacetime momenta and also directly in position space in
terms of a Bessel function. You are free to use our results for the real scalar field, so you shouldn’t
need to evaluate any new integrals.
2. Show that if R1 and R2 are open spatial regions (which recall for us means that each lies in a constant
time slice in some Lorentz frame) obeying R1 ⊂ R2 , then their domains of dependence obey D[R1 ] ⊂
D[R2 ].
3. Show that SU (N ) is indeed a group, meaning that it is closed under matrix multiplication and matrix
inverse.
4. Show that every Lorentz transformation is indeed a product of an element of SO+ (d − 1, 1) with 1,
R, T , or RT . Hint: this shouldn’t require any detailed calculation or explicit parameterization of the
Lorentz group.
5. Argue that the vector two-point function indeed has the form (4.55), and also show that if ∂µ V µ = 0
then (4.58) follows.
6. Check that the two-point functions we computed for real and complex scalar fields are consistent with
the time-reversal constraint (4.57).
7. Extra credit: Antiunitary operators may seem somewhat counter-intuitive, but there is an elegant
characterization of any antiunitary operator due to Wigner that you will work out in this problem.
First argue that if Θ is antiunitary then Θ2 is unitary. There therefore must be a basis |i⟩ in which
we have Θ2 |i⟩ = e−2iθi |i⟩ and (Θ† )2 |i⟩ = (Θ2 )† |i⟩ = e2iθi |i⟩ for some θi ∈ (−π/2, π/2]. Work out how
Θ and Θ† act in this basis, and then argue that their action on arbitrary superpositions follows from
antilinearity. Hint: you want to show that up to phase redefinitions you can take this basis to consist
of states which are invariant and pairs of states which are exchanged up to a phase by acting with Θ.
You might start by showing that Θ|i⟩ is also an eigenstate of Θ2 .
57
5 Path integrals in quantum mechanics and quantum field theory
So far we have discussed quantum field theory in the Hamiltonian formalism. This formalism has many
advantages, for example it is where the physical interpretation of a quantum system in terms of measurements
and counting degrees of freedom is most clear, but it obscures the full symmetry of relativistic theories since
one needs to pick a Lorentz frame to define the canonical momenta and the Hamiltonian.36 Giving up on
manifest Lorentz invariance makes it harder to demonstrate some of the deeper consequences of Lorentz
invariance, such as the CRT and spin-statistics theorems, and it also makes practical calculations more
difficult since each intermediate step seems to depend on the Lorentz frame but the end result doesn’t. In
classical mechanics there is a clear way to handle this problem: we can think more about the Lagrangian and
less about the Hamiltonian. The goal of the path integral approach to quantum mechanics, first suggested
by Dirac and then greatly expanded by Feynman, is to give an independent (but equivalent) formulation of
quantum mechanics that based on the Lagrangian instead of the Hamiltonian. We will spend the rest of this
lecture developing this approach.
[Qa , Pb ] = iδba
[Qa , Qb ] = 0
[Pa , Pb ] = 0. (5.1)
We will take the Hamiltonian H(Q, P ) to be a polynomial in Q and P whose terms are ordered in such a
way that all P ’s appear to the right of all Q’s (using the canonical commutation relations we can always
write any product of P s and Qs as a sum of terms with this ordering), and we will work in the Heisenberg
picture so that both Q and P are functions of time. For convenience we will take the Hamiltonian to be
time-independent, but there is no real difficulty in repeating the argument for a time-dependent Hamiltonian.
Let’s say we are interested in computing the propagator G(qf , qi ; tf , ti ) in the Q basis. In the Schrödinger
picture this is given by
G(qf , qi ; tf , ti ) = ⟨qf |e−iH(tf −ti ) |qi ⟩, (5.2)
but since we are working in the Heisenberg picture we’ll instead write it as
covariant phase space approach. See my first paper with Jie-qiang for a review. Quantization in this approach is somewhat
subtle, and we won’t pursue the topic here.
58
Figure 8: Discretizing a particle trajectory from (qi , ti ) to (qf , tf ). The dashed lines show the positions
which are integrated over in the intermediate steps.
qfa at time tf , see figure 8 for an illustration in the case of a single particle moving in one dimension. The
integral is therefore a sum over (discretized) intermediate trajectories; a path integral. The expression
(5.5) however is not so useful: we need some way to compute the propagators. At finite ϵ this of course isn’t
any easier than computing the full propagator, but in the limit of small ϵ a simplification is possible:
Here in going from the first to second and fourth to fifth lines we have neglected terms which are O(ϵ2 ),
in going from the second to the third line we have inserted a complete set of states and used that in H
the momenta are ordered to the right, and in going from the third to the fourth line we have used the
momentum-space wave function P a
⟨q, t|p, t⟩ = ei a p qa . (5.7)
We can then use this repeatedly in (5.5) and take the limit ϵ → 0, which gives
−1 Z −1 Z
N NY " N −1 !#
a
− qℓa
Y dpn X X qℓ+1
⟨qf , tf |qi , ti ⟩ = lim dqm exp iϵ pℓ,a − H (qℓ+1 , pℓ )
ϵ→0
m=1 n=0
2π a
ϵ
ℓ=0
" Z !#
Z Z tf
qf
X
a
:= Dq|qi Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) . (5.8)
ti a
q
Here we have defined q0 = qi and qN =R qf , and Dq|qfi indicates a functional integral over paths q a (t)
R
obeying q a (ti ) = qia and q a (tf ) = qfa . Dp indicates a functional integral over paths pa (t) in momentum
space with no restrictions at ti and tf . Equation (5.8) is called a Hamiltonian path integral expression
for the propagator. The quantity appearing in the exponent is essentially i times the Lagrangian, except
that p is treated as an independent variable instead of being related to q and q̇.
59
In quantum field theory we are particularly interested in expectation values of products of Heisenberg
operators, and these also have a useful path integral representation. Indeed we can consider the quantity
⟨qf , tf |OM Q(tM ), P (tM ) . . . O1 Q(t1 ), P (t1 ) |qi , ti ⟩, (5.9)
where I’ve put a line over the times of the Heisenberg operators to distinguish them from the timesteps
appearing in the path integral discretization. We will assume that the operators are time-ordered, meaning
that
t1 ≤ t2 ≤ . . . ≤ tM , (5.10)
and we will also take these operators to be ordered so that all canonical momenta appear to the left (note
that this is the opposite of the ordering we chose for the Hamiltonian). We can evaluate this quantity by
inserting complete sets of states as before, except now we occasionally need to evaluate
Z
dp ′
⟨q ′ , t + ϵ|O(Q(t), P (t))|q, t⟩ = ⟨q , t|e−iϵH(Q(t),P (t)) |p, t⟩⟨p, t|O(Q(t), P (t))|q, t⟩
2π
dp iϵ Pa pa q′a −q a
Z
−H(q ′ ,p)
≈ e ϵ
O(q(t), p(t)). (5.11)
2π
Thus we see that the only effect of time-ordered operator insertions is to insert these operators evaluated as
functions of q and p into the path integral:
Z Z
qf
⟨qf , tf |T O1 (Q(t1 ), P (t1 )) . . . OM (Q(t1 ), P (tM ))|qi , ti ⟩ = Dq|qi Dp O1 (q(t1 ), p(t1 )) . . . OM (q(tM ), p(tM ))
" Z !#
tf X
a
× exp i dt pa (t)q̇ (t) − H(q(t), p(t))
ti a
(5.12)
Here we have used the time-ordering symbol T on the left-hand side to ensure that operators are time-ordered,
so we no longer need to impose (5.10).
with H|i⟩ = Ei |i⟩. The idea is then to give t a small imaginary part via
t = e−iϵ τ, (5.15)
with τ real and 0 < ϵ ≪ 1, and then take τ to be large and negative. Working to leading order in ϵ we then
have X
|q, e−iϵ τ ⟩ ≈ |q, (1 − iϵ)τ ⟩ = e(i+ϵ)τ H |q, 0⟩ = Ci (q)e(i+ϵ)Ei τ |i⟩, (5.16)
i
60
Figure 9: The iϵ prescription for computing correlation functions in quantum field theory. Here t1 , t2 , . . . are
the locations of the operators and the time contour is shown in red. In practice it simplifies calculations if
we also analytically continue the operator times as tm = e−iϵ τ m , as then we can straighten the contour to
the dashed one.
so if we take τ to −∞ this gives us a state which is proportional to the ground state (which we renormalize
to have zero energy):
|qi , −(1 − iϵ)∞⟩ = C0 (q)|Ω⟩. (5.17)
Therefore we can write a (Hamiltonian) path integral expression for the ground state wave function:
" Z !#
Z Z 0
1 qf
X
a
⟨qf , 0|Ω⟩ = Dq|qi Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) . (5.18)
C0 (qi ) −(1−iϵ)∞ a
We can also use (5.17) to give a path integral expression for the time-ordered correlation functions:
Z Z
1 0
⟨Ω|T O1 Q(t1 ), P (t1 ) . . . OM Q(tM ), P (tM ) |Ω⟩ = Dq| 0 Dp O1 (q(t1 ), p(t1 )) . . . OM (q(tM ), p(tM ))
|C0 (0)|2
" Z !#
(1−iϵ)∞ X
a
× exp i dt pa (t)q̇ (t) − H(q(t), p(t)) ,
−(1−iϵ)∞ a
(5.19)
where for convenience we have arbitrarily taken qia = qfa = 0. The contour for the t integral is shown in figure
9. This contour prescription is the path integral version of the iϵ prescription, and we will soon see that it
gives rise to the same iϵ prescription in the Feynman propagator that we found from the canonical approach
a few lectures ago. You may worry that this formula still requires us to know |C0 (0)|, but by removing the
operator insertions we can also use it to give us a path integral formula for this,
" Z !#
Z Z (1−iϵ)∞ X
2 0 a
|C0 (0)| = Dq|0 Dp exp i dt pa (t)q̇ (t) − H(q(t), p(t)) , (5.20)
−(1−iϵ)∞ a
so the correlation function is really a ratio of two path integrals. This is convenient because ambiguities in
the normalization of the path integral measure cancel between the numerator and denominator.
61
but just in case the proof is to look at the square of this integral and change to polar coordinates:
Z ∞ 2 Z ∞ Z ∞ Z ∞ Z ∞
− x2
2 2
− x +y
2 2
− r2 d − r2
dxe = dx dy e 2 = 2π dr re = 2π dr −e 2 = 2π. (5.22)
−∞ −∞ −∞ 0 0 dr
Once we have this basic result we can derive others, for example for any A > 0 and any complex B we have
Z ∞ Z ∞ Z ∞ r
−A x 2
+Bx − A
( x− B 2
) + B2
2A = √
1 B2
−z 2 2π B2
dx e 2 = dx e 2 A e 2A dze = e 2A . (5.23)
−∞ −∞ A −∞ A
By differentiating this expression with respect to B we can compute all the moments of the Gaussian
distribution, for example
A ∞
r
d2 B2
Z
A 2 1
dxx2 e− 2 x = 2
e 2A = . (5.24)
2π −∞ dB B=0 A
We can also consider multiple integrals: given a symmetric matrix A which we will at first assume to be real
and positive, and a complex vector B, we have the integral
Z
1 T T
Z[A, B] := dxe− 2 x Ax+B x (5.25)
where x is a real vector. We can diagonalize A as A = OT DO, where O is orthogonal and D is diagonal
with positive elements d1 , d2 , . . .. We can then change variables to x e = Ox, giving
Z
1 T T
Z[A, B] = de xe− 2 xe Dex+(OB) xe
Y Z 1
P
= xi e− 2 xei di xei + j Oij Bj xei
de
i
( j Oij Bj )2
r
Y 2π P
= e 2di
i
di
1 1
B T A−1 B
=q e2 . (5.26)
A
Det 2π
There is an easy way to remember this result: up to a determinant factor, we can evaluate a Gaussian integral
by evaluating its integrand on the value of x for which its exponent is stationary. Indeed the exponent in
(5.25) has a stationary point at
x = A−1 B, (5.27)
and we then have
1 1
− xT Ax + Bx = B T A−1 B. (5.28)
2 2
We can also use this result to compute correlation functions:
1 T
dx xi1 . . . xin e− 2 x Ax
R
∂ ∂ 1 T −1
R − 1 T = ... e2B A B , (5.29)
dx e 2 x Ax ∂B i1 ∂B in B=0
In quantum mechanics we are not only interested in the situation where A is real and positive. We can
extend our result (5.26) to more general A by analytic continuation; a minimal condition for the convergence
of the integral (5.25) is that A has positive real part, meaning that A + A† is positive, and (5.26) will apply
for any such matrix provided that we are careful to define the sign of the square root by analytic continuation
from real positive A.
62
5.4 Lagrangian path integral in quantum mechanics
So far the path integrals we have discussed have independent integrals over the trajectories q(t) and p(t).
These manifestly rely on the Hamiltonian formalism, and thus are not manifestly covariant in relativistic
theories. To get covariant expressions we need to get rid of p(t). The best way to do this is to integrate
it out, meaning to simply evaluate the functional integral over p(t). In many theories of physical interest,
including in particular the standard model of particle physics and also general relativity, the Hamiltonian
is a quadratic function of the canonical momenta. The functional integral over p(t) is therefore a Gaussian
integral, and we can thus evaluate it using the methods of the previous subsection. Indeed the stationarity
condition is simply Hamilton’s equation
∂H
q̇ a = , (5.31)
∂pa
so evaluating the Gaussian integral over p(t) has precisely the effect of converting the exponent in the path
integral into the Lagrangian! More explicitly, considering expectation values of operators that depend only
on Q (and not P ) we have
q
Dq|qfi
Z R tf
⟨qf , tf |T O1 (Q(t1 )) . . . OM (Q(tM ))|qi , ti ⟩ =
p O1 (q(t1 )) . . . OM (q(tM )) ei ti L(q(t),q̇(t)) .
Det (2πA[q])
(5.32)
Here A[q] is the “matrix” appearing in the term in the Hamiltonian which is quadratic in P , as in equation
(5.25). In simple theories (such as the harmonic oscillator or the standard model of particle physics) A is
independent of q, in which case the determinant factor is a field-independent constant and can be absorbed
into a rescaling of the measure.37 Equation (5.32) is called the Lagrangian path integral, and unlike the
Hamiltonian path integral it manifestly has (up to possible regularization issues) all the symmetries of the
classical Lagrangian L. Using the iϵ prescription we can also give a Lagrangian path integral expression for
time-ordered correlation functions:
R ∞(1−iϵ)
Dq|00 i dtL(q(t),q̇(t))
√
R
O1 (q(t1 )) . . . OM (q(tM ))e −∞(1−iϵ)
Det(2πA[q])
⟨Ω|T O1 (Q(t1 )) . . . OM (Q(tM ))|Ω⟩ = R ∞(1−iϵ) . (5.34)
Dq|00 i dtL(q(t),q̇(t))
√
R
e −∞(1−iϵ)
Det(2πA[q])
This expression, together with its Euclidean continuation we will introduce soon, is the starting point for
many (most?) standard calculations in quantum field theory.
The restriction to operator insertions that don’t depend on P is not so serious, as we can differentiate
both sides of equation (5.32) with respect to the operator times t1 , t2 , . . . to get path integral expressions for
correlation functions involving time derivatives of q. The restriction to Hamiltonians which are quadratic in
P is more concerning. In general the best that can be said is that by integrating out p we will always get some
local Lagrangian which has whatever symmetries the theory has, but it won’t in general be the Legendre
transform of the Hamiltonian we started with. On the other hand in quantum field theory we usually end
up writing down the most general local Lagrangian that is consistent with the symmetries in question (see
our discussion of effective field theories in later lectures), and the new Lagrangian resulting from integrating
out p will differ from the one resulting from the Legendre transformation only by shifts of the values of
the parameters in this Lagrangian. By starting with the Lagrangian approach we therefore land on the
same class of theories as we did starting from the Hamiltonian approach, but now with a more complicated
relationship between the two approaches. These comments also apply to the somewhat arbitrary choices we
made for the operator ordering of H and O: other choices would just differ by shifting the coefficients of
37 An example of a theory which is not “simple” in this regard is the “non-linear σ-model”, which is a theory of multiple
63
the local terms appearing in H and O. In general shifts of this type are called renormalizations, and in
defining path integrals we always give ourselves some leeway in how to renormalize both the Hamiltonian
and the operators appearing in expectation values.
This integral is Gaussian, so we can evaluate it using our formula (5.26): we are supposed to find the saddle
point of the exponent and then evaluate the integrand on it. The saddle point equation is
d2 x
= −m2 x, (5.36)
dt2
but it is more convenient to rewrite this in terms of τ = (1 + iϵ)t:
d2 x
= −m2 (1 − 2iϵ)x. (5.37)
dτ 2
We are interested in finding the saddle point which vanishes at τ = −∞ and is equal to xf at τ = 0; this is
which is indeed the exponent for the correct harmonic oscillator ground state.
dd p ieip(x2 −x1 )
Z
A−1 (x1 , x2 ) =
(2π)d (1 + 2iϵ)(p0 )2 − |p|2 − m2
dd p −ieip(x2 −x1 )
Z
= , (5.42)
(2π)d p2 + m2 − iϵ
where in the second line we rescaled ϵ by the positive quantity (p0 )2 . By equation (5.30) this should be equal
to the Feynman propagator GF (x1 , x2 ), and indeed it matches the expression we found a few lectures ago
using the canonical formalism.
64
We can also use this approach to independently derive a position space expression for the Feynman
propagator. Indeed from equation (5.41) the Feynman propagator should obey
and substituting this into (5.43) we find that away from s = 0 the Feynman propagator obeys (up to terms
of order ϵ2 )
d−1 ′
G′′F (s) + GF (s) − m2 GF (s) = 0, (5.45)
s
which is a standard ordinary differential equation whose solutions can be expressed in terms of Bessel
functions (as you can easily check in mathematica). The solution which goes to zero at large positive s is
d−2
GF ∝ s− 2 K d−2 (ms), (5.46)
2
and we can fix the coefficient of proportionality either by requiring that this obeys
or else by matching to the integral (5.42) in the massless limit that ms ≪ 1 where the integral is easier to
compute. This is the same position-space two-point function we quoted in lecture five, except that now the
iϵ prescription we are using gives us the Feynman propagator instead of the two-point function.
t = −iτ. (5.48)
The path integral on this contour is called the Euclidean path integral, and for many questions the
Euclidean path integral is the best way to think about it. Given its importance, it is worth repeating the
deriviation we gave in Lorentzian signature directly in Euclidean signature. The idea is to define Euclidean
Heisenberg operators by38
O(τ ) = eτ H O(0)e−τ H , (5.49)
with eigenstates |q, −iτ ⟩ = eHτ |q, 0⟩. Proceeding as in the Lorentzian case, we can note that
Z
dp ′
⟨q ′ , −i(τ + ϵ)|O(Q(τ ), P (τ )|q, −iτ ⟩ = ⟨q , −iτ |e−ϵH(Q(τ ),P (τ )) |p, −iτ ⟩⟨p, −iτ |O(Q(τ ), P (τ ))|q, −iτ ⟩
2π
dp ϵ i Pa pa q′a −q a
Z
−H(q ′ ,p)
≈ e ϵ
O(q(τ ), p(τ )) (5.50)
2π
and therefore by inserting complete sets of states we have
Z Z
q
⟨qf , −iτf |T O1 (Q(τ 1 ), P (τ 1 )) . . . OM (Q(τ M ), P (τ M ))|qi , −iτi ⟩ = Dq|qfi Dp O1 (q(τ 1 ), p(τ 1 )) . . . OM (q(τ M ), p(τ M ))
"Z !#
τf X
a
× exp dτ i pa (τ )q̇ (τ ) − H(q(τ ), p(τ )) .
τi a
(5.51)
38 These operators are somewhat delicate mathematically due to the presence of eτ H , which has a very limited domain. It is
always ok to use them in time-ordered vacuum correlators however, which in the end is the only place we will use them.
65
Taking τi → −∞ and τf → ∞ now automatically projects onto the ground state, so no analytic continuation
is needed to convert this into a vacuum expectation value:
Z Z
1 0
⟨Ω|T O1 (Q(τ 1 ), P (τ 1 )) . . . OM (Q(τ M ), P (τ M ))|Ω⟩ = Dq|0 Dp O1 (q(τ 1 ), p(τ 1 )) . . . OM (q(τ M ), p(τ M ))
|C0 (0)|2
"Z !#
∞ X
a
× exp dτ i pa (τ )q̇ (τ ) − H(q(τ ), p(τ )) .
−∞ a
(5.52)
Converting this into a Lagrangian path integral (with the same caveats as before), we end up with
Dq|00
R∞
√ O1 (q(τ 1 )) . . . OM (q(τ M ))e− dτ LE (q,q̇)
R
−∞
Det(2πA)
⟨Ω|T O1 (Q(τ 1 )) . . . OM (Q(τ M ))|Ω⟩ = 0
R∞ , (5.53)
√ Dq|0 e− −∞ dτ LE (q,q̇)
R
Det(2πA)
For example for the simple harmonic oscillator the Euclidean Lagrangian is
1 2
q̇ + m2 q 2 ,
LE = (5.55)
2
while for a free scalar field the Euclidean Lagrangian is the spatial integral of the Euclidean Lagrangian
density
1 2
LE = ϕ̇ + ∇ϕ · ∇ϕ + m2 ϕ2 . (5.56)
2
There are a few essential points to make about the Euclidean path integral:
Mathematically
R∞ it is much better behaved than the Lorentzian path integral. The Euclidean action
SE = −∞ dτ LE is often real and bounded from from below, as you can see from the harmonic oscillator
and the free scalar, so the integrand e−SE exponentially suppresses field configurations which aren’t
near ϕ = 0. This makes it possible to give it a mathematically rigorous formulation (at least in the
case of a finite number of degrees of freedom), look up “Wiener measure” if you want to learn about
it.
In situations where SE is real and bounded from below we can interpret the Euclidean path integral
(5.53) as computing expectation values in a classical probability distribution. Many famous classical
statistical systems arise in this way, for example the Euclidean path integral for a free particle is the
classical theory of Brownian motion and the Euclidean path integral for a free scalar field with d = 2
is the classical theory of random surfaces. The critical point in the phase diagram of water is also
described by a (interacting) Euclidean scalar field theory, as are the fluctuations of magnets at the
Curie temperature. Euclidean path integrals also arise in quantitative finance: the prices of options as
a function of time are fluctuating variables which can be characterized by a Euclidean path integral.
In situations where the Euclidean path integral has a probabilistic interpretation it is amenable to
explicit numerical evaluation. The standard approach to this is called the Monte Carlo method, which
samples from the probability distribution and then assumes that the expectation value is dominated
by its value on a typical instance. This is a very powerful method for evaluating high-dimensional
integrals. For example in QCD, the theory of the strong nuclear force, my colleagues here in the
Center for Theoretical Physics use this method to compute the masses of hadrons such as the proton
66
and neutron to quite good accuracy. The computational resources involved are somewhat terrifying,
for example in a recent calculation my colleague Will Detmold used the fastest publicly-available
supercomputer in the world, Frontier at Oak Ridge National Laboratory, to evaluate the Euclidean
path integral of QCD on a Euclidean spacetime lattice with 72 × 72 × 72 × 192 sites, consuming of
order 1011 Joules of energy in the process.
67
5.8 Homework
1. Rewrite the operator P QP Q as a sum of operators with all P to the right and all Q to the left.
2. Use the path integral to find the propagator ⟨q ′ , t′ |q, t⟩ of a free quantum particle moving on a line
P2
with Hamiltonian H = 2m . Hint: use the discretized version of the Lagrangian path integral.
3. Use the path integral to find the propagator for the simple harmonic oscillator, with Hamiltonian
P2
H = 2m + k2 Q2 . Hint: you should expand the function q(t) you are integrating over as a classical
solution qcl plus a fluctuating piece δq, and then expand δq in Fourier modes and integrate over the
coefficients of these modes. I recommend first doing the calculation neglecting any prefactors which are
independent of k: you can find the k-independent prefactor at the end by comparing to your answer
for the previous problem in the limit k → 0.
4. Use the Lorentzian path integral with an iϵ prescription to find the Feynman propagator of a free
massive complex scalar field (remember that this is the time-ordered two-point function of Φ and Φ† ).
5. Use the Euclidean path integral followed by a Wick rotation to compute the Feynman propagator of a
real free scalar field with mass m. Hint: you should find that the Euclidean Feynman propagator is a
Greens function for the Euclidean Klein-Gordon operator, obeying (∇2x − m2 )GF (x, y) = −δ d (x − y).
It is ok to leave your expression for it in terms of a spacetime momentum integral, but you should
make sure that after Wick rotation you get the right iϵ prescription for the Feynman propagator in
Lorentzian signature.
68
6 CRT , spin-statistics, and all that
We’ve now developed two powerful formalisms for thinking about quantum field theory: the operator ap-
proach based on algebras acting on Hilbert spaces and the path integral approach, both in Lorentzian and
Euclidean signature. In this lecture we will put the pieces together to prove some of the famous results
in relativistic quantum field theory: the CRT theorem, the relation between spin and statistics, and the
thermal nature of vacuum entanglement (the Unruh effect). All of these results are true non-perturbatively
in any relativistic quantum field theory, as the arguments will hopefully make clear. The title of this lecture
is shamelessly adapted from a famous book by Streater and Wightman, which discusses the first two of these
from a rigorous (but somewhat out-dated) approach.
Here I have switched from the particle notation we used in the last lecture to field notation, and also absorbed
the determinant factor coming from integrating out the momenta into the measure Dϕ. In any relativistic
quantum field theory this path integral is invariant under Euclidean rotation symmetry, in the sense that if
FΛ is a transformation of field space which implements a Euclidean rotation Λ ∈ SO(d), i.e.
on the dynamical fields, then the combination of the path integral measure and action are invariant:
The invariance of the action is the classical statement of having a symmetry, while the invariance of the
measure reflects the statement that the regularization of the theory implicit in the path integral does not
destroy the symmetry (much later we will see examples of situations where this happens). Using this
invariance we can derive a constraint on correlation functions:
Dϕ O1 [ϕ] . . . OM [ϕ]e−SE [ϕ]
R
⟨Ω|T O1 [Φ] . . . OM [Φ]|Ω⟩ = R
Dϕe−SE [ϕ]
D(FΛ ϕ) O1 [FΛ ϕ] . . . OM [FΛ ϕ]e−SE [FΛ ϕ]
R
= R
Dϕe−SE [ϕ]
Dϕ O1 [FΛ ϕ] . . . OM [FΛ ϕ]e−SE [ϕ]
R
= R
Dϕe−SE [ϕ]
= ⟨Ω|T O1 [FΛ Φ] . . . OM [FΛ Φ]|Ω⟩. (6.4)
In going from the first to the second line here we changed variables in the path integral, in going from the
second to the third we used the symmetry condition (6.3), and in going from the third to the fourth we used
(6.1).
To prove the CRT theorem we are interested in the Euclidean rotation Λ = RT , which acts as39
This then leads to a symmetry called CPT , which is a symmetry of any relativistic field theory when d is even. Historically the
theorem discussed in this section has thus usually been called the CPT theorem, especially by particle physicists who only care
about the case of d = 4, while the terminology CRT is of more recent origin. We have focused on CRT nonetheless because 1)
it is the thing which works in any spacetime dimension and 2) it is what naturally arises from the proof of the theorem.
69
I emphasize that RT is indeed an element of SO(d), it is a rotation by π in the plane of τ and x1 . This
transformation reverses the direction of Euclidean time, so it also reverses the order of the operators in the
Euclidean correlation function. To be more concrete, if O1 lives at time τ1 , O2 at time τ2 , etc, and for
simplicity we assume that τ1 ≤ τ2 ≤ . . ., then the Euclidean statement of this symmetry is that
f (f −1)
⟨Ω|OM [Φ] . . . O1 [Φ]|Ω⟩ = (−1) 2 ⟨Ω|O1 [FRT Φ] . . . OM [FRT Φ]|Ω⟩
f /2
= (−1) ⟨Ω|O1 [FRT Φ] . . . OM [FRT Φ]|Ω⟩, (6.6)
where
f = fO 1 + . . . + fO M (6.7)
is the total number of fermionic operators appearing in O1 . . . OM . This pesky minus sign arises from
something we haven’t discussed yet, which is that when you time-order fermionic operators the process is
antisymmetric instead of symmetric. We’ll see this in more detail when we discuss the fermionic path integral
in a month or so, but for now the basic idea is that since fermionic fields anticommute instead of commute
at spacelike separation it must be that the degrees of freedom which represent them in the path integral are
also anticommuting. The second line of (6.6) follows from the first because correlation functions involving
fermions vanish unless the total number of fermions is even, which is a consequence of the fact that the
Lagrangian density is always bosonic (this is called fermion parity symmetry).
The CRT theorem is what we get when we analytically continue (6.6) to Lorentzian signature. We can
formalize the analytic continuation be introducing a Wick rotation operation W , whose action on dynamical
fields is defined to perform the analytic continuation τ = it. On Euclidean scalar fields we have
while for tensor fields each raised τ indices get a factor of i and each lowered τ indices get a factor of −i. So
for example a vector field V µ has 0 0
V (t, ⃗x) iV (it, ⃗x)
W = (6.9)
V j (t, ⃗x) V j (it, ⃗x)
while a one-form field ωµ has
ω0 (t, ⃗x) −iω0 (it, ⃗x)
W = . (6.10)
ωj (t, ⃗x) ωj (it, ⃗x)
These factors are necessary because we’d like to preserve e.g. the expressions V = V µ ∂µ and ω = ωµ dxµ , so
we should rotate V 0 in the same way as we rotate τ and ω0 in the opposite way.40 Analytic continuation of
(6.6) thus gives
⟨Ω|W OM [Φ] . . . W O1 [Φ]|Ω⟩ = (−1)(fO1 +...fOM )/2 ⟨Ω|W O1 [FRT Φ] . . . W OM [FRT Φ]|Ω⟩. (6.11)
In order to give this a symmetry interpretation in Lorentzian signature, we can first observe that the sym-
metry must be antiunitary since it reverses time. To see what the antiunitary is, we need to first recall that
for any antiunitary operator Θ that preserves the ground state we have the constraint
′†
⟨Ω|O1 . . . OM |Ω⟩ = ⟨Ω|OM . . . O1′† |Ω⟩ (6.12)
Thus we can interpret (6.11) as indicating that our Lorentzian theory has an antiunitary symmetry ΘCRT
whose action on the dynamical fields is41
γ 0 transforms as W γ τ = iγ t . Don’t worry about this if you don’t yet know what it means.
41 Note that this definition does not require or use independent definitions of C, R, and T . In general these are not symmetries,
and even when they are there is some freedom in how they are defined. The name CRT is thus in some sense a historical
anachronism, the whole is better-defined than its parts.
70
where fa = 1 if Φa is fermionic and fa = 0 if Φa is bosonic. In particular note that we need to take the
complex conjugate of the analytic continuation of the Euclidean rotation to match (6.12), this is the origin
of the “C” in CRT . For example the action of CRT on a (Lorentzian) complex scalar Φ or complex vector
V µ is42
Once we have understood spinor fields we will also see that a Dirac spinor transforms as
since any factors of i and −i from the Wick rotation of any τ indices cancel between the two sides. In the
homework you will show that this equation together with the spin-statistics theorem imply that
Θ2CRT = 1, (6.17)
doesn’t square to one. The CRT we have constructed here however is the only unbreakable one, up to the possibility of
multiplying it by the fermion parity operator (−1)F which acts as one on all bosonic states and minus one on all fermionic
states.
44 In quantum gravity there are good reasons to think that we can have quantum mechanics and (general) relativity without
having locality, but as far as we can tell CRT continues to be a good symmetry even in quantum gravity. See my recent paper
with Numasawa for more on this.
45 The fully non-perturbative proof of the spin-statistics theorem we give here is due to Schwinger. In most quantum field
theory books the theorem is proven in a more banal way that applies only to free fields. Essentially one tries to construct free
fields for particles of various spin, and then finds that it only works if the fields commute for integer spin and anticommute for
half-integer spin.
71
Spin-statistics relation: Particles with integer spin are bosons, while particles with half-integer spin
are fermions.
The idea behind this rule is quite easy to understand, but we first need to discuss the subtle fact (which
hopefully you have seen before) that a rotation by 2π acts on objects of half-integer spin as −1. For example
in the context of a spin 1/2 particle in three spatial dimensions a rotation by θ about the z axis is implemented
on the Hilbert space by
U (θ) = eiθσz /2 , (6.18)
so U (2π) = eiπσz = −1. Mathematically we can express this by saying that the action of rotations on a
spin 1/2 particle does not give a genuine representation of the rotation group SO(3) in the sense of a set
of unitary operators U (g) such that U (g1 )U (g2 ) = U (g1 g2 ), since a rotation by 2π is equal to nothing in
the rotation group but apparently it isn’t equal to nothing acting on a spin 1/2 particle. We will discuss
this in more detail later in the semester, but the right way to understand this is that in a relativistic theory
with half-integer spin particles the spacetime symmetry group isn’t really SO+ (d − 1, 1), but instead what is
called its double cover Spin+ (d − 1, 1). Locally Spin+ (d − 1, 1) looks just like SO+ (d − 1, 1), but globally
it is different in that each element of SO+ (d − 1, 1) corresponds to two elements of Spin+ (d − 1, 1) which
differ by a rotation by 2π. The rotation part of Spin+ (3, 1), which is the double cover of SO(3), is precisely
⃗
given by the set of matrices of the form eiθ·⃗σ/2 , which is nothing but the matrix group SU (2). We will see
how to extend this to a double cover of the full Lorentz group later in the semester when we discuss spinors.
Turning now to spin and statistics, the basic ingredient we will need is to understand in more detail how
the Euclidean rotation matrix DE (RT ) acts on the fields Φa and their complex conjugates. This is a bit
tricky, so hold on tight! Let’s first recall that in Lorentzian signature we have
and thus
U (Λ)† Φa (x)† U (Λ) = D∗ (Λ)ab Φb (Λ−1 x)† . (6.20)
1
In particular when Λ is a boost of rapidity η in the x direction we have
01
D(Λ) = eiηJ , (6.21)
where the matrix J 01 is the boost generator in the representation D of the Lorentz group. To turn a boost
into a Euclidean rotation, we want analytically continue t = −iτ and η = −iθ such that
t′ = cosh(η)t + sinh(η)x
x′ = cosh(η)x + sinh(η)t (6.22)
become
τ ′ = cos(θ)τ + sin(θ)x
x′ = cos(θ)x − sin(θ)τ. (6.23)
We therefore have 01
DE (Λ) = eθJ (6.24)
1
for a Euclidean rotation by θ in the τ, x plane. In Euclidean signature the rotation group SO(d) is a
compact group, and the finite-dimensional representations of such groups are always unitary. We therefore
see that J 01 must be anti-hermitian. D∗ (Λ) therefore analytically continues to
01 ∗ 01 T 01 T
e−iη(J )
= eiη(J )
= eθ(J )
= DE (Λ)T . (6.25)
72
We next need to understand how the hermitian conjugate of fields works in Euclidean signature. In
Lorentzian signature we have the convenient fact that we can take the hermitian conjugate before or after
time evolution and end up with the same thing:
as this is the quantity which analytically continues to O† (t) in Lorentzian signature. Therefore from the pre-
vious paragraph, in Euclidean signature we have the somewhat counter-intuitive symmetry transformations
To proceed further, we now change our basis of fields Φa (x) to diagonalize DE (RT ). Recall that this
is a unitary matrix, and since all fields have integer or half-integer spin it must obey DE (RT )4 = 1. Its
eigenvalues are therefore ±1 on fields of integer spin and ±i on fields of half-integer spin. We may then
observe that
where in the first line jϕ is the spin of Φ and we have used our Euclidean rotation rule (6.4) and also that if
Φ has integer spin both rotations contribute ±1 while if Φ has half-integer spin then they both contribute
±i. Note that (6.29) here is crucial in getting the factor of (−1)2jϕ , as it ensures Φ and Φ∗ contribute with
the same sign in front of i in the fermionic case. The second line then follows from the antisymmetry of the
time-ordered product for fermions, as explained below (6.6).
Finally we can complete the proof by showing that the correlation functions on both sides of (6.30) are
strictly positive: this implies the theorem because then we need
which means that when jϕ is an integer we must have fϕ = 0 while when jϕ is a half-integer we must have
fϕ = 1. It is easy to show that they are positive semidefinite, as they are the squared norms of states:
where here we have used (6.28). We will show later in the lecture that these norms cannot vanish, so provided
that the theorem is proved!46
It is instructive to consider how a naive version of this argument which doesn’t use special relativity can
fail. The idea of the naive argument is to do the same manipulation using a spatial rotation by π instead
of RT . We can derive the relation (6.30) just as before (except with the fields now being at ±xx̂), but the
failure mode is that we can no longer show that the correlators aren’t zero! Indeed in non-relativistic field
theory you can have a field Φ that only has an annihilation part and such a field can indeed annihilate the
vacuum.
46 More precisely what we showed is that Φ and Φ∗ commute/anticommute at spacelike separation if they have integer/half-
integer spin. In the homework you will show that this implies the same for Φ with Φ and Φ∗ with Φ∗ .
73
Figure 10: Re-interpreting the ground state wave function as a Euclidean transition amplitude in half of
space.
It is also instructive to compare this argument to the more conventional one given e.g. in Weinberg.
There one tries to construct free fields that create particles of arbitrary spin, finding out by brute force
that it is impossible to choose the coefficient functions ui and vi such that the field both transforms in
a valid representation of Spin+ (d − 1, 1) and commutes/anticommutes at spacelike separation unless the
spin-statistics relation is satisfied. The proof given here by contrast does not rely on free fields and is also
more intuitive. As in the case of the CRT theorem, any experimental demonstration of a violation of the
spin-statistics connection would be catastrophic for quantum mechanics and special relativity.
H = HL ⊗ H R . (6.33)
The operators in the algebra A(L) are product operators of the form OL ⊗ IR , while the operators in the
algebra A(R) = A′ (L) are product operators of the form IL ⊗ OR .47 The ground state wave function is
computed by the Euclidean path integral in the region τ < 0:
Z R 0 R d−1
⟨ϕL ϕR |Ω⟩ ∝ Dϕ|ϕ0 L ,ϕR e− −∞ d xLE (ϕ,∂ϕ) . (6.34)
The idea is to change our interpretation of this path integral from being split up on horizontal slices to being
split up on radial slices, as shown in figure 10. We thus have
of commute; we haven’t introduced enough fermion technology to deal with this yet so for now we’ll stick to bosons.
74
also sometimes called the Rindler Hamiltonian, and |n⟩ is a complete basis of KR eigenstates with
eigenvalues ωn . You can think of (6.35) as arising from applying our usual path integral derivation to
Euclidean evolution by the Rindler Hamiltonian, which generates rotation in the τ x1 plane. To turn (6.35)
into an expression for the ground state however we need to find a way to get ϕL into a bra instead of a ket.
We can do this by introducing a “partial CRT ” operator ΘR CRT : HR → HL which acts as
′∗
ΘR
CRT |ϕR ⟩ = |ϕL ⟩. (6.37)
Here ϕ′L indicates the CRT transformation of ϕL , which is indeed a function of ϕR . This operator implements
CRT on operators in the left region, as we can check by noting that if x is in L we have:
R†
Φ(x)ΘRCRT |ϕ R ⟩ = ΘR
CRT Θ CRT Φ(x)Θ R
CRT |ϕR ⟩
′
= ΘR
CRT Φ (x)|ϕR ⟩
′
= ΘR
CRT ϕL (x)|ϕR ⟩
= ϕ′∗ R
L (x)ΘCRT |ϕR ⟩. (6.38)
In the first line here we used that ΘR CRT is antiunitary, the second line is just implementing CRT on Φ, in
the third line we use that for bosonic theories Φ and Φ† are commuting so an eigenstates of Φ is also an
eigenstates of Φ′ , and in the fourth line we used that ΘRCRT is antilinear. From (6.13) we can rewrite (6.37)
as
ΘR
CRT |ϕR ⟩ = |FRT ϕR ⟩, (6.39)
2
and making the substitution ϕR = FRT ϕL and using that FRT = 1 on bosons we have
ΘR†
CRT |ϕL ⟩ = |FRT ϕL ⟩. (6.40)
We therefore have shown that in any relativistic quantum field theory the ground state has the simple
entangled form X
|Ω⟩ ∝ e−πωn ΘRCRT |n⟩ ⊗ |n⟩, (6.43)
n
which is called the Rindler decomposition. Stated heuristically, the Rindler eigenstates in the right region
are entangled with their CRT conjugates in the left region.48
do so, but I’ll mention that the result in that case becomes
X
|Ω⟩ ∝ e−πωn i−FL ΘR
CRT |n⟩ ⊗ |n⟩, (6.44)
n
75
where I have temporarily restored the unsightly dimensionful constants ℏ, c, and kB . This is a quite
remarkable statement, although not one which is easy to experience yourself. For example if we take a to be
9.8 m/s2 we get
TU nruh ≈ 4 × 10−20 K. (6.46)
To derive this, we first note that an observer living in the right region can take the partial trace over the left
region, leading to a vacuum density matrix
X
ρR ∝ e−2πωn |n⟩⟨n| = e−2πKR . (6.47)
n
1
This is nothing but a thermal density matrix, but with “Hamiltonian” KR and “temperature” TK = 2π .
The world should therefore look thermal to someone whose proper time is proportional to the boost rapidity
η. From equation (6.22), we see that such a person should be moving on a trajectory
t(η) = x0 sinh η
x(η) = x0 cosh η. (6.48)
Note that this trajectory is the boost image of the point (0, x0 ). The proper time along this trajectory is
related to η by
τ = ηx0 , (6.49)
and we can compute the proper acceleration:
s 2 2
d2 x d2 t
1
a= − = . (6.50)
dτ 2 dτ 2 x0
76
The final expression here is a sum of positive semi-definite terms, so it can vanish only if each term vanishes.
Therefore if O annihilates the vacuum, all of its matrix elements must vanish - in other words O must itself
be zero. We note in passing that the Reeh-Schlieder property is precisely what we needed to complete our
proof of the spin-statistics theorem, so that theorem is now proved as well. We also note that this argument
actually proves something stronger: it shows that no operator which is an element of A[R] in some Lorentz
frame can annihilate the vacuum. For example any nonzero product of a finite number of local operators at
arbitrary points also cannot annihilate the vacuum, since by an appropriate spacetime translation we can put
all the operators into the domain of dependence of the right region R and the vacuum is translation-invariant.
The Reeh-Schlieder property has a rather surprising consequence: it implies that any state in the Hilbert
space can be obtained by acting on the vacuum with an operator which is supported only in the left region
L (by symmetry the same is of course true for the right region R, or more generally for the left or right
region in any Lorentz frame). The proof goes like this: suppose by contradiction that there is a nontrivial
subspace S ⊂ H which is orthogonal to all the states which can be written as O|Ω⟩ for some O ∈ A[L]. We
will argue that the projection PS is a nonzero element of A[R] which annihilates |Ω⟩. By the Reeh Schlieder
property this is not allowed, and so the subspace S must be zero-dimensional. The idea is to first consider
PS ⊥ = 1 − PS , which is the projection onto the subspace of states which can be created by acting on |Ω⟩
with elements of A[L]. For any O in A[L] we have
and
O † PS ⊥ = PS ⊥ O † PS ⊥ , (6.54)
where in both cases the argument is that both sides of the equation act as zero on S and as O/O on S ⊥ . †
Taking the dagger of the second equation and combining them, we see that
OPS ⊥ = PS ⊥ O, (6.55)
and thus that PS ⊥ is in the commutant of A[L]. By Haag duality this is equal to A[R], and so we have
PS = 1 − PS ⊥ ∈ A[R]. Moreover PS clearly annihilates |Ω⟩ since |Ω⟩ ∈ S ⊥ .
We only proved the Reeh-Schlieder property for half-space regions, but in fact it is true for any region
which is not a complete time slice.49 In other words any operator which annihilates the vacuum cannot be in
A[R] for any region R that is not a complete time slice. The argument just given then implies an even more
shocking consequence: for any open spatial region R and any quantum state |ψ⟩, we can find an element O
of A[R] such that50
|ψ⟩ = O|Ω⟩. (6.56)
For example we can instantaneously create the moon by acting on the vacuum with an operator that has
support only in this classroom! This is a rather extreme example of what is called quantum teleportation.51
It is worth briefly mentioning some standard mathematical terminology which is used in discussing the
Reeh-Schlieder property. In von Neumann algebra a state |Ω⟩ with the property that it is not annihilated
by any nonzero element of a von Neumann algebra A is said to be separating for that algebra. Similarly a
state |Ω⟩ with the property that A|Ω⟩ is a dense set of states in the Hilbert space H is said to be a cyclic
state for A. What the Reeh Schlieder property says is that in quantum field theory the vacuum is both
cyclic and separating for the algebra A[R] associated to any spatial region which isn’t a complete time slice.
49 Unfortunately I’m not aware of a simple proof of this generalization, except in the special case of conformal field theories.
50 This statement isn’t actually quite true: if we are careful about infinite-dimensional Hilbert spaces, what we find from the
proof in the previous paragraph is that we can create a state which as close to |ψ⟩ as we like in the Hilbert space norm. A
mathematician would describe this situation by saying that the set A[R]|Ω⟩ is dense in the Hilbert space H.
51 To be clear, the operator which does this is not unitary so we can’t use it to communicate faster than light. This seeming
non-locality is of the EPR variety, rather than the worse non-locality we found in the first lecture by trying to quantize a
relativistic quantum particle.
77
6.6 Homework
1. Using (6.16) and also the spin-statistics theorem, show that Θ2CRT = 1.
2. Check that the complex scalar action is invariant under CRT .
1 m2 µ
L = − (∂µ Vν − ∂ν Vµ )(∂ µ V ν − ∂ ν V µ ) − V Vµ (6.57)
4 2
is also invariant under CRT .
4. Let’s model the hydrogen atom by a classical electron orbiting the proton in a circle whose radius is
the Bohr radius a0 = 5 × 10−11 m. What is the Unruh temperature experienced by the electron? How
does it compare to the binding energy of hydrogen?
5. Show that if Φ(x)Φ† (y) ± Φ† (y)Φ(x) = 0 at spacelike separation, then we also have Φ(x)Φ(y) ±
Φ(y)Φ(x) = 0 and Φ† (x)Φ† (y) ± Φ† (y)Φ† (x) = 0 at spacelike separation. Hint: you should assume
that Φ(x)Φ(y) + sΦ(y)Φ(x) = 0 with either s = 1 or s = −1, and then show that s needs to be the
same sign as appears in Φ(x)Φ† (y) ± Φ† (y)Φ(x) = 0. I recommend considering the norm of the state
Φ(x)Φ(y)|Ω⟩, and you will need to use the Reeh-Schlieder property and also that as (x − y)2 → +∞
we have
⟨Φ† (x)Φ(x)Φ† (y)Φ(y)⟩ → ⟨Φ† (x)Φ(x)⟩⟨Φ† (y)Φ(y)⟩, (6.58)
which is an example of what is called cluster decomposition. In general cluster decomposition says
that the connected correlation functions of local operators should always decay at large separation, in
this case the connected two-point function of the composite operator Φ† Φ.
78
7 Perturbative calculation of correlation functions in interacting
theories
So far our results in this class have fallen into two categories:
General formal results (such as the CRT and spin-statistics theorems) which are valid in any relativistic
quantum field theory.
Free field theory is quite useful for getting an initial picture of how quantum field theory works, and formal
results are of course important for understanding the general structure of quantum field theory, but in the
end of the day most field theories are not free and formal results won’t get us to detailed predictions that
can be quantivatively compared to experiment. It is time for us to learn how to do some explicit calculations
in field theories that are not free.
The simplest interacting field theory is called ϕ4 theory, and its Lagrangian density is given by
1 m2 2 λ 4
L = − ∂µ ϕ∂ µ ϕ − ϕ − ϕ . (7.1)
2 2 4!
It must be acknowledged from the outset that no analytic solution of this theory is known. It is not difficult
to see why it cannot be solved using the methods we have discussed so far: the Heisenberg equation of
motion
λ
(∇2 − m2 )Φ = Φ3 (7.2)
6
is non-linear, and thus cannot be solved using the Fourier transform, and the path integral
Z R d
Dϕei d xL (7.3)
is not Gaussian so we can’t compute it using our Gaussian tricks. In fact there is an even more severe problem:
for d ≥ 4 this model is widely expected to not even have a continuum limit: it can only be defined precisely
in the presence of a finite UV cutoff such as a lattice. Nonetheless there is much to be gained by studying
this model, and the key idea that will allow us to make progress is perturbation theory: we treat the
parameter λ, called the coupling constant, as small, and then we compute interacting correlation functions
as power series in λ about their free field values. There is a beautiful diagrammatic way of organizing such
calculations, called Feynman diagrams, which we will meet for the first time in this lecture. Perturbative
calculations using Feynman diagrams are the central focus of a large fraction of the practicing quantum field
theorists in the world, especially those working in particle physics, and developing a good intuition for them
is essential for any aspiring theoretical physicist (or any aspiring particle experimentalist).
79
but our approach here will be to ignore this and try to approximate f (λ) when λ ≪ 1. The idea is to Taylor
expand the “interaction” term, which allows us to rewrite the integral as a sum over Gaussian moments:
Z ∞ ∞ n
λx4
1 1 2
X 1
f (λ) = √ dxe− 2 x −
2π −∞ n=0
n! 4!
∞ n ∞
1 X 1 −λ
Z
1 2
“ = ”√ dxx4n e− 2 x (7.6)
2π n=0 n! 4! −∞
I’ve put the equality in quotes in the second line since we have recklessly exchanged the order of summation
and integration, a sin for which we will shortly pay a price. Proceeding boldly ahead in the meantime, we
can be encouraged by the fact that the terms in the sum are suppressed by higher powers of λ as n increases,
and so we can hope that truncating this sum to the first few terms gives a good approximation to f (λ) when
λ is small. The easiest way to evaluate these Gaussian moments is to remember the integral definition
Z ∞
Γ(y) = dssy−1 e−s (7.7)
0
80
1.02
0th order
4th order
1.00 Exact
3rd order
0.99
1st order
Figure 11: Comparing the first few terms in perturbation theory to the exact answer. What is plotted here
is the ratio of the partial sum of the first few terms to the exact answer; for λ < .3 the first order result
already brings us within a percent of right answer, and including higher order terms gets us even closer.
To show that this series is indeed asymptotic, note that we can legally move a finite number of the terms in
the sum past the integral to get
N −1 n Z ∞ ∞ n
λx4
1 X 1 λ 1 − 21 x2
X 1
f (λ) = √ Γ(2n + 1/2) − +√ dxe − , (7.13)
π n=0 n! 6 2π −∞ n! 4!
n=N
and therefore
N −1 n N Z ∞ ∞ m
1 X 1 λ λ 1 − 12 x2
X 1 λ
f (λ) − √ Γ(2n + 1/2) − = − √ dxe − x4(m+N ) .
π n=0 n! 6 4! 2π −∞ m=0
(m + N )! 4!
(7.14)
λ N
In the second line we relabeled the sum to pull out an overall factor of − 4! . The thing it multiplies
approaches a constant as λ → 0, so the error of the series is indeed of order λN at small λ.
We can understand the implications of the asymptotic nature of this series as follows: the series will not
begin to diverge until we get to large enough n that
elog n λ ∼ 1, (7.15)
or in other words
1
n∼ . (7.16)
λ
At this point the terms are of order
1 log(1/c)
ϵmin = c λ = e− λ , (7.17)
where c is some O(1) constant which is less than one. ϵmin is the most accurate that the perturbation series
can be, after this including more terms only causes the error to get larger. We illustrate this qualitative
behavior in figure 12. Effects which are of order ϵmin or smaller are typically referred to as non-perturbative
effects, and in situations where they are of interest we need to use methods that go beyond perturbation
theory. For reasonable values of λ however this minimal error can be quite small, for example in quantum
1
electrodynamics we have λ ≈ 137 so the QED perturbation series should be good up to an unrecoverable
error which is of order
ϵmin ∼ e−137 . (7.18)
81
Figure 12: The qualitative behavior of perturbation theory: adding more terms to the series increases the
−#
accuracy until we get to N ∼ λ1 terms, at which point the error of the series is of order e λ . After this
the series begins to diverge and the approximation gets worse and worse. In the plot label an indicates the
coefficient of λn in the perturbative expansion for f (λ).
I’d say this is close enough for most practical purposes! From now on we will therefore use perturbation
theory without further handwringing about its validity, except in non-perturbative situations where we are
indeed interested in effects of order ϵmin .52
82
paired with the B it acts on. The number of such pairings is
m!
Nm := , (7.22)
2m/2 (m/2)!
since we can chose the first element of the first pair, the second element of the first pair, and so on down to
the 2nd element of the m/2nd pair, and then we need to divide by a factor of two for each pair since the
order doesn’t matter and also divide by the number of permutations of the pairs. Therefore we have
Z ∞ (
1 m − 12 x2 Nm m even
√ dxx e = . (7.23)
2π −∞ 0 m odd
This of course is equal to what we found using the Γ function (with the replacement m = 4n), as you will
check on the homework.
In quantum field theory we are really interested in multi-dimensional Gaussian integrals, which we found
obey s Z
A 1 1
Det dx exp − xT Ax + B T x = exp B T A−1 B (7.24)
2π 2 2
and thus s Z
A 1 T ∂ ∂ 1 T −1
Det dxxi1 . . . xim e− 2 x Ax
= ... e2B A B . (7.25)
2π ∂Bi1 ∂Bim
B=0
We can think of this as the m-point correlation function in the Gaussian distribution. To compute it we
again can observe that each derivative again does one of two things, which now are to bring down a factor
of A−1 B or to take the derivative of the existing prefactor, so we have
∂ ∂ 1 T −1 1 T −1 X ∂ X ∂
... e2B A B = e2B A B A−1
i1 j 1 B j 1 +
... A−1
im jm Bjm +
× 1. (7.26)
∂Bi1 ∂Bim j
∂B i1 j
∂Bi m
1 m
As before we can only get a term that survives taking B = 0 if each partial derivative is paired with a B to
its right, so the integral again vanishes for odd m while for even m we have
s Z
A 1 T
X Y
Det dxxi1 . . . xim e− 2 x Ax = A−1
ij ik . (7.27)
2π
P (j,k)∈P
Here P indicates pairings of 1, . . . , m. As before there are Nm such pairings, but now they can make different
contributions to the integral. For example for m = 4 we have
s Z
A 1 T
Det dxxi1 xi2 xi3 xi4 e− 2 x Ax = A−1 −1 −1 −1 −1 −1
i1 i2 Ai3 i4 + Ai1 i3 Ai2 i4 + Ai1 i4 Ai2 i3 . (7.28)
2π
We are now ready for our first meeting with Feynman diagrams. These are simply a graphical way of
representing the different pairings appearing on the right side of equation 7.27. The idea is quite trivial: we
draw a dot for each xi appearing in the correlation function, and then we draw lines connecting them to
indicate the pairing. Each pairing contributes a “propagator” A−1 . The m = 4 case is shown in figure 13.
83
+ +
Figure 13: Feynman diagrams for the four-point function in the Gaussian distribution.
which you can think of as a simple model of the interacting ϕ4 theory we began the lecture with. The
perturbative expansion for this integral is
s X ∞ n X Z
A 1 λ 1 T
f (λ) ∼ Det − dxx4i1 . . . x4in e− 2 x Ax , (7.30)
2π n=0 n! 4! i ...i 1 n
and we can evaluate these integrals using our pairing formula (7.27). We now meet a new phenomenon how-
ever, which is that many of the pairings give the same answer due to the repeated indices in the interaction.
For example the first order n = 1 contribution to the series is
s Z
λX A 1 T λ X
− Det dxx4i e− 2 x Ax = − × 3 × (A−1
ii )
2
4! i 2π 4! i
λ X −1 2
=− (Aii ) , (7.31)
8 i
where all three pairings appearing in (7.28) contribute equally. The second order contribution has three
distinct kinds of pairings: those where each interaction has two self-pairings, those where each interaction
has one self-pairing, and those where there are no self-pairings. These lead to
s
λ2 λ2
Z
X A 4 4 − 12 xT Ax
X
9(A−1 2 −1 2 −1 −1 −1 2 −1 4
2
Det dxx x
i j e = 2 ii ) (Ajj ) + 72Aii Ajj (Aij ) + 24(Aij )
2 × (4!) ij 2π 2 × (4!) ij
X 1 1 −1 −1 −1 2 1
= λ2 (A−1 )2
(A −1 2
) + A A (A ) + (A −1 4
) ,
i,j
128 ii jj
16 ii jj ij
48 ij
(7.32)
where the factors of 9, 72, and 24 count how many pairings there are of each type. Counting these pairings
takes a bit of practice to get used to, we illustrate the idea in figure 14.
The diagrams in figure 14 are useful for counting pairings, but it is also useful to have a simpler set
of diagrams which are designed so that the same diagram automatically represents all the pairings in each
equivalence Pclass. Following Feynman, the idea is to combine all the dots appearing in each factor of the
interaction i x4i to a single interaction vertex, giving us the Feynman diagram expansion. See figure 15
for the set of Feynman diagrams contributing to f (λ) up through order n2 . In terms of these diagrams we
can rewrite our asymptotic series for f (λ) as
X 1 X Y
f (λ) ∼ 1 + (−λ)nD A−1
im iℓ , (7.33)
sD i1 ...inD (m,ℓ)∈LD
D
where m and ℓ label the interaction vertices of the diagram, nD indicates the number of interaction vertices
in D, LD indicates the set of (unoriented) links in D, and sD is called the symmetry factor of the diagram
and is given by
nD !(4!)nD
sD = , (7.34)
pD
84
Figure 14: Counting pairings at first and second order. For the n = 1 pairings, we need to pick which of
three other is to pair the first i with. For the n = 2 pairings where each interaction has two self-pairings, we
need to make this choice independently for each interaction. For the n = 2 pairings where both interactions
have a single self-pairing, for each interaction we need to pick which two of the four is are self-paired, and
then there are two ways to do the remaining pairings. For the n = 2 pairings with no self-pairings, we need
to pick which of the four js pairs with the first i, which of the remaining three js pairs with the second i,
and which of the remaining two js pairs with the third i.
Figure 15: Feynman diagrams contributing to f (λ) up through order λ2 . As we saw above, the symmetry
factors for these diagrams are sD = 1, 8, 128, 16, and 48.
85
with pD the number of pairings which give rise to this diagram as in figure 14. Except for sD all factors in
(7.33) are easy to read off by visual inspection of D, so Feynman diagrams give a powerful way of immediately
seeing what is going on at each order in perturbation theory. There is actually also a way to compute sD
directly from the diagram, it is the size of the automorphism group of the diagram, but as long as you
do not intend to become a high-order amplitudes expert it is easy enough (and perhaps safer) to just use
the method of figure 14 to compute pD .53 In more realistic theories where the interaction vertices are less
symmetric we conveniently often have SD = 1.
the “value” of a Feynman diagram. If D is disconnected then most of these terms are just products of
the analogous terms for its connected components, but we need to be careful about the symmetry factor.
Indeed let’s say that a disconnected diagram D has connected components C1 , C2 , . . . , CM , which we will
momentarily take to be all distinct from each other. We can write the pairing number pD of the full
disconnected diagram as
nD nD − nC 1 nD − nC1 − . . . − nCM −1
pD = ... × pC1 . . . pCM
nC1 nC2 nCM
nD !
= × pC1 . . . pCM , (7.37)
nC1 ! . . . nCM !
where the combinatoric factors account for the number of ways we can choose which interaction vertices get
assigned to which connected components, and we then multiply by the number of pairings we can do within
each component. If the diagrams appear with repetitions, say ma repetitions of Ca , then we need to divide
by additional factors of ma ! since exchanging identitical connected components of a pairing gives the same
pairing. We thus in general have
nD ! 1 m
pD = m1 mM × × (pC1 )m1 . . . (pCM ) M , (7.38)
(nC1 !) . . . (nCM !) m1 ! . . . mM !
53 This interpretation of S
D is actually the reason we included the factor of 1/4! in the interaction vertex. The basic idea is
that nD !(4!)nD gives an “estimate” of how many pairings there are with a given diagram topology, since permuting the nD
vertices and permuting which of the four dots at each vertex get attached to other dots can’t change the graph topology. This
sometimes is an overestimate however, as whenever the graph has an automorphism then acting on a pairing with it gives the
same pairing. Therefore SD is precisely counting the number of such automorphisms. For example in the second diagram of
figure 15 there are three Z2 automorphisms: one that reflects the top lobe, one that reflects the bottom lobe, and one that
exchanges the two lobes. We therefore have SD = 8. Similarly for the fifth diagram there is a four-fold permutation symmetry
of the links, as well as a Z2 symmetry that exchanges the two vertices, so we have SD = 4! × 2 = 48.
86
Figure 16: Feynman diagrams for computing the numerator of (7.42) with two external points. Dividing
by the denominator of (7.42) removes all disconnected diagrams. The symmetry factors of the connected
diagrams here are sD = 1, 2, 6, 4, and 4.
and therefore
1 pD 1 1
= n
= m1 mM × . (7.39)
SD nD !(4!) D SC1 . . . SCM m1 ! . . . mM !
We can therefore write the value of D as
Y (VC )mC
VD = . (7.40)
mC !
C
Finally we can observe that these are precisely the coefficients that these values appear with in
!
P Y Y X (VC )mC
V V
e C C
= e =C
, (7.41)
m
mC !
C C C
We already know how to compute the denominator perturbatively: it is the exponential q of the sum of
A
connected Feynman diagrams with only interaction vertices (divided by a factor of Det 2π that will
cancel with the same factor in the numerator). Let’s think about how to compute the numerator. The
perturbation series for the numerator is
s Z s X n Z !n
A 1 T λ
P 4 A 1 λ X 1 T
Det dxxi1 . . . xiM e− 2 x Ax− 4! i xi ∼ Det − dxxi1 . . . xiM x4i e− 2 x Ax
2π 2π n
n! 4! i
X 1 λ n X X Y
= − A−1 ij ,ik , (7.43)
n
n! 4! i ...i M +1 M +n P (j,k)∈P
where in the second line I’ve labeled the n interaction vertices as iM +1 , . . . , iM +n . In such calculations the
xia with a ∈ (1, M ) are referred to as “external” and the ia with a ∈ (M + 1, M + n) are referred to as
“internal” or “interaction”.54
r
54 In A
this terminology the denominator of (7.42) (multiplied by Det 2π
) is the exponential of the sum of connected
diagrams with no external legs, also sometimes called the sum over “vacuum bubbles”.
87
Figure 17: Feynman diagrams contributing to the four-point function up through O(λ). Note however that
the second two rows are all really just incorporating corrections to the two point functions in the first row;
it is only the fourth row that is a “genuinely four-point” contribution.
As in the previous subsection we can group this sum over pairings into equivalence classes labeled by
Feynman diagrams, with the diagrams contributing to the two-point function through second order shown
in figure 16. In general in terms of diagrams we have
s Z
A 1 T λ
P 4 X 1 X Y
Det dxxi1 . . . xiM e− 2 x Ax− 4! i xi ∼ (−λ)nD A−1
im iℓ , (7.44)
2π SD i ...i
D M +1 M +nD m,ℓ∈LD
where now nD is the number of interaction vertices, m and ℓ run over the links of the diagram including
links to external points, and SD is again the symmetry factor
nD !(4!)nD
SD = (7.45)
pD
with pD the number of pairings of the m + N points that give rise to the diagram D. As before we can
interpret SD as counting the automorphisms of the diagram, now restricting to those automorphisms which
keep the external points fixed. We also have an exponentiation result: the numerator of (7.42) is equal to
the sum over diagrams where all interaction vertices are connected to at least one external point times the
exponential of the sum over connected vacuum bubbles. The second factor just cancels the denominator
(7.42), so we then have
X 1 X Y
⟨xi1 . . . xiM ⟩ = (−λ)nĈ A−1
im iℓ (7.46)
SĈ iM +1 ...iM +n
Ĉ m,ℓ∈LĈ
Ĉ
where Ĉ runs over the set of diagrams where each interaction vertex is connected to at least one external
point. The first few such diagrams for the four-point function are shown in figure 17. Note that these
diagrams still are not all connected, essentially because there are diagrams which amount to just correcting
88
Figure 18: Feynman diagrams for computing the connected four-point function, including all contributions
up through λ2 . The symmetry factor of the first diagram is one and the symmetry factors for the others are
all two.
the two-point functions appearing in figure 13 rather than giving “genuinely four-point” contributions. To
focus on the latter, we should look at the connected four-point function, which is defined by
⟨xi1 . . . xi4 ⟩c = ⟨xi1 . . . xi4 ⟩ − ⟨xi1 xi2 ⟩⟨xi3 xi4 ⟩ − ⟨xi1 xi3 ⟩⟨xi2 xi4 ⟩ − ⟨xi1 xi4 ⟩⟨xi2 xi3 ⟩. (7.47)
More generally the connected M -point function ⟨xi1 . . . xiM ⟩c is defined recursively by55
X Y Y
⟨xi1 . . . xiM ⟩ = ⟨ xij ⟩c . . . ⟨ xij ⟩c , (7.48)
S j∈S1 j∈SL
where the sum is over partitions S of M into parts S1 , . . . , SL . This defines ⟨xi1 . . . xiM ⟩c in terms of lower-
point connected correlation functions and the full correlation function ⟨xi1 . . . xiM ⟩. Recursing down to the
lowest level, we take ⟨xi ⟩c = ⟨xi ⟩. Forgetting for a moment that in this theory the odd moments of xi vanish,
the first few explicit solutions of this definition are
⟨xi ⟩c = ⟨xi ⟩
⟨xi xj ⟩c = ⟨xi xj ⟩ − ⟨xi ⟩⟨xj ⟩
⟨xi xj xk ⟩c = ⟨xi xj xk ⟩ − ⟨xi xj ⟩⟨xk ⟩ − ⟨xi xk ⟩⟨xj ⟩ − ⟨xj xk ⟩⟨xi ⟩ + 2⟨xi ⟩⟨xj ⟩⟨xk ⟩. (7.49)
In practice however the definition (7.48) is more useful, as it shows that what the connected correlation
function is really doing is removing all parts of the full correlation function which are mere products of
lower-point correlation functions. Said differently, it builds up the full correlation function out of connected
components in precisely the same way as Feynman diagrams do. We therefore can express the connected
correlation function as a sum over connected diagrams only:
X 1 X Y
⟨xi1 . . . xiM ⟩c = (−λ)nC A−1
im iℓ , (7.50)
SC iM +1 ...iM +nC m,ℓ∈LC
C
where now the sum is over genuinely connected diagrams C. We show the first few diagrams contributing
to the connected four-point function in figure 18.
Already some patterns may be apparent in the diagrams we have discussed. Let’s emphasize two of them:
The number of diagrams grows quite rapidly as we go to higher orders in λ. Roughly speaking grows
like some power of nD !, since the total number of pairings grows like this and the symmetry factors grow
55 In other contexts the connected correlation functions are called “cumulants” or “Ursell functions”.
89
too slowly to make up for it (after all generic diagrams should have few symmetries). This growth is
consistent with the idea that the series should be divergent, since a power of nD ! will always eventually
beat λnD . It also means that computing higher-order Feynman diagrams is a rather laborious process,
requiring many clever tricks to make progress.
For a fixed number of external legs, as we go to higher order the number of loops in the diagram
increases by one for each power of λ. Diagrams are thus often classified by the number of loops
rather than the number of interaction vertices, as it is really the number of loops that determines the
complexity of evaluating individual diagrams. Connected diagrams with interaction vertices but no
loops are called tree diagrams, while higher loop diagrams are referred to as one-loop diagrams,
two-loop diagrams, and so on. Most theoretical physicists these days never need to evaluate a
diagram with more than one loop, so in this class our focus will be on computing tree and one-loop
diagrams rather than developing machinery for higher loop computations.
of the free massive scalar field (ϕ̇ indicates the derivative with respect to the Euclidean time τ ). The
perturbative evaluation of this path integral is precisely the same as in the previous section, leading to
Z Z
nC 1
X Y
⟨T ϕ(x1 ) . . . ϕ(xM )⟩c ∼ (−λ) d xM +1 . . . dd xM +nC
d
GE (xm − xℓ ), (7.53)
SC
C m,ℓ∈LC
where
dd p eip·x
Z
GE (x) = (7.54)
(2π)d p2 + m2
is the Euclidean propagator. The two-point function is thus computed by the sum of the connected subset
of the diagrams appearing in figure 16, while the four-point function is computed by the sum of connected
diagrams appearing in figure 18. So for example the first few terms in the expansion for the Euclidean
two-point function are
Z
λ
⟨T ϕ(x1 )ϕ(x2 )⟩ =GE (x2 − x1 ) − dd x3 GE (x1 − x3 )GE (x2 − x3 )GE (0)
2
Z
2 1
+λ dd x3 dd x4 GE (x1 − x3 )GE (x2 − x4 )GE (x3 − x4 )3
6
!
1 1
+ GE (x1 − x3 )GE (x3 − x4 )GE (x2 − x4 )GE (0)2 + GE (x1 − x3 )GE (x2 − x3 )GE (x3 − x4 )2 GE (0)
4 4
+ .... (7.55)
You may be alarmed by the factors of GE (0) = ∞ in the one-loop and two-loop contributions to this
formula. These are further “UV divergences” of the type we met already in computing the Hamiltonian in free
90
field theory. There we saw the divergence could be absorbed into a redefinition of the cosmological constant
via a process we called renormalization. We’ll eventually see that we can also absorb the divergences here
into a redefinition of the particle mass m and a rescaling of the field ϕ. To get a first sense of the former we
can observe that a change δm2 in the mass squared corrects the Euclidean propagator as
dd p δm2
Z
δGE (x) = − eip·x , (7.56)
(2π) (p + m2 )2
d 2
To get perturbative expressions for time-ordered Lorentzian correlation functions, we should rotate τ =
i(1 − iϵ)t in all external locations x1 , . . . , xM and also in all interaction locations xM +1 , . . . , xM +nC . This
has two effects: it converts all Euclidean propagators to Feynman propagators
dd p −ieip·x
Z
GF (x) = , (7.61)
(2π) p + m2 − iϵ
d 2
and provides an extra factor of inC from the dτ factors in the integrals over interaction locations. Thus we
have the Lorentzian formula
Z Z
nC 1
X Y
⟨T ϕ(x1 ) . . . ϕ(xM )⟩c ∼ (−iλ) d xM +1 . . . dd xM +nC
d
GF (xm − xℓ ). (7.62)
SC
C m,ℓ∈LC
In these calculations the exponentiated sum over connected bubble diagrams canceled between the nu-
merator and denominator of (7.51). It is worth mentioning however that this sum does have a physical
interpretation: it renormalizes the cosmological constant. To see this, note that any connected bubble dia-
gram will be proportional to the volume of spacetime since there is a symmetry of translating all interaction
vertices by the same amount. We can therefore view each connected diagram with no external legs as giving
a contribution to the cosmological constant.
91
Figure 19: Momentum labels for a one-loop contribution to the four-point function.
(−iλ)2
Z d
d p1 dd p4 dd p dd q ip1 ·x1 +...ip4 ·x4
⟨T ϕ(x1 ) . . . ϕ(x4 )⟩c ⊃ . . . e
2 (2π)d (2π)d (2π)d (2π)d
−i −i −i −i −i −i
× 2
p1 + m2 − iϵ p22 + m2 − iϵ p23 + m2 − iϵ p24 + m2 − iϵ p2 + m2 − iϵ q 2 + m2 − iϵ
× (2π)d δ d (p1 + p2 + p + q)(2π)d δ d (p3 + p4 − p − q). (7.63)
This looks a bit nicer if we take the Fourier transform and use one of the δ-functions to evaluate the q
integral, giving us
(−iλ)2 −i −i
⟨T ϕ(p1 ) . . . ϕ(p4 )⟩c ⊃ (2π)d δ d (p1 + p2 + p3 + p4 ) 2 2
... 2
2 p1 + m − iϵ p4 + m2 − iϵ
d
−i −i
Z
d p
× . (7.64)
(2π) p + m − iϵ (p − p3 − p4 )2 + m2 − iϵ
d 2 2
Here the “time-ordered correlator in momentum space” just means the Fourier transform of the time-ordered
position space correlator, and the δ-function in front is called a momentum-conserving δ-function. Such
a δ-function appears in every momentum-space correlation function, and is a consequence of the fact that
the correlation functions in position space only depend on relative positions due to spacetime translation
invariance. The “hard” part of computing this diagram is evaluating the integral over the loop momentum
on the second line, we will learn how to evaluate such integrals in a few weeks.
The procedure employed in the previous paragraph is easily formalized into an algorithm for evaluating
any Feynman diagram for a momentum-space correlation function. This algorithm is called the Feynman
rules, and given a connected Feynman diagram C contributing to the connected M -point function it goes
like this:
1. Write a factor of (−iλ)nC , where nC is the number of interaction vertices in C.
2. Divide by the symmetry factor SC .
3. Label the momenta of all propagators, with external momenta pointed outwards.
92
4. Multiply by an overall momentum-conserving δ-function (2π)d δ d (p1 + . . . + pM ).
−i
5. Multiply by a factor of p2 +m 2 −iϵ for each propagator, both internal and external, imposing momentum
93
Figure 20: Diagrams for problem 3.
7.8 Homework
PN −1
1. Using Mathematica (or your favorite competitor), for f (λ) given by (7.5) make plots of log | n=0 an λn −
f (λ)| as a function of N for λ = .5, .2, .1, and .05. Here an are the coefficients in the perturbation series
(7.12). Are your plots consistent with the qualitative story in figure 12? In particular note the maximal
accuracy and the value of N at which the series begins to diverge.
2. Starting from the definition (7.7), show that the Γ-function obeys Γ(x + 1) = xΓ(x) (to use (7.7) you
can assume Re x > 0, but if you are comfortable with analytic continuation
√ then you should also argue
that this identity holds for all complex x). Also show that Γ(1/2) = π. Using these results, show
that (7.8) and (7.23) are compatible.
3. Check the symmetry factors quoted in the captions of figures 16 and 18, and also compute the symmetry
factors for the two diagrams shown in figure 20.
4. Using the momentum-space Feynman rules, write down an expression for the contribution of the two-
loop diagram in figure 21 to the Fourier transform of the Lorentzian time-ordered four-point function in
λϕ4 theory. You should evaluate all momentum integrals which can be evaluated using the momentum-
conserving δ-functions at the vertices, but you can leave any remaining integrals unevaluated. Make
sure to label the directions of the momenta on your diagram.
Figure 22: Feynman diagrams for the four-point function in a Gaussian integral over complex degrees of
freedom, note that there is one fewer diagram than in the real case.
94
5. Show that Z
† e T x∗
Ax+B T x+B 1 e T A−1 B
dxdx∗ e−x = iA
eB , (7.65)
Det 2π
where x is a vector with complex components, A is a positive symmetric matrix, and the measure is
defined by dxdx∗ = −2idRe(x)dIm(x). Use this to show that correlation functions of the form
Z
∗ ∗ iA †
⟨xi1 . . . xiM xj1 . . . xjN ⟩ = Det dxdx∗ xi1 . . . xiM x∗j1 . . . x∗jM e−x Ax (7.66)
2π
are given by a sum over pairings as in (7.27), but where now each pair must contain one x and one
x∗ (so in particular they vanish unless M = N ). Feynman diagrams for a complex degree of freedom
therefore include an arrow on each propagator that points from x to x∗ , as in figure 22.
56 Note that the denominator in the interaction term is 4, not 4!. This is because the symmetry of the vertex is now only
95
8 Particles and Scattering
So far we have organized our discussion of quantum field theory in terms of correlation functions - vacuum
expectation values of products of Heisenberg fields. At least in free field theory however we saw that we
could also interpret correlation functions in terms of particles: we found a basis of eigenstates of the
Hamiltonian whose elements each have some definite number of non-interacting bosons each carrying a
definite spacetime momentum. It is natural to ask if interacting field theories also have a description in
terms of particles. In general they don’t, which is why we have focused on correlation functions so far, but
many of the interacting quantum field theories of most interest in physics do indeed give rise to particles and
it is therefore worthwhile for us to spend some time understanding when this happens and how to relate it
to our knowledge of correlation functions.
In other words we can expand the set of one-particle states in a basis of P µ eigenstates
where σ runs over a finite set and Poincaré transformations U (Λ, a) act within the subspace of the full Hilbert
space of the theory that is spanned by this basis. As in previous lectures we will normalize these states so
that
⟨p′ , σ ′ |p, σ⟩ = (2π)d−1 δ d−1 (⃗
p ′ − p⃗)δσ′ σ , (8.2)
where ensuring diagonality in the σ indices may require some Gram-Schmidt procedure. The action of
spacetime translations in this basis is simple, we have
µ µ
e−iaµ P |p, σ⟩ = e−iaµ p |p, σ⟩, (8.3)
so what we need to understand is the action of Lorentz transformations U (Λ). These obey
96
To work out the structure of the Cσ′ ,σ (Λ, p), it is useful to first consider the special case where Λµν pν = pµ .
Given a spacetime momentum pµ , the subgroup of the Lorentz group which preserves pµ is called the little
group for pµ . Given a Lorentz transformation W which is in the little group for some spacetime momentum
k µ , acting on |k, σ⟩ the transformation (8.5) simplifies to
X
U (W )|k, σ⟩ = Dσ′ ,σ (W )|k, σ ′ ⟩, (8.6)
σ′
A warning:
The D-matrices appearing here are operators which represent the little group on the Hilbert space
of quantum mechanics. They are NOT the same as the D(Λ) matrices we met in earlier lectures,
which represent the full
P Lorentz group acting on the components of the fields in the theory via
U (Λ−1 )Φa (x)U (Λ) = b Dab (Λ)Φb (Λ−1 x). Many people have wasted a lot of time being confused
about the difference between the (typically infinite-dimensional) representation U (Λ) of Lorentz sym-
metry acting on Hilbert space and the (typically finite-dimensional) representation D(Λ) of Lorentz
symmetry acting on fields. The D-matrices we are introducing now are involved in the part of the
former which acts on one-particle states, not the latter.
The key point is then that once we have decided on a representation for the little group, the representation
of the full Lorentz group is determined as well. The idea is that the set of possible spacetime momenta pµ for
a particle are all related by Lorentz transformations, so we can write each pµ as a Lorentz transformation Lp
of some fixed reference momentum k µ . The detailed form of k µ depends on whether the particle is massive
or massless, and will be considered in the next paragraph. So far we have not said anything about how the
σ indices at different momenta are related, we can determine this by simply adopting a convention where
the state |p, σ⟩ is related to the state |k, σ⟩ by
Here N (p) is a normalization factor that we include to maintain the normalization (8.2). We showed back
in lecture four that this requires s
k0
N (p) = , (8.10)
p0
dd p dd−1 p 1
with the idea being that the object (2π)d
2πδ(p2 +m2 )Θ(p0 ) =
(2π)d−1 2ωp ⃗
defines a Lorentz-invariant measure
′ ω
on spatial momenta and this implies a Lorentz transformation δ d−1
(⃗ p ′ − p⃗). For general
p Λ − p⃗Λ ) = ωp⃗p⃗ δ d−1 (⃗
Λ
Λ we then have
97
where we have observed that L−1 µ
Λp ΛLp is in the little group with respect to k , and thus that
s
(Λp)0
Cσ′ ,σ (Λ, p) = Dσ′ ,σ (L−1
Λp ΛLp ). (8.12)
p0
This way of building a representation of a group out of a representation for one of its subgroups is called the
method of induced representations.
To discuss the structure of the little group in more detail, we need to be more explicit about which
Lorentz group we are considering: do we include its non-identity components, and do we go to the double
cover where a rotation by 2π acts as −1 on particles of half-integer spin? Let’s first consider the more
familiar case where we take 2π to be the identity, in which case we are interested in the identity component
SO+ (d − 1, 1) of the Lorentz group. The little group depends on whether pµ is timelike, null, or spacelike,
and in the timelike and null cases it also depends on whether it is future or past pointing. There are no
known particles whose momentum is spacelike (this would be called “tachyons” and would allow causality
violation), nor are there any known particles with p0 < 0 (these would have negative energy and destabilize
the vacuum). We will thus focus on the cases where pµ is timelike or null with p0 > 0, which describe massive
and massless particles respectively.
In the massive case we have p · p = −m2 for some m > 0, and by going to the rest from of the particle
we can choose our reference momentum to be
The little group thus consists of those elements of SO+ (d − 1, 1) which preserve the vector (m, 0, . . . , 0).
No Lorentz transformation which involves a boost can do this, so the little group of a massive particle is
just the spatial rotation group SO(d − 1). Therefore each massive particle is characterized by an irreducible
representation of the spatial rotation group, which of course is what we call the spin of the particle. In
particular for d = 4 the irreducible representations of SO(3) are labeled by integers j ≥ 0, with the spin-j
representation having dimension 2j + 1 as you hopefully know. If we generalize SO+ (d − 1, 1) to its double
cover Spin+ (d − 1, 1), then the little group becomes Spin(d − 1) so more representations are allowed. In
particular for d = 4 the little group becomes Spin(3) = SU (2), which allows for half-integer j.
The massless case is perhaps more novel. A massless particle has no rest frame, so the best we can do is
choose the reference momentum k µ to point in the positive x1 direction:
k µ = (κ, κ, 0, . . . 0) (8.14)
with κ > 0. To find the little group the easiest way to proceed is to find the set of Lorentz generators which
annihilate k µ . We can write a general Lorentz generator as
0 b1 b2 . . . bd−1
b1 0
−c2 . . . −cd−1
b2 c
J = −i 2 , (8.15)
.. .. A
. .
bd−1 cd−1
where the bi multiply boost generators in the i direction and the ci multiply rotation generators in the
1 − i plane. The matrix A is an arbitrary real antisymmetric matrix, which we can think of as generating
a rotation that doesn’t involve the x1 direction. Demanding that this annihilates (8.14) tells us that we
need b1 = 0 and bi = −ci for all i ≥ 2, so we see that the little group of a massless particle moving in the
x1 direction is generated by rotations involving the x2 , . . . , xd−1 directions and combinations of boosts and
rotations whose generators have the form
Ai = J i0 + J i1 . (8.16)
These generators are mutually commuting, and are rotated into each other by rotations involving the
x2 , . . . , xd−1 directions, so the little group of a massless particle is isomorphic to the group ISO(d − 2)
98
of Euclidean rotations and translations of Rd−2 . Since the Ai are mutually commuting we can simultane-
ously diagonalize them, but this leads to a problem: since their eigenvalues give a vector in Rd−2 which can
be continuously rotated, if we do not have Ai = 0 for all i ≥ 2 then the index σ cannot run only over a finite
(or even discrete) range. Such representations are called “continuous spin particles”, and they are typically
viewed as pathological. For example such a particle could never be in thermal equilibrium, and seems quite
hard to reconcile with quantum gravity. There has also never been any evidence for a continuous spin par-
ticle. More pedantically, continuous spin particles don’t actually obey our definition of particle state since
σ is not finite. For any of these reasons, we will from now on restrict to representations with Ai = 0. This
reduces the little group to SO(d − 2), and the choice of an irreducible SO(d − 2) representation associated to
a particle is called its helicity. In particular for d = 4 helicity is determined by an irreducible representation
of SO(2), and these are one-dimensional and labeled by an integer j. Indeed the representation is simply
where θ is the rotation angle in the 2 − 3 plane. If we generalize SO+ (d − 1, 1) to Spin+ (d − 1, 1), then the
little group of a massless particle (again with Ai = 0) is Spin(d − 2). For d = 4 this is again isomorphic
to SO(2), but now with half-integer j allowed (so that we have to rotate by 4π to get back to where we
started).
Now we can return to the question of why photons do not have three spin states. The reason is that they
don’t really have spin one, spins are for massive particles. What they have is helicity one! Photons with right
circular polarization have helicity j = 1, while photons with left circular polarization have helicity j = −1.
These states transform in distinct irreducible representations of the Lorentz group, and in particular they
are not mixed together by Lorentz transformations (which is quite different from the situation for a massive
particle of spin one).
There is a somewhat confusing terminology issue with helicity that is related to the idea of spatial
reflection symmetry R (or parity P in even dimensions). Photons with helicity one and helicity minus one
are not mixed by the action of the connected Lorentz group SO+ (d − 1, 1), so strictly speaking according to
our definition we should view them as different types of particle. On the other hand all interactions which
involve photons also preserve R symmetry, and R symmetry does mix photons of opposite helicity. It is
therefore conventional to refer to both helicities as photons. The same is true for gravitons, which have
helicity j = ±2. In the standard model of particle physics however there are particles called neutrinos, which
are involved in nuclear reactions, and which are treated as massless with helicity j = −1/2. The standard
model also has massless particles with helicity j = 1/2, which are called antineutrinos. The interactions of
these particles do not respect R symmetry, and so they are given different names.58
|p1 , σ1 , n1 ; p2 , σ2 , n2 ; . . . ; pM , σM , nM ⟩, (8.18)
where the new label ni tells us the type of the ith particle (i.e. is it a photon, an electron, etc). The Poincaré
transformation of such a state is just the product transformation
s
M 0
e−ia·Λp (Λpi )
Y X
U (Λ, a)|p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩ = 0 Dσn′i,σi L−1
Λpi ΛLpi
|Λp1 , σ1′ , n1 ; . . . ; ΛpM , σM
′
, nM ⟩.
i=1
pi ′
i
σi
(8.19)
58 There is clear evidence that at least two of the three known types of neutrino is massive. Most particle physicists expect
that in fact they all are, but so far this has not been confirmed experimentally. Understanding the nature of the neutrino mass
matrix is one of the main goals of current particle physics research.
99
The normalization of these states is a bit trickier since we need to account for identical particles. For example
in our free scalar theory we have
In general we impose
X M
Y
⟨p′1 , σ1′ , n′1 ; . . . ; p′M , σM
′
, n′M |p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩ = (−1)fπ p ′π(i) − p⃗i )δσπ(i)
(2π)d−1 δ d−1 (⃗ ′ σi δn′π(i) ni ,
π i=1
(8.21)
where the sum is over permutations π of M objects and fπ indicates the number of fermions which are
exchanged by the permutation. In free field theory this sum is automatically generated by the algebra of
creation/annihilation operators, as we saw in (8.20). It is also useful to introduce the idea of a complete set
of multiparticle states, which we can write as
∞ Y M
!
dd−1 pi X X
Z
X 1
I= d−1
|p1 , σ1 , n1 ; . . . ; pM , σM , nM ⟩⟨p1 , σ1 , n1 ; . . . ; pM , σM , nM |, (8.22)
i=1
(2π) σ n
S(n)
M =0 i i
where the “symmetry factor” S(n) counts the number of possible permutations of identitical particles in
|p1 , σ1 , n1 . . . pM , σM , nM ⟩ (so in particular if none of the ni are equal then S(n) = 1). Here by convention
the state with zero particles is of course the vacuum |Ω⟩.
As you can already see the notation for multiparticle states is somewhat tedious, so following Weinberg
we’ll adopt an abbreviated notation where a multiparticle state is simply called |α⟩, the inner product (8.21)
is written as
⟨α|β⟩ = δ(α − β), (8.23)
and the resolution of the identity (8.22) is written as
Z
I = dα|α⟩⟨α|. (8.24)
100
?
?
Figure 23: In and out states in scattering: in the in state |α, +⟩ we have a definite particle configuration at
t → −∞, while in an out state |β, +⟩ we have a definite particle configuration at t → ∞. In general an in
state evolves to a complicated superposition of out states, with the coefficients being given by the S-matrix.
where scattering theory is most clearly established. In theories with massless particles one can still try to use
scattering theory, but one often encounters “infrared divergences” when doing so and these typically need to
be dealt with on a somewhat case-by-case basis. Those of you who have studied scattering in non-relativistic
quantum mechanics should already be familiar with this problem, as attempts to treat scattering off of a
Coulomb potential using standard methods break down due to logarithmic divergences. Our approach will
be simply to proceed with the assumption that additive multiparticle states exist, with the understanding
that this will sometimes lead to trouble with massless particles that we will need to address when it arises.
We formalize this as follows:
A quantum mechanical theory with Hamiltonian H has a scattering description if H has a complete
set of “in state” eigenstates, denoted |α, +⟩, and also a complete set of “out state” eigenstates |α, −⟩,
both with eigenvalues
H|α, ±⟩ = Eα |α, ±⟩, (8.25)
where Eα are the eigenvalues of a non-interacting multiparticle Hamiltonian H0 with eigenstates |α⟩,
such that we have
Z Z
−iEα t
lim dα g(α)e |α, ±⟩ = lim dα g(α)e−iEα t |α⟩ (8.26)
t→∓∞ t→∓∞
for arbitrary smooth (and integrable) wave packets g(α) which respect the exchange symmetry of any
identical particles.
What this definition says is that wave packets of the in states look like non-interacting multiparticle eigen-
states at early times, while wave packets of the out states look like non-interacting multiparticle eigenstates
at late times. The basic idea is illustrated in figure 23.
One immediate consequence of this definition is that the inner product of in states with in states and out
states with out states is the same as for non-interacting particles:
which follows because the inner product is time-independent so we can compute at early/late times for in/out
states where they coincide with the non-interacting eigenstates. More interesting is the overlap between in
and out states, which by definition is called the S-matrix:
The S-matrix is the primary object of interest in scattering theory; it tells us the quantum amplitude to
find the system in an out state |β, −⟩ given that it started in an in state |α, +⟩. More earthily, the S-matrix
101
provides the answer to a question well known to children everywhere: if you take some stuff and slam it
together, what comes out? Many physicists, who after all have much in common with children, spend their
days studying precisely this question.
A very important property of this S-matrix is that it is unitary, which follows immediately from its
definition since it is a change of basis between two complete sets of orthonormal states. We can also check
this explicitly: Z Z
∗
dβ Sβα Sβγ = dβ ⟨α, +|β, −⟩⟨β, −|γ, +⟩ = ⟨α, +|γ, +⟩ = δ(α − γ). (8.29)
One reason why the unitarity of the S-matrix is interesting is that if we have a perturbative expansion
for S then the unitarity constraint mixes different orders in perturbation theory, which sometimes lets us
determine higher-order contributions from lower-order ones.
It will be useful in what follows to write down the Lorentz transformation of the S-matrix: from (8.19)
we have (in more explicit notation)
s s
N M
Y (Λp′i )0 X n′i ∗ Y (Λp j )0 X
n
⟨p′1 , σ1′ , n′1 ; . . . ; p′N , σN
′
, n′N , −|p1 , σ1 , n1 ; . . . pM , σM , nM , +⟩ = Dσ′ ,σ′ (Wp ) Dσjj ,σj (Wp )
i=1
p′0
i ′
i i
j=1
p0j σ
σi j
with
Wp = L−1
Λpi ΛLpi . (8.31)
Due to the unitarity of the little group representations this formula simplifies if we take the absolute value
squared and sum over all initial and final spins/helicities:
N M
(Λp′i )0 Y (Λpj )0
X Y
|⟨p′1 , σ1′ , n′1 ; . . . ; p′N , σN
′
, n′N , −|p1 , σ1 , n1 ; . . . pM , σM , nM , +⟩|2 =
′ ...σ ′ ,σ ...σ i=1
p′0
i j=1
p0j
σ1 N 1 M
X
× |⟨Λp′1 , σ1′ , n′1 ; . . . ; Λp′N , σN
′
, n′N , −|Λp1 , σ1 , n1 ; . . . ΛpM , σM , nM , +⟩|2 . (8.32)
′ ...σ ′ ,σ ...σ
σ1 N 1 M
where Eα,i is the energy of the ith ingoing particle and Eβ,j is the energy of the jth outgoing particle, then
the quantity X
|Seβα |2 (8.34)
spin/helicity
is Lorentz invariant.
102
|Sβα |2 , which naively is the probability to find an out state |β, −⟩ given that we start in an in state |α, +⟩,
is infinite. To see the necessity of this δ-function, we can observe that
Sβα = ⟨β, −|eiP ·a e−iP ·a |α, +⟩ = ei(pβ −pα )·a Sβα , (8.35)
for all spacetime vectors a, which is only possible if Sβα vanishes if pα ̸= pβ . Here pα and pβ are the total
spacetime momenta of the in and out states (α and β label the states, they are NOT Lorentz indices).
Moreover if we integrate Sβα against generic normalized wave packets we expect a finite and nonzero answer
since we are computing an overlap of normalized states which have no reason to be orthogonal in general, so
whatever support Sβα has when pα = pβ must be strong enough to integrate to a nonzero result - in other
words there must be a δ-function. It is convenient to extract this δ-function, and also the non-interacting
contribution δ(β − α), in hopes that the remaining part of Sβα is nonsingular:
The factor of i here is conventional, I offer no explanation for it. Mβα can still have further δ-function
singularities, but only when a subset of the particles in α has exactly the same spacetime momenta as a
subset of the particles in β. These additional singularities can be removed by defining a “connected” S-matrix
in exactly the same way we did for correlation functions, to avoid this we will just restrict to studying the
S-matrix away from these special kinematic points.
To understand what to do about the divergence of |Sβα |2 , we first need to realize that it is an infrared
divergence: the quantity δ d−1 (0) is infinity in momentum space because we are working in infinite volume.
In finite volume the inner product of our one-particle states is
Z
′
⟨p′ , σ ′ |p, σ⟩ = δσσ′ dd−1 xei(⃗p−⃗p )·⃗x = V δσσ′ δp⃗p⃗ ′ , (8.37)
so we can formally interpret δ d−1 (0) as the volume of space. The square
√ of the S-matrix is therefore diverging
because we defined it as an overlap of states whose norms are order V instead of states whose norms are
one. The principled way to fix this is to only consider scattering of normalizeable wave packets. In particular
the quantity Z 2
dαg(α)Sβα (8.38)
with Z
dα|g(α)|2 = 1 (8.39)
should be finite. A somewhat lazier approach, but which also works and leads more quickly to the same
answer, is to just work in finite volume and stick to momentum eigenstates. Indeed in finite volume we can
define properly normalized in and out states
1
|α, ±⟩V = |α, ±⟩, (8.40)
V Nα /2
in terms of which the differential transition probability from α to β is
|Sβα |2
dP (α → β) = |⟨β, −|α, +⟩V |2 dNβ = dβ, (8.41)
V Nα
where
dNβ = V Nβ dβ (8.42)
is the number of states in the infinesimal phase space window dβ. To derive this, recall that for a single
particle momentum in finite volume we have
dd−1 p
dN = V (8.43)
(2π)d−1
103
since momenta in the box are quantized (for example if we take it to be square torus of length L) as
2π⃗n
p⃗ = (8.44)
L
with ⃗n a spatial vector of integers. If we avoid choices of α and β where Mβα has additional δ-functions (or
just focus on connected scattering), then we can write this as
2
dP (α → β) = V −Nα (2π)d δ d (pβ − pα ) |Mβα |2 dβ. (8.45)
To deal with the square of the δ-function, we notice that in finite volume we can write it as
2 2
(2π)d δ d (pβ − pα ) = V δp⃗α p⃗β T δEα Eβ
= V T × V T δp⃗α p⃗β δEα Eβ
= V T (2π)d δ d (pβ − pα ), (8.46)
where T is the total time elapsed in the scattering process (i.e. we work in a “time box” as well as a spatial
box). The transition rate, which is the transition probability per unit time, is therefore given by
We’ve now succeeding in pushing all infrared divergences into an overall power of the volume, as everything
else appearing here is sensible in the large volume limit.
To proceed further, we need to think a bit about how to connect this setup to what experimentalists
actually do. The easiest case is Nα = 1, for which the power of V just cancels. We are then studying the
decay of an unstable particle, whose differential decay rate into a final state β is apparently given by
I must confess however that this formula (although correct when interpreted properly) is a bit of a cheat: an
unstable particle isn’t really a one-particle state of the theory in infinite volume, so we can’t really interpret
Mβα as part of the S-matrix. On the other hand if the total decay rate
Z
Γ = dβ(2π)d δ d (pβ − pα )|Mβα |2 (8.49)
is small compared to the inverse of our time interval T then we can effectively treat the particle as stable
in our box setup. This formula should therefore be correct as long as the lifetime of the particle is long
compared to all other scales in the problem.60
A somewhat more complicated (but also better defined) case is Nα = 2, which is the scattering of two
particles to many. The classic setup for this experiment is shown in figure 24: we have a beam of incident
identical particles with momentum p1 aimed at a target particle with momentum p2 = (m2 , ⃗0). The natural
thing to measure is the differential rate for scattering into the out state β divided by the incident flux, which
is called the differential cross section:
dΓ(α → β)
dσ(α → β) = . (8.50)
fα
Since we are working in the rest frame of the target particle, the incident flux fα is given by
stable particles and Mαβ in terms of the residue of this resonance, but we won’t explore it here. If you just compute Mαβ using
the Feynman rules we’ll find below extrapolated to the case of one ingoing particle then you will get the right answer.
104
Figure 24: Fixed-target scattering: an incident beam of identical particles (shown in red) with identical
momenta are scattered off of a single target particle (shown in blue). What is the rate at which each out
dσ
state β is produced per unit incident flux? The answer to this question is the differential cross section dβ .
where ⃗v1 is the velocity of incident particles and ρ1 is their density. In a general Lorentz frame (which after
all we had better include since the target particle could be massless) the flux is instead defined to be
fα = uα ρ1 , (8.52)
where p
(p1 · p2 )2 − m21 m22
uα = (8.53)
E1 E2
is called the relative velocity. You will show in the homework that when p⃗1 and p⃗2 are collinear (i.e.
proportional to each other) then we have
uα = |⃗v1 − ⃗v2 |. (8.54)
For general p1 , p2 the motivation for this definition of flux is that it makes the spin-summed differential
cross section Lorentz invariant, as we will see in a moment. Returning now to our box setup with Nα = 2,
our one-particle box states |α, ∓⟩V are properly normalized so the number of particles in the box in such a
state is one. The particle density is therefore 1/V , and we can think of the transition rate (8.47) as arising
from a beam of particle one with density ρ1 = V1 and flux uα /V scattering off of the other particle in the
box (wherever it is) just as in figure 24. The differential cross section is therefore given by
dσ(α → β) = u−1 d d 2
α (2π) δ (pβ − pα )|Mβα | dβ. (8.55)
This formula is used anytime someone wants to compare a theoretical calculation of two-particle scattering
to experiment!
Let’s briefly consider the Lorentz transformation properties of the differential cross section. Mβα has
the same Lorentz transformation properties as Sβα , and we saw in equation (8.32) that the Lorentz trans-
formations of |Sβα |2 is simple once we sum over spins/helicities and multiply by the product of initial and
final energies of each particle. More concretely, if we define
Nβ Nα
Y p Y p
Mfβα = 2Eβ,i 2Eα,j Mβα , (8.56)
i=1 j=1
then X
fβα |2
|M (8.57)
spin/helicity
is Lorentz invariant. We also saw in lecture four (and mentioned below equation (8.10)) that the quantity
f = Q dβ
dβ (8.58)
N
i=1 2Eβ,i
105
is Lorentz invariant, where again Eβ,i is the energy of the ith particle in the final state. We therefore are
motivated to sum (8.55) over initial and final spins/helicities and then rewrite it as
X 1 d d
X
fβα |2 × dβ,
dσ(α → β) = p × (2π) δ (p β − pα ) × |M f (8.59)
4 (p · p )2 − m2 m2
spin/helicity 1 2 1 2 spin/helicity
where the right-hand side is now a product of manifestly Lorentz-invariant quantities (in particular because
of our definition (8.53) of the relative velocity). Thus we see that the spin-summed differential cross section
is Lorentz invariant!
You perhaps are wondering about the physical motivation for summing over initial and final spins/helicities.
In fact what we really should do is sum over final spins/helicities and average over initial spins/helicities,
for the following reasons:
Typically the method for preparing a beam of particles does not preferentially treat one spin/helicity
state over another. We therefore should expect the initial state in a scattering process to be a mixed
quantum state where all spin/helicity configurations are equally likely, in which case we should average
over initial spins/helicities in the transition rate.
Typically in measuring the final state we do not get a good measurement of the spins/helicities of the
particles. We should therefore sum over these to compute the transition rate which does not distinguish
between different spins/helicities.
In situations where either of these statements is not the case, then we need to deal with the full differential
cross section.
Z
= δ(γ − α) + (2π)d δ d (pγ − pα ) iMγα − iM∗αγ + dβ(2π)d δ d (pβ − pα )M∗βγ Mβα . (8.60)
The δ(γ − α) terms cancel on both sides, so the remaining equality tells this that for any states α and γ such
that pγ = pα we should have
Z
iMγα − iM∗αγ + dβ(2π)d δ d (pβ − pα )M∗βγ Mβα = 0. (8.61)
dΓ(α → β)
Z
Γ(α) = dβ , (8.63)
dβ
which gives
Γ(α) = 2V 1−Nα Im Mαα . (8.64)
106
In particular for Nα = 1 we have
Γ(α) = 2Im Mαα , (8.65)
so the lifetime of an unstable particle is just two times the imaginary part of its forward scattering amplitude.
For Nα = 2 we can rewrite things in terms of the total cross section
dσ(α → β)
Z
V Γ(α)
σ(α) = dβ = , (8.66)
dβ uα
which gives
2
σ(α) = Im Mαα . (8.67)
uα
This last result is called the optical theorem. Both (8.65) and (8.67) express the idea that by unitary
any decay or scattering which is possible must decrease the probability that no scattering happens, which is
certainly a reasonable thing to expect!
107
8.6 Homework
1. The helicity of a photon in general dimensions is that of the vector representation of SO(d − 2), so a
photon in d dimensions has d − 2 independent values of σ (i.e. independent polarization states). How
would you interpret this in 1+1 and 2+1 dimensions? Hint: think about how the classical polarization
of an EM wave should work in these dimensions.
2. The helicity of a graviton in general dimensions is that of the representation of SO(d − 2) furnished
by a symmetric traceless two-tensor hij . How many independent polarizations does a graviton have in
d spacetime dimensions?
3. Check that our resolution (8.22) of the identity indeed acts as the identity on free scalar two-particle
states of the form ap†⃗ a†p⃗ ′ |Ω⟩.
4. Consider the scattering of a non-relativistic quantum particle off of a δ-function potential, with Hamil-
tonian
p2
H= + V0 δ(x). (8.68)
2m
You can assume V0 > 0. Give explicit formulas for the In and Out states of this theory, and compute
the S-matrix. Hint: to get a complete basis you need to consider incident waves from both the left and
the right, and you need to make sure your states are eigenstates of the full Hamiltonian. This theory
arises from the more general scattering theory we’ve considered in the limit where one of the incident
particles is infinitely massive and its interaction with the other particle has infinitely-short range.
5. Confirm that (8.32) follows from (8.30) by the unitarity of the little group representations.
6. Show that the relative velocity (8.53) becomes (8.54) in the “collinear” situation where the spatial
momenta are proportional to each other (possibly with opposite sign).
108
9 Scattering from correlation functions in quantum field theory
In the last lecture we studied scattering theory in quantum mechanics. In particular we encountered the
idea of “in” and “out” states |α, ±⟩ and the S-matrix
We also learned how to convert the S-matrix into observable transition rates such as the differential cross-
section
dσ(α → β) = u−1 d d 2
α (2π) δ (pβ − pα )|Mβα | dβ (9.2)
for two-particle scattering. In this lecture we return to quantum field theory, looking to answer two questions:
1. How can we tell when an interacting quantum field theory has a scattering description in terms of
particles?
2. In quantum field theories which do have a scattering description, how can we compute the S-matrix
starting from the correlation functions?
We will see that the answer to the first of these questions is that the existence of particles in a quantum
field theory leads to poles in the Fourier transform of its two-point functions, and the answer to the second
is given by the LSZ reduction formula. Once we establish these tools, we will finally be in a position to
compute genuine observables in interacting quantum field theories for comparison with experiment!61
where O1a1 and O2a2 are local operators that transform in irreducible representations of the Lorentz group:
X
U † (Λ)Oiai (x)U (Λ) = Diai bi (Λ)Oibi (Λ−1 x), (9.4)
bi
you want to get to practical applications quickly you won’t understand what you are doing, and in this class we’ve decided to
take our time and learn things properly.
109
and thus
⟨Ω|Oiai (xi )|α, ±⟩ = eipα ·xi ⟨Ω|Oiai (0)|α, ±⟩
⟨α, ±|Oiai (xi )|Ω⟩ = e−ipα ·xi ⟨α, ±|Oiai (xi )|Ω⟩, (9.8)
from which we have
Z "Z
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dα dx1 dx2 e−i(k1 +pα )·x1 −i(k2 −pα )·x2 −ϵ(t2 −t1 ) ⟨Ω|O2a2 (0)|α, ±⟩⟨α, ±|O1a1 (0)|Ω⟩
t2 >t1
Z #
−i(k1 −pα )·x1 −i(k2 +pα )·x2 +ϵ(t2 −t1 )
+ (−1) fO
dx1 dx2 e ⟨Ω|O1a1 (0)|α, ±⟩⟨α, ±|O2a2 (0)|Ω⟩ .
t2 <t1
(9.9)
The spatial integrals here give simple δ-functions, but the integrals over t1 and t2 are a bit trickier:
Z ∞ Z ∞ Z ∞
i(k10 +p0α −iϵ)t1 +i(k20 −p0α +iϵ)t2 0 0 i
dt1 dt2 e = dt1 ei(k1 +k2 )t1 0 0 + iϵ
−∞ t1 −∞ k 2 − p α
0 0 i
= 2πδ(k1 + k2 ) × 0 (9.10)
k2 − p0α + iϵ
and
Z ∞ Z ∞ Z ∞
i(k10 −p0α +iϵ)t1 +i(k20 +p0α −iϵ)t2 0 i0
dt2 dt1 e = dt2 ei(k1 +k2 )t2
−∞ t2 −∞ k10
− p0α + iϵ
i
= 2πδ(k10 + k20 ) × 0 . (9.11)
k1 − p0α + iϵ
We thus have
Z "
i
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ = dα (2π)d−1 δ d−1 (k⃗2 − p
⃗α )(2π)d−1 δ d−1 (k⃗1 + p
⃗α )2πδ(k10 + k20 ) ⟨Ω|O2a2 (0)|α, ±⟩⟨α, ±|O1a1 (0)|Ω⟩
k20 − p0α + iϵ
#
i
+ (−1) fO
(2π)d−1 d−1
δ (k⃗2 + p
⃗α )(2π)d−1 δ d−1 (k⃗1 − p
⃗α )2πδ(k10 + k20 ) a1 a2
⟨Ω|O1 (0)|α, ±⟩⟨α, ±|O2 (0)|Ω⟩ .
k10 − p0α + iϵ
(9.12)
i i
The key point to notice here are the pole factors and what we will now show is that the
k20 −p0α +iϵ k10 −p0α +iϵ
:
contribution to the α integral coming from one-particle states turns these poles into poles of the momentum-
space correlation function ⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ . For one-particle states we simply have
Z X Z dd−1 p
dα = (9.13)
σ,n
(2π)d−1
and
pα = (ωn,⃗p , p⃗), (9.14)
with p
ωn,⃗p = |p|2 + m2n . (9.15)
Evaluating the momentum integrals we thus find
X i
⟨T O2a2 (k2 )O1a1 (k1 )⟩ϵ ⊃ (2π)d δ(k2 + k1 ) ⟨Ω|O2a2 (0)|k2 , σ, n⟩⟨k2 , σ, n|O1a1 (0)|Ω⟩
n,σ
k20 − ωn,⃗k2 + iϵ
!
i
+ (−1)fO ⟨Ω|O1a1 (0)|k1 , σ, n⟩⟨k1 , σ, n|O2a2 (0)|Ω⟩ .
k10 − ωn,⃗k1 + iϵ
(9.16)
110
The two-point function therefore has a pole whenever the external momenta go “on-shell” for any particle
species n for which the matrix elements do not vanish. This may seem technical, but in fact it is profound:
The way we tell if a quantum field theory has particles is we look for on-shell poles in the Fourier
transform of the time-ordered two-point function: they exist if and only if the theory has one-particle
states, and we can determine the masses of the particles from the locations of the poles.
In particular I want to emphasize that nothing in this derivation assumed that the particles are “fundamental”
in the sense of being associated with fields in the Lagrangian, for example in QED the hydrogen atom
contributes poles to two-point functions and the same is true for protons in QCD.
What about other contributions to the α integral? The vacuum contribution vanishes because of (9.5).
We will not try to show it systematically, but the multi-particle states only contribute branch cuts which in
massive theories are away from the “on-shell” poles at k20 = ±ω⃗k2 so the pole contributions comes only from
one-particle states. The basic idea is that the integral
Z zmax 0
dz k − zmin + iϵ
0
= log (9.17)
zmin k − z + iϵ k 0 − zmax + iϵ
has only logarithmic branch singularities, where here z stands for the integral over the additional momenta
in a multi-particle state and we’d have zmin = mn + ωn,⃗k so the branch point is at k 0 = mn + ωn,⃗k which
is different from ωn,⃗k unless mn = 0. When mn = 0 some more care is needed, but in essence we can still
distinguish a pole from a branch point even if they are right on top of each other.
Here I’ve put a hat on the little group representation matrices D̂n to distinguish them from the operator
representation matrices D appearing in (9.4), and used the little group transformation
X
U (Λ)|k, σ, n⟩ = Dσn′ σ (Λ)|k, σ ′ , n⟩. (9.19)
σ′
or equivalently
D(Λ)T = T D̂n (Λ). (9.21)
Similarly we have X ′ ′
⟨k, σ, n|Oa (0)|Ω⟩ = Daa (Λ−1 )D̂σn∗′ σ (Λ)⟨k, σ ′ , n|Oa (0)|Ω⟩, (9.22)
a′ ,σ ′
111
obeys
D(Λ)Te = TeD̂n∗ (Λ). (9.24)
n
We have taken both D and D̂ to form irreducible representations of the little group, and equations (9.21)
and (9.24) can be interpreted in terms of group theory as saying that the matrices T and Te are intertwiners
between these irreducible representations. More precisely, T is an intertwiner from the D̂ representation to
the D representation and Te is an intertwiner from the conjugate of the D̂ representation to the D repre-
sentation. It is a theorem in group theory that nonzero intertwiners between finite-dimensional irreducible
representations exist only if the representations are equivalent by a similarity transformation, and moreover
that even in this case the intertwiner is unique up to an overall constant factor.63
In fact we have already discussed these intertwiners, in the context of free field theory. Given a particle
type n, we argued in the first two lectures that to make a relativistic quantum theory we should begin by
constructing a free field which annihilates that particle with the form
X Z dd−1 p 1 h i
a a ip·x a c −ip·x †
Φ (x) = u (⃗
p , σ, n)e a p
⃗ σn + v (⃗
p , σ, n )e a p
⃗σn c . (9.25)
(2π)d−1 2ωn,⃗p
p
σ
Here nc is the antiparticle of n (which may coincide with n or not), and the functions ua and v a are chosen
so that the field commutes/anticommutes at spacelike separation and we have the Lorentz transformation
X ′ ′
U † (Λ)Φa (x)U (Λ) = Daa (Λ)Φa (Λ−1 x). (9.26)
a′
Due to our little group transformation (9.19) we see that the creation and annihilation operators in this field
must transform as
c
U (Λ)a†p⃗σnc U (Λ−1 ) = D̂σn′ σ (Λ)a†p⃗σ′ nc
X
σ′
X
−1
U (Λ)ap⃗σn U (Λ )= D̂σn∗′ σ (Λ)ap⃗σ′ n , (9.27)
σ′
where Λ is in the little group for p. In order for the field to have the Lorentz transformation (9.26), these
transformations must combine with ua and v a to give
X X ′ ′
p, σ ′ , n)D̂σn′ ,σ (Λ) =
ua (⃗ Daa (Λ)ua (⃗
p, σ, n)
σ′ a′
X c X ′ ′
′
a
v (⃗
p, σ , n)D̂σn′ σ∗ (Λ) = Daa (Λ)v a (⃗
p, σ, nc ). (9.28)
σ′ a′
In other words ua and v a are intertwiners, so by the uniqueness of intertwiners they must be proportional
to T and Te respectively:
⟨Ω|Oa (0)|k, σ, n⟩ = An (⃗k) ua (⃗k, σ, n)
enc (⃗k) v a (⃗k, σ, nc ).
⟨k, σ, nc |Oa (0)|Ω⟩ = A (9.29)
We can learn more about the proportionality functions An and A enc by considering general Lorentz
transformations.64 For general Λ instead of (9.18) we have
s
ωn,⃗kΛ X aa′ −1 n a′
a
⟨Ω|O (0)|k, σ, n⟩ = D (Λ )D̂σ′ σ (L−1 ′
Λk ΛLk )⟨Ω|O (0)|Λk, σ , n⟩, (9.30)
ω⃗k ′ ′
a ,σ
63 If you know a little representation theory the proof is a fairly straightforward application of Schur’s lemmas, see theorem
is that the uniqueness result for intertwiners only applies to finite-dimensional irreducible representations, and it is only the
little group which acts in a finite-dimensional representation on particle states.
112
which from (9.29) tells us that
s
ωn,⃗kΛ X ′
a′ ⃗
An (⃗k)u (⃗k, σ, n) =
a
An (⃗kΛ ) Daa (Λ−1 )D̂σn′ σ (L−1 ′
Λk ΛLk )u (kΛ , σ , n). (9.31)
ω⃗k
a′ ,σ ′
Here we are using the notation that Λ(ωn,⃗k , ⃗k) = (ωn,⃗kΛ , ⃗kΛ ), and as in the previous lecture Lp is the
Lorentz transformation which maps a reference momentum to p. Similarly from the transformation of
⟨k, σ, nc |Oa (0)|Ω⟩ we have
s
ωn,⃗kΛ X ′ c
a′ ⃗
enc (⃗k)v (⃗k, σ, n ) =
A a c enc (⃗kΛ )
A Daa (Λ−1 )D̂σn′ σ∗ (L−1 ′ c
Λk ΛLk )v (kΛ , σ , n ). (9.32)
ω⃗k ′ ′ a ,σ
We can simplify these by noting that in general the transformations of the creation and annihilation operators
is
ωn,⃗pΛ X nc
r
U (Λ)a†p⃗σnc U (Λ−1 ) = D̂σ′ σ (L−1 †
Λp ΛLp )ap⃗Λ σ ′ nc
ωp⃗
σ′
ωn,⃗pΛ X n∗ −1
r
U (Λ)ap⃗σn U (Λ−1 ) = D̂σ′ σ (LΛp ΛLp )ap⃗Λ σ′ n , (9.33)
ωp⃗ ′ σ
and that the Lorentz transformation (9.26) for the free field then implies (extra credit homework) that we
have
X X ′ ′
ua (⃗kΛ , σ ′ , n)D̂σn′ σ (L−1
Λk ΛLk ) = Daa (Λ)ua (⃗k, σ, n)
σ′ a′
X c X ′ ′
v a (⃗kΛ , σ ′ , nc )D̂σn′ σ∗ (L−1
Λk ΛLk ) = Daa (Λ)v a (⃗k, σ, nc ). (9.34)
σ′ a′
enc (⃗k) = pω ⃗ A
ωn,⃗k A e c ⃗
n,kΛ n (kΛ ), (9.35)
p
and thus
Zn
An (⃗k) = q
2ωn,⃗k
where Zn and Zenc are pure numbers. Moreover Zn and Zenc are related by CRT symmetry: by CRT we
have
⟨Ω|Oa (0)|k, σ, n⟩ = ⟨Θ†CRT kσn|(Θ†CRT Oa (0)ΘCRT )† |Ω⟩, (9.37)
where ⟨Θ†CRT kσn| is the bra dual to the ket Θ†CRT |kσn⟩ and we have used that ΘCRT is antiunitary and
leaves the vacuum unchanged. Recalling that for CRT we have
and also that CRT maps particles to antiparticles, we see that (9.37) gives a proportionality relation between
Zn and Zenc . Working out the proportionality coefficient this way is a bit tricky (we’d need to sort out how
113
CRT acts on one-particle states), but once we know such a relation exists we can instead just determine the
coefficient by comparing to free field theory. There we have
1
⟨Ω|Φa (0)|k, σ, n⟩ = q ua (⃗k, σ, n)
2ωn,⃗k
1
⟨k, σ, nc |Φa (0)|Ω⟩ = q v a (⃗k, σ, nc ), (9.39)
2ωn,⃗k
which up to the overall factor of Zn are rather remarkably the same as we would have obtained simply from
replacing Oa by Φa and using free field theory!
X i 1 X a2 ⃗
⟨T Oa2 (k2 )Oa1 † (k1 )⟩ϵ ⊃ (2π)d δ(k2 + k1 ) |Zn |2 u (k2 , σ, n)ua1 ∗ (k⃗2 , σ, n)
n
k20 − ωn,⃗k2 + iϵ 2ωn,⃗k2 σ
!
i 1 X a2 ⃗
+ (−1)fO v (k1 , σ, nc )v a1 ∗ (k⃗1 , σ, nc ) ,
k10 − ωn,⃗k1 + iϵ 2ωn,⃗k1 σ
(9.42)
where I again remind you that we are focusing on the contribution of the on-shell pole. You will show in
the homework that up to the factor of |Zn |2 , this is precisely the Fourier transform of the time-ordered two
point function of the free field (9.25):65
Thus we see that in the vicinity of an on-shell pole, the exact two-point function in any quantum field theory
with a particle description just becomes that of free field theory up to an overall factor! It is important to
make several comments about this however:
In general the particles appearing here have nothing to do with the fields appearing in the Lagrangian.
The free fields we are discussing here may thus look nothing like the “true” fields appearing in the
Lagrangian.
65 If there are multiple types of particle with exactly the same mass and spin/helicity (besides just n and nc ) then O a could
create a superposition of them, in which case there could still be a sum over some subset of n here. In this case however we
can just redefine our basis of particle types to treat this superposition as its own type of particle. Typically this situation only
arises when there is a global symmetry to enforce the degeneracy, in which case this redefinition will just be a global symmetry
transformation.
114
On the other hand in situations when the interactions are weak and we are interested in particles which
do correspond to fundamental fields, we indeed can (and will) chose Oa to just be the fundamental
field for the particle in question. It cannot be emphasized enough however that mass mn appearing in
this formula is the genuine particle mass, not the mass parameter appearing in the Lagrangian. We
saw already at one loop in ϕ4 theory that these are not the same.
Moreover the two-point function of a field whose kinetic term in the Lagrangian is normalized as
− 12 ∂µ ϕ∂ µ ϕ (in the scalar case) will NOT have Zn = 1 in the interacting theory. We have not yet
computed enough loop diagrams to see this happen, but we eventually will (unfortunately in ϕ4 theory
one needs to go to two loops to see it). Rescaling the fundamental field to give us something whose
two-point function doesn’t have a factor of |Zn |2 is (for historical reasons) called wave function
renormalization.
In a situation where the particle we want to create is not fundamental, it may not seem so clear which
operator Oa we should use to get a two-point function with a non-vanishing Zn . In fact it is easy: we
simply look for any local operator with the same symmetry charges as a free field which annihilates that
particle should have. So for example in QCD it is easy to construct a local operator out of quark and
gluon fields with the same symmetry transformations as a field that would annihilate the proton, and
we can just use that one even though it undoubtedly will create all sorts of mess in the multiparticle
states which do not contribute to the pole.
′aN
× ⟨Ω|T ON (xM +N ) . . . O1′a1 (xM +1 )O1b1 † (x1 ) . . . OM
bM †
(xM )|Ω⟩
× e−ϵ(tmax −tmin ) (9.44)
bM
where O1b1 , . . . , OM ′aN
and O1′a1 , . . . , ON are Heisenberg operators in some quantum field theory with a scat-
tering description that transform in irreducible representations of the Lorentz group. In the ϵ-regulator,
tmin and tmax are the least and greatest of the times t1 , . . . , tM +N . We will eventually arrange so that
k1 , . . . , kM are the spacetime momenta of the ingoing particles and k1′ , . . . , kN
′
are the spacetime momenta
of the outgoing particles. Very roughly we’ll see that you can think of the Oi† (ki ) as creation operators for
particles in an “in” state and the Oi′ (ki ) as annihilation operators for particles in an “out” state. Our goal is
to show that this object has a multi-dimensional pole as we take the external momenta to be on-shell, and
in particular when we do this so that
t1 ≤ t2 ≤ . . . ≤ tM +N , (9.46)
115
and is a generalization of the first term in (9.42) (other regions of integration give poles where some of the
ki0 are equal to ωni ,⃗ki and some of the ki′0 are equal to −ωn′ ,⃗k′ ). Focusing on this region of the integral, and
i i
defining
Gϵ := ⟨Ω|T ON′aN ′
(kN ) . . . O1′a1 (k1′ )O1b1 † (k1 ) . . . OM
bM †
(kM )|Ω⟩ϵ (9.47)
to save space, we can insert complete sets of scattering states to get
Z Z
′ ′
Gϵ ⊃ dx1 . . . dxN +M dα1 . . . dαM dβ1 . . . dβN e−i(k1 ·x1 +...+kM ·xM +k1 ·xM +1 +...+kN ·xM +N )−ϵ(tM +N −t1 )
t1 ≤t2 ...≤tM +N
′aN
× ⟨Ω|ON (xM +N )|β1 ⟩ . . . ⟨βN −1 |O1′a1 (xM +1 )|βN ⟩⟨βN |αM ⟩⟨αM |O1b1 † (x1 )|αM −1 ⟩ . . . ⟨α1 |OM
bM †
(xM )|Ω⟩,
(9.48)
where to save more space I’ve here adopted a convention that α states are “in” states and β states are “out”
states. Note the appearance of the M -particle to N -particle S-matrix ⟨βN |αM ⟩; our goal is now to show that
this can be extracted by isolating the on-shell pole.
As before we can extract the position dependence of the matrix elements in scattering states as
The integrals over spatial positions now give simple δ-functions, but the integrals over time require a bit
more work. Defining
Z ∞ Z ∞ Z ∞
i k10 +p0α −p0α −iϵ t1 i k20 +p0α −p0α 0 0
dtM ei(kM +pα1 )tM
t2
T := dt1 e M M −1 dt2 e M −1 M −2 ...
−∞ t1 tM −1
Z ∞ Z ∞ Z ∞
i k1′0 −p0β +p0β i( ′0 0 0
)tM +N −1 ′0 0
dtM +N ei(kN −pβ1 +iϵ)tM +N ,
tM +1 kN −1 −pβ2 +pβ1
× dtM +1 e N N −1 ... dtM +N −1 e
tM tM +N −2 tM +N −1
(9.51)
ktot = k1 + . . . + kM
′
ktot = k1′ + . . . + kN
′
. (9.53)
116
Evaluating the spatial integrals, we thus have
Z
Gϵ ⊃ dα1 . . . dαM dβ1 . . . dβN ⟨βN |αM ⟩ T
× (2π)d−1 δ d−1 ⃗k1 + p⃗αM − p⃗αM −1 . . . (2π)d−1 δ d−1 ⃗kM + p⃗α1
× (2π)d−1 δ d−1 ⃗k1′ − p⃗βN + p⃗βN −1 . . . (2π)d−1 δ d−1 ⃗kN
′
− p⃗β1
′aN
× ⟨Ω|ON (0)|β1 ⟩ . . . ⟨βN −1 |O1′a1 (0)|βN ⟩⟨αM |O1b1 † (0)|αM −1 ⟩ . . . ⟨α1 |OM
bM †
(0)|Ω⟩. (9.54)
p⃗α1 = −⃗kM
p⃗α2 = −(⃗kM + ⃗kM −1 )
..
.
p⃗α = −⃗ktot ,
M
(9.55)
p⃗β1 = ⃗kN
′
p⃗β2 = (⃗kN′
+ ⃗kN
′
−1 )
..
.
p⃗β = ⃗k ′ .
N tot (9.56)
To make sure we get a pole of maximum strength we should choose the multiparticle states appearing in
(9.54) to ensure that the answer has no remaining momentum integrals. The way to do this to take α1 and
β1 to be one-particle states, α2 and β2 to be two-particle states, and so on.
To proceed further we need to say something about the matrix elements of the O operators. For each O
matrix element the bra has one fewer particle than the ket, so we are interested in the part of O which is
proportional to an annihilation operator. In the previous section we saw that we can write this as
dd−1 p ua (⃗
Z
X p, σ, n) .
Oa (0) = Zn a (9.57)
d−1
2ωn,⃗p p⃗,σ,n
p
n,σ
(2π)
Similarly for each of the O† s the bra has one more particle than the ket, so are interested in the part of O†
which is proportional to a creation operator, which is given by
dd−1 p ub∗ (⃗
Z
b†
X
∗ p, σ, n) †
O (0) = Zn a . (9.58)
d−1
2ωn,⃗p p⃗,σ,n
p
n,σ
(2π)
To simplify life we’ll assume that we’ve chosen either our particle basis or our operators O such that Zn
is nonzero for only one n with a given mass and spin, in which case we can drop the sum on n in these
BM †
expressions. There is then only one way to satisfy all of the spatial δ-functions: OM must create a particle
⃗ bM −1 † ⃗
of spatial momentum −kM , OM −1 must create a particle of spatial momentum −kM −1 , and so on, and
similarly O1′a1 must annihilate a particle of momentum ⃗k1 , O2′b2 must annihilate a particle of momentum ⃗k2 ,
and so on. We can therefore simplify the quantity T from the time integrals:
0 −i −i i i
T = 2πδ(kM +ωnM ,⃗kM )× ... 0 . . . ′0 .
k10 + ωn1 ,⃗k1 − iϵ kM −1 + ωnM −1 ,⃗kM −1 − iϵ k1′0 − ωn′ ,⃗k′ + iϵ kN − ωn′ ⃗′+ iϵ
1 1 N kN
(9.59)
117
0
It may seem strange that we have treated kM differently than all the other energies, this is because momentum
conservation doesn’t let us really vary all the momenta independently. We can restore the symmetry by using
0 i i
2πδ(kM + ωnM ,⃗kM ) = 0 − 0 , (9.60)
kM + ωnM ,⃗kM + iϵ kM + ωnM ,⃗kM − iϵ
0
with the understanding that we are interested in the pole where kM = −ωnM ,⃗kM − iϵ, in which case we can
simply write
−i −i i i
T ⊃ ... 0 . . . ′0 . (9.61)
k10 + ωn1 ,⃗k1 − iϵ kM + ωnM ,⃗kM − iϵ k1′0 − ωn′ ,⃗k′ + iϵ kN − ωn′ ,⃗k′ + iϵ
1 1 N N
Having now done all of the momentum integrals, we arrive at last at the LSZ formula:66
X ubj (⃗kj′ , σj′ , n′j )
N
Y i YM X uai ∗ (−⃗ki , σi , ni ) −i
Gϵ −−0−−−−−−−→ Zn′ q × ′0 Zn∗ q × 0
ki → −ωn ,⃗k
j
′ 2ω ′ ⃗ ′
k j − ω n ′ ,⃗
k ′ + iϵ
i
2ω k i + ω n ,⃗k − iϵ
i i j=1 σ n ,kj j j j
j
i=1 σi n ,⃗ k i i
i i
ki′0 → ωn′ ,⃗k′
i i
The third line here is just the S-matrix with arbitrary external particles, so by stripping off the factors in
the first two lines we can directly extract it! The only remaining difficulty is how to “undo” the sums over
spin/helicity; there is a standard way to do this but we will postpone discussion of it until we discuss fields
with nonzero spin more explicitly.
The LSZ formula is often written in a slightly more covariant way by noting that near the poles we have
Due to the pesky signs in the momenta of the “in” state here, one sometimes defines Gϵ to have the opposite
sign for ki (anyways the signs in the Fourier transform are a matter of convention).
There are a few points which are worth making about this formula:
The LSZ formula is completely non-perturbative, computing the exact S-matrix of the true asymptotic
states of the theory. Choosing the operators O and O′ in general requires you to know enough about
the theory to be able to find a local operator that creates/annihilates each particle type n with nonzero
Zn . As mentioned above this is usually not difficult however: you just find an operator that has the
right symmetry charges and then a nonzero Zn is generic.
66 LSZ stands for Lehmann, Symanzik, and Zimmerman. Their original paper from 1954 only treats the scalar case and
assumes weak coupling. It is written in German, Ein Prosit if you can read it!
67 In many textbooks the “in” and “out” states are normalized in a more covariant way by absorbing the factors of
q
2ωn′ ,⃗k′
j j
q
and 2ωn ,⃗k into their definitions, which makes this formula look even more covariant.
i i
118
Comparison to (9.42) may have you worried that we are only computing the S-matrix for particles and
not antiparticles, but of course it is arbitrary which particles we view as an antiparticles. Given a free
field Φa (x) as in equation (9.25), we can exchange the role of particles and antiparticles by taking the
adjoint of the field. We have implicitly done this in our presentation of the LSZ formula since we have
only the ua intertwiners appearing, so in particular if the amplitude involves both a particle and its
anti-particle then we have used an O analogous to Φ for the former and an O analogous to Φ† for the
latter. If we instead want to adhere to some pre-existing convention for which particles are antiparticles
(for example if we want positrons to be antiparticles), and we want to take all Os to be analogous to
Φ, then in the LSZ formula we should make the replacements ua∗ → v a for each antiparticle in the
initial state and ua → v a∗ for each antiparticle in the final state.
Since we have related S-matrix elements to correlation functions, all symmetry constraints on correla-
tion functions must imply symmetry constraints on S-matrix elements. For example if there is a U (1)
global symmetry under which the incoming particles have charges q1 , q1 , . . . and the outgoing particles
have charges q1′ , q2′ , . . ., then we must have
q1 + q2 + . . . + qM = q1′ + q2′ + . . . + qN
′
. (9.66)
The masses appearing in the poles are again the physical masses, not bare masses that appear in the
Lagrangian. After all the latter do not even make sense for composite particles. The factors of Zn′
and Zn are again called wave function renormalization.
In the next lecture we will learn how to use the LSZ formula to compute the S-matrix in weakly interacting
theories using Feynman diagrams.
119
9.5 Homework
−i
1. Show that the expression (9.16) is compatible with our expression (2π)d δ d (k2 + k1 ) k2 +m2 −iϵ for the
2
momentum-space Feynman propagator in free scalar field theory. You should take O1 (x) = O2 (x) =
Φ(x), where Φ is a real free scalar field.
2. Consider the derivative ∂ µ Φ(x) of a free scalar field. From the point of view of this lecture this is just
as good of a candidate for a field that creates a free scalar particle as Φ(x) itself is. What are uµ and
v µ for the free field ∂ µ Φ? Show that these uµ and v µ obey the intertwiner equations (9.28) and (9.34).
3. Evaluate the Fourier transform of the time-ordered two-point function ⟨Ω|T Φa2 (x2 )Φa1 † (x1 )|Ω⟩ of a
general
P free field as in equation (9.25), and show that it gives the right hand side of (9.42) but without
the n |Zn |2 .
4. Evaluate the integrals in T and show that they lead to (9.52). Make sure to go from right to left, and
be prepared to use the δ-function at the end to rewrite the poles involving ki0 .
5. Extra credit: starting from the general Lorentz transformation properties of one-particle states, the
free-field expression (9.25), and the field transformation (9.26), derive the creation and annihilation
operator transformations (9.33) and the ua and v a transformations (9.34).
6. Extra extra credit: derive Zn = Zenc directly from (9.37) and (9.38), without using free field theory. You
will need to figure out how one-particle states transform under CRT , which requires you to think about
how to analytically continue the machinery of the little group to Euclidean signature. (Disclosure: I
tried this myself, but there was a sign I so far couldn’t get to work out in the fermionic case.)
120
10 Scattering in perturbation theory
In the last lecture we met the LSZ formula relating the S-matrix to the Fourier transform of time-ordered
correlation functions:
q
N −i 2ωn′ ,⃗k′
j j
) . . . O1′a1 (k1′ )O1b1 † (k1 ) . . . OM
bM †
Y X
′aN
⟨Ω|T ON ′
(kN (kM )|Ω⟩ −−0−−−−−−−→ Zn′ ubj (⃗kj′ , σj′ , n′j ) × ′2
ki → −ωn ,⃗k
j
kj + m2n′ − iϵ
i i j=1 σj′ j
ki′0 → ωn′ ,⃗k′
i i
q
M
Y X −i 2ωni ,⃗ki
× Zn∗ uai ∗ (−⃗ki , σi , ni ) × 2
i=1
i
σi
ki + m2ni − iϵ
In this lecture we will learn how to use this formula in perturbation theory to compute the S-matrix. To
simplify expressions we will restrict to particles with zero spin/helicity and take the operators O and O′ to
be scalars, in which case the formula simplifies to68
q q
−iZn∗i 2ωni ,⃗ki
N −iZn′j 2ωn′ ,⃗k′ M
j j
) . . . O1′ (k1′ )O1† (k1 ) . . . OM
†
Y Y
′ ′
⟨Ω|T ON (kN (kM )|Ω⟩ −−0−−−−−−−→ ×
ki → −ωn ,⃗k
j=1
kj′2 + m2n′ − iϵ i=1
ki2 + m2ni − iϵ
i i j
ki′0 → ωn′ ,⃗k′
i i
× ⟨k1′ , n′1 ; . . . ; kN
′
, n′N , −| − k1 , n1 ; . . . ; −kM , nM , +⟩. (10.2)
′
× ⟨Ω|T ON ′
(kN ) . . . O1′ (k1′ )O1† (−k1 ) . . . OM
†
(−kM )|Ω⟩c ki0 → ωn ,⃗k ,
i i
ki′0 → ωn′ ,⃗k′
i i
(10.3)
where I’ve taken the liberty of flipping the sign of the ingoing momenta in the Fourier transform. I’ve also
taken the connected part of the S-matrix, which is defined in just the same way as the connected part of
the correlation functions and therefore can be computed by using the connected correlation function on
the right-hand side. Let’s study this formula specifically in the context of our interacting ϕ4 theory, with
Lagrangian density
1 m2 λ
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ4 . (10.4)
2 2 4!
We will take all of the Os and O′ s to just be Φ. Since this theory has only one kind of particle and Φ is
hermitian, we can further simplify (10.3) to
!
N ′2 2 M
Y k
j q
+ m − iϵ Y ki2 + m2 − iϵ
⟨k1′ ; . . . ; kN
′
, −|k1 ; . . . ; kM , +⟩c = ′
⟨Ω|T Φ(kN ) . . . Φ(k1′ )Φ(−k1 ) . . . Φ(−kM )|Ω⟩c ki0 → ω⃗k .
−iZ 2ω⃗ki
p
j=1 −iZ 2ω⃗k′ i=1
i
j ki′0 → ω⃗k′
i
(10.5)
68 It is easy to put back the spin, we will do it later when we consider particles of spin/helicity 1/2 and 1.
121
We’ve written m0 for the “bare” mass in the Lagrangian to distinguish it from the genuine physical mass m
which appears in equation (10.5). Recall that Z here is defined by
Z
⟨Ω|Φ(0)|k⟩ = ⟨k|Φ(0)|Ω⟩ = p , (10.6)
2ω⃗k
with p
ω⃗k = |k|2 + m2 . (10.7)
Z must be real by the hermiticity of Φ.
We learned a few lectures ago that in perturbation theory we can compute the Fourier transform of the
connected time-ordered correlation functions of Φ using the momentum-space Feynman rules:
−i
⟨T Φ(p)Φ(p′ )⟩ = (2π)d δ d (p + p′ ) . (10.8)
p2 + m20 + Σ(p2 ) − iϵ
The δ-function here is a consequence of translation invariance, and the fact that it depends only on p2 is a
consequence of Lorentz invariance. We saw in the previous lecture that this correlation function has a pole
at p2 = −m2 , so the relationship between the bare and physical masses is determined by solving
We also saw that the residue of this pole is −i(2π)d δ d (p + p′ )Z 2 , so apparently we have
1
Z2 = . (10.10)
1 + Σ′ (−m2 )
122
= 1PI 1PI 1PI
( 1PI = )
Figure 25: The one-particle-irreducible (1PI) decomposition of the two-point function. The full two-point
function is built out of a sum of increasing numbers of 1PI bubbles chained together by propagators, leading
to a geometric series.
We can then consider how to compute the self-energy perturbatively. In the free theory with λ = 0 we
have Σ = 0, so we can rewrite the momentum space propagator perturbatively:
−i −i p2 + m20 − iϵ
= ×
p2 + m20 + Σ(p2 ) − iϵ p2 + m20 − iϵ p2 + m20 + Σ(p2 ) − iϵ
−i 1
= 2 ×
p + m20 − iϵ 1 + 2 Σ(p22 )
p +m0 −iϵ
2 !
−i −i −i
= 2 × 1 + −iΣ(p2 ) 2 + −iΣ(p2 ) 2 + ... .
p + m20 − iϵ p + m20 − iϵ p + m20 − iϵ
(10.11)
In the last line here we have “unsummed” a geometric series to get a Taylor expansion in Σ, which we should
think of as being O(λ). We can then compare this expression to what we get from the Feynman diagram
expansion for the two-point function, for which the first few diagrams are shown in the first line of figure 25.
To isolate the contribution of Σ, we note that we can organize this series as a geometric sum by splitting it
up into “one-particle irreducible” (1PI) pieces. By definition a 1PI Feynman diagram is a connected diagram
with at least one interaction vertex and also the property that there is no internal link such that removing
that link splits the diagram into two connected components. The second, third, and fourth diagrams in the
first line of figure 25 are 1PI but the first and fifth are not. Comparing the 1PI decomposition to the last
line of equation (10.11), we see the following rule:
−iΣ(p2 ) =sum over two-point 1PI diagrams with external propagators
and the momentum-conserving δ-function removed.
In particular at one loop the only contribution to Σ is the “snail” diagram (the first diagram in the third
line of figure 25), so we have
dd q −i
Z
2 iλ
−iΣ(p ) = − + O(λ2 )
2 (2π)d q 2 + m20 − iϵ
iλ
= − GF (0) + O(λ2 ). (10.12)
2
123
Figure 26: Momentum labels for the sunset diagram.
just as we found back in lectures 10-11. We have now done better than we did then however, as we have
seen that by resumming an infinite sum of diagrams this mass shaft indeed is a shift of the pole location to
all orders in perturbation theory.
You may have already noticed that at one loop Σ(p2 ) is actually independent of p2 , so from (10.10) we
see that
Z = 1 + O(λ2 ). (10.14)
To get a nonzero contribution to Z we need a 1PI diagram that has nontrivial p2 dependence once the external
propagators are removed, and the first diagram with this property is the two-loop “sunset” diagram (the
second diagram in the third line of figure 25). Choosing momentum labels as in figure 26, the contribution
of this diagram to Σ(p2 ) is
λ2 dd q dd ℓ −i −i −i
Z Z
−iΣ(p2 ) ⊃ − . (10.15)
6 (2π)2 (2π)d q 2 + m20 − iϵ ℓ2 + m20 − iϵ (p − ℓ − q)2 + m20 − iϵ
You can see the explicit p2 dependence here in the third propagator. For now we won’t try to evaluate
the loop integrals. In the homework you’ll meet another scalar field theory which already has a nonzero
contribution to Z at one loop.
124
Figure 27: Contributions to the four-point function up to O(λ2 ). The diagrams in the second row are not
pruned, so we should remove them in computing the S-matrix.
which would give corrections to the external propagators, so finish removing the exact two-point functions
on the external legs we just need to divide by the free propagators. We thus have the following rule:70
q q q q
2ω⃗k′ . . . 2ω⃗k′ 2ω⃗k1 . . . 2ω⃗kM ⟨k1′ ; . . . ; kN
′
, −|k1 ; . . . kM , +⟩c =sum over all pruned connected Feynman diagrams
1 N
At one loop we then have the three diagrams in the third row, whose momenta we label as in figure 28,
which add up to
λ2 dd ℓ
Z
fc (k1 , k2 → k1′ , k2′ ) ⊃ − −i h −i
iM
2 (2π) ℓ + m0 − iϵ (ℓ + k1′ − k1 )2 + m20 − iϵ
d 2 2
−i −i i
+ 2
+ ′ 2 2
.
(ℓ − k1 − k2 ) + m0 − iϵ
2 (ℓ + k2 − k1 ) + m0 − iϵ
(10.19)
We will learn how to evaluate this integral in the next lecture.
70 Pruned diagrams are usually instead called “amputated”, but pruning feels less gruesome to me.
125
Figure 28: Momentum labels for the pruned diagrams contributing to the 2 → 2 S-matrix at tree level and
one loop.
Here If inal is equal to zero if the final state particles are distinguishable (i.e. if n′1 ̸= n′2 ) and equal to one if
they are indistinguishable (i.e. if n′1 = n′2 ) (such a factor was part of the definition of dβ). We can use the
′
spatial momentum-conserving δ-function to evaluate the integral over k⃗1 , so we are left with
1 1
dσ(k1 , σ1 , n1 ; k2 , σ2 , n2 → k1 + k2 − k2′ , σ1′ , n′1 ; k2′ , σ2′ , n′2 ) =
2If inal 4 (k1 · k2 )2 − m2n1 m2n2
p
fc |2
|M dd−1 k2′
× 2πδ(−ωn1 ,⃗k1 − ωn2 ,⃗k2 + ωn′ ,⃗k′ + ωn′ ,⃗k′ ) ,
1 1 2 2 4ωn′ ,⃗k′ ωn′ ,⃗k′ (2π)d−1
1 1 2 2
(10.23)
126
and the differential cross section is to be integrated over only k2′ . We will study this in the “center of mass
frame” where
⃗k2 = −k⃗1 := ⃗k (10.25)
and
⃗k ′ = −⃗k ′ := ⃗k ′ . (10.26)
2 1
and q
(k1 · k2 )2 − m2n1 m2n2 = |k|Etot . (10.28)
The energy conserving δ-function sets
q q
Etot = |k ′ |2 + m2n′ + |k ′ |2 + m2n′ , (10.29)
1 2
which has no solution if Etot < mn′1 + mn′2 and has solution
q
2 − m2 − m2 )2 − 4m2 m2
(Etot n′ n′ n′ n′
|k ′ | =
1 2 1 2
(10.30)
2Etot
if Etot ≥ mn′1 + mn′2 . In order to use the δ function to simplify the differential cross section we need to
rewrite by noting that
d q ′ 2 2 +
q
′ |2 + m2
|k ′ | |k ′ | |k ′ |Etot
|k | + m ′
n1 |k n2′ = + =
d|k ′ |
q q q q
|k ′ |2 + m2n′ |k ′ |2 + m2n′ |k ′ |2 + m2n′ |k ′ |2 + m2n′
1 2 1 2
(10.31)
and thus
q q q
q q |k ′ |2 + m2n′ |k ′ |2 + m2n′ 2 − m2 − m2 )2 − 4m2 m2
(Etot n ′ n ′ n ′ n ′
δ |k|′ −
1 2 1 2 1 2
2πδ(−Etot + |k ′ |2 + m2n′ + |k ′ |2 + m2n′ ) = 2π .
1 2 |k ′ |Etot 2Etot
(10.32)
The integration measure is
dd−1 k ′
= (2π)−(d−1) |k ′ |d−2 d|k ′ |dΩd−2 , (10.33)
(2π)d−1
where dΩ is the volume measure on a unit Sd−2 . Putting this all together we see that the differential cross
section (now only to be integrated over the angular coordinates on Sd−2 ) is
dσ 1 1 |k ′ |d−3 f 2
dΩd−2
= If inal
2 (2π)d−2 16|k|Etot2 |Mc | , (10.34)
127
with |k ′ | and Etot being given in terms of |k| by equations (10.30) and (10.27).
Returning now to our ϕ4 theory, since all external masses are equal equation (10.30) simplifies to
|k ′ | = |k| (10.35)
and so we have
dσ 1 1 |k|d−4 f 2
dΩd−2
= 2 | Mc | .
2 (2π)d−2 16Etot
(10.36)
dσ 1 1 |k|d−4 2
λ + O(λ3 ) .
= 2 (10.37)
dΩd−2 2 (2π)d−2 16Etot
This answer is independent of angle, so the outgoing particles are equally likely to come out in any direction.
The total cross section σ is therefore just the differential cross section times the volume71
2π (d−1)/2
Ωd−2 = (10.38)
Γ( d−1
2 )
of a unit Sd−2 :
Ωd−2 |k|d−4 2
λ + O(λ3 ) .
σ= d−2 2 (10.39)
32(2π) Etot
At one loop the differential cross section becomes angle-dependent, leading to a more interesting differential
cross section. In the homework you will consider another scalar field theory which already at tree level has
an angle-dependent differential cross section.
It is interesting to consider the high- and low-energy limits of the tree-level cross section as a function of
incident energy. This dependence is given by
|k|d−4
σ∝ λ2 . (10.40)
|k|2 + m2
To get a sense of the real strength of the interactions we should compare σ to some other quantity with units
of area, and at high energies the only such quantity available is |k|−(d−2) . We thus can get a rough estimate
of the interaction strength at high energy by
Thus for d > 4 the interaction strength grows with energy, and one might worry whether the theory really
makes sense at short distances (it probably doesn’t). In the massless limit this scaling also controls the
theory at low energies, and so when d < 4 the interaction strength grows at low energies in the massless
case. Thus for d < 4 perturbation theory will not be valid at low energy and we will need to use some more
exotic technique. In particular this is true for d = 3, and the strongly-interacting theory one reaches at low
energy in that case governs the behavior of classical Ising magnets in three spatial dimensions. In the case
of d = 4 the interaction strength is constant, and then we need to go to higher order in perturbation theory
to see what happens. We will soon see that at one-loop the interactions grow logarithmically with energy in
d = 4. This kind of argument is made more precisely using the idea of the renormalization group, which
we will return to soon.
128
10.4 Homework
1. Derive equation (10.38) for the volume of a sphere in general dimensions. Hint: The easiest way to
2
do this is to evaluate the multi-dimensional Gaussian integral dd−1 xe−|x| in both cartesian and
R
1 m2 g0
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ3 (10.42)
2 2 3!
with g > 0. Non-perturbatively this theory is rather sick, as the naive ground state near ϕ = 0 can
decay by tunneling through the barrier and then rolling down to ϕ = 0−∞. It is also rather fine-tuned,
as we arbitrarily didn’t write down a linear term proportional to ϕ in the potential that would have
been consistent with all the symmetries of the theory. Nonetheless it is a useful model for playing with
Feynman diagrams, which are not sophisticated enough to see the non-perturbative instability. In fact
some textbooks (such as Srednicki) use this theory as their primary example of an interacting field
theory, as the Feynman diagrams are more similar to those of QED.
(a) Make a sketch of the potential for the field in this theory.
(b) Draw the Feynman diagrams which contribute to the self energy Σ(p2 ) up through two loops.
(c) Draw the Feynman diagrams which contribute to the 2 → 2 scattering amplitude up through one
loop.
(d) Evaluate the tree-level 2 → 2 scattering amplitude and differential cross section. In what space-
time dimension is the cross section measured in units of the wavelength roughly constant at large
energies?
129
11 Loop diagrams
We’ve now learned how to compute the perturbative S-matrix and perturbative correlation functions in
quantum field theory. In particular we wrote down several one- and two- loop Feynman integrals, but so far
we have not attempted to actually integrate over any of the loop momenta. The goal of this lecture is to
rectify that.
1 m2 λ0
L = − ∂µ ϕ∂ µ ϕ − 0 ϕ2 − ϕ4 , (11.1)
2 2 4!
which we found in the previous lecture to be
dd q −i
Z
λ0
Σ(p2 ) = . (11.2)
2 (2π) q + m20 − iϵ
d 2
The first thing we will do with this integral is rotate the q 0 contour to Euclidean signature by substituting
0 0
qL = iqE , leading to
dd q
Z
λ0 1
Σ(p2 ) = , (11.3)
2 (2π) q + m20
d 2
where we can drop the iϵ since the denominator of the propagator is now positive-definite. We can rewrite
this in radial coordinates as
λ0 Ωd−1 ∞ q d−1 dq
Z
2
Σ(p ) = , (11.4)
2 (2π)d 0 q 2 + m20
which is an integral that diverges at large q for d ≥ 2. To make sense of the integral we therefore need
to regulate it in some way. We will consider four regulators in turn, understanding the advantages and
disadvantages of each.
where a is some short distance scale called the lattice spacing. This is called a cubic spacetime lattice.
When we introduce the Fourier transform on a lattice, the momenta which can appear are restricted in an
interesting way. This is because when x ∈ aZ we have
2πm
ei(p+ a )x = eipx (11.6)
for any integer m. To get a genuinely independent set of Fourier modes, we should therefore restrict to p in
the range
π π
p ∈ (− , ). (11.7)
a a
µ
What this does in loop integrals is that it restricts each component of qE to lie in this range. Since at finite
a this range is finite, this rule assigns a finite value to all loop integrals.
72 I’ve relabeled the bare coupling constant to λ , anticipating that some renormalization will be necessary to convert this to
0
a “physical” coupling λ.
130
Figure 29: Momentum integration regions for lattice (on the left) and hard momentum cutoff (on the right)
regulators in 1 + 1 dimensions.
In most cases lattice regularization is by far the best way to do non-perturbative calculations in interacting
quantum field theories. You “simply” put the Euclidean path integral on a lattice and then evaluate it with
a big computer using the Monte Carlo method. It is also the best way to think about regularization in
quantum field theory, as there is a clear physical picture of what is going on. Unfortunately however the
lattice regulator is rather awkward for concrete calculations in perturbation theory, as the region over which
the momentum integral is evaluated breaks most of the Euclidean symmetry of the problem (see figure
29). Lattice regulators are therefore rarely used in perturbative calculations. For example even the d = 2
lattice-regulated self-energy integral
π/a 0 π/a
dk 1
Z Z
λ0 dkE 1
Σ(p2 ) = 0 )2 + (k 1 )2 + m2 (11.8)
2 −π/a 2π −π/a 2π (kE 0
q 2 ≤ Λ2 , (11.9)
where Λ is some fixed large energy scale. This makes the integral much easier, as we can now go to radial
coordinates:
λ0 Ωd−1 Λ q d−1 dq
Z
Σ(p2 ) = . (11.10)
2 (2π)d 0 q 2 + m20
This integral can be done for general d ≥ 0 in terms of a hypergeometric function, but it is perhaps more
instructive to just give the answers for d = 1, 2, 3, 4:
1 Λ
arctan
m0 m0
d=1
m2
Λ 1
λ0 Ωd−1 log m0 + 2 log 1 + Λ20 d=2
2
Σ(p ) = × . (11.11)
2 (2π)d Λ − m 0 arctan Λ
d=3
m0
2 2 2
Λ + m0 log 2m0 2
d=4
2 2 Λ +m 0
73 Mathematica did give me some terrifying expression, but when I asked it to expand this answer for small a it gave me
1.5mb of garbage.
131
Expanding these at large Λ we have
π
2m0+ .
.. d=1
Λ
λ0 Ωd−1 log m0 + . . . d=2
Σ(p2 ) = × , (11.12)
2 (2π)d Λ − m20 π + . . . d=3
Λ2
2 Λ
2 − m0 log m0 + . . . d=4
where in each case “. . .” indicates terms which vanish as Λ → ∞. The d = 1 case gives a (finite) quantum
correction to the frequency of the quartic anharmonic oscillator, while for d = 2, 3, 4 we see that we have
increasingly divergent corrections to mass.
The new term in the propagator here is small when p2 ≪ Λ2 , but for p2 ≫ Λ2 it improves the high-momentum
behavior of the propagator from 1/p2 to 1/p4 . This improves the convergence of loop integrals.
Unfortunately the canonical Pauli-Villars regulation (11.13) doesn’t always render loop integrals finite.
For example our self-energy integral in d = 4 is still logarithmically divergent at high momentum, and in
higher dimensions things are only worse. To deal with this we will instead consider an “improved” Pauli-
Villars regulator, which in Euclidean signature modifies the propagator as
p2
1 e− Λ 2
→ . (11.15)
p2 + m20 p2 + m20
The exponential factor is close to one when p2 ≪ Λ2 just as before, but now for p2 ≫ Λ2 the exponential
suppression ensures that all loop integrals will be finite in any dimension and for any number of propagators.
Making use of Mathematica, with this regulator the self-energy of our scalar field theory becomes
π
2m0+ . .. d=1
Λ γ
− 2 + ...
λ0 Ωd−1 log d=2
Σ(p2 ) = d
× √π m 0 , (11.16)
2 (2π) 2 Λ − m20 π + . . .
d=3
Λ2
2 Λ γ 2
2 − m0 log m0 + 2 m0 + . . . d = 4
132
is called the Euler-Mascheroni constant.
Already a pattern is hopefully apparent: the power-law divergent contributions to Σ(p2 ) are different for
different choices of regulator, but when there is a logarithmic divergence its coefficient is universal and when
there is no logarithmic divergence the finite term is also universal. The finite piece cannot be universal when
there is a logarithmic divergence since we can always rescale the cutoff:
bΛ Λ
a log = a log + a log b. (11.18)
m0 m0
As you can see this changes the finite piece but doesn’t change the coefficient of the logarithm. Such rescalings
of course also change the coefficients of any power-law divergences.
Something else is also hopefully clear: the relationship between the bare and physical mass is regularization-
dependent: for example for d = 4 we at one loop have
( 2
Λ 2 Λ
2 2 λ0 2 − m0 log m0 hard momentum cutoff
m = m0 + 2 Λ2
2 Λ γ 2 . (11.19)
16π 2 − m0 log m0 + 2 m0 improved Pauli-Villars
Thus the value of m0 we should choose to match the observed value of m depends on which regularization
scheme we use. This is sometimes described by saying that bare masses are scheme-dependent.
λ0 2π d/2 1 πm0d−2
Σ(p2 ) = . (11.23)
2 Γ(d/2) (2π)d 2 sin dπ
2
74 This is a special case of the integral (11.47) below, as you can check using the Γ function reflection formula Γ(z)Γ(1 − z) =
π
sin(πz)
.
133
Let’s now think a bit about the analytic structure of this expression. Most of the factors are well-behaved
for positive d, but the sin dπ
2 in the denominator leads to poles at each even value of d. Let’s first therefore
consider the odd values - in particular we have
(
πmd−2
0
π
2m0 d=1
dπ
= πm0
. (11.24)
2 sin 2 − 2 d=3
These are precisely the universal finite contributions we found using hard momentum cutoff and improved
Pauli-Villars above! For d = 1 this is no mystery, since anyways the integral is convergent so the regulator
can’t matter, but for d = 3 the dimensional regularization method has automatically removed the linear
divergence but kept the correct finite piece.
In even dimensions we need to be more careful due to the poles. The basic idea is to work in d = 2(n − ϵ)
dimensions, in which case the pole at d = 2n will show up as a factor of 1/ϵ. To expand (11.23) near d = 2n
we need two pieces of information. The first is the behavior of the sin factor near d = 2n, which is easily
shown to be
1 (−1)n+1
1 + O(ϵ2 ) .
= (11.25)
sin(π(n − ϵ)) πϵ
We also need to know how to deal with the Γ(n + ϵ) in the denominator of (11.23). This is a bit trickier:
from the Taylor expansion we have
where
Γ′ (z)
ψ(z) = (11.27)
Γ(z)
is called the digamma function. For our purposes it is useful to know that by taking the logarithm of the
Γ-function recursion relation Γ(z + 1) = zΓ(z) and then differentiating we see that the digamma function
obeys
1
ψ(z + 1) = ψ(z) + , (11.28)
z
and thus
n−1
X1
ψ(n) = + ψ(1) (11.29)
k
k=1
′
for any positive integer n. Computing ψ(1) = Γ (1) is a bit tricky, but after a nasty integral evaluation (or
more elegantly by using the Weierstrass product representation of Γ) one finds
ψ(1) = −γ (11.30)
and thus
n−1
X 1
ψ(n) = − γ. (11.31)
k
k=1
We will also at several points need to use the fact that for a > 0 we have
so again we see that the logarithmic term in m0 matches the logarithmic term we got from the hard cutoff
and improved Pauli-Villars regulators.
134
You may be puzzled about how we were able to get a dimensionful quantity (m20 ) inside of a logarithm
in (11.33). To understand this we need to account for the dimensions of the bare coupling constant λ0 . In d
spacetime dimensions a scalar field needs to have units of energy to the (d − 2)/2, since we need the kinetic
term ∂µ ϕ∂ µ ϕ to have energy dimension d so that integrating it against dd x gives a dimensionless quantity.
The interaction term λ0 ϕ4 must also have energy dimension d, which means that the energy dimension of λ0
is (4 − d). Energy dimensions are a very useful notion in quantum field theory, so there is a special notation
for them: if a quantity O has units of energy to the ∆, then we write
[O] = ∆. (11.34)
[L] = d
d−2
[ϕ] =
2
2
[m0 ] = 2
[λ0 ] = 4 − d. (11.35)
When doing dimensional regularization we wish to expand things around d = 2n, so we can write the bare
coupling “near” d = 2n in terms of the bare coupling “at” d = 2n as
{2(n−ϵ)} {2n} {2n}
= µ2ϵ λ0 1 + ϵ log µ2 ,
λ0 = λ0 (11.36)
where µ is an arbitrary quantity with energy dimension one. Substituting this into (11.33) we get
n−1
!
{2n}
2 λ0 Ω2n−1 (−1)n+1 m2n−2
0 1 3
X1 µ2
Σ(p ) = × − log(4π ) − γ + + log 2 + O(ϵ) , (11.37)
2 (2π)2n 2 ϵ k m0
k=1
which looks more sensible dimensionally. The scale µ is called the renormalization scale, we will discuss
its physical interpretation below. Putting everything together we have the expressions
λ0
4m0 2 d=1
µ
λ0 1 − γ + log
d=2
8π ϵ 4π 3 m2
m2 = m20 + λ0 m0
0
(11.38)
− 8π 2 d=3
− λ0 m0 1 µ2
32π 2 ϵ − γ + 1 + log 4π 3 m20
d=4
for determining the physical mass at one-loop in terms of the bare mass and coupling in dimensional regu-
larization. In practice the way this expression is usually used is the opposite however: the physical mass m
is measured and then we use this formula to determine m0 .
135
The external momenta ki , ki′ here should be taken on-shell to give a genuine scattering matrix element,
but for now it is convenient to allow them to take general values so that we can analytically continue to
Euclidean signature:
iλ20 dd ℓ
Z
fc (k1 , k2 → k1′ , k2′ ) ⊃ 1 h 1
iM
2 (2π) ℓ + m0 (ℓ + k1′ − k1 )2 + m20
d 2 2
1 1 i
+ 2
+ ′ 2 2
. (11.40)
(ℓ − k1 − k2 ) + m0
2 (ℓ + k2 − k1 ) + m0
This integral is convergent for 0 < d < 4 and logarithmically divergent for d = 4. We could study it using
any of the regulators we discussed above, but we will stick to dimensional regularization so for now we are
assuming that d is in the convergent range. The three terms give the same integral three times, so we just
need to figure out how to compute
dd ℓ
Z
1 1
I(q) = (11.41)
(2π)d ℓ2 + m20 (ℓ + q)2 + m20
for general Euclidean q (we will eventually analytically continue back to on-shell Lorentzian q). This integral
may look difficult, but there is a clever trick due to Feynman which makes it tractable: we use the identity
(that you will derive in the homework)
Z 1
1 dx
= 2
, (11.42)
AB 0 (xA + (1 − x)B)
1
dd ℓ
Z Z
1
= dxd 2 (11.44)
0 (2π) (ℓ + x(1 − x)q 2 + m20 )
2
Z ∞
Ωd−1 1 ℓd−1 dℓ
Z
= d
dx 2 (11.45)
(2π) 0 0 (ℓ2 + x(1 − x)q 2 + m20 )
In going to the second line we have made an additive shift of the integration variable, and in going to the
third we changed to radial coordinates. x here is called a Feynman parameter. The propagators in more
general loop diagrams can be combined using multiple Feynman parameters, for example we have
Z 1 Z 1−x
1 2
= dx dy . (11.46)
ABC 0 0 (xA + yB + (1 − x − y)C)3
The remaining momentum integral in (11.45) can then be evaluated using the general formula (that you
will derive in the homework)
Z ∞
ℓa−1 σ a−2b Γ(a/2)Γ(b − a/2)
dℓ 2 2 b
= , (11.47)
0 (ℓ + σ ) 2 Γ(b)
which is valid for σ > 0 and 0 < a < 2b, giving us
Ωd−1 1 1 d−4 Γ(d/2)Γ(2 − d/2)
Z
I(q) = d
dx m20 + x(1 − x)q 2 2
(2π) 2 0 Γ(2)
Z 1
Γ(2 − d/2) d−4
dx m20 + x(1 − x)q 2 2 .
= (11.48)
(4π)d/2 0
136
Defining the Mandelstam variables
s = −(k1 + k2 )2
t = −(k1′ − k1 )2
u = −(k1′ − k2 )2 , (11.49)
we can then write the one-loop contribution to the scattering amplitude as
2 Z 1
fc (k1 , k2 → k ′ , k ′ ) ⊃ iλ0 Γ(2 − d/2) 2
d−4 2
d−4 2
d−4
iM 1 2 dx m 0 − x(1 − x)s 2
+ m 0 − x(1 − x)t 2
+ m0 − x(1 − x)u 2
.
2 (4π)d/2 0
(11.50)
We will now focus on the cases of d = 3 and d = 4.
For d = 3 we are in the convergent region, so we simply have
Z 1
1 dx
I(q) = p . (11.51)
8π 0 m20 + x(1 − x)q 2
For q 2 > 0 this integral gives
2m0
π − 2 arctan √
1 q2
I(q) = p , (11.52)
8π q2
which is the regime we need for the terms involving t and u since on shell we always have t < 0 and u < 0.
To compute the integral involving s we need to restore the iϵ by taking q 2 = −(s + iϵ), which leads to
√
s+2m0 s
1 iπ + log s−2m0 √s
I(q) = √ (11.53)
8π s
where we have used that s ≥ 4m20 . We won’t have too much to say about these results, but one comment is
that at this order we can replace m0 → m since the difference between the two is higher-order in λ0 , so the
one loop contribution to 2 → 2 scattering is finite without any further renormalization. It also decays with
energy since you will show in the homework that s is essentially just the center of mass energy squared.
For d = 4 we need to be more careful due to the pole in Γ(2 − d/2). Setting d = 4 − 2ϵ, we have
Γ(1 + ϵ) 1 1
Γ(ϵ) = = (1 + ψ(1)ϵ + . . .) = − γ + O(ϵ). (11.54)
ϵ ϵ ϵ
Using this in (11.48), together with (11.32), we have
Z 1
1 1 2 2
I(q) = − γ + log(4π) − dx log m0 + x(1 − x)q . (11.55)
16π 2 ϵ 0
137
where Λ is the UV cutoff. Either way we are now ready for the key question: what are we supposed to do
about this UV divergence?
There is only one sensible thing to do: we absorb this divergence into a redefinition of the bare coupling
constant λ0 . In the context of the bare mass m0 we had a physical motivation for doing this: we wanted
to write things in terms of the physical mass m instead of the scheme-dependent bare mass m0 . Is there a
similar justification here? Indeed there is - the bare coupling λ0 is no more directly measurable than the
bare mass m0 . What is measurable is the 2 → 2 S-matrix, so the simplest thing we can do is simply define a
physical coupling λ so that the exact 2 → 2 S-matrix is equal to its tree-level value at some preferred choice
for the initial momenta. More concretely we will impose that
fc |s=4m2 ,t=0,u=0 = −iλ.
iM (11.59)
Now the moment of truth: using either (11.62) or (11.63) we can rewrite the scattering amplitude M fc for
general s, t, u in terms of the physical mass and coupling:
Z 1 2
iλ2 m − 4m2 x(1 − x) m2 m2
iMc = −iλ +
f dx log + log + log .
32π 2 0 m2 − sx(1 − x) m2 − tx(1 − x) m2 − ux(1 − x)
(11.64)
All UV divergences are gone, and the answer is now independent of which regularization scheme we used!
We have to fit two parameters (λ and m) to experiment, but this expression gives a function’s worth of
predictions in exchange. The integrals can again be evaluated in terms of inverse trig functions but I won’t
bother.
This argument leading to the finite and scheme-independent scattering amplitude (11.64) may have
seemed a bit like magic. Why did this happen? Does it continue to happen at higher loops and for more
complicated scattering amplitudes? Are there more parameters we need to tune, or is it just λ0 and m0 ? It
is far from obvious, but the answers to the latter two questions are “yes it continues” and “no it is just λ0
and m0 ”. Understanding why is our next order of business.
As a first indication that things may not be so mysterious, I’ll mention that the derivative of λ0 with
respect to the logarithm of either the renormalization scale µ (in dim reg) or the cutoff Λ (in a more physical
138
scheme) holding the physical coupling λ fixed is a very useful quantity, usually called the β-function. Here
it is given by
dλ0 3λ2
β(λ) := Λ = . (11.65)
dΛ 16π 2
3λ Λ
Note in particular that λ0 grows with energy, so when we reach a regime where 16π 2 log m ∼ 1, or in other
words the cutoff reaches
16π 2
Λstrong ∼ me 3λ , (11.66)
then the theory becomes strongly coupled and our perturbative approach breaks down. This is usually
viewed as evidence that the continuum limit does not really exist for ϕ4 theory in d = 4. Fortunately if λ
is small this is a rather high energy scale, for example in the standard model of particle physics the Higgs
boson is a scalar field theory whose mass is 125GeV and whose quartic coupling is of order λ ∼ .1, so the
scale where the Higgs becomes strongly coupled is
which is a far higher energy scale than the Planck scale of Mp ∼ 1018 GeV where quantum gravity effects
are expected to become important. If we view our theory as having a genuine cutoff Λ at some scale which
is large compared to where we do experiments but small compared to Λstrong , then these UV divergences
start looking less scary and perhaps we will be able to tame them more systematically. Doing so is the goal
of the next lecture.
139
11.3 Homework
1. Check the Feynman parameter identity (11.42) for A, B > 0.
2. Evaluate the general loop integral (11.47). One strategy is the following: 1) rescale ℓ to extract the
overall power of σ, 2) rewrite the integral in terms of the Euler β-function
Z 1
β(z1 , z2 ) = tz1 −1 (1 − t)z2 −1 . (11.68)
0
1
using the change of variables t = 1+ℓ2 , and 3) use a famous expression for the β function,
Γ(z1 )Γ(z2 )
β(z1 , z2 ) = . (11.69)
Γ(z1 + z2 )
R∞ R∞
To derive this last expression, start with Γ(z1 )Γ(z2 ) = 0 ds1 0 ds2 s1z1 −1 sz22 −1 e−s1 −s2 and then use
the change of variables s1 = st and s2 = s(1 − t).
3. Show that on shell the Mandelstam variables (11.49) obey s + t + u = 4m2 and s = Etot
2
, where Etot
is the total energy in the center-of-mass frame. Also show that t, u ≤ 0.
140
12 Renormalizability and the Renormalization Group
In the previous lecture we saw that once we expressed the one-loop 2 → 2 scattering amplitude of ϕ4 theory
in terms of the physical mass and coupling parameters m and λ, the amplitude was independent of the
short-distance cutoff Λ (or the renormalization scale µ in dimensional regularization). In this lecture we will
sketch a general understanding of why this is true, starting with a more “old-fashioned” approach based
on showing that the divergences cancel in certain “renomalizeable” theories such as the ϕ4 theory and then
moving to a more modern “Wilsonian” approach based on viewing the cutoff Λ as being physical and then
seeking to understand physics at energy scales which are low compared to Λ.
where Oi is some power of the fields and their derivatives and λi is a coupling constant. We will discuss
below how to normalize the fields so that Oi and λi are separately well-defined. In this section we will study
the convergence of a general one-particle irreducible diagram with E a external Φa legs, I a internal Φa legs,
and Vi vertices of type i. We will focus on the particular region of the loop integration space where all loop
momenta become large at the same rate. This is not the only region a divergence can come from, but our
results will be indicative of the general case.
Before beginning we need to think a bit about the large-momentum behavior of the propagator. For
a scalar field this is easy, it just goes like 1/k 2 . In lecture 14 you showed on the homework that the
momentum-space Feynman propagator of a field of general spin is
i 1 X a⃗ i 1 X a ⃗
GF (k) = u (k, σ, n)ub∗ (⃗k, σ, n)−(−1)F 0 v (−k, σ, nc )v b∗ (−⃗k, σ, nc ).
k0
− ωn,⃗k + iϵ 2ωn,⃗k σ k + ωn,⃗k − iϵ 2ωn,⃗k σ
(12.2)
We haven’t discussed the spin sums of the intertwiners yet, but they always give polynomials of k so at large
k this propagator will go as
GF ∼ k 2sa −2 , (12.3)
where 2sa is highest power of k appearing in the spin sums. Roughly speaking sa is the spin of the ath field,
for example for a spin 1/2 field we will see that sa = 1/2 and for a massive vector field we’ll have sa = 1. In
the massless case however sometimes sa is lower than expected due to gauge symmetry, for example sf = 0
for photons and gravitons. I will adopt a convention where the field is normalized so that the highest power
of k in GF has coefficient one (perhaps multiplied by some dimensionless tensor such as η µν to make up the
a, b∗ indices), in which case (12.3) tells us that the energy dimension of the field obeys
and thus
d−2
[Φa ] = sa + . (12.5)
2
Turning now to the question of the divergence of the loop integrals, each internal propagator contributes
dd k
P
an integral (2π) d . The total number of loop integrals is d a Ia , but each vertex contributes a momentum-
conserving δ function so the total number of loop integrals is
!
X X
d Ia − Vi + 1 (12.6)
a i
141
since there will always be a single momentum-conserving δ function left over which doesn’t constrain the loop
momenta. Going to spherical coordinates in this full space of loop integrals thus gives a radial momentum
integral of the form Z
dkk d( a Ia − i Vi +1)+2 a Ia (sa −1)+ Vi di −1
P P P P
(12.7)
is greater than or equal to zero. We can simplify this expression by observing that since each internal a line
connects two vertices and each external line is connected to one vertex we have
X
Vi Nia = 2Ia + Ea , (12.9)
i
and thus !
X d−2 X X d−2
D =d− Ea sa + − Vi d − di − Nia sa + . (12.10)
a
2 i a
2
We can write this more simply by observing that the quantity multiplying Vi in the sum over i is just d
minus the energy dimension X
∆i := di + Nia [Φa ] (12.11)
a
The qualitative behavior of this formula depends very strongly on ∆i : if all interactions obey ∆i ≤ d, then
adding additional interaction vertices cannot increase the degree of divergence. In this case the theory is
said to be renormalizable. More generally we can classify interaction vertices into three groups:
Vertices with ∆i < d are called super-renormalizable. For example the ϕ4 interaction in d = 3
obeys
[ϕ4 ] = 2 < 3, (12.13)
and is thus super-renormalizable.
Vertices with ∆i ≤ d are called renormalizable. For example the ϕ4 interaction in d = 4 obeys
[ϕ4 ] = 4, (12.14)
142
So for example a real scalar field has
d−2
[Φ] = , (12.17)
2
so a scattering amplitude with E external particles can have D ≥ 0 only if
2d
E≤ . (12.18)
d−2
For d = 4 this is E ≤ 4, while for d = 3 this is E ≤ 6. We will now argue that this translates into
the statement that in a renormalizable theory UV divergences can be removed by absorbing them into a
finite number of coupling constants. This is just what we found at one-loop in ϕ4 theory in d ≤ 4. In a
theory where all interactions are super-renormalizable something even stronger is true: there are only a finite
number of diagrams which are UV divergent. The UV divergences in a super-renormalizable theory can thus
be completely removed by coupling constant shifts which are polynomials in the coupling, and that can be
computed at some fixed order in perturbation theory. For example in the ϕ4 theory in d = 3 we found no
divergence at one loop in 2 → 2 scattering.
Before continuing it is worth mentioning that there is a simple interpretation of the condition for a
vertex to be non-renormalizable: the quantity d − ∆i is precisely the energy dimension of the coupling λi
which appears in front of the interaction operator OI . Non-renormalizable interactions are thus those with
coupling constants that have negative energy dimension, while super-renormalizable interactions are those
with positive energy dimension.
dℓ (p + ℓ)2 + 2a(p + ℓ) − b
Z Z
d dℓ p + ℓ + a
2
= − , (12.19)
dp 2π (p + ℓ) + b 2π ((p + ℓ)2 + b)2
where the integral on the left is logarithmically divergent at large ℓ but the integral on the right is convergent.
More generally if we differentiate a diagram with D ≥ 0 a total number of (D + 1) times it becomes
convergent. The divergent part of the diagram must therefore consist of a polynomial in the external
momenta, heuristically of the form
where we have written pMn to represent any product of Mn components of the external momenta. These
powers will also be multiplied by various mass scales from the coupling constants of the theory to make sure
they have the right units (in a massless theory the units will need to work out without this). These however
are precisely the form of divergence which can be removed by adding local terms to the Lagrangian! More
concretely, to remove a divergence of the heuristic form ΛD−n pMD−n in a diagram with Ea external Φa legs,
we introduce a term with Ea factors of each field and MD−n derivatives acting on those fields with the same
index structure as in the divergence. For example let’s say we are computing the self-energy of a scalar in
d = 4 and we find the divergences
Λ
Σ(p2 ) ⊃ aΛ2 + (b + cp2 ) log . (12.21)
m
Λ Λ
We can absorb aΛ2 + b log m divergence into a shift of the bare mass term, and we can absorb the cp2 log m
µ
divergence into a shift of the kinetic term ∂µ Φ∂ Φ, i.e. into a wave function renormalization. In the previous
lecture we computed a and b at one loop, and we pointed out that c is also nonzero at two loops. More
143
+
Figure 30: Canceling a divergent subdiagram with a counterterm. For d = 2, 3 the full diagram has D < 0,
but the subdiagram is still divergent. Here the dot with an x through it indicates an insertion of the mass
Z 2 m20 −m2 2
renormalization term − 2 ΦR .
generally we only need to do this subtraction for diagrams with D ≥ 0, and thus we only need to include
shifts of interaction terms with ∆i ≤ d.
In the traditional way of describing this process one rewrites the bare Lagrangian in terms of the physical
mass and coupling m and λ, and also defines a rescaled field
Φ
ΦR = (12.22)
Z
which has a finite two-point function and in particular whose on-shell residue is the same as that of a free
field. We thus have
1 m2 λ0
L = − ∂µ Φ∂ µ Φ − 0 Φ2 − Φ4 (12.23)
2 2 4!
Z2 Z 2 2
m0 2 λ0 Z 4 4
= − ∂µ ΦR ∂ µ ΦR − ΦR − ΦR (12.24)
2 2 4!
1 m λ
= − ∂µ ΦR ∂ µ ΦR − Φ2R − Φ4R + Lct , (12.25)
2 2 4!
where
Z2 − 1 Z 2 m20 − m2 2 λ0 Z 4 − λ 4
Lct = − ∂µ ΦR ∂ µ ΦR − ΦR − ΦR (12.26)
2 2 4!
is called the counterterm Lagrangian and its individual terms are called counterterms. In the old-
fashioned approach to renormalization one views these counterterms as “corrections” to the original theory
which are included to cancel the infinities. They are treated as additional interaction vertices, providing
corrections to a free theory whose mass is now the physical mass and whose interaction vertex is now −iλ
instead of −iλ0 . This is not actually different from what we did in the previous lecture, where we followed
the Wilsonian approach (to be developed further in the next section) of tuning the bare couplings so that
the physical couplings have their observed values: the counterterms are just an alternative way of describing
this tuning.
Before proceeding to the Wilsonian approach, we need to confront the fact that so far we have only
considered the region of momentum integration where all loop momenta go to infinity together. This of
course is an important contribution to the integrals, but we also need to consider the possibility of divergences
that arise when only a subset of the momenta go to infinity. This is a subtle and difficult problem, whose
traditional solution we won’t describe in detail since the Wilsonian approach deals with it in a much cleaner
(but more abstract) manner. We will instead content ourselves with a few remarks about the ingredients
which go into the traditional proof that the same renormalization which removes the divergences in the
integration region we have considered so far also removes them for the full range of momentum integration.
144
+ +
Figure 31: A Feynman diagram with overlapping divergences. The red and blue dashed lines surround
four-point subdiagrams which each are logarithmically divergent in d = 4, but we can only use a four-point
counterterm to cancel the divergence from one of them. The remaining divergence must be canceled by an
additional two-point counterterm.
The first step in proving renormalizability is Weinberg’s theorem, which says that a multi-loop
integral will be convergent if and only if its degree of divergence is negative as we take any linear
combination of the loop momenta to infinity. This means that we can show convergence using a
generalization of the method employed so far.
One can think of the various options for which momenta go to infinity together in terms of subdia-
grams of the full Feynman diagram. For example a diagram whose degree of divergence as defined
above is negative can still diverge due to a subdiagram whose degree of divergence is positive. See
figure 30 for an example.
In simple cases there is a simple fix to the presence of a divergent subdiagram: we can simply ignore
the rest of the diagram, in which case we have already seen that the divergence can be canceled by
including an appropriate counterterm. At least to the extent that the propagators involved in the
divergent subdiagram are not involved in other divergent subdiagrams, this cancellation works also in
the full diagram (see figure 30).
The key technical problem with this approach however is the possibility of overlapping divergences,
meaning situations where we have multiple divergent subdiagrams with propagators in common. See
figure 31 for an example. In such a case it isn’t so clear that we can cancel both divergences with
counterterms, as once we replace one of the subdiagrams by a counterterm we have lost part of the
other subdiagram. The systematic approach to dealing with this goes under the name “BPHZ”, for
Bogoliubov, Parasiuk, Hepp, and Zimmerman, and it requires a detailed analysis of the structure of
the diagrams using the infamous “forest formula”. In the end everything does work though, and the
renormalization which fixes the divergences in the region where all momenta scale together indeed
removes the divergences from subdiagrams as well.
no papers (in particular he wrote zero papers as a graduate student and his 1961 thesis still has zero citations). Somehow he
managed to get a faculty position anyways, and also tenure at Cornell. He then proceeded to revolutionize physics, explaining
the real meaning of renormalization in the process, and ended up with a Nobel Prize. I do not recommend trying to replicate
this trajectory.
145
The first essential idea for the Wilsonian approach is to view the cutoff as physical. In a condensed
matter system this is self-explanatory: at the atomic scale in a solid there is a genuine lattice of ions, with
electrons constrained be near the ions, and at shorter distances there is nothing. In high-energy physics it is
less obvious that there needs to be a genuine cutoff at short distances (or equivalently high energies), but the
quantization of gravity seems to require major modifications of the laws of physics at the (absurdly small)
Planck length: r
ℏG
ℓp = ≈ 10−35 m. (12.27)
c3
Moreover there are several indications from particle physics (such as the mass of neutrinos, the existence of
dark matter, and the small baryon-to-photon ration of the universe) that some kind of modification of the
standard model of particle physics is necessary at sufficiently short distances.
The second essential idea for the Wilsonian approach is decoupling. This means that the details of
what is going on at large energies/short distances do not affect what is going on low energies/long distances.
For example if we regulate a scalar field theory by putting it on a lattice, when we look at the low-energy
physics of the system we cannot tell whether the lattice has a cubic structure or a hexagonal structure. We
also cannot detect the existence of very heavy particles by doing low-energy experiments.
The third essential idea for the Wilsonian approach is integrating out. The idea here is that since
low-energy physics does not depend on the details of high-energy physics, rather than carrying around all
that high-energy physics for no reason we can simply sum over it in the path integral once and for all. This
produces a “low-energy effective field theory”, where all effects of the high-energy modes are repackaged into
the values of the low-energy coupling constants.
Indeed let’s consider a rather general-looking quantum field theory with an explicit cutoff Λ, with action
XZ
SΛ = dd xgi (Λ)Λd−∆i Oi . (12.28)
i
Here Oi are some basis for all the scalar local operators in the theory, and ∆i are their energy dimensions.
In general there are infinitely many such operators, so the sum over i here needs to be viewed somewhat
heuristically. We have chosen to extract a power of the cutoff Λ from the coupling constants, which is chosen
so that the quantities gi (Λ) are dimensionless. The idea of the Wilsonian approach is that if we lower the
cutoff from Λ0 to Λ (with Λ < Λ0 ), we should tune the Λ-dependence of the couplings so that the low-energy
physics is not affected. You may worry whether or not we can do this, but in fact there is a simple path
integral method: we split all fields into a “high-energy” part ΦH , consisting of the modes which exist for
cutoff Λ0 but not for cutoff Λ, and a “low-energy” part ΦL , consisting of the modes which exist for both
cutoffs. For any observable OL [ΦL ] built only out of the low-energy modes we then have
In other words, the low-energy effective action is obtained by starting with the full action and then integrating
out the high-energy modes. This process gives a flow in the space of actions (or equivalently a flow in the
space of coupling constants) which is called renormalization group flow.76
The operation (12.30) has an important defect: in general there is no reason for the action SΛ to be
local even if we start with a local action SΛ0 . On the other hand since we only integrated out modes whose
76 The name is misleading, as renormalization group flow is not invertible (how would you “un-integrate”?) so there isn’t
really a group structure. A more accurate name would be “renormalization semigroup”, but unfortunately we are stuck with
this one.
146
wavelengths are at most of order Λ1 , any non-localities we generate should be constrained to this scale. We
therefore can Taylor expand them to express SΛ as a local action order by order in Λ1 . This suppression is
already built into our expression (12.28), as each derivative increase the dimension of Oi and thus costs a
power of Λ. As a simple example of this, we can consider a non-local term
Z
dd xdd yK(x − y)ϕ(x)ϕ(y). (12.31)
Since this came from integrating out short-distance modes with momenta roughly between Λ and Λ0 , the
Fourier transform of K(x − y) should be a reasonably smooth function with compact support in k. K will
therefore be an analytic function that decays rapidly at separations which are large compared to Λ1 . For ϕ
configurations which vary only on scales which are large compared to Λ1 , we can therefore approximate K as
a sum of δ-functions and their derivatives. In this way given a local action SΛ0 with couplings
we can construct a local action SΛ with couplings gi (Λ) that gives the same low-energy physics. At first
order in Λ0 − Λ the new couplings are functions of the old couplings only, so they must obey Wilson’s
renormalization group (RG) equation
dgi
Λ = βi (g(Λ)). (12.33)
dΛ
In other words we can think of the renormalization group flow as being the integral curves generated by
a vector field βi on the space of couplings. In the last lecture we computed the β function for the scalar
λ 4
coupling 4! ϕ in d = 4 at one loop, finding
3λ2
βλ (λ) = . (12.34)
16π 2
It must be emphasized that this flow generically generates all possible couplings which are allowed by the
symmetries of the theory - it does NOT only generate renormalizable couplings with ∆i ≤ d.77
with
∂βi
Mij = . (12.36)
∂gj
77 The only exceptions I know of to this rule are free theories, conformal field theories for which all β vanish, and supersym-
i
metric theories.
147
′
Introducing matrix notation and also using to indicate a derivative with respect to log Λ, we can rewrite
this equation as
δg ′ = M δg. (12.37)
So far this equation does not distinguish between renormalizable and non-renormalizable couplings. To
distinguish them, it is useful to introduce a projection matrix
(
δij i renormalizable
Pij = (12.38)
0 otherwise,
Π = 1 − DP (P DP )−1 P, (12.40)
which is designed to decouple the renormalizable and non-renormalizable couplings in the RG equation.78
Π is indeed a projection in the linear algebra sense of obeying
Π2 = Π, (12.41)
but it is not orthogonal in the sense that it doesn’t obey Π† = Π. P and Π are related by the equations79
PΠ = 0
Π(1 − P ) = (1 − P ). (12.42)
ξ = Πδg, (12.43)
and thus
D′ = M D. (12.46)
Moreover since for any matrix N we have
(N −1 )′ = −N −1 N ′ N −1 , (12.47)
78 The inverse matrix (P DP )−1 here should only be used on vectors which are in the image of P . Otherwise the inverse does
not exist.
79 The relationship between P and Π is interesting from a linear algebra point of view. If Π were hermitian then equations
(12.42) would imply that Π = 1 − P . Since Π is not hermitian, we can only conclude that (1 − P )v = v ⇔ Πv = v. We will see
im a moment however that what we are really interested in the null space of Π, and this need not coincide with the null space
of 1 − P .
148
we also have
ξ ′ = Πδg ′ + Π′ δg
= ΠM δg − ΠM DP (P DP )−1 P δg
= ΠM ξ, (12.49)
so the projection Π has succeeded in decoupling the RG equation. Rewriting this in terms of P and D we
have
ξ ′ = M − DP (P DP )−1 P M ξ.
(12.50)
So far our discussion has been non-perturbative. In a situation where perturbation theory is valid, we
can usefully approximate the matrix M using free field theory. In free field theory the action should not
have any cutoff dependence since there are no loop diagrams, so we need the quantities
gi (Λ)Λd−∆i (12.51)
and thus
βi ≈ (∆i − d)gi (12.53)
and
Mij = ∂j βi ≈ (∆i − d)δij . (12.54)
The key point is then the following. The renormalizable components of ξ are zero by construction, and to
the extent that M is diagonal in the same basis as P we can ignore the second term in (12.50) since then
P M ξ ≈ M P ξ = 0. (12.55)
We therefore have (
0 i renormalizable
ξi′ ≈ (12.56)
(∆i − d)ξi i non-renormalizable.
The non-renormalizable couplings are precisely those for which ∆i − d > 0, so we thus see that the entire
vector ξ vanishes like a power of ΛΛ0 as we flow to Λ ≪ Λ0 ! And moreover this conclusion is preserved under
perturbative corrections as long as these are small compared to ∆i − d (which they always will be for small
enough coupling). Once this suppression is complete, the full set of coupling variations needs to obey
Πδg = 0, (12.57)
or more explicitly
δg = DP (P DP )−1 P δg. (12.58)
In other words we can determine the change in all of the infinitely many non-renormalizable couplings by
looking at the change in the renormalizable couplings alone. Said differently, if we know the values of
all of the renormalizable couplings in the low-energy action then the non-renormalizable couplings are all
determined. This, in essence, is the statement of renormalizability!
149
To see more closely the connection between (12.58) and renormalizability, we first should note that
although the matrix D, which depends on the cutoff Λ0 and initial couplings gi0 , appears in equation (12.58),
the relationship determining the non-renormalizable couplings in terms of the renormalizable ones actually
can’t depend on these. This is because the focusing behavior of the RG equation (12.33) is a purely local
affair in the space of couplings: we are solving a first-order differential equation, and we only need to
know what is going on in the vicinity of where we are solving it. The finite-dimensional attractor manifold
therefore cannot depend on where the flows started. This is the essential point: all low-energy observables
can be computed using only the low-energy action SΛ [ϕL ], and thus expressed entirely in terms of where
we are on the attractor manifold. We can parametrize where we are on this manifold using the low-energy
renormalizable couplings, in which case all results will depend only on these low-energy couplings and the
(low) cutoff scale Λ, NOT on the initial cutoff Λ0 or initial couplings gi0 . But this is precisely the statement
of renormalizability: all observables can be expressed as functions of the low-energy couplings and kinematic
variables without any dependence on the cutoff or the bare couplings.
In particular note the demotion of operators with ∆i > d from “non-renormalizable” to “irrelevant”: if
we change the dimensionless coupling for an irrelevant operator by an O(1) amount at short distance, the
150
Figure 32: Integrating out a heavy particle of mass M creates new interactions for the light fields which are
suppressed by the mass of the particle.
only effect at low energies is a shift of where we are on the attractor manifold that could just as well have
been achieved by changing the coefficients of the relevant and marginal operators alone. These days most
non-ancient theoretical physicists prefer this terminology for classifying operators to the old one, and in fact
it has been something of a chore for me to not use it thus far. From now on I will switch to using it.
Here we have not indicated how the indices are contracted or attempted to compute O(1) factors. The key
point however is that all interaction terms are suppressed by powers of the Planck mass
r
ℏc
Mp = ≈ 2.2 × 10−8 kg. (12.62)
G
151
This may not seem like a large mass compared to your own mass, but it is a gigantic energy scale for
an elementary particle. For example it is about 1019 times the mass of a proton, which is about 1 GeV.
Nonetheless gravity is a part of our every day experience, since the tiny gravitational force of each proton
in the earth on each atom in our bodies adds up and there is no competing force to overwhelm it.
Another example of a field with only irrelevant interactions is a real scalar field with a shift symmetry
ϕ′ = ϕ + a. The Lagrangian for this theory can only be made out of derivatives of ϕ, and the only relevant or
marginal term of this type is the massless kinetic term − 12 ∂µ ϕ∂ µ ϕ. We therefore need to include irrelevant
operators such as
g
L ⊃ − d (∂µ ϕ∂ µ ϕ)2 (12.63)
Λ
to get nontrivial scattering. This example has great physical relevance, as it arises whenever there is a
“spontaneously-broken continuous global symmetry”. For example this happens in nuclear physics, where
the pion fields are the scalars, and also in condensed matter physics systems such as liquid helium at the
critical point. We will learn more about these next semester.
The unifying theme of these examples is that we can parametrize the effects of unknown high-energy
physics by including irrelevant operators in the low-energy theory suppressed by powers of the energy scale
Λ of that unknown physics. The theory including these terms is only valid when viewed as computing an
expansion in E Λ . Such a theory is called an effective field theory. Our current best understanding of the
laws of physics is an effective field theory, as it includes irrelevant operators to explain gravity and also the
observed nonzero values of neutrino masses. Effective field theories inevitably break down when we consider
energies of order Λ, and to understand what happens then we need to know the real high-energy physics.
For pions and liquid helium we already know this, while finding it for gravity is one of the biggest problems
in physics. We will meet effective field theories again in the next semester, and in fact there is an entire class
about them taught by Iain Stewart here at MIT.80
152
where ωµν = −ωνµ . The first two terms here are infinitesimal Poincaré transformations, while b parametrizes
an infinitesimal dilation. The vector cα parametrizes the infinitesimal version of what is called a special
conformal transformation. A quantum field theory with conformal symmetry is called a conformal field
theory, so in relativistic field theory a fixed point of the renormalization group likely always corresponds to
a conformal field theory. In what follows we will not need to use conformal symmetry however, so we will
stick to the language of fixed points.
Fixed points are natural “starting” and “ending” points for the renormalization group flow. The typical
situation is that we begin with a “UV” fixed point, deform by a relevant operator with a small coefficient,
and then flow off in the space of couplings until we reach some other “IR” fixed point. As a simple example
we can consider our old friend the free massive scalar theory:
1 m2 2
L = − ∂µ ϕ∂ µ ϕ − ϕ . (12.69)
2 2
In this theory the only nontrivial coupling is the mass m2 , which we will parametrize as a function of the
cutoff by
m2 = g2 Λ2 (12.70)
as usual. The renormalization group equation is quite simple:
so the β-function is
β2 (g2 ) = −2g2 . (12.72)
Thus we see that to get a fixed point we need g2 = 0, or equivalently m = 0. This is quite sensible: if m2
2
is not zero then the theory has a dimensionful parameter and cannot be scale invariant. This fixed point is
the simplest conformal field theory: the massless free scalar. If we now deform the action by turning on a
small nonzero value g20 for g2 at some cutoff scale Λ0 , then we now have
2
Λ0
g2 (Λ) = g20 . (12.73)
Λ
When Λ ∼ Λ0 this is a small contribution, but as we lower Λ it grows and once we get to the regime where
q
Λ ∼ Λ0 g20 (12.74)
this deformation has a large effect on the theory. Of course in this case the right-hand side of (12.74) is just
the mass m, so it is hardly news that the mass becomes important for energies E ≲ m. Indeed below this
scale there are no states except for the ground state, so in this particular renormalization group flow the IR
fixed point is the trivial conformal field theory with zero degrees of freedom.
This last example may have you worried that the IR fixed point in quantum field theory is often the
trivial CFT. Indeed this is generically the case, for a simple reason: as long as the candidate IR CFT has at
least one scalar relevant operator which is invariant under all symmetries of the UV CFT, then generically
the RG flow is repelled from the fixed point in this direction so we need to tune a continuous parameter in
the initial conditions to hit it. See figure 33 for an illustration. If the candidate IR CFT has more than one
invariant relevant operator, then we need to tune a continuous parameter for each such operator. Sometimes
however you get lucky: the IR CFT can have no invariant relevant operators or there may be some reason
why they cannot be turned on. We will meet examples of this type next semester.
153
Figure 33: Renormalization group flows in the vicinity of a UV fixed point (shown in blue) with two relevant
operators and an IR fixed point (shown in red) with one relevant operator. To hit the IR fixed point, we
need to tune the initial flow direction from the blue point, otherwise we flow off to what is likely a trivial
theory.
exhibiting spontaneous magnetization when T < Tc where Tc is either called the Curie temperature or the
critical temperature. In particular for T ≈ Tc this system is described by the Euclidean version of our old
friend the massive scalar field ϕ, with Lagrangian
1 m2 2 λ 4
LE = ∂µ ϕ∂ µ ϕ + ϕ + ϕ . (12.75)
2 2 4!
Here ϕ is essentially the average magnetization, the Ising spin-flip symmetry is represented as ϕ′ = −ϕ,
and82
m2 ∝ T − Tc (12.76)
so the phase transition happens at m = 0. This tuning to m = 0 is precisely the tuning mentioned at the end
of the previous section, so at low energy this theory at m = 0 should be described by a nontrivial conformal
field theory with one relevant operator that is invariant under the spin-flip symmetry. This theory is not so
easy to compute in, as for d = 3 the operator ϕ4 in the free theory (the UV fixed point) is relevant so at low
energy its dimensionless coupling becomes strong. Indeed finding a reliable way to do computations in the
IR fixed point of the critical Ising model in d = 3 is one of the most famous problems in theoretical physics.83
We will now see that by using a clever trick due to Wilson and Fisher we can compute some aspects of this
theory surprisingly reliably using results we have already obtained.
We first need to get a sense of what kind of quantity we would like to compute. The first thing to note
is that as we flow to the IR fixed point some renormalization of the operator ϕ2 will typically be necessary.
In other words the operator which has cutoff-independent correlation functions will have the form
where [ϕ2 ]0 is the “bare” ϕ2 operator at the UV fixed point and γϕ2 is called the anomalous dimension of
ϕ2 . The full energy dimension of ϕ2 at the IR fixed point (working for the moment in d spacetime dimensions)
82 The argument for this that m2 should vanish at T = Tc by scale invariance and the effective Lagrangian should be analytic
in T , so generically it should vanish linearly.
83 For d ≥ 4 the ϕ4 coupling is marginal or irrelevant (and in the marginal d = 4 case it still flows to zero in the IR since the
one-loop β function is positive), so the IR CFT is just the massless free scalar theory. For d = 2 the scalar description breaks
down due to infrared divergences and other methods are needed; we will show next semester that the IR fixed point for d = 2
is actually a free fermion theory.
154
is thus
∆ϕ2 = d − 2 + γϕ2 . (12.78)
2
We can read off the anomalous dimension of ϕ from its Euclidean two-point function at the critical point,
since by dimensional analysis this must be given by
C
⟨ϕ2 (x)ϕ2 (y)⟩ = (12.79)
|x − y|2∆ϕ2
with C a dimensionless constant. It is convenient to take the Fourier transform of this, which by dimensional
analysis must be
⟨ϕ2 (k1 )ϕ2 (k2 )⟩ = (2π)d δ d (k1 + k2 )D|k1 |2∆ϕ2 −d , (12.80)
with D again a dimensionless constant. Since the quantity m2 ϕ2 must have dimension d, we must have
so we can write
m2 = ξ γϕ2 −2 (12.82)
where ξ has units of length and is called the correlation length of the system. Combining this with (12.76),
we see that we must have
1
∼ (T − Tc )ν (12.83)
ξ
with
1
ν= . (12.84)
2 − γϕ 2
ν here is an example of what is called a critical exponent, and the relation (12.83) is easily measurable
in a real magnet. There are other critical exponents for other thermodynamic quantities, and all of them
can be related to the energy dimensions of relevant operators in the IR CFT. Computing these dimensions
is thus the central problem in understanding the Ising phase transition.84
Now, following Wilson and Fisher, let’s see how to compute the anomalous dimension γϕ2 . The method
we will use is called the “ϵ-expansion”, and if this is the first time you are hearing it you may think I am
crazy. The idea is to continuously connect the nontrivial IR CFT in d = 3 to the free scalar CFT in d = 4
by taking d = 4 − 2ϵ, expanding perturbatively in ϵ, and then setting ϵ = 1/2. It is not clear a priori that
this is a good thing to do, but it turns out that the O(1) coefficients in this expansion work out in such a
way that ϵ = 1/2 is small enough to get a decent approximation.85 Let’s first recall that in d = 4 we found
the expression
3λ2
β= (12.85)
16π 2
for the β-function of the quartic coupling in λϕ4 theory. For d = 4 − 2ϵ this coupling becomes dimensionful,
so following the Wilsonian approach we should introduce a dimensionless coupling g4 via
λ = g4 Λ2ϵ . (12.86)
The coupling g4 thus has nontrivial scale dependence even in the free theory, scaling like g4 (Λ) ∼ Λ−2ϵ . Its
β-function at one loop is thus
3g42
β = −2ϵg4 + . (12.87)
16π 2
84 Infact the same IR CFT governs many other physical systems, including the critical point of the phase diagram of water. All
of these systems have the same critical exponents, which is a rather remarkable convergence due to the great differences in the
underlying physics of these system. This “universality” is a beautiful illustration of the focusing power of the renormalization
group.
85 There are other more modern (and more rigorous) approaches to doing this calculation, but for the most part they require
substantial numerical work while the ϵ-expansion gives quick analytic results that already work pretty well.
155
Figure 34: Leading diagrams contributing to an insertion of [ϕ2 ]0 into a correlation function.
We can therefore find a fixed point by canceling these two terms against each other, leading to
32π 2 ϵ
g4∗ = . (12.88)
3
If we take ϵ → 12 this gives a rather large coupling in d = 3, but we can boldly press ahead and see what we
find for the anomalous dimension γϕ2 .
For small ϵ we can compute γϕ2 by studying the renormalization of the composite operator ϕ2 . So far
we have not discussed how to compute correlation functions of composite operators, but the basic idea is
simple: start with the pieces of the operator at different points, and then bring them together ignoring any
diagrams with propagators connecting the pieces of the operator. In particular for ϕ2 we subtract a factor
of GF (0), removing the obvious divergence proportional to the identity operator as we bring the two ϕ’s
together. This renormalization of composite operators is called normal ordering, and it must be done even
in free field theory to define a sensible composite operator. We can also understand normal ordering in the
operator approach, where the divergence arises from the term
dd−1 p dd−1 p
Z Z
1 ′
Φ2 (x) ⊃ d−1 √ ei(p−p )x ap⃗ a†p⃗ ′ , (12.89)
(2π) (2π)d−1 2 ωp⃗ ωp⃗′
which has the divergent vacuum expectation value
dd−1 p 1
Z
2
⟨Φ (x)⟩ = = GF (0). (12.90)
(2π)d−1 2ωp⃗
What normal ordering does in free field theory is re-order all products of a’s and a† ’s so that the a† ’s are
to the left of the a’s, ensuring a vanishing vacuum expectation value. It is convenient to instead compute
correlation functions of the rescaled operator 21 ϕ2 , as Feynman diagrams for these have symmetry factors
that work in the way we are familiar with (the 1/2 is similar to the 1/4! we put in front of ϕ4 , and cancels
the two ways that incoming propagators can be attached to the operator). The leading diagrams arising
from an insertion of 21 [ϕ2 ]0 are shown in figure 34. Evaluating these diagrams we see that at this order the
only effect is to multiply the Fourier transform
Z
1 2 1
[ϕ (p)]0 = dxe−ip·x [ϕ2 (x)]0 (12.91)
2 2
by a factor
λ
Nϕ2 = 1 − I(p) + O(λ2 ) , (12.92)
2
where I(p) is our old friend
Z 1
dd ℓ
Z
1 1 1 1 2 2
I(q) = = − γ + log(4π) − dx log m + x(1 − x)q .
(2π)d ℓ2 + m2 (ℓ + q)2 + m2 16π 2 ϵ 0
(12.93)
156
Here we are interested in the massless case, so rewriting things in terms of g4 we have
Λ2
g4 1 2
Nϕ2 = 1 − − γ + log(4π) + 2 + log 2 + O(g4 ) . (12.94)
32π 2 ϵ p
Absorbing the finite one-loop contributions into a rescaling of the cutoff via
1
log Λ′2 = − γ + log(4π) + 2 + log Λ2 , (12.95)
ϵ
we can write this as
′2
g4 Λ 2
Nϕ2 = 1− log + O(g4 )
32π 2 p2
′2 − g4 2
Λ 32π
1 + O(g42 )
= 2
(12.96)
p
Therefore we can remove the cutoff dependence by defining the renormalized operator
g4
ϕ2 = Λ′ 16π2 [ϕ2 ]0 . (12.97)
86 These values are from Zinn-Justin’s book “Quantum field theory and critical phenomena”.
157