Particles Physics
Particles Physics
Stephen P. Martin
James D. Wells
Elementary
Particles
and Their
Interactions
Graduate Texts in Physics
Series Editors
Kurt H. Becker, NYU Polytechnic School of Engineering, Brooklyn, NY, USA
Jean-Marc Di Meglio, Matière et Systèmes Complexes, Bâtiment Condorcet, Université
Paris Diderot, Paris, France
Sadri Hassani, Department of Physics, Illinois State University, Normal, IL, USA
Morten Hjorth-Jensen, Department of Physics, Blindern, University of Oslo, Oslo,
Norway
Bill Munro, NTT Basic Research Laboratories, Atsugi, Japan
Richard Needs, Cavendish Laboratory, University of Cambridge, Cambridge, UK
William T. Rhodes, Department of Electrical Engineering and Computer Science,
Florida Atlantic University, Boca Raton, FL, USA
Susan Scott, Australian National University, Acton, Australia
H. Eugene Stanley, Center for Polymer Studies, Physics Department, Boston
University, Boston, MA, USA
Martin Stutzmann, Walter Schottky Institute, Technical University of Munich,
Garching, Germany
Andreas Wipf, Institute of Theoretical Physics, Friedrich-Schiller-University Jena,
Jena, Germany
Graduate Texts in Physics publishes core learning/teaching material for graduate-
and advanced-level undergraduate courses on topics of current and emerging fields
within physics, both pure and applied. These textbooks serve students at the MS-
or PhD-level and their instructors as comprehensive sources of principles, defi-
nitions, derivations, experiments and applications (as relevant) for their mastery
and teaching, respectively. International in scope and relevance, the textbooks cor-
respond to course syllabi sufficiently to serve as required reading. Their didactic
style, comprehensiveness and coverage of fundamental material also make them
suitable as introductions or references for scientists entering, or requiring timely
knowledge of, a research field.
Stephen P. Martin · James D. Wells
Elementary Particles
and Their Interactions
Stephen P. Martin James D. Wells
Physics Department Physics Department
Northern Illinois University University of Michigan
DeKalb, IL, USA Ann Arbor, MI, USA
This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface
Our most fundamental understanding of the laws of nature is embodied in the the-
ories of General Relativity and the Standard Model of elementary particle physics.
There are many excellent books about the Standard Model for students to consult.
However, the assumed background for the students is different for every book, and
the emphasis is different. For example, some authors do not assume that the stu-
dents have a good understanding of quantum field theory, and so present particle
physics without it. Other authors, on the other hand, present the material with the
assumption that the student already has a good working knowledge of it.
In contrast, our book is intended to be a one-semester course for graduate
students or advanced undergraduates that develops particle physics and quantum
field theory with equal emphasis while pursuing two goals. First, we want stu-
dents to come away with a basic and solid understanding of quantum field theory
techniques aimed at computing observables that are commonly studied by experi-
mentalists, such as pp and e+ e− collisions and particle decays. Second, we want
students to gain a comprehensive survey of the full structure of the Standard Model
of elementary particle physics. In other words, students will learn what are the
basic constituents of nature (leptons, quarks, etc.), the symmetries that they obey,
and the resulting interactions that they have between them.
Our hope is that if a student has only one formalized structured course to give
to particle physics, for one reason or another, that this book would be a good one
for that purpose. We have successfully taught the material of this book at North-
ern Illinois University and at the University of Michigan. At Northern Illinois
University, this book constitutes the material of the most advanced formal course
that a particle physicist usually takes before beginning guided research. The book
was originally conceived with that goal in mind. At the University of Michigan,
this course is taught as the first particle physics course to graduate students in
their first year where quantum field theory is not allowed to be assumed going
in. Furthermore, numerous graduate students outside of particle physics (mathe-
matics, engineering, etc.) have taken the course as part of their “graduate cognate
requirement” at the University, partly because they know that in one semester they
have the opportunity to learn both quantum field theory fundamentals and particle
physics.
Given the aim of this book, to be a one-semester course, decisions have been
made to focus the material for that purpose. We have left topics that are more
v
vi Preface
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Fundamental Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Resonances, Widths, and Lifetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Leptons and Quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Hadrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Decays and Branching Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Special Relativity and Lorentz Transformations . . . . . . . . . . . . . . . . . . . . 13
2.1 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Relativistic Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Tensors and Lorentz Invariant Quantities . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Maxwell’s Equations and Electromagnetism . . . . . . . . . . . . . . . . . . . 24
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Relativistic Quantum Mechanics of Single Particles . . . . . . . . . . . . . . . . 29
3.1 Klein-Gordon and Dirac Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Solutions of the Dirac Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 The Weyl Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Majorana Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Field Theory and Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 The Field Concept and Lagrangian Dynamics . . . . . . . . . . . . . . . . . 51
4.2 Quantization of Free Scalar Field Theory . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Quantization of Free Dirac Fermion Field Theory . . . . . . . . . . . . . 64
4.4 Scalar Field with φ 4 Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Scattering Processes and Cross-Sections . . . . . . . . . . . . . . . . . . . . . . . 73
4.6 Scalar Field with φ 3 Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.7 Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Quantum Electro-Dynamics (QED) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1 QED Lagrangian and Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2 Electron-Positron Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.1 e− e+ → μ− μ+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.2 e− e+ → f f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
vii
viii Contents
Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Introduction
1
In this book, we will explore some of the tools necessary for attacking the fundamental
questions of elementary particle physics. These questions include:
The Standard Model of particle physics proposes some answers to these questions.
Although the Standard Model is an incomplete fundamental description of nature,
it is the benchmark against which future theories will be compared. Furthermore,
if new physics is uncovered at the CERN Large Hadron Collider (LHC), or other
future experiments, it is likely that it can be described using the same set of tools.
This Introduction contains a brief outline of the known fundamental particle con-
tent of the Standard Model, for purposes of orientation. These and many other exper-
imental results about elementary particles can be found in the Review of Particle
Properties, hereafter known as the RPP.1 Unless otherwise indicated all experimen-
tal data quoted in this book was obtained from RPP.
1 R.
L. Workman et al. (Particle Data Group), to be published in Prog. Theor. Exp. Phys. 2022,
083C01 (2022) with frequent updates.
The known interaction forces in nature are the universal attraction of gravity, the
electromagnetic force, the weak nuclear force, and the strong nuclear force. Among
these, gravity is special and is governed by Einstein’s theory of General Relativity.
The other forces are gauge theories. The definition of gauge theories and their prop-
erties will be explored extensively throughout this book. Here let it suffice to say that
a gauge force is one that is mediated by a spin-1 (vector) boson. The force-mediator
gauge bosons that we know about in the Standard Model are listed in Table 1.1.
The photon is the mediator of the electromagnetic force, while the W ± and Z 0
bosons mediate the weak nuclear force, which is seen primarily in decays and in
neutrino interactions. The W + and W − bosons are antiparticles of each other, so they
have exactly the some mass and lifetime. The gluon has an exact 8-fold degeneracy
due to a degree of freedom known as “color”. Color is the charge associated with the
strong nuclear force. Particles that carry net color charges are always confined by
the strong nuclear force, meaning that they can only exist in bound states. Therefore,
no value is listed for the gluon lifetime, and the entry “0” for its mass is meant to
indicate only that the classical wave equation for it has the same character as that of
the photon. Although the fundamental spin-1 bosons are often called force carriers,
that is not their only role, since they are particles in their own right.
Table 1.1 also includes information about the width of the W and Z particle reso-
nances, measured in units of mass, GeV/c2 . In general, resonances can be described
by a relativistic Breit-Wigner lineshape, which gives the probability for the kinematic
mass reconstructed from the production and decay of the particle to have a particular
value M, in the idealized limit of perfect detector resolution and an isolated state.
For a particle of mass m, the probability is:
f (M)
P(M) = , (1.1)
(M 2 − m 2 )2 + m 2 (M)2
where f (M) and (M) are functions that usually vary slowly over the resonance
region M ≈ m, and thus can be treated as constants. The resonance width ≡ (m)
is equivalent to the mean lifetime, which appears in the next column of Table 1.1;
they are related by
The RPP lists the mean lifetime τ for some particles, and the width for others.
Actually, the Standard Model of particle physics predicts the width of the W boson far
more accurately than the experimentally measured width indicated in Table 1.1. The
predicted width, with uncertainties from input parameters, is W = 2.091 ± 0.002
GeV/c2 .
The fermions listed in Tables 1.2 and 1.3 are often considered as divided into fam-
ilies, or generations. The first family is e− , νe , d, u, the second is μ− , νμ , s, c, and
the third is τ − , ντ , b, t. The masses of the fermions of a given charge increase with
the family. The weak interactions mediated by W ± bosons can change quarks of one
family into those of another, but it is an experimental fact that these family-changing
reactions are highly suppressed. All of the fermions listed above also have corre-
sponding antiparticles, with the opposite charge and color, but the same mass and
spin. The antileptons are positively charged e+ , μ+ , τ + and antineutrinos ν e , ν μ , ν τ .
For each quark, there is an antiquark (d, u, s, c, b, t) with the same mass but the
opposite charge. Antiquarks carry anticolor (anti-red, anti-blue, or anti-green).
The masses of the five lightest quarks (d, u, s, c, b) are somewhat uncertain,
and even the definition of the mass of a quark is subject to technical difficulties and
ambiguities. This is related to the fact that quarks exist only in colorless bound states,
called hadrons, due to the confining nature of the strong force. A colorless bound state
can be formed either from three quarks (a baryon), or from three antiquarks (an anti-
baryon), or from a quark with a given color and an antiquark with the corresponding
anti-color (a meson). All baryons are fermions with half-integer spin, and all mesons
are bosons with integer spin. The quark mass values shown in Table 1.3 correspond to
particular technical definitions of quark mass used by the RPP,2 but other definitions
give quite different values. The lifetimes of the d, u, s, c, b quarks are also fuzzy,
and are best described in terms of the hadrons in which they live. In contrast, the
top quark mass is relatively well-known, with an uncertainty under a percent. This is
because the top-quark mean lifetime (about 4.6 × 10−25 s) is so short that it decays
before it can form hadronic bound states (which take roughly 3 × 10−24 s to form).
Therefore it behaves like a free particle during its short life, and so its mass and width
can be defined in a way that is not subject to large ambiguities. Each of these quarks
has an exact 3-fold degeneracy, associated with the color that is the source charge
for the strong force. The colors are often represented by the labels red, green, and
blue, but these are just arbitrary labels; there is no experiment that could tell a red
quark from a green quark, even in principle.
There is also a Higgs boson, with spin 0 and charge 0. It was discovered in 2012,
and its mass has been measured to be 125.25 ± 0.17 GeV. Some extensions of the
Standard Model predict that this Higgs boson is not fundamental and is a composite
state of other particles. However, the data collected to date are consistent with the
Higgs boson being another elementary particle.
1.4 Hadrons
As remarked above, quarks and antiquarks are always found as part of colorless
bound states. The most common are the nucleons (the proton and the neutron), the
baryons that make up most of the directly visible mass in the universe. They and other
2 Here, we have quoted “MS masses” for u, d, s, c, b, and the “pole mass” for t.
1.4 Hadrons 5
similar baryons with total angular momentum (including both constituent spins and
orbital angular momentum) J = 1/2 are listed in Table 1.4.
The quarks listed in parentheses are the valence quarks of the bound state, but there
are also virtual (or “sea”) quark-antiquark pairs and virtual gluons in each of these
and other hadrons. The proton may be absolutely stable; experiments to try to observe
its decays have not found any, resulting in only a very high lower bound on the mean
lifetime. The neutron lifetime is also relatively long, but it decays into a proton,
electron, and antineutrino (n → pe− ν̄e ). The other J = 1/2 baryons decay in times
of order 10−10 s by weak interactions, except for the 0 baryon, which decays
extremely quickly by an electromagnetic interaction into the , which has the same
valence quark content: 0 → γ . In that sense, one can think of the 0 as being an
excited state of the . There are other excited states of these baryons, not listed here.
The mass of the baryons in Table 1.4 increases with the number of valence strange
quarks contained.
Note that the masses of the proton and the neutron (and all other hadrons) are
much larger than the sums of the masses of the valence quarks that make them
up. These nucleon masses come about from the strong interactions by a mechanism
known as chiral symmetry breaking. Nucleons dominate the visible mass of particles
in the universe. Therefore, it is only partially correct to say that the Higgs boson is
needed to understand the “origin of mass”. Most of the masses of the W ± and Z
bosons and the top, bottom, charm, and strange quarks and the leptons are indeed
believed to come from the Higgs mechanism, to be discussed below. However, the
Higgs mechanism is by no means necessary to understand the origin of all mass, and
in particular it is definitely not the explanation for most of the mass that is directly
observed in the universe.
There are also J = 3/2 baryons, with some of the more common ones listed in
Table 1.5. The RPP uses a slightly different notation for the ∗ and ∗ J = 3/2
baryons. Instead of the ∗ notation to differentiate these states from the corresponding
J = 1/2 baryons with the same quantum numbers, the RPP chooses to denote them
by their approximate mass in MeV (as determined by older experiments, so a little off
from the present best values) in parentheses, so (1385) and (1530). Very narrow
Table 1.4 Baryons with J = 1/2 made from light (u, d, s) quarks
J = 1/2 baryon Charge Mass (GeV/c2 ) Lifetime (s)
p (uud) +1 0.938272 >6.6 ×1036
n (udd) 0 0.939565 880.3
(uds) 0 1.11568 2.63 × 10−10
+ (uus) +1 1.18937 8.02 × 10−11
0 (uds) 0 1.19264 7.4 × 10−20
− (dds) −1 1.19745 1.48 × 10−10
0 (uss) 0 1.31486 2.9 × 10−10
− (dss) −1 1.32171 1.64 × 10−10
6 1 Introduction
and
0c resonances with masses ranging from 2.29 GeV/c2 to 2.7 GeV/c2 , and those
with a bottom quark include the 0b , 0b , − − + −
b ,
b , b , and b , with masses ranging
from 5.62 GeV/c2 to 5.82 GeV/c2 . More information about them can be found in
the RPP. Again, there are other baryons, generally with heavier masses, that can be
thought of as excited states of the more common ones listed above.
Bound states of a valence quark and antiquark are called mesons. They always
carry integer total angular momentum J . The most common J = 0 mesons are listed
in Table 1.6.
Here the bar over a quark name denotes the corresponding antiquark. The charged
pions π ± are antiparticles of each other, as are the charged kaons K ± , so they are
exactly degenerate mass pairs with the same lifetime. However, the K 0 and K 0
mesons are mixed and not quite exactly degenerate in mass, as will be discussed in
greater detail in Chap. 11. One of the interaction eigenstates (K L0 ) is actually much
longer-lived than the other (K S0 ); the mean lifetimes are respectively 5.12 × 10−8
and 8.95 × 10−11 s. The lifetimes (and the widths) of the other J = 0 mesons are
not listed here; you can find them yourself in the RPP.
Besides the J = 0 mesons listed above, there are counterparts containing a single
heavy (charm or bottom) quark or antiquark, with the other antiquark or quark light
(up, down or strange). The most common ones are listed in Table 1.7.
There are also J = 0 mesons containing only charm and bottom quarks and
antiquarks. The ones with the lowest masses are listed in Table 1.8.
Vector (J = 1) mesons are also very important. Table 1.9 lists the most common
ones that contain only light (u, d, s) valence quarks and antiquarks.
The most common J = 1 mesons containing one heavy (c or b) quark or antiquark
are likewise shown in Table 1.10. Note that these have the same charges and slightly
larger masses than the corresponding J = 0 mesons in Table 1.7. Mesons with J = 1
and with both quark and antiquark heavy are shown in Table 1.11.
In principle, there should also be J = 1 Bc∗± mesons, but (unlike their J = 0
counterparts in Table 1.8) their existence has not been established experimentally.
Table 1.7 J = 0 mesons containing one heavy and one light quark and antiquark
J = 0 meson Charge Mass (GeV/c2 )
D0 , D0 (cū); (u c̄) 0 1.8648
D± (cd̄); (d c̄) ±1 1.8696
Ds± (cs̄); (s c̄) ±1 1.9683
B± (u b̄); (bū) ±1 5.279
B0, B0 (d b̄); (bd̄) 0 5.280
Bs0 , Bs0 (s b̄); (bs̄) 0 5.367
Table 1.10 J = 1 mesons containing one heavy and one light quark and antiquark
J = 1 meson Charge Mass (GeV/c2 )
D ∗0 , D ∗0 (cū); (u c̄) 0 2.007
D ∗± (cd̄); (d c̄) ±1 2.010
Ds∗± (cs̄); (s c̄) ±1 2.112
B ∗0 , B ∗0 (d b̄); (bd̄) 0 5.325
B ∗± (u b̄); (bū) ±1 5.325
Bs∗0 , Bs∗0 (s b̄); (bs̄) 0 5.415
The heavy quarkonium (cc and bb) systems have other states besides the ηc , J /ψ
and ηb , ϒ from Tables 1.8 and 1.11. For the cc system, there are J = 0 mesons
χc0,1,2 that have the quark and antiquark in P-wave orbital angular momentum
states. There are also states ηc (2S), ψ(2S), ψ(3770), ψ(3872) that are similar to
the ηc and J /ψ, but with excited radial bound-state wavefunctions. Similarly, in the
bb system, there are excited bottomonium states ϒ(2S), ϒ(3S), ϒ(4S), ϒ(10860),
and ϒ(11020) with J = 1, and P-wave orbital angular momentum states with total
J = 0, χb0,1,2 (1P) and χb0,1,2 (2P). The spectroscopy of these states provides a
striking confirmation of the quark model for hadrons and of the strong force.
Much more detailed information on all of these hadronic bound states (and many
others not listed above), including the decay widths and the decay products, can be
found in the RPP. In Fig. 1.1 we scatter plot the mass and lifetime of many of the
elementary particles and boundstate particles that we have discussed in this chapter.
One sees that they fill out many orders of magnitude in both mass and lifetime. Some
reasons for the ordering of masses will become clear in future chapters, such as why
the pion masses are lower than the proton and neutron masses, but other orderings
of the particle masses remain a mystery, such as why the muon mass is much lower
than the top quark mass.
Theoretically, one also expects exotic mesons that are mostly “gluonium” or
glueballs, that is, bound states of gluons. However, these states are expected to mix
with excited quark-antiquark bound states, and they will be extremely difficult to
identify experimentally.
In collider experiments, hadrons are most often produced in groups called jets.
Roughly speaking, each jet can be thought of as originating, at the shortest dis-
tance scales, from individual gluons and quarks (partons) which then hadronize by
complicated processes into collections of final state particles that share the energy
and momentum of the original parton. The hadrons in a given jet have momenta in
approximately the same direction as their parent parton.
1.5 Decays and Branching Ratios 9
Fig. 1.1 Mass versus lifetime for many elementary particles and composite hadrons discussed in
this chapter
In some cases, hadrons can decay through the strong interactions, with widths of
order tens or hundreds of MeV. Some examples include:
++ → pπ + (1.3)
ρ− → π 0π − (1.4)
ω → π +π −π 0 (1.5)
φ → K + K −. (1.6)
There are also decays that are mediated by electromagnetic interactions, for example:
π0 → γγ (1.7)
+ → pγ (1.8)
0 → γ (1.9)
ρ0 → π +π −γ . (1.10)
10 1 Introduction
The smallest decay widths for hadrons are those mediated by the weak interactions,
for example:
n → pe− ν̄e (1.11)
− −
π → μ ν̄μ (1.12)
K + → π +π 0 (1.13)
+
B → D 0 μ+ ντ (1.14)
− → K . −
(1.15)
The weak interactions are also entirely responsible for the decays of the charged
leptons:
μ− → νμ e− ν̄e (1.16)
τ− → ντ e− ν̄e (1.17)
τ− → ντ μ− ν̄μ (1.18)
τ− → ντ + hadrons. (1.19)
Experimentally, the hadronic τ decays are classified by the number of charged
hadrons present in the final state, as either “1-prong” (if exactly one charged hadron),
“3-prong” (if exactly three charged hadrons), etc.
In most cases, a variety of different decay modes contribute to each total decay
width. The fraction that each final state contributes to the total decay width is known
as the branching ratio (or branching fraction), usually abbreviated as BR or B. As a
randomly chosen example, in the case of the ω meson the strong interaction accounts
for most, but not all, of the decays:
The sum of all of the branching ratios is equal to 1, and the sum of the partial widths
is equal to the total decay width.
There are two roads to enlightenment regarding the Standard Model and its future
replacement. The experimental road, which is highly successful as indicated by the
impressive volume and detail in the RPP, finds the answers to masses, decay rates,
branching ratios, production rates, and even more detailed information like kinematic
and angular distributions directly from data in high-energy collisions. The theoretical
road aims to match these results onto predictions of quantum field theories specified
in terms of a small number of parameters. In the case of electromagnetic interactions,
1.5 Decays and Branching Ratios 11
(ct, x, y, z) = (x 0 , x 1 , x 2 , x 3 ) = x μ (2.1)
The Greek indices μ, ν, ρ, . . . run over the values 0, 1, 2, 3, and c is the speed of
light in vacuum. As a matter of terminology, x μ is an example of a contravariant
four-vector.
The laws of physics should not depend on what coordinate system we use, as long
as it is an “inertial reference frame”, which means that the coordinates describing the
position of a free classical particle do not accelerate. This invariance of the laws of
physics is a guiding principle in making a sensible theory. It is often useful to change
our coordinate system from one inertial reference frame to another, according to
x μ → x μ = L μ ν x ν . (2.2)
x μ = (ct , x , y , z ) (2.3)
where
ct = ct
x = x cos α + y sin α
y = −x sin α + y cos α
z = z. (2.4)
Alternatively, we could go to a frame moving with respect to the original frame with
velocity v along the z direction, with the origins of the two frames coinciding at time
t = t = 0. Then:
ct = γ (ct − βz)
x =x
y =y
z = γ (z − βct). (2.5)
where
β = v/c; γ = 1/ 1 − β 2 . (2.6)
x 0 = x 0 cosh ρ − x 3 sinh ρ
x 1 = x1
x 2 = x2
x 3 = −x 0 sinh ρ + x 3 cosh ρ. (2.7)
This change of coordinates is called a boost (with rapidity ρ and in the ẑ direction).
Another example of a contravariant four-vector is given by the 4-momentum
formed from the energy E and spatial momentum p of a particle:
p μ = (E/c, p ). (2.8)
a μ = L μ ν a ν . (2.9)
2.1 Lorentz Transformations 15
A key property of special relativity is that for any two events one can define a
proper interval, which is independent of the Lorentz frame, and which tells us how
far apart the two events are in a coordinate-independent sense. So, consider two
events occurring at x μ and x μ + d μ , where d μ is some four-vector displacement.
The proper interval between the events is
where
⎛ ⎞
1 0 0 0
⎜0 −1 0 0 ⎟
gμν =⎜
⎝0
⎟ (2.13)
0 −1 0 ⎠
0 0 0 −1
is known as the metric tensor. Here, and from now on, we adopt the Einstein summa-
tion convention, in which repeated indices μ, ν, . . . are taken to be summed over. It
is an assumption of special relativity that gμν is the same in every inertial reference
frame.
The existence of the metric tensor allows us to define covariant four-vectors by
lowering an index:
μ
where δν = 1 if μ = ν, and otherwise = 0. It follows that
⎛ ⎞
1 0 0 0
⎜0 −1 0 0 ⎟
g μν ⎜
=⎝ ⎟. (2.17)
0 0 −1 0 ⎠
0 0 0 −1
aμ = gμν a ν ; a μ = g μν aν . (2.18)
aμ = L μ ν aν (2.19)
L μ ν = gμρ g νσ L ρ σ . (2.20)
Because one can always use the metric to go between contravariant and covariant
four-vectors, people often use a harmlessly sloppy terminology and neglect the dis-
tinction, simply referring to them as four-vectors.
If a μ and bμ are any four-vectors, then
a μ bν gμν = aμ bν g μν = aμ bμ = a μ bμ ≡ a · b (2.21)
is a scalar quantity. For example, if p μ and q μ are the four-momenta of any two
particles, then p · q is a Lorentz-invariant; it does not depend on which inertial
reference frame it is measured in. In particular, a particle with mass m satisfies the
on-shell condition
p 2 = p μ pμ = E 2 /c2 − p 2 = m 2 c2 . (2.22)
The Lorentz invariance of dot products of pairs of 4-momenta, plus the conservation
of total four-momentum, plus the on-shell condition (2.22), is enough to solve most
problems in relativistic kinematics.
Let us pause to illustrate this with an example. Consider the situation of two particles,
each of mass m, colliding. Suppose the result of the collision is two final-state
particles each of mass M. Let us find the threshold energy and momentum 4-vectors
2.2 Relativistic Kinematics 17
for this process in the COM (center-of-momentum) frame and in the frame in which
one of the initial-state particles is at rest. Throughout most of the following, we will
take c = 1, by a choice of units.
Relativistic kinematics problems are often more easily analyzed in the COM
frame, so let us consider that case first. Without loss of generality, we can take the
colliding initial-state particles to be moving along the z-axis. Then their 4-momenta
are:
μ
p1 = (E, 0, 0, E 2 − m 2 ), (2.23)
μ
p2 = (E, 0, 0, − E − m ).
2 2 (2.24)
The spatial momenta are required to be opposite by the definition of the COM
frame, which in turn requires the energies to be the same, using (2.22) and the fact
that the masses are assumed equal. The total 4-momentum of the initial state is
p μ = (2E, 0, 0, 0), and so this must be equal to the total 4-momentum of the final
state in the COM frame as well. Furthermore,
p 2 = 4E 2 (2.25)
The angle θ parametrizes the arbitrary direction of the scattering. Without loss of
generality, we have taken the scattering to occur within the yz plane, as shown:
k1
p1 p2
k2
The fact that we are in the COM frame again requires the spatial momenta to be
opposite, and thus the energies to be equal to a common value E f because of the
assumed equal masses M. Now, requiring conservation of total 4-momentum gives
18 2 Special Relativity and Lorentz Transformations
μ μ μ μ
k1 + k2 = p1 + p2 , so E f = E. In order for the spatial momentum components
to be real, we therefore find the energy threshold condition in the COM frame
Now let us reconsider the problem in a frame where one of the initial-state particles
is at rest, corresponding to a fixed-target experiment. In the Lab frame,
μ
p1 = (E , 0, 0, E 2 − m 2 ), (2.29)
μ
p2 = (m, 0, 0, 0) (2.30)
are the 4-momenta of the two initial-state particles, and E is the Lab frame energy
μ
√ particle. The total initial state 4-momentum is therefore p = (E +
of the moving
m, 0, 0, E 2 − m 2 ), leading to a Lorentz invariant
This must be the same as (2.25), so the Lab frame energy is related to the COM
energy of each particle by
m(E + m) = 2E 2 . (2.32)
Because we already found E > M, the Lab frame threshold energy condition for the
scattering event to be possible is m(E + m) > 2M 2 , or
2M 2 − m 2
E > E thresh
= . (2.33)
m
Let us also relate the Lab frame 4-momenta to those in the COM frame. To find the
Lorentz transformation needed to go from the COM frame to the Lab frame, consider
μ
the 0, 3 components of the equation p2 = μ ν p2ν :
m γ βγ √ E
= . (2.34)
0 βγ γ − E 2 − m2
It follows that
E − m
β = 1 − m /E =
2 2 , (2.35)
E + m
1 E + m
γ = = E/m = , (2.36)
1 − β2 2m
E − m
βγ = E 2 /m 2 − 1 = . (2.37)
2m
2.2 Relativistic Kinematics 19
Now we can apply this Lorentz boost to the final-state momenta as found in the COM
frame to obtain the Lab frame momenta. For the first final-state particle:
⎛ ⎞⎛ ⎞
γ 0 0βγ E
⎜ 0 0 ⎟ ⎜ ⎟
k1
μ
=⎜
1 0 ⎟⎜ √0 ⎟ (2.38)
⎝ 0 0 1 0 ⎠ ⎝ sin θ √E − M ⎠
2 2
βγ 0 0 γ cos θ E 2 − M 2
⎛ ⎞
1 + cos θ 1 − m 2 /E 2 1 − M 2 /E 2
⎜ 0 ⎟
= (E 2 /m) ⎜
⎝
⎟.
⎠ (2.39)
sin θ (m/E) 1 − M 2 /E 2
1 − m /E + cos θ 1 − M /E
2 2 2 2
Note that for M > m, the z-component of the momentum is always positive (in the
same direction as the incoming particle in the Lab frame), regardless of the sign of
cos θ . (The other final-state momentum is obtained by just flipping the signs of cos θ
and sin θ .) The Lab-frame scattering angle with respect to the original collision axis
(the z-axis in both frames) is determined by
(m/E) sin θ
tan θ = . (2.40)
1 − m 2 /E 2 /1 − M 2 /E 2 + cos θ
For fixed θ in the COM frame, |θ | in the Lab frame decreases with increasing E/m,
as the produced particles go more in the forward direction.
Notice from (2.28) and (2.33) that while the production of a pair of heavy particles
of mass M requires beam energies in symmetric collisions that scale like M, in fixed-
target collisions the energy required scales like 2M 2 /m M, where m is the beam
particle mass. This is why fixed-target collisions are no longer an option for frontier
physics discoveries of very heavy particles or high-energy phenomena.
In collider applications, it is common to see the direction of a final-state particle
with respect to the colliding beams described either by the pseudo-rapidity η or the
longitudinal rapidity y. Suppose that the two colliding beams are oriented so that
Beam 1 is going in the ẑ direction and Beam 2 is going in the −ẑ direction. A final
state particle (or group of particles) emerging at an angle θ with respect to Beam 1
in general has a four-vector momentum given by:
where pT = |p| sin θ is the transverse momentum, pz = |p| cos θ is the longitudinal
momentum, and E = |p|2 + m 2 is the energy, with m the mass and p the three-
vector momentum. (In hadron colliders, this four-vector is generally defined in the
lab frame, not in the center-of-momentum frame of the scattering event, which is
often unknown.) Then the pseudo-rapidity is defined by
1 |p| + pz
η = ln = − ln [tan(θ/2)] . (2.42)
2 |p| − pz
20 2 Special Relativity and Lorentz Transformations
In fact, η = y in the special case of a massless particle, and they are very nearly
equal for a particle whose energy is large compared to its mass. However, in general
y does depend on the energy. For the same particle, the ordinary rapidity is given
by:
1 E + |p|
ρ = ln . (2.44)
2 E − |p|
The quantity y is the rapidity of the boost needed to move to a frame where the
particle has no longitudinal momentum along the beam direction, while ρ is the
rapidity of the boost needed to move to the particle’s rest frame. Confusingly, it has
become a standard abuse of language among collider physicists to call y simply the
rapidity, and among non-collider physicists it is common to see the letter η used to
refer to the ordinary rapidity, called ρ here. Some care is needed to ensure that one
is using and interpreting these quantities consistently.
Now let us return to the study of the properties of Lorentz transformations. The
Lorentz-invariance of equation (2.21) implies that, if a μ and bμ are constant four-
vectors, then
so that
This is the fundamental constraint that a Lorentz transformation matrix must satisfy.
In matrix form, it could be written as L T gL = g. If we contract (2.47) with g ρκ , we
obtain
L ν κ L ν σ = δσκ (2.48)
Applying this to (2.2) and (2.19), we find that the inverse Lorentz transformation of
any four-vector is
a ν = a μ L μ ν (2.49)
aν = aμ L μ ν (2.50)
This just flips the sign of the time coordinate, and is therefore known as time reversal:
x 0 = −x 0 x 1 = x 1 x 2 = x 2 x 3 = x 3 . (2.52)
so that:
x 0 = x 0 x 1 = −x 1 x 2 = −x 2 x 3 = −x 3 . (2.54)
It was once thought that the laws of physics have to be invariant under these oper-
ations. However, it was shown experimentally in the 1950s that parity is violated
in the weak interactions, specifically in the weak decays of the 60 Co nucleus and
the K ± mesons. Likewise, experiments in the 1960s on the decays of K 0 mesons
showed that time-reversal invariance is violated (at least if very general properties
of quantum mechanics and special relativity are assumed).
22 2 Special Relativity and Lorentz Transformations
However, all experiments up to now are consistent with invariance of the laws
of physics under the subset of Lorentz transformations that are continuously con-
nected to the identity; these are known as “proper” Lorentz transformations and have
det(L) = +1. They can be built up out of infinitesimal Lorentz transformations:
where we agree to drop everything with more than one ωμ ν . Then, according to
(2.47),
or
Therefore
F (x ) = F(x). (2.59)
Then
∂F 1 ∂F
∂μ F ≡ μ = , ∇F (2.60)
∂x c ∂t
2.3 Tensors and Lorentz Invariant Quantities 23
One can obtain another scalar function by acting twice with the 4-dimensional
derivative operator on F, contracting the indices on the derivatives:
1 ∂2 F
∂ μ ∂μ F = − ∇ 2 F. (2.63)
c2 ∂t 2
The object −∂ μ ∂μ F is a 4-dimensional generalization of the Laplacian.
A tensor is an object that can carry an arbitrary number of spacetime vector
indices, and transforms appropriately when one goes to a new reference frame. The
μ
objects g μν and gμν and δν are constant tensors. Four-vectors and scalar functions
and 4-derivatives of them are also tensors. In general, the defining characteristic of
μ μ ...
a tensor function Tν1 ν1 2 ...2 (x) is that under a change of reference frame, it transforms
so that in the primed coordinate system, the corresponding tensor T is:
Tνμ 1 μ2 ... (x ) = L μ1
1 ν2 ...
μ2 σ1 σ2 ρ1 ρ2 ...
ρ1 L ρ2 · · · L ν1 L ν2 · · · Tσ1 σ2 ... (x). (2.64)
A special and useful constant tensor is the totally antisymmetric Levi-Civita ten-
sor:
⎧
⎨ +1 if μνρσ is an even permutation of 0123
μνρσ = −1 if μνρσ is an odd permutation of 0123 (2.65)
⎩
0 otherwise
One use for the Levi-Civita tensor is in understanding the Lorentz invariance of
four-dimensional integration. Define 4 four-vectors so that in a particular frame they
are given by the infinitesimal differentials:
d 4 x ≡ d x 0 d x 1 d x 2 d x 3 = Aμ B ν C ρ D σ μνρσ (2.70)
24 2 Special Relativity and Lorentz Transformations
An example of a relativistic theory that we are familiar with is electricity and mag-
netism. It is instructive to recast Maxwell’s equations into a manifestly relativistic
form. This will give us familiarity with four-component gauge field formulation of
the relativistic wave equations governing electromagnetic fields. We will also see the
relativistic version of gauge invariance and concept of gauge transformations for the
electromagnetic field, which will be explored more completely in later sections.
Recall that Maxwell’s equations can be written in the form:
∂ρ
+ ∇ · J = 0. (2.76)
∂t
To put this into a Lorentz-invariant form, we can form a four-vector charge and
current density:
J μ = (ρ, J ), (2.77)
∂μ J μ = 0. (2.78)
Problems 25
Furthermore, (2.74) and (2.75) imply that we can write the electric and magnetic
fields as derivatives of the electric and magnetic potentials V and A:
∂A
E = −∇V − , (2.79)
∂t
B = ∇ × A. (2.80)
Aμ = (V , A ), (2.81)
then (2.79) and (2.80) mean that we can write the electric and magnetic fields as
components of an antisymmetric tensor:
Fμν = ∂μ Aν − ∂ν Aμ (2.82)
⎛ ⎞
0 Ex Ey Ez
⎜ −E x 0 −Bz By ⎟
=⎜⎝ −E y
⎟. (2.83)
Bz 0 −Bx ⎠
−E z −B y Bx 0
Now the Maxwell equations (2.72) and (2.73) correspond to the relativistic wave
equation
∂μ F μν = e J ν , (2.84)
or equivalently,
∂μ ∂ μ Aν − ∂ ν ∂μ Aμ = e J ν . (2.85)
The remaining Maxwell equations (2.74) and (2.75) are equivalent to the identity:
1. Natural units are a system of units where the units of time, distance, energy,
mass and momentum are all represented in terms of GeV. This is accomplished
by rescaling constants of nature such that = 1 (Planck’s quantum mechanics
constant) and c = 1 (speed of light). Let us understand this better by giving c
the units of “speedy” and the units of “spinny.” In other words,
Use also the fact that 1 GeV = 1.602 × 10−10 J to find what meters, seconds and
kg are in units of speedy, spinny and GeV. Compare the numerical values you
get with the conversion factors of meters, seconds and kg to GeV in appendix
A.1.
Note: there is no problem keeping “speedy” and “spinny” units for the entire
book but since they are always multiplied by factors of c and in the equations,
which are numerically just 1 in these units, it is numerically safe to always drop
reference to “speedy”’ and “spinny” units and just keep GeV. That choice defines
“natural units”.
2. A baseball has a mass of 0.145 kg. The distance between the pitcher’s mound
and home plate is 18.44 meters. A hitter has about 150 milliseconds to make a
decision on whether to swing at a pitch. Express these three measurements in
natural units (i.e., in units of GeV).
3. Using data from the “Review of Particle Properties”, find numerical values for
the mean distance traveled by a muon, a neutral pion, a charged pion, a tau lepton,
a K + meson, a D + meson, and a B + meson produced in high-energy physics
experiments, for the following values of the particle energy: 10 GeV, 100 GeV,
1000 GeV.
4. What is the threshold energy for the initial-state proton, for the reaction p + n →
p + p + π − , assuming the neutron is initially at rest?
5. Consider a particle at rest with mass M, which undergoes a 2-body decay to
particles a, b of masses m a and m b . Assuming that all motion of the decay
products is along the z direction, find the four-momenta of the particles a, b for:
(a) the special case m b = m a = m.
Problems 27
problem, denote the mass of the muon as M, and its mean lifetime in its rest
frame as τ .
(a) Find the necessary inequality for the process to occur, in terms of the given
quantities. Discuss the behavior of this requirement for the two limiting cases
θ = 0 and θ = π .
(b) Now suppose that the collision occurs head-on (θ = 0), and that E 1 > E 2 .
What is the maximum energy that one of the resulting muons can have?]
(c) Now suppose that θ = 0, E 1 = 9M, and E 2 = M. What will be the maxi-
mum mean lifetime of the longer-lived muon?
11. Prove that any Lorentz transformation matrix L μ ν satisfies det(L) = ±1.
[Hint: Recall that det(AB) = det(A)det(B) for any matrices A, B.]
12. Consider a Lorentz transformation for which x μ → x μ = L μ ν x ν , where L μ ν
is constant.
μ
(a) Suppose that Tν is a tensor. Use (2.3.20) to write down how it transforms
μ
under a Lorentz transformation. Then, from this, prove that Tμ transforms
as a scalar.
(b) Let Aμ be the 4-vector potential for electromagnetism, and ∂μ = ∂/∂ x μ and
∂ μ = ∂/∂ xμ and let J μ be the corresponding current density, as in Sect. 2.4.
Write down how each of these objects transforms under a Lorentz transfor-
mation.
(c) Use the results of part (b) to derive how the field-strength tensor F μν trans-
forms under the Lorentz transformation, but without further appealing to
(2.3.20). Write the result in the form
F μν → X μν ρσ F ρσ , (2.91)
Any realistic theory must be consistent with quantum mechanics. In this chapter, we
consider how to formulate a theory of quantum mechanics that is consistent with
special relativity.
Suppose that (x) is the wavefunction of a free particle in 4-dimensional space-
time. A fundamental principle of quantum mechanics is that the time dependence of
is determined by a Hamiltonian operator, according to:
∂
H = i . (3.1)
∂t
Now, the three-momentum operator is given by
P = −i∇. (3.2)
Because H and P commute, one can take to be one of the basis of wavefunctions
for eigenstates with energy and momentum eigenvalues E and p respectively:
H = E; P = p . (3.3)
One can now turn this into a relativistic Schrödinger wave equation for free particle
states, by using the fact that special relativity implies:
E = m 2 c4 + p 2 c2 , (3.4)
where m is the mass of the particle. To make sense of this as an operator equation, we
could try expanding it in an infinite series, treating p 2 as small compared to m 2 c2 :
∂ p2 p4
i = mc2 1 + − + . . . (3.5)
∂t 2 m 2 c2 8 m 4 c4
2 2 4
= mc −
2
∇ − (∇ ) + . . .
2 2
(3.6)
2m 8 m 3 c2
If we keep only the first two terms, then we recover the standard non-relativistic
quantum mechanics of a free particle; the first term mc2 is an unobservable constant
contribution to the Hamiltonian, proportional to the rest energy, and the second term is
the usual non-relativistic kinetic energy. However, the presence of an infinite number
of derivatives leads to horrible problems, including apparently non-local effects.
Instead, one can consider the operator H 2 acting on , avoiding the square root.
It follows that
so that:
∂ 2
− = −∇ 2 + m 2 . (3.8)
∂t 2
Here and from now on we have set c = 1 and = 1 by a choice of units. This conven-
tion means that mass, energy, and momentum all have the same units (GeV), while
time and distance have units of GeV−1 , and velocity is dimensionless. These con-
ventions greatly simplify the equations of particle physics. One can always recover
the usual metric system units using the following conversion table for energy, mass,
distance, and time, respectively:
(∂ μ ∂μ + m 2 ) = 0. (3.13)
∂ μ ∂μ = −k μ kμ = −k 2 . (3.15)
∂2
− = (−∇ 2 + m 2 ). (3.17)
∂t 2
On the other hand, expressing H in terms of the right-hand side of (3.16), we find:
⎡ ⎤
∂2
3
∂ ∂ ∂
− 2 = ⎣− α j αk − im (α j β + βα j ) j + β m ⎦ .
2 2 (3.18)
∂t ∂x j ∂xk ∂x
j,k=1 j
3
∂ ∂ 1
3
∂ ∂
α j αk = (α j αk + αk α j ) j k . (3.19)
∂x ∂x
j k 2 ∂x ∂x
j,k=1 j,k=1
Then comparing (3.17) and (3.18), one finds that the two agree if, for j, k = 1, 2, 3:
β 2 = 1, (3.20)
α j β + βα j = 0, (3.21)
α j αk + αk α j = 2δ jk . (3.22)
The simplest solution turns out to require n = 4 spinor indices. This may be
somewhat surprising, since naively one only needs n = 2 to describe a spin-1/2
particle like the electron. As we will see, the Dirac equation automatically describes
positrons as well as electrons, accounting for the doubling. It is easiest to write the
solution in terms of 2 × 2 Pauli matrices:
0 1 0 −i 1 0 1 0
σ1 = , σ2 = , σ3 = , and σ 0 = . (3.23)
1 0 i 0 0 −1 0 1
obey the required conditions. The matrices β, α j are written in 2 × 2 block form,
so “0” actually denotes a 2 × 2 block of 0’s. Equation (3.16) is known as the Dirac
equation, and the 4-component object is known as a Dirac spinor. Note that the fact
that Dirac spinor space is 4-dimensional, just like ordinary spacetime, is really just
a coincidence.2 One must be careful not to confuse the two types of 4-dimensional
spaces!
It is convenient and traditional to rewrite the Dirac equation in a nicer way by
multiplying it on the left by the matrix β, and defining
γ 0 = β, γ j = βα j , ( j = 1, 2, 3). (3.25)
The result is
∂ ∂ ∂ ∂
i(γ 0 0 + γ 1 1 + γ 2 2 + γ 3 3 ) − m = 0, (3.26)
∂x ∂x ∂x ∂x
2 Forexample, if we lived in 10 dimensional spacetime, it turns out that Dirac spinors would have
32 components.
3.1 Klein-Gordon and Dirac Equations 33
[The solution found above for the γ μ is not unique. To see this, suppose U is any
constant unitary 4 × 4 matrix satisfying U † U = 1. Then the Dirac equation implies:
So, the new γ μ matrices together with the new spinor are just as good as the
old pair γ μ , ; there are an infinite number of different, equally valid choices. The
set we’ve given above is called the chiral or Weyl representation. Another popular
choice used by some textbooks (but not here) is the Pauli-Dirac representation.]
Many problems involving fermions in high-energy physics involve many gamma
matrices dotted into partial derivatives or momentum four-vectors. To keep the nota-
tion from getting too bloated, it is often useful to use the Feynman slash notation:
γ μ aμ = a/ (3.31)
for any four-vector a μ . Then the Dirac equation takes the even more compact form:
(i ∂/ − m) = 0. (3.32)
γ 0† = γ 0 , γ j† = −γ j , ( j = 1, 2, 3), (3.33)
γ 0 γ μ† γ 0 = γ μ , (3.34)
Tr(γμ γν ) = 4gμν , (3.35)
γ μ γμ = 4, (3.36)
γμ γν + γν γμ = {γμ , γν } = 2gμν . (3.37)
Note that on the right-hand sides of each of (3.36) and (3.37), there is an implicit
4 × 4 unit matrix. It turns out that one almost never needs to know the explicit form
of the γ μ . Instead, the equations above can be used to derive identities needed in
practical work.
How does a Dirac spinor a (x) transform under a Lorentz transformation? It
carries no vector index, so it is not a tensor. On the other hand, the fact that the
Hamiltonian “mixes up” the components of a (x) is a clue that it doesn’t transform
like an ordinary scalar function either. Instead, we might expect that the spinor
reported by an observer in the primed frame is given by
(x ) = (x), (3.38)
34 3 Relativistic Quantum Mechanics of Single Particles
exp(M) = 1 + M + M 2 /2 + M 3 /6 + . . . . (3.43)
So, we have found the that appears in (3.38) corresponding to the L μ ν that appears
in (3.41):
i μν
= exp −
μν S . (3.45)
2
Then
⎛ ⎞ ⎛ ⎞
ρ2 0 0 0 0 0 0 −ρ 3
⎜ 0 0 0 0 ⎟ ⎜ 0 0 0 0 ⎟
2 = ⎜
⎝ 0
⎟,
3 = ⎜ ⎟, etc. (3.47)
0 0 0 ⎠ ⎝ 0 0 0 0 ⎠
0 0 0 ρ2 −ρ 3 0 0 0
Therefore, this is the matrix that boosts a Dirac spinor in the z direction with rapidity
ρ, in (3.38).
Since is not a scalar, it is natural to ask whether one can use it to construct a
scalar quantity. A tempting guess is to get rid of all the pesky spinor indices by
4
† (x) ≡ a† a . (3.51)
a=1
† (x ) = † † (x). (3.52)
36 3 Relativistic Quantum Mechanics of Single Particles
† γ 0 . (3.53)
† γ 0 (x ) = † † γ 0 (x). (3.54)
† γ 0 = γ 0 . (3.55)
One can check that this is indeed true for the special case of (3.50). More importantly,
(3.55) is true for any
i
= 1 − ωμν S μν (3.56)
2
that is infinitesimally close to the identity, using (3.33) and (3.34). Therefore, it is
true for any proper Lorentz transformation built out of infinitesimal ones.
Motivated by this, one defines, for any Dirac spinor ,
≡ †γ 0. (3.57)
One should think of as a column vector in spinor space, and as a row vector.
Then their inner product
, (3.58)
with all spinor indices contracted, transforms as a scalar function under proper
Lorentz transformations. Similarly, one can show that
γ μ (3.59)
Our next task is to construct solutions to the Dirac equation. Let us separate out the
x μ -dependent part as a plane wave, by trying
( /p − m)u( p, s) = 0. (3.61)
To simplify things, first consider this equation in the rest frame of the particle,
where p μ = (m, 0, 0, 0). In that frame,
where each “1” means a 2 × 2 unit matrix. The solutions are clearly
√ χs
u( p, s) = m , (3.64)
χs
√
where χs can be any 2-vector, and the m normalization is a convention. In practice,
it is best to choose the χs orthonormal, satisfying χs† χr = δr s for r , s = 1, 2. A
particularly nice choice is:
1 0
χ1 = , χ2 = . (3.65)
0 1
As we will see, these just correspond to spin eigenstates Sz = 1/2 and −1/2.
Now, to construct the corresponding solution in any other frame, one can just
boost the spinor using (3.45). For example, consider the solution
⎛ ⎞
1
√ ⎜ 0 ⎟ −imt
(x ) = u( p, 1)e−i p·x = m⎜
⎝1⎠e
⎟ (3.66)
0
38 3 Relativistic Quantum Mechanics of Single Particles
in a frame where the particle is at rest; we have called it the primed frame for
convenience. We suppose the primed frame is moving with respect to the unprimed
frame with rapidity ρ in the z direction. Thus, the particle has, in the unprimed frame:
so that
⎛√ ⎞
E − pz
⎜ ⎟
u( p, 1) = ⎜ √ 0 ⎟ (3.71)
⎝ E + pz ⎠
0
in this frame.
0
Similarly, if we use instead χ2 = in (3.64) in the rest frame, and apply the
1
same procedure, we find a solution:
⎛ ⎞ ⎛ ⎞
0 √ 0
√ ⎜ eρ/2 ⎟ −i p·x ⎜ E + pz ⎟ −i p·x
(x) = m ⎜ ⎝ 0 ⎠e
⎟ =⎜⎝
⎟e
⎠ , (3.72)
√ 0
e−ρ/2 E − pz
so that
⎛ ⎞
√ 0
⎜ E + pz ⎟
u( p, 2) = ⎜
⎝
⎟
⎠ (3.73)
√ 0
E − pz
3.2 Solutions of the Dirac Equation 39
in this frame. Note that pz in (3.70) and (3.72) can have either sign, corresponding
to the wavefunction for a particle moving in either the +z or −z directions.
In order to make a direct connection between spin and the various components of
a Dirac spinor, let us now consider how to construct the spin operator S. To do this,
recall that by definition, spin is the difference between the total angular momentum
operator J and the orbital angular momentum operator L:
J = L + S. (3.74)
Now,
L = x × P, (3.75)
where x and P are the three-dimensional position and momentum operators. The
total angular momentum must be conserved, or in other words it must commute with
the Hamiltonian:
[H , J] = 0. (3.76)
obeys (3.78). So, it must be the spin operator acting on Dirac spinors.
In particular, the z-component of the spin operator for Dirac spinors is given by
the diagonal matrix:
⎛ ⎞
1 0 0 0
1⎜ 0 −1 0 0 ⎟
Sz = ⎜ ⎟. (3.80)
2 ⎝0 0 1 0 ⎠
0 0 0 −1
Therefore, the solutions in (3.70) and (3.72) can be identified to have spin eigenvalues
Sz = +1/2 and Sz = −1/2, respectively. In general, a Dirac spinor eigenstate with
Sz = +1/2 will have only the first and third components non-zero, and one with
40 3 Relativistic Quantum Mechanics of Single Particles
Sz = −1/2 will have only the second and fourth components non-zero, regardless
of the direction of the momentum. Note that, as promised, Sz = 1/2 (−1/2) exactly
corresponds to the use of χ1 (χ2 ) in (3.64).
The helicity operator gives the relative orientation of the spin of the particle and
its momentum. It is defined to be:
p·S
h= . (3.81)
|p |
Like Sz , helicity has possible eigenvalues ±1/2 for a spin-1/2 particle. For example,
if pz > 0, then (3.70) and (3.72) represent states with helicity +1/2 and −1/2
respectively. The helicity is not invariant under Lorentz transformations for massive
particles. This is because one can always boost to a different frame in which the
3-momentum is flipped but the spin remains the same. (Also, note that unlike Sz ,
helicity is not even well-defined for a particle exactly at rest, due to the |p | = 0 in
the denominator.) However, a massless particle moves at the speed of light in any
inertial frame, so one can never boost to a frame in which its 3-momentum direction is
flipped. This means that for massless (or very energetic, so that E m) particles, the
helicity is fixed and invariant under Lorentz transformations. In any frame, a particle
with p and S parallel has helicity h = 1/2, and a particle with p and S antiparallel
has helicity h = −1/2.
Helicity is particularly useful in the high-energy limit. For example, we can con-
sider four solutions obtained from the E, pz m limits of (3.70) and (3.72), so that
| pz | = E:
⎛ ⎞
0
⎜ 0 ⎟ −i E(t−z)
pz >0,Sz =+1/2 =⎜ √ ⎟
⎝ 2E ⎠ e [p ↑, S ↑, h = +1/2] (3.82)
0
⎛ ⎞
√0
⎜ 2E ⎟ −i E(t−z)
pz >0,Sz =−1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↑, S ↓, h = −1/2] (3.83)
0
⎛√ ⎞
2E
⎜ 0 ⎟ −i E(t+z)
pz <0,Sz =+1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↓, S ↑, h = −1/2] (3.84)
0
⎛ ⎞
0
⎜ 0 ⎟ −i E(t+z)
pz <0,Sz =−1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↓, S ↓, h = +1/2]. (3.85)
√
2E
In this high-energy limit, a Dirac spinor with h = +1/2 is called right-handed (R)
and one with h = −1/2 is called left-handed (L). Notice that a high-energy L state
3.2 Solutions of the Dirac Equation 41
is one that has the last two entries zero, while a high-energy R state always has the
first two entries zero.
It is useful to define matrices that project onto L and R states in the high-energy
or massless limit. In 2 × 2 blocks:
1 0 0 0
PL = ; PR = , (3.86)
0 0 0 1
where 1 and 0 mean the 2 × 2 unit and zero matrices, respectively. Then PL acting
on any Dirac spinor gives back a left-handed spinor, by just killing the last two
components. The projectors obey the rules:
Then
1 − γ5 1 + γ5
PL = ; PR = . (3.89)
2 2
The matrix γ5 satisfies the equations:
So far, we have been considering Dirac spinor wavefunction solutions of the form
electron has charge3 −e, so the hole corresponding to its absence effectively has
the opposite charge, +e. Since both electrons and holes obey p 2 = m 2 , they have
the same mass. Dirac’s proposal therefore predicts the existence of “anti-electrons’
or positrons, with positive energy and positive charge. The positron was indeed
discovered in 1932 in cosmic ray experiments.
Feynman and Stückelberg noted that one can reinterpret the positron as a nega-
tive energy electron moving backwards in time, so that p μ → − p μ and S → −S.
According to this interpretation, the wavefunction for a positron with 4-momentum
p μ with p 0 = E > 0 is
Now, using the Dirac equation (3.32), v( p, s) must satisfy the eigenvalue equation:
( /p + m)v( p, s) = 0. (3.93)
We can now construct solutions to this equation just as before. First, in the rest
(primed) frame of the particle, we have in 2 × 2 blocks:
m m
v( p, s) = 0. (3.94)
m m
∂
H = −i , (3.96)
∂t
P = i∇, (3.97)
1 σ 0
S = − , (3.98)
2 0 σ
3 Here, e is always defined to be positive, so that the electron has charge −e. (Some references
define e to be negative.).
3.2 Solutions of the Dirac Equation 43
where H , P, and S are the operators whose eigenvalues are to be interpreted as the
energy, 3-momentum, and spin of the positive-energy positron antiparticle. There-
fore, to describe a positron with spin Sz = +1/2 or −1/2, one should use, respec-
tively,
0 1
ξ1 = ; or ξ2 = , (3.99)
1 0
in (3.95).
Now we can boost to the unprimed frame just as before, yielding the solutions:
⎛ ⎞
√ 0
⎜ E + pz ⎟
v( p, 1) = ⎜
⎝
⎟,
⎠ (3.100)
√ 0
− E − pz
⎛ √ ⎞
E − pz
⎜ ⎟
v( p, 2) = ⎜ √ 0 ⎟
⎝ − E + pz ⎠ . (3.101)
0
Here v( p, 1) corresponds
to a positron moving in the +z direction with 3-momentum
pz and energy E = pz2 + m 2 and Sz = +1/2, hence helicity h = +1/2 if pz > 0.
Similarly, v( p, 2) corresponds to a positron with the same energy and 3-momentum,
but with Sz = −1/2, and therefore helicity h = −1/2 if pz > 0.
Note that for positron wavefunctions, PL projects onto states that describe right-
handed positrons in the high-energy limit, and PR projects onto states that describe
left-handed positrons in the high-energy limit. If we insist that PL projects on to left-
handed spinors, and PR projects on to right-handed spinors, then we must simply
remember that a right-handed positron is described by a left-handed spinor (annihi-
lated by PR ), and vice versa!
Later we will also need to use the Dirac row spinors:
so that
⎛√ ⎞
m
√ √
⎜ 0 ⎟
u( p, 1)u( p, 1) = m0 m0 ⎜ √ ⎟
⎝ m ⎠ = 2 m. (3.104)
0
44 3 Relativistic Quantum Mechanics of Single Particles
Since this quantity is a scalar, it must be true that u( p, 1)u( p, 1) = 2m in any Lorentz
frame, in other words, for any p μ . More generally, if s, r = 1, 2 represent orthonor-
mal spin state labels, then the u and v spinors obey:
Similarly, one can show that u( p, s)γ μ u( p, r ) = 2(m, 0 )δsr in the rest frame. Since
it is a four-vector, it must be that in any frame:
But the most useful identities that we will use later on are the spin-sum equations:
2
u( p, s)u( p, s) = /p + m; (3.109)
s=1
2
v( p, s)v( p, s) = /p − m. (3.110)
s=1
Here the spin state label s is summed over. These equations are to be interpreted in
the sense of a column vector times a row vector giving a 4 × 4 matrix, like:
⎛ ⎞ ⎛ ⎞
a1 a1 b1 a1 b2 a1 b3 a1 b4
⎜ a2 ⎟
⎜ a2 b1 a2 b2 a2 b3 a2 b4 ⎟
⎜ ⎟ b1 b2 b3 b4 = ⎜ ⎟. (3.111)
⎝ a3 ⎠ ⎝ a3 b1 a3 b2 a3 b3 a3 b4 ⎠
a4 a4 b1 a4 b2 a4 b3 a4 b4
We will use (3.109) and (3.110) often when calculating cross-sections and decay
rates involving fermions.
As a check, note that if we act on the left of (3.109) with /p − m, the left hand
side vanishes because of (3.61), and the right hand side vanishes because of
( /p − m)( /p + m) = /p /p − m 2 = 0. (3.112)
/p /p = p ,
2
(3.113)
where (3.37) was used. A similar consistency check works if we act on the left of
(3.110) with /p + m.
The Dirac spinors given above only describe electrons and positrons with both
momentum and spin aligned along the ±z direction. More generally, we could con-
struct u( p, s) and v( p, s) for states describing electrons or positrons with any p and
spin. However, in general that is quite a mess, and it turns out to be not particularly
useful in most practical applications, as we will see.
It turns out that the Dirac equation can be replaced by something simpler and more
fundamental in the special case m = 0. If we go back to Dirac’s guess for the Hamil-
tonian, we now have just
H = α · P, (3.115)
and there is no need for the matrix β. Therefore, (3.20) and (3.21) are not applicable,
and we have only the one requirement:
α j αk + αk α j = 2δ jk . (3.116)
σ μ = (σ 0 , σ 1 , σ 2 , σ 3 ), (3.118)
σ μ = (σ 0 , −σ 1 , −σ 2 , −σ 3 ), (3.119)
iσ μ ∂μ ψ L = 0, (3.120)
iσ μ ∂μ ψ R = 0. (3.121)
Here we have attached labels L and R because the solutions to these equations turn
out to have left and right helicity, respectively, as we will see in a moment. Each of
these equations is called a Weyl equation. They are similar to the Dirac equation, but
only apply to massless spin-1/2 particles, and are 2 × 2 matrix equations rather than
4 × 4. The two-component objects ψ L and ψ R are called Weyl spinors.
We can understand the relationship of the Dirac equation to the Weyl equations
if we notice that the γ μ matrices can be written as
μ 0 σμ
γ = . (3.122)
σμ 0
46 3 Relativistic Quantum Mechanics of Single Particles
(Compare (3.28).) If we now write a Dirac spinor in its L and R helicity components,
L
= , (3.123)
R
iσ μ ∂μ R = m L , (3.125)
iσ μ ∂μ L = m R . (3.126)
iσ μ ∂μ ψ R† = 0. (3.128)
ψ ≡ L = R† . (3.130)
and obeys the same wave equation as a Dirac fermion, (iγ μ ∂μ − m) M = 0. How-
ever, it has only half as many degrees of freedom; the Majorana condition (3.130)
ensures that a Majorana fermion is its own antiparticle. From (3.125), (3.126), one
sees that in the two-component form a classical Majorana fermion obeys the wave
equation:
iσ μ ∂μ ψ − mψ † = 0, (3.132)
iσ μ ∂μ ψ † − mψ = 0. (3.133)
As we will see in Sect. 10.4, the experimental fact that neutrinos have small masses
suggests that they are likely to be Majorana fermions (and thus their own antiparti-
cles), although this expectation is based partly on theoretical prejudice and it is also
quite possible that they may be Dirac. In the minimal supersymmetric extension of
the Standard Model, there are new fermions called neutralinos and the gluino, which
are predicted to be Majorana fermions.
48 3 Relativistic Quantum Mechanics of Single Particles
Problems
1. Prove that the following statements are true, where the Ni are certain integers that
you will determine.
(a) γ μ γν γμ = N1 γν .
(b) γ μ γν γρ γμ = N2 gνρ .
(c) γ μ γν γρ γσ γμ = N3 γσ γρ γν .
(d) Tr(γμ γν γρ γσ ) = N4 (gμν gρσ − gμρ gνσ + gμσ gνρ ).
(e) [γρ , [γμ , γν ]] = N5 (gρμ γν − gρν γμ ).
2. Prove each of the following statements, where the numbers Ni are constants that
you are to determine:
(a) /p k/ /p = N1 p 2 k/ + N2 (k · p) /p
(b) Tr[ /p k/ /p k/] = N3 p 2 k 2 + N4 ( p · k)2
(c) Tr[ /p k/q/ /p ] = N5 ( p · k)(q · p) + N6 p 2 (q · k)
3. By taking the Hermitian conjugates of the Dirac equations ( /p − m)u( p, s) = 0
and ( /p + m)v( p, s) = 0, show that:
u( p, s)( /p − m) = 0, and
v( p, s)( /p + m) = 0.
6. In this problem, we will check the Lorentz invariance of the Dirac equation, and in
the process determine the Lorentz transformation rule for Dirac spinors. Suppose
that two coordinate systems are related by a Lorentz transformation
x μ = L μ ν x ν . (3.134)
(x ) = (x) (3.135)
Problems 49
−1 γ ρ L ρ μ = γ μ . (3.138)
μ
(b) Now suppose that L μ ν = δν + ωμ ν with ωμ ν infinitesimal. Prove that the
equation found in part (a) is satisfied if
1
= 1 + ωμν [γμ , γν ] (3.139)
8
7. Consider a non-infinitesimal Lorentz transformation realized on contravariant
vectors and spinors according to a μ = L μ ν a ν and ψ (x ) = ψ(x). Find L μ ν
and for:
(a) a boost of rapidity ρ in the x direction.
(b) a rotation of angle θ around the z axis.
(c) a parity transformation P. (Hint: parity exchanges left-handed and right-
handed spinors, and P 2 = 1.)
Field Theory and Lagrangians
4
It is now time to make a conceptual break from our earlier treatment of relativistic
quantum-mechanical wave equations for scalar and Dirac particles. There are two
reasons for doing this. First, the existence of negative energy solutions has lead us
to the concept of antiparticles. Now, a hole in the Dirac sea, representing a positron,
can be removed if the state is occupied by a positive energy electron, releasing an
energy of at least 2m. This forces us to admit that the total number of particles is
not conserved. The Klein-Gordon wavefunction φ(x) and the Dirac wavefunction
(x) were designed to describe single-particle probability amplitudes, but the correct
theory of nature evidently must describe a variable number of particles. Secondly,
we note that in the electromagnetic theory, Aμ (x) are not just quantum mechanical
wavefunctions; they exist classically too. If we follow this example with scalar and
spinor particles, we are lead to abandon φ(x) and (x) as quantum wavefunctions
representing single-particle states, and reinterpret them as fields that have meaning
even classically.
Specifically, a scalar particle is described by a field φ(x). Classically, this just
means that for every point x μ , the object φ(x) returns a number. Quantum mechan-
ically, φ(x) becomes an operator (rather than a state or wavefunction). There is a
distinct operator for each x μ . Therefore, we no longer have a position operator x μ ;
instead, it is just an ordinary number label that tells us which operator φ we are
talking about.
If φ(x) is now an operator, what states will it act on? To answer this, we can start
with a vacuum state
|0 (4.1)
that describes an empty universe with no particles in it. If we now act with our field
operator, we obtain a state:
φ(x)|0, (4.2)
which, at the time t = x 0 , contains one particle at x. (What this state describes at
other times is a much more complicated question!) If we act again with our field
operator at a different point y μ , we get a state
φ(y)φ(x)|0, (4.3)
t f
S= L(qn , q̇n ) dt. (4.4)
ti
Here ti and t f are fixed initial and final times, and L is the Lagrangian. It is given in
simple systems by
L = T − V, (4.5)
4.1 The Field Concept and Lagrangian Dynamics 53
where T is the total kinetic energy and V is the total potential energy. Thus the action
S is a functional of qn ; if you specify a particular trajectory qn (t), then the action
returns a single number. The Lagrangian is a function of qn (t) and its first derivative.
The usefulness of the action is given by Hamilton’s principle, which states that if
qn (ti ) and qn (t f ) are held fixed as boundary conditions, then S is minimized when
qn (t) satisfy the equations of motion. Since S is at an extremum, this means that any
small variation in qn (t) will lead to no change in S, so that qn (t) → qn (t) + δqn (t)
implies δS = 0, provided that qn (t) obeys the equations of motion. Here δqn (t) is
any small function of t that vanishes at both ti and t f , as shown in the figure below:
q(t2)
q(t)+δq(t)
q(t)
q(t)
q(t1)
t1 t2
t
Let us therefore compute δS. First, note that by the chain rule, we have:
∂L ∂L
δL = δqn + δ q̇n . (4.6)
n
∂qn ∂ q̇n
Now, since
d
δ q̇n = (δqn ), (4.7)
dt
we obtain
tf
∂L d ∂L
δS = δqn + (δqn ) dt. (4.8)
n t
∂qn dt ∂ q̇n
i
tf
∂L d ∂L ∂ L t=t f
δS = δqn − dt + δqn . (4.9)
n t
∂qn dt ∂ q̇n n
∂ q̇n t=ti
i
54 4 Field Theory and Lagrangians
The last term vanishes because of the boundary conditions δqn (ti ) = δqn (t f ) = 0.
Since the variation δS is supposed to vanish for any δqn (t), it must be that
∂L d ∂L
− = 0, (4.10)
∂qn dt ∂ q̇n
1 2
L =T −V = m ẋ − V (x), (4.11)
2
from which there follows:
∂L ∂V
=− = F, (4.12)
∂x ∂x
which we recognize as the Newtonian force, and
d ∂L d
= (m ẋ) = m ẍ. (4.13)
dt ∂ ẋ dt
F = m ẍ. (4.14)
t f t f
S= dt L(φ, φ̇) = dt d 3 x L(φ, φ̇). (4.15)
ti ti
Now, since this expression depends on φ̇, it must also depend on ∇φ in order to be
Lorentz invariant. This just means that the form of the Lagrangian allows it to depend
on the differences between the field evaluated at infinitesimally nearby points. So, a
better way to write the action is:
S = d 4 x L(φ, ∂μ φ). (4.16)
4.1 The Field Concept and Lagrangian Dynamics 55
The object L is known as the Lagrangian density. Specifying a particular form for
L defines the theory.
To find the classical equations of motion for the field, we must find φ(x) so that
S is extremized; in other words, for any small variation φ(x) → φ(x) + δφ(x), we
must have δS = 0. By a similar argument as above, this implies the equations of
motion:
δL δL
− ∂μ = 0. (4.17)
δφ δ(∂μ φ)
Here, we use δL
δφ to mean the partial derivative of L with respect to φ; δ is used rather
than ∂ to avoid confusing between derivatives with respect to φ and spacetime partial
derivatives with respect to x μ . Likewise, δ(∂δL
μ φ)
means a partial derivative of L with
respect to the object ∂μ φ, upon which it depends.
As an example, consider the choice:
1 1
L= ∂μ φ∂ μ φ − m 2 φ 2 . (4.18)
2 2
It follows that:
δL
= −m 2 φ (4.19)
δφ
and
δL δ 1 αβ 1 1
= g ∂α φ∂β φ = g μβ ∂β φ + g αμ ∂α φ = ∂ μ φ. (4.20)
δ(∂μ φ) δ(∂μ φ) 2 2 2
Therefore,
δL
∂μ = ∂μ ∂ μ φ, (4.21)
δ(∂μ φ)
∂μ ∂ μ φ + m 2 φ = 0. (4.22)
This we recognize as the Klein-Gordon wave equation for a scalar particle of mass
m; compare to (3.13). This equation was originally introduced with the interpretation
as the equation governing the quantum wavefunction of a single scalar particle. Now
it has reappeared with a totally different interpretation, as the classical equation of
motion for the scalar field.
The previous discussion for scalar fields can be extended to other types of fields as
well. Consider a general list of fields j (x), which could include scalar fields φ(x),
Dirac or Majorana fields (x) with four components, Weyl fields ψ(x) with two
components, or vector fields Aμ (x) with four components, or several copies of any
56 4 Field Theory and Lagrangians
Now using δ(∂μ j ) = ∂μ (δ j ), and integrating the second term by parts (this is
where the boundary conditions come in), one obtains
δL δL
δS = 4
d x δ j − ∂μ . (4.24)
δ j δ(∂μ j )
j
If we require this to vanish for each and every arbitrary variation δ j , we obtain the
Euler-Lagrange equations of motion:
δL δL
− ∂μ = 0, (4.25)
δ j δ(∂μ j )
for each j.
For example, let us consider how to make a Lagrangian density for a Dirac field
(x). Under Lorentz transformations, (x) transforms exactly like the wavefunction
solution to the Dirac equation. But now it is interpreted instead as a field; classically
it is a function on spacetime, and quantum mechanically it is an operator for each
point x μ . Now, (x) is a complex 4-component object, so † (x) is also a field.
One should actually treat (x) and † (x) as independent fields, in the same way
that in complex analysis one often treats z = x + i y and z ∗ = x − i y as independent
variables. As we found in Sect. 3.1, if we want to build Lorentz scalar quantities, it
is useful to use (x) = † γ 0 as a building block.
A good (and correct) guess for the Lagrangian for a Dirac field is:
L = i † γ 0 γ μ ∂μ − m † γ 0 . (4.27)
This is a Lorentz scalar, so that when integrated d 4 x it will give a Lorentz-invariant
number. Let us now compute the equations of motion that follow from it. First, let
4.1 The Field Concept and Lagrangian Dynamics 57
us find the equations of motion obtained by varying with respect to † . For this, we
need:
δL
= iγ 0 γ μ ∂μ − mγ 0 , (4.28)
δ †
δL
= 0. (4.29)
δ(∂μ † )
The second equation just reflects the fact that the Lagrangian only contains the
derivative of , not the derivative of † . So, by plugging in to the general result
(4.25), the equation of motion is simply:
iγ μ ∂μ − m = 0, (4.30)
δL
† = 0 on the left by γ and used the fact that (γ ) = 1.
where we have multiplied δ 0 0 2
− i∂μ γ μ − m = 0. (4.33)
However, this is nothing new; it is just the Hermitian conjugate of (4.30), multiplied
on the right by γ 0 and using (3.34).
The Lagrangian density for an electromagnetic field is:
1 1
LEM = − Fμν F μν = − (∂μ Aν − ∂ν Aμ )(∂ μ Aν − ∂ ν Aμ ). (4.34)
4 4
To find the equations of motion that follow from this Lagrangian, we compute:
δLEM
= 0, (4.35)
δ Aν
since Aμ doesn’t appear in the Lagrangian without a derivative acting on it, and
δLEM δ 1 αρ βσ
= − (∂α Aβ − ∂β Aα )(∂ρ Aσ − ∂σ Aρ )g g
δ(∂μ Aν ) δ(∂μ Aν ) 4
1
= − (∂ρ Aσ − ∂σ Aρ )g μρ g νσ − (∂ρ Aσ − ∂σ Aρ )g νρ g μσ
4
+(∂α Aβ − ∂β Aα )g μα g βσ − (∂α Aβ − ∂α Aβ )g να g μβ
= −∂ μ Aν + ∂ ν Aμ
= −F μν . (4.36)
58 4 Field Theory and Lagrangians
reduce to
∂μ F μν = 0, (4.38)
Lcurrent = −e J μ Aμ (4.39)
δLcurrent
= −e J ν ; (4.40)
δ Aν
δLcurrent
= 0, (4.41)
δ(∂μ Aν )
∂μ F μν − e J ν = 0, (4.42)
Let us now turn to the question of quantizing a field theory. To begin, let us recall how
one quantizes a simple generic system based on variables qn (t), given the Lagrangian
L(qn , q̇n ). First, one defines the canonical momenta conjugate to each qn :
∂L
pn ≡ . (4.43)
∂ q̇n
where the q̇n are to be eliminated using (4.43). To go to the corresponding quantum
theory, the qn and pn and H are reinterpreted as Hermitian operators acting on
4.2 Quantization of Free Scalar Field Theory 59
[ pn , pm ] = 0; (4.45)
[qn , qm ] = 0; (4.46)
[ pn , qm ] = −iδnm , (4.47)
and the time evolution of the system is determined by the Hamiltonian operator H .
Let us now apply this to the theory of a scalar field φ(x) with the Lagrangian
density given by (4.18). Then φ(x) = φ(t, x) plays the role of qn (t), with x playing
the role of the label n; there is a different field at each point in space. The momentum
conjugate to φ is:
δL δ
1 2
π(x) ≡ = φ̇ + · · · = φ̇. (4.48)
δ φ̇ δ φ̇ 2
It should be emphasized that π(x), the momentum conjugate to the field φ(x), is not
in any way the mechanical momentum of the particle. Notice that π(x) is a scalar
function, not a three-vector or a four-vector!
The Hamiltonian is obtained by summing over the fields at each point x:
H = d 3 x π(x)φ̇(x) − L (4.49)
1
= d 3 x π(x)φ̇(x) − d 3 x [φ̇ 2 − (∇φ)2 − m 2 φ 2 ] (4.50)
2
1
= d 3 x [π 2 + (∇φ)2 + m 2 φ 2 ]. (4.51)
2
Notice that a nice feature has emerged: this Hamiltonian is a sum of squares, and is
therefore always ≥ 0. There are no dangerous solutions with arbitrarily large negative
energy, unlike the case of the single-particle Klein-Gordon wave equation.
At any given fixed time t, the field operators φ(x) and their conjugate momenta
π(x) are Hermitian operators satisfying commutation relations exactly analogous to
(4.45)–(4.47):
As we will see, it turns out to be profitable to analyze the system in a way similar to the
way one treats the harmonic oscillator in one-dimensional nonrelativistic quantum
mechanics. In that system, one defines “creation” and “annihilation” (or “raising”
60 4 Field Theory and Lagrangians
where
Ep = p2 + m 2 , (4.56)
with the positive square root always taken. The overall coefficient in front of (4.55)
reflects an arbitrary choice (and in fact it is chosen differently by various books).
Equation (4.55) defines a distinct annihilation operator for each three-momentum p.
Taking the Hermitian conjugate yields:
ap† = d 3 x eip·x E p φ(x) − iπ(x) . (4.57)
= d 3 x d 3 y e−ip·x eik·y (E k + E p ) δ (3) (x − y), (4.59)
where (4.52)–(4.54) have been used. Now performing the y integral using the defi-
nition of the delta function, one obtains:
[ap , ak† ] = (E k + E p ) d 3 x ei(k−p)·x . (4.60)
Here we have put E k = E p , using the fact that the delta function vanishes except
when k = p. In a similar way, one can check the commutators:
Up to a constant factor on the right-hand side of (4.62), these results have the
same form as the harmonic oscillator algebra familiar from non-relativistic quantum
4.2 Quantization of Free Scalar Field Theory 61
ak |0 = 0. (4.64)
which describes a state with a single particle with three-momentum k (and no definite
position). Acting multiple times with raising operators produces a state with multiple
particles. So
used often from now on. The result is that (4.67) becomes
† d 3 p ip·(y−x)
d p̃ e ip·y
(ap + a−p ) = d x φ(x)
3
e . (4.70)
(2π )3
62 4 Field Theory and Lagrangians
Since the p integral in braces is equal to δ (3) (y − x) (see (4.61)), one obtains after
performing the x integral:
†
φ(y) = d p̃ eip·y (ap + a−p ), (4.71)
This expresses the original field in terms of raising and lowering operators. Similarly,
for the conjugate momentum field, one finds:
π(x) = −i d p̃ E p (eip·x ap − e−ip·x ap† ). (4.73)
Now we can plug the results of (4.72) and (4.73) into the expression (4.51) for
the Hamiltonian. The needed terms are:
1
d 3 x π(x)2 =
2
1
d 3 x d k̃ d p̃ (−i)2 E k E p eik·x ak − e−ik·x ak† eip·x ap − e−ip·x ap† , (4.74)
2
1
d 3 x (∇φ)2 =
2
1
d 3 x d k̃ d p̃ ikeik·x ak − ike−ik·x ak† · ipeip·x ap − ipe−ip·x ap† , (4.75)
2
m2
d 3 x φ(x)2 =
2
m2
d 3 x d k̃ d p̃ eik·x ak + e−ik·x ak† eip·x ap + e−ip·x ap† . (4.76)
2
where “∞” means an infinite, but constant, contribution to the energy. Since a uniform
constant contribution to the energy of all states is unobservable and commutes will
all other operators, we are free to drop it, by a redefinition of the Hamiltonian. This
is a simple example of the process known as renormalization. (In a more careful
treatment, one could “regulate” the theory by quantizing the theory confined to a
box of finite volume, and neglecting all contributions coming from momenta greater
than some very large cutoff |p|max . Then the infinite constant would be rendered
finite. Since we are going to ignore the constant anyway, we won’t bother doing
this.) So, from now on,
H = d p̃ E p ap† ap (4.84)
H |0 = 0, (4.85)
since all ap annihilate the vacuum. This shows that the infinite constant we dropped
from H is actually the infinite energy density associated with an infinite universe of
empty space, filled with the zero-point energies of an infinite number of oscillators,
one for each possible momentum 3-vector p. But we’ve already agreed to ignore it,
so let it go. One can show that:
This proves
√ that the one-particle state with 3-momentum k has energy eigenvalue
E k = k2 + m 2 , as expected from special relativity. More generally, a multi-particle
state
Let us now apply the wisdom obtained by the quantization of a scalar field in the
previous subsection to the problem of quantizing a Dirac fermion field that describes
electrons and positrons. A sensible strategy is to expand the fields and † in terms
of operators that act on states by creating and destroying particles with a given 3-
momentum. Now, since is a spinor with four components, one must expand it in a
basis for the four-dimensional spinor space. A convenient such basis is the solutions
we found to the Dirac equation, u( p, s) and v( p, s). So we expand the Dirac field,
at a given fixed time t, as:
2
(x) = d p̃ u( p, s)eip·x bp,s + v( p, s)e−ip·x dp,s
†
. (4.89)
s=1
Here s labels the two possible spin states in some appropriate basis (for example,
Sz = ±1/2). The operator bp,s will be interpreted as an annihilation operator, which
removes an electron, with 3-momentum p and spin state s, from whatever state it
†
acts on. The operator dp,s is a creation operator, which adds a positron to whatever
state it acts on. We are using b, b† and d, d † rather than a, a † in order to distinguish
the fermion and antifermion creation and annihilation operators from the scalar field
versions. Taking the Hermitian conjugate of (4.89), and multiplying by γ 0 on the
right, we get:
2
(x) = d p̃ u( p, s)e−ip·x bp,s
†
+ v( p, s)eip·x dp,s . (4.90)
s=1
†
The operator bp,s creates an electron, and dp,s destroys a positron, with the corre-
sponding 3-momentum and spin.
More generally, if the Dirac field describes some fermions other than the electron-
positron system, then you can substitute “particle” for electron and “antiparticle” for
4.3 Quantization of Free Dirac Fermion Field Theory 65
positron. So b† , b act on states to create and destroy particles, while d † , d create and
destroy antiparticles.
Just as in the case of a scalar field, we assume the existence of a vacuum state |0,
which describes a universe of empty space with no electrons or positrons present.
The annihilation operators yield 0 when acting on the vacuum state:
for all p and s. To make a state describing a single electron with 3-momentum p and
spin state s, just act on the vacuum with the corresponding creation operator:
−
†
bp,s |0 = |ep,s . (4.92)
Similarly,
† − −
bk,r †
bp,s |0 = |ek,r ; ep,s (4.93)
A corollary of this is the Pauli exclusion principle, which states that two electrons
cannot be in exactly the same state. In the present case, that means that we cannot add
to the vacuum two electrons with exactly the same 3-momentum and spin. Taking
k = p and r = s in (4.94):
− − − −
|ep,s ; ep,s = −|ep,s ; ep,s , (4.95)
† †
bk,r †
bp,s |0 = −bp,s
†
bk,r |0. (4.96)
that one might expect from comparison with the scalar field, we must have an anti-
commutation relation:
† † †
bk,r †
bp,s + bp,s
†
bk,r = {bk,r , bp,s
†
} = 0. (4.98)
66 4 Field Theory and Lagrangians
Similarly, applying the same thought process to identical positron fermions, one must
have:
†
{dk,r , dp,s
†
} = {dk,r , dp,s } = 0. (4.100)
Note that in the classical limit, → 0, these equations are unaffected, since
doesn’t appear anywhere. So it must be true that b, b† , d, and d † anticommute even
classically. So, as classical fields, one must have
Evidently, the classical Dirac field is not a normal number, but rather an anticommut-
ing or Grassmann number. Interchanging the order of any two Grassmann numbers
results in an overall minus sign.
In order to discover how the classical equations (4.101) and (4.103) are modified
when one goes to the quantum theory, let us construct the momentum conjugate to
(x). It is:
δL
P(x) = = iγ 0 = i † γ 0 γ 0 = i † . (4.104)
δ(∂0 )
So the momentum conjugate to the Dirac spinor field is just i times its Hermitian
conjugate. Now, naively following the path of canonical quantization, one might
expect the equal-time commutation relation:
However, this clearly cannot be correct, since these are anticommuting fields; in
the classical limit → 0, (4.105) disagrees with (4.103). So, instead we postulate
a canonical anticommutation relation for the Dirac field operator and its conjugate
momentum operator:
From this, using a strategy similar to that used for scalar fields, one can obtain:
†
{bp,s , bk,r †
} = {dp,s , dk,r } = (2π )3 2E p δ (3) (p − k) δsr , (4.108)
in a way very similar to the way we found the Hamiltonian for a scalar field in terms
of a, a † operators. In doing so, one must again drop an infinite constant contribution
(negative, this time) which is unobservable because it is the same for all states. Note
that H again has energy eigenvalues that are ≥ 0. One can show that:
† †
[H , bk,s ] = E k bk,s , (4.110)
† †
[H , dk,s ] = E k dk,s . (4.111)
(Note that these equations are commutators rather than anticommutators!) It follows
that the eigenstates of energy and 3-momentum are given in general by:
bp†1 ,s1 . . . bp†n ,sn dk†1 ,r1 . . . dk†m ,rm |0, (4.112)
So far, we have been dealing with free field theories. These are theories in which the
Lagrangian density is quadratic in the fields, so that the Euler-Lagrange equations
obtained by varying L are linear wave equations in the fields, with exact solutions
that are not too hard to find. At the quantum level, this nice feature shows up in the
simple time evolution of the states. In field theory, as in any quantum system, the
time evolution of a state |X is given in the Schrödinger picture by
d
i |X = H |X . (4.113)
dt
So, in the case of a multiparticle state with an energy eigenvalue E as described
above, the solution is just
In other words, the state at some time t is just the same as the state at some previous
time t0 , up to a phase. So nothing ever happens to the particles in a free theory; their
number does not change, and their momenta and spins remain the same.
We are interested in describing a more interesting situation where particles can
scatter off each other, perhaps inelastically to create new particles, and in which some
particles can decay into other sets of particles. To describe this, we need a Lagrangian
density that contains terms with more than two fields. At the classical level, this will
lead to non-linear equations of motion that have to be solved approximately. At the
quantum level, finding exact energy eigenstates of the Hamiltonian is not possible,
so one usually treats the non-quadratic part of the Hamiltonian as a perturbation on
the quadratic part, giving an approximate answer.
As an example, consider the free Lagrangian for a scalar field φ, as given in (4.18),
and add to it an interaction term:
L = L0 + Lint ; (4.115)
λ
Lint = − φ 4 . (4.116)
24
Here λ is a dimensionless number, a parameter of the theory known as a coupling.
It governs the strength of interactions; if we set λ = 0, we would be back to the free
theory in which nothing interesting ever happens. The factor of 1/4! = 1/24 is a
convention, and the reason for it will be apparent later. Now canonical quantization
can proceed as before, except that now the Hamiltonian is
H = H0 + Hint , (4.117)
where
λ
Hint = − d 3 x Lint = d 3 x φ(x)4 . (4.118)
24
Let us write this in terms of creation and annihilation operators, using (4.72):
λ
Hint = d 3 x d q̃1 d q̃2 d q̃3 d q̃4 aq1 eiq1 ·x + aq†1 e−iq1 ·x aq2 eiq2 ·x + aq†2 e−iq2 ·x
24
aq3 eiq3 ·x + aq†3 e−iq3 ·x aq4 eiq4 ·x + aq†4 e−iq4 ·x . (4.119)
Now we can perform the d 3 x integration, using (4.61). The result is:
λ
Hint = (2π ) d q̃1 d q̃2 d q̃3 d q̃4 aq†1 aq†2 aq†3 aq†4 δ (3) (q1 + q2 + q3 + q4 )
3
24
+ 4aq†1 aq†3 aq†3 aq4 δ (3) (q1 + q2 + q3 − q4 )
+ 6aq†1 aq†2 aq3 aq4 δ (3) (q1 + q2 − q3 − q4 )
+ 4aq†1 aq2 aq3 aq4 δ (3) (q1 − q3 − q3 − q4 )
+ aq1 aq2 aq3 aq4 δ (3) (q1 + q2 + q3 + q4 ) . (4.120)
4.4 Scalar Field with φ 4 Coupling 69
Here we have combined several like terms, by relabeling the momenta, giving rise
to the factors of 4, 6, and 4. This involves reordering the a’s and a † ’s. In doing so,
we have ignored the fact that a’s do not commute with a † ’s when the 3-momenta
are exactly equal. This should not cause any worry, because it just corresponds to
the situation where a particle is “scattered” without changing its momentum at all,
which is the same as no scattering, and therefore not of interest.
To see how to use the interaction Hamiltonian, it is useful to tackle a specific
process. For example, consider a scattering problem in which we have two scalar
particles with 4-momenta pa , pb that interact, producing two scalar particles with
4-momenta k1 , k2 :
pa pb → k1 k2 . (4.121)
in the far future. These are built out of creation and annihilation operators just
as before, so they are eigenstates of the free Hamiltonian H0 , but not of the full
Hamiltonian. Now, we are interested in computing the probability amplitude that the
state | pa , pb IN evolves to the state |k1 , k2 OUT . According to the rules of quantum
mechanics this is given by their overlap at a common time, say in the far future:
The state | pa , pb OUT is the time evolution of | pa , pb IN from the far past to the far
future:
where T is the long time between the far past time when the initial state was created
and the far future time at which the overlap is computed. So we have:
The states appearing on the right-hand side are simple; see (4.122) and (4.123). The
complications are hidden in the operator e−i T H .
In general, e−i T H cannot be written exactly in a useful way in terms of creation
and annihilation operators. However, we can do it perturbatively, order by order in
the coupling λ. For example, let us consider the contribution linear in λ. We use the
definition of the exponential to write:
for N → ∞. Now, the part of this that is linear in Hint can be expanded as:
N −1
e−i T H = [1 − i H0 T /N ] N −n−1 (−i Hint T /N ) [1 − i H0 T /N ]n . (4.128)
n=0
(Here we have dropped the 0th order part, e−i T H0 , as uninteresting; it just corresponds
to the particles evolving as free particles.) We can now turn this discrete sum into an
integral, by letting t = nT /N and dt = T /N in the limit of large N :
T
−i T H
e = −i dt e−i(T −t)H0 Hint e−it H0 . (4.129)
0
Next we can use the fact that we know what H0 is when acting on the simple states
of (4.122) and (4.123):
where
E i = E pa + E pb , E f = E k1 + E k2 (4.132)
are the energies of the initial and final states, respectively. So we have:
−i T H
OUT k1 , k2 |e | pa , pb IN =
T
−i dt e−i(T −t)E f e−it Ei OUT k1 , k2 |Hint | pa , pb IN . (4.133)
0
T T /2
−i(T −t)E f −it E i −i(E f +E i )T /2
dt e e =e dt
eit (E f −Ei ) (4.134)
0 −T /2
4.4 Scalar Field with φ 4 Coupling 71
∞
d x ei x A = 2π δ(A) (4.135)
−∞
to obtain:
T
dt e−i(T −t)E f e−it Ei = 2π δ(E f − E i ) e−i(E f +Ei )T /2 . (4.136)
0
This tells us that energy conservation will be enforced, and (dropping the phase
factor e−i(E f +Ei )T /2 , which will just give 1 when we take the complex square of the
probability amplitude):
Now we are ready to use our expression for Hint in (4.120). The action of Hint on
eigenstates of the free Hamiltonian can be read off from the different types of terms.
The aaaa-type term will remove four particles from the state. Clearly we don’t have
to worry about that, because there were only two particles in the state to begin with!
The same goes for the a † aaa-type term. The terms of type a † a † a † a † and a † a † a † a
create more than the two particles we know to be in the final state, so we can ignore
them too. Therefore, the only term that will play a role in this example is the a † a † aa
contribution:
λ
Hint = (2π )3 d q̃1 d q̃2 d q̃3 d q̃4 δ (3) (q1 + q2 − q3 − q4 ) aq†1 aq†2 aq3 aq4 , (4.138)
4
Therefore, we have:
4λ
OUT k1 , k2 | pa , pb OUT = −i(2π ) d q̃1 d q̃2 d q̃3 d q̃4
4
δ (4) (q1 + q2 − q3 − q4 )OUT k1 , k2 |aq†1 aq†2 aq3 aq4 | pa , pb IN . (4.139)
Here we have combined the three-momenta delta function from (4.138) with the
energy delta function from (4.137) to give a 4-momentum delta function.
It remains to evaluate:
0|ak1 ak2 aq†1 aq†2 aq3 aq4 ap†a ap†b |0. (4.141)
This can be done using the commutation relations of (4.62) and (4.63). The strategy
is to commute aq3 and aq4 to the right, so they can give 0 when acting on |0, and
72 4 Field Theory and Lagrangians
commute aq†1 and aq†2 to the left so they can give 0 when acting on 0|. Along the
way, one picks up delta functions whenever the 3-momenta of an a and a † match.
One contribution occurs when q3 = pa and q4 = pb and q1 = k1 and q2 = k2 . It
yields:
There are 3 more similar terms. You can check that each of them gives a contribution
equal to (4.142) when put into (4.139), after relabeling momenta; this cancels the
factor of 1/4 in (4.139). Now, the factors of (2π )3 2E q3 etc. all neatly cancel against
the corresponding factors in the denominator of d q̃3 , etc. (See (4.69).) The three-
momentum delta functions then make the remaining d 3 q1 , d 3 q2 , d 3 q3 , and d 3 q4
integrations trivial; they just set the four-vectors q3 = pa , q4 = pb , q1 = k1 , and
q2 = k2 in the remaining 4-momentum delta function that was already present in
(4.139).
Putting it all together, we are left with the remarkably simple result:
Rather than go through this whole messy procedure every time we invent a new
interaction term for the Lagrangian density, or every time we think of a new scattering
process, one can instead summarize the procedure with a simple set of diagrammatic
rules. These rules, called Feynman rules, are useful both as a precise summary of
a matrix element calculation, and as a heuristic guide to what physical process the
calculation represents. In the present case, the Feynman diagram for the process is:
Here the two lines coming from the left represent the incoming state scalar particles,
which get “destroyed” by the annihilation operators in Hint . The vertex where the
four lines meet represents the interaction itself, and is associated with the factor −iλ.
The two lines outgoing to the right represent the two final state scalar particles, which
are resurrected by the two creation operators in Hint .
This is just the simplest of many Feynman diagrams one could write down for
the process of two particle scattering in this theory. But all other diagrams represent
contributions that are higher order in λ, so if λ is small we can ignore them.
4.5 Scattering Processes and Cross-Sections 73
In Sect. 4.4, we found that the matrix element corresponding to 2 particle to 2 particle
λ 4
scattering in a scalar field theory with interaction Lagrangian − 24 φ is:
Now we would like to learn how to translate this information into something physi-
cally meaningful that could in principle be measured in an experiment. The matrix
element itself is infinite whenever 4-momentum is conserved, and zero otherwise.
So clearly we must do some work to relate it to an appropriate physically measurable
quantity, namely the cross-section.
The cross-section is the observable that gives the expected number of scattering
events N S that will occur if two large sets of particles are allowed to collide. Suppose
that we have Na particles of type a and Nb of type b, formed into large packets of
uniform density that move completely through each other as shown:
The two packets are assumed to have the same area A (shaded gray) perpendicular
to their motion. The total number of scattering events occurring while the packets
move through each other should be proportional to each of the numbers Na and Nb ,
and inversely proportional to the area A. The equation
Na Nb
NS = σ (4.145)
A
defines the cross-section σ . The rate at which the effective Na Nb /A is increasing with
time in an experiment is called the luminosity L (or instantaneous luminosity), and
the same quantity integrated over time is called the integrated luminosity. Therefore,
NS = σ L dt. (4.146)
The dimensions for cross-section are the same as area, and the official unit is 1
barn = 10−24 cm2 = 2568 GeV−2 . However, from the point of view of modern high-
energy experiments, a barn is a very large cross-section,1 so more commonly-used
units are obtained by using the prefixes nano-, pico-, and femto-:
1 The joke is that achieving an event with such a cross-section is “as easy as hitting the broad side
of a barn”.
74 4 Field Theory and Lagrangians
although not all of that data is useable in any given analysis. The Large Hadron
Collider (LHC) at CERN is a pp machine that previously collected 23.3 fb−1 of
integrated
√ luminosity per experiment
√ (ATLAS and CMS) at center-of-mass energy
s = 8 TeV. The LHC ran at s = 13 TeV from 2015–2018, collecting about
160 fb−1 of integrated luminosity per experiment. The peak luminosity attained was
2 × 1034 cm−2 sec−1 , exceeding the LHC design target by a factor of two.
To figure out how many scattering events one expects at a collider, one needs to
know the corresponding cross-section for that type of event, which depends on the
final state. The total cross-section for any type of scattering at hadron colliders is
quite large. By one estimate at the Tevatron it was approximately
However, this estimate is quite fuzzy, because it depends on detection variables such
as the minimum momentum transfer that one requires in order to say that a scattering
event has occurred. For arbitrarily small momentum transfer in elastic scattering of
charged particles, the cross-section actually becomes arbitrarily large due to the long-
range nature of the Coulomb force, as we will see in section 5.2.4. Also, the vast
majority of the scattering events reflected in (4.151) are extremely uninteresting,
featuring final states of well-known and well-understood hadrons.
An example of a more interesting final state would be anything involving a top
quark (t)and anti-top quark (t) pair, for which the Tevatron cross-section was about
This means that about 96,000 top pairs were produced at the Tevatron. However,
only a small fraction of these were identified as such. At the LHC, the cross-section
for producing top-antitop pairs is about
Fig. 4.1 Production cross-section for various particle final states at LHC energies of 7, 8 and 13
TeV. ATLAS experiment (from “Quantum Chromodynamics”, RPP)
1 1
As an abbreviation, we can call the initial state |i = | pa , m a ; pb , m b IN and the
final state | f = |k1 , m 1 ; . . . ; kn , m n OUT . In general, all of the particles could be
different, so that a different species of creation and annihilation operators might be
used for each. They can be either fermions or bosons, provided that the process con-
serves angular momentum, charge, and color, and is consistent with other symmetries
of the Standard Model. If the particles are not scalars, then |i and | f should also
carry labels that specify the spin of each particle. Now, because of four-momentum
conservation, we can always write:
n
4 (4)
f |i = Mi→ f (2π ) δ pa + pb − ki . (4.154)
i=1
Here Mi→ f is called the reduced matrix element for the process. In the example
in Sect. 4.4, the reduced matrix element we found to first order in the coupling λ
was simply a constant: Mφφ→φφ = −iλ. However, in general, M can be a non-
trivial Lorentz-scalar function of the various 4-momenta and spin eigenvalues of the
particles in the problem. In practice, it is computed order-by-order in perturbation
theory, so it is only known approximately.
According to the postulates of quantum mechanics, the probability of a transition
from the state |i to the state | f is:
| f |i|2
Pi→ f = . (4.155)
f | f i|i
The matrix element has been divided by the norms of the states, which are not unity;
they will be computed below. Now, the total number of scattering events expected to
occur is:
N S = Na Nb Pi→ f . (4.156)
f
d 3k
density of states = V . (4.157)
(2π )3
4.5 Scattering Processes and Cross-Sections 77
n
d 3 ki
→ V . (4.158)
(2π )3
f i=1
Putting this into (4.156) and comparing with the definition (4.145), we have for the
differential contribution to the total cross-section:
n
d 3 ki
dσ = Pi→ f A V . (4.159)
(2π )3
i=1
Let us now suppose that each packet of particles consists of a cylinder with a large
volume V . Then the total time T over which the particles can collide is given by the
time it takes for the two packets to move through each other:
V
T = . (4.160)
A|va − vb |
(Assume that the volume V and area A of each bunch are very large compared
to the cube and square of the particles’ Compton wavelengths.) It follows that the
differential contribution to the cross-section is:
n
V d 3 ki
dσ = Pi→ f V . (4.161)
T |va − vb | (2π )3
i=1
The total cross-section is obtained by integrating over 3-momenta of the final state
particles. Note that the differential cross-section dσ depends only on the collision
process being studied. So in the following we expect that the arbitrary volume V and
packet collision time T should cancel out.
To see how that happens in the example of φφ → φφ scattering with a φ 4 inter-
action, let us first compute the normalizations of the states appearing in Pi→ f . For
the initial state |i of (4.122), one has:
i|i = 0|apb [apa ap†a ]ap†b |0 + 0|apb ap†a apa ap†b |0. (4.163)
Now the incoming particle momenta pa and pb are always different, so in the last
term the ap†a just commutes with apb according to (4.62), yielding 0 when acting on
0|. The first term can be simplified using the commutator (4.62):
Commuting ap†b to the left in the same way yields the norm of the state |i:
This result is doubly infinite, since the arguments of each delta function vanish! In
order to successfully interpret it, let us recall the origin of the 3-momentum delta
functions. One can write
(2π )3 δ (3) (p − p ) = d 3 x ei0·x = V , (4.166)
i|i = 4E a E b V 2 (4.167)
n
n
f|f = (2π )3 2E i δ (3) (ki − ki ) = (2E i V ). (4.168)
i=1 i=1
In doing this, there is one subtlety; unlike the colliding particles, it could be that two
identical outgoing particles have exactly the same momentum. This seemingly could
produce “extra” contributions when we commute a † operators to the left. However,
at least for massive particles, one can usually ignore this, since the probability that
two outgoing particles will have exactly the same momentum is vanishingly small
in the limit V → ∞.
Next we turn to the square of the matrix element:
2
| f |i|2 = |Mi→ f |2 (2π )4 δ (4) ( pa + pb − ki ) . (4.169)
This also is apparently the square of an infinite quantity. To interpret it, we again
recall the origin of the delta functions:
T /2
2πδ(E f − E i ) = dt eit(E f −Ei ) = T (for E f = E i ), (4.170)
−T /2
(2π)3 δ k− p = d 3 x eix·( k− p ) = V for k= p . (4.171)
So we can write:
n
4 (4)
(2π ) δ pa + pb − ki = T V . (4.172)
i=1
4.5 Scattering Processes and Cross-Sections 79
Now if we use this to replace one of the two 4-momentum delta functions in (4.169),
we have:
n
| f |i|2 = |Mi→ f |2 (2π )4 T V δ (4) pa + pb − ki . (4.173)
i=1
Plugging the results of (4.167), (4.168) and (4.173) into (4.155), we obtain an
expression for the transition probability:
n n
4 (4) T 1
Pi→ f = |Mi→ f | (2π ) δ
2
pa + pb − ki . (4.174)
4E a E b V 2E i V
i=1 i=1
|Mi→ f |2
dσ = d n , (4.175)
4E a E b |va − vb |
Now, assuming that the collision is head-on so that vb is opposite to va (or 0), the
denominator in (4.175) is:
The most common case one encounters is two-particle scattering to a final state
with two particles. In the center-of-momentum frame, pb = −pa , so that the 2-body
Lorentz-invariant phase space becomes:
d 3 k1 d 3 k2
d 2 = (2π )4 δ (3) (k1 + k2 )δ(E a + E b − E 1 − E 2 ) (4.179)
(2π ) 2E 1 (2π )3 2E 2
3
δ (3) (k1 + k2 ) δ( k21 + m 21 + k22 + m 22 − E CM )
= d 3 k1 d 3 k2 , (4.180)
16π k1 + m 1 k2 + m 2
2 2 2 2 2
80 4 Field Theory and Lagrangians
where
E CM ≡ E a + E b (4.181)
is the center-of-momentum energy of the process. Now one can do the k2 integral;
the 3-momentum delta function just sets k2 = −k1 (as it must be in the CM frame).
If we define
K ≡ |k1 | (4.182)
d 3 k1 = K 2 d K d = K 2 d K dφ d(cos θ ), (4.183)
we find
K 2 + m 21 + K 2 + m 22
dW = KdK. (4.186)
K 2 + m 21 K 2 + m 22
Noticing that the delta function predestines K 2 + m 21 + K 2 + m 22 to be replaced
by E CM , we can write:
K 2d K K dW
= . (4.187)
K 2 + m 21 K 2 + m 22 E CM
Using this in (4.184), and integrating d W using the delta function δ(W ), we obtain:
K
d 2 = dφ d(cos θ ) (4.188)
16π 2 E CM
|k1 |
dσ = |Mi→ f |2 2 |p |
d (4.190)
64π 2 E CM a
|k1 |
dσ = |Mi→ f |2 2 |p |
d(cos θ ). (4.191)
32π E CM a
If the particle masses satisfy m a = m 1 and m b = m 2 (or they are very small), then
one has the further simplification |k1 | = |pa |, so that
1
dσ = |Mi→ f |2 2
d(cos θ ). (4.192)
32π E CM
λ2
dσφφ→φφ = 2
d(cos θ ). (4.193)
32π E CM
1
Now we can integrate over θ using −1 d(cos θ ) = 2. However, there is a double-
counting problem that we must take into account. The angles (θ, φ) that we have
integrated over represent the direction of the 3-momentum of one of the final state
particles. The other particle must then have 3-momentum in the opposite direction
(π − θ, −φ). The two possible final states with k1 along those two opposite direc-
tions are therefore actually the same state, because the two particles are identical. So,
we have actually counted each state twice when integrating over all d. To take this
into account, we have to divide by 2, arriving at the result for the total cross-section:
λ2
σφφ→φφ = 2
. (4.194)
32π E CM
In the system of units with c = = 1, energy has the same units as 1/distance. Since
λ is dimensionless, it checks that σ indeed has units of area. This is a very useful
thing to check whenever one has found a cross-section!
82 4 Field Theory and Lagrangians
For our next example, let us consider a theory with a single scalar field as before,
but with an interaction Lagrangian that is cubic in the field:
μ
Lint = − φ 3 , (4.195)
6
instead of (4.116). Here μ is a coupling that has the same dimensions as mass.
As before, let us compute the matrix element for 2 particle to particle scattering,
φφ → φφ.
The definition and quantization of the free Hamiltonian proceeds exactly as before,
with equal time commutators given by (4.62) and (4.63), and the free Hamiltonian by
(4.84). The interaction part of the Hamiltonian can be obtained in exactly the same
way as the discussion leading up to (4.120), yielding:
μ
Hint = (2π )3 d˜q1 d q̃2 d q̃3 aq†1 aq†2 aq†3 δ (3) (q1 + q2 + q3 )
6
+3aq†1 aq†2 aq3 δ (3) (q1 + q2 − q3 )
+3aq†1 aq2 aq3 δ (3) (q1 − q2 − q3 )
(3)
+aq1 aq2 aq3 δ (q1 + q2 + q3 ) . (4.196)
2 , this becomes:
in the large N limit. Keeping only terms that are of order Hint
−2 N −n−2
N
T N −n−m−2
T
T m T
T n
e−i T H = 1 − i H0 −i Hint 1 − i H0 −i Hint 1 − i H0 .
N N N N N
n=0 m=0
(4.199)
4.6 Scalar Field with φ 3 Coupling 83
Now, in the large N limit, we can convert the discrete sums into integrals over the
variables t = T n/N and t
= T m/N + t with t = t
= T /N . Since most of the
contribution comes from large n, m when N → ∞, the result becomes:
T T
−i T H
e = dt dt
e−i H0 (T −t ) (−i Hint ) e−i H0 (t −t) (−i Hint ) e−i H0 t . (4.200)
0 t
When we sandwich this between the states k1 , k2 | and |pa , pb , we can substitute
provided that in place of E X we will later put in the appropriate energy eigenvalue
of the state created by each particular term in −i Hint acting on the initial state. So,
we have:
where
T T
I = dt dt
e−i E f (T −t ) e−i E X (t −t) e−i Ei t . (4.205)
0 t
T /2 T /2
I =e −i T (E i +E f )/2
d t¯ ei t¯(E X −Ei ) d t¯
ei t¯ (E f −E X ) . (4.206)
−T /2 t¯
Now e−i T (Ei +E f )/2 is just a constant phase that will go away when we take the
complex square of the matrix element, so we drop it. Then, relabeling t¯ → t and
t¯
→ t
, and taking the limit of a very long time T → ∞:
∞ ∞
it(E X −E i )
I = dt e dt
eit (E f −E X +i) . (4.207)
−∞ t
84 4 Field Theory and Lagrangians
The t
integral does not have the form of a delta function because its lower limit of
integration, we get:
∞
i
I = dt eit(E f −Ei ) (4.208)
E f − E X + i
−∞
i
= 2π δ(E f − E i ) . (4.209)
E f − E X + i
As usual, energy conservation between the initial and final states is thus automatic.
Putting together the results above, we have so far:
Let us now evaluate the matrix element in (4.210). To do this, we can divide the
calculation up into pieces, depending on how many a and a † operators are contained
in each factor of Hint . First, let us consider the contribution when the right Hint
contains a † aa terms acting on the initial state, and the left Hint contains a † a † a
terms. Taking these pieces from (4.196), the contribution from (4.210) is:
OUT k1 , k2 |pa , pb OUT =
(a † a † a)(a † aa) part
μ 2
−i (2π )3 2π δ(E f − E i ) d r̃1 d r̃2 d r̃3 d q̃1 d q̃2 d q̃3
2
δ (r1 + r2 − r3 ) δ (3) (q1 − q2 − q3 )
(3)
i
0|ak1 ak2 ar1 ar2 ar3
† †
aq†1 aq2 aq3 ap†a ap†b |0. (4.211)
E f − EX
The factor involving E X is left inserted within the matrix element to remind us that
E X should be replaced by the eigenvalue of the free Hamiltonian H0 acting on the
state to its right.
The last line in (4.211) can be calculated using the following general strategy. We
commute a’s to the right and a † ’s to the left, using (4.62) and (4.63). In doing so, we
will get a non-zero contribution with a delta function whenever the 3-momentum of
an a equals that of an a † with which it is commuted, removing that a, a † pair. In the
end, every a must “contract” with some a † in this way (and vice versa), because an
a acting on |0 or an a † acting on 0| vanishes.
This allows us to identify what E X is. The aq2 and aq3 operators must be contracted
with ap†a and ap†b if a non-zero result is to be obtained. There are two ways to do
this: either pair up [aq2 , ap†a ] and [aq3 , ap†b ], or pair up [aq2 , ap†b ] and [aq3 , ap†a ]. In
4.6 Scalar Field with φ 3 Coupling 85
q1 = pa + pb ≡ Q. (4.212)
So the energy eigenvalue of the state aq†1 aq2 aq3 ap†a ap†b |0 must be replaced by
E X = EQ = |pa + pb |2 + m 2 (4.213)
0|ak1 ak2 ar†1 ar†2 ar3 aq†1 aq2 aq3 ap†a ap†b |0 (4.214)
now yields four distinct non-zero terms, corresponding to the following ways of
contracting a’s and a † ’s:
[ak1 , ar†1 ] [ak2 , ar†2 ] [aq2 , ap†a ] [aq3 , ap†b ] [ar3 , aq†1 ], or (4.215)
[ak1 , ar†2 ] [ak2 , ar†1 ] [aq2 , ap†a ] [aq3 , ap†b ] [ar3 , aq†1 ], or (4.216)
[ak1 , ar†1 ] [ak2 , ar†2 ] [aq2 , ap†b ] [aq3 , ap†a ] [ar3 , aq†1 ], or (4.217)
[ak1 , ar†2 ] [ak2 , ar†1 ] [aq2 , ap†b ] [aq3 , ap†a ] [ar3 , aq†1 ]. (4.218)
Now, the various factors of (2π )2 2E just cancel the factors in the denominators of the
definition of d q̃i and d r̃i . One can do the q1 , q2 , q3 , r1 , r2 , and r3 integrations trivially,
using the 3-momentum delta functions, resulting in the following contribution to
(4.211):
μ 2 i
1
−i (2π )4 δ(E f − E i )δ (3) (k1 + k2 − pa − pb ). (4.220)
2 E f − E Q 2E Q
The two delta functions can be combined into δ (4) (k1 + k2 − pa − pb ). Now, the
other three sets of contractions listed in (4.216)–(4.218) are exactly the same, after
a relabeling of momenta. This gives a factor of 4, so (replacing E f → E i in the
denominator, as allowed by the delta function) we have:
OUT k1 , k2 |pa , pb OUT † † =
(a a a)(a aa) part
†
i
(−iμ) 2
(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.221)
(E i − E Q )(2E Q )
86 4 Field Theory and Lagrangians
One can draw a simple picture illustrating what has happened in the preceding
formulas:
The initial state contains two particles, denoted by the lines on the left. Acting with
the first factor of Hint (on the right in the formula, and represented by the vertex on the
left in the figure) destroys the two particles and creates a virtual particle in their place.
The second factor of Hint destroys the virtual particle and creates the two final state
particles, represented by the lines on the right. The three-momentum carried by the
intermediate virtual particle is Q = pa + pb = k1 + k2 , so momentum is conserved
at the two vertices.
However, there are other contributions that must be included. Another one occurs
if the Hint on the right in (4.210) contains a † a † a † operators, and the other Hint
contains aaa operators. The corresponding picture is this:
Here the Hint carrying a † a † a † (the rightmost one in the formula) is represented by
the upper left vertex in the figure, and the one carrying aaa is represented by the
lower right vertex. The explicit formula corresponding to this picture is:
OUT k1 , k2 |pa , pb OUT =
(aaa)(a † a † a † ) part
μ 2
−i (2π )3 2π δ(E f − E i ) d r̃1 d r̃2 d r̃3 d q̃1 d q̃2 d q̃3
6
δ (r1 + r2 + r3 ) δ (3) (q1 + q2 + q3 )
(3)
i
0|ak1 ak2 ar1 ar2 ar3 aq†1 aq†2 aq†3 ap†a ap†b |0. (4.222)
E f − EX
As before, we can calculate this by commuting a’s to the right and a † ’s to the left.
In the end, non-zero contributions arise only when each a is contracted with some
a † . In doing so, we should ignore any terms that arise whenever a final state state ak
is contracted with an initial state ap† . That would correspond to a situation with no
scattering, since the initial state particle and the final state particle would be exactly
the same.
4.6 Scalar Field with φ 3 Coupling 87
E X = E q1 + E q2 + E q3 + E pa + E pb . (4.223)
For example consider the term obtained from the following contractions of a’s
and a † ’s:
[ak1 , aq†1 ] [ak2 , aq†2 ] [ar1 , ap†a ] [ar2 , ap†b ] [ar3 , aq†3 ]. (4.224)
This leads to factors of (2π )3 2E and momentum delta functions just as before. So
we can do the 3-momentum q1,2 and r1,2,3 integrals using the delta functions, in the
process setting q1 = k1 and q2 = k2 and r1 = pa and r2 = pb and r3 = q3 . Finally,
we can do the q3 integral using one of the delta functions already present in (4.222),
resulting in q3 = −pa − pb = −k1 − k2 = −Q, with Q the same as was defined in
(4.212). This allows us to identify in this case:
E X = E k1 + E k2 + E pa + E pb + E Q = 2E i + E Q . (4.225)
It is now profitable to combine the two contributions we have found. One hint that
this is a good idea is the fact that the two cartoon figures we have drawn for them are
topologically the same; the second one just has a line that moves backwards. So if
we just ignore the distinction between internal lines that move backwards and those
that move forwards, we can draw a single Feynman diagram to represent both results
combined:
1
2
88 4 Field Theory and Lagrangians
The initial state is on the left, and the final state is on the right, and the flow of 4-
momentum is indicated by the arrows, with 4-momentum conserved at each vertex.
The result of combining these two contributions is called the s-channel contribu-
tion, to distinguish it from still more contributions that we will get to soon. Using a
common denominator for (E i − E Q ) and (E i + E Q ), we get:
OUT k1 , k2 |pa , pb OUT =
s−channel
i
(−iμ) 22
(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.227)
Ei − EQ
2
( pa + pb )2 = E i2 − |Q|2 = E i2 − E Q
2
+ m2. (4.229)
μ μ
(Note that pa + pb is not equal to (E Q , Q).) So we can rewrite the term
i i
= , (4.230)
E i2 − EQ
2 ( pa + pb )2 − m 2
The final result is that the s-channel contribution to the matrix element is:
k
OUT 1 2 a, k |p , p
b OUT =
s−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ). (4.231)
( pa + pb )2 − m 2
Note that one could just as well have put (k1 + k2 )2 in place of ( pa + pb )2 in this
expression, because of the delta function.
Now one can go through the same whole process with contributions that come
from the rightmost Hint (acting first on the initial state) consisting of a † a † a terms,
and the leftmost Hint containing a † aa terms. One can draw Feynman diagrams that
represent these terms, which look like:
1 1
1 2
2 2
4.6 Scalar Field with φ 3 Coupling 89
These are referred to as the t-channel and u-channel contributions respectively. Here
we have combined all topologically-identical diagrams. This is a standard procedure
that is always followed; the diagrams we have drawn with dashed lines for the scalar
field are the Feynman diagrams for the process. [The solid-line diagrams appearing
between (4.221) and (4.222) above are sometimes known as “old-fashioned Feynman
diagrams”, but it is very rare to see them in the modern literature.]
After much juggling of factors of (2π )3 and doing 3-momentum integrals using
delta functions, but using no new concepts, the contributions of the t-channel and
u-channel Feynman diagrams can be found to be simply:
k , k
OUT 1 2 a |p , p
b OUT =
t−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ), (4.232)
( pa − k1 )2 − m 2
and
OUT k1 , k2 |pa , pb OUT =
u−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ). (4.233)
( pa − k2 )2 − m 2
The reduced matrix element can now be obtained by just stripping off the factors of
(2π )4 δ (4) (k1 + k2 − pa − pb ), as demanded by the definition (4.154). So the total
reduced matrix element, suitable for plugging into the formula for the cross-section,
is:
Mφφ→φφ = Ms + Mt + Mu (4.234)
where:
i
Ms = (−iμ)2 , (4.235)
( pa + pb )2 − m 2
i
Mt = (−iμ)2 , (4.236)
( pa − k1 )2 − m 2
i
Mu = (−iμ)2 . (4.237)
( pa − k2 )2 − m 2
The reason for the terminology s, t, and u is because of the standard kinematic
variables for 2→2 scattering known as Mandelstam variables:
s = ( pa + pb )2 = (k1 + k2 )2 , (4.238)
t = ( pa − k1 )2 = (k2 − pb )2 , (4.239)
u = ( pa − k2 )2 = (k1 − pb )2 . (4.240)
90 4 Field Theory and Lagrangians
The s-, t-, and u-channel diagrams are simple functions of the corresponding Man-
delstam variables:
i
Ms = (−iμ)2 , (4.241)
s − m2
i
Mt = (−iμ)2 , (4.242)
t − m2
i
Mu = (−iμ)2 . (4.243)
u − m2
It is now possible to abstract what we have found, to obtain the general Feynman
rules for calculating reduced matrix elements in a scalar field theory. Evidently,
the reduced matrix element M is the sum of contributions from each topologically
distinct Feynman diagram, with external lines corresponding to each initial state or
final state particle. For each term in the interaction Lagrangian
y n
Lint = − φ , (4.244)
n!
with coupling y, one can draw a vertex at which n lines meet. At each vertex, 4-
momentum must be conserved. Then:
• For each vertex appearing in a diagram, we should put a factor of −i y. For the
examples we have done with y = λ and y = μ, the Feynman rules are just:
Note that the conventional factor of 1/n! in the Lagrangian (4.244) makes the
corresponding Feynman rule simple in each case.
• For each internal scalar field line carrying 4-momentum p μ , we should put a
factor of i/( p 2 − m 2 + i):
2 2
4.7 Feynman Rules 91
This factor associated with internal scalar field lines is called the Feynman prop-
agator. Here we have added an imaginary infinitesimal term i, with the under-
standing that → 0 at the end of the calculation; this turns out to be necessary for
cases in which p 2 become very close to m 2 . This corresponds to the particle on the
internal line being nearly “on-shell”, because p 2 = m 2 is the equation satisfied
by a free particle in empty space.
• For each external line, we just have a factor of 1:
or 1
This is a rather trivial rule, but it is useful to mention it because in the cases of
fermions and vector fields, external lines will turn out to carry non-trivial factors
not equal to 1. (Here the gray blobs represent the rest of the Feynman diagram.)
Those are all the rules one needs to calculate reduced matrix elements for Feynman
diagrams without closed loops, also known as tree diagrams. The result is said to
be a tree-level calculation. There are additional rules that apply to diagrams with
closed loops (loop diagrams) which have not arisen explicitly in the preceding
discussion, but could be inferred from more complicated calculations. For them,
the additional rules are:
• For each closed loop in a Feynman diagram, there is an undetermined 4-
momentum μ . These loop momenta should be integrated over according to:
d 4
. (4.245)
(2π )4
Loop diagrams quite often diverge because of the integration over all μ , because
of the contribution from very large |2 |. This can be fixed by introducing a cutoff
|2 |max in the integral, or by other more rigorous methods, which can make the
integrals finite. The techniques of getting physically meaningful answers out of
this are known as regularization (making the integrals finite) and renormalization
(redefining coupling constants and masses so that the physical observables do not
depend explicitly on the unknown cutoff).
• If a Feynman diagram with one or more closed loops can be transformed into
an exact copy of itself by interchanging any number of internal lines through a
smooth deformation, without moving the external lines, then there is an additional
factor of 1/N , where N is the number of distinct permutations of that type. (This
is known as the “symmetry factor” for the loop diagram.)
Some examples might be useful. In the φ 3 theory, there are quite a few Feynman
diagrams that will describe the scattering of 2 particles to 3 particles. One of them
is shown below:
92 4 Field Theory and Lagrangians
1 2
For this diagram, according to the rules, the contribution to the reduced matrix
element is just
i i
M = (−iμ)3 . (4.246)
( pa + pb )2 − m 2 (k1 + k2 )2 − m 2
Imagine having to calculate this starting from scratch with creation and annihilation
operators, and tremble with fear! Feynman rules are good.
An example of a Feynman diagram with a closed loop in the φ 3 theory is:
1
There is a symmetry factor of 1/2 for this diagram, because one can smoothly
μ μ
interchange the two lines carrying 4-momenta μ and μ − pa − pb to get back
to the original diagram, without moving the external lines. So the reduced matrix
element for this diagram is:
2
1 i d4 i i
M= (−iμ)4 .
2 ( pa + pb )2 − m 2 (2π )4 ( − pa − pb )2 − m 2 + i 2 − m 2 + i
Again, deriving this result starting from the creation and annihilation operators is pos-
sible, but extraordinarily unpleasant! In the future, we will simply state the Feynman
rules for any theory from staring at the Lagrangian density. The general procedure
for doing this is rather simple (although the proof is not), and is outlined below.
A Feynman diagram is a precise representation of a contribution to the reduced
matrix element M for a given physical process. The diagrams are built out of three
types of building blocks:
The Feynman rules specify a mathematical expression for each of these objects. They
follow from the Lagrangian density, which defines a particular theory.2
To generalize what we have found for scalar fields, let us consider a set of generic
fields i , which can include both commuting bosons and anticommuting fermions.
They might include real or complex scalars, Dirac or Weyl fermions, and vector
fields of various types. The index i runs over a list of all the fields, and over their
spinor or vector indices. Now, it is always possible to obtain the Feynman rules by
writing an interaction Hamiltonian and computing matrix elements. Alternatively,
one can use powerful path integral techniques that are beyond the scope of this book
to derive the Feynman rules. However, in the end the rules can be summarized very
simply in a way that could be guessed from the examples of real scalar field theory
that we have already worked out. In the following, we will simply state the relevant
results; more rigorous derivations can be found in field theory textbooks.
For interactions, we have now found in two cases that the Feynman rule for n
scalar lines to meet at a vertex is equal to −i times the coupling of n scalar fields
in the Lagrangian with a factor of 1/n!. More generally, consider an interaction
Lagrangian term:
X i1 i2 ...i N
Lint = − i1 i2 . . . i N , (4.250)
P
where P is the product of n! for each set of n identical fields in the list i1 , i2 ,
. . . , i N , and X i1 i2 ...i N is the coupling constant that determines the strength of the
interaction. The corresponding Feynman rule attaches N lines together at a vertex.
Then the mathematical expression assigned to this vertex is −i X i1 i2 ...i N . The lines for
distinguishable fields among i 1 , i 2 , . . . , i N should be labeled as such, or otherwise
distinguished by drawing them differently from each other.
For example, consider a theory with two real scalar fields φ and ρ. If the interaction
Lagrangian includes terms, say,
λ1 2 2 λ2 3
Lint = − φ ρ − φ ρ, (4.251)
4 6
then there are Feynman rules:
Here the longer-dashed lines correspond to the field φ, and the shorter-dashed lines
to the field ρ.
2 It is tempting to suggest that the Feynman rules themselves should be taken as the definition of the
theory. However, this would only be sufficient to describe phenomena that occur in a perturbative
weak-coupling expansion.
94 4 Field Theory and Lagrangians
In this case, we must distinguish between lines for all three fields, because = † γ 0
is independent of . For Dirac fermions, one draws solid lines with an arrow coming
in to a vertex representing in Hint , and an arrow coming out representing . So
the Feynman rule for this interaction is:
Note that this Feynman rule is proportional to a 4 × 4 identity matrix in Dirac spinor
a
space. This is because the interaction Lagrangian can be written −yδa b φ b ,
where a is the Dirac spinor index for and b for . Often, one just suppresses the
spinor indices, and writes simply −i y for the Feynman rule, with the identity matrix
implicit.
The interaction Lagrangian (4.252) is called a Yukawa coupling. This theory has
a real-world physical application: it is precisely the type of interaction that applies
between the Standard Model Higgs boson φ = h and each Dirac fermion , with the
coupling y proportional to the mass of that fermion. We will return to this interaction
when we discuss the decays of the Higgs boson into fermion-antifermion pairs.
Let us turn next to the topic of internal lines in Feynman diagrams. These are
determined by the free (quadratic) part of the Lagrangian density. Recall that for a
scalar field, we can write the free Lagrangian after integrating by parts as:
1
L0 = φ(−∂μ ∂ μ − m 2 )φ. (4.253)
2
This corresponded to a Feynman propagator rule for internal scalar lines i/( p 2 −
m 2 + i). So, up to the i factor, the propagator is just proportional to i divided by
the inverse of the coefficient of the quadratic piece of the Lagrangian density, with
the replacement
∂μ −→ −i pμ . (4.254)
The free Lagrangian density for generic fields i can always be put into either the
form
1
L0 = i Pi j j , (4.255)
2
i, j
4.7 Feynman Rules 95
for complex fields (including, for example, Dirac spinors). To accomplish this, one
by parts, throwing away a total derivative in L0 which
may need to integrate the action
will not contribute to S = d 4 x L. Here Pi j is a matrix that involves spacetime
derivatives and masses. Then it turns out that the Feynman propagator can be found
by making the replacement (4.254) and taking i times the inverse of the matrix Pi j :
i(P −1 )i j . (4.257)
This corresponds to an internal line in the Feynman diagram labeled by i at one end
and j at the other.
As an example, consider the free Lagrangian for a Dirac spinor , as given by
(4.26). According to the prescription of (4.256) and (4.257), the Feynman propagator
connecting vertices with spinor indices a and b should be:
b
i ( /p − m)−1 a . (4.258)
In order to make sense of the inverse matrix, we can write it as a fraction, then
multiply numerator and denominator by ( /p + m), and use the fact that /p /p = p 2
from (3.113):
i i( /p + m) i( /p + m)
= = 2 . (4.259)
/p − m ( /p − m)( /p + m) p − m 2 + i
In the last line we have put in the i factor needed for loop diagrams as a prescription
for handling the possible singularity at p 2 = m 2 . So the Feynman rule for a Dirac
fermion internal line is:
Here the arrow direction on the fermion line distinguishes the direction of particle
flow, with particles (anti-particles) moving with (against) the arrow. For electrons
and positrons, this means that the arrow on the propagator points in the direction
of the flow of negative charge. As indicated, the 4-momentum p μ appearing in the
propagator is also assigned to be in the direction of the arrow on the internal fermion
line.
Next we turn to the question of Feynman rules for external particle and anti-
particle lines. At a fixed time t = 0, a generic field is written as an expansion of
the form:
96 4 Field Theory and Lagrangians
(x) = d p̃ i(p, n) eip·x ap,n + f (p, n) e−ip·x bp,n
†
, (4.260)
n
†
where ap,n and bp,n are annihilation and creation operators (which may or may
not be Hermitian conjugates of each other); n is an index running over spins and
perhaps other labels for different particle types; and i(p, n) and f (p, n) are expansion
coefficients. In general, we build an interaction Hamiltonian out of the fields . When
†
acting on an initial state ak,m |0 on the right, Hint will therefore produce a factor of
i(k, m) after commuting (or anticommuting, for fermions) the ap,n operator in to
†
the right, removing the ak,m . Likewise, when acting on a final state 0|bk,m on the
left, the interaction Hamiltonian will produce a factor of f (k, m). Therefore, initial
and final state lines just correspond to the appropriate coefficient of annihilation and
creation operators in the Fourier mode expansion for that field.
For example, comparing (4.260) to (4.72) in the scalar case, we find that i(p, n)
and f (p, n) are both just equal to 1.
For Dirac fermions, we see from (4.89) that the coefficient for an initial state
particle (electron) carrying 4-momentum p μ and spin state s is u( p, s)a , where a
is a spinor index. Similarly, the coefficient for a final state antiparticle (positron) is
v( p, s)a . So the Feynman rules for these types of external particle lines are:
Here the blobs represent the rest of the Feynman diagram in each case. Similarly,
considering the expansion of the field in (4.90), we see that the coefficient for
an initial state antiparticle (positron) is v( p, s)a and that for a final state particle
(electron) is u( p, s)a . So the Feynman rules for these external states are:
Note that in these rules, the p μ label of an external state is always the physical
4-momentum of that particle or anti-particle; this means that with the standard con-
vention of initial state on the left and final state on the right, the p μ associated with
each of u( p, s), v( p, s), u( p, s) and v( p, s) is always taken to be pointing to the
right. For v( p, s) and v( p, s), this is in the opposite direction to the arrow on the
fermion line itself.
Problems 97
Problems
2. Prove that
3. Writing all Lorentz invariant terms for the lagrangian of a scalar field φ with no
more than two powers of φ yields
1 1
L= (∂μ φ)(∂ μ φ) − m 2 φ 2 . (4.263)
2 2
However, why not add a term linear in φ such as φ, where is a constant?
Show that this extra linear term (“tadpole term” as it is often called) can be
eliminated by a suitable redefinition of the field φ.
4. In our computation of σ (φφ → φφ) within the scalar φ 4 theory the matrix ele-
ment was Mi→ f = −iλ. Let us suppose instead that it is
1 1 1 1
L= (∂μ φ1 )(∂ μ φ1 ) + (∂μ φ2 )(∂ μ φ2 ) − m 21 φ12 − m 22 φ22 + m 23 φ1 φ2 + μφ13 .
2 2 2 2
Redefine this theory by rotating φ1 and φ2 such that the kinetic and mass terms
(bilinear in φi ) are canonical (i.e., diagonal). Then write down all the Feynman
rules of the theory. In other words, give all the external leg factors, the propaga-
tors, and the interaction vertices.
98 4 Field Theory and Lagrangians
Hint: Expressions will be simpler if you find the rotation angle α between the
{φ1 , φ2 } states and the mass eigenstates {φ1
, φ2
} in terms of m 1 , m 2 and m 3 and
then write all interaction Feynman rules in terms of μ and the angle α. Also,
you’ll need to find the mass-eigenstates masses m
2
2
1 and m 2 in terms of m 1 , m 2
and m 3 .
7. Draw all the Feynman diagrams corresponding to φφ → φφφφ scattering, to
order λ2 . You should find 10 distinct diagrams, which can be organized into two
distinct classes. Clearly label the particle 4-momenta for each external particle
and internal propagator line in each diagram. Use Feynman rules to write down
an expression for the matrix elements corresponding to these diagrams.
8. Consider the reduced matrix element obtained for the toy model example of
φφ → φφ scattering in φ 3 theory in Sect. 4.6.
(a) Find the differential cross section
dσ
d cos θ
as a function of μ, s, m, and cos θ .
(b) Find the total cross section in terms of μ, s, m, and simplify it as much as
possible.
(c) Show that the total cross section approaches a constant at threshold, and falls
like 1/s 2 at high energy:
N 2 μ4
σthreshold = (4.265)
1152π m 6
μ4
σsm 2 = , (4.266)
16π m 2 s 2
where N is a certain odd integer. [Hints: integrate directly in terms of the
variable cos θ . You are very likely to find the following definite integrals to
be useful:
1
dx 1 a+b
= ln , (4.267)
a + bx b a−b
−1
1
dx 2
= 2 , (4.268)
(a + bx)2 a − b2
−1
1
dx 1 a+b
= ln . (4.269)
(a + bx)(a − bx) ab a−b
−1
In the theory of a free Dirac fermion field, use the anticommutation relations of
b, b† to compute the commutators
†
[H , bk,s ]=? and (4.272)
[H , bk,s ] = ? (4.273)
in the case of a free real scalar field φ. Rewrite this operator in terms of the ap
and ap† operators, and show that the result is:
P= d p̃ p ap† ap (4.275)
(Hint: You will have to argue that certain terms vanish, including an apparently
infinite one, by carefully noting their behavior as p → −p.) What is P acting on
the vacuum state |0 ? Compute the commutator:
Let us now see how all of the general rules we have developed so far apply in
the case of Quantum Electrodynamics. This is the quantum field theory governing
photons (quantized electromagnetic waves) and charged fermions and antifermions.
The fermions in the theory are represented by Dirac spinor fields carrying electric
charge Qe, where e is the magnitude of the charge of the electron. Thus Q = −1 for
electrons and positrons, +2/3 for up, charm and top quarks and their anti-quarks,
and −1/3 for down, strange and bottom quarks and their antiquarks. (Recall that a
single Dirac field, assigned a single value of Q, is used to describe both particles and
their anti-particles.) The free Lagrangian for the theory is:
1
L0 = − F μν Fμν + (iγ μ ∂μ − m). (5.1)
4
Now, earlier we found that the electromagnetic field Aμ couples to the 4-current
density J μ = (ρ, J ) by a term in the Lagrangian −e J μ Aμ (see (4.39)). Since J μ
must be a four-vector built out of the charged fermion fields and , we can guess
that
J μ = Qγ μ . (5.2)
The interaction Lagrangian density for a fermion with charge Qe and electromagnetic
fields is therefore
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 101
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_5
102 5 Quantum Electro-Dynamics (QED)
characteristic energy of the process. For very low energy experiments, the numerical
value is e ≈ 0.30282, corresponding to the experimental result for the fine structure
constant:
e2
α≡ ≈ 1/137.036. (5.4)
4π
For experiments done at energies near 100 GeV, the appropriate value is a little larger,
more like e ≈ 0.313.
Let us take a small detour to check that (5.2) really has the correct form and
normalization to be the electromagnetic current density. Consider the total charge
operator:
=
Q d x ρ(x) =
3
d x J (x) = Q
3 0
d x γ = Q
3 0
d 3 x † . (5.5)
Plugging in (4.89) and (4.90), and doing the x integration, and one of the momentum
integrations using the resulting delta function, one finds
2
2
= Q d p̃ † † †
Q [u ( p, s)bp,s + v † ( p, s)dp,s ][u( p, r )bp,r + v( p, r )dp,r ]. (5.6)
2E p
s=1 r =1
Taking into account that the operators d, d † satisfy the anticommutation relation
(4.108), the result is
2
= Q
Q †
d p̃ bp,s bp,s − dp,s
†
dp,s . (5.9)
s=1
= 0.
Q|0 (5.10)
b† ] = Q b† ,
[ Q, (5.11)
k,r k,r
d † ] = −Q d † .
[ Q, (5.12)
k,r k,r
5.1 QED Lagrangian and Feynman Rules 103
Therefore,
b† |0 = Q b† |0
Q (5.13)
k,r k,r
d † |0 = −Q d † |0
Q (5.14)
k,r k,r
acting on a state
for single anti-particle states. More generally, the eigenvalue of Q
with N particles and N antiparticles is (N − N )Q. From (5.5), this verifies that the
charge density ρ is indeed the time-like component of the four-vector Qγ μ ,
which must therefore be equal to J μ .
The full QED Lagrangian is invariant under gauge transformations:
1
Aμ (x) → Aμ (x) − ∂μ θ (x), (5.15)
e
(x) → e i Qθ
(x), (5.16)
(x) → e−i Qθ (x), (5.17)
Here the term “covariant” refers to the gauge transformation symmetry, not the
Lorentz transformation symmetry as it did when we introduced covariant four-
vectors. Note that the covariant derivative actually depends on the charge of the
field it acts on. Now one can write the full Lagrangian density as
1
L = L0 + Lint = − F μν Fμν + iγ μ Dμ − m. (5.19)
4
The ordinary derivative of the spinor transforms under the gauge transformation with
an “extra” term:
The point of the covariant derivative of is that it transforms under the gauge
transformation the same way does, by acquiring a phase:
Dμ → ei Qθ Dμ . (5.21)
Here the contribution from the transformation of Aμ in Dμ cancels the extra term in
(5.20). Using (5.15)–(5.17) and (5.21), it is easy to see that L is invariant under the
gauge transformation, since the multiplicative phase factors just cancel.
104 5 Quantum Electro-Dynamics (QED)
Returning to the interaction term (5.3), we can now identify the Feynman rule for
QED interactions, by following the general prescription outlined with (4.250):
3
Aμ = d p̃ μ ( p, λ)eip·x ap,λ + μ∗ ( p, λ)e−ip·x ap,λ
†
. (5.22)
λ=0
†
The operators ap,λ and ap,λ act on the vacuum state by creating and destroying
photons with momentum p and polarization vector μ ( p, λ).
However, not all of the four degrees of freedom labeled by λ can be physical. From
classical electromagnetism, we know that electromagnetic waves are transversely
polarized. This means that the electric and magnetic fields are perpendicular to the 3-
momentum direction of propagation. In terms of the potentials, it means that one can
always choose a gauge in which A0 = 0 and the Lorenz gauge condition ∂μ Aμ = 0
is satisfied. Therefore, physical electromagnetic wave quanta corresponding to the
classical solutions to Maxwell’s equations Aμ = μ e−i p·x with p 2 = 0 can be taken
to obey:
(or, equivalent to the last condition, · p = 0). After imposing these two conditions,
only two of the four λ’s will survive as valid initial or final states for any given p μ .
For example, suppose that a state contains a photon with 3-momentum p = p ẑ,
so p μ = ( p, 0, 0, p). Then we can choose a basis of transverse linearly-polarized
vectors with λ = 1, 2:
However, in high-energy physics it is often more useful to instead use a basis of left-
and right-handed circular polarizations that carry definite helicities λ = R, L:
1
μ ( p, R) = √ (0, 1, i, 0) right-handed, (5.28)
2
1
μ ( p, L) = √ (0, 1, −i, 0) left-handed. (5.29)
2
In general, incoming photon lines have a Feynman rule μ ( p, λ) and outgoing photon
lines have a Feynman rule μ∗ ( p, λ), where λ = 1, 2 in some convenient basis of
choice. Often, we will sum or average over the polarization labels λ, so the μ ( p, λ)
will not need to be listed explicitly for a given momentum.
Let us next construct the Feynman propagator for photon lines. The free
Lagrangian density given in (4.34) can be rewritten as:
1 μ
L0 = A gμν ∂ρ ∂ ρ − ∂μ ∂ν Aν , (5.30)
2
where we have dropped a total derivative. (The action, obtained by integrating L0 ,
does not depend on total derivative terms.) Therefore, following the prescription of
(4.257), it appears that we ought to find the propagator by finding the inverse of the
4 × 4 matrix
Unfortunately, however, this matrix is not invertible. The reason for this can be traced
to the gauge invariance of the theory; not all of the physical states we are attempting
to propagate are really physical.
This problem can be avoided using a trick, due to Fermi, called “gauge fixing”.
As long as we agree to stick to the Lorenz gauge, ∂μ Aμ = 0, we can add a term to
the Lagrangian density proportional to (∂μ Aμ )2 :
(ξ ) 1
L0 = L 0 − (∂μ Aμ )2 . (5.32)
2ξ
In Lorenz gauge, not only does the extra term vanish, but also its contribution to the
equations of motion vanishes. Here ξ is an arbitrary new gauge-fixing parameter;
it can be picked at will. Intermediate steps in a calculation may depend on it, but
physical results should not depend on the choice of ξ . The new term in the modified
(ξ )
free Lagrangian L0 is called the gauge-fixing term. Now the matrix to be inverted
is:
1
Pμν = − p 2 gμν + 1 − pμ pν . (5.33)
ξ
106 5 Quantum Electro-Dynamics (QED)
To find the inverse, one notices that as a tensor, (P −1 )νρ can only be a linear com-
bination of terms proportional to g νρ and to p ν p ρ . So, writing the most general
possible form for the answer:
(P −1 )νρ = C1 g νρ + C2 p ν p ρ (5.34)
It follows that the desired Feynman propagator for a photon with momentum p μ is:
i pμ pν
−gμν + (1 − ξ ) . (5.37)
p 2 + i p2
(Here we have put in the i factor in the denominator as usual.) In a Feynman diagram,
this propagator corresponds to an internal wavy line, labeled by μ and ν at opposite
ends, and carrying 4-momentum p. The gauge-fixing parameter ξ can be chosen at
the convenience or whim of the person computing the Feynman diagram. The most
popular choice for simple calculations is ξ = 1, called Feynman gauge. Then the
Feynman propagator for photons is simply:
−igμν
(Feynman gauge). (5.38)
p 2 + i
Another common choice is ξ = 0, known as Landau gauge, for which the Feynman
propagator is:
−i pμ pν
gμν − (Landau gauge). (5.39)
p 2 + i p2
(Comparing to (5.32), we see that this is really obtained as a formal limit ξ → 0.)
The Landau gauge photon propagator has the nice property that it vanishes when con-
tracted with either p μ or p ν , which can make some calculations simpler (especially
certain loop diagram calculations). Sometimes it is useful to just leave ξ unspecified.
Even though this means having to calculate more terms, the payoff is that in the end
one can see if the final answer for the reduced matrix element is independent of ξ ,
providing a consistency check.
We have now encountered most of the Feynman rules for QED. There are some
additional rules having to do with minus signs because of Fermi-Dirac statistics;
these can be understood by carefully considering the effects of anticommutation
relations for fermionic operators. In practice, one usually does not write out the
5.1 QED Lagrangian and Feynman Rules 107
spinor indices explicitly. All of the rules are summarized in the following two pages
in a cookbook form. Of course, the best way to understand how the rules work is to
do some examples. That will be the subject of the next few sections.
To find the contributions to the reduced matrix element M for a physical process
involving charged Dirac fermions and photons:
1. Draw all topologically distinct Feynman diagrams, with wavy lines representing
photons, and solid lines with arrows representing fermions, using the rules below
for external lines, internal lines, and interaction vertices. The arrow direction is
preserved when following each fermion line. Enforce four-momentum conserva-
tion at each vertex.
2. For external lines, write (with 4-momentum p μ always to the right, and spin
polarization s or λ as appropriate):
with 4-momentum p μ along the arrow direction, and m the mass of the fermion.
For internal photon lines, write:
with 4-momentum p μ along either direction in the wavy line. (Use ξ = 1 for
Feynman gauge and ξ = 0 for Landau gauge.)
108 5 Quantum Electro-Dynamics (QED)
The vector index μ is to be contracted with the corresponding index on the photon
line to which it is connected. This will be either an external photon line factor of
μ or μ∗ , or on an internal photon line propagator index. Note that the fermion f
coming into the vertex must be the same flavor as the fermion coming out of the
vertex; for example, there is no photon-muon-positron vertex.
5. For each loop momentum μ that is undetermined by four-momentum conserva-
tion with fixed external-state momenta, perform an integration
d 4
. (5.40)
(2π )4
Getting a finite answer from these loop integrations often requires that they be
regularized by introducing a cutoff or some other trick.
6. Put a factor of (−1) for each closed fermion loop.
7. To take into account suppressed spinor indices on fermion lines, write terms
involving spinors as follows. For fermion lines that go all the way through the
diagram, start at the end of each fermion line (as defined by the arrow direction)
with a factor u or v, and write down factors of γ μ or ( /p + m) consecutively,
following the line backwards until a u or v spinor is reached. For closed fermion
loops, start at an arbitrary vertex on the loop, and follow the fermion line back-
wards until the original point is reached; take a trace over the gamma matrices in
the closed loop.
8. If a Feynman diagram with one or more closed loops can be transformed into
an exact copy of itself by interchanging any number of internal lines through a
smooth deformation without moving the external lines, then there is an additional
symmetry factor of 1/N , where N is the number of distinct permutations of that
type.
9. After writing down the contributions from each diagram to the reduced matrix
element M according to the preceding rules, assign an additional relative minus
sign between different diagram contributions whenever the written ordering of
external state spinor wavefunctions u, v, u, v differs by an odd permutation.
5.2 Electron-Positron Scattering 109
5.2.1 e− e+ → μ− μ+
In the next few sections we will study some of the basic scattering processes in QED,
using the Feynman rules found in the previous section and the general discussion
of cross-sections given in Sect. 4.5. These calculations will involve several thematic
tricks that are common to many Feynman diagram evaluations.
We begin with electron-positron annihilation into a muon-antimuon pair:
e− e+ → μ− μ+ .
We will calculate the differential and total cross-sections for this process in the
center-of-momentum frame, to leading order in the coupling e. Since the mass of
the muon (and the anti-muon) is about √ m μ = 105.66 MeV, this process requires a
center-of-momentum energy of at least s = 211.3 MeV. By contrast, the mass of
the electron is only about m e = 0.511 MeV, which we can therefore safely neglect.
The error made in doing so is far less than the error made by not including higher-
order corrections.
A good first step is to label the momentum and spin data for the initial state
electron and positron and the final state muon and anti-muon:
At order e2 , there is only one Feynman diagram for this process. Here it is:
Applying the rules for QED to turn this picture into a formula for the reduced matrix
element, we find:
−igμν
M = v( pb , sb ) (ieγ μ ) u( pa , sa )
( pa + pb )2
ν
u(k1 , s1 ) (ieγ ) v(k2 , s2 ) . (5.42)
beginning. The interaction vertex is −i Qeγ μ = ieγ μ , since the charge of the elec-
tron and muon is Q = −1. Likewise, the u(k1 , s1 ) (ieγ ν ) v(k2 , s2 ) part is obtained
by starting at the end of the muon-antimuon line with the muon external state spinor,
and following it backwards. The photon propagator is written in Feynman gauge, for
simplicity, and carries indices μ and ν that connect to the two fermion lines at their
respective interaction vertices.
We can write this result more compactly by using abbreviations v( pb , sb ) = v b
and u( pa , sa ) ≡ u a , etc. Writing the denominator of the photon propagator as the
Mandelstam variable s = ( pa + pb )2 = (k1 + k2 )2 , and using the metric in the pho-
ton propagator to lower the index on one of the gamma matrices, we get:
e2
M=i (v b γμ u a )(u 1 γ μ v2 ). (5.43)
s
e4
|M|2 = (v b γμ u a )(u 1 γ μ v2 )(v b γν u a )∗ (u 1 γ ν v2 )∗ . (5.44)
s2
Evaluating the complex conjugated terms in parentheses can be done systematically
by taking the Hermitian conjugate of the Dirac spinors and matrices they are made
of, taking care to write them in the reverse order. So, for example,
The third equality follows from the identity (A.2.5), which implies γν† γ 0 = γ 0 γν .
Similarly,
(u 1 γ ν v2 )∗ = v 2 γ ν u 1 . (5.46)
e4
|M|2 = (v b γμ u a )(u a γν vb ) (u 1 γ μ v2 )(v 2 γ ν u 1 ). (5.48)
s2
At this point, we could work out explicit forms for the external state spinors and
plug (5.48) into (4.190) to find the differential cross-section for any particular set of
spins. However, this is not very convenient, and fortunately it is not necessary either.
In a real experiment, the final state spins of the muon and anti-muon are typically not
measured. Therefore, to find the total cross-section for all possible final states, we
should sum over s1 and s2 . Also, if the initial-state electron spin states are unknown,
we should average over sa and sb . (One must average, not sum, over the initial-state
5.2 Electron-Positron Scattering 111
spins, because sa and sb cannot simultaneously take on both spin-up and spin-down
values; there is only one initial state, even if it is unknown.) These spin sums and
averages will allow us to exploit the identities (3.109) and (3.110) (also listed in
Appendix B as (A.2.29) and (A.2.30)), so that the explicit forms for the spinors are
never needed.
After doing the spin sum and average, the differential cross-section must be sym-
metric under rotations about the collision axis. This is because the only special direc-
tions in the problem are the momenta of the particles, so that the cross-section can
only depend on the angle θ between the collision axis determined by the two initial-
state particles and the scattering axis determined by the two final-state particles. So,
we can apply (4.191) to obtain:
dσ 1 1 |k1 |
= |M|2 (5.49)
d(cos θ ) 2 s 2 s s s 32π s|pa |
a b 1 2
in the center-of-momentum frame, with the effects of initial state spin averaging and
final state spin summing now included.
We can now use (A.2.29) and (A.2.30), which in the present situation imply
u a u a = /pa + m e , (5.50)
sa
v2 v 2 = k/2 − m μ . (5.51)
s2
1 1 e4
|M|2 = (v b γμ /pa γν vb ) (u 1 γ μ [/k 2 − m μ ]γ ν u 1 ). (5.52)
2 s 2 s s s 4 s 2 s ,s
a b 1 2 b 1
Now we apply another trick. A dot product of two vectors is equal to the trace of the
vectors multiplied in the opposite order to form a matrix:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
b1 b1 b1 a1 b1 a2 b1 a3 b1 a4
⎜ b2 ⎟ ⎜ b2 ⎟
⎜ b2 a1 b2 a2 b2 a3 b2 a4 ⎟
⎜ ⎟ ⎜ ⎟ ⎜
a1 a2 a3 a4 ⎝ ⎠ = Tr ⎝ ⎠ a1 a2 a3 a4 ≡ Tr ⎝ ⎟ . (5.53)
b3 b3 b3 a1 b3 a2 b3 a3 b3 a4 ⎠
b4 b4 b4 a1 b4 a2 b4 a3 b4 a4
Applying this to each expression in parentheses in (5.52), we move the barred spinor
(thought of as a row vector) to the end and take the trace over the resulting 4 × 4
Dirac spinor matrix. So:
so that
1 e4
|M|2 = Tr[γμ /pa γν vb v b ] Tr[γ μ (k/2 − m μ )γ ν u 1 u 1 ]. (5.56)
4 4 s2 s s
spins b 1
The reason this trick of rearranging into a trace is useful is that now we can once
again exploit the spin-sum identities (A.2.29) and (A.2.30), this time in the form:
vb v b = /p b − m e , (5.57)
sb
u 1 u 1 = k/1 + m μ . (5.58)
s1
1 e4
|M|2 = Tr[γμ /pa γν /p b ] Tr[γ μ (k/2 − m μ )γ ν (k/1 + m μ )]. (5.59)
4 4 s2
spins
where we have used the general result for the trace of four gamma matrices listed in
(A.2.10). Similarly, making use of the fact that the trace of an odd number of gamma
matrices is zero:
where (A.2.8)–(A.2.10) have been used. Taking the product of the two traces, one
finds that the answer reduces to simply:
1 e4
|M|2 = 2 8 ( pa · k2 )( pb · k1 ) + ( pa · k1 )( pb · k2 ) + ( pa · pb )m 2μ . (5.65)
4 s
spins
Our next task is to work out the kinematic quantities appearing in (5.65). Let
us call P and K the magnitudes of the 3-momenta of the electron and the muon,
respectively. We assume that the electron is initially moving in the +z direction, and
the muon makes an angle θ with respect to the positive z axis, within the yz plane.
The on-shell conditions for the particles are:
Then we have:
and
s
pa · pb = 2P 2 = , (5.74)
2
⎡ ⎤
s 4m 2μ
pa · k1 = pb · k2 = P K 2 + m 2μ − P K cos θ = ⎣1 − cos θ 1 − ⎦, (5.75)
4 s
⎡ ⎤
s 4m 2μ
pa · k2 = pb · k1 = P K 2 + m 2μ + P K cos θ = ⎣1 + cos θ 1 − ⎦. (5.76)
4 s
dσ 1 K
= |M|2 (5.78)
d(cos θ ) 4 32π s P
spins
! "
e 4 4m 2μ 4m 2μ 4m 2μ
= 1− 1+ + 1− cos θ , (5.79)
2
32π s s s s
#1 #1
Doing the integral over cos θ using −1 d(cos θ ) = 2 and −1 cos2 θ d(cos θ ) = 2/3,
we find the total cross-section:
!
4π α 2 4m 2μ 2m 2μ
σ = 1− 1+ . (5.81)
3s s s
It is a useful check that the cross-section has units of area; recall that when c = = 1,
then s = E CM 2 has units of mass2 or length−2 .
Equations (5.80) and (5.81) have been tested in many experiments, and correctly
predict the rate of production of muon-antimuon pairs at electron-positron colliders.
Let us examine some special limiting cases. Near the energy threshold for μ+ μ−
production, one may expand in the quantity
√
E = s − 2m μ . (5.82)
The cross-section therefore rises like the square root of the energy excess over the
threshold. However, going to increasing energy, σ quickly levels off because of
the 1/s factors in (5.81). Maximizing with$respect to s, one finds that the largest
√ √
cross-section in (5.81) is reached for s = 1 + 21 m μ ≈ 2.36m μ , and is about
5.2.2 e− e+ → f f .
The reduced matrix element for this process has exactly the same form as for e− e+ →
f f , except that the photon-μ− -μ+ vertex is replaced by a photon- f - f vertex, with:
ieγ ν −→ −i Q f eγ ν , (5.86)
where Q f is the charge of the fermion f . In the case of quarks, there are three
indistinguishable colors for each flavor (up, down, strange, charm, bottom, top). The
photon-quark-antiquark vertex is diagonal in color, so the three colors are simply
summed over in order to find the total cross-section for a given flavor. In general, if
we call n f the number of colors (or perhaps other non-spin degrees of freedom) of
the fermion f , then we have:
4π α 2
σe − e + → f f = n f Q 2f . (5.89)
3s
Figure 5.1 compares the total cross-section for e+ e− → f f (solid line) as given by
(5.88) to the asymptotic approximation (dashed line) given by (5.89).
We see that the true cross-section is always less
√ than the asymptotic approxima-
tion, but the two already agree fairly well when s > ∼ 2.5m f . This means that when
several fermions contribute, the total cross-section well above threshold is just equal
to the sum of n f Q 2f for the available states times a factor 4π α 2 /3s. For example,
the up quark has charge Q u = +2/3, and there are three colors, so the prefactor
116 5 Quantum Electro-Dynamics (QED)
0
0 1 2 3 4 5
ECM/mf
indicated above is 3(2/3)2 = 4/3. The prefactors for all of the fundamental charged
fermion types with masses less than m Z are:
However, free quarks are not seen in nature because the QCD color force confines
them within color-singlet hadrons. This means that the quark-antiquark production
process e− e+ → Q Q cross-section cannot easily be interpreted in terms of spe-
cific particles in the final state. Instead, one should view the quark production as
a microscopic process, occurring at a distance scale much smaller than a typical
hadron. Before we “see” them in macroscopic-sized detectors, the produced quarks
then undergo further strong interactions that end up producing hadronic jets of par-
ticles with momenta close to those of the original quarks. This always involves at
least the further production of a quark-antiquark pair in order to make the final state
hadrons color singlets. A Feynman-diagram cartoon of the situation might look as
shown in Fig. 5.2. Because the hadronic interactions are most important at the strong-
interaction energy scale of a few hundred MeV, the calculation of the cross-section
√
can only be trusted for energies that are significantly higher than this. When s
1
GeV, one can make the approximation:
σ (e− e+ → hadrons) ≈ σ (e− e+ → qq). (5.90)
q
The final state can be quite complicated, so to test QED production of quarks, one can
just measure the total cross-section for producing hadrons. The traditional measure
of the total hadronic cross-section is the variable Rhadrons , defined as the ratio:
σ (e− e+ → hadrons)
Rhadrons = . (5.91)
σ (e− e+ → μ− μ+ )
5.2 Electron-Positron Scattering 117
When the approximation is valid, one can always produce up, down and strange
quarks, which all have masses < √ 1 GeV. The threshold to produce charm-anticharm
quarks occurs roughly when √ s > 2m c ≈ 3 GeV, and that to produce bottom-
antibottom quarks is at roughly s > 2m b ≈ 10 GeV. As each of these thresholds
is passed, one gets a contribution
√ to Rhadrons that is approximately a constant pro-
portional to n f Q 2f . So, for s < 3 GeV, one has
4 1 1
Rhadrons = + + = 2. (u, d, s quarks) (5.92)
3 3 3
√
For 3 GeV< s < 10 GeV, the charm quark contributes, and the ratio is
4 1 1 4 10
Rhadrons = + + + = . (u, d, s, c quarks) (5.93)
3 3 3 3 3
√
Finally, for s > 10 GeV, we get
4 1 1 4 1 11
Rhadrons = + + + + = . (u, d, s, c, b quarks). (5.94)
3 3 3 3 3 3
Besides these “continuum” contributions to Rhadrons , there are resonant contributions
that come from e− e+ → hadronic bound states. These √ bound states tend to have
very large, but narrow, production cross-sections when s is in just the right energy
118 5 Quantum Electro-Dynamics (QED)
range to produce them. For example, when the bound state√consists of a charm
and anticharm quark, one gets the J /ψ particle resonance at s = 3.096916 GeV,
with a width of 0.00093 GeV. These resonances contribute very sharp peaks to the
measured Rhadrons . Experimentally, Rhadrons is quite hard to measure, being plagued
by systematic detector effects. Many of the older experiments at lower energy tended
to underestimate the systematic uncertainties. Figure 5.3 shows plots of the data from
RPP 2022. The approximate agreement with the predictions for Rhadrons in (5.92)–
(5.94) provides a crucial test of the quark model of hadrons, including the charges
of the quarks and the number of colors..
5.2.3 Helicities in e− e+ → μ− μ+
P R − PL = X % (5.95)
where P R and P L are the probabilities of measuring the spin pointing along and
against the 3-momentum direction, respectively. This experimental capability shows
that one needs to be able to calculate cross-sections without assuming that the initial
spin state is random and averaged over.
We could redo the calculation of the previous sections with particular spinors
u( p, s) and v( p, s) for the desired specific spin states s of initial
% state %
particles.
However, then we would lose our precious trick of evaluating s uu and s vv. A
nicer way is to keep the sum over spins, but eliminate the “wrong” polarization from
the sum using a projection matrix from (3.89). So, for example, we can use
in place of the usual Feynman rules for an initial state particle. Summing over the
spin s will not change the fact that the projection matrix allows only L- or R-handed
electrons to contribute to the cross-section. Now our traces over gamma matrices
will involve γ5 , because of the explicit expressions for PL and PR (see (3.89)).
5.2 Electron-Positron Scattering 119
√
Fig. 5.3 Rhad = σ (e+ e− → hadrons)/σ (e+ e− → μ+ μ− ) vs. s (from RPP 2022)
To get the equivalent rules for an initial state antiparticle, we must remember that
the spin operator acting on v( p, s) spinors is the opposite of the spin operator acting
on u( p, s) spinors. Therefore, PL acting on a v( p, s) spinor projects onto a R-handed
antiparticle. So if we form the object v( p, s)PR = v † ( p, s)γ 0 PR = v † ( p, s)PL γ 0 ,
the result must describe a R-handed positron; in this case, the bar on the spinor for an
antiparticle “corrects” the handedness. So, for an initial state antiparticle with either
120 5 Quantum Electro-Dynamics (QED)
In some cases, one can also measure the polarizations of outgoing particles, for
example by observing their decays. Tau leptons and anti-taus sometimes decay by
the weak interaction processes:
τ − → − ντ ν , (5.100)
τ + → + ν τ ν , (5.101)
where is either e or μ, with the angular distributions of the final state directions
depending on the spin of the τ , which may be one of the final state fermions in a
scattering or decay process of interest. If the polarization of a final-state fermion is
fixed by measurement, then we need to use:
e− + − +
R eR → μ μ , (5.106)
where the helicities of the initial state particles are now assumed to be known per-
fectly. The reduced matrix element for this process, following from the same Feyn-
man diagram as before, is:
e2
M = i (v b PR γ μ PR u a ) (u 2 γμ v1 ). (5.107)
s
A projection matrix can be moved through a gamma matrix by changing L ↔ R:
PR γ μ = γ μ PL , (5.108)
PL γ μ = γ μ PR . (5.109)
5.2 Electron-Positron Scattering 121
PR γ μ PR = γ μ PL PR = 0. (5.111)
e− + − +
L e R → μL μ R , (5.112)
in which we have now assumed that all helicities√ are perfectly known. To simplify
matters, we will assume the high energy limit s
m μ . The reduced matrix ele-
ment can be simply obtained from (5.43) by just putting in the appropriate L and R
projection matrices acting on each external state spinor:
e2
M=i (v b PR γ μ PL u a )(u 1 PR γμ PL v2 ). (5.113)
s
This can be simplified slightly by using the properties of the projections matrices:
PR γ μ PL = γ μ PL PL = γ μ PL , (5.114)
so that
e2
M=i (v b γ μ PL u a )(u 1 γμ PL v2 ), (5.115)
s
and so
e4
|M|2 = (v b γ μ PL u a )(u 1 γμ PL v2 )(v b γ ν PL u a )∗ (u 1 γν PL v2 )∗ . (5.116)
s2
To evaluate this, we compute:
(u 1 γν PL v2 )∗ = v 2 PR γν u 1 . (5.121)
Therefore,
e4
|M|2 = (v b γ μ PL u a )(u a PR γ ν vb )(u 1 γμ PL v2 )(v 2 PR γν u 1 ). (5.122)
s2
Because the spin projection matrices will only allow the specified set of spins to
contribute anyway, we are free to sum over the spin labels sa , sb , s1 , and s2 , without
changing anything. Let us do so, since it will allow us to apply the tricks
u a u a = /pa + m e , (5.123)
sa
v2 v 2 = k/2 − m μ . (5.124)
s2
to get
e4
|M|2 = (v b γ μ /pa PR γ ν vb ) (u 1 γμ k/2 PR γν u 1 ). (5.129)
s2 s s
1 b
Again using the trick of putting the barred spinor at the end and taking the trace (see
the discussion around (5.53)) for each quantity in parentheses, this becomes:
e4
|M|2 = Tr[γ μ /pa PR γ ν vb v b ] Tr[γμ k/2 PR γν u 1 u 1 ]. (5.130)
s2 s s
1 b
Doing the sums over s1 and sb using the usual trick gives:
e4
|M|2 = Tr[γ μ /pa PR γ ν /p b ] Tr[γμ k/2 PR γν k/1 ]. (5.131)
s2
5.2 Electron-Positron Scattering 123
where μανβ is the totally antisymmetric Levi-Civita tensor defined in (2.65). Putting
things together:
μ
Tr[γ μ /pa PR γ ν /p b ] = 2 paμ pbν − g μν ( pa · pb ) + pb paν + i paα pbβ μανβ . (5.136)
Finally, we have to multiply these two traces together, contracting the indices μ and
ν. Note that the cross-terms containing only one tensor vanish, because the epsilon
tensors are antisymmetric under μ ↔ ν, while the other terms are symmetric. The
term involving two epsilon tensors can be evaluated using the useful identity
which you can verify by brute force substitution of indices. The result is simply:
so that
16e4
|M|2 = ( pa · k2 )( pb · k1 ). (5.140)
s2
This result should be plugged in to the formula for the differential cross-section:
Note that one does not average over initial-state spins in this case, because they have
already been fixed. The kinematics is of course not affected by the fact that we have
124 5 Quantum Electro-Dynamics (QED)
fixed the helicities, and so can be taken from the discussion in Sect. 5.2.1 with m μ
replaced by 0. It follows that:
dσe− e+ →μ− μ+ e4
L R L R
= (1 + cos θ )2 (5.142)
d(cos θ ) 32π s
π α2
= (1 + cos θ )2 . (5.143)
2s
The angular dependence of this result can be understood from considering the con-
servation of angular momentum in the event. Drawing a short arrow to represent the
direction of the spin:
This shows that the total spin angular momentum of the initial state is Sẑ = −1 (taking
the electron to be moving in the +z direction). The total spin angular momentum
of the final state is Sn̂ = −1, where n̂ is the direction of the μ− . This explains
why the cross-section vanishes if cos θ = −1; that corresponds to a final state with
the total spin angular momentum in the opposite direction from the initial state.
The quantum mechanical overlap for two states with measured angular momenta in
exactly opposite directions must vanish. If we describe the initial and final states as
eigenstates of angular momentum with J = 1:
(1 + cos θ )2
|Jn̂ = −1|Jẑ = −1|2 = . (5.146)
4
Similarly, one can compute:
dσe− e+ →μ− μ+ π α2
R L R L
= (1 + cos θ )2 , (5.147)
d(cos θ ) 2s
with all helicities reversed compared to the previous case. If we compute the cross-
sections for the final state muon to have the opposite helicity from the initial state
electron, we get
then by exactly the same argument as before, the fermion and antifermion must have
opposite helicities, because of v PL γ μ PL u = v PR γ μ PR u = 0 and u PL γ μ PL v =
u PR γ μ PR v = 0 and the rules of (5.96)–(5.99) and (5.102)–(5.105).
Moreover, if an initial state fermion (or anti-fermion) interacts with a vector and
emerges as a final state fermion (or anti-fermion):
then the fermions (or anti-fermions) must have the same helicity, because of the
identities u PL γ μ PL u = u PR γ μ PR u = 0 and v PL γ μ PL v = v PR γ μ PR v = 0. This
is true even if the interaction with the vector changes the fermion from one type to
another.
These rules embody the concept of helicity conservation in high energy scatter-
ing. They are obviously useful when the helicities of the particles are controlled or
measured by the experimenter. They are also useful because, as we will see, the weak
interactions only affect fermions with L helicity and antifermions with R helicity.
The conservation of angular momentum together with helicity conservation often
allows one to know in which direction a particle is most likely to emerge in a scat-
tering or decay experiment, and in what cases one may expect the cross-section to
vanish or be enhanced.
e− e+ → e− e+ . (5.151)
√
For simplicity we will only consider the case of high-energy scattering, with s =
E CM
m e , and we will consider all spins to be unknown (averaged over in the
initial state, summed over in the final state).
5.2 Electron-Positron Scattering 127
The first of these is called the s-channel diagram; it is exactly the same as the one we
drew for e− e+ → μ− μ+ . The second one is called the t-channel diagram. Using the
QED Feynman rules listed at the end of Sect. 5.1, the corresponding contributions to
the reduced matrix element for the process are:
−igμν
Ms = v b (ieγ μ )u a u 1 (ieγ ν )v2 , (5.153)
( pa + pb )2
and
−igμν
Mt = (−1) u 1 (ieγ μ )u a v b (ieγ ν )v2 . (5.154)
( pa − k1 )2
The additional (−1) factor in Mt is due to Rule 9 in the QED Feynman rules at the
end of Sect. 5.1. It arises because the order of spinors in the written expression for
Ms is b, a, 1, 2, but that in Mt is 1, a, b, 2, and these differ from each other by an
odd permutation. We could have just as well assigned the minus sign to Ms instead;
only the relative phases of terms in the matrix element are significant.
Therefore the full reduced matrix element for Bhabha scattering, written in terms
of the Mandelstam variables s = ( pa + pb )2 and t = ( pa − k1 )2 , is:
& '
1 1
M = Ms + Mt = ie2 (v b γμ u a )(u 1 γ μ v2 ) − (u 1 γμ u a )(v b γ μ v2 ) . (5.155)
s t
128 5 Quantum Electro-Dynamics (QED)
The complex square of the reduced matrix element, |M|2 = M∗ M, contains a pure
s-channel piece proportional to 1/s 2 , a pure t-channel piece proportional to 1/t 2 , and
an interference piece proportional to 1/st. For organizational purposes, it is useful
to calculate these pieces separately.
The pure s-channel contribution calculation is exactly the same as what we
did before for e− e+ → μ− μ+ , except that now we can substitute m μ → m e → 0.
Therefore, plagiarizing the result of (5.65), we have:
1 8e4
|Ms |2 = 2 [( pa · k2 )( pb · k1 ) + ( pa · k1 )( pb · k2 )] . (5.159)
4 s
spins
The pure t-channel contribution can be calculated in a very similar way. We have:
e4
|Mt |2 = (v 2 γν vb )(v b γμ v2 )(u 1 γ μ u a )(u a γ ν u 1 ). (5.160)
t2
Taking the average of% initial state spins and
% the sum over final state spins allows us
to use the identities sa u a u a = /pa and sa vb v b = /p b (neglecting m e ). The result
is:
1 e4
|Mt |2 = 2 (v 2 γν /p b γμ v2 )(u 1 γ μ /pa γ ν u 1 ) (5.161)
4 4t s ,s
spins 1 2
e4
= Tr[γν /p b γμ v2 v 2 ]Tr[γ μ /pa γ ν u 1 u 1 ], (5.162)
4t 2 s ,s
1 2
in which we have turned the quantity into a trace by moving the u 1 to the end. Now
performing the sums over s1 , s2 gives:
1 e4
|Mt |2 = 2 Tr[γν /p b γμ k/2 ]Tr[γ μ /pa γ ν k/1 ] (5.163)
4 4t
spins
e4
= Tr[γμ k/2 γν /p b ]Tr[γ μ /pa γ ν k/1 ]. (5.164)
4t 2
In the second line, the first trace has been rearranged using the cyclic property of
traces. The point of doing so is that now these traces have exactly the same form that
5.2 Electron-Positron Scattering 129
1 8e4
|Mt |2 = 2 [( pa · k2 )( pb · k1 ) + (k2 · k1 )( pb · pa )] . (5.165)
4 t
spins
1 ∗ e4
Mt Ms = − (v b γμ u a )(u a γ ν u 1 )(u 1 γ μ v2 )(v 2 γν vb ). (5.166)
4 4 st
spins spins
1 ∗ e4
Mt Ms = − (v b γμ /pa γ ν k/1 γ μ k/2 γν vb ),
4 4 st s
spins b
which can now be converted into a trace by the usual trick of moving the v b to the
end:
1 ∗ e4
Mt Ms = − Tr[γμ /pa γ ν k/1 γ μ k/2 γν vb v b ] (5.167)
4 4 st s
spins b
e4
=− Tr[γμ /pa γ ν k/1 γ μ k/2 γν /p b ]. (5.168)
4 st
Now we are faced with the task of computing the trace of 8 gamma matrices. In
principle, the trace of any number of gamma matrices can be performed with the
algorithm of (A.2.11). The procedure is to replace the trace over 2n gamma matrices
by a sum over traces of 2n − 2 gamma matrices, and repeat until all traces are short
enough to evaluate using (A.2.8)–(A.2.11) and (A.2.15)–(A.2.19). However, in many
cases including the present one it is easier to simplify the contents of the trace first,
using (A.2.20)–(A.2.23). To evaluate the trace in (5.168), we first use (A.2.23) to
write:
so that:
in which the trace has finally been performed using (A.2.9). So, from (5.168)
1 ∗ 8e4
Mt Ms = ( pa · k2 )( pb · k1 ). (5.174)
4 st
spins
1 ∗ 8e4
Ms Mt = ( pa · k2 )( pb · k1 ). (5.175)
4 st
spins
It remains to identify the dot products of momenta appearing in the above formulas.
This can be done by carrying over the kinematic analysis for the case e− e+ → μ− μ+
as worked out in (5.66)–(5.76), with m μ , m e → 0. Letting θ be the angle between
the 3-momenta directions of the initial state electron and the final state electron, we
have:
pa · pb = k1 · k2 = s/2, (5.177)
pa · k1 = pb · k2 = −t/2, (5.178)
pa · k2 = pb · k1 = −u/2, (5.179)
with
s
t = − (1 − cos θ ), (5.180)
2
s
u = − (1 + cos θ ). (5.181)
2
It follows from (5.159), (5.165), (5.174), and (5.175) that:
1 2e4
|Ms |2 = 2 (u 2 + t 2 ) (5.182)
4 s
spins
1 2e4
|Mt |2 = 2 (u 2 + s 2 ) (5.183)
4 t
spins
1 4e4 2
(M∗t Ms + M∗s Mt ) = u . (5.184)
4 st
spins
5.2 Electron-Positron Scattering 131
Putting this into (4.192), since |pa | = |k1 |, we obtain the spin-averaged differen-
tial cross-section for Bhabha scattering:
2
dσ e4 u + t2 u2 + s2 2u 2
= + + (5.185)
d(cos θ ) 16π s s2 t2 st
2
π α 2 3 + cos2 θ
= . (5.186)
2s 1 − cos θ
This result actually diverges for cos θ → 1, because of the t’s in the denominator.
This is not an integrable singularity, because the differential cross-section blows up
quadratically near cos θ = 1, so
1
dσ
σ = d(cos θ ) −→ ∞. (5.187)
d(cos θ )
−1
The infinite total cross-section corresponds to the infinite range of the Coulomb
potential between two charged particles. It arises entirely from the t-channel dia-
gram, in which the electron and positron scatter off of each other in the forward
direction (θ ≈ 0). It simply reflects that an infinite-range interaction will always
produce some deflection, although it may be extremely small. This result is the rela-
tivistic generalization of the non-relativistic, classical Rutherford scattering problem,
in which an electron or alpha particle (or some other light charged particle) scatters
off of the classical electric field of a heavy nucleus. As worked out in many textbooks
on classical physics (for example, H. Goldstein’s Classical Mechanics, J.D. Jack-
son’s Classical Electrodynamics), the differential cross-section for a non-relativistic
light particle with charge Q A and a heavy particle with charge Q B , with center-of-
momentum energy E CM to scatter through their Coulomb interaction is:
dσRutherford π Q 2A Q 2B α 2
= . (5.188)
d(cos θ ) 2E 2 (1 − cos θ )2
(Here one must be careful in comparing results, because the charge √ e used by Gold-
stein and Jackson differs from the one used here by a factor of 4π .) Comparing
the non-relativistic Rutherford result to the relativistic Bhabha result, we see that in
both cases the small-angle behavior scales like 1/θ 4 , and does not depend on the
signs of the charges of the particles.
In a real experiment, there is always some minimum scattering angle that can
be resolved. In a colliding-beam experiment, this is usually dictated by the fact that
detectors cannot be placed within or too close to the beamline. In other experiments,
one is limited by the angular resolution of detectors. Therefore, the true observable
quantity is typically something more like:
θcut
cos
dσ
σexperiment = d(cos θ ). (5.189)
d(cos θ )
− cos θcut
132 5 Quantum Electro-Dynamics (QED)
Of course, in real experiments, the minimum resolvable angle is just one of many
practical factors that have to be included.
In terms of the Feynman diagram interpretation, the divergence for small θ corre-
sponds to the photon propagator going on-shell; in other words, the situation where
the square of the t-channel virtual photon’s 4-momentum is nearly equal to 0, the
classical value for a real massless photon. For any scattering angle θ > 0, one has
s
( pa − k1 )2 = t = (1 − cos θ ) > 0, (5.190)
2
so that the virtual photon is said to be off-shell. In general, any time that a virtual
(internal line) particle can go on-shell, there will be a divergence in the cross-section
due to the denominator of the Feynman propagator blowing up. Sometimes this is
a real divergence with a physical interpretation, as in the case of Bhabha scattering.
In other cases, the divergence is removed by higher-order effects, such as the finite
life-time of the virtual particle, which will give an imaginary part to its squared mass,
removing the singularity in the Feynman propagator.
e− e+ → μ− μ+ (5.191)
e− μ+ → e− μ+ . (5.192)
The relevant Feynman diagrams for these two processes are very similar:
In fact, by stretching and twisting, one can turn the first process into the second by
the transformation:
Two processes related to each other by exchanging some initial state particles with
their antiparticles in the final state are said to be related by crossing. Not surprisingly,
5.3 Crossing Symmetry 133
This similarity is generalized and made more precise by the following theorem.
Crossing Symmetry Theorem: Suppose two Feynman diagrams with reduced matrix ele-
ments M and M are related by the exchange (“crossing”) of some initial state particles and
antiparticles for the corresponding final state antiparticles
% and particles. If the crossed par-
μ μ
ticles have 4-momenta P1 , . . . Pn in M, then spins |M|2 can be obtained by substituting
μ μ %
Pi = −Pi into the mathematical expression for spins |M |2 , as follows:
(
μ μ (
|M(P1 , . . . , Pnμ , . . .)|2 = (−1) F |M (P1 , . . . , Pnμ , . . .)|2 (( ,
(5.197)
μ μ
spins spins Pi → −Pi
with the other (uncrossed) particle 4-momenta unaffected. Here F is the number of fermion
lines that were crossed.
5.3.1 e− μ+ → e− μ+ and e− μ− → e− μ−
Let us apply crossing symmetry to the example of (5.191) and (5.192) by assigning
primed momenta to the Feynman diagram for e− e+ → μ− μ+ :
&
e− ↔ pa
initial state (5.198)
e+ ↔ pb
&
μ− ↔ k1
final state (5.199)
μ+ ↔ k2 .
134 5 Quantum Electro-Dynamics (QED)
Then the Crossing Symmetry Theorem tells us that we can get the reduced matrix ele-
ment for the process e− μ+ → e− μ+ as a function of physical momenta pa , pb , k1 , k2
by substituting unphysical momenta
into the formula for the reduced matrix element for the process e− e+ → μ− μ+ .
This means that we can identify:
In other words, crossing symmetry tells us that the formulas for the reduced matrix
elements for these two processes are just related by the exchange of s and t, as illus-
trated in the high-energy limit in (5.195) and (5.196). Since we had already derived
the result for the first process in Sect. 5.2.1, the second result has been obtained for
free. Note that we could have obtained the particular result (5.196) even more easily
just by noting that the calculation for the reduced matrix element of e− μ+ → e− μ+
is exactly the same as for Bhabha scattering, except that only the t-channel diagram
exists in the former case. So one only keeps the contribution with t (not s or u) in
the denominator, since that corresponds to the t-channel diagram.
We can carry this further by considering another process also related by crossing
to the two just studied:
e− μ− → e− μ− , (5.206)
This time, the Crossing Symmetry Theorem tells us that we can identify the matrix
element by again starting with the reduced matrix element for e− e+ → μ− μ+ and
replacing:
so that
Here the primed Mandelstam variable are the unphysical ones for the e− e+ → μ− μ+
process, and the unprimed ones are for the desired process e− μ− → e− μ− . We can
therefore infer, from (5.195), that
2
1 s + u2
|Me− μ− →e− μ− |2 = 2e4 , (5.213)
4 t2
spins
Therefore, putting (5.213) into (4.192) with |pa | = |k1 | and using e2 = 4π α, we
obtain the differential cross-section for e− μ± → e− μ± :
dσ π α2 u2 + s2
= (5.215)
d(cos θ ) s t2
π α 2 5 + 2 cos θ + cos2 θ
= . (5.216)
2s (1 − cos θ )2
e− e− → e− e− . (5.217)
Now, nobody can stop us from getting the result for this process by applying
the Feynman rules to get the reduced matrix element, taking the complex square,
summing and averaging over spins, and computing the Dirac traces. However, an
easier way is to note that this is a crossed version of Bhabha scattering, which we
studied earlier. Making a table of the momenta:
we see that crossing symmetry allows us to compute the Møller scattering by iden-
tifying the (initial state positron, final state positron) in Bhabha scattering with the
(final state electron, initial state electron) in Møller scattering, so that:
into the corresponding result for Bhabha scattering. Using the results of (5.182)–
(5.184), we get:
2
1 s + t2 s2 + u2 2 s2
|Me− e− →e− e− |2 = 2e2 + + . (5.227)
4 u2 t2 ut
spins
5.4 Gauge Invariance in Feynman Diagrams 137
Again, if we keep only the t-channel part (that is, the part with t 2 in the denominator),
we recover the result for e− μ± → e− μ± in the previous section.
Applying (4.192), we find the differential cross-section:
⎛ ⎞
dσe− e− →e− e− 1 ⎝1
= |Me− e− →e− e− |2 ⎠ (5.228)
d(cos θ ) 32π s 4
spins
π α2 s 2 + t 2 s2 + u2 2 s2
= + + (5.229)
s u2 t2 ut
2
2π α 2 3 + cos2 θ
= . (5.230)
s 1 − cos2 θ
Just as in the cases in the previous sections, t = −s(1 − cos θ )/2 and u = −s(1 +
cos θ )/2 where θ is the angle between the initial-state and final-state electrons. How-
ever, in this case there is a special feature, because the two electrons in the final state
are indistinguishable particles. This means that the final state with an electron com-
ing out at angles (θ, φ) is actually the same quantum state as the one with an electron
coming out at angles (π − θ, −φ). (As a check, note that (5.230) is invariant under
cos θ → − cos θ . We have already integrated over the angle φ.) Therefore, to avoid
overcounting we must only integrate over half the range of θ , or equivalently divide
the total cross-section by 2. So, we have a tricky and crucial factor of 1/2 in the total
cross-section:
θcut
cos
1 dσe− e− →e− e−
σe− e− →e− e− = d(cos θ ). (5.231)
2 d(cos θ )
− cos θcut
To obtain a finite value for the total cross-section, we had to also impose a cut on the
minimum scattering angle θcut that we require in order to say that a scattering event
should be counted.
Let us now turn to the issue of gauge invariance as it is manifested in QED Feyn-
man diagrams. Recall that when we found the Feynman propagator for a photon, it
contained a term that depended on an arbitrary parameter ξ . We have been work-
ing with ξ = 1 (Feynman gauge). Consider what the matrix element for the process
e− e+ → μ− μ+ would be if instead we let ξ remain unfixed. Instead of (5.42), we
would have
ie2 ( pa + pb )μ ( pa + pb )ν
M = (v b γ μ u a )(u 1 γ ν v2 ) −gμν + (1 − ξ ) . (5.232)
( pa + pb )2 ( pa + pb )2
If the answer is to be independent of ξ , then it must be true that the new term
proportional to (1 − ξ ) gives no contribution. This can be easily proved by observing
that it contains the factor
138 5 Quantum Electro-Dynamics (QED)
(v b γ μ u a )( pa + pb )μ = v b /pa u a + v b /p b u a = mv b u a − mv b u a = 0. (5.233)
Here we have applied the Dirac equation, as embodied in (A.2.24) and (A.2.25),
to write /pa u a = mu a and v b /p b = −m /p b . For any photon propagator connected (at
either end) to an external fermion line, the proof is similar. And, in general, one
can show that the 1 − ξ term will always cancel when one includes all Feynman
diagrams contributing to a particular process. So we can choose the most convenient
value of ξ , which is usually ξ = 1.
Another aspect of gauge invariance involves a feature that we have not explored
in an example so far: external state photons. Recall that the Feynman rules associate
factors of μ ( p, λ) and ∗μ ( p, λ) to initial or final state photons, respectively. Now,
making a gauge transformation on the photon field results in:
Aμ → Aμ + ∂ μ (5.234)
μ ( p, λ) → μ ( p, λ) + ap μ (5.235)
where a is any quantity. The polarization vector and momentum for a physical photon
satisfy 2 = −1 and · p = 0 and p 2 = 0. As a consistency check, note that if these
relations are satisfied, then they will also be obeyed after the gauge transformation
(5.235).
Gauge invariance implies that the reduced matrix element should also be
unchanged after the substitution in (5.235). The reduced matrix element for a process
with an external state photon with momentum p μ and polarization label λ can always
be written in the form:
M = Mμ μ ( p, λ), (5.236)
Mμ p μ = 0. (5.237)
This relation is known as the Ward identity for QED. It says that if we replace the
polarization vector for any photon by the momentum of that photon, then the reduced
matrix element should become 0. This is a nice consistency check on calculations.
5.5 External Photon Scattering 139
Another nice consequence of the Ward identity is that it provides for a simplified
way to sum or average over unmeasured photon polarization states. Consider a pho-
ton with momentum taken to be along the positive z axis, with p μ = (P, 0, 0, P).
Summing over the two polarization vectors in (5.26)–(5.27), we have:
2
2
|M|2 = Mμ M∗ν μ ( p, λ) ∗ν ( p, λ) = |M1 |2 + |M2 |2 , (5.238)
λ=1 λ=1
The last equation is written in a Lorentz invariant form, so it is true for any photon
momentum direction, not just momenta oriented along the z direction. Gauge invari-
ance, as expressed by the Ward identity, therefore implies that we can always sum
over a photon’s polarization states by the rule:
2
μ ( p, λ) ∗ν ( p, λ) = −g μν + (irrelevant)μν , (5.240)
λ=1
as long as we are taking the sum of the complex square of a reduced matrix element.
Although the (irrelevant)μν part is non-zero, it must vanish when contracted with
Mμ M∗ν , according to (5.239).
γ e− → γ e− . (5.241)
initial γ μ ( pa , λa )
initial e− u( pb , sb )
final γ ν∗ (k1 , λ1 )
final e− u(k2 , s2 ).
which are s-channel and u-channel, respectively. Applying the QED Feynman rules,
we obtain:
i( /pa + /p b + m)
Ms = u 2 (ieγ ν ) (ieγ μ )u b 1ν
∗
aμ (5.242)
( pa + pb )2 − m 2
e2 ∗
= −i u 2 γ ν ( /pa + /p b + m)γ μ u b 1ν aμ (5.243)
s−m 2
and
i( /p b − k/1 + m)
μ
Mu = u 2 (ieγ ) (ieγ ν )u b 1ν
∗
aμ (5.244)
( pb − k1 )2 − m 2
e2 ∗
= −i u 2 γ μ ( /p b − k/1 + m)γ ν u b 1ν aμ . (5.245)
u−m 2
Before squaring the total reduced matrix element, it is useful to simplify. So we note
that:
Now we multiply together (5.250) and (5.251), and average over the initial photon
polarization λa and sum over the final photon polarization λ1 , using
1
2
∗ 1
aμ aρ = − gμρ + irrelevant, (5.252)
2 2
λa =1
2
∗
1ν 1σ = −gνσ + irrelevant, (5.253)
λ1 =1
to obtain:
1
|M|2 =
2
λa ,λ1
4 & '
e 1 ν p γ μ + 2 pμ γ ν ) + 1 μk ν + 2 pν γ μ ) u
u2 (γ /a b (−γ / 1 γ b b
2 s − m2 u − m2
& '
1 1
ub (γμ /pa γν + 2 pbμ γν ) + (−γν k/1 γμ + 2 pbν γμ ) u 2 . (5.254)
s−m 2 u − m2
Next we can average over sb , and sum over s2 , using the usual tricks:
1 1
u b u b = ( /p b + m), (5.255)
2 s 2
b
u 2 . . . u 2 = Tr[. . . (k/2 + m)]. (5.256)
s2
1
|M|2 =
4
spins
& '
e4 1 ν p γ μ + 2 pμ γ ν ) + 1 μk ν + 2 p ν γ μ ) ( p + m)
Tr (γ /a b (−γ / 1 γ b /b
4 s − m2 u − m2
& '
1 1
(γμ /
p a γν + 2 pbμ γν ) + (−γ ν /
k 1 γμ + 2 p bν γμ ) (/
k 2 + m) . (5.257)
s − m2 u − m2
Doing this trace requires a little patience and organization. The end result can be
written compactly in terms of
s − m2
pa · pb = , (5.258)
2
m2 − u
pb · k 1 = . (5.259)
2
142 5 Quantum Electro-Dynamics (QED)
Equation (5.260) is a Lorentz scalar. We can now find the differential cross-section
after choosing a reference frame. We will do this first in the center-of-momentum
frame, and then redo it in the “lab” frame in which the initial electron is at rest.
In the center-of-momentum frame, the kinematics is just like in the case eμ → eμ.
Call the magnitude of the 3-momentum of the photon in the initial state P. Then
using four-momentum conservation and the on-shell conditions pa2 = k12 = 0 and
pb2 = k22 = m 2 , and taking the initial state photon momentum to be in the +z direction
and the final state photon momentum to make an angle θ with the z-axis, we have:
The initial and final state photons have the same energy, as do the initial and final
state electrons, so define:
E γ = P, (5.265)
$
Ee = P 2 + m 2 . (5.266)
Then we have:
s = (E e + E γ )2 , (5.267)
pa · pb = E γ (E e + E γ ), (5.268)
pb · k 1 = E γ (E e + E γ cos θ ), (5.269)
|k1 |
= 1. (5.270)
|pa |
5.5 External Photon Scattering 143
1
Ee + Eγ Ee + Eγ Ee + Eγ
d(cos θ ) = ln . (5.272)
E e + E γ cos θ Eγ Ee − Eγ
−1
Therefore, using s ≈ 4E γ2 ,
1
Ee + Eγ
d(cos θ ) = 2 ln(s/m 2 ) + O(m 2 ), (5.274)
E e + E γ cos θ
−1
with the dominant contribution coming from cos θ near −1, where the denominator
of the integrand becomes small. Integrating the second term in (5.271), one finds:
1
E e + E γ cos θ 2E e
d(cos θ ) = = 1 + O(m 2 ). (5.275)
Ee + Eγ Ee + Eγ
−1
1
dσ π α2
σ = d(cos θ ) = 2 ln(s/m 2 ) + 1 (5.276)
d(cos θ ) s
−1
144 5 Quantum Electro-Dynamics (QED)
Then, in terms of the lab photon scattering angle θ , the 4-momenta are:
Here we have used four-momentum conservation and the on-shell conditions pa2 =
k12 = 0 and pb2 = m 2 . Applying the last on-shell condition k22 = m 2 now leads to:
pa · pb = ωm; (5.284)
pb · k1 = ω m, (5.285)
5.5 External Photon Scattering 145
where d2 is the two-body Lorentz-invariant phase space, as defined in (4.176), for
the final state particles in the lab frame. Evaluating the prefactors for the case at
hand:
E a = ω, (5.290)
E b = m, (5.291)
|va − vb | = 1 − 0 = 1. (5.292)
d 3 k1 d 3 k2
d2 = (2π )4 δ (4) (k1 + k2 − pa − pb ) (5.293)
(2π )3 2E 1 (2π )3 2E 2
d 3 k1
= δ (3) (k1 + k2 − ω
z) δ(ω + E 2 − ω − m) d 3 k2 , (5.294)
16π 2 ω E 2
$
where ω is now defined to be equal to E 1 = |k1 | and E 2 is defined to be |k2 |2 + m 2 .
Performing the k2 integral using the 3-momentum delta function just sets k2 =
ω
z − k1 , resulting in:
d 3 k1
d2 = δ(ω + E 2 − ω − m) , (5.295)
16π 2 ω E 2
146 5 Quantum Electro-Dynamics (QED)
where now
$
E2 = ω2 − 2ωω cos θ + ω2 + m 2 . (5.296)
Therefore,
d(cos θ ) ω dω
d2 = δ(ω + E 2 − ω − m) . (5.298)
8π E 2
K = ω + E 2 − ω − m, (5.299)
we have
dK ω − ω cos θ
=1+ . (5.300)
dω E2
Therefore,
ω dω ω d K
= . (5.301)
E2 E 2 + ω − ω cos θ
ω d(cos θ ) ω2
d2 = = d(cos θ ), (5.302)
m + ω(1 − cos θ ) 8π 8π mω
where (5.282) has been used to simplify the denominator. Finally using this in (5.289)
yields:
⎛ ⎞
1 ω2
dσ = ⎝ |M|2 ⎠ d(cos θ ), (5.303)
4 32π mω2
spins
π α2
σ = 2ln(s/m 2 ) + 1 + · · · . (5.307)
s
Equation (5.307) is the same result that we found in the center-of-momentum frame,
(5.276). This is an example of a general fact: the total cross-section does not depend
on the choice of reference frame, provided that one boosts along a direction parallel
to the collision axis. To see why, one need only look at the definition of the total
cross-section given in (4.145). The numbers of particles N S , Na , and Nb can be
simply counted, and so certainly do not depend on any choice of inertial reference
frame, while the area A is invariant under Lorentz boosts along the collision axis.
The low-energy Thomson scattering limit is also interesting. In the lab frame,
ω m implies ω /ω = 1, so that
dσ π α2
= 2 (1 + cos2 θ ), (5.308)
d(cos θ ) m
8π α 2
σ = . (5.309)
3 m2
Unlike the case of high-energy Compton scattering, Thomson scattering is symmet-
ric under θ → π − θ , with a factor of 2 enhancement in the forward (θ = 0) and
backward (θ = π ) directions compared to right-angle (θ = π/2) scattering.
148 5 Quantum Electro-Dynamics (QED)
5.5.2 e+ e− → γ γ
Compton e+ e− → γ γ (5.310)
γ ↔ pa +
e ↔ pa (5.311)
e− ↔ pb e − ↔ pb (5.312)
γ ↔ k1 γ ↔ k1 (5.313)
e− ↔ k2 γ ↔ k2 . (5.314)
pa · pb pb · k1
|Mγ e− →γ e− |2 = 8e4 + , (5.316)
pb · k1 pa · pb
spins
obtained from (5.260). Because the crossing involves one fermion (a final state
electron changes into an initial state positron), there is also a factor of (−1)1 = −1,
according to (5.197). So, the result is:
−k2 · pb pb · k 1
|Me+ e− →γ γ |2 = (−1)8e4 + (5.317)
pb · k 1 −k2 · pb
spins
k 2 · pb pb · k 1
= 8e4 + . (5.318)
pb · k 1 k 2 · pb
Therefore,
1 t u 1 + cos2 θ
|Me+ e− →γ γ |2 = 2e4 + = 4e4 . (5.321)
4
spins
u t sin2 θ
It is a useful check, and a vindication of the (−1) F factor in the Crossing Symmetry
Theorem, that this is positive! Now plugging this into the formula (4.192) for the
differential cross-section, we get:
dσ 2π α 2 1 + cos2 θ
= . (5.322)
d cos(θ ) s sin2 θ
θcut
cos
1 dσ
σcut = d(cos θ ) (5.323)
2 d(cos θ )
− cos θcut
2π α 2 1 + cos θcut
= ln − cos θcut . (5.324)
s 1 − cos θcut
(You can easily check that this is a positive and increasing function of cos θcut .) On
the other hand, if you are interested in the total cross-section for electron-positron
annihilation with no cuts applied on the angle, then you must
√ take into account the
non-zero electron mass. Redoing everything with m s but non-zero, you can
show:
2π α 2 ) s *
σ = ln − 1 . (5.325)
s 2m 2
150 5 Quantum Electro-Dynamics (QED)
The logarithmic enhancement at large s in this formula comes entirely from the
sin θ ≈ 0 region. Note that this formula is just what you would have gotten by
plugging in
into (5.324), for small m. In this sense, the finite mass of the electron “cuts off” the
would-be logarithmic divergence of the cross-section for small sin θ .
Problems
1. Instead of adding gauge fixing to the QED lagrangian, add a mass for the photon.
Compute the photon propagator.
2. Consider the process of antimuon scattering off of an electron:
μ+ e− → μ+ e− (5.327)
(c) Use the Feynman rules of QED to obtain the reduced matrix element M.
(d) Take the complex square of the reduced matrix element you found. Sum over
final state spins, and average over initial state spins, and simplify. Write the
result in terms of Mandelstam variables s, t, u, and then rewrite it in terms of
P and the scattering angle θ .
(e) Find the differential cross section. Simplify your answer as much as possible.
(f) Now take m μ → 0. What is the differential cross section? You should note
something interesting for a particular value of cos θ .
Problems 151
μ+ −
L eL → μ+ −
L eL ; μ+ −
L eL → μ+ −
R eL ; (5.332)
+ − + − + − + −
μL eL → μL e R ; μL eL → μR eR ; (5.333)
μ+ −
L eR → μ+ −
L eL ; μ+ −
L eR → μ+ −
R eL ; (5.334)
μ+ −
L eR → μ+ −
L eR ; μ+ −
L eR → μ+ −
R eR ; (5.335)
μ+ −
R eL → μ+ −
L eL ; μ+ −
R eL → μ+ −
R eL ; (5.336)
+ − + − + − + −
μ R eL → μL e R ; μ R eL → μR eR ; (5.337)
μ+ −
R eR → μ+ −
L eL ; μ+ −
R eR → μ+ −
R eL ; (5.338)
μ+ −
R eR → μ+ −
L eR ; μ+ −
R eR → μ+ −
R eR . (5.339)
Do not compute them. Instead, figure √ out which ones vanish by helicity
conservation in the limit m e , m μ s.
γ γ → e− e+ (5.340)
to be useful.
1
1 1 1+a
dx = ln , (5.341)
1 − a2 x 2 a 1−a
−1
1
1 1 1 1+a
dx = + ln . (5.342)
(1 − a 2 x 2 )2 1 − a2 2a 1−a
−1
152 5 Quantum Electro-Dynamics (QED)
5. (a) Compute all the reduced matrix elements squared for the different helicity pro-
jections of e− μ− → e− μ− . These include e− − − − − − − −
L μ L → e R μ L , e L μ L → e L μ R , etc.
There are eight of them. Hint: some can be seen to be zero without calculation,
and others can be understood to be the same as another one already computed.
Assume the fermion masses are negligible.
6. Draw the following Feynman diagrams in QED. In each case, write down con-
sistent expressions for the reduced matrix elements in terms of clearly defined
momenta and spins, but you do not need to simplify or evaluate them.
(a) All tree-level diagrams contributing to e− e+ → μ+ μ− γ
(b) All one-loop diagrams contributing to e− e+ → μ+ μ− . [Here, you do not
need to write down the reduced matrix elements for the subset of diagrams
that just involve corrections to external legs. They have to be handled by a
different method.]
(c) A representative diagram contributing to γ γ → γ γ
(d) All diagrams contributing to e− μ+ → μ− e+
Decay Processes
6
Suppose we observe the decays of a large sample of particles of this type, all at rest.
If the number of particle at time t is denoted N (t), then the number of particles
remaining a short time later is therefore:
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 153
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_6
154 6 Decay Processes
It follows that
dN N (t + t) − N (t)
= lim = − N , (6.3)
dt t→0 t
so that
When one has computed or measured for some particle, it is traditional and sensible
to quote the result as measured in the rest frame of the particle. If the particle is moving
with velocity β, then because of relativistic time dilation, the survival probability for
a particular particle as a function of the laboratory time t is:
√
(Probability of particle survival) = e−t 1−β 2
. (6.5)
The quantity
τ = 1/ (6.6)
is also known as the mean lifetime of the particle at rest. (Putting in the units recovers
(1.2); see (A.1.6).) There is often more than one final state available for a decaying
particle. One can then compute or measure the decay rates into particular final states.
The rate for a particular final state or class of final states is called a partial width.
The sum of all exclusive partial widths should add up to the total decay rate, of
course.
Consider a process in which a particle at rest with 4-momentum
p μ = (M, 0, 0, 0) (6.7)
decays to several particles with 4-momenta ki and masses m i . Given the reduced
matrix element M for this process, one can show by arguments similar to those in
(4.5) for cross-sections that the differential decay rate is:
1
d = |M|2 dn , (6.8)
2M
where
n
d 3 ki
dn = (2π )4 δ (4) ( p − ki ) (6.9)
(2π )3 2E i
i i=1
is the n-body Lorentz-invariant phase space. (Compare this to (4.176); you will
see that the only difference is that the pa + pb in the 4-momentum delta function
for a scattering process has been replaced by p for a decay process.) To find the
contribution to the decay rate for final-state particles with 3-momenta restricted to
6.2 Two-Body Decays 155
be in some ranges, we should integrate d over those ranges. To find the total decay
rate , we should integrate the 3-momenta over all available ki . The energies in this
formula are defined by
Ei = ki2 + m i2 . (6.10)
Most of the decay processes that one encounters in high-energy physics are two-
particle or three-particle final states. As general rule, if the number of particles in the
final state is larger, then the decay partial width for that final state tends to be smaller,
so a particle will typically decay into few-particle states if it can. If a two-particle
final state is available, it is usually a very good bet that three-particle final states will
lose. However, there are some exceptions to this, including in the important case of
the Higgs boson.
Let us simplify the formula (6.8) for the case of two-particle final states with
arbitrary masses. The evaluation of the two-particle final-state phase space is exactly
the same as in (4.179)–(4.188), with the simple replacement E CM → M. Therefore,
K
d2 = dφ1 d(cos θ1 ), (6.11)
16π 2 M
and
K
d = |M|2 dφ1 d(cos θ1 ). (6.12)
32π 2 M 2
It remains to solve for K . Energy conservation requires that
E1 + E2 = K 2 + m 21 + K 2 + m 22 = M. (6.13)
M 2 + m 21 − m 22
E1 = , (6.15)
2M
M 2 + m 22 − m 21
E2 = , (6.16)
2M
λ(M 2 , m 21 , m 22 )
K = , (6.17)
2M
156 6 Decay Processes
where
is known as the triangle function1 (or Källén function). It is useful to tabulate results
for some common special cases:
• If the final-state masses are equal, m 1 = m 2 = m, then the final state particles
share the energy equally in the rest frame:
E 1 = E 2 = M/2, (6.19)
M
K = 1 − 4 m 2 /M 2 , (6.20)
2
and
|M|2
d = 1 − 4 m 2 /M 2 dφ1 d(cos θ1 ). (6.21)
64π 2 M
• If one of the final-state particle is massless, m 2 = 0, then:
M 2 + m 21
E1 = , (6.22)
2M
M 2 − m 21
E2 = K = . (6.23)
2M
This illustrates the general feature that since the final state particles have equal
3-momentum magnitudes, the heavier particle gets more energy. In this case,
|M|2
d = 2
1 − m 21 /M 2 dφ1 d(cos θ1 ). (6.24)
64π M
• If the decaying particle has spin 0, or if its spin is not measured, then there can
be no special direction in the decay, so the final state particles must be distributed
isotropically in the center-of-momentum frame. One then obtains the total decay
rate from
provided that the two final state particles are distinguishable. There is an extra
factor of 1/2 if they are identical, to avoid counting each final state twice (see the
discussions at the end of Sects. 5.3.2 and 5.5.2).
1 It is so-named because, if each of √ x, √ y, √zis less than the sum of the other two, then λ(x, y, z)
√ √ √
is −16 times the square of the area of a triangle with sides x, y, z. However, in the present
context M, m 1 , m 2 never form a triangle; if M < m 1 + m 2 , then the decay is forbidden.
6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay 157
Let us now consider a simple and very important decay process, namely a scalar
particle φ decaying to a fermion-antifermion pair. As a model, let us consider the
Lagrangian already mentioned in Sect. 4.7:
The Feynman rules for fermion external states don’t depend on the choice of inter-
action vertex, so they are the same as for QED. Therefore we can draw the Feynman
diagram:
and immediately write down the reduced matrix element for the decay:
To turn M into a physically observable decay rate, we need to compute the squared
reduced matrix element summed over final state spins. From (6.28),
M∗ = i y v 2 u 1 , (6.29)
so
where we have used the fact that the trace of an odd number of gamma matrices
vanishes, and (A.2.8) and (A.2.9). The fermion and antifermion have the same mass
m, so
implies that
M2
k1 · k2 − m 2 = − 2 m2. (6.37)
2
Therefore,
4 m2
|M| = 2y M
2 2 2
1− , (6.38)
M2
spins
Doing the (trivial) angular integrals finally gives the total decay rate:
3/2
y2 M 4 m2
= 1− . (6.40)
8π M2
In the Standard Model, the Higgs boson h plays the role of φ, and couples to each
fermion f with a Lagrangian that is exactly of the form given above:
Lint = − y f h f f . (6.41)
f
The Yukawa coupling for each fermion is approximately proportional to its mass:
mf
yf ≈ . (6.42)
175 GeV
6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay 159
(The reason for this will be explained below in Sect. 10.3.) However, the m f appear-
ing in this formula is not quite equal to the mass, because of higher-order corrections.
For quarks, these corrections are quite large, and m f tends to come out considerably
smaller than the masses of the quarks quoted in Table 1.3.
At the LHC, one of the major goals is to study the Higgs boson through its decay
modes. The Higgs boson mass has been measured to be about 125 GeV. Since the top
quark has a mass of about 173 GeV, the decay h → tt is kinematically forbidden. The
next-lightest fermions in the Standard Model are the bottom quark, charm quark, and
tau lepton, so we expect decays h → bb and h → τ − τ + and h → cc. For quarks,
the sum in (6.41) includes a summation over 3 colors, leading to an extra factor of
n f = 3 in the decay rate. Since the kinematic factor (1 − 4m 2f /Mh2 )3/2 is close to 1
for all allowed fermion-antifermion final states, the leading-order prediction for the
decay rate to a particular fermion is approximately:
n f y 2f Mh
(h → f f ) = ∝ n f m 2f . (6.43)
16π
Estimates of the m f from present experimental data are:
for a Higgs with mass Mh = 125 GeV. (Notice that even though the charm quark is
heavier than the tau lepton, it turns out that m τ > m c because of the large higher-
order corrections for the charm quark.) Therefore, the prediction is that bb final states
win, with, very roughly:
A more accurate accounting of the Higgs boson width and branching ratios must
take into account many important corrections beyond our scope here. For example,
higher-order Feynman diagram are important, and increase the partial widths into
quarks substantially. Second, there are other final states that can appear in h decays,
notably gluon-gluon (gg) and γ γ , which both occur due to Feynman diagrams with
loops, and W + W − and Z 0 Z 0 . Naively, the last two are not kinematically allowed,
since 2m W and 2m Z are both greater than m h . However, they can still contribute
if one or both of the weak vector bosons is off-shell (virtual). These decays are
often written as h → W W (∗) and h → Z Z (∗) , with the (∗) indicating an off-shell
particle. Normally, such decays would be negligible compared to 2-body decays to
on-shell particles, but they are competitive because the bottom quark squared Yukawa
coupling yb2 ≈ 0.00024 is so small.
160 6 Decay Processes
Taking into account these effects,2 it turns out that the total width of the Higgs
boson is approximately 4.2 MeV, assuming m h = 125 GeV. This is an extremely
small decay width for such a heavy particle. One can define the branching ratio to
be the partial decay rate into a particular final state, divided by the total decay rate,
so for example
In the Standard Model with m h = 125 GeV, the predicted branching fractions into
b, τ , and c pairs, taking into account all known effects, are:
Some other branching ratios that turn out to be extremely important for the Higgs
boson at the Large Hadron collider, but rely on more involved calculations, are:
We will return to the subject of the Higgs boson branching ratios in Sect. 10.5.
Finally, consider the helicities for the process h → f f . If we demanded that the
final states have particular helicities, then we would have obtained for the matrix
element, using PR , PL projection matrices:
R-fermion, R-antifermion: M = −i y u 2 PL PL v1 = −i y u 2 PL v1
= 0, (6.57)
L-fermion, L-antifermion: M = −i y u 2 PR PR v1 = −i y u 2 PR v1
= 0, (6.58)
R-fermion, L-antifermion: M = −i y u 2 PL PR v1 = 0, (6.59)
L-fermion, R-antifermion: M = −i y u 2 PR PL v1 = 0. (6.60)
2 For
the results quoted in this paragraph, see https://arxiv.org/abs/1307.1347 S. Heinemeyer et al.,
“Handbook of LHC Higgs Cross Sections: 3. Higgs Properties”.
6.4 Three-Body Decays 161
of the outgoing particles must therefore have opposite directions. Since they have
momentum in opposite directions, this means they must also have the same helicity.
Drawing a short arrow to represent the spin, the allowed cases of RR helicities and
LL helicities look like:
The helicities of tau leptons can be (statistically) measured from the angular distri-
butions of their decay products, so this effect may eventually be measured with a
sample of h → τ − τ + decay events.
1
d = |M|2 d3 , (6.61)
2M
where
d 3 k1 d 3 k2 d 3 k3
d3 = (2π )4 δ (4) ( p − k1 − k2 − k3 ) (6.62)
(2π )3 2E 1 (2π )3 2E 2 (2π )3 2E 3
is the Lorentz-invariant phase space. Since there are 9 integrals to do, and 4 delta
functions, the result for d is a differential with respect to 5 remaining variables.
The best choice of 5 variables depends on the problem at hand, so there are several
162 6 Decay Processes
ways to present the result. Two of the 5 variables can be chosen to be the energies
E 1 and E 2 of two of the final-state particles; then the energy of the third particle
E 3 = M − E 1 − E 2 is also known from energy conservation. In the rest frame of
the decaying particle, the three final-state particle 3-momenta must lie in a plane,
because of momentum conservation. Specifying E 1 and E 2 also uniquely fixes the
angles between the three particle momenta within this decay plane. The remaining 3
variables just correspond to the orientation of the decay plane with respect to some
fixed coordinate axis. If we think of the three 3-momenta within the decay plane as
describing a rigid body, then the relative orientation can be described using three
Euler angles. These can be chosen to be the spherical coordinate angles φ1 and θ1 for
particle 1, and an angle α2 that measures the rotation of the 3-momentum direction
of particle 2 as measured about the axis of the momentum vector of particle 1. Then
one can show:
1
d3 = d E 1 d E 2 dφ1 d(cos θ1 ) dα2 . (6.63)
256π 5
The choice of which particles to label as 1 and 2 is arbitrary, and should be made to
maximize convenience.
If the initial state particle spin is averaged over, or if it is spinless, then there is
no special direction to measure the orientation of the final state decay plane with
respect to. In that case, for particular E 1 and E 2 , the reduced matrix element cannot
depend on the angles φ1 , θ1 , or α2 , and one can do the integrals
2π 1 2π
dφ1 d(cos θ1 ) dα2 = (2π )(2)(2π ) = 8π 2 . (6.64)
0 −1 0
Then,
1
d3 = d E1d E2 , (6.65)
32π 3
and so, for spinless or spin-averaged initial states,
1
d = |M|2 d E 1 d E 2 . (6.66)
64π 3 M
To do the remaining energy integrals, one must find the limits of integration. If one
decides to do the E 2 integral first, then by doing the kinematics one can show for
any particular E 1 that
1
E 2max,min = (M − E 1 )(m 2
23 + m 2
2 − m 2
3 ) ± (E 2 − m 2 ) λ(m 2 , m 2 , m 2 ) , (6.67)
1 1 23 2 3
2m 223
is the invariant (mass)2 of the combination of particles 2 and 3. Then the limits of
integration for the final E 1 integral are:
M 2 + m 21 − (m 2 + m 3 )2
m 1 < E1 < . (6.69)
2M
A good strategy is usually to choose the label “1” for the particle whose energy
we care the most about. Then after doing the d E 2 integral, we will be left with an
expression for d/d E 1 .
In the special case that all final state particles are massless (or small enough to
neglect) m 1 = m 2 = m 3 = 0, then these limits of integration simplify to:
M M
− E1 < E2 < , (6.70)
2 2
M
0 < E1 < . (6.71)
2
Problems
Lint = y f A0 f γ5 f (6.72)
where y f is a coupling constant. Compute the partial decay rate for A0 into a
fermion anti-fermion pair, as a function of y f , the mass of the pseudo-scalar
M A0 , and the mass of the fermion m f . You should find a result of the form:
p
4m 2f
= N n f y 2f M A0 1− , (6.73)
M A2 0
where n f = 3 for quarks and 1 for leptons, and N and p are numbers that you
will compute.
2. This problem is an extension of the previous problem. In the MSSM, it has been
calculated that the ratio of the couplings of A0 to bottom quarks and to top quarks
is:
yb mb
= tan2 β (6.74)
yt mt
164 6 Decay Processes
2< <
∼ tan β ∼ 55. (6.75)
Here, m t and m b differ somewhat from the actual masses, because of higher-order
corrections.
(a) Taking m t = 165 GeV and m b = 3 GeV and m t = 173 GeV and m b = 5 GeV
and m A0 = 400 GeV, make a plot of
BR(A0 → bb)
(6.76)
BR(A0 → tt)
as a function of tan β, using at least representative points tan β = 2, 5, 10, 20, 30,
40, 50. Use a log scale for the vertical axis. For what values of tan β is the
branching fraction of A0 into bb greater than for tt?
(b) Repeat part (a) for m A0 = 1000 GeV.
Fermi Theory of Weak Interactions
7
In nuclear physics, the weak interactions are responsible for decays of long-lived
isotopes. A nucleus with Z protons and A − Z neutrons, so A nucleons in all, is
denoted by A Z . If kinematically allowed, one can observe decays:
A
Z → A (Z + 1) + e− ν e . (7.1)
n → p + e− ν e (7.2)
1 The lifetime of the neutron is an infamous example of an experimental measurement that has
shifted dramatically over time. As recently as the late 1960s, it was thought that τn = 1010 ± 30
s, and as recently as 2010, the official value was 885.7 ± 0.8 seconds. Even today the systematic
uncertainties are a source of concern.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 165
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_7
166 7 Fermi Theory of Weak Interactions
where the quotes indicated that the neutron and proton are really not separate entities,
but part of the nuclear bound states. So for example, tritium decays according to
3
H → 3 He + e− ν e (τ = 5.6 × 108 s = 17.7 years), (7.5)
Nuclear physicists usually quote the half life t1/2 rather than the mean lifetime τ .
They are related by
so that t1/2 = 5730 years for Carbon-14, making it ideal for dating dead organisms.
In the upper atmosphere, cosmic rays produce energetic neutrons, which in turn
constantly convert 14 N nuclei into 14 C. Carbon-dioxide-breathing organisms, or those
that eat them, maintain an equilibrium with the carbon content of the atmosphere, at a
level of roughly 14 C/12 C≈ 10−12 . However, a complication is the fact that this ratio
is not constant; it dropped in the early 20th century as more ordinary 12 C entered the
atmosphere because of the burning of fossil fuels containing the carbon of organisms
that have been dead for a very long time. The relative abundance 14 C/12 C ≈ 10−12
then doubled after 1954 because of nuclear weapons testing, reaching a peak in the
mid 1960s from which it has since declined. In any case, dead organisms lose half
of their 14 C every 5730 ± 30 years, and certainly do not regain it by breathing or
eating. So, by measuring the rate of e− beta rays consistent with 14 C decay produced
by a sample, and determining the historic atmospheric 14 C/12 C ratio as a function
of time with control samples or by other means, one can date the death of a sample
of organic matter.
One can also have decays that release a positron and neutrino:
A
Z → A (Z − 1) + e+ νe . (7.8)
“ p + ” → “n”e+ νe . (7.9)
In free space, the proton cannot decay, simply because m p < m n , but under the right
circumstances it is kinematically allowed when the proton and neutron are parts of
nuclear bound states. An example is
14
O → 14 N + e+ νe (τ = 71 s). (7.10)
The long lifetimes of such decays are what originally gave rise to the name “weak”
interactions.
7.2 Muon Decay 167
Charged pions also decay through the weak interactions, with a mean lifetime of
τπ ± = 2.6 × 10−8 s. (7.11)
This is still a very long lifetime by particle physics standards, and corresponds to
a proper decay length of cτ = 7.8 meters. The probability that a charged pion with
velocity β will travel a distance L in empty space before decaying is therefore
√
P = e−(L/7.8 m) 1−β /β .
2
(7.12)
This means that a relativistic charged pion will typically travel several meters before
decaying, unless it interacts (which it usually will in a collider detector). The main
decay mode is
π − → μ− ν μ (7.13)
π − → e− ν e . (7.14)
with a branching ratio 1.2 × 10−4 . This presents a puzzle: since the electron is lighter,
there is more kinematic phase space available for the second decay, yet the first decay
dominates by almost a factor of 104 . We will calculate the reason for this later, in
Sect. 7.6.
implying a proper decay length of cτ = 659 meters. Muons do not undergo hadronic
interactions like pions do, so that relativistic muons will usually penetrate at least
the inner layers of particle detectors with a very high probability.
The Feynman diagram for muon decay can be drawn as:
168 7 Fermi Theory of Weak Interactions
where the symbols μ, νμ , e, νe mean the Dirac spinor fields for the muon, muon
neutrino, electron, and electron neutrino, and the ellipses mean matrices in Dirac
spinor space. To be more precise about the interaction Lagrangian, one needs clues
from experiment.
One clue is the fact that there are three quantum numbers, called lepton numbers,
that are additively conserved to a high accuracy in most experiments. (The only
confirmed exceptions are neutrino oscillation experiments.) They are assigned as:
⎧
⎨ +1 for e− , νe
L e = −1 for e+ , ν e (7.18)
⎩
0 for all other particles
⎧
⎨ +1 for μ− , νμ
L μ = −1 for μ+ , ν μ (7.19)
⎩
0 for all other particles
⎧
⎨ +1 for τ − , ντ
L τ = −1 for τ + , ν τ (7.20)
⎩
0 for all other particles
So, for example, in the nuclear decay examples above, one always has L e = 0 in
the initial state, and L e = 1 − 1 = 0 in the final state, with L μ = L τ = 0 trivially in
each case. The muon decay mode in (7.15) has (L e , L μ ) = (0, 1) in both the initial
and final states. If lepton numbers were not conserved, then one might expect that
decays like
μ− → e− γ (7.21)
would be allowed. However, these decays have never been observed, and the most
recent limit from the MEG experiment at the Paul Scherrer Institute is
(Actually, the MEG experiment searches for the decay μ+ → e+ γ , but the branching
ratio should be the same with all particles replaced by their anti-particles.) This is a
2 We will later find out that this is not a true fundamental interaction of the theory, but rather an
“effective” interaction that is derived from the low-energy effects of the W − boson.
7.2 Muon Decay 169
remarkably strong constraint, since this decay only has to compete with the already
weak mode in (7.15). It implies that
The BaBar and Belle experiments have put similar (but not as stringent) bounds on
tau lepton number non-conservation:
Since 1998, experimental results from neutrinos produced in the Sun, the atmo-
sphere, by accelerators, and in reactors have given strong evidence for oscillations
of neutrinos that are caused by them having small non-zero masses that violate the
individual lepton numbers. (It is still an open question whether they also violate the
total lepton number
L ≡ Le + Lμ + Lτ ; (7.27)
for more on this, see Sect. 10.4.) However, these are very small effects for colliding
beam experiments, and can be ignored for almost all conceivable processes at the
Tevatron and the LHC.
The (near) conservation of lepton numbers suggests that the interaction Lagrangian
for the weak interactions can always be written in terms of fermion bilinears involv-
ing one barred and one unbarred Dirac spinor from each lepton family. So, we
will write the weak interactions for leptons in terms of building blocks with net
L e = L μ = L τ = 0, for example, like the first term in (7.17) but not the second.
More generally, we will want to use building blocks:
( . . . ν ) or (ν . . . ), (7.28)
where is any of e, μ, τ . Now, since each Dirac spinor has 4 components, a basis
for fermion bilinears involving any two fields 1 and 2 will have 4 × 4 = 16
elements. The can be classified by their transformation properties under the proper
Lorentz group and the parity transformation x → −x, as follows:
The entry under Parity indicates the multiplicative factor under which each of these
terms transforms when x → −x, with
μ +1 for μ = 0
(−1) = (7.29)
−1 for μ = 1, 2, 3.
The weak interaction Lagrangian for leptons could be formed out of any product of
such terms with 1 , 2 = , ν . Fermi originally proposed that the weak interaction
fermion building blocks were of the type V , so that muon decays would be described
by
Lint
V
= −G(ν μ γ ρ μ)(eγρ νe ) + c.c. (7.30)
Here “c.c.” means complex conjugate; this is necessary since the Dirac spinor fields
are complex. Some other possibilities could have been that the building blocks were
of type A:
Lint
A
= −G(ν μ γ ρ γ5 μ)(eγρ γ5 νe ) + c.c. (7.31)
The nucleus 60 Ni has spin J = 4, so the net angular momentum carried away by
the electron and antineutrino is 1. The observation was that the electron is emitted
preferentially in the direction opposite to the original spin of the 60 Co nucleus. This
can be explained consistently with angular momentum conservation if the electron
produced in the decay is always polarized left-handed and the antineutrino is always
right-handed. Using short arrows to designate spin directions, the most favored con-
figuration is:
The importance of this experiment and others was that right-handed electrons and
left-handed antineutrinos do not seem to participate in the weak interactions. This
7.2 Muon Decay 171
means that when writing the interaction Lagrangian for weak interactions, we can
always put a PL to the left of the electron’s Dirac field, and a PR to the right of a
ν e field. This helped establish that the correct form for the fermion bilinear in the
Lagrangian is V − A:
1 ρ
e PR γ ρ νe = eγ ρ PL νe = eγ (1 − γ5 )νe . (7.33)
2
Since this is a complex quantity, and the Lagrangian density must be real, one must
also have terms involving the complex conjugate of (7.33):
ν e PR γ ρ e = ν e γ ρ PL e. (7.34)
The feature that was considered most surprising at the time was that right-handed
Dirac fermion fields PR e, PR νe , and left-handed Dirac barred fermion fields e PL ,
and ν e PL never appear in any part of the weak interaction Lagrangian.
For muon decay, the relevant four-fermion interaction Lagrangian is:
√
Lint = −2 2G F (ν μ γ ρ PL μ)(eγρ PL νe ) + c.c. (7.35)
These are related by reversing of all arrows, corresponding to the complex conjugate
in (7.35). The slightly separated dots in the Feynman rule picture are meant to
indicate the Dirac spinor structure. The Feynman rules for external state fermions
172 7 Fermi Theory of Weak Interactions
and antifermions are exactly the same as in QED, with neutrinos treated as fermions
and antineutrinos as antifermions. This weak interaction Lagrangian for muon decay
violates parity maximally, since it treats left-handed fermions differently from right-
handed fermions. However, helicity is conserved by this interaction Lagrangian, just
as in QED, because of the presence of one gamma matrix in each fermion bilinear.
We can now derive the reduced matrix element for muon decay, and use it to
compute the differential decay rate of the muon. Comparing this to the experimentally
measured result will allow us to find the numerical value of G F , and determine the
energy spectrum of the final state electron. At lowest order, the only Feynman diagram
for μ− → e− ν e νμ is:
using the first of the two Feynman rules above. Let us label the momenta and spins
of the particles as follows:
The reduced matrix element is obtained by starting at the end of each fermion line
with a barred spinor and following it back (moving opposite the arrow direction) to
the beginning. In this case, that means starting with the muon neutrino and electron
barred spinors. The result is:
√
M = −i2 2G F (u 3 γ ρ PL u a )(u 1 γρ PL v2 ). (7.37)
Therefore,
In the following, we can neglect the mass of the electron m e , since m e /m μ < 0.005.
Now we can average over the initial-state spin sa and sum over the final-state spins
s1 , s2 , s3 using the usual tricks:
1 1
u a u a = ( /pa + m μ ), (7.40)
2 s 2
a
u 1 u 1 = k/1 , (7.41)
s1
v2 v 2 = k/2 , (7.42)
s2
u 3 u 3 = k/3 , (7.43)
s3
1
|M|2 = 4G 2F Tr[γ ρ PL ( /pa + m μ )PR γ σ k/3 ]Tr[γρ PL k/2 PR γσ k/1 ] (7.44)
2
spins
Fortunately, we have already seen a product of traces just like this one, in (5.139),
so that by substituting in the appropriate 4-momenta, we immediately get:
1
|M|2 = 64G 2F ( pa · k2 )(k1 · k3 ). (7.46)
2
spins
Our next task is to turn this reduced matrix element into a differential decay rate.
Applying the results of Sect. 6.4 to the example of muon decay, with M = m μ
and m 1 = m 2 = m 3 = 0. According to our result of (7.46), we need to evaluate the
dot products pa · k2 and k1 · k3 . Since these are Lorentz scalars, we can evaluate
them in a frame where k2 is along the z-axis. Then
pa = (m μ , 0, 0, 0), (7.47)
k2 = (E ν e , 0, 0, E ν e ). (7.48)
Therefore,
pa · k2 = m μ E ν e . (7.49)
1 2
k1 · k3 = (m − 2m μ E ν e ). (7.50)
2 μ
174 7 Fermi Theory of Weak Interactions
1
|M|2 = 32G 2F (m 3μ E ν e − 2m 2μ E ν2e ). (7.51)
2
spins
G 2F 2
d = d E e d E ν e (m E ν − 2m μ E ν2e ). (7.52)
2π 3 μ e
Doing the d E ν e integral using the limits of integration of (6.70), we obtain:
mμ
2
G 2F 2
d = d E e d Eνe (m E ν − 2m μ E ν2e )
2π 3 μ e
mμ
2 −E e
G2 m 2μ E e2 m μ E e3
= d E e 3F − . (7.53)
π 4 3
We have obtained the differential decay rate for the energy of the final state electron:
d G 2F m 2μ 2 4E e
= E 1− . (7.54)
d Ee 4π 3 e 3m μ
dΓ/dE
0
0 0.1 0.2 0.3 0.4 0.5
E/mμ
7.2 Muon Decay 175
We see that the electron energy is peaked near its maximum value of m μ /2. This cor-
responds to the situation where the electron is recoiling directly against both the neu-
trino and antineutrino, which are collinear; for example, k1 = (m μ /2, 0, 0, −m μ /2),
μ μ
and k2 = k3 = (m μ /4, 0, 0, m μ /4):
The helicity of the initial state is undefined, since the muon is at rest. However,
we know that the final state e− , νμ , and ν e have well-defined L, L, and R helicities
respectively, as shown above by the short arrows pointing in the spin directions, since
this is dictated by the weak interactions. In the case of maximum E e , therefore, the
spins of νμ and ν e must be in opposite directions. The helicity of the electron is L, so
its spin must be opposite to its 3-momentum direction. By momentum conservation,
this tells us that the electron must move in the opposite direction to the initial muon
spin in the limit that E e is near the maximum.
The smallest possible electron energies are near 0, which occurs when the neutrino
and antineutrino move in nearly opposite directions, so that the 3-momentum of the
electron recoiling against them is very small.
We have done the most practically sensible thing by plotting the differential decay
rate in terms of the electron energy, since that is what is directly observable in an
experiment. Just for fun, however, let us pretend that we could directly measure the
νμ and ν e energies, and compute the distributions for them. To find d/d E νμ , we can
take E 2 = E ν e and E 1 = E νμ in (6.66) and (6.70)–(6.71), with the reduced matrix
element from (7.51). Then
G 2F 2
d = d E νμ d E ν e (m E ν − 2m μ E ν2e ), (7.55)
2π 3 μ e
and the range of integration for E ν e is now:
mμ mμ
− E νμ < E ν e < , (7.56)
2 2
so that
mμ
2
G 2F 2
d = d E νμ d Eνe (m E ν − 2m μ E ν2e ) (7.57)
2π 3 μ e
mμ
2 −E νμ
G2 m 2μ E ν2μ m μ E ν3μ
= d E e 3F − . (7.58)
π 4 3
Therefore, the E νμ distribution of final states has the same shape as the E e distribu-
tion:
d G 2 m 2μ 4E νμ
= F 3 E ν2μ 1 − . (7.59)
d E νμ 4π 3m μ
176 7 Fermi Theory of Weak Interactions
G 2F 2 2
= d Eνe 3
m μ E ν e − 2m μ E ν3e , (7.61)
2π
so that
d G 2 m 2μ 2E ν e
= F 3 E ν2e 1 − . (7.62)
d Eνe 2π mμ
This distribution is plotted as the dashed line in the previous graph. Unlike the
distributions for E e and E ν , we see that d/d E ν e vanishes when E ν e approaches its
m
maximum value of 2μ . We can understand this by noting that when E ν e is maximum,
the ν e must be recoiling against both e and νe moving in the opposite direction, so
the L, L, R helicities of e, νμ , and ν e tell us that the total spin of the final state is 3/2:
Since the initial-state muon only had spin 1/2, the quantum states have 0 overlap,
and the rate must vanish in that limit of maximal E ν e .
The total decay rate for the muon is found by integrating either (7.54) with respect
to E e , or (7.59) with respect to E νμ , or (7.62) with respect to E ν e . In each case, we
get:
m μ /2
m μ /2
m μ /2
d d d
= d Ee = d E νμ = d E ν e (7.63)
d Ee d E νμ d Eνe
0 0 0
G 2F m 5μ
= . (7.64)
192π 3
It is a good check that the final result does not depend on the choice of the final
energy integration variable. It is also good to check units: G 2F has units of [mass]−4
or [time]4 , while m 5μ has units of [mass]5 or [time]−5 , so indeed has units of [mass]
or [time]−1 .
Experiments tell us that
(This determination also includes some small and delicate corrections reviewed
below in Sect. 7.3.)
The 4-fermion weak interaction Lagrangian of (7.35) describes several other pro-
cesses besides the decay μ− → e− ν e νμ that we studied in Sect. 7.2. As the simplest
example, we can just replace each particle in the process by its anti-particle:
μ+ → e+ νe ν μ , (7.68)
for which the Feynman diagram is just obtained by changing all of the arrow direc-
tions:
The evaluations of the reduced matrix element and the differential and total decay
rates for this decay are very similar to those for the μ− → e− ν e νμ . For future
reference, let us label the 4-momenta for this process as follows:
The reduced matrix element, following from the “+c.c.” term in (7.35), is then
√
M = −i2 2G F (v a γ ρ PL v3 )(u 2 γρ PL v1 ). (7.70)
As one might expect, the result for the spin-summed squared matrix element,
|M|2 = 128G 2F ( pa · k2 )(k1 · k3 ), (7.71)
spins
is exactly the same as obtained in (7.46), with the obvious substitution of primed
4-momenta. The differential and total decay rates that follow from this are, of course,
exactly the same as for μ− decay.
178 7 Fermi Theory of Weak Interactions
It turns out to be impossible to write down any theory that fails to obey this rule, as
long as the Lagrangian is invariant under proper Lorentz transformations and contains
a finite number of spacetime derivatives and obeys some other technical assumptions.
Among other things, the CPT Theorem implies that the mass and the total decay rate
of a particle must each be equal to the same quantities for the corresponding anti-
particle. (It does not say that the differential decay rate to a particular final state
configuration necessarily has to be equal to the anti-particle differential decay rate
to the same configuration of final-state anti-particles; that stronger result holds only
if the theory is invariant under T. The four-fermion Fermi interaction for leptons
does respect invariance under T, but it is violated by a tiny amount in the weak
interactions of quarks.) We will study some other processes implied by the Fermi
weak interaction Lagrangians in Sects. 7.4, 7.5, and 7.6 below.
In the previous section, we derived the μ− decay rate in terms of Fermi’s four-fermion
weak interaction coupling constant G F . Since this decay process is actually the one
that is used to experimentally determine G F most accurately, it is worthwhile to note
the leading corrections to it.
First, there is the dependence on m e , which we neglected, but could have included
at the cost of a more complicated phase space integration. Taking this into account
using correct kinematics for m e = 0 and the limits of integration in (6.67)–(6.69),
one finds that the decay rate must be multiplied by a correction factor Fkin (m 2e /m 2μ ),
where
Evaluating these diagrams is beyond the scope of this book. However, it should
be clear that they give contributions to the reduced matrix element proportional to
e2 G F , since each contains two photon interaction vertices. These contributions to
the reduced matrix element actually involve divergent loop integrals, which must be
“regularized” by using a high-energy cutoff. There is then a logarithmic dependence
on the cutoff energy, which can then be absorbed into a redefinition of the mass and
coupling parameters of the model, by the systematic process of renormalization. The
interference of the loop diagrams with the original lowest-order diagram then gives a
contribution to the decay rate proportional to αG 2F . There are also QED contributions
from diagrams with additional photons in the final state:
The QED diagrams involving an additional photon contribute to a 4-body final state,
with a reduced matrix element proportional to eG F . After squaring, summing over
final spins, and averaging over the initial spin, and integrating over the 4-body phase
space, the contribution to the decay rate is again proportional to αG 2F . Much of this
contribution actually comes from very soft (low-energy) photons, which are difficult
or impossible to resolve experimentally. Therefore, one usually just combines the
180 7 Fermi Theory of Weak Interactions
two types of QED contributions into a total inclusive decay rate with one or more
extra photons in the final state. After a heroic calculation, one finds that the QED
effect on the total decay rate is to multiply by a correction factor
α 25 π2 m2 m
e
α 2
FQED (α) = 1 + − − 2e 9 + 4π 2 + 24 ln + C2 + · · · (7.73)
π 8 2 mμ mμ π
where the C2 contribution refers to even higher-order corrections from: the interfer-
ence between Feynman diagrams with two virtual photons and the original Feynman
diagram; the interference between Feynman diagrams with one virtual photons plus
one final state photon and the original Feynman diagram; the square of the reduced
matrix element for a Feynman diagram involving one virtual photon; and two pho-
tons in the final state. A complicated calculation shows that C2 ≈ 6.68. Because of
the renormalization procedure, the QED coupling α actually is dependent on the
energy scale, and should be evaluated at the energy scale of interest for this problem,
which is naturally m μ . At that scale, α ≈ 1/135.9, so numerically
Finally, there are corrections involving the fact that the point-like four-fermion
interaction is actually due to the effect of a virtual W − boson. This gives a correction
factor
3m 2μ
FW = 1 + 2
≈ 1.000001, (7.75)
5MW
using MW = 80.4 GeV. The predicted decay rate defining G F experimentally includ-
ing all these higher-order effects is
G 2F m 5μ
μ− = Fkin FQED FW . (7.76)
192π 3
The dominant remaining uncertainty in G F quoted in (7.67) comes from the exper-
imental input of the muon lifetime.
e− νμ → νe μ− . (7.77)
in which we see that the following particles have been crossed from the previous
diagram for μ+ decay:
In fact, this scattering process is often known as inverse muon decay. To apply the
Crossing Symmetry Theorem stated in Sect. 5.3, we can assign momentum labels
pa , pb , k1 , k2 to e− , νμ , νe , μ− respectively, as shown in the figure, and then make
the following comparison table:
μ+ → e+ νe ν μ e− νμ → νe μ−
μ+ , pa μ− , k2
e+ , k1 e− , pa (7.81)
νe , k2 νe , k1
ν μ , k3 νμ , pb
in (7.71), and then multiplying by (−1)3 for three crossed fermions, resulting in:
|Me− νμ →νe μ− |2 = 128G 2F (k2 · k1 )( pa · pb ). (7.83)
spins
Let us evaluate this result in the limit of high-energy scattering, so that m μ can be
neglected, and in the center-of-momentum frame. In that case, all four particles being
treated as massless, we can take the kinematics results from (5.177)–(5.181), so that
pa · pb = k1 · k2 = s/2, and
|Me− νμ →νe μ− |2 = 32G 2F s 2 . (7.84)
spins
182 7 Fermi Theory of Weak Interactions
Including a factor of 1/2 for the average over the initial-state electron spin,3 and
using (4.192),
dσ G2 s
= F . (7.85)
d(cos θ ) 2π
G 2F s
σe− νμ →νe μ− = . (7.86)
π
Numerically, we can evaluate this using (7.67):
√
2
s
σe− νμ →νe μ− = 16.9 fb . (7.87)
GeV
This is a very small cross-section for typical neutrino energies encountered in present
experiments, but it does grow with E νμ .
The isotropy of e− νμ → νe μ− scattering in the center-of-momentum frame can
be understood from considering what the helicities dictated by the weak interactions
tell us about the angular momentum. Since this is a weak interaction process involving
only fermions and not anti-fermions, they are all L helicity.
3 In the Standard Model with neutrino masses neglected, all neutrinos are left-handed, and all
antineutrinos are right-handed. Since there is only one possible νμ helicity, namely L, it would
be incorrect to average over the νμ spin. This is a general feature; one should never average over
initial-state neutrino or antineutrino spins, as long as they are being treated as massless.
7.5 e− ν e → μ− ν μ 183
We therefore see that the initial and final states both have total spin 0, so that the
process is s-wave, and therefore necessarily isotropic.
7.5 e− ν e → μ− ν μ
e− ν e → μ− ν μ . (7.90)
in (7.71), and multiplying again by (−1)3 because of the three crossed fermions. The
result this time is:
|M|2 = 128G 2F (k1 · pb )( pa · k2 ) = 32G 2F u 2 = 8G 2F s 2 (1 + cos θ )2 , (7.95)
spins
184 7 Fermi Theory of Weak Interactions
where (5.179) and (5.181) for 2→2 massless kinematics have been used. Here θ is
the angle between the incoming e− and the outgoing μ− 3-momenta.
Substituting this result into (4.192), with a factor of 1/2 to account for averaging
over the initial e− spin, we obtain:
dσe− ν e →μ− ν μ G 2F s
= (1 + cos θ )2 . (7.96)
d(cos θ ) 8π
G 2F s
σe− ν e →μ− ν μ = . (7.97)
3π
This calculation shows that in the center-of-momentum frame, the μ− tends to keep
going in the same direction as the original e− . This can be understood from the
helicity-spin-momentum diagram:
The interaction Lagrangian term responsible for muon decay and for the cross-
sections discussed above is just one term in the weak-interaction Lagrangian. More
generally, we can write the Lagrangian as a product of a weak-interaction charged
current Jρ− and its complex conjugate Jρ+ :
7.6 Charged Currents and π ± Decay 185
√
Lint = −2 2G F Jρ+ J −ρ . (7.98)
The weak-interaction charged current is obtained by adding together terms for pairs
of fermions, with the constraint that the total charge of the current is −1, and all
Dirac fermion fields involved in the current are left-handed, and all barred fields are
right-handed:
Notice that we have included contributions for the quarks. The quark fields d , s ,
and b appearing here are actually not quite mass eigenstates, because of mixing;
this is the reason for the primes. The complex conjugate of Jρ− has charge +1, and
is given by:
can find some distribution for the u and d momenta, and try to average over that
distribution, but the strong interactions are very complicated so this is not very easy
to do from the theoretical side. However, by considering what we do know about
the current-current Lagrangian, we can write down the general form of the reduced
matrix element. First, we know that the external state spinors for the fermions are:
In this formula, the factor (u 1 γ ρ PL v2 ) just reflects the fact that leptons are immune
from the complications of the strong interactions. The factor f π pρ takes into account
the part of the reduced matrix element involving the π − ; here p ρ is the 4-momentum
of the pion. The point is that whatever the pion factor in the reduced matrix element
is, we know that it is a four-vector in order to contract with the lepton part, and it
must be proportional to p ρ , since there is no other vector quantity in the problem
that it can depend on. (Recall that pions are spinless, so there is no spin dependence.)
So we are simply parameterizing all of our ignorance of the bound-state properties
of the pion in terms of a single constant f π , called the pion decay constant. It is a
quantity with dimensions of mass. In principle we could compute it if we had perfect
ability to calculate with the strong interactions. In practice, f π is an experimentally
−
measured quantity, with its value following √ most accurately from the π lifetime
that we will compute below. The factor 2G F is another historical convention;
it could have been absorbed into the definition of f π . But it is useful to have the
G F appear explicitly as a sign that this is a weak interaction; then f π is entirely a
strong-interaction parameter.
Let us now compute the decay rate for π − → μ− ν μ . Taking the complex square
of the reduced matrix element (7.102), we have:
Now summing over final state spins in the usual way gives:
|M|2 = 2G 2F f π2 pρ pσ Tr[γ ρ PL k/2 PR γ σ (k/1 + m μ )] (7.104)
spins
Note that we do not neglect the mass of the muon, since m μ /m π ± = 0.1056
GeV/0.1396 GeV = 0.756 is not a small number. However, the term in (7.105)
7.6 Charged Currents and π ± Decay 187
that explicitly involves m μ does not contribute, since the trace of 3 gamma matrices
(with or without a PR ) vanishes. Evaluating the trace, we have:
where (θ, φ) are the angles for the μ− three-momentum. Of course, since the pion
is spinless, the differential decay rate is isotropic, so the angular integration trivially
gives dφd(cos θ ) → 4π , and:
2
− −
G 2F f π2 m π ± m 2μ m 2μ
(π → μ νμ) = 1− . (7.113)
8π m 2π ±
The charged pion can also decay according to π − → e− ν e . The calculation of this
decay rate is identical to the one just given, except that m e is substituted everywhere
for m μ . Therefore, we have:
2
G 2 f 2 m π ± m 2e m2
(π − → e− ν e ) = F π 1 − 2e , (7.114)
8π mπ±
The ratio (7.115) is a robust prediction of the theory, because the dependence on
f π has canceled out. Since there are no other kinematically-possible two-body decay
channels open to π − , it should decay to μ− ν μ almost always, with a rare decay to
e− ν e occurring 0.012% of the time. This has been confirmed experimentally. We
can also use the measurement of the total lifetime of the π − to find f π numerically,
using (7.113). The result is:
It is not surprising that this value is of the same order-of-magnitude as the mass of
the pion.
The most striking feature of the π − decay rate is that it is proportional to m 2μ , with
M proportional to m μ . This is what leads to the strong suppression of decays to e− ν e
(already mentioned at the end of Sect. 7.1). We found this result just by calculating.
To understand it better, we can draw a momentum-helicity-spin diagram, using the
fact that the − and ν produced in the weak interactions are L and R respectively:
The π − has spin 0, but the final state predicted by the weak interaction helicities
unambiguously has spin 1. Therefore, if helicity were exactly conserved, the π −
could not decay at all! However, helicity conservation only holds in the high-energy
limit in which we can treat all fermions as massless. This decay is said to be helicity-
suppressed, since the only reason it can occur is because m μ and m e are non-zero.
In the limit m → 0, we recover exact helicity conservation and the reduced matrix
element and the decay lifetime vanish. This explains why they should be proportional
to m and m 2 respectively. The helicity suppression of this decay is therefore a
good prediction of the rule that the weak interactions affect only L fermions and
R antifermions. In the final state, the charged lepton μ− or e− is said to undergo a
helicity flip, meaning that the L-helicity fermion produced by the weak interactions
has an amplitude to appear in the final state as a R-fermion. In general, a helicity flip
for a fermion entails a suppression in the reduced matrix element proportional to the
mass of the fermion divided by its energy.
Having computed the decay rate following from (7.102), let us find a Lagrangian
that would give rise to it involving a quantum field for the pion. Although the pion
is a composite, bound-state particle, we can still invent a quantum field for it, in an
approximate, “effective” description. The π − corresponds to a charged spin-0 field.
Previously, we studied spin-0 particles described by a real scalar field. However, the
particle and antiparticle created by a real scalar field turned out to be the same thing.
Here we want something different; since the π − is charged, its antiparticle π + is
clearly a different particle. This means that the π − particle should be described by
a complex scalar field.
7.6 Charged Currents and π ± Decay 189
Let us therefore define π − (x) to be a complex scalar field, with its complex
conjugate given by
We can construct a real free Lagrangian density from these complex fields as follows:
L = ∂μ π + ∂ μ π − − m 2π ± π + π − . (7.118)
(Compare to the Lagrangian density for a real scalar field, (4.18).) This Lagrangian
density describes free pion fields with mass m π ± . At any fixed time t = 0, the π +
and π − fields can be expanded in creation and annihilation operators as:
−
π (x) = d p̃ (eip·x ap,− + e−ip·x ap,+
†
); (7.119)
π + (x) = d p̃ (eip·x ap,+ + e−ip·x ap,−
†
). (7.120)
Note that these fields are indeed complex conjugates of each other, and that they are
each complex since ap,− and ap,+ are taken to be independent. The operators ap,−
†
and ap,− act on states by destroying and creating a π − particle with 3-momentum
†
p. Likewise, the operators ap,+ and ap,+ act on states by destroying and creating a
π + particle with 3-momentum p. In particular, the single particle states are:
†
ap,− |0
= |π − ; p
, (7.121)
†
ap,+ |0
= |π + ; p
. (7.122)
One can now carry through canonical quantization as usual. Given an interaction
Lagrangian, one can derive the corresponding Feynman rules for the propagator and
interaction vertices. Since a π − moving forward in time is a π + moving backwards
in time, and vice versa, there is only one propagator for π ± fields. It differs from the
propagator for an ordinary scalar in that it carries an arrow indicating the direction
of the flow of charge:
The external state pion lines also carry an arrow direction telling us whether it is a
π − or a π + particle. A pion line entering from the left with an arrow pointing to the
right means a π + particle in the initial state, while a line entering from the left with
an arrow pointing back to the left means a π − particle in the initial state. Similarly,
if a pion line leaves the diagram to the right, it represents a final state pion, with an
arrow to the right meaning a π + and an arrow to the left meaning a π − . We can
summarize this with the following mnemonic figures:
190 7 Fermi Theory of Weak Interactions
In each case the Feynman rule factor associated with the initial- or final-state pion
is just 1.
Returning to the reduced matrix element of (7.102), we can interpret this as
coming from a pion-lepton-antineutrino interaction vertex. When we computed the
decay matrix element, the pion was on-shell, but in general this need not be the case.
The pion decay constant f π must therefore be generalized to a function f ( p 2 ), with
f ( p 2 )| p2 =m 2 = fπ (7.123)
π±
pρ ↔ i∂ρ . (7.124)
Then, reversing the usual procedure of inferring the Feynman rule from a term in
the interaction Lagrangian, we conclude that the effective interaction describing π −
decay is:
√
Lint,ß− ¯˚¯ = − 2G F (μγ ρ PL νμ ) f (−∂ 2 )∂ρ π − . (7.125)
and
In each Feynman diagram, the arrow on the pion line describes the direction of flow
of charge, and the 4-momentum p ρ is taken to be flowing in to the vertex. When the
pion is on-shell, one can replace f ( p 2 ) by the pion decay constant f π .
Other charged mesons made out of a quark and antiquark, like the K ± , D ± , and
±
Ds , have their own decay constants f K , f D , and f Ds , and their decays can be treated
in a similar way.
Let us develop the dimensional analysis of fields and couplings further. We know
that the Lagrangian must have the same units as energy. In the standard system in
which c = = 1, this is equal to units of [mass]. Since d 3 x has units of [length]3 ,
or [mass]−3 , and
L = d 3 x L, (7.127)
it must be that L has units of [mass]4 . This fact allows us to evaluate the units of
all fields and couplings in a theory. For example, a spacetime derivative has units of
inverse length, or [mass]. Therefore, from the kinetic terms for scalars, fermions, and
vector fields found for example in (4.18), (4.26), and (4.34), we find that these types
of fields must have dimensions of [mass], [mass]3/2 , and [mass] respectively. This
allows us to evaluate the units of various possible interaction couplings that appear
in the Lagrangian density. For example, a coupling of n scalar fields,
λn n
Lint = − φ (7.128)
n!
implies that λn has units of [mass]4−n . A vector-fermion-fermion coupling, like e in
QED, is dimensionless. The effective coupling f π for on-shell pions has dimensions
of [mass], because of the presence of a spacetime derivative together with a scalar
field and two fermion fields in the Lagrangian. Summarizing this information for the
known types of fields and couplings that we have encountered so far:
Object Dimension Role
L [mass] 4 Lagrangian density
∂μ [mass] derivative
φ [mass] scalar field
[mass]3/2 fermion field
Aμ [mass] vector field
λ3 [mass] scalar3 coupling
λ4 [mass] 0 scalar4 coupling
0 (7.129)
y [mass] scalar-fermion-fermion (Yukawa) coupling
e [mass]0 photon-fermion-fermion coupling
GF [mass]−2 fermion4 coupling
fπ [mass] fermion2 -scalar-derivative coupling
u, v, u, v [mass] 1/2 external-state spinors
M Ni →N f [mass]4−Ni −N f reduced matrix element for Ni → N f particles
σ [mass]−2 cross-section
[mass] decay rate.
It is a general fact that theories with couplings with negative mass dimension, like
G F , or λn for n ≥ 5, always suffer from a problem known as non-renormalizability.4
4 Theconverse is not true; just because a theory has only couplings with positive or zero mass
dimension does not guarantee that it is renormalizable. It is a necessary, but not sufficient, condition.
7.7 Unitarity, Renormalizability, and the W Boson 193
In a renormalizable theory, the divergences that occur in loop diagrams due to inte-
grating over arbitrarily large 4-momenta for virtual particles can be regularized by
introducing a high momentum cutoff, and then the resulting dependence on the
unknown cutoff can be absorbed into a redefinition of the masses and coupling
constants of the theory. In contrast, in a non-renormalizable theory, one finds that
this process requires introducing an infinite number of different couplings, each
of which must be redefined in order to absorb the momentum-cutoff dependence.
This dependence on an infinite number of different coupling constants makes non-
renormalizable theories non-predictive, although only in principle. We can always
use non-renormalizable theories as effective theories at low energies, as we have
done in the case of the four-fermion theory of the weak interactions. However, when
probed at sufficiently high energies, a non-renormalizable theory will encounter
related problems associated with the apparent failure of unitarity (cross-sections
that grow uncontrollably with energy) and non-renormalizability (an uncontrollable
dependence on more and more unknown couplings that become more and more
important at higher energies). For this reason, we are happier to describe physical
phenomena using renormalizable theories if we can.
An example of a useful non-renormalizable theory is gravity. The effective
coupling constant for Feynman diagrams involving gravitons is 1/MPlanck 2 , where
MPlanck = 2.4 × 10 GeV is the “reduced Planck mass”. Like G F , this coupling has
18
Since the currents involved have electric charges −1 and +1, the vector boson must
carry charge −1 to the right. This is the W − vector boson. By analogy with the
charged pion, W ± are complex vector fields, with an arrow on its propagator indi-
cating the direction of flow of charge.
194 7 Fermi Theory of Weak Interactions
The Feynman rule for the propagator of a charged vector W ± boson carrying
4-momentum p turns out to be:
In the limit of low energies and momenta, | pρ |
m W , this propagator just becomes
a constant:
i pρ pσ gρσ
−gρσ + 2 −→ i 2 . (7.130)
p − m W + i
2 2 mW mW
√
Here g is a fundamental coupling of the weak interactions, and the 1/ 2 is a standard
convention. These are the Feynman rules involving W ± interactions with leptons;
there are similar rules for interactions with the quarks in the charged currents Jρ+
and Jρ− given earlier in (7.99) and (7.100). The interaction Lagrangian for W bosons
with standard model fermions corresponding to these Feynman rules is:
g
Lint = − √ W +ρ Jρ− + W −ρ Jρ+ . (7.131)
2
Comparing the four-fermion vertex to the reduced matrix element from W -boson
exchange, we find that we must have:
2
√ −ig i
− i2 2G F = √ , (7.132)
2 m 2W
7.7 Unitarity, Renormalizability, and the W Boson 195
so that
g2
GF = √ 2 . (7.133)
4 2m W
The W ± boson has been discovered, with a mass m W = 80.4 GeV, so we conclude
that
g ≈ 0.65. (7.134)
Since this is a dimensionless coupling, there is at least a chance to make this into a
renormalizable theory that is unitary in perturbation theory. At very high energies,
the W ± propagator will behave like 1/ p 2 , rather than the 1/m 2W that is encoded in
G F in the four-fermion approximation. This “softens” the weak interactions √ at high
energies, leading to cross-sections that fall, rather than rise, at very high s.
When a massive vector boson appears in a final state, it has a Feynman rule
given by a polarization vector μ ( p, λ), just like the photon did. The difference is
that a massive vector particle V has three physical polarization states λ = 1, 2, 3,
satisfying
p ρ ρ ( p, λ) = 0 (λ = 1, 2, 3). (7.135)
One can sum over these polarizations for an initial or final state in a squared reduced
matrix element, with the result:
3
pρ pσ
ρ ( p, λ)σ∗ ( p, λ) = −gρσ + . (7.136)
λ=1
m 2V
Summarizing the propagator and external state Feynman rules for a generic massive
vector boson for future reference:
If the massive vector is charged, like the W ± bosons, then an arrow is added to each
line to show the direction of flow of charge.
The weak interactions and the strong interactions are invariant under non-Abelian
gauge transformations, which involve a generalization of the type of gauge invari-
ance we have already encountered in the case of QED. This means that the gauge
196 7 Fermi Theory of Weak Interactions
transformations not only multiply fields by phases, but can mix the fields. In the next
section we will begin to study the properties of field theories, known as Yang-Mills
theories, which have a non-Abelian gauge invariance. This will enable us to get a
complete theory of the weak interactions.
Problems
where G X is a constant.
For this S − P theory, calculate
1
|M|2 (7.138)
2
spins
d
, (7.139)
d Ee
Notice that we are again making the approximation of ignoring the issue of mass
eigenstates being not quite the same as the fields that couple to the W − .
(a) The W − boson cannot decay into a final state with a bottom quark, within
the very good approximation just mentioned. Why? (This is a useful thing
sometimes; if your experiment tags a bottom quark jet, you can say it almost
certainly didn’t come from a W − decay unless it was mis-tagged.)
Problems 197
(b) Treating the electron as massless, compute the decay rate for W − → e− ν e ,
in terms of g and MW . [Draw the Feynman diagram, write down the reduced
matrix element, take its complex square, average over the three possible initial
polarizations of the W − boson, sum over all possible final state spins.]
(c) From your answer to part (b), infer the results for: (W − → μ− ν μ ) and
(W − → τ − ν τ ) and (W − → du) and (W − → sc), treating all of the
final-state fermions as massless. Remember that each quark in the final state
has 3 possible colors, which you must sum over. The antiquarks are constrained
to have the opposite color, so once you have summed over the quark colors,
you should not sum over the antiquark’s anticolors. Because the W − has a
large mass and the decay happens quickly, you can assume that the strong
interactions of the quarks and antiquarks are irrelevant until long after the
decay has occurred.
(d) From the above results, predict the total decay width of the W boson in GeV,
its lifetime in seconds, and its branching ratio into each of the possible final
states.
Quantum Chromo-Dynamics (QCD)
8
In this section, we will generalize the idea of gauge invariance found in electrodynam-
ics. This is primarily a mathematical exercise which will serve the greater purpose
of this chapter to describe quantum chromodynamics in the following sections—the
theory that governs the interactions of the quarks.
Recall that in QED the Lagrangian is defined in terms of a covariant derivative
Dμ = ∂μ + i Qe Aμ (8.1)
Fμν = ∂μ Aν − ∂ν Aμ (8.2)
as
1
L = − F μν Fμν + i D
/ − m. (8.3)
4
This Lagrangian is invariant under the local gauge transformation
1
Aμ → Aμ = Aμ − ∂μ θ, (8.4)
e
→ = ei Qθ , (8.5)
where θ (x) is any function of spacetime, called a gauge parameter. Now, the result
of doing one gauge transformation θ1 followed by another gauge transformation θ2
is always a third gauge transformation parameterized by the function θ1 + θ2 :
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 199
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_8
200 8 Quantum Chromo-Dynamics (QCD)
1 1 1
Aμ → (Aμ − ∂μ θ1 ) − ∂μ θ2 = Aμ − ∂μ (θ1 + θ2 ), (8.6)
e e e
i Q(θ1 +θ2 )
→e (e
i Qθ2 i Qθ1
) = e . (8.7)
(1) Closure: If gi and g j are elements of the group G, then the product gi g j is also
an element of G.
(2) Associativity: gi (g j gk ) = (gi g j )gk .
(3) Existence of an Identity: There is a unique element I = g1 of the group, such
that for all gi in G, I gi = gi I = gi .
(4) Inversion: For each gi , there is a unique inverse element (gi )−1 satisfying
(gi )−1 gi = gi (gi )−1 = I .
It may or may not be also true that the group also satisfies the commutativity property:
gi g j = g j gi . (8.8)
U Q (θ ) = ei Qθ . (8.9)
Here θ labels the group elements, and the charge Q labels the representation of
the group. So we can say that the electron, muon, and tau Dirac fields each live in
a representation of the group U (1) with charge Q = −1; the Dirac fields for up,
charm, and top quarks each live in a representation with charge Q = 2/3; and the
8.1 Groups and Representations 201
Dirac fields for down, strange and bottom quarks each live in a representation with
Q = −1/3. We can read off the charge of any field if we know how it transforms
under the gauge group. A barred Dirac field transforms with the opposite phase from
the original Dirac field of charge Q, and therefore has charge −Q.
Objects that transform into themselves with no change are said to be in the singlet
representation. In general, the Lagrangian should be invariant under gauge transfor-
mations, and therefore must be in the singlet representation. For example, each term
of the QED Lagrangian carries no charge, and so is a singlet of U (1). The photon
field Aμ has charge 0, and is therefore usually said (by a slight abuse of language) to
transform as a singlet representation of U (1). [Technically, it does not really trans-
form under gauge transformations as any representation of the group U (1), because
of the derivative term in (8.4), unless θ is a constant function so that one is making
the same transformation everywhere in spacetime.]
Let us now generalize to non-Abelian groups, which always involve representa-
tions containing more than one field or state. Let ϕi be a set of objects that together
transform in some representation R of the group G. The number of components of
ϕi is called the dimension of the representation, d R , so that i = 1, . . . , d R . Under a
group transformation,
ϕi → ϕi = Ui j ϕ j (8.10)
U ()i j = (1 + i a T a )i j . (8.11)
aj
Here the Ti are a basis for all the possible infinitesimal group transformations.
The number of matrices T a is called the dimension of the group, dG , and there
is an implicit sum over a = 1, . . . , dG . The a are a set of dG infinitesimal gauge
parameters (analogous to θ in QED) that tell us how much of each is included in the
transformation represented by U (). Since U () is unitary,
The closure property requires that this is a representation of the group element in
(8.13), which must also be close to the identity. It follows that
[T a , T b ] = i f abc T c (8.16)
for some set of numbers f abc , called the structure constants of the group. In prac-
tice, one often picks a particular representation of matrices T a as the defining or
fundamental representation. This determines the structure constants f abc once and
for all. The set of matrices T a for all other representations are then required to repro-
duce (8.16), which fixes their overall normalization. Equation (8.16) defines the Lie
algebra corresponding to the Lie group, and the hermitian matrices T a are said to be
generators of the Lie algebra for the corresponding representation. Many physicists
have a bad habit of using the words “Lie group” and “Lie algebra” interchangeably,
because we often only care about the subset of gauge transformations that are close
to the identity.
For any given representation R, one can always choose the generators so that:
The number I (R) is called the index of the representation. A standard choice is that
the index of the fundamental representation of a non-Abelian Lie algebra is 1/2.
(This can always be achieved by rescaling the T a , if necessary.) From (8.16) and
(8.17), one obtains for any representation R:
It follows, from the cyclic property of the trace, that f abc is totally antisymmetric
under interchange of any two of a, b, c. By using the Jacobi identity,
[T a , [T b , T c ]] + [T b , [T c , T a ]] + [T c , [T a , T b ]] = 0, (8.19)
which holds for any three matrices, one also finds the useful result:
Two representations R and R are said to be equivalent if there exists some fixed
matrix X such that:
for all a. Obviously, this requires that R and R have the same dimension. From a
physical point of view, equivalent representations are indistinguishable from each
other.
8.1 Groups and Representations 203
Here the Trai are representation matrices for smaller representations ri . One calls this
a direct sum, and writes it as
R = r1 ⊕ r2 ⊕ · · · ⊕ rn . (8.23)
j
(TRa TRa )i j = C(R)δi (8.24)
aj
Ti = 0. (8.26)
If TRa is equivalent to TRa , so that there is some fixed matrix X such that
j,y y j
(TR⊗R
a
)
i,x
≡ (TRa )i j δx + δi (TRa )x y . (8.29)
R ⊗ R = R1 ⊕ · · · ⊕ Rn (8.30)
with
d R⊗R = d R d R = d R1 + · · · + d Rn . (8.31)
This is a way to make larger representations (R1 . . . Rn out of smaller ones (R, R ).
One can check from the identity (8.20) that the matrices
(T a )b c = −i f abc (8.32)
form a representation, called the adjoint representation, with the same dimension
as the group G. As a matter of terminology, the quadratic Casimir invariant of the
adjoint representation is also called the Casimir invariant of the group, and given the
symbol C(G). Note that, from (8.25), the index of the adjoint representation is equal
to its quadratic Casimir invariant:
We now list, without proof, some further group theory facts regarding Lie algebra
representations:
1 Real representations can be divided into two sub-cases, “positive-real” and “pseudo-real”, depend-
ing on whether the matrix X can or cannot be chosen to be symmetric. In a pseudo-real representation,
the T a cannot all be made antisymmetric and imaginary; in a positive-real representation, they can.
8.1 Groups and Representations 205
• The tensor product of any representation with the singlet representation just gives
the original representation back:
1 ⊗ R = R ⊗ 1 = R. (8.36)
• The tensor product of two real representations R1 and R2 is always a direct sum
of representations that are either real or appear in complex conjugate pairs.
• The adjoint representation is always real.
• The tensor product of two irreducible representations contains the singlet repre-
sentation if and only if they are complex conjugates of each other:
R1 ⊗ R2 = 1 ⊕ · · · ←→ R2 = R 1 . (8.37)
R ⊗ R = 1 ⊕ Adjoint ⊕ · · · . (8.38)
• As a corollary of the preceding rules, the tensor product of the adjoint represen-
tation with itself always contains both the singlet and the adjoint:
Here the S and A mean that the indices of the two adjoints on the left are combined
symmetrically and antisymmetrically respectively.
• If the tensor product of two representations contains a third, then the tensor
product of the first representation with the conjugate of the third representation
contains the conjugate of the second representation:
R1 ⊗ R2 = R3 ⊕ · · · , ←→ R1 ⊗ R 3 = R 2 ⊕ · · · , (8.40)
R1 ⊗ R2 ⊗ R3 = 1 ⊕ · · · , ←→ R1 ⊗ R2 = R 3 ⊕ · · · . (8.41)
n
I (R1 )d R2 + I (R2 )d R1 = I (ri ). (8.42)
i=1
Let us recall how Lie algebra representations work in the example of SU (2), the
group of unitary 2 × 2 matrices (that’s the “U(2)” part of the name) with determinant
1 (that’s the “S”, for special, part of the name). This group is familiar from the
study of angular momentum in quantum mechanics, and the defining or fundamental
206 8 Quantum Chromo-Dynamics (QCD)
representation is the familiar spin-1/2 one with ϕi with i = 1, 2 or up,down. The Lie
algebra generators in the fundamental representation are:
σa
Ta = (a = 1, 2, 3), (8.43)
2
where the σ a are the three Pauli matrices (see, for example, (3.23)). One finds that
the structure constants are
⎧
⎨ +1 if a, b, c = 1, 2, 3 or 2, 3, 1 or 3, 1, 2
f abc = abc = −1 if a, b, c = 1, 3, 2 or 3, 2, 1 or 2, 1, 3 (8.44)
⎩
0 otherwise.
Irreducible representations exist for any “spin” j = n/2, where n is an integer, and
have dimension 2 j + 1. The representation matrices J a in the spin- j representation
satisfy the SU (2) Lie algebra:
[J a , J b ] = i abc J c . (8.45)
J 3 | j, m = m| j, m , (8.46)
J a J a | j, m = j( j + 1)| j, m , (8.47)
We therefore recognize from (8.24) that the quadratic Casimir invariant of the spin-
j representation of SU (2) is C(R j ) = j( j + 1). It follows from (8.25) that the
index of the spin- j representation is I (R j ) = j( j + 1)(2 j + 1)/3. The j = 1/2
representation is real, because
σ a∗ σa
X − X −1 = , (8.50)
2 2
where
0 i
X= . (8.51)
−i 0
8.1 Groups and Representations 207
More generally, one can show that all representations of SU (2) are real. Making a
table of the representations of SU (2):
The tensor product of any two representations of SU (2) is reducible to a direct sum,
as:
j1 ⊗ j2 = | j1 − j2 | ⊕ (| j1 − j2 | + 1) ⊕ · · · ⊕ ( j1 + j2 ) (8.53)
with a constant θ a that does not depend on position in spacetime. The weak inter-
actions involve a still different SU (2), known as weak isospin or SU (2) L . Weak
isospin is a gauge symmetry that acts only on left-handed fermion fields. The irre-
ducible j = 1/2 representations of SU (2) L are composed of the pairs of fermions
that couple to a W ± boson, namely:
νeL νμL ντ L
; ; ; (8.55)
eL μL τL
uL cL tL
; ; . (8.56)
d L s L bL
Here e L means PL e, etc., and the primes mean that these are not quark mass eigen-
states. When one makes an SU (2) L gauge transformation, the transformation can
be different at each point in spacetime. However, one must make the same transfor-
mation simultaneously on each of these representations. We will come back to study
the SU (2) L symmetry in more detail later, and see more precisely how it ties into
the weak interactions and QED.
One can generalize the SU (2) group to non-Abelian groups SU (N ) for any integer
N ≥ 2. The Lie algebra generators of SU (N ) in the fundamental representation are
208 8 Quantum Chromo-Dynamics (QCD)
1 a
Ta = λ (a = 1, . . . , 8), (8.57)
2
Note that each of these matrices is Hermitian and traceless, as required. They have
also been engineered to satisfy Tr(λa λb ) = 2δ ab , so that
1 ab
Tr(T a T b ) = δ , (8.59)
2
and therefore the index of the fundamental representation is
By taking commutators of each pair of generators, one finds that the non-zero struc-
ture constants of SU (3) are:
f 123 = 1; (8.63)
f 147 = − f 156 = f 246 = f 257 = f 345 = − f 367 = 1/2; (8.64)
√
f 458 = f 678 = 3/2. (8.65)
and those related to the above by permutations of indices, following from the con-
dition that the f abc are totally antisymmetric. From these one can find the adjoint
representation matrices using (8.32), with, for example:
⎛ ⎞
0 0 0 0 0 0 0 0
⎜0 0 1 0 0 0 0 0⎟
⎜ ⎟
⎜ 0 −1 0 0 0 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 0 0 1/2 0 ⎟
1
Tadjoint = −i ⎜ ⎜0 0 0
⎟ , (8.66)
⎜ 0 0 −1/2 0 0⎟⎟
⎜0 0 0 0 1/2 0 0 0⎟
⎜ ⎟
⎝ 0 0 0 −1/2 0 0 0 0⎠
0 0 0 0 0 0 0 0
etc. However, it is almost never necessary to actually use the explicit form of any
matrix representation larger than the fundamental. Instead, one relies on group-
theoretic identities. For example, calculations of Feynman diagrams often involve
the index or Casimir invariant of the fundamental representation, and the Casimir
invariant of the group. One can easily compute the latter by using (8.24) and (8.32):
Here, when we take the tensor product of two or more identical representations, the
irreducible representations on the right side are labeled as A, S, or M depending on
whether they involve an antisymmetric, symmetric, or mixed symmetry combination
of the indices of the original representations on the left side.
In SU (N ), one can build any representation out of objects that carry only indices
transforming under the fundamental N and anti-fundamental N representations. It
is useful to employ lowered indices for the fundamental, and raised indices for the
antifundamental. Then an object carrying n lowered and m raised indices:
j ... j
ϕi11...inm (8.76)
N ⊗ .
. . ⊗ N ⊗ N ⊗ .
. . ⊗ N . (8.77)
n times m times
This is always reducible. To reduce it, one can decompose ϕ into parts that have
different symmetry and trace properties. So, for example, we can take an object that
transforms under SU (3) as N × N, and write it as:
j j 1 j 1 j k
ϕi = ϕi − δi ϕkk + δi ϕk . (8.78)
N N
The first term in parentheses transforms as an adjoint representation, and the second
as a singlet, under SU (N ). For SU (3), this corresponds to the rule of (8.71).
Similarly, an object that transforms under SU (N ) as N × N can be decomposed
as
1 1
ϕi j = (ϕi j + ϕ ji ) + (ϕi j − ϕ ji ). (8.79)
2 2
The two terms on the right-hand side correspond to an N (N + 1)/2-dimensional
symmetric tensor, and an N (N − 1)/2-dimensional antisymmetric tensor, irre-
ducible representations. For N = 3, these are the 6 and 3 representations, respec-
tively, and this decomposition corresponds to (8.72). By using this process of taking
symmetric and anti-symmetric parts and removing traces, one can find all necessary
tensor-product representation rules for any SU (N ) group.
8.2 The Yang-Mills Lagrangian and Feynman Rules 211
In this section, we will construct the Lagrangian and Feynman rules for a theory of
Dirac fermions and gauge bosons transforming under a non-Abelian gauge group,
called a Yang-Mills theory.
Let the Dirac fermion fields be given by i , where i is an index in some representa-
aj
tion of the gauge group with generators Ti . Here i = 1, . . . , d R and a = 1, . . . , dG .
Under a gauge transformation, we have:
i → Ui j j (8.80)
where
U = exp(iθ a T a ). (8.81)
i → (1 + i a T a )i j j . (8.82)
†i → † j (1 − i a T a ) j i , (8.83)
where we have used the fact that T a are Hermitian matrices. (Notice that taking the
Hermitian conjugate changes the heights of the representation indices, and in the case
of matrices, reverses their order. So the Dirac spinor carries a lowered representation
index, while the Hermitian conjugate spinor carries a raised index.) Now we can
multiply on the right by γ 0 . The Dirac gamma matrices are completely separate
from the gauge group representation indices, so we get the transformation rule for
the barred Dirac spinors:
i j
→ (1 − i a T a ) j i . (8.84)
so that
i j
→ (1 + i a [−T a∗ ])i j . (8.86)
i
Comparing with (8.27), this establishes that transforms in the complex conjugate
of the representation carried by i .
212 8 Quantum Chromo-Dynamics (QCD)
i
Since and i transform as complex conjugate representations of each other,
their tensor product must be a direct sum of representations that includes a singlet.
The singlet is obtained by summing over the index i:
i
i . (8.87)
i j k
i → (1 − i a T a ) j i (1 + i b T b )i k (8.88)
i
= i + O( 2 ), (8.89)
where the terms linear in a have indeed canceled. Therefore, we can include a
fermion mass term in the Lagrangian:
i
Lm = −m i . (8.90)
This shows that each component of the field i must have the same mass.
Next we would like to include a derivative kinetic term for the fermions. Just as
in QED, the term
i γ μ ∂μ i
i
(8.91)
is not acceptable by itself, because ∂μ i does not gauge transform in the same way
that i does. The problem is that the derivative can act on the gauge-parameter
function a , giving an extra term:
aj
∂μ i → (1 + i a T a )i j ∂μ j + i(∂μ a )Ti j . (8.92)
By analogy with QED, we can fix this by writing a covariant derivative involving
vectors fields, which will also transform in such a way as to cancel the last term in
(8.92):
aj
Dμ i = ∂μ i + ig Aaμ Ti j . (8.93)
The vector boson fields Aaμ are known as gauge fields. They carry an adjoint rep-
resentation index a in addition to their spacetime vector index μ. The number of
such fields is equal to the number of generator matrices T a , which we recall is the
dimension of the gauge group dG . The quantity g is a coupling, known as a gauge
coupling. It is dimensionless, and is the direct analog of the coupling e in QED. The
aj
entries of the matrix Ti take the role played by the charges q in QED. Notice that
the definition of the covariant derivative depends on the representation matrices for
the fermions, so there is really a different covariant derivative depending on which
fermion representation one is acting on.
8.2 The Yang-Mills Lagrangian and Feynman Rules 213
Lfermions = i γ μ Dμ i − m i
i i
(8.94)
is invariant under infinitesimal gauge transformations, provided that the gauge field
is taken to transform as:
1
Aaμ → Aaμ − ∂μ a − f abc b Acμ . (8.95)
g
The term with a derivative acting on a is the direct analog of a corresponding term
in QED, see (8.4). The last term vanishes for Abelian groups like QED, but it is
necessary to ensure that the covariant derivative of a Dirac field transforms in in the
same way as the field itself:
Dμ i → (1 + i a T a )i j Dμ j . (8.96)
with an implied sum on a, where Fμνa is an antisymmetric field strength tensor for
each Lie algebra generator. However, we must also require that this Lagrangian is
invariant under gauge transformations. This is accomplished if we choose:
a
Fμν = ∂μ Aaν − ∂ν Aaμ − g f abc Abμ Acν . (8.98)
1 μνa a 1 1
− F Fμν → − F μνa Fμν
a
− F μνa f abc b Fμν
c
+ O( 2 ). (8.100)
4 4 2
The extra term linear in vanishes, because the part F μνa Fμν
c is symmetric under
interchange of a ↔ c, but its gauge indices are contracted with f abc , which is anti-
symmetric under the same interchange. Therefore, (8.97) is a gauge singlet.
214 8 Quantum Chromo-Dynamics (QCD)
Now we can find the Feynman rules for this theory in the usual way. First we
identify the kinetic terms that are quadratic in the fields. That part of LYang-Mills
is:
1
Lkinetic = − (∂μ Aaν − ∂ν Aaμ )(∂ μ Aaν − ∂ ν Aaμ ) + i ∂/i − m i . (8.102)
i i
4
These terms have exactly the same form as in the QED Lagrangian, but with a sum
over dG copies of the vector fields, labeled by a, and over d R copies of the fermion
field, labeled by i. Therefore we can obtain the Feynman rules for vector and fermion
propagators in the same way. For the gauge fields, we need to include a gauge-fixing
term, just as in QED (compare (5.32) and the surrounding discussion), in order to
have a well-defined propagator:
1 μ a 2
Lgauge−fixing = − (∂ Aμ ) (8.103)
2ξ
where p μ is the 4-momentum along either direction in the wavy line, and one can
take ξ = 1 for Feynman gauge and ξ = 0 for Landau gauge. The δ ab in the Feynman
rule just means that a gauge field does not change to a different type as it propagates.
Likewise, the Dirac fermion propagator is:
with 4-momentum p μ along the arrow direction, and m the mass of the Dirac fermion.
j
Again the factor of δi means the fermion does not change its identity as it propagates.
The interaction Feynman rules follow from the remaining terms in LYang−Mills .
First, there is a fermion-fermion-vector interaction coming from the covariant deriva-
tive in L . Identifying the Feynman rule as i times the term in the Lagrangian (recall
the discussion surrounding (4.250)), from
we get:
8.2 The Yang-Mills Lagrangian and Feynman Rules 215
This rule says that the coupling of a gauge field to a fermion line is proportional to
the corresponding Lie algebra generator matrix. Since the matrices T a are not diag-
onal for non-Abelian groups, this interaction can change one fermion into another.
The Lagrangian density Lgauge contains three-gauge-field and four-gauge-field cou-
plings, proportional to g and g 2 respectively. After combining some terms using the
antisymmetry of the f abc symbol, they can be written as
δ3
i LA A A, (8.107)
δ Aaμ δ Abν δ Acρ
resulting in:
where p μ , q μ , and k μ are the gauge boson 4-momenta flowing into the vertex.
Likewise, the Feynman rule for the coupling of four gauge bosons with indices μ, a
and ν, b and ρ, c and σ, d is:
δ4
i LA A A A, (8.108)
δ Aaμ δ Abν δ Acρ δ Adσ
leading to:
216 8 Quantum Chromo-Dynamics (QCD)
There are more terms in these Feynman rules than in the corresponding Lagrangian,
since the functional derivatives have a choice of several fields on which to act. Notice
that these fields are invariant under the simultaneous interchange of all the indices and
momenta for any two vector bosons, for example (μ, a, p) ↔ (ν, b, q). The above
Feynman rules are all that is needed to calculated tree-level Feynman diagrams in a
Yang-Mills theory with Dirac fermions. External state fermions and gauge bosons
are assigned exactly the same rules as for fermions and photons in QED. The external
state particles carry a representation or gauge index determined by the interaction
vertex to which that line is attached.
(However, this is not quite the end of the story if one needs to compute loop
diagrams. In that case, one must take into account that not all of the gauge fields
that can propagate in loops are actually physical. One way to fix this problem is by
introducing “ghost” fields that only appear in loops, never in initial or final states.
The ghost fields do not create and destroy real particles; they are really just book-
keeping devices that exist only to cancel the unphysical contributions of gauge fields
in loops. We will not do any loop calculations in this book, so we will not go into
more detail on that issue.)
The Yang-Mills theory we have constructed makes several interesting predictions.
One is that the gauge fields are necessarily massless. If one tries to get around this
by introducing a mass term for the vector gauge fields, like:
then one finds that this term is not invariant under the gauge transformation of (8.95).
Therefore, if we put in such a term, we necessarily violate the gauge invariance of
the Lagrangian, and the gauge symmetry will not be a symmetry of the theory. This
sounds like a serious problem, because there is only one known freely-propagating,
non-composite, massless vector field, the photon. In particular, the massive W ±
boson cannot be described by the Yang-Mills theory that we have so far. One way
to proceed would be to simply keep the term in (8.109), and accept that the theory
is not fully invariant under the gauge symmetry. The only problem with this is that
the theory would be non-renormalizable in that case; as a related problem, unitarity
would be violated in scattering at very high energies. Instead, we can explain the
non-zero mass of the W ± boson by enlarging the theory to include scalar fields,
leading to a spontaneous breakdown in the gauge symmetry.
Another nice feature of the Yang-Mills theory is that several different couplings
are predicted to be related to each other. Once we have picked a gauge group G, a
8.3 QCD Lagrangian and Feynman Rules 217
set of irreducible representations for the fermions, and the gauge coupling g, then
the interaction terms are all fixed. In particular, if we know the coupling of one type
of fermion to the gauge fields, then we know g. This in turn allows us to predict,
as a consequence of the gauge invariance, what the couplings of other fermions to
the gauge fields should be (as long as we know their representations), and what the
three-gauge-boson and four-gauge-boson vertices should be.
The strong interactions are based on a Yang-Mills theory with gauge group SU (3)c ,
with quarks transforming in the fundamental 3 representation. The subscript c is
to distinguish this as the group of invariances under transformations of the color
degrees of freedom. As far as we can tell, this is an exact symmetry of nature. (There
is also an approximate SU (3)flavor symmetry under which the quark flavors u, d, s
transform into each other; isospin is an SU (2) subgroup of this symmetry.) Each of
the quark Dirac fields u, d, s, c, b, t transforms separately as a 3 of SU (3)c , and each
barred Dirac field u, d, s, c, b, t therefore transforms as a 3, as we saw on general
grounds in Sect. 8.2.
For example, an up quark is created in an initial state by any one of the three color
component fields:
⎛ ⎞ ⎛ ⎞
u red u1
u= ⎝ u blue ⎠ = ⎝ u 2⎠, (8.110)
u green u3
Since SU (3)c is an exact symmetry, no experiment can tell the difference between
a red quark and a blue quark, so the labels are intrinsically arbitrary. In fact, we can
do a different SU (3)c transformation at each point in spacetime, but simultaneously
on each quark flavor, so that:
a (x)T a a (x)T a a (x)T a
u → eiθ u, d → eiθ d, s → eiθ s, etc., (8.112)
where T a = λ2 with a = 1, . . . , 8, and the θ a (x) are any gauge parameter functions
a
Here g3 is the coupling constant associated with the SU (3)c gauge interactions. The
strength of the strong interactions comes from the fact that g3
e.
Using the general results for a gauge theory in Sect. 8.2, we know that the propa-
gator for the gluon is that of a massless vector field just like the photon:
Note that it is traditional, in QCD, to use “springy” lines for gluons, to easily distin-
guish them from wavy photon lines. There are also quark-gluon interaction vertices
for each flavor of quark:
Here the quark line can be any of u, d, s, c, b, t. The gluon interaction changes the
color of the quarks when T a is non-diagonal, but never changes the flavor of the
quark line, so an up quark remains an up quark, a down quark remains a down quark,
etc.
The Lagrangian density also contains a “pure glue” part:
1
Lglue = − F μνa Fμν a
(8.116)
4
a
Fμν = ∂μ G aν − ∂ν G aμ − g3 f abc G bμ G cν , (8.117)
8.4 Scattering of Quarks and Gluons 219
where f abc are the structure constants for SU (3)c given in (8.63)–(8.65). In addition
to the propagator, this implies that there are three-gluon and four-gluon interactions:
The spacetime- and gauge-index structure are just as given in Sect. 8.2 in the general
case, with g → g3 .
To see how the Feynman rules for QCD work in practice, let us consider the exam-
ple of quark-quark scattering. This is not a directly observable process, because
the quarks in both the initial state and final state are parts of bound states. How-
ever, it does form the microscopic part of a calculation for the observable process
hadron+hadron→jet+jet. We will see how to use the microscopic cross-section result
to obtain the observable cross-section later, in Sect. 8.6. To be specific, let us consider
the process of an up-quark and down-quark scattering from each other:
ud → ud. (8.118)
The reduced matrix element can now be written down by the same procedure as in
QED. One obtains, using Feynman gauge (ξ = 1):
−ig μν δ ab
bj
M = u 3 (−ig3 γμ Tlai )u 1 u 4 (−ig3 γν Tm )u 2 (8.120)
( p − k)2
2 ai a j
μ
= ig3 Tl Tm u 3 γμ u 1 u 4 γ u 2 /t (8.121)
where t = ( p − k)2 . This matrix element is exactly what one finds in QED for
e− μ− → e− μ− , but with the QED squared coupling replaced by a product of matri-
ces depending on the color combination:
aj
e2 → g32 Tlai Tm . (8.122)
This illustrates that the “color charge matrix” g3 Tlai is analogous to the electric
charge eQ f . There are 34 = 81 color combinations for quark-quark scattering.
In order to find the differential cross-section, we continue as usual by taking the
complex square of the reduced matrix element:
aj bj |2 ,
|M|2 = g34 (Tlai Tm )(Tlbi Tm )∗ |M (8.123)
1 1
|M|2 . (8.125)
3 3 m
i j l
To do the color sum/average most easily, we note that, because the gauge group
generator matrices are Hermitian,
1 1 ai a j 1
(Tl Tm )(Tlbi Tm )∗ =
bj aj
(Tlai Tibl )(Tm T jbm ) (8.127)
3 3 m
9
i j l i, j,l,m
1
= Tr(T a T b )Tr(T a T b ) (8.128)
9
1
= I (3)δ ab I (3)δ ab (8.129)
9
1 1 2
= ( ) dG (8.130)
9 2
2
= (8.131)
9
In doing this, we have used the definition of the index of a representation (8.17);
the fact that the index of the fundamental representation is 1/2; and the fact that the
sum over a, b of δ ab δ ab just counts the number of generators of the Lie algebra dG ,
which is 8 for SU (3)c .
Meanwhile, the rest of |M|2 , including a sum over final state spins and an aver-
age over initial state spins, can be taken directly from the corresponding result for
e− μ− → e− μ− in QED, which we found by crossing symmetry in (5.213). Strip-
ping off the factor e4 associated with the QED charges, we find in the high energy
limit of negligible quark masses,
2
1 1 2 s + u2
|M | = 2 . (8.132)
2 s 2 s s s t2
1 2 3 4
Putting this together with the factor of g34 and the color factor above, we have
1 1 4g34 s 2 + u 2
|M|2 ≡ |M| =
2
. (8.133)
9 4 9 t2
colors spins
The notation |M|2 is a standard notation, which for a general process implies the
appropriate sum/average over spin and color. The differential cross-section for this
process is therefore:
dσ 1 2π αs2 s2 + u2
= |M|2 = , (8.134)
d(cos θ ) 32π s 9s t2
where
g32
αs = (8.135)
4π
222 8 Quantum Chromo-Dynamics (QCD)
is the strong-interaction analog of the fine structure constant. Since we are neglecting
quark masses, the kinematics for this process is the same as in any massless 2→2
process, for example as found in (5.177)–(5.181). Therefore, one can replace cos θ
in favor of the Mandelstam variable t, using
2dt
d(cos θ ) = , (8.136)
s
so
dσ 4π αs2 s2 + u2
= . (8.137)
dt 9s 2 t2
Let us now turn to QCD scattering of gluons. Because there are three-gluon and
four-gluon interaction vertices, one has the interesting process gg → gg even at
tree-level. (It is traditional to represent the gluon particle name, but not its quantum
field, by g.) The corresponding QED process of γ γ → γ γ does not happen at tree-
level, but does occur at one loop. In QCD, because of the three-gluon and four-gluon
vertices, there are four distinct Feynman diagrams that contribute at tree-level:
Let the internal gluon line carry (vector,gauge) indices (κ, e) on the left and (λ, f )
on the right. Labeling the Feynman diagram in detail:
8.5 Renormalization 223
The momenta flowing into the leftmost 3-gluon vertex are, starting from the upper-left
incoming gluon and going clockwise, ( p, − p − p , p ). Also, the momenta flowing
into the rightmost 3-gluon vertex are, starting from the upper-right final-state gluon
and going clockwise, (−k, −k , k + k ). So we can use the Feynman rules of Sect. 8.2
to obtain:
κλ e f
-channel = μ ν ρ∗ σ ∗ −ig δ
Msgg→gg 1 2 3 4
( p + p )2
−g f aeb gμκ (2 p + p )ν + gκν (− p − 2 p )μ + gνμ ( p − p)κ
−g f cd f gρσ (k − k)λ + gσ λ (−2k − k)ρ + gλρ (2k + k )σ . (8.139)
After writing down the reduced matrix elements for the other three diagrams, adding
them together, taking the complex square, summing over final state polarizations and
averaging over initial state polarizations, summing overfinal-state
gluon colors and
averaging over initial-state gluon colors according to 18 a 81 b c d one finds:
dσgg→gg 9π αs2 u2 + t 2 s2 + u2 s2 + t 2
= + + + 3 . (8.140)
dt 4 s2 s2 t2 u2
When we collide a proton with another proton or an antiproton, this process, and the
process ud → ud, are just two of many possible subprocesses that can occur. There
is no way to separate the proton into simpler parts, so one must deal with all of these
possible subprocesses. We will consider the subprocesses of proton-(anti)proton
scattering more systematically in the next section.
8.5 Renormalization
Since the strong interactions involve a coupling g3 that is not small, we should
worry about higher-order corrections to the treatment of quark-quark scattering in
the previous section. Let us discuss this issue in a more general framework than
just QCD. In a general gauge theory, the Feynman diagrams contributing to the
reduced matrix element at one-loop order in fermion+fermion → fermion+fermion
scattering are the following:
224 8 Quantum Chromo-Dynamics (QCD)
In each of these diagrams, there is a loop momentum μ that is unfixed by the external
4-momenta, and must be integrated over. Only the first two diagrams give a finite
answer when one naively integrates d 4 . This is not surprising; we do not really
know what physics is like at very high energy and momentum scales, so we have
no business in integrating over them. Therefore, one must introduce a very high
cutoff mass scale M, and replace the loop-momentum integral by one that kills the
contributions to the reduced matrix element from |μ | ≥ M. Physically, M should
be the mass scale at which some as-yet-unknown new physics enters in to alter the
theory. It is generally thought that the highest this cutoff is likely to be is about
MPlanck = 2.4 × 1018 GeV (give or take an order of magnitude), but it could very
easily be much lower.
As an example of what can happen, consider the next-to-last Feynman diagram
given above. Let us call q μ = p μ − k μ the 4-momentum flowing through either of
the vector-boson propagators. Then the part of the reduced matrix element associated
with the fermion loop is:
!
i(/ + q/ + m̂ f )
−i ĝ(T f )i γ μ
aj
(−1) d Tr
4
f
( + q)2 − m̂ 2f + i
|μ |≤M
!"
i(/ + m̂ f )
−i ĝ(T f )bij γ ν . (8.141)
− m̂ 2f + i
2
8.5 Renormalization 225
This involves a sum over all fermions that can propagate in the loop, and a trace over
the spinor indices of the fermion loop. For reasons that will become clear shortly,
we are calling the gauge coupling of the theory ĝ and #the mass of each fermion
species m̂ f . We are being purposefully vague about what |μ |≤M d 4 means, in part
because there are actually several different ways to cutoff the integral at large M. (A
straightforward step-function cutoff will work, but is clumsy to carry out and even
clumsier to interpret.)
The d 4 factor can be written as an angular part times a radial part ||3 d||. Now
there are up to five powers of || in the numerator (three from the d 4 , and two from
the propagators), and four powers of || in the denominator from the propagators. So
naively, one might expect that the result of doing the integral will scale like M 2 for
a large cutoff M. However, there is a conspiratorial cancellation, so that the large-M
behavior is only logarithmic. The result is proportional to:
ĝ 2 (q 2 g μν − q μ q ν ) Tr(T fa T fb ) [ln (M/m) + · · · ] (8.142)
f
where the · · · represents a contribution that does not get large as M gets large. The
m is a characteristic mass scale of the problem; it is something with dimensions of
mass built out of q μ and the m̂ f . It must appear in the formula in the way it does in
order to make the argument of the logarithm dimensionless. The arbitrariness in the
precise definition of m can be absorbed into the “· · · ”.
When one uses (8.142) in the rest of the Feynman diagram, it is clear that the
entire contribution must be proportional to:
Mfermion loop in gauge propagator ∝ ĝ 4 I (R f )ln(M/m) + · · · . (8.143)
f
What we are trying to keep track of here is just the number of powers of ĝ, the
group-theory factor, and the large-M dependence on ln(M/m).
A similar sort of calculation applies to the last diagram involving a gauge vector
boson loop. Each of the three-vector couplings involves a factor of f abc , with two of
the indices contracted because of the propagators. So it must be that the loop part of
the diagram make a contribution proportional to f acd f bcd = C(G)δ ab . It is again
logarithmically divergent, so that
Doing everything carefully, one finds that the contributions to the differential cross-
section is given by:
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 ⎣ 11 4 ⎬
dσ = dσtree (ĝ) 1 + C(G) − I (R f )⎦ ln(M/m) + · · · , (8.145)
⎩ 4π 2 3 3 ⎭
f
226 8 Quantum Chromo-Dynamics (QCD)
where dσtree (ĝ) is the tree-level result (which we have already worked out in the
special case of QCD), considered to be a function of ĝ. To be specific, it is proportional
to ĝ 4 . Let us ignore all the other diagrams for now; the justification for this will be
revealed soon.
The cutoff M may be quite large. Furthermore, we typically do not know what
it is, or what the specific very-high-energy physics associated with it is. (If we did,
we could just redo the calculation with that physics included, and a higher cutoff.)
Therefore, it is convenient to absorb our ignorance of M into a redefinition of the
coupling. Specifically, inspired by (8.145), one defines a renormalized or running
coupling g(μ) by writing:
⎧ ⎡ ⎤ ⎫
⎨ (g(μ)) ⎣ 11
2 4 ⎬
ĝ = g(μ) 1 − C(G) − I (R f )⎦ ln(M/μ) , (8.146)
⎩ 16π 2 3 3 ⎭
f
Here μ is a new mass scale, called the renormalization scale, that we get to pick. (It
is not uncommon to see the renormalization scale denoted by Q instead of μ.) The
original coupling ĝ is called the bare coupling. One can invert this relation to write
the renormalized coupling in terms of the bare coupling:
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 ⎣ 11 4 ⎬
g(μ) = ĝ 1 + C(G) − I (R f )⎦ ln(M/μ) + · · · , (8.147)
⎩ 16π 2 3 3 ⎭
f
where we are treating g(μ) as an expansion in ĝ, dropping terms of order ĝ 5 every-
where.
The reason for this strategic definition is that, since we know that dσtree (ĝ) is
proportional to ĝ 4 , we can now write, using (8.145) and (8.146):
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 11 4 ⎬
dσ = dσtree (g) (ĝ/g)4 1 + ⎣ C(G) − I (R f )⎦ ln(M/m) + · · ·
⎩ 4π 2 3 3 ⎭
f
⎧ ⎡ ⎤ ⎫
⎨ g 2 ⎣ 11 4 ⎬
= dσtree (g) 1 + C(G) − I (R f ) ⎦ ln(μ/m) + · · · . (8.148)
⎩ 4π 2 3 3 ⎭
f
Here we are again dropping terms that go like g 4 ; these are comparable to 2-loop
contributions that we are neglecting anyway. The factor dσtree (g) is the tree-level dif-
ferential cross-section, but with g(μ) in place of ĝ. This formula looks very much like
(8.145), but with the crucial difference that the unknown cutoff M has disappeared,
and is replaced by the scale μ that we know, because we get to pick it.
What should we pick μ to be? In principle we could pick it to be the cutoff M,
except that we do not know what that is. Besides, the logarithm could then be very
large, and perturbation theory would converge very slowly or not at all. For example,
suppose that M = MPlanck , and the characteristic energy scale of the experiment
we are doing is, say, m = 0.511 MeV or m = 1000 GeV. These choice might be
8.5 Renormalization 227
μ ≈ m. (8.150)
Then, to a first approximation, one can calculate using the tree-level approximation
using a renormalized coupling g(μ), knowing that the one-loop correction from these
diagrams is small. The choice of renormalization scale (8.150) allows us to write:
Of course, this is only good enough to get rid of the large logarithmic one-loop
corrections. If you really want all one-loop corrections, there is no way around
calculating all the one-loop diagrams, keeping all the pieces, not just the ones that
get large as M → ∞.
What about the remaining diagrams? If we isolate the M → ∞ behavior, they
fall into three classes. First, there are diagrams that are not divergent at all (the first
two diagram). Second, there are diagrams (the third through sixth diagrams) that
are individually divergent like ln(M/m), but sum up to a total that is not divergent.
Finally, the seventh through tenth diagrams have a logarithmic divergence, but it can
be absorbed into a similar redefinition of the mass. A clue to this is that they all
involve sub-diagrams:
The one-loop renormalized or running mass m f (μ) is defined in terms of the bare
mass m̂ f by
g2
m̂ f = m f (μ) 1 − C(R f )ln(M/μ) , (8.152)
2π 2
or
g2
m f (μ) = m̂ f 1+ C(R f )ln(M/μ) + · · · , (8.153)
2π 2
where C(R f ) is the quadratic Casimir invariant of the representation carried by the
fermion f . It is an amazing fact that the two redefinitions (8.146) and (8.152) are
enough to remove the cutoff dependence of all cross-sections in the theory up to and
including one-loop order. In other words, one can calculate dσ for any process, and
228 8 Quantum Chromo-Dynamics (QCD)
express it in terms of the renormalized mass m(μ) and the renormalized coupling
g(μ), with no M-dependence. This is what it means for a theory to be renormalizable
at one loop order.
In Yang-Mills theories, one can show that by doing some redefinitions of the
form:
!
L
ĝ = g(μ) 1 + bn g pn (ln(M/μ)) ,
2n
(8.154)
n=1
!
L
m̂ f = m f (μ) 1 + cn g qn (ln(M/μ)) ,
2n
(8.155)
n=1
one can simultaneously eliminate all dependence on the cutoff in any process up to
L-loop order. Here pn (x) and qn (x) are polynomials of degree n, and bn , cn are some
constants that depend on group theory invariants like the Casimir invariants of the
group and the representations, and the index. At any finite loop order, what is left
in the expression for any cross-section after writing it in terms of the renormalized
mass m(μ) and renormalized coupling g(μ) is a polynomial in ln(μ/m); these are
to be made small by choosing2 μ ≈ m. This is what it means for a theory to be
renormalizable at all loop orders. Typically, the specifics of these redefinitions is
only known at 2- or 3- or occasionally 4- loop order, except in some special theories.
If a theory is non-renormalizable, it does not necessarily mean that the theory is
useless; we saw that the four-fermion theory of the weak interactions makes reliable
predictions, and we still have no more predictive theory for gravity than Einstein’s
relativity. It does mean that we expect the theory to have trouble making predictions
about processes at high energy scales.
We have seen that we can eliminate the dependence on the unknown cutoff of a
theory by defining a renormalized running coupling g(μ) and mass m f (μ). When
one does an experiment in high energy physics, the results are first expressed in
terms of observable quantities like cross-sections, decay rates, and physical masses of
particles. Using this data, one extracts the value of the running couplings and running
masses at some appropriately-chosen renormalization scale μ, using a theoretical
prediction like (8.148), but with the non-logarithmic corrections included too. (The
running mass is not quite the same thing as the physical mass. The physical mass can
be determined from the experiment by kinematics, the running mass is related to it by
various corrections.) The running parameters can then be used to make predictions
for other experiments. This tests both the theoretical framework, and the specific
values of the running parameters.
The bare coupling and the bare mass never enter into this process of comparing
theory to experiment. If we measure dσ in an experiment, we see from (8.145) that
2 Of course, there might be more than one characteristic energy scale in a given problem, rather
than a single m. If so, and if they are very different from each other, then one may be stuck with
some large logarithms, no matter what μ is chosen. This has to be dealt with by fancier methods.
8.5 Renormalization 229
in order to determine the bare coupling ĝ from the data, we would also need to know
the cutoff M. However, we do not know what M is. We could guess at it, but this
would usually be a wild guess, devoid of practical significance.
A situation that arises quite often is that one extracts running parameters from
an experiment with a characteristic energy scale μ0 , and one wants to compare
with data from some other experiment that has a completely different characteristic
energy scale μ. Here μ0 and μ each might be the mass of some particle that is
decaying, or the momentum exchanged between particles in a collision, or some
suitable average of particle masses and exchanged momenta. It would be unwise to
use the same renormalization scale when computing the theoretical expectations for
both experiments, because the loop corrections involved in at least one of the two
cases will be unnecessarily large. What we need is a way of taking a running coupling
as determined in the first experiment at a renormalization scale μ0 , and getting from
it the running coupling at any other scale μ. The change of the choice of scale μ is
known as the renormalization group.3
As an example, let us consider how g(μ) changes in a Yang-Mills gauge theory.
Since the differential cross-section dσ for fermion+fermion →fermion+fermion
is an observable, in principle it should not depend on the choice of μ, which is an
arbitrary one made by us. Therefore, we can require that (8.148) is independent of
μ. Remembering that dσtree ∝ g 4 , we find:
⎧ ⎡ ⎤ ⎫
d ⎨ 4 dg g 2 ⎣ 11 4 1 ⎬
0= (dσ ) = (dσtree ) + C(G) − I (R f )⎦ + · · · , (8.156)
dμ ⎩ g dμ 4π 2 3 3 μ ⎭
f
where we are dropping all higher-loop-order terms that are proportional to (dσtree )g 4 .
The first term in (8.156) comes from the derivative acting on the g 4 inside dσtree .
The second term comes from the derivative acting on the lnμ one-loop correction
term. The contribution from the derivative acting on the g 2 in the one-loop correction
term can be self-consistently judged, from the equation we are about to write down,
as proportional to (dσtree )g 5 , so it is neglected as a higher-loop-order effect in the
expansion in g 2 . So, it must be true that:
⎡ ⎤
dg g 3 ⎣ 11 4
μ = − C(G) + I (R f )⎦ + · · · . (8.157)
dμ 16π 2 3 3
f
3 The use of the word “group” is historical; this is not a group in the mathematical sense defined
earlier.
230 8 Quantum Chromo-Dynamics (QCD)
then test the whole framework. The right-hand side of the RG equation is known as
the beta function for the running coupling g(μ), and is written β(g), so that:
dg
μ = β(g). (8.158)
dμ
In a Yang-Mills gauge theory, including the effects of Feynman diagrams with more
loops,
g3 g5 g7
β(g) = b0 + b1 + b2 + · · · (8.159)
16π 2 (16π 2 )2 (16π 2 )3
where we already know that
11 4
b0 = − C(G) + I (R f ), (8.160)
3 3
f
34 20
b1 = − C(G)2 + C(G) I (R f ) + 4 C(R f )I (R f ), (8.161)
3 3
f f
dg 2 b0 4
= g , (8.162)
dlnμ 8π 2
you can check that
g 2 (μ0 )
g 2 (μ) = . (8.163)
b0 g 2 (μ0 )
1− 8π 2
ln(μ/μ0 )
To see how this works in QCD, let us examine the one-loop beta function. In
SU (3), C(G) = 3, and each quark flavor is in a fundamental 3 representation with
I (3) = 1/2. Therefore,
2
b0,QC D = −11 + n f (8.164)
3
8.5 Renormalization 231
where n f is the number of “active” quarks in the effective theory, usually those
with mass < ∼ μ. The crucial fact is that since there are only 6 quark flavors known,
b0,QCD is definitely negative for all accessible scales μ, and so the beta function
is definitely negative. For an effective theory with n f = (3, 4, 5, 6) quark flavors,
b0 = (−9, −25/3, −23/3, −7). Writing the solution to the RG equation, (8.163), in
terms of the running αs , we have:
αs (μ0 )
αs (μ) = b0 αs (μ0 )
. (8.165)
1− 2π ln(μ/μ0 )
2π
αs (μ) = . (8.167)
b0 ln(QCD /μ)
This shows that at the scale μ = QCD , the QCD gauge coupling is predicted to
blow up, according to the 1-loop RG equation. A qualitative graph of the running of
αs (μ) as a function of renormalization scale μ is shown below:
αS(Q)
ΛQCD
Renormalization scale Q
Of course, once αs (μ) starts to get big, we should no longer trust the one-loop
approximation, since two-loop effects are definitely big. The whole analysis has been
extended to four-loop order, with significant numerical changes, but the qualitative
effect remains: at any finite loop order, there is some scale QCD at which the gauge
coupling is predicted to blow up in a theory with a negative beta function. This is not
a sign that QCD is wrong. Instead, it is a sign that perturbation theory is not going to
232 8 Quantum Chromo-Dynamics (QCD)
4 In this scheme, one cuts off loop momentum integrals by a process known as dimensional regular-
ization, which continuously varies the number of spacetime dimensions infinitesimally away from
4, rather than putting in a particular cutoff M. Although bizarre physically, this scheme is consistent
with gauge invariance and relatively easy to calculate in.
8.5 Renormalization 233
lattice QCD calculations of the mass splittings in the ϒ bottomonium system and
from the hadronic branching ratio in τ decays, but other inputs to the average come
from production of jets and tt pairs at hadron colliders, deep inelastic scattering at
the HERA proton-electron collider, and jet production data in e+ e− collisions. The
four-loop renormalization group running of α S (μ) with inputs from various widely
different μ are then used to determine the reference value α S (m Z ).
We can contrast this situation with the case of QED. For a U (1) group, there is
no non-zero structure constant, so C(G) = 0. Also, since the generator of the group
in a representation of charge Q f is just the 1 × 1 matrix Q f , the index for a fermion
with charge Q f is I (R f ) = Q 2f . Therefore,
4 16 4 4
b0,QED = 3n u (2/3)2 + 3n d (−1/3)2 + n (−1)2 = nu + nd + n, (8.168)
3 9 9 3
where n u is the number of up-type quark flavors (u, c, t), and n d is the number of
down-type quark flavors (d, s, b), and n is the number of charged leptons (e, μ, τ )
included in the chosen effective theory. If we do experiments with a characteristic
energy scale m e < <
∼ μ ∼ m μ , then only the electron itself contributes, and b0,EM =
4/3, so:
de e3 4
= βe = (m e < <
∼ μ ∼ m μ ). (8.169)
dlnμ 16π 2 3
This corresponds to a very slow running. (Notice that the smaller a gauge coupling
is, the slower it will run.) If we do experiments at characteristic energies that are
much less than the electron mass, then the relativistic electron is not included in the
234 8 Quantum Chromo-Dynamics (QCD)
effective theory (virtual electron-positron pairs are less and less important at low
energies), so b0,EM = 0, and the electron charge does not run at all:
de
=0 (μ m e ). (8.170)
dlnμ
This means that QED is not quite “infrared free”, since the effective electromagnetic
coupling is perturbative, but does not get arbitrarily small, at very large distance
scales. At extremely high energies, the coupling e could in principle become very
large, because the QED beta function is always positive. Fortunately, this is predicted
to occur only at energy scales far beyond what we can probe, because e runs very
slowly. Furthermore, QED is embedded in a larger, more complete theory anyway at
energy scales in the hundreds of GeV range, so the apparent blowing up of α = e2 /4π
much farther in the ultraviolet is just an illusion.
In general, a hadron is a QCD bound state of quarks, anti-quarks, and gluons. The
characteristic size of a hadron, like the proton or antiproton, is always roughly
1/QCD ≈ 10−13 cm, since this is the scale at which the strongly-interacting par-
ticles are confined. In general, the point-like quark, antiquark, and gluon parts of a
hadron are called partons, and the description of hadrons in terms of them is called
the parton model.
Suppose we scatter a hadron off of another particle (which might be another hadron
or a lepton or photon) with a total momentum exchange much larger than QCD . The
scattering can be thought of as a factored into a “hard scattering” of one of the point-
like partons, with the remaining partons as spectators, and “soft” QCD processes that
involve exchanges and radiation of low energy virtual gluons. The hard scattering sub-
process takes place on a time scale much shorter than 1/QCD expressed in seconds.
Because of asymptotic freedom, at higher scattering energies it becomes a better and
better approximation to think of the partons as individual entities that move collectively
before and after the scattering, but are free particles at the moment of scattering. As a
first approximation, we can consider only the hard scattering processes, and later worry
about adding on the various soft processes as part of the higher-order corrections. This
way of thinking about things allows us to compute cross-sections for hadron scattering
by first calculating the partonic cross-sections leading to a desired final state, and then
combining them with information about the multiplicity and momentum distributions
of partons within the hadronic bound states.
For example, suppose we want to calculate the scattering of a proton and antipro-
ton. This involves the following 2→2 partonic subprocesses:
qq → qq, qq → qq , q q → q q, q q → q q , (8.171)
qq → gg, qq → qq, qq → q q , qq → qq , (8.172)
qg → qg, qg → qg, gg → gg, gg → qq. (8.173)
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 235
where g is a gluon and q is any fixed quark flavor and q is any quark flavor that is
definitely different
√ from q. Let the center-of-mass energy of the proton and antiproton
be called s. Each parton only carries a fraction of the energy of the√energy of
the proton it belongs to,√so the partonic center-of-mass energy, call it ŝ, will be
significantly less than s. One often uses √ hatted Mandelstam variables ŝ, tˆ and
û for the partonic scattering event. If ŝ
QCD , then the two partons in the
final state will be sufficiently energetic that they can usually escape from most or
all of the spectator quarks and gluons before hadronizing (forming bound states).
However, before traveling a distance 1/QCD , they must rearrange themselves into
color-singlet combinations, possibly by creating quark-antiquark or gluon-antigluon
pairs out of the vacuum. This hadronization process can be quite complicated, but
will usually result in a jet of hadronic particles moving with roughly the same 4-
momentum as the parton that was produced. So, all of the partonic process cross-
sections in (8.173) contribute to the observable cross-section for the process:
p p → j j + X, (8.174)
where j stands for a jet. The X stands for “anything”, and includes stray hadronic junk
left over from the original proton and antiproton. Similarly, partonic hard scatterings
like:
p p → j j j + X. (8.176)
The 2 × 2 hard scattering processes can also contribute to this process if one of the
final state partons hadronizes by splitting into two jets, or if there is an additional jet
from the initial state.
In order to use the calculation of cross-sections for partonic processes like (8.173)
to obtain measurable cross-sections, we need to know how likely it is to have a given
parton inside the initial-state hadron with a given 4-momentum. Since we are mostly
interested in high-energy scattering problems, we can make things simple, and treat
the hadron and all of its constituents
√ as nearly massless. (For the proton, this means
that we are assuming that s
m p ≈ 1 GeV.) Suppose we therefore take the total
4-momentum of the hadron h in an appropriate Lorentz frame to be:
μ
ph = (E, 0, 0, E). (8.177)
This is sometimes called the “infinite momentum frame”, even though E is finite,
since E
m p . Consider a parton constituent A (a quark, antiquark, or gluon) that
carries a fraction x of the hadron’s momentum:
μ
p A = x(E, 0, 0, E). (8.178)
236 8 Quantum Chromo-Dynamics (QCD)
The function f Ah (x) is called the parton distribution function or PDF for the parton
A in the hadron h. The parton can be either one of the two or three valence quarks or
antiquarks that are the nominal constituents of the hadron, or one of an indeterminate
number of virtual sea quarks and gluons. Either type of parton can participate in a
scattering event.
Hadronic collisions studied in laboratories usually involve protons or antiprotons,
so the PDFs of the proton and antiproton are especially interesting. The proton is
nominally a bound state of three valence quarks, namely two up quarks and one down
quark, so we are certainly interested in the up-quark and down-quark distribution
functions
p p
f u (x) and f d (x).
The proton also contains virtual gluons, implying a gluon distribution function:
p
f g (x). (8.180)
Furthermore, there are always virtual quark-antiquark pairs within the proton. This
p p
adds additional contributions to f u (x) and f d (x), and also means that there is a
non-zero probability of finding antiup, antidown, or strange or antistrange quarks:
p p p p
f u (x), f d (x), f s (x), f s (x). (8.181)
These parton distribution functions are implicitly summed over color and spin. So
p
f u (x) tells us the probability of finding an up quark with the given momentum
fraction x and any color and spin. Since the gluon is its own antiparticle (it lives in
the adjoint representation of the gauge group, which is always a real representation),
p
there is not a separate f g (x).
Although the charm, bottom and top quarks are heavier than the proton, virtual
charm-anticharm, bottom-antibottom, and top-antitop pairs can exist as long as their
total energy does not exceed m p . This can happen because, as virtual particles, they
need not be on-shell. So, one can even talk about the parton distribution functions
p p p p p p
f c (x), f c (x), f b (x), f b (x), f t (x), f t (x). Fortunately, these are small so one can
often neglect them, although they can be important for processes involving charm or
bottom quarks in the final state.
Given the PDFs for the proton, the PDFs for the antiproton follow immediately
from the fact that it is the proton’s antiparticle. The probability of finding a given
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 237
parton in the proton with a given x is the same as the probability of finding the
corresponding antiparton in the antiproton with the same x. Therefore, if we know
the PDFs for the proton, there is no new information in the PDFs for the antiproton.
We can just describe everything having to do with proton and antiproton collisions
in terms of the proton PDFs. To simplify the notation, it is traditional to write the
proton and antiproton PDFs as:
p p
g(x) = f g (x) = f g (x), (8.182)
p p
u(x) = f u (x) = f u (x), (8.183)
p p
d(x) = f d (x) = f d (x), (8.184)
p p
u(x) = f u (x) = f u (x), (8.185)
p p
d(x) = f (x) = f d (x), (8.186)
d
p p
s(x) = f s (x) = f s (x), (8.187)
p p
s(x) = f s (x) = f s (x). (8.188)
a calculation, one often varies the renormalization and factorization scales (either
together or independently) over a range (say, from Q = m/4 to Q = 2m in the
case just mentioned) to see how the cross-section or other observables that resulted
from the calculation vary. This is a test of the accuracy of the perturbation theory
calculation, since in principle if one could calculate exactly rather than to some low
order in perturbation theory, the results should not depend on either scale choice at
all.
In the proton, antiquarks are always virtual, and so must be accompanied by a
quark with the same flavor. This implies that if we add up all the up quarks found in
the proton, and subtract all the anti-ups, we must find a total of 2 quarks:
1
d x u(x, Q 2 ) − u(x, Q 2 ) = 2. (8.189)
0
Similarly, summing over all x the probability of finding a down quark with a given
x, and subtracting the same thing for anti-downs, one has:
1
d x d(x, Q 2 ) − d(x, Q 2 ) = 1. (8.190)
0
Most of the strange quarks in the proton come from the process of a virtual gluon
splitting into a strange and anti-strange pair. Since the virtual gluon treats quarks and
antiquarks on an equal footing, for every strange quark with a given x, there should
be5 an equal probability of finding an antistrange with the same x:
The up-quark PDF can be thought of as divided into a contribution u v (x) from the
two valence quarks, and a contribution u s (x) from the sea (non-valence) quarks that
are accompanied by an anti-up. (Here and below the factorization scale dependence
is left implicit, for brevity.) So we have:
There is also a constraint that the total 4-momentum of all partons found in the proton
must be equal to the 4-momentum of the proton that they form. This rule takes the
form:
5 Although QCD interactions do not change quark flavors, there is a small strangeness violation in
the weak interactions, so the following rule is not quite exact.
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 239
1
d x x[g(x) + u(x) + u(x) + d(x) + d(x) + s(x) + s(x) + · · · ] = 1, (8.194)
0
1
d x x f Ah (x) = 1. (8.195)
A 0
Each term x f Ah (x) represents the probability that a parton is found with a given
momentum fraction x, multiplied by that momentum fraction. One of the first com-
pelling pieces of evidence that the gluons are actual particles carrying real momentum
and energy, and not just abstract group-theoretic constructs, was that if one excludes
them from the sum rule (8.194), only about half of the proton’s 4-momentum is
accounted for:
1
d x x[u(x) + u(x) + d(x) + d(x) + s(x) + s(x) + · · · ] ≈ 0.5. (8.196)
0
If we could solve the bound state problem for the proton in QCD, like one can
solve the hydrogen atom in quantum mechanics, then we could derive the PDFs
directly from the Hamiltonian. However, we saw in Sect. 8.5 why this is not practi-
cal; perturbation theory in QCD is not accurate for studying low-energy problems
like bound-state problems, because the gauge coupling becomes very large at low
energies. Instead, the proton PDFs are measured by experiments including those
in which charged leptons and neutrinos probe the proton, like − p → − + X and
ν p → − + X . Several collaborations perform fits of available data to determine the
PDFs, and periodically publish updated result both in print and as computer code.
In each case, the PDFs are given in the form of computer codes obtained by fitting
to experimental data. Because of different techniques and weighting of the data, the
PDFs from different groups are always somewhat different.
As an example, let us consider the CTEQ collaboration’s CTEQ5L PDF set.
Here, the “5” says which update of the PDFs is being provided, and the “L” stands
for “lowest order”, which means it is the set appropriate when one only has the
lowest-order calculation of the partonic cross-section. This set is somewhat old; we
use it only because it is relatively easy to evaluate, since it is given in parameterized
function form rather than interpolation table form. There are more recent sets from
CTEQ and other collaborations. Each of these has a version appropriate for lowest-
order work, and other versions appropriate when one has the next-to-leading order
(NLO) or next-next-leading order (NNLO) formulas for the hard scattering process
of interest. At Q = 10 GeV, the CTEQ5L PDFs for u(x), u(x), and the valence
contribution u v (x) ≡ u(x) − u(x) are shown below, together with a similar graph
for the down quark and antiquark distributions:
240 8 Quantum Chromo-Dynamics (QCD)
1 1
CTEQ5L, Q=10 GeV CTEQ5L, Q=10 GeV
u(x) d(x)
0.8 u(x) = us(x) 0.8 d(x) = ds(x)
uv(x) = u(x) - u(x) dv(x) = d(x) - d(x)
0.6 0.6
x f(x) x f(x)
0.4 0.4
0.2 0.2
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
Here we follow tradition by graphing x times the PDF in each case, since they all
tend to get large near x = 0.
We see from the first graph that the valence up-quark distribution is peaked below
x = 0.2, with a long tail for larger x (where an up-quark is found to have a larger
fraction of the proton’s energy). There is even a significant chance of finding that
an up quark has more than half of the proton’s 4-momentum. In contrast, the sea
quark distribution u(x) is strongly peaked near x = 0. This is a general feature of
sea partons; the chance that a virtual particle can appear is greater when it carries a
smaller energy, and thus a smaller fraction x of the proton’s total momentum. The
solid curve shows the total up-quark PDF for this value of Q. The sea distribution
d(x) is not very different from that of the anti-up, but the distribution dv (x) is of
course only about half as big as u v (x), since there is only one valence down quark
to find in the proton.
Next, let us look at the strange and gluon PDFs:
1
CTEQ5L, Q=10 GeV
g(x)
0.8 s(x) = s(x)
0.6
x f(x)
0.4
0.2
0
0 0.2 0.4 0.6 0.8 1
x
The parton distribution function for gluons grows very quickly as one moves towards
x = 0. This is because there are 8 gluon color combinations available, and each virtual
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 241
gluon can give rise to more virtual gluons because of the 3-gluon and 4-gluon vertex.
This means that the chance of finding a gluon gets very large if one requires that it only
have a small fraction of the total 4-momentum of the proton. The PDFs s(x) = s(x)
are suppressed by the non-zero strange quark mass, since this imposes a penalty on
making virtual strange and antistrange quarks. This explains why s(x) < d(x).
The value of the factorization scale Q = 10 GeV corresponds roughly to the
appropriate energy scale for many of the experiments that were actually used to
fit for the PDFs. However, at the Tevatron and LHC, one often studies events with
a much larger characteristic energy scale, like Q ∼ m t for top events and perhaps
Q ∼ 1000 GeV for supersymmetry events at the LHC. Larger Q is appropriate for
probing the proton at larger energy scales, or shorter distance scales. The next two
graphs show the CTEQ5L PDFs for Q = 100 and 1000 GeV:
1 1
CTEQ5L, Q=100 GeV CTEQ5L, Q=1000 GeV
g g
0.8 0.8
0.6 0.6
x f(x) u x f(x)
u
0.4
d 0.4
d
0.2 d 0.2 d
u u
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x
As Q increases from Q = 100 to 1000 GeV, the PDFs become larger at very small x
(although this is hard to see from the graphs), but smaller for x >
∼ 0.015 for gluons and
x>∼ 0.04 for quarks. More generally, the variation with Q can be made quantitative
using the DGLAP equations, which are built into the computer codes that provide
the parton distributions as a function of x and Q.
Now suppose we have available a set of PDFs, and let us see how to use them to
get a cross-section. Consider scattering two hadrons h and h , and let the partonic
differential cross-sections for the desired final state X be
d σ̂ (ab → X ) (8.197)
for any two partons a (to be taken from h) and b (from h ). The hat is used as a
reminder that this is a partonic process. If X has two particles 1,2 in it, then one
defines partonic Mandelstam variables:
ŝ = ( pa + pb )2 , (8.198)
tˆ = ( pa − k1 )2 , (8.199)
û = ( pa − k2 )2 . (8.200)
242 8 Quantum Chromo-Dynamics (QCD)
Then we can define a Feynman x for each of the initial-state partons, xa and xb , so:
Here s is determined by the collider [(1960 GeV)2 at the Tevatron and (13 TeV)2 at
the LHC], while ŝ is different for each event. Now, to find the total cross-section to
produce the final state X in h, h collisions, we should multiply the partonic cross-
section by the probabilities of finding in h a parton a with momentum fraction in
the range xa to xa + d xa and the same probability for parton b in h ; then integrate
over all possible xa and xb , and then sum over all the different parton species a and
b. The result is:
1
1
dσ (hh → X ) = d xa d xb d σ̂ (ab → X ) f ah (xa ) f bh (xb ). (8.206)
a,b 0 0
This integration is done by computer, using PDFs with Q chosen equal to some
energy characteristic of the event. The partonic differential cross-section d σ̂ (ab →
μ μ μ
X ) depends on the momentum fractions xa and xb through pa = xa ph and pb =
μ μ μ
xb ph , with ph and ph controlled or known by the experimenter.
These are listed in the order of their numerical importance in contributing to the
total cross-section for tt at the Tevatron. Notice that the most likely thing is to find
8.7 Top-Antitop Production in p P and pp Collisions 243
a quark in the proton and an antiquark in the anti-proton, but there is also a small
but non-zero probability of finding an anti-quark in the proton, and a quark in the
anti-proton. All of the processes involving quark and antiquark in the initial state
involve the same parton-level cross-section
d σ̂ (qq → tt)
. (8.208)
d tˆ
The gluon-gluon process has a partonic cross-section that is somewhat more difficult
to obtain:
d σ̂ (gg → tt) παs2 6(m 2 − tˆ)(m 2 − û) m 2 (ŝ − 4 m 2 )
= −
d tˆ 8ŝ 2 ŝ 2 3(m 2 − tˆ)(m 2 − û)
4[(m − tˆ)(m − û) − 2 m (m + tˆ)] 4[(m 2 − tˆ)(m 2 − û) − 2 m 2 (m 2 + û)]
2 2 2 2
+ +
3(m 2 − tˆ)2 3(m 2 − û)2
2 ˆ
3[(m − t )(m − û) + m (û − t )]
2 2 ˆ 3[(m − t )(m − û) + m 2 (tˆ − û)]
2 ˆ 2
− − , (8.209)
ŝ(m 2 − tˆ) ŝ(m 2 − û)
where m is the mass of the top quark. Even these leading-order partonic differen-
tial cross-sections depend implicitly on the renormalization scale μ, through the
renormalized coupling α S (μ).
In order to find the total cross-section, one can first integrate the partonic cross-
sections with respect to tˆ; this is equivalent to integrating over the final-state top
quark angle θ̂ in the partonic COM frame, since they are related linearly by
+
ŝ
tˆ = m 2t + −1 + cos θ̂ 1 − 4m 2t /ŝ . (8.210)
2
tˆmax
d σ̂
σ̂ = d tˆ , (8.211)
d tˆ
tˆmin
where
+
ŝ
tˆmax,min = m 2t + −1 ± 1 − 4m t /ŝ .
2 (8.212)
2
ŝ d ŝ
xb = ; d xb = . (8.213)
xa s xa s
So instead of integrating over xb , we can integrate over ŝ. The limits of integration
on ŝ are from ŝmin = 4m 2t (the minimum required to make a top-antitop pair) to
244 8 Quantum Chromo-Dynamics (QCD)
ŝmax = s (the maximum available from the proton and antiproton, corresponding to
xa = xb = 1). For a given ŝ, the range of xa is from ŝ/s to 1. Relabeling xa as just
x, we therefore have:
s 1 ,
1
σ ( p p → tt) = d ŝ dx σ̂ (qq → tt) u(x)u(ŝ/xs) + d(x)d(ŝ/xs)
xs
4m 2t ŝ/s
+u(x)u(ŝ/xs) + d(x)d(ŝ/xs) + 2 s(x)s(ŝ/xs)
-
+σ̂ (gg → tt)g(x)g(ŝ/xs) . (8.214)
√
Using the CTEQ5L PDFs and m t = 173 GeV, with s = 1960 GeV, and computing
αs (μ) using (8.165) starting from αs (m t ) = 0.1082, and working with the leading-
order partonic cross-sections, the results as a function of the common factorization
and renormalization scale Q = μ look like:
1
10 Total
σ(tt) [pb] at Tevatron
-
uu
0 dd
-
10
gg
-1
10
-
dd
-2 -uu
10
ss- + ss
-
0.2 0.5 1 2
Q/mt
Unfortunately, the accuracy of the above results, obtained with only leading order
partonic cross-sections and PDFs, is clearly not very high. Ideally, the lines should
be flat, but there is instead a strong dependence of the leading-order prediction on
Q = μ. The higher-order corrections to the quark-antiquark processes turn out to
be of order 10 to 20%, while the gluon-gluon process gets about a 70% correction
from its leading-order value at Q = μ = m t . Accurate comparisons with experiment
require a much more detailed and sophisticated treatment of the higher-order effects,
including at least a next-to-leading order calculation of the partonic cross-sections.
Still, some useful information can be gleaned. Experience has shown that evaluating
the leading-order result at Q ∼ m t /2 gives a decent estimate of the total cross-
section, although a principled justification for this scale choice is hard to make. Also,
the relative sizes of the parton-level contributions can be understood qualitatively
from the PDFs as follows. To produce a top-antitop pair, we must have ŝ > 4m 2t , so
according to (8.205),
So at least one of the momentum fraction x’s must be larger than 0.1765 for m t = 173
GeV. This means that the largest contributions come from the valence quarks. Since
there are roughly twice as many valence up quarks as down quarks in the proton
for a given x, and twice as many antiups as antidowns in the antiproton, the ratio of
top-antitop events produced from up-antiup should be about 4 times that from down-
antidown. The gluon-gluon contribution is suppressed in this case because most of
the gluons are at small x and do not have enough energy to make a top-antitop pair.
Finally, the contributions from sea partons (u, d, s, s in the proton, and u, d, s, s in
the antiproton) are highly suppressed for the same reason.
Let us now consider tt production√ for the Large Hadron Collider, a pp collider,
by taking into account the larger s and the different parton distribution function
roles. Since both of the initial-state hadrons are protons, the formula for the total
cross-section is now:
s 1 ,
1
σ ( pp → tt) = d ŝ dx σ̂ (qq → tt) 2u(x)u(ŝ/xs) + 2d(x)d(ŝ/xs)
xs
4m 2t ŝ/s
-
+2 s(x)s(ŝ/xs) + σ̂ (gg → tt) g(x)g(ŝ/xs) . (8.216)
The factors of 2 are present because each proton can contribute either the quark, or
the antiquark; then the contribution of the other proton is fixed. The gluon-gluon
contribution has the same form as in p p collisions, because the gluon distribution is
identical in protons and in antiprotons.
Numerically integrating the above formula with a computer using the CTEQ5L
PDFs, one finds the leading order results shown below, as a function
√ of the common
factorization and renormalization scale Q = μ, for the case of s = 14 TeV:
3
10
Total
σ(tt) [pb] at 14 TeV LHC
gg
2
10
- + uu
uu -
- -
dd + dd
1
10
- + -ss
ss
- + -cc
cc
0
100.2 0.5 1 2
Q/mt
PDF, but that is not the main reason. The really important effect is that at very high
energies like at the LHC, the top quark can be considered light (!)√ and so one can
make them using partons with much lower x. For example, with s = 14 TeV, the
kinematic constraint on the longitudinal momentum fractions becomes
so that now the smaller one can be as low as 0.0247. At low x, we saw above that
the gluon distribution function is very large; one has plenty of gluons available with
less than 1/10 of the protons’ total energy, and they dominate over the quark and
antiquark PDFs. This is actually a common feature, and is why you sometimes hear
people somewhat whimsically call the LHC a “gluon collider”; with so much energy
available for the protons, many processes are dominated by the large gluon PDF at
low x. There are some processes that do not rely on gluons at all, however. We will
see one example in Sect. 8.9. Those processes are dominated by sea quarks at the
LHC. Also, many processes get a large contribution from gluon-squark scattering as
well, for example gluon-squark production in supersymmetry.
One can also look at√the distribution of tt production as a function of the total
invariant mass Minv = ŝ of the hard scattering process, by leaving the ŝ integra-
tion in (8.214) and (8.216) unperformed. The resulting shape of the distribution
normalized by the total cross-section,
√
1 dσ (tt) 2 s dσ (tt)
= , (8.218)
σ (tt) d Minv σ (tt) d ŝ
√
is shown below for the Tevatron and the LHC with s = 14 TeV:
Tevatron
(1/σ)dσ(tt)/dMinv [GeV ]
-1
LHC at 14 TeV
0.006
0.004
0.002
0
300 500 600 700 800 900 1000
400
tt Minv [GeV]
The invariant mass distribution of the tt system is peaked not far above 2m t in
both cases, indicating that the top and antitop usually have only semi-relativistic
velocities. This is because the PDFs fall rapidly with increasing x, so the most
important contributions to the production cross-section occur when both x’s are
not very far above their minimum allowed values. At the LHC, the top and antitop
are likelier to be produced with higher energy than at the Tevatron, with a more
substantial tail at high mass.
8.8 Kinematics in Hadron-Hadron Scattering 247
Let us now consider the general problem of kinematics associated with hadron-
hadron collisions with underlying 2 → 2 parton scattering. To make things simple,
we will suppose all of the particles are essentially massless, so what we are about
to do does not work for tt in the final state (but could be generalized to do so).
After doing a sum/average over spins, colors, and any other unobserved degrees of
freedom, we should be able to compute the differential cross-section for the partonic
event from its Feynman diagrams as:
d σ̂ (ab → 12)
. (8.219)
d tˆ
As we learned in Sect. 8.6, we can then write:
d σ̂ (ab → 12)
dσ (hh → 12 + X ) = f ah (xa ) f bh (xb ) d tˆ d xa d xb . (8.220)
d tˆ
a,b
A cartoon picture of the scattering process in real space might look like:
There are several different ways to choose the kinematic variables describing the final
state. There are three significant degrees of freedom: two angles at which the final-
state particles emerge with respect to the collision axis, and one overall momentum
scale. (Once the magnitude of the momentum transverse to the beam for one particle
is specified, the other is determined.) The angular dependence about the collision
axis is trivial, so we can ignore it.
For example, we can use the following three variables: momentum of particle 1
transverse to the√ collision axis, pT ; the total center-of-momentum energy of the final
state partons, ŝ; and the longitudinal rapidity of the two-parton system in the lab
frame, defined by
1
Y = ln(xa /xb ). (8.221)
2
248 8 Quantum Chromo-Dynamics (QCD)
This may look like an obscure definition, but it is the rapidity (see Sect. 2) needed
to boost along the collision axis to get to the center-of-momentum frame for the
two-parton system. It is equal to 0 if the final-state particles are back-to-back, which
would occur in the special case that the initial-state partons have the same energy
in the lab frame. Instead of the variables (xa , xb , tˆ), we can use the more directly
observable variables (ŝ, pT2 , Y ), or perhaps some subset of these with the others
integrated over. Working in the center-of-momentum frame of the partons, we can
write:
with
ŝ = 4 Ê 2 = xa xb s, (8.226)
and so
where the last equality uses û = −ŝ − tˆ = −sxa xb − tˆ for massless particles. Mak-
ing the change of variables (xa , xb , tˆ) to (ŝ, pT2 , Y ) for a differential cross-section
requires
d tˆ d xa d xb = J d ŝ d( pT2 ) dY , (8.230)
hh → + − . (8.233)
This does not involve QCD as the hard partonic scattering, since the leptons = e,
μ, or τ are singlets under SU (3)c color. However, it still depends on QCD, because
to evaluate it we need to know the PDFs for the quarks inside the hadrons. Since
gluons have no electric charge and do not couple to photons, the underlying partonic
process is always:
qq → + − . (8.234)
√
with the q coming from either h or h . The cross-section for this, for s m Z and
not near a resonance, can be obtained by exactly the same methods as in (5.2.1) for
e+ e− → μ+ μ− ; we just need to remember to use the charge of the quark Q q instead
of the charge of the electron, and to average over initial-state colors. The latter effect
leads to a suppression of 1/3; there is no reaction if the colors do not match. One
finds that the differential partonic cross-section is:
d σ̂ (qq → + − ) 2π α 2 Q q2 (tˆ2 + û 2 )
= . (8.235)
d tˆ 3ŝ 4
Therefore, using û = −ŝ − tˆ for massless scattering, and writing separate contribu-
tions from finding the quark, and the antiquark, in h:
dσ (hh → + − )
=
d ŝ dpT2 dY
2π α 2 tˆ2 + (ŝ + tˆ)2
h h
x a x b Q 2
f h
(x a ) f (x b ) + f h
(x a ) f (x b ) . (8.236)
3 ŝ 4 (ŝ + 2tˆ) q
q q q q q
This can be used to make a prediction for the experimental distribution of events
with respect to each of ŝ, pT , and Y .
250 8 Quantum Chromo-Dynamics (QCD)
4π α 2 Q q2
σ̂ (qq → + − ) = . (8.237)
9ŝ
Therefore we get:
dσ (hh → + − ) =
4π α 2 2 h
d xa d xb Q q f q (xa ) f qh (xb ) + f qh (xa ) f qh (xb ) . (8.238)
9ŝ q
we therefore have:
dσ (hh → + − ) 4π α 2 2 h
= Q q f q (xa ) f qh (xb ) + f qh (xa ) f qh (xb ) . (8.240)
d ŝ dY 9 s ŝ q
Still another way to present the result is to leave only ŝ unintegrated, by again first
integrating the partonic differential cross-section with respect to tˆ, and then trading
one of the Feynman-x variables for ŝ, and do the remaining x-integration. This is
how we wrote the top-antitop total cross-section. The Jacobian factor in the change
of variables from (xa , xb ) → (xa , ŝ) is now:
. .−1
. ∂ ŝ .
d xa d xb = .. . d xa d ŝ = d xa d ŝ . (8.241)
∂x . xa s
b
dσ (hh → + − )
=
d ŝ
1
4π α 2 2 ŝ h h h
Q q d x f (x) f (ŝ/xs) + f h
(x) f (ŝ/xs) . (8.242)
9ŝ 2 q xs q q q q
ŝ/s
This version makes a nice prediction that is (almost) independent of the actual parton
distribution functions. The right-hand side could have depended on both ŝ and s in
an arbitrary way, but to the extent that the PDFs are independent of Q, we see that it
8.9 Drell-Yan Scattering (+ − Production in Hadron collisions) 251
is predicted to scale like 1/ŝ 2 times some function of the ratio ŝ/s. Since the PDFs
run slowly with Q, this is a reasonably good prediction. Drell-Yan scattering has
been studied in hh = p p, pp, π ± p, and K ± p scattering experiments, and in each
case the results indeed satisfy the scaling law:
dσ (hh → + − )
ŝ 2 = Fhh (ŝ/s) (8.243)
d ŝ
to a very good approximation, for low ŝ not near a resonance. Furthermore, the
functions Fhh gives information about the PDFs.
Because of the relatively clean signals of muons in particle detectors, the Drell-
Yan process
p p → μ− μ+ + X or pp → μ− μ+ + X (8.244)
is often one of the first things one studies at a hadron collider to make sure everything
is working correctly and understood. It also provides a test of the PDFs, especially
at small x. √
For larger s, one must take into account the s-channel Feynman diagram with
a Z boson in place of the virtual photon. The resulting cross-section can be obtained
from (8.242) by replacing
!
4π α 2 2 4π α 2 Q q2 (Vq2 + Aq2 )(V2 + A2 ) 2Q q Vq V (1 − m 2Z /ŝ)
Qq → + − (8.245)
9ŝ 2
q
9 q
ŝ 2 (ŝ − m 2Z )2 + m 2Z 2Z (ŝ − m 2Z )2 + m 2Z 2Z
Fig. 8.2 The production of dimuon pairs as a function of μ+ μ− invariant mass (from CMS DP-
2018/055)
Problems
1. Each element g of the SU (2) group can be parametrized by three real parameters
xa , where
i xb T b [1 − x 2 ]1/2 + i x1 x2 + i x3
g = g(x1 , x2 , x3 ) = e =
−x2 + i x3 [1 − x 2 ]1/2 − i x1
where T k are the corresponding SU (2) Lie algebra generators and the xk take on
values
Note, the identity element is when x 2 = 0, and so small deviation away from the
identity are when |x| 1.
(a) Determine what the generators T k are by
.
dg ..
iT = k
(8.247)
d xk .x1 =x2 =x3 =0
¯
2. (a) Compute the QCD √ cross-section σ (u ū → f f ) where f = u for collision cen-
ter of mass energy s. Assume m f = 0 and m u = 0 for this problem, and u is the
up-quark of QCD. Represent your final answer not in terms of the SU (3) genera-
tors but in terms of “group and representation data”, such as I R ( f ), C(R), C(G),
dG , d R , etc. Now, compute precisely what you get for the cases of (b) f being in
the fundamental representation and (c) f being in the adjoint representation of
SU (3).
3. Consider the EHLQ pdf functions6
and q̄s (x) = qs (x) for every q = {u, d, s}. In addition, ū v (x) = d̄v (x) = 0. The
subscripts refer to valence (qv (x)) and sea (qs (x)) quarks. These PDFs depend
on Q very weakly (u(x) = u(x, Q)) but ignore that dependence here.
This problem below is to be done numerically. All results should be given to three
significant digits (e.g., 1.38).
(a) Compute the number of valence up quarks and number of valence down
quarks in the proton according to these PDFs.
(b) Compute the total fraction of momentum carried by all the partons.
(c) What is the total momentum carried by the gluons?
(d) Compute the total fraction of momentum carried by strange quarks (s and s̄).
qq → qq, qq → qq , q q → q q, q q → q q , (8.253)
qq → gg, qq → qq, qq → q q , qq → qq , (8.254)
qg → qg, qg → qg, gg → gg, gg → qq. (8.255)
where q represents any fixed quark flavor, and q represents a quark flavor that
is definitely different from q. For the purposes of this problem,
√we consider only
massless partons (those very light compared to the partonic ŝ), and so do not
consider processes involving top quarks or antiquarks.
Several of the processes in this list actually have the same tree-level differential
cross-sections:
d σ̂ (qq → qq ) d σ̂ (q q → q q ) d σ̂ (qq → qq )
= = , (8.256)
d tˆ d tˆ d tˆ
d σ̂ (qq → qq) d σ̂ (q q → q q)
= , (8.257)
d tˆ d tˆ
d σ̂ (qg → qg) d σ̂ (qg → qg)
= . (8.258)
d tˆ d tˆ
So, there are really only 8 independent parton-level cross-sections in terms of
which one can express the leading-order two-jet production cross-section for
hadron colliders including the Tevatron and the LHC. Assume they are all known.
(a) Find an integral expression for the total Tevatron cross section for dijet pro-
duction in p p collisions at leading order in QCD in terms of the 8 independent
parton-level total cross sections and the PDFs g(x), u(x), d(x), u(x), d(x), s(x).
(Neglect charm and bottom PDFs; they are small.) I’ll start by including the
contributions for three of the 8 independent parton-level cross-sections, and
you fill in the contributions for the other 5:
σ(p p → j j + X) =
s 1 ,
dx
d ŝ g(x)g(ŝ/sx) σ̂ (gg → gg) + σ̂ (gg → qq)
xs q
0 0
-
+[u(x)u(ŝ/sx) + d(x)d(ŝ/sx) + s(x)s(ŝ/sx)]2σ̂ (qq → qq) + ? . (8.259)
√
Here s is the
√proton-antiproton collision energy in their center-of-momentum
frame, and ŝ is the parton-parton collision energy in their center-of-
momentum frame.
(b) Do the same for pp collisions relevant for the LHC. (The answer is different.)
6. Consider the parton-level process involving scattering of a quark with its anti-
quark: qq → qq. (Note that this is one of the partonic processes that appeared
Problems 255
where n is a certain positive rational number that you will find, and ŝ, tˆ, û are
the partonic Mandelstam variables.
[Hint: You will need to compute Tr[T a T b T a T b ], with the adjoint indices a
and b implicitly summed over. This can be done by using equations (8.1.16),
(8.1.59), (8.1.61), and (8.1.67).]
d σ̂ (qq → qq)
(c) Find the parton-level differential cross-section .
d tˆ
7. Use crossing symmetry to find some others of the 2 → 2 partonic differential
cross-sections mentioned in Problem 1:
d σ̂ (qq → qq)
(a) Using the results of part (b) of the previous problem, find .
d tˆ
d σ̂ (qq → qq )
(b) Using the result for qq → qq in equation (9.2.16), find both
d tˆ
d σ̂ (qq → q q )
and . [Hint: don’t worry, nothing weird happens with the
d tˆ
color factors in this problem.]
Spontaneous Symmetry Breaking
9
Not all of the symmetries of the laws of physics are evident in the state that describes
a physical system, or even in the vacuum state with no particles. For example, in
condensed matter physics, the ground state of a ferromagnetic system involves a
magnetization vector that points in some particular direction, even though Maxwell’s
equations in matter do not contain any special direction. This is because it is ener-
getically favorable for the magnetic moments in the material to line up, rather than
remaining randomized. The state with randomized magnetic moments is unstable
to small perturbations, like a stick balanced on one end, and will settle in the more
energetically-favored magnetized state.
The situation in which the laws of physics are invariant under some symmetry
transformations, but the vacuum state is not, is called spontaneous symmetry break-
ing. In this section we will study how this works in quantum field theory. There
are two types of continuous symmetry transformations; global, in which the trans-
formation does not depend on position in spacetime, and local (or gauge) in which
the transformation can be different at each point. We will work out how sponta-
neous symmetry breaking works in each of these cases, using the example of a U (1)
symmetry, and then guess the generalizations to non-Abelian symmetries.
Consider a complex scalar field φ(x) with a Lagrangian density:
L = ∂ μ φ ∗ ∂μ φ − V (φ, φ ∗ ), (9.1)
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 257
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_9
258 9 Spontaneous Symmetry Breaking
where m 2 and λ are parameters of the theory. This Lagrangian is invariant under the
global U (1) transformations:
where α is any constant. The classical equations of motion for φ and φ ∗ following
from L are (see (4.25)):
δV
∂ μ ∂μ φ + = 0, (9.4)
δφ ∗
δV
∂ μ ∂μ φ ∗ + = 0. (9.5)
δφ
0|φ(x)|0 = 0. (9.6)
The scalar particles created and destroyed by the field φ(x) correspond to quantized
oscillations of φ(x) about the minimum of the potential. They have squared mass
equal to m 2 , and interact with a four-scalar vertex proportional to λ.
Let us now consider what happens if the signs of the parameters m 2 and λ are
different. If λ < 0, then the potential V (φ, φ ∗ ) is unbounded from below for arbi-
trarily large |φ|. This cannot lead to an acceptable theory. Classically there would
be runaway solutions in which |φ(x)| → ∞, gaining an infinite amount of kinetic
energy. The quantum mechanical counterpart of this statement is that the expectation
value of φ(x) will grow without bound.
However, there is nothing wrong with the theory if m 2 < 0 and λ > 0. (One should
think of m 2 as simply a parameter that appears in the Lagrangian density, and not
as the square of some mythical real number m.) In that case, the potential V (φ, φ ∗ )
has a local maximum at φ = 0, and a degenerate set of minima with
v2
|φmin |2 = , (9.7)
2
where we have defined:
v = −m 2 /λ. (9.8)
The potential V does not depend on the phase of φ(x) at all, so it is impossible
to unambiguously determine the phase of φ(x) at the minimum. However, by an
arbitrary choice, we can make Im(φmin ) = 0. In quantum mechanics, the system
9.1 Global Symmetry Breaking 259
will have a ground state in which the expectation value of φ(x) is constant and equal
to the classical minimum:
v
0|φ(x)|0 = √ . (9.9)
2
The quantity v is a measurable property of the vacuum state, known as the vacuum
expectation value, or VEV, of φ(x). If we now ask what the VEV of φ is after
performing a U (1) transformation of the form (9.3), we find:
v v
0|φ (x)|0 = eiα 0|φ(x)|0 = eiα √ = √ . (9.10)
2 2
The VEV is not invariant under the U (1) symmetry operation acting on the fields of
the theory; this reflects the fact that we had to make an arbitrary choice of phase. One
cannot restore the invariance by defining the symmetry operation to also multiply
|0 by a phase, since 0| will rotate by the opposite phase, canceling out of (9.10).
Therefore, the vacuum state must not be invariant under the global U (1) symmetry
rotation, and the symmetry is spontaneously broken. The sign of the parameter m 2 is
what determines whether or not spontaneous symmetry breaking takes place in the
theory.
In order to further understand the behavior of this theory, it is convenient to rewrite
the scalar field in terms of its deviation from its VEV. One way to do this is to write:
1
φ(x) = √ [v + R(x) + i I (x)] , (9.11)
2
1
φ ∗ (x) = √ [v + R(x) − i I (x)] , (9.12)
2
where R and I are each real scalar fields, representing the real and imaginary parts
of φ. The derivative part of the Lagrangian can now be rewritten in terms of R and
I , as:
1 μ 1
L= ∂ R∂μ R + ∂ μ I ∂μ I . (9.13)
2 2
The potential appearing in the Lagrangian can be found in terms of R and I most
easily by noticing that it can be rewritten as
Dropping the last term that does not depend on the fields, and plugging in (9.11) and
(9.12), this becomes:
λ
V (R, I ) = [(v + R)2 + I 2 − v 2 ]2 (9.15)
4
λ 2
= λv 2 R 2 + λv R(R 2 + I 2 ) + (R + I 2 )2 . (9.16)
4
260 9 Spontaneous Symmetry Breaking
Comparing this expression with our previous discussion of real scalar fields in Chap.
4, we can interpret the terms proportional to λv as R R R and R I I interaction vertices,
and the last term proportional to λ as R R R R, R R I I , and I I I I interaction vertices.
The first term proportional to λv 2 is a mass term for R, but there is no term quadratic
in I , so it corresponds to a massless real scalar particle. Comparing to the Klein-
Gordon Lagrangian density of (4.18), we can identify the physical particle masses:
m 2R = 2λv 2 = −2 m 2 , (9.17)
m 2I = 0. (9.18)
1
φ(x) = √ [v + h(x)]ei G(x)/v , (9.19)
2
1
φ ∗ (x) = √ [v + h(x)]e−i G(x)/v , (9.20)
2
instead of (9.11), (9.12). Again h(x) and G(x) are two real scalar fields, related to
R(x) and I (x) by a non-linear functional transformation. In terms of these fields,
the potential is:
λ 4
V (h) = λv 2 h 2 + λvh 3 + h . (9.21)
4
Notice that the field G does not appear in V at all. This is because G just corre-
sponds to the phase of φ, and the potential was chosen to be invariant under U (1)
phase transformations. However, G does have interactions coming from the part
of the Lagrangian density containing derivatives. To find the derivative part of the
Lagrangian, we compute:
1 i(v + h)
∂μ φ = √ ei G/v ∂μ h + ∂μ G , (9.22)
2 v
1 i(v + h)
∂μ φ ∗ = √ e−i G/v ∂μ h − ∂μ G , (9.23)
2 v
so that:
1 μ 1 h 2 μ
Lderivatives = ∂ h∂μ h + 1+ ∂ G∂μ G. (9.24)
2 2 v
The quadratic part of the Lagrangian, which determines the propagators for h and
G, is
1 μ 1 1
Lquadratic = ∂ h∂μ h − m 2h h 2 + ∂ μ G∂μ G, (9.25)
2 2 2
9.2 Local Symmetry Breaking and the Higgs Mechanism 261
with
m 2h = 2λv 2 , (9.26)
m 2G = 0. (9.27)
This confirms the previous result that the spectrum of particles consists of a mas-
sive real scalar (h) and a massless one (G). The interaction part of the Lagrangian
following from (9.21) and (9.24) is:
1 1 2 μ λ
Lint = h + 2 h ∂ G∂μ G − λvh 3 − h 4 . (9.28)
v 2v 4
G → G = G + αv; (9.29)
h → h = h. (9.30)
This explains why G only appears in the Lagrangian with derivatives acting on it. In
general, a broken global symmetry is always signaled by the presence of a massless
Nambu-Goldstone boson with only derivative interactions. This is an example of
Goldstone’s theorem, which we will state in a more general framework in Sect. 9.3.
Let us now consider how things change if the spontaneously broken symmetry is
local, or gauged. As a simple example, consider a U (1) gauge theory with a fermion
ψ with charge Q and gauge coupling g and a vector field Aμ transforming according
to:
The Lagrangian density for the scalar and vector degrees of freedom of the theory
is:
1 μν
L = Dμ φ ∗ D μ φ − V (φ, φ ∗ ) − F Fμν , (9.37)
4
where V (φ, φ ∗ ) is as given before in (9.2). Because the covariant derivative of the
field transforms like
Dμ φ → eiθ Dμ φ, (9.38)
this Lagrangian is easily checked to be gauge-invariant. If m 2 > 0 and λ > 0, then this
theory describes a massive scalar, with self-interactions with a coupling proportional
to λ, and interaction with the massless vector field Aμ .
However, if m 2 < 0, then the minimum
√ of the potential for the scalar field brings
about a non-zero VEV 0|φ|0 = v/ 2 = −m 2 /2λ, just as in the global symmetry
case. Using the same decomposition of φ into real fields h and G as given in (9.19),
one finds:
1 1
Dμ φ = √ ∂μ h + ig Aμ + ∂μ G (v + h) ei G/v . (9.39)
2 gv
1
Dμ φ = √ ∂μ h + igVμ (v + h) ei G/v , (9.41)
2
1
Dμ φ = √ ∂μ h − igVμ (v + h) e−i G/v .
∗
(9.42)
2
Note also that since
∂μ Aν − ∂ν Aμ = ∂μ Vν − ∂ν Vμ , (9.43)
the vector field strength part of the Lagrangian is the same written in terms of the
new vector Vμ as it was in terms of the old vector Aμ :
1 μν 1
− F Fμν = − (∂μ Vν − ∂ν Vμ )(∂ μ V ν − ∂ ν V μ ). (9.44)
4 4
9.2 Local Symmetry Breaking and the Higgs Mechanism 263
The complete Lagrangian density of the vector and scalar degrees of freedom is now:
1 μ 1
L= ∂ h∂μ h + g 2 (v + h)2 V μ Vμ − F μν Fμν − λ(vh + h 2 /2)2 . (9.45)
2 4
This Lagrangian has the very important property that the field G has completely
disappeared! Reading off the part quadratic in h, we see that it has the same squared
mass as in the global symmetry case, namely
m 2h = 2λv 2 . (9.46)
1 g2 v2 μ
LV V = − F μν Fμν + V Vμ . (9.47)
4 2
This means that by spontaneously breaking the gauge symmetry, we have given a
mass to the corresponding vector field:
m 2V = g 2 v 2 . (9.48)
We can understand why the disappearance of the field G goes along with the
appearance of the vector boson mass as follows. A massless spin-1 vector boson
(like the photon) has only two possible polarization states, each transverse to its
direction of motion. In contrast, a massive spin-1 vector boson has three possible
polarization states; the two transverse, and one longitudinal (parallel) to its direction
of motion. The additional polarization state degree of freedom had to come from
somewhere, so one real scalar degree of freedom had to disappear. The words used
to describe this are that the vector boson becomes massive by “eating” the would-be
Nambu-Goldstone boson G, which becomes its longitudinal polarization degree of
freedom. This is called the Higgs mechanism. The original field φ(x) is called a
Higgs field, and the surviving real scalar degree of freedom h(x) is called by the
generic term Higgs boson. The Standard Model Higgs boson and the masses of the
W ± and Z bosons result from a slightly more complicated version of this same idea,
as we will see.
An alternative way to understand what has just happened to the would-be Nambu-
Goldstone boson field G(x) is that it has been “gauged away”. Recall that
1
φ = √ (v + h)ei G/v (9.49)
2
φ → eiθ φ. (9.50)
264 9 Spontaneous Symmetry Breaking
This choice, known as “unitary gauge”, eliminates G(x) completely, just as we saw
in (9.45). In unitary gauge,
1
φ(x) = √ [v + h(x)]. (9.52)
2
Notice also that the gauge transformation (9.51) gives exactly the term in (9.40), so
that Vμ is simply the unitary gauge version of Aμ . The advantage of unitary gauge
is that the true physical particle content of the theory (a massive vector and real
Higgs scalar) is more obvious than in the version of the Lagrangian written in terms
of the original fields φ and Aμ . However, it turns out to be easier to prove that the
theory is renormalizable if one works in a different gauge in which the would-be
Nambu-Goldstone bosons are retained. The physical predictions of the theory do not
depend on which gauge one chooses, but the ease with which one can compute those
results depends on picking the right gauge for the problem at hand.
Let us catalog the propagators and interactions of this theory, in unitary gauge.
The propagators of the Higgs scalar and the massive vector are:
Finally, a fermion with charge Q inherits the same interactions with Vμ that it had
with Aμ , coming from the covariant derivative:
This is a general way of making massive vector fields in gauge theories with inter-
acting scalars and fermions, without ruining renormalizability.
Let us now state, without proof, how all of the considerations above generalize to
arbitrary groups. First, suppose we have scalar fields φi in some representation of a
aj
global symmetry group with generators Ti . There is some potential
which we presume has a minimum where at least some of the φi are non-zero. This
of course depends on the parameters and couplings appearing in V . The fields φi
will then have VEVs that can be written:
vi
0|φi |0 = √ . (9.54)
2
Any group generators that satisfy
aj
Ti v j = 0 (9.55)
Goldstone’s Theorem states that for every spontaneously broken generator, labeled
by a, of a global symmetry group, there must be a corresponding Nambu-Goldstone
boson. (The group U (1) has just one generator, so there was just one Nambu-
Goldstone boson.)
In the case of a local or gauge symmetry, each of the would-be Nambu-Goldstone
bosons is eaten by the vector field with the corresponding index a. The vector fields
for the broken generators become massive, with squared masses that can be computed
in terms of the VEV(s) and the gauge coupling(s) of the theory. There are also Higgs
boson(s) for the uneaten components of the scalar fields that obtained VEVs.
One might also ask whether it is possible for fields other than scalars to obtain
vacuum expectation values. If one could succeed in concocting a theory in which a
fermion spinor field or a vector field has a VEV:
0|
α |0 = 0 (?), (9.57)
0|Aμ |0 = 0 (?), (9.58)
then Lorentz invariance will necessarily be broken, since the alleged VEV carries an
uncontracted spinor or vector index, and therefore transforms non-trivially under the
Lorentz group. This would imply that the broken generators would include Lorentz
boosts and rotations, in contradiction with experiment. However, there can be vacuum
expectation values for antifermion-fermion composite fields, since they can form a
Lorentz scalar:
0|
|0 = 0. (9.59)
where μ is a quantity with dimensions of [mass] which is set by the scale QCD
at which non-perturbative effects become important. This is known as chiral sym-
metry breaking. The chiral symmetry is a global, approximate symmetry by which
left-handed u, d, s quarks are rotated into each other and right-handed u, d, s quarks
are rotated into each other. (The objects qq are color singlets, so these antifermion-
fermion VEVs do not break SU (3)c symmetry.) Chiral symmetry breaking is actually
the mechanism that is responsible for most of the mass of the proton and the neutron,
and therefore most of the mass of everyday objects. When the chiral symmetry is
spontaneously broken, the Nambu-Goldstone bosons that arise include the pions π ±
and π 0 . They are not exactly massless because the chiral symmetry was really only
approximate to begin with, but the Goldstone theorem successfully explains why
they are much lighter than the proton; m 2π m 2p . They are often called pseudo-
Nambu-Goldstone bosons, or PNGBs, with the “pseudo” indicating that the asso-
ciated spontaneously broken global symmetry was only an approximate symmetry.
Extensions of the Standard Model that feature new approximate global symmetries
that are spontaneously broken generally predict the existence of heavy exotic PNGBs.
For example, these are a ubiquitous feature of technicolor models.
Problems 267
Problems
In this chapter we detail the elements of the Standard Model of elementary particle
physics. This is the culmination of many of the principles and facts that we have
developed up to this point in the book. We start in this section by studying the
implications of the Higgs mechanism of the Standard Model, which gives rise to the
masses of W ± and Z bosons, and the masses of the leptons and most of the masses
of the heavier quarks.
The electroweak interactions are mediated by three massive vector bosons W ± , Z
and the massless photon γ . The gauge group before spontaneous symmetry break-
ing must therefore have four generators. After spontaneous symmetry breaking,
the remaining unbroken gauge group is electromagnetic gauge invariance. A viable
theory must explain the qualitative experimental facts that the W ± bosons couple
only to L-fermions (and R-antifermions), that the Z boson couples differently to
L-fermions and R-fermions, but γ couples with the same strength to L-fermions and
R-fermions. Also, there are very stringent quantitative experimental tests involving
the relative strengths of fermion-antifermion-vector couplings and the ratio of the W
and Z masses. The Standard Model (SM) of electroweak interactions of Glashow,
Weinberg and Salam successfully incorporates all of these features and tests into a
spontaneously broken gauge theory. In the SM, the gauge symmetry breaking is:
We will need to introduce a Higgs field to produce this pattern of symmetry breaking.
The SU (2) L subgroup is known as weak isospin. Left-handed SM fermions are
known to be doublets under SU (2) L :
νe νμ ντ uL cL tL
, , , , , . (10.2)
eL μL τL dL sL bL
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 269
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_10
270 10 The Standard Electroweak Model
Notice that the electric charge of the upper member of each doublet is always 1
greater than that of the lower member. The SU (2) L representation matrix generators
acting on these fields are proportional to the Pauli matrices:
T a = σ a /2, (a = 1, 2, 3) (10.3)
Wμa , (a = 1, 2, 3) (10.4)
eR , μR , τR , u R , cR , tR , dR , sR , bR , (10.5)
1 a aμν 1
Lgauge = − Wμν W − Bμν B μν , (10.6)
4 4
where:
a
Wμν = ∂μ Wνa − ∂ν Wμa − g abc Wμb Wνc , (10.7)
Bμν = ∂μ Bν − ∂ν Bμ (10.8)
are the SU (2) L and U (1)Y field strengths. The totally antisymmetric abc (with
123 = +1) are the structure constants for SU (2) L . This Lgauge provides for kinetic
terms of the vector fields, and Wμa self-interactions.
The interactions of the electroweak gauge bosons with fermions are determined by
the covariant derivative. For example, the covariant derivatives acting on the lepton
fields are:
νe
νe
Dμ = ∂μ + ig Bμ Y L + igWμ Ta a
, (10.9)
eL eL
Dμ e R = ∂μ + ig Bμ Y R e R . (10.10)
where Y L and Y R are the weak hypercharges of left-handed leptons and right-
handed leptons, and 2 × 2 unit matrices are understood to go with the ∂μ and Bμ
terms in (10.9). A multiplicative factor can always be absorbed into the definition of
10.1 SU (2) L × U (1)Y Representations and Lagrangian 271
Therefore, the covariant derivatives of the lepton fields can be summarized as:
g g
Dμ νe = ∂μ νe + i g Y L Bμ + Wμ3 νe + i (Wμ1 − i Wμ2 )e L , (10.12)
2 2
g 3 g
Dμ e L = ∂μ e L + i g Y L Bμ − Wμ e L + i (Wμ1 + i Wμ2 )νe , (10.13)
2 2
Dμ e R = ∂μ e R − ig Bμ e R . (10.14)
The covariant derivative of a field must carry the same electric charge as the field
itself, in order for charge to be conserved. Evidently, then, Wμ1 − i Wμ2 must carry
electric charge +1 and Wμ1 + i Wμ2 must carry electric charge −1, so these must be
identified with the W ± bosons of the weak interactions. Consider the interaction
Lagrangian following from
μ νe
L = i ν e e L γ Dμ (10.15)
eL
g g
= − ν e γ μ e L (Wμ1 − i Wμ2 ) − e L γ μ νe (Wμ1 + i Wμ2 ) + · · · . (10.16)
2 2
Comparing with (7.99), (7.100), and (7.131), we find that to reproduce the weak-
interaction Lagrangian of muon decay, we must have:
1
Wμ+ ≡ √ (Wμ1 − i Wμ2 ), (10.17)
2
1
Wμ− ≡ √ (Wμ1 + i Wμ2 ). (10.18)
2
√
The 1/ 2 normalization agrees with our previous convention; the real reason for it
is so that the kinetic terms for W ± have a standard normalization: L = − 21 (∂μ Wν+ −
∂ν Wμ+ )(∂ μ W −ν − ∂ ν W −μ ).
The vector bosons Bμ and Wμ3 are both electrically neutral. As a result of spon-
taneous symmetry breaking, we will find that they mix. In other words, the fields
with well-defined masses (“mass eigenstates” or “mass eigenfields”) are not Bμ and
Wμ3 , but are orthogonal linear combinations of these two gauge eigenstate fields.
1 Some references define the weak hypercharge normalization so that Y is a factor of 2 larger than
here, for each particle.
272 10 The Standard Electroweak Model
One of the mass eigenstates is the photon field Aμ , and the other is the massive Z
boson vector field, Z μ . One can write the relation between the gauge eigenstate and
mass eigenstate fields as a rotation in field space by an angle θW , known as the weak
mixing angle:
Wμ3 cos θW sin θW Zμ
= , (10.19)
Bμ − sin θW cos θW Aμ
We now require that the resulting theory has the correct photon coupling to
fermions, by requiring that the field Aμ appears in the covariant derivatives in the
way dictated by QED. The covariant derivative of the right-handed electron field
(10.14) can be written:
g cos θW = e. (10.22)
g
sin θW + g Y L cos θW = 0. (10.26)
2
10.1 SU (2) L × U (1)Y Representations and Lagrangian 273
Y L = −1/2, (10.27)
gg
e= , (10.28)
g 2 + g 2
tan θW = g /g, (10.29)
so that
g g
sin θW = , cos θW = . (10.30)
g2 + g 2 g2 + g 2
These are requirements that will have to be satisfied by the spontaneous symmetry
breaking mechanism. The numerical values from experiment are approximately:
g = 0.652, (10.31)
g = 0.357, (10.32)
e = 0.313, (10.33)
sin θW = 0.231.
2
(10.34)
In general, the electric charge of any field f is given in terms of the eigenvalue of
the 3 component of weak isospin matrix, T 3 , and the weak hypercharge Y , as:
Q f = T f3 + Y f . (10.36)
274 10 The Standard Electroweak Model
Here T f3 is +1/2 for the upper component of a doublet, −1/2 for the lower component
of a doublet, and 0 for an SU (2) L singlet. The couplings of the SM fermions to the
Z boson then follow as a prediction. One finds for each SM fermion f :
= −Z μ f γμ (g L PL + g R PR ) f ,
f f
LZ f f (10.37)
where
g 3
g L = g cos θW T f3L − g sin θW Y f L =
f
T f L − sin2 θW Q f , (10.38)
cos θW
g
2
g R = −g sin θW Y f R
f
= − sin θW Q f , (10.39)
cos θW
with coefficients:
fermion T f3L Y f L Y f R Q f
νe , νμ , ντ 1
2 − 21 0 0
e, μ, τ − 21 − 21 −1 −1
1 1 2 2
u, c, t 2 6 3 3
d, s, b − 2 1 1
6 − 3 − 31
1
Equation (10.37) can also be rewritten in terms of vector and axial-vector cou-
plings to the Z boson:
= −Z μ f γμ (gV − g A γ5 ) f ,
f f
LZ f f (10.40)
with
f 1 f f
g
gV = gL + g R = T f3L − 2 sin2 θW Q f , (10.41)
2 2 cos θW
f 1 f f
g
gA = gL − g R = T f3L . (10.42)
2 2 cos θW
f f
The coupling parameters appearing in (8.245) are V f = gV /e and A f = g A /e.
The partial decay widths and branching ratios of the Z boson can be worked out
from these couplings, and agree with the results from experiment:
The “invisible” branching ratio matches up extremely well with the theoretical pre-
diction for the sum over the three ν ν final states, while “hadrons” is due to quark-
antiquark final states. It is an important fact that the Z branching ratio into charged
leptons is small. This is unfortunate, since backgrounds for leptons are smaller than
for hadrons or missing energy, and Z bosons can appear in many searches for new
phenomena.
10.2 The Standard Model Higgs Mechanism 275
Let us now turn to the question of how to spontaneously break the electroweak gauge
symmetry in a way that satisfies the above conditions. There is actually more that
one way to do this, but the Standard Model chooses the simplest possibility, which
is to introduce a complex SU (2) L -doublet scalar Higgs field with weak hypercharge
Y
= +1/2:
+
φ 1
= ←→ 1, 2, . (10.46)
φ0 2
Each of the fields φ + and φ 0 is a complex scalar field; we know that they carry
electric charges +1 and 0 respectively from (10.36). Under gauge transformations,
transforms as:
(x) →
(x) = eiθ (x)σ /2
(x),
a a
SU (2) L : (10.47)
U (1)Y :
(x) →
(x) = eiθ (x)/2
(x). (10.48)
† →
† =
† e−iθ σ /2 ,
a a
SU (2) L : (10.49)
U (1)Y :
† →
† =
† e−iθ/2 . (10.50)
†
and D μ
† D μ
(10.51)
V (
,
† ) = m 2
†
+ λ(
†
)2 , (10.52)
L = D μ
† Dμ
− V (
,
† ). (10.53)
0
Now, provided that m 2 < 0, then
= is a local maximum, rather than a
0
minimum, of the potential. This will ensure the spontaneous symmetry breaking that
we demand. There are degenerate minima of the potential with
†
= v 2 /2, v = −m 2 /λ. (10.54)
or, in components,
φ + → eiθ φ + , (10.57)
φ0 → φ0. (10.58)
Comparing with the QED gauge transformation rule of (8.5), we see that indeed φ +
and φ 0 have charges +1 and 0, respectively.
The Higgs field
has two complex, so four real, scalar field degrees of freedom.
Therefore, following the example of Sect. 9.2, we can write it as:
i G a (x)σ a /2v
0
(x) = e v+h(x)
√
, (10.59)
2
where G a (x) (a = 1, 2, 3) and h(x) are each real scalar fields. The G a are would-
be Nambu-Goldstone bosons, corresponding to the three broken generators in
SU (2) L × U (1)Y → U (1)EM . The would-be Nambu-Goldstone fields can be
removed by going to unitary gauge, which means performing an SU (2) L gauge
transformation of the form of (10.47), with θ a = −G a /v. This completely elimi-
nates the G a from the Lagrangian, so that in the unitary gauge we have simply
0
(x) = v+h(x)
√
. (10.60)
2
The field h creates and destroys the physical Higgs particle, an electrically neutral
real scalar boson that has yet to be discovered experimentally. We can now plug
this into the Lagrangian density of (10.53), to find interactions and mass terms for
the remaining Higgs field h and the vector bosons. The covariant derivative of
in
unitary gauge is:
1 0 i g g 0
Dμ
= √ +√ Bμ + Wμa σ a , (10.61)
2 ∂μ h 2 2 2 v+h
Therefore,
1
D μ
† D μ
= ∂μ h∂ μ h
2
√ √
(v + h)2
g Bμ + gWμ3 2gWμ− g B μ 2gW −μ
√ + gW
3μ 0
+ 0 1 √ + +μ μ , (10.63)
8 2gWμ g Bμ − gWμ3 2gW g B − gW 3μ 1
where the first equality uses (10.30) and the second uses (10.20). So finally we have:
L
kinetic = D μ
† Dμ
1 μ (v + h)2 2 + −μ 1 2 2 μ
= ∂μ h∂ h + g Wμ W + (g + g )Z μ Z . (10.66)
2 4 2
The parts of this proportional to v 2 make up (mass)2 terms for the W ± and Z vector
bosons, vindicating the earlier assumption of neutral vector boson mixing with the
form that we took for the sine and cosine of the weak mixing angle. Since there is
no such (mass)2 term for the photon field Aμ , we have successfully shown that the
photon remains massless, in agreement with the fact that U (1)EM gauge invariance
remains unbroken. The specific prediction is:
g2 v2 (g 2 + g 2 )v 2
m 2W = , m 2Z = , (10.67)
4 4
which agrees with the experimental values provided that the VEV is approximately:
√ 0
v = 2φ = 246 GeV (10.68)
in the conventions used here.2 Note that, comparing (7.133) and (10.67), the Fermi
constant is simply related to the VEV, by:
1
GF = √ . (10.69)
2v 2
m W /m Z = cos θW . (10.70)
All of the above predictions are subject to small, but measurable, loop corrections.
For example, the present experimental values m W = 80.379 ± 0.012 GeV and m Z =
91.1876 ± 0.0021 GeV give:
on−shell
sin2 θW ≡ 1 − m 2W /m 2Z = 0.22301 ± 0.00025, (10.71)
with the arrow direction on W ± lines indicating the direction of the flow of positive
charge. The field-strength Lagrangian terms of (10.6) provides the momentum part
of the W , Z propagators above, and also yields 3-gauge-boson and 4-gauge-boson
interactions:
10.2 The Standard Model Higgs Mechanism 279
where:
X μν,ρσ = 2 g μν g ρσ − g μρ g νσ − g μσ g νρ . (10.72)
λ 4
V (h) = λv 2 h 2 + λvh 3 + h , (10.73)
4
just as in the toy model studied in (9.2). Therefore, the Higgs boson has self-
interactions with Feynman rules:
280 10 The Standard Electroweak Model
and a mass
√
mh = 2λv, (10.74)
It would be great if we could evaluate this numerically using present data. Unfortu-
nately, while we know what the Higgs VEV v is, there is no present experiment that
gives any direct measurement of λ. Indirectly we know what it needs to be from the
Higgs boson mass, if the SM is the correct theory. Furthermore, there are indirect
effects of the Higgs mass in loops of precision electroweak obervables, such as the
Z mass, W mass, sin2 θW , etc. The experiments that measure these observables sug-
gested well before the Higgs boson discovery that m h should be less than 200 GeV.
The self-consistency of these indirect constraints vs. physical mass was verified by
the discovery of the Higgs boson at m h = 125 GeV.
The gauge group representations for fermions in the Standard Model are chiral.
This means that the left-handed fermions transform in a different representation than
the right-handed fermions. Chiral fermions have the property that they cannot have
masses without breaking the symmetry that makes them chiral.
For example, suppose we try to write down a mass term for the electron:
The Dirac spinor for the electron can be separated into left- and right-handed pieces,
e = PL e L + PR e R , (10.76)
where, to avoid any confusion between between (e L ) and (e)PL , we explicitly define
The point is that this is clearly not a gauge singlet. In the first place, the e L part of
each term transforms as a doublet under SU (2) L , and the e R is a singlet, so each term
is an SU (2) L doublet. Furthermore, the first term has Y = Ye R − Ye L = −1/2, while
the second term has Y = 1/2. All terms in the Lagrangian must be gauge singlets in
order not to violate the gauge symmetry, so the electron mass is disqualified from
appearing in this form. More generally, for any Standard Model fermion f , the naive
mass term
Lf mass = −m f ( f L f R + f R f L ) (10.80)
is not an SU (2) L singlet, and is not neutral under U (1)Y , and so is not allowed.
Fortunately, fermion masses can still arise with the help of the Higgs field. For
the electron, there is a gauge-invariant term:
φ+
Lelectron Yukawa = −ye ν e e L e R + c.c. (10.81)
φ0
Here ye is a Yukawa coupling of the type we studied in (6.3). The field ν e e L
carries weak hypercharge +1/2, as does the Higgs field, and e R carries weak hyper-
charge −1, so the whole term is a U (1)Y singlet, as required. Moreover, the doublets
transform under SU (2) L as:
+ +
φ −iθ a σ a /2 φ
→e , (10.82)
φ0 φ0
ν e e L → ν e e L e+iθ σ /2 ,
a a
(10.83)
Since we know the electron mass and the Higgs VEV already, we can compute the
electron Yukawa coupling:
√ 0.511 MeV
ye = 2 = 2.94 × 10−6 . (10.87)
246 GeV
Unfortunately, this is so small that we can forget about ever observing the interactions
of the Higgs particle h with an electron. Notice that although the neutrino participates
in the Yukawa interaction, it disappears in unitary gauge from that term.
Masses for all of the other leptons, and the down-type quarks (d, s, b) in the
Standard Model arise in exactly the same way. For example, the bottom quark mass
comes from the gauge-invariant Yukawa coupling:
φ+
L = −yb t L b L b R + c.c., (10.88)
φ0
implying that, in unitary gauge, we have a b-quark mass and an hbb vertex:
yb
L = − √ (v + h)bb. (10.89)
2
The situation is slightly different for up-type quarks (u, c, t), because the complex
conjugate of the field
must appear in order to preserve U (1)Y invariance. It is
convenient to define
0∗
˜ 0 1 ∗ φ
≡
= , (10.90)
−1 0 −φ +∗
→ eiθ σ /2
,
a a
(10.91)
˜
˜ → eiθ a σ a /2
. (10.92)
charge.) Therefore, one can write a gauge-invariant Yukawa coupling for the top
quark as:
φ 0∗
L = −yt t L b L t R + c.c. (10.93)
−φ +∗
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 283
Going to unitary gauge, one finds that the top quark has a mass:
yt
L = − √ (v + h)tt. (10.94)
2
The mass and the h-fermion-antifermion coupling obtained by each Standard Model
fermion in this way are both proportional to y f . The Higgs mechanism not only
explains the masses of the W ± and Z bosons, but also explains the masses of fermions.
Notice that all of the particles in the Standard Model (except the photon and gluon,
which must remain massless because of SU (3)c × U (1)EM gauge invariance) get a
mass from spontaneous electroweak symmetry breaking of the form:
The Standard Model fermions consist of three families with identical gauge inter-
actions. Therefore, the most general form of the Yukawa interactions is actually:
φ+
Le,μ,τ Yukawas = − ν i iL ye i j R j + c.c., (10.98)
φ0
φ+
Ld,s,b Yukawas = − u iL d iL yd i j d R j + c.c., (10.99)
φ0
0∗
φ
Lu,c,t Yukawas = − u iL d iL yu i j u R j + c.c. (10.100)
−φ +∗
Here i, j are indices that run over the three families, so that:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
eR eL νe
R j ⎝
= μR ⎠ , ⎝
L j = μL ⎠ , ⎝
νi = νμ ⎠ , (10.101)
τR τL ντ
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
dL dR uL uR
dL j = ⎝ sL ⎠ , dR j = ⎝ sR ⎠ , u L j = ⎝ cL ⎠ , uRj = ⎝ cR ⎠ . (10.102)
bL bR tL tR
ye i j , yd i j , yu i j (10.103)
are complex 3 × 3 matrices in family space. In unitary gauge, the Yukawa interaction
Lagrangian can be written as:
h i i
L=− 1+ L mei j R j + d L mdi j d R j + u iL mui j u R j + c.c., (10.104)
v
where
v
mf i j = √ yf i j . (10.105)
2
(The fermion fields are now labeled with a prime, to distinguish them from the basis
we are about to introduce.) It therefore appears that the masses of Standard Model
fermions are actually 3 × 3 complex matrices.
It is most convenient to work in a basis in which the fermion masses are real and
positive, so that the Feynman propagators are simple. This can always be accom-
plished, thanks to the following:
U L† MU R = M D (10.106)
where M D is diagonal with positive real entries, and U L and U R are unitary matrices.
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 285
To apply this in the present case, consider the following redefinition of the lepton
fields:
where L L i j and L R i j are unitary 3 × 3 matrices. The lepton mass term in the unitary
gauge Lagrangian then becomes:
h i † j
L=− 1+ L (L L me L R )i R j . (10.108)
v
Now, the theorem just stated assures us that we can choose the matrices L L and L R
so that:
⎛ ⎞
me 0 0
L †L me L R = ⎝ 0 m μ 0 ⎠. (10.109)
0 0 mτ
In the same way, one can do unitary-matrix redefinitions of the quark fields:
d Li = DL i j dL j , d Ri = DR i j dR j , (10.111)
u Li = UL i u L j ,
j
u Ri = URi u R j ,
j
(10.112)
j j
i L ∂/L j = i( L L †L ) j ∂/(L L L ) j = i L ∂/ L j . (10.115)
286 10 The Standard Electroweak Model
This relies on the fact that the (constant) field redefinition matrix L L is unitary,
L †L L L = 1. The same thing works for all of the other derivative kinetic terms, for
example, for right-handed up-type quarks:
which relies on U R† U R = 1. So the redefinition has no effect at all here; the form
of the derivative kinetic terms is exactly the same for unprimed fields as for primed
fields.
There are also interactions of fermions with gauge bosons. For example, for the
right-handed leptons, the QED Lagrangian contains a term
j j
− e Aμ R γ μ R j = −e Aμ ( R L †R ) j γ μ (L R R ) j = −e Aμ R γ μ R j . (10.117)
Just as before, the unitary condition (this time L †R L R = 1) guarantees that the form of
the Lagrangian term is exactly the same for unprimed fields as for primed fields. You
can show quite easily that the same thing applies to interactions of all fermions with
Z μ and the gluon fields. The unitary redefinition matrices for quarks just commute
with the SU (3)c generators, since they act on different indices.
But, there is one place in the Standard Model where the above argument does
not work, namely the interactions of W ± vector bosons. This is because the W ±
interactions involve two different types of fermions, with different unitary redefini-
tion matrices. Consider first the interactions of the W ± with leptons. In terms of the
original primed fields:
g
L = − √ Wμ+ ν i γ μ Li + c.c., (10.118)
2
so that
g
L = − √ Wμ+ ν i γ μ (L L L )i + c.c. (10.119)
2
Since we did not include a Yukawa coupling or mass term for the neutrinos, we
did not have to make a unitary redefinition for them. But now we are free to do so;
defining νi in the same way as the corresponding charged leptons, νi = L L ν j , we
get,
ν i = (ν L †L )i , (10.120)
resulting in
g
L = − √ Wμ+ ν i γ μ Li + c.c. (10.121)
2
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 287
So once again the interactions of W ± bosons have exactly the same form for mass-
eigenstate leptons. However, consider the interactions of W ± bosons with quarks.
In terms of the original primed fields,
g
L = − √ Wμ+ u iL γ μ d Li
+ c.c., (10.122)
2
which becomes:
g
L = − √ Wμ+ u iL γ μ (U L† D L d L )i + c.c. (10.123)
2
There is no reason why U L† D L should be equal to the unit matrix, and in fact it is
not. So we have finally encountered a consequence of going to the mass-eigenstate
basis. The charged-current weak interactions contain a non-trivial matrix operating
in quark family space,
V = U L† D L , (10.124)
called the Cabibbo-Kobayashi-Maskawa matrix (or CKM matrix). The CKM matrix
V is itself unitary, since V † = (U L† D L )† = D †L U L , implying that V † V = V V † = 1.
But, it cannot be removed by going to some other basis using a further unitary matrix
without ruining the diagonal quark masses. So we are stuck with it.
One can think of V as just a unitary rotation acting on the left-handed down
quarks. From (10.123), we can define
d Li = Vi j d L j , (10.125)
where θc is called the Cabibbo angle. This implies that the interactions of W + with
the mass-eigenstate quarks are very nearly:
g
L = − √ Wμ+ cos θc [u L γ μ d L + c L γ μ s L ] + sin θc [u L γ μ s L − c L γ μ d L ] + t L γ μ b L .
2
(10.129)
Strange hadrons have long lifetimes because they decay through the weak interac-
tions, and with reduced matrix elements that are proportional to sin2 θc = 0.05.
More precisely, the CKM matrix is:
⎛ ⎞ ⎛ ⎞
Vud Vus Vub 0.9743 0.2252 0.004
V = ⎝ Vcd Vcs Vcb ⎠ ≈ ⎝ 0.230 0.975 0.041 ⎠ , (10.131)
Vtd Vts Vtb 0.008 0.04 0.999
where the numerical values given are estimates of the magnitude only (not the sign
or phase). In fact, the CKM matrix contains one phase that cannot be removed by
redefining phases of the fermion fields. This phase is the only source of CP violation
in the Standard Model.
Weak decays of mesons involving the W bosons allow the entries of the CKM
matrix to be probed experimentally. For example, decays
B → D+ ν , (10.132)
where B is a meson containing a bottom quark and D contains a charm quark, can be
used to extract |Vcb |. The very long lifetimes of B mesons are explained by the fact
that |Vub | and |Vcb | are very small. One of the ways of testing the Standard Model is
to check that the CKM matrix is indeed unitary:
V † V = 1, (10.133)
∗ + V V ∗ + V V ∗ = 0. This is an automatic
which implies in particular that Vud Vub cd cb td tb
consequence of the Standard Model, but if there is further unknown physics out there,
then the weak interactions could appear to violate CKM unitarity.
10.4 Neutrino Masses and the Seesaw Mechanism 289
The partial decay widths and branching ratios of the W boson can be worked out
from (10.121) and (10.127), and agree well with the experimental results:
where the “hadrons” refers mostly to Cabibbo-allowed final states ud and cs, with a
much smaller contribution from the Cabibbo-suppressed final states cd and us. The
tb final state is of course not available due to kinematics; this implies the useful fact
that (up to a very small effect from CKM mixing) b quarks do not result from W
decays in the Standard Model.
Also, the very small magnitudes of Vtd and Vts imply that top quarks decay to
bottom quarks almost every time:
BR(t → W + b) ≈ 1. (10.136)
Evidence from observation of neutrinos produced in the Sun, the atmosphere, accel-
erators, and reactors have now established that neutrinos do have mass. In the renor-
malizable version of the Standard Model given up to here, this cannot be explained.
The basic reason for this is the absence of right-handed neutrinos from the list in
(10.35). To remedy the situation, we can add three right-handed fermions that are
singlets under all three components of the gauge group SU (3)c × SU (2) L × U (1)Y :
Note the similarity of this with the up-type quark Yukawa couplings in (10.100).
Going to unitary gauge, one obtains a neutrino mass matrix
290 10 The Standard Electroweak Model
v
mν i j = √ yν i j . (10.139)
2
just as in (10.105) for the Standard Model charged fermions. This neutrino mass
matrix can be diagonalized to obtain the physical neutrino masses as the absolute
values of its eigenvalues. The neutrinos in this scenario are Dirac fermions, as the
mass term couples together left-handed and right-handed degrees of freedom that
are independent.
Although the magnitudes of the neutrino masses are not yet determined by exper-
iment, there are strong upper bounds, as seen in Table 1.2 in the Introduction. Also,
limits from the WMAP and Planck measurements of the cosmic background radi-
ation, interpreted within the standard cosmological model, implies that the sum of
the three Standard Model neutrinos should be at most 0.17 eV. Neutrino oscilla-
tion data do not constrain the individual neutrino masses, but imply that the largest
differences between squared masses should be less than 3 × 10−3 eV2 . So, several
independent pieces of evidence indicate that neutrino masses are much smaller that
any of the charged fermion masses. To accommodate this within the Dirac mass
framework of (10.139), the eigenvalues of the neutrino Yukawa matrix yν i j would
have to be extremely small, no larger than about 10−9 . Such small dimensionless
couplings appear slightly unnatural, in a purely subjective sense, and this suggests
that neutrino masses may have a different origin than quark and leptons masses.
The seesaw mechanism is a way of addressing this problem, such that very small
neutrino masses naturally occur, even if the corresponding Yukawa couplings are of
order 1. One includes, besides (10.138), a new term in the Lagrangian:
1
L = − Mi j N i N j , (10.140)
2
where Ni are the Majorana fermion fields (see Sect. 3.4) that include N Ri , and Mi j
is a symmetric mass matrix. If the neutrino fields carry lepton number 1, then this
Majorana mass term necessarily violates the total lepton number L = L e + L μ +
L τ . Now the total mass matrix for the left-handed neutrino fields ν Li and the right-
handed neutrino fields N Ri , including both (10.139) and (10.140), is:
0 √v yν
M= √v yνT
2 . (10.141)
M
2
The point of the seesaw mechanism is that if the eigenvalues of M are much larger than
those of the Dirac mass matrix √v yν , then the smaller set of mass eigenvalues of M
2
will be pushed down. Since M does not arise from electroweak symmetry breaking,
it can naturally be very large. For illustration, taking M and yν to be 1 × 1 matrices,
the absolute values of the neutrino mass eigenvalues of M are approximately:
v 2 yν 2
, and M (10.142)
2M
10.4 Neutrino Masses and the Seesaw Mechanism 291
in the limit vyν M. For example, to get a neutrino mass of order 0.1 eV, one could
have yν = 1.0 and M = 3 × 1014 GeV, or yν = 0.1 and M = 3 × 1012 GeV. The
light neutrino states (corresponding to the lighter eigenvectors of M) are mostly the
Standard Model ν L and they are Majorana fermions. There are also three extremely
heavy Majorana neutrino mass eigenstates, which decouple from present weak inter-
action experiments. The fact that the magnitude of M necessary to make this work is
not larger than the Planck scale, and is very roughly commensurate with other scales
that occur in other theories such as supersymmetry, is encouraging. In any case,
the ease with which the seesaw mechanism accommodates very small but non-zero
neutrino masses has made it a favorite scenario of theorists.
In either of the two cases above, the left-handed parts of the neutrino mass eigen-
states ν1 , ν2 , ν3 (with masses m 1 < m 2 < m 3 ) can be related to the left-handed parts
of the neutrino weak-interaction eigenstates νe , νμ , ντ (which each couple to the
corresponding charged lepton only, and the W boson) by:
⎛ ⎞ ⎛ ⎞
νeL ν1L
⎝ νμL ⎠ = U ⎝ ν2L ⎠ , (10.143)
ντ L ν3L
(at the nucleon level nn → ppe− e− ), which can proceed via the quark-level Feyn-
man diagram shown below.
292 10 The Standard Electroweak Model
Since this process requires a violation of total lepton number in the neutrino propaga-
tor, it can only occur in the case of Majorana neutrinos. It is the subject of continuing
searches.
One of the most momentous discoveries of the last half century was that of the Higgs
boson. Before its discovery in 2012, its properties and even its existence were in
doubt, despite the many successes of the Standard Model. Research publications on
“Higgsless theories” had persisted to the very end. As discussed above, the simplest
path to achieve masses for the W and Z bosons and the fermions of the Standard
Model is to introduce a single scalar Higgs boson doublet that condenses, breaking
electroweak symmetry down to the U (1)EM . The fluctuation around this background
value is the Higgs boson. However, when the theory was first formulated, there
was little guidance as to what mass it should have. One only knew that it was con-
trolled by the vacuum expectation value, which was known to be v 246 GeV, and
√ constant λ that was completely unknown, leading to an unknown
a dimensionless
mass m h = 2λv, as we saw in (10.74).
Experiments had been searching for the Higgs boson for decades without success.
Just prior to its discovery, evidence from the sum of data collected from the Z pole
experiments at CERN LEP and SLAC SLC, combined with the top quark and W
mass measurements at Tevatron and LEP2 at CERN, suggested that if the Standard
Model is the underlying theory of the weak scale, then the Higgs boson mass needed
to be in the range 114 GeV < m h < ∼ 180 GeV at 95% confidence level. It should be
emphasized that the lower bound of 114 GeV was derived directly by not seeing the
Higgs boson produced and decay at LEP2, whereas the upper bound was derived
indirectly, and thus less reliably, by a global analysis of compatibility to all data that
is sensitive to the Higgs boson mass, via quantum loops. A problem with this kind
of indirect bound is that it is always possible that some other unsuspected particle(s)
also contribute in the loops, interfering with the Higgs boson contribution.
Below, we will review the physics of the Higgs boson discovery at the LHC,
starting with a discussion of the decay modes.
In Sect. 6.3, we have already calculated the leading-order decays of the Higgs boson
into fermion-antifermion final states. However, there are other final states that are
quite important besides h → f f . First, the Higgs boson can decay into two gluons,
h → gg, with the gluons eventually manifesting themselves in the detector as jets.
This decay cannot happen at tree level, but does occur through the one-loop diagram
below, where quarks go around the loop:
10.5 The Higgs Boson Discovery 293
(One must also include the diagram with the gluons exchanged, or equivalently the
diagram with the quarks running the other direction around the loop.) Even though
one-loop graph amplitudes are usually not competitive with tree-level amplitudes,
this is an exception because the gluons have strong couplings and because the top
quark with its large yt participates in the loop diagrams, while in the on-shell 2-body
decays to fermions, only lighter fermions (with much smaller Yukawa couplings)
can appear.
The resulting partial decay width for h → gg depends on the quark masses in
two places. First, at the hq q̄ vertex there is a Yukawa coupling, and second there
needs to be a chirality flip in one of the propagators to enable a non-zero result. That
is, the trace over the three fermion propagator numerators vanishes (due to an odd
number of γ matrices) unless one of the propagators is traced over the mass term.
These two facts explain why the top quark, being by far the most massive quark,
gives a dominant contribution to this amplitude.
The spin-summed squared amplitude for the h → gg transition is
m 4h αs 2 2
|M|2 = A 1/2 (τq ) , (10.145)
v2 π q
A1/2 (τ ) = τ + τ (1 − τ ) f (τ ) (10.146)
where
⎧
⎨ sin−1 1/√τ 2 , (for τ ≥ 1),
f (τ ) = √ 2 (10.147)
⎩ − ln
1 1+ √1−τ
− iπ (for τ ≤ 1).
4 1− 1−τ
The first equality can be obtained from (6.24)–(6.25). Keep in mind that there is a fac-
tor of 1/2 from the indistinguishability of the gluons; otherwise each kinematic con-
figuration of gluons would be double counted when the final state phase space is inte-
grated over. Using αs = 0.118, m t = 173 GeV, m h = 125 GeV, and v = 246 GeV,
one finds (h → gg) = 0.214 MeV. This is in contrast to a more complete and state-
of-the-art computation, which instead gives (h → gg + X ) = 0.349 MeV, where
X represents anything (including nothing). The reason for the discrepancy is that
higher loop contributions and the radiation of additional soft gluons enhance the
decay partial width.
The decay h → γ γ also is absent at tree level, but does occur due to one-loop
graphs where any charged particle goes around the loop. This again includes notably
the top quark, but now the largest contribution is due to the W boson. The two most
important Feynman diagrams are:
m 3h α 2 f 2
(h → γ γ ) = A1 (τW ) + Nc Q 2f A1/2 (τ f ) (10.149)
64π v π
2
f
f
where the sum is over f = t, b, c, s, u, d and τ, μ, e, with Nc = 3 for quarks and
f
Nc = 1 for leptons, with Q t,c,u = 2/3 and Q b,s,d = −1/3 and Q τ,μ,e = −1, with
τW = 4m 2W /m 2h and τ f = 4m 2f /m 2h , and
where f (τ ) is the same function as appears in A1/2 (τ ), and was given already in
(10.147). Although the resulting branching ratio to two photons is much smaller
(about 2.3 × 10−3 in the Standard Model with m h = 125 GeV, after taking into
account higher-order corrections), it is still important because the corresponding
backgrounds at colliders are also small.
10.5 The Higgs Boson Discovery 295
The decay h → Z γ is also mediated by similar one-loop graphs, but it will not
be reviewed here because it turns out to be quite small and not as useful, once the
corresponding background rates are taken into account. It has still not been observed,
but even this non-observation can constrain some non-minimal models.
Also significant are decays through two massive vector bosons, h → W + W − and
h → Z Z , corresponding to the Feynman rules found in Sect. 10.2 (below (10.71))
are important. In fact, these decays are important even if (as turns out to be true in the
real world) m h < 2m W and m h < 2m Z , despite the on-shell decays being forbidden
so that one of the vector bosons must be virtual. One usually indicates this by writing
where the “(∗)” means that the corresponding particle may be off-shell, depending
on the kinematics. If one of the vector bosons is off-shell, then the decay can be
thought of as really three-body. This means that the decay is really h → W ± f f¯ or
h → Z f f¯, where f f¯ is any final state that couples to the off-shell W ± , and f f¯ is
any fermion-antifermion final state that couples to the off-shell Z . For example, for
leptonic final states of the off-shell vector boson:
0
10
bb
Higgs Branching Ratio
+ −
ττ
-1
10 cc
gg
+ −
W W
-2
10 ZZ
γγ
-3
10100 120 140 160 180 200 220 240
Higgs mass [GeV]
Higgs Total Width [GeV]
0
10
-1
10
-2
10
-3
10100 120 140 160 180 200 220 240
Higgs mass [GeV]
The largest decay modes were predicted to be bb and/or W W (∗) over the entire range
of m h , with the total width dramatically increasing when both W bosons can be on
shell. However, because it has very low backgrounds in colliders, the γ γ final state
was understood to be very important for the discovery of a light Higgs boson, despite
its tiny branching ratio. The Z Z (∗) final state is also important because it can lead to
low-background signals if both of the Z bosons decay to leptons. Note that the gg
final state is useless as a discovery mode because of huge QCD backgrounds to dijet
production, but it is important to keep track of for two reasons. First, its presence
reduces the branching ratios into the more useful final states. Second, it is related,
by crossing symmetry, to the largest production cross-section mode, as we discuss
next.
10.5 The Higgs Boson Discovery 297
At hadron colliders such as the LHC, the largest parton-level production processes
for the Higgs boson is:
gg → h, (10.152)
This process cannot occur at tree level, but it does occur due to the same one-loop
diagram mentioned above in Sect. 10.5.1 for the decay h → gg. The roles of the
initial state and the final state are simply exchanged, by crossing:
Although the amplitude is loop-suppressed, the large gluon PDFs at the LHC make it
by far the most important production mode for a 125 GeV Higgs boson at the LHC,
with a cross-section exceeding that of the next largest, the W -boson fusion process
discussed below, by more than an order of magnitude. It was the process that figured
most prominently in the initial discovery.
At leading order, the production cross-section for gg → h is directly proportional
to the decay width h → gg, by crossing symmetry. To see this connection, we begin
with the generalized cross-section formula for initial state massless states with four-
momentum pa and pb scattering
to a single final state particle with four-momentum
k = (E, k), with E = |k|2 + m 2h in the present case. From (4.175) and (4.176)
with n = 1, and |va − vb | = 2 and 4E a E b = ŝ in the center of momentum frame,
we obtain:
3
1 d k 1 1 1
d σ̂ = · · |M| 2
(2π )4 δ (4) ( pa + pb − k). (10.153)
2ŝ (2π )3 2E 4 64
By crossing symmetry, |M|2 is the same spin-summed and color-summed squared
matrix element as in the decay calculation, but here it comes with the prefactor 14 · 64
1
,
since in this case we need to average (instead of sum) over initial state gluon spins
(2 spins) and color factors (8 gluons g a=1...8 ). Since the cross-section is invariant
under boosts along the beam direction we chose to work in the center-of-momentum
frame where pa + pb = 0, so that the delta function
√ vanishes except for k = 0. For
on-shell production of the Higgs boson, E = ŝ = m h .
Integrating (10.153) over the three-momentum k and collecting terms we find that
π
σ̂ = 2
δ(ŝ − m 2h ) |M|2 . (10.154)
256m h
298 10 The Standard Electroweak Model
Now, from (10.148) we know that |M|2 = 32π m h and so the cross-section of
gg → h can be obtained by knowing the partial decay width h → gg:
π2
σ̂ (gg → h) = (h → gg)δ(ŝ − m 2h ). (10.155)
8m h
The δ function dependence of the cross-section corresponds to the fact that free, on-
shell asymptotic states cannot scatter 2-to-1 unless the four-momenta of the first two
μ
particles pa,b are precisely arranged to construct the final momentum k = p1 + p2
such that k 2 = m 2h . Unlike 2-to-2 scattering not just any sufficiently large incoming
momenta will do. In the center of mass frame this requires that E = m h and k = 0.
More generally, there is no way to allow AB → C particle scattering unless
m A + m B ≤ m C . However, if that is allowed, then C → AB decays are allowed
with the same amplitude, giving C a decay width. The decay width means that there
is a finite spread of k 2 around m C2 (with the finite spread being determined by )
C
such that AB → C is allowed. The finite spread is the Breit-Wigner width, which is
characterized by replacing the δ-function with
1 m h h
δ(ŝ − m 2h ) → . (10.156)
π (ŝ − m 2h )2 + m 2h h2
1 1
σ ( pp → h) = d xa d xb g(xa )g(xb )σ̂ (gg → h). (10.157)
0 0
τ = xa xb and x = xa . (10.158)
10.5 The Higgs Boson Discovery 299
1 1
dx
σ ( pp → h) = dτ g(x)g(τ/x)σ̂ (τ s). (10.159)
x
0 τ
1
dL(τ ) dx
= g(x)g(τ/x), (10.160)
dτ x
τ
1
dL(τ )
σ ( pp → h) = dτ σ̂ (τ s). (10.161)
dτ
0
For a given fixed value of m h , the total leading-order cross-section for pp → h due
to the gg → h parton-level process is therefore a simple function of s and m h , and
can be obtained by using the δ-function in σ̂ to integrate over τ , with the result:
where
π2
σ0 = (h → gg), (10.163)
8m 3h
and
1
dx
Fgg (τh ) = τh g(x)g(τh /x) (10.164)
x
τh
τh = m 2h /s, (10.165)
which in turn depends on the Higgs boson mass and the proton beam energy.
300 10 The Standard Electroweak Model
The numerical value of σ0 , using the leading-order width (h → gg) = 0.214
MeV obtained above, is:
π2
σ0 = (H → gg) 53 fb (10.166)
8m 3H
Using the MSTW2008NLO √ parton distribution functions for gluons gives Fgg (τh ) =
99, 127, and 292 for s = 7, 8, and 13 TeV, respectively, with m h = 125 GeV (and
the factorization scale in the gluon PDFs set equal to m h ). Thus, √ we get leading-
order estimates of σ ( pp → h) = 5.2 pb, 6.7 pb, and 15.4 pb, for s = 7, 8, and 13
TeV, respectively. These simple estimates are considerably smaller than the results
of state-of-the-art computations of σ ( pp → h + X ) coming from the CERN Higgs
cross-section working group, which gives approximately 17 pb, 21 pb, and 49 pb,
respectively. This increase of more than a factor of 3 compared to our results above
is because of the large effects of higher-order loop corrections, the emission of addi-
tional soft gluons, and a more sophisticated use of PDFs, all of which we have not
included in our simple analysis. This demonstrates the great importance of the heroic
efforts that have been made to calculate such higher order effects. In addition, more
sophisticated calculations provide crucial kinematic information about the kinemat-
ics of Higgs boson events, including the distribution of the transverse momenta of
the Higgs boson, and the numbers and momenta of the additional jets that may be
produced in the event.
Other parton-level processes that produce the Higgs boson at the LHC have smaller
cross-sections, but are important because they involve additional final state particles
whose presence can be used to control backgrounds. Furthermore, these processes
involve different couplings, allowing tests of the proposition that the new scalar
particle is really behaving as expected for the Standard Model Higgs. First, there
are the weak vector boson fusion modes, which refers to the parton-level processes
qq → qqh, qq → qqh, and q q → q qh through Feynman diagrams like this:
Here, the quark jets in the final state are usually found at small angles with respect
to the beam. Tagging events with these forward jets is a way to reduce backgrounds.
Another type of channel features Higgs bosons that are radiated off of weak vector
bosons:
qq → Z h, (10.167)
qq → W ± h. (10.168)
10.5 The Higgs Boson Discovery 301
These channels provide useful modes for confirmation and study, because the pres-
ence of the extra weak boson reduces backgrounds. The process qq → Z h occurs
due to this Feynman diagram:
This is particularly useful as a direct test of the Higgs boson interaction with the top
quark.
The pp Large Hadron Collider experiments at CERN, ATLAS and CMS, were both
designed to be able to cover the entire range of Higgs boson mass suggested by
the indirect constraints on it. For much of the allowed mass region, the primary
target for discovery was the decay to γ γ , manifested as a narrow mass peak of two
signal
photons centered on m h = m γ γ . The main background, largely created by q q̄ →
bkgd
γ γ , consists of a diffuse spectrum of m γ γ . There are also important contributions
to the background from gg → γ γ and from fake photons.
Indeed, the diphoton signal is half of how the Higgs boson discovery was estab-
lished – a peak of γ γ events that ultimately could be identified with m h . The top
302 10 The Standard Electroweak Model
Fig. 10.1 Top panel: The diphoton invariant mass spectrum from ATLAS data (ATLAS-CONF-
2022-094). The upper red line is signal (from h → γ γ ) plus background (mostly qq → γ γ ). The
dashed blue line is a fit to the background-only hypothesis. The black curves on the bottom show the
signal expectation (black solid curve) and the background subtracted data (data circles with 68%
CL vertical uncertainty bars). Bottom panel: The four-lepton invariant mass spectrum from CMS
is shown as the black data points (Sirunyan et al. (CMS), EPJ C81, 488 (2021) and reprinted in RPP
2022). The prediction from backgrounds other than the Standard Model Higgs boson are shown as
the blue histogram. The red curve includes the backgrounds plus the prediction of a Higgs boson
with invariant mass peak of m h = m 4l = 125 GeV
m Z = 91.2 GeV,
1
α= ,
129
sin2 θW = 0.23.
where t, b are the Dirac spinor fields for the top quark and bottom quark. (Ignore
the fact that the bottom quark appearing here is not quite a mass eigenstate; this is a
very small effect.) The top quark decays very quickly, as you will discover below,
so it does not form complicated bound states like the lighter quarks. Therefore, one
can just use the simple charged-current weak-interaction Feynman rule implied
by the above interaction Lagrangian.
μ
Let the 4-momentum of the top quark be p μ , and that of the W + boson be k1 ,
μ
and that of the bottom quark be k2 . Find the kinematic quantities:
in terms of the symbols m t and m W . Treat the bottom quark as massless. (This is a
good approximation, since m t = 173.1 ± 1.3 GeV, m W = 80.399 ± 0.023 GeV,
and m b ≈ 5 GeV. Note that kinematic quantities generally involve the squares of
ratios of masses.)
3. Draw the Feynman diagram and write down the reduced matrix element for top
quark decay. Take the complex square of the reduced matrix element, and sum
over the final state polarizations of the W + boson. Then average over the initial
t spin, and sum over the initial spin of the b. (The quarks have 3 colors, but the
color of the final state bottom quark is constrained to be the same as that of the
initial state top quark. Since one should average over the initial state quark color,
the net color factor is just 1.)
4. Compute the decay rate of the top quark. You should find a result of the form:
N3
2 2
+g 2 m 3t MW MW
(t → bW ) = 1 + N2 1− (10.171)
N1 π M W 2 m 2t m 2t
d 1
μ gi = β(gi ) = Bi gi3 . (10.172)
dμ 16π 2
√
Here, g2 = g and g1 = 5/3g , where g, g are the SU (2) L and U (1)Y cou-
plings in the normalization of Sect. 11.1. The choice of normalization for g1
is called the GUT (Grand Unified Theory) normalization. As boundary condi-
tions, take α3 (M Z ) = 0.1185, g2 (M Z ) = 0.652, and g1 (M Z ) = 0.461. Make
a graph of αi−1 (μ) as a function of log10 (μ/1 GeV), for M Z ≤ μ ≤ 1019 GeV,
using the one-loop running approximation. Make a note of the numerical values
of αi (μ) at μ = 1000 GeV and μ = 5000 GeV.
(c) In the Minimal Supersymmetric Standard Model (MSSM), the same three
gauge couplings appear, but they have different one-loop running coefficients:
due to the fact that the MSSM contains more fields appearing in the loop
diagrams. Let us assume that the new particles in the MSSM all have the same
masses μSUSY . (This is probably not realistic, but captures the main point of
the following.) Then, assuming supersymmetry is correct, the renormalization
group running should use the MSSM coefficients for μ > μSUSY . Starting
with the values you found for αi (μ) at μ = μSUSY = 1000 GeV and μ =
μSUSY = 5000 GeV in part (b), make graphs of αi−1 (μ) for μSUSY ≤ μ ≤ 1019
GeV in the MSSM, as a function of log10 (μ/GeV). You should observe that
the three running gauge couplings become approximately equal at a single
renormalization group scale μGUT . (To really do this right, one ought to include
at least 2-loop RG running, as well as small “threshold” corrections when
matching the MSSM onto the Standard Model. But those effects make a rather
small difference.) What do you estimate for μGUT ?
This famous unification of gauge couplings is held by many to be an indirect
piece of evidence (but far from compelling) in favor of supersymmetry, since
it suggests that at very high energies the gauge interaction themselves unify
into a simpler Grand Unified Theory (GUT), a larger Yang-Mills theory with a
single Lie algebra SU (5) or S O(10) or E 6 . The unification of gauge couplings
can also occur in some versions of superstring theory.
Neutral Meson Mixing
11
In this chapter we will investigate the experimental effects of neutral meson mix-
ing. An excellent laboratory in which to do so is the neutral kaon sector. Kaons are
particles that are bound states of a single anti-strange quark along with an up quark
(charged kaons, K + ) or down quark (neutral kaons, K 0 ). Such composite particles
satisfy the requirements of QCD, which demands that bounds states are color sin-
glets. There are two reasons to explore meson mixing among the neutral kaons in
some detail. First, the subject is endowed with its own complexity that is beyond the
discussion we have encountered so far. This complexity is the quantum mechanical
mixing over time of two “equivalent states” from the point of view of the charges
that they share. These oscillations show up in many guises in particle physics, but
especially in neutral meson mixings and neutrino flavor mixing. The methods dis-
cussed here will have transferability to understanding those other systems. In the
0
case of neutral kaons, the K 0 = d s̄ state can mix with its anti-particle K = d̄s.
The Hamiltonian eigenstates are superpositions of these states with distinct masses
and lifetimes, and are called K L and K S for “K long” (lifetime about 5.1 × 10−8
seconds) and “K short” (lifetime about 8.95 × 10−11 seconds), respectively.
The second reason to discuss neutral meson mixing is that historically it was the
first place in which CP violation was observed in particle physics experiments. Here,
C is the charge conjugation operation that turns each particle into its antiparticle,
and P is the parity (or space inversion) operation, each with eigenvalues ±1. The
combination of these operations, CP, is a symmetry of the strong interactions but
not the weak interactions. This is because there is a complex phase in the weak
interaction Hamiltonian that cannot be removed by a field redefinition. As we will
see below, the discovery of CP violation came about by recognizing that the K L
eigenstate could decay into two different final states with different CP eigenvalues.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 307
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_11
308 11 Neutral Meson Mixing
This was unexpected and could only be explained if there is a CP violating interaction
allowed among the constituent quarks. We know from our earlier discussion of the
CKM matrix that governs W boson interaction among the quarks that with three (or
more) families not all phases can be absorbed in the fields and one non-zero phase
possibility is left over. Thus, a mechanism for introducing CP violation is possible,
which predicted the necessity of a third generation of quarks (t, b) even before they
were found.
Neutral kaon mixing formalism has much in common with the formalisms of
neutral D-meson mixing and neutral B-meson mixing. The latter two have detailed
0 0
descriptions in review essays within the RPP (“D 0 − D Mixing” and “B 0 − B
0
Mixing”). The D 0 and D mesons are made from (cū) and (c̄u) quark bound states.
0
The Bd0 and B d mesons are made from (b̄d) and (bd̄) quark bound states, and likewise
0
Bs0 and B s are made from (b̄s) and (bs̄). All of these have oscillatory mixing behavior
similar to neutral kaon mixing. The reader is encouraged to consult these review
essays, whose formalism and language will be straightforward to understand after
reading this chapter.
Our discussion in this chapter is centered mainly on the mixing phenomena itself
0
and not on diagrammatical calculations of K 0 − K mixing. For the most part we
assume the mixing and investigate the consequences for time evolution and experi-
mental measurement. The reason for this emphasis is that the calculations are loop-
order computations which we have not emphasized within this introductory book.
Only when discussing the K L − K S mass difference do we invoke the necessary
one-loop diagrams that directly account for the mass difference. In that case we aim
to show that the extraordinarily tiny mass difference that is measured indeed arises
from a sensible calculation of the theory.
0
Fig. 11.1 Leading order K 0 − K mixing diagrams. The mixing violates strangeness (S = 2)
and let us imagine for a moment that the K 0 state has no mixing with any other state.
In this case, quantum mechanics says that the wave function over time evolves as
where M is the mass of the particle and is the total width of the particle, which
gives rise to the exponential decay of the wave function in time due to this decay
probability.
We can rewrite (11.3) in more standard quantum mechanical framework by defin-
ing an effective Hamiltonian
H = M −i , (11.4)
2
and applying this to the time-dependent Schrödinger equation
∂
H |ψ(t) = i |ψ(t). (11.5)
∂t
Upon applying the boundary condition that |ψ(t = 0) = |K 0 at t = 0 one sees that
(11.3) is the solution of (11.5).
0
However, in reality K 0 mixes with K and so (11.3) and (11.5) are too simplistic.
0
The mixing of K 0 and K arises from box diagrams in the weak interactions which
0
are shown in Fig. 11.1. Since K 0 and K are strangeness eigenstates, this mixing
violates strangeness by 2 units, S = ±2. We shall come back to the details of this
mixing contribution, but for now we need only recognize that the mixing does indeed
occur and it is small due to the suppressions caused by the mixing being a loop effect
and the presence of the W boson within the loop, which is much heavier than the
kaon (m 2K /m 2W < 10−4 ).
0
The Schrödinger equation now becomes, in the |K 0 , |K basis,
∂ M11 − 2i 11 M12 − 2i 12
H |ψ(t) = i |ψ(t), where H = . (11.6)
∂t M21 − 2i 21 M22 − 2i 22
Note, the matrices M and are necessarily Hermitian since they correspond to
∗ and = ∗ .
observables. Thus, M11 , M22 , 11 , and 22 are real, and M21 = M12 21 12
The effective Hamiltonian is further restricted due to a general property of local
quantum field theories, invariance under CPT (the product of charge conjugation,
310 11 Neutral Meson Mixing
parity, and time reversal operations). This can be shown to imply the exact relations
M11 = M22 , whose common value we will set to M, and 11 = 22 whose common
value we will set to . To emphasize the smallness of M12 and 12 , we will write
them as μ and γ respectively. The effective Hamiltonian is then
M − 2i μ − 2i γ
H= . (11.7)
μ∗ − 2i γ ∗ M − 2i
i i
λ S = m S − S , and λ L = m L − L , (11.8)
2 2
where
1 1
m S = M − m, S = + , (11.9)
2 2
1 1
m L = M + m, L = − , (11.10)
2 2
where
i 1/2
m + = 2 (μ − iγ /2) μ∗ − iγ ∗ /2 . (11.11)
2
Thus m and are nonzero by virtue of μ, γ = 0, and it turns out that, as defined
above, they are both positive. Thus, m L > m S but S > L , which is to say that
even though the K L mass is larger than the K S mass, the K L lifetime (τ L = 1/ L )
is longer than the K S lifetime (τ S = 1/ S ). As we will see later, the mass difference
between the two states is very nearly degenerate (i.e., m = m L − m S m L , m S ),
but the difference in width is large ( S L ). This is due to an interplay between
kinematics and the CP symmetry’s gatekeeping of what final states K L and K S are
easily allowed to decay into. The point is that, as we will see below, K L mainly decays
into three-pion states, which have a 3-body phase-space suppression compared to
the 2-body phase-space suppression for K S decays. The K L decays are even further
kinematically suppressed by the fact that m K − 3m π happens to be rather small.
The eigenstates K L and K S that correspond to the eigenvalues λ L and λ S are
1 0
|K L = (1 +
)|K 0 − (1 −
)|K for λ L , (11.12)
2 + 2|
|2
1 0
|K S = (1 +
)|K 0 + (1 −
)|K for λ S , (11.13)
2 + 2|
|2
where |
| 1 in the neutral kaon system and is defined to be
√ √
μ − iγ /2 − μ∗ − iγ ∗ /2
=√ √ . (11.14)
μ − iγ /2 + μ∗ − iγ ∗ /2
11.2 Neutral Kaon Mixing 311
Note that
= 0 only because of CP violation, for otherwise μ and γ would be real.
Current experimental best-fit values for the neutral kaon system are
where
|
| = (2.228 ± 0.011) × 10−3 , (11.20)
π
φ
= (0.9671 ± 0.0011) radians. (11.21)
4
Remarkably, the real and imaginary parts of
are almost equal.
The time evolution of these diagonalized Hamiltonian eigenstates is now straight-
forward, and is given by
where, for example, |K L (t) should be interpreted as the time evolved quantum state
|ψ(t) subject to the condition that |ψ(0) = |K L .
It is convenient at times to write the expansion of these different eigenstates in a
standard neutral meson mixing notation
|K L p −q |K 0
= 0 (11.23)
|K S p q |K
where
1+
1−
Note, p and q satisfy the condition | p|2 + |q|2 = 1, which is required for normalized
0
eigenstates. The expansion of K 0 and K in terms of K L and K S is then
|K 0 1 q q |K L
0 = . (11.25)
|K 2 pq −p p |K S
This expression is valuable when the produced states are strange eigenstates (e.g.,
K 0 ) but the time evolution is governed by the mass eigenstates. We will do examples
of these considerations in the subsequent discussion.
312 11 Neutral Meson Mixing
11.3 CP Eigenstates
0
In the previous subsection we identified the K 0 state and K state by their bound
state constituent quarks. Each had definite strangeness according to the single strange
quark or anti-strange quark it contains. K 0 was identified as a bound state of d and
0
s̄ quarks, and K as a bound state of d̄ and s quarks. If we apply the CP operator on
these quarks they turn into their anti-quark complements. Thus we recognize that1
0 0
C P|K 0 = |K , and C P|K = |K 0 . (11.26)
0
C P|K (+) = (+1)|K (+)
0
and C P|K (−)
0
= (−1)|K (−)
0
. (11.29)
Why would we want to identify CP eigenstates in this system? After all, the mass
eigenstates K L and K S are not CP eigenstates, nor are the “production eigenstates”
0
K 0 and K eigenstates of CP. Nevertheless, there are two key reasons why it is
helpful to identify the CP eigenstates. First, since |
| 1 in the kaon system, the
mass eigenstates are nearly identical to the CP eigenstates, which is important to
note. Turning this statement around, the mass eigenstates are not CP eigenstates only
because there are small CP violating effects in the kaon system. This is all abstract
if there are no experimental indications of this small CP violation. In fact, there is
experimental indication, which we will come to shortly, which is best understood by
showing that the kaon mass eigenstates each have at least a little bit of CP even and
CP odd eigenstate overlap within them. That is the second reason why it is helpful
to identify formally the CP eigenstates among the neutral kaons.
0
1 More precisely, C P|K 0 = η|K , where η is an arbitrary and unobservable phase factor such
that |η|2 .We choose η = 1 out of convenience.
11.4 Neutral Kaon Oscillations and Lifetimes 313
each s or s̄ quark combines with other quarks to create pure flavor kaons. Thus, the
0
kaons are given birth as a flavor eigenstate, either K 0 (d s̄) or K (s d̄). However, these
states are not mass eigenstates, and the time evolution of the system is obtained by
expanding in terms of its Hamiltonian eigenstates K L and K S .
Let us suppose that at time t = 0 we have created a pure flavor eigenstate K 0 .
From (11.25) we can expand K 0 in terms of the mass eigenstates
1
|K 0 = (|K L + |K S ) . (11.30)
2p
Let us define the quantum state |ψ(t) to be the time-evolved wave function subject
to the initial condition that |ψ(0) = |K 0 at t = 0. At subsequent times
1
|ψ(t) = |K L e−iλ L t + |K S e−iλ S t (11.31)
2p
1
= |K L e−im L t e− L t/2 + |K S e−im S t e− S t/2 . (11.32)
2p
Due to λ S = λ L , at a later time t the state |ψ(t) does not remain a pure |K 0 state
0
and oscillates between |K 0 and |K . By expanding the |K L and |K S states in
0
terms of |K 0 and |K according to (11.23) one finds
1
−iλ L t 1 q
−iλ S t 0
|ψ(t) = e + e−iλ S t |K 0 + e − e−iλ L t |K . (11.33)
2 2p
Utilizing (11.33) one finds that at time t the probability that the |ψ(t) state is
measured to be |K 0 given that it started at t = 0 as |K 0 is
1 q
= e− S t + e− L t − 2 cos((m L − m S )t) e−( S + L )t/2 . (11.37)
4 p
The lifetime of K L is τ L = 5.12 × 10−8 s, which is about 570 times longer than the
K S lifetime of τ S = 8.956 × 10−11 s. At times well above τ S the |K S component of
|ψ(t) in (11.32) has almost completely decayed away and therefore |ψ(t τ S )
|K L . Thus, all the branching ratios at those late times are reflective of K L decays. On
the other hand, for times t τ S the only decays that are taking place are those of the
314 11 Neutral Meson Mixing
K S state, since it has such a shorter lifetime. These considerations allow us to exploit
experimentally the regions of time and space where we expect a preponderance of
K S decays and K L decays.
We can rephrase the above considerations in terms of spatial resolution instead of
time resolution of K L and K S decays. The cτ lifetime decay length for K S and K L
are ∼ 2.7 cm and ∼ 15 m, respectively. The good separation in lifetime and the not-
too-large macroscopic distances that the kaons travel gives experiment the capability
of creating circumstances that can separate K L decays from K S . For example, if one
produces the kaons relativistically, or near relativistic, such as in the manner above
(pure K 0 at t = 0) one can observe decays at distances long enough (well past cτ of
K S ) where only K L states exist. This “K L region” must be adjusted for the specific
kinematics in each experiment, especially if the produced kaons are not relativistic. In
the K L region there are no K S and plenty of K L states present, and what is measured
can be interpreted as pure K L decays.
Let us consider possible hadronic decays of the kaons. The two candidates for decay
that are kinematically viable are 2π and 3π final states. By 2π we mean K 0 → π 0 π 0
and π + π − , and by 3π we mean K 0 → π 0 π 0 π 0 and π 0 π + π − . These two classes of
final states are pure CP eigenstate final states, where 2π is pure CP even (+1) and 3π
is pure CP odd (+1). This can be understood by recognizing that both final states are
invariant under charge conjugation C and that the pions are pseudoscalar particles
(Pπ = −1). The 2π final state is even under parity Pπ Pπ = (−1)(−1) = +1 and
the 3π final state is odd under parity Pπ Pπ Pπ = (−1)3 = −1, therefore
Here, it is important that the kaon states are spin-0, so that in their rest frame the
total angular momentum is 0. Since the pions are also spinless, the orbital angular
momentum is 0, so the contribution to the parity of the pion states is always (−1)
=
(−1)0 = 1. In order to produce a +1 CP eigenstate in the final state, the parent particle
that gave rise to it must have at least some CP even component to it. Likewise, in
order to produce a −1 CP eigenstate in the final state, the parent particle that gave
rise to this must have at least some CP odd component to it. In other words, to decay
0 or K 0 respectively:
into 2π or 3π the parent state must have a component of K (+) (−)
0 → π 0 π 0 , π + π − and K 0 → π 0 π 0 π 0 , π 0 π + π − are allowed. (11.39)
K (+) (−)
If we compare (11.12) and (11.13) with (11.27) and (11.28), noting that |
| 1,
0 ) and the K
we see that the K L eigenstate is nearly a pure CP odd eigenstate (K (−) S
0
eigenstate is nearly a pure CP even eigenstate (K (+) ). The overlap of K L and K S in
terms of these CP eigenstates is
11.6 Direct CP Violation in Kaon Decay 315
1
|K L = |K (−)
0
+
|K (+)
0
, (11.40)
1 + |
|2
1
|K S = |K (+)
0
+
|K (−)
0
, (11.41)
1 + |
|2
where again
is the CP violating parameter defined by (11.14).
A clear signature for CP violation then is to witness decays of K L into π π which
0 component, which in turn can only be present if
can only take place through its K (+)
there is CP violation in the theory. Indeed, that is what was found and CP violation
in the kaon sector was established.2
Our description above of how to determine that there is CP violation present in the
kaon system has some additional subtleties beyond just making a K L and watching
it decay. First, it must be noted that there is no direct production mechanism for
making pure K L states. What is really produced are flavor eigenstates (‘strangeness
0
eigenstates’) K 0 and K which have nearly equal parts of K S and K L in them. How
do we create a circumstance such that when we see a decay into π + π − we know it
came from a K L meson? As mentioned above, one key method of identification is
to wait long enough when only the K L component will survive.
Let us review briefly the experimental evidence. As we just mentioned, K L decays
to pions take place through its overlap with C P even 2π final states and C P odd final
states 3π . Recall that the K L would be a pure C P odd eigenstates if the CP violation
parameter
= 0 (see (11.40)). Therefore, a clear signal for CP violation in the kaon
system, and therefore
= 0, is a non-zero value of K L → 2π . Experimentally one
finds (see RPP tables)
(K L → π + π − )
= (1.967 ± 0.010) × 10−3 , and (11.42)
(K L → all)
(K L → π 0 π 0 )
= (8.64 ± 0.06) × 10−4 . (11.43)
(K L → all)
The decay rates of K L → π π are small but they are nevertheless nonzero, indicating
a small CP violating effect in kaon mixing and kaon decays.
In the above we have approximated all CP violating effects as coming through the
CP violating parameter
that affects the misalignment of the pure CP eigenstates
0 and K 0 with respect to the mass eigenstates K and K , as shown in (11.40)
K (+) (−) L S
and (11.41). However, there is another source of CP violation in the kaon system
H. Christenson, J. W. Cronin, V. L Fitch, “Evidence for the 2π Decay of the K 20 Meson.” Phys.
2 J.
that affects observables. This is the “direct CP violation” that occurs from the CP
eigenstate decaying into a final state with different CP charge.
The disentangling of the direct and indirect CP-violating effects starts with defin-
ing two parameters that can be extracted out of observable ratios of CP violating to
CP conserving decay amplitudes for K L and K S decays to π 0 π 0 and π + π − final
states. Define:
π + π − |Hweak |K L
η+− = , (11.44)
π + π − |Hweak |K S
π 0 π 0 |Hweak |K L
η00 = . (11.45)
π 0 π 0 |Hweak |K S
Evaluating these requires some care in analyzing weak decay amplitudes of kaon
decays into two pions. Final state pions with different isospins will have differ-
ent amplitudes for the decay. Therefore, it is important to decompose the two-pion
systems into pure isospin eigenstates with their appropriate Clebsch-Gordan coeffi-
cients:
1 2
|π π =
0 0
|(π π ) I =0 − |(π π ) I =2 , (11.46)
3 3
2 1
|π + π − = |(π π ) I =0 + |(π π ) I =2 . (11.47)
3 3
0
(π π ) I |Hweak |K 0 = A I eiδ I , and (π π ) I |Hweak |K = A∗I eiδ I , (11.48)
for I = 0, 2. The phases δ I are due entirely to final state pion interaction and thus
are the same for both amplitudes. Then A2 is complex, but without loss of generality
we can take A0 to be real by a choice of phase for the two-pion states.
To compute η+− and η00 one needs to expand the amplitudes of the numerator
and denominator of (11.44) and (11.45):
2 1 0
+ −
π π |Hweak |K S = (π π ) I =0 | + (π π ) I =2 | | Hweak | p|K 0 + q|K ,
3 3
1 2 0
π π |Hweak |K L = (π π ) I =0 |
0 0
− (π π ) I =2 | | Hweak | p|K 0 − q|K ,
3 3
1 2 0
η+− =
+
, (11.49)
η00 =
− 2
, (11.50)
where
i Im[A2 ] i(δ2 −δ0 )
= √ e . (11.51)
2 A0
= = 1 + 6 Re . (11.52)
(K L → π π )/ (K S → π π )
0 0 0 0 η00
This can be compared with the CERN’s full NA48 experimental result4 of
From these experimental results we see that the numerical impact of direct CP vio-
lation in kaon decays is nonzero and measurable but subdominant to the overall
manifestation of CP violation in the neutral kaon system.
The neutral kaons also have substantial branching fraction into semileptonic final
states. The RPP reports
(K L → π ± e∓ νe )
= 40.55 ± 0.11%, (11.56)
(K L → all)
(K L → π ± μ∓ νμ )
= 27.04 ± 0.07%. (11.57)
(K L → all)
0
These decays are made possible by the amplitudes of K 0 and K strange eigenstates
into semi-leptonic states due to the W -mediated weak interactions
0
A
= π −
+ ν
|Hweak |K 0 , and A∗
= π +
− ν̄
|Hweak |K . (11.58)
The corresponding Feynman diagrams are shown in Fig. 11.2. Note, the following
decay amplitudes are to a good approximation zero in comparison to the above
amplitudes and play little role in semi-leptonic decays:
B
= π −
+ ν
|Hweak |K
0, and B
∗ = π +
− ν̄
|Hweak |K 0
0. (11.59)
0
1 −im S t − S t/2
and
= A∗
(1 − 2 Re(
)) e−im S t e− S t/2 − e−im L t e− L t/2 . (11.61)
2
Fig. 11.2 Tree-level diagrams for the decays of neutral kaons to leptons K 0 → π −
+ ν
(left) and
0
K → π +
− ν̄
(right), with amplitudes given in (11.58)
11.7 Neutral Kaon Decays to Leptons 319
where |ψ(t) is the time-dependent state subject to the boundary condition that
|ψ(0) = |K 0 (see (11.33)), and we are working to first order in
. From these
amplitudes we can compute the decay rates into π −
+ ν
and π +
− ν̄
to be
+ (t) −
− (t)
A
± (t) = (11.66)
+ (t) +
− (t)
β(t) + 2[α(t) − β(t)] Re(
)
= . (11.67)
α(t) − 2[α(t) − β(t)] Re(
)
The asymmetry has a particularly simple and useful form in the limit of large time
compared to the K S lifetime (t S 1), where the K L decays are dominating:
β
A
± (t)
+ 2 Re(
) = 2 cos(mt) e−( S − L )t/2 + 2 Re(
). (11.68)
α
By careful measurements over time of this lepton charge asymmetry of a beam of
kaons that are produced as K 0 at time t = 0 one is able to extract both the mass
splitting m and the CP violating parameter Re(
). Note, in the limit of long time
t S 1 the asymmetry A
± (t) is dominated by the second term and one can directly
measure Re(
) from the experimental result:
1
Re(
) = A
± (t), for t S−1 . (11.69)
2
The experimental extraction of Re(
) from this technique was already good enough
in the 1970s to establish its value between 0.0016 < Re(
) < 0.0017, which can
be compared with today’s experimentally determined result using all techniques
Re(
) = (1.66 ± 0.02) × 10−3 (RPP). However, the primary value of this analysis
is the determination of the mass splitting.
It is traditional to quote the mass difference in units of inverse seconds by virtue
of the oscillatory cos(mt) factors that (11.62) and (11.63) depend on. For example,
the best fit value for m according to the RPP is
Fig. 11.3 Lepton asymmetry of neutral kaon decays A
± as defined in (11.67), as a function of t/τ S ,
0
where τ S = 1/ S is the K S0 lifetime. The beam at t = 0 consists of pure K 0 and K tagged states
which then oscillate over time according to quantum mechanical evolution. The K L lifetime is much
longer and corresponds to τ L /τ S
570. Thus, for t/τ S > 10 in the figure, all the decays are to a
good approximation K L0 decays, and the asymmetry asymptotes here to 2 Re(
)
0.003. This plot
is taken from Adler et al. Phys. Lett. B363, 237 (1995) who found m = (5.274 ± 0.029) × 109 /s
from best fit to their data at CPLEAR detector at CERN
The mass difference of the K L and K S is small and is in principle computable within
the Standard Model. The mass splitting is due to level repulsion originating from
the S = ±2 off-diagonal terms in the interaction Hamiltonian in (11.6): m =
m L − m S
2Re[M12 ] = 2Re[μ]. (See (11.11) in the approximation that μ and γ
are real.) The matrix element M12 can be computed in quantum field theory at leading
order in perturbation theory as
1 0 S=2
M12 = K |Heff |K 0 , (11.71)
2m K
S=2 is the part of the effective Hamiltonian density that changes strangeness
where Heff
by 2 units, which arises from the Feynman diagrams shown in Fig. 11.1. (The nor-
11.8 K L − K S Mass Difference 321
0
malization of |K 0 and K | on the right side of (11.71) is chosen for consistency
with the conventional hadronic matrix element in (11.77) below.)
A leading-order 1-loop computation of the diagrams of Fig. 11.1 gives a result for
the effective Hamiltonian density of the form
g4
S=2
Heff = Vid∗ Vis V jd
∗
V js F(m i2 , m 2j , MW
2
) Osd , (11.72)
16π 2 m 4W i, j=u,c,t
In (11.72) there is an insertion of g and a CKM matrix element for each W interaction
vertex, a factor of 1/m 2W for each W -boson propagator, the factor 1/16π 2 is a typical
1-loop integration suppression factor, and F is a kinematic function with dimensions
of [mass]2 . From the last fact, one might naively expect that Heff S=2 should have
contributions that scale like g /m W and g m t /m W . However, the true result is much
4 2 4 2 4
smaller. To see why, note that if all of the up-type quarks i, j = u, c, t had the same
mass, then the result would actually vanish. This is because if all m i2 were the same,
then the kinematic function F would contribute the same to each term, and so the
result would be proportional to V ∗ V , which vanishes due to unitarity of the
i id is
CKM matrix, i Vik∗ Vil = δkl . The same applies to the summed index j = u, c, t.
This means that in particular, the contributions to Heff S=2 must vanish in the limit
The suppression due to cancellation from CKM unitarity, which also occurs in many
other contexts, is called the Glashow-Iliopoulis-Maiani (GIM) mechanism.
This still leaves open the possibility that the virtual top-quark contributions pro-
portional to m 2t could dominate. However, the top-quark contributions are suppressed
by very small CKM matrix elements which more than compensate for the large mass
enhancement. A naive estimate, which can be verified by a more involved loop inte-
gral calculation, is that the ratio of magnitudes of contributions proportional to m 2t
and m 2c is roughly
|Vts Vtd∗ |2 m 2t
∗ |2 m 2
0.04,
|Vcs Vcd
(11.74)
c
S=2 g 4 m 2c ∗ 2
Heff =η (Vcs Vcd ) Osd , (11.75)
128π 2 m 4W
where η is a number of order unity that reflects the sizeable and complicated effects
of higher-order corrections. Now, using |Vcs |
cos θC
0.97 and |Vcd |
sin θC
g 4 m 2c 1 0
m = η 2 4
sin2 θC cos2 θC K |Osd |K 0 . (11.76)
128π m W mK
The remaining hadronic matrix element in (11.76) is rather difficult to obtain reliably,
as it is inherently non-perturbative. It can be parameterized as
0 4 2 2
K |Osd |K 0 = f m B, (11.77)
3 K K
where f K
113 MeV is the kaon decay constant, and B is another dimensionless
quantity of order unity. (The constant f K is a universal non-perturbative parameter
which appears in many other kaon decay matrix elements, but one must be careful√
because it is often defined in a normalization that makes it larger by a factor 2.)
Thus we get as a rough leading order estimate, and putting in the numbers including
m c = 1.5 GeV:
g 4 m 2c f K2 m K sin2 θC cos2 θC
m = ηB
ηB (3 × 10−15 GeV), (11.78)
96π 2 m 4W
which is of the same order as the experimental result of 3.5 × 10−15 GeV quoted
above in (11.16). Historically, this is how Gaillard and Lee predicted the charm
quark mass in 1974 before its discovery, by identifying the value of m c that gave a
theory prediction for the kaon mass splitting m equal to the experimental result.
Since then, it has been understood in increasingly greater detail that higher order
corrections to m, not reviewed here, are numerically important. These include not
just the factors η and B mentioned above, but non-negligible long-distance effects
S=2 .
not captured at all by the effective Hamiltonian Heff
As a matter of terminology, the existence of m is an example of a flavor-
changing neutral current, or FCNC. The name reflects the fact that the change in
flavor (strangeness, in the case of m with S = 2) is not accompanied by a net
change in electric charge of the hadron. In constrast, the S = ±1 decays of neu-
tral kaons to charged pions, depicted in Fig. 11.2, are examples of charged current
flavor-changing processes.
Some other examples of FCNCs are the S = 1 processes K ± → π ± ν ν̄ and
0
K 0 → μ+ μ− , the C = 2 process of D 0 –D mixing, and the B = 2 process of
0
B 0 –B mixing. The GIM mechanism of partial or full cancellation of the would-be
leading contributions to FCNCs due to the unitarity of the CKM matrix applies much
more generally. This makes the precision measurement of FCNCs a powerful tool for
indirect constraints on physics beyond the Standard Model, because hypothetical new
physics effects governed by couplings not involving the CKM matrix, and therefore
not subject to the GIM mechanism, can in principle overwhelm the small sub-leading
order Standard Model contributions. This can occur even when the new particles are
heavier than the TeV scale and beyond direct reach at colliders.
Problems 323
Problems
1. Find approximate numerical values for the complex parameters μ and γ defined
in (11.7) that fit the experimental central values for the observables given in
(11.16)–(11.21).
2. Compute the coefficients a− (t) and a+ (t) of
subject to the initial condition that |ψ(0) = |K 0 . Compute the probability that
a measurement would find the system in the state |K (−) 0 at time t.
+ −
3. Consider the S = 1 decay process K → μ μ in the Standard Model.
0
(a) At tree level you can try to draw diagrams involving γ and Z exchange in the
s channel. Explain why these diagrams vanish identically.
(b) At 1-loop order, you can draw two diagrams, each involving a pair of virtual
W bosons. Use the CKM matrix factors to discuss how the GIM mechanism
works to suppress the amplitude beyond naive expectation in this case. Make
a rough order-of-magnitude estimate of the resulting contribution to the decay
rate.
Neutrinos
12
Neutrinos were a somewhat neglected sector of the Standard Model for many years
since they were originally thought to be massless without much complexity to con-
sider. Massless neutrinos were accommodated within the old SM by disallowing any
gauge-singlet right-handed neutrinos, thereby forbidding any gauge-invariant renor-
malizable mass term for neutrinos. Experimental and theoretical progress over the
years began to point to neutrinos having mass, and now that fact is well established.
Earlier in Sect. 10.4 we described how neutrinos can obtain mass. In this chapter
we describe the unique experimental implications of massive neutrinos. We first
describe the sources of copious neutrino fluxes with which one can conduct exper-
iments to infer neutrino properties. One of the first strong evidences that neutrinos
might not be massless fermions was the solar neutrino deficit, where experimentalists
measured the neutrino flux coming from the sun and found too few. The results were
not compatible with the massless hypothesis. The reason for this, and the effect that
pervades this entire chapter, is that massive neutrinos oscillate in their flavor content
over time as they propagate. In other words, mass eigenstates and flavor eigenstates
are not synonymous, and a neutrino produced as an electron flavor eigenstate will
oscillate to other flavors over time:
ψ(t) = α(t)νe + β(t)νμ + γ (t)ντ , where α(0) = 1 and β(0) = γ (0) = 0. (12.1)
Neutrino masses are required to enable β(t), γ (t) = 0 for future times, and thus
enable the νe component of the produced state to reduce or “disappear” over time
(|α(t)| < 1). A similar situation develops for a muon neutrino or tau neutrino pro-
duced at t = 0—they will not maintain their flavor identity over time. Experiment
aims to determine the details of these flavor oscillations.
Because of these flavor oscillations the reader should pay close attention to the
type of neutrino (its flavor) that is produced from the sources described in Sect. 12.1.
© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 325
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_12
326 12 Neutrinos
One should equally keep in mind that the pure flavor content at birth does not last long,
and the neutrino state will oscillate into a different identity with a linear superposition
of all three flavor components.
The qualitative description above will be upgraded to a precise quantitative
description of the propagation of neutrinos through vacuum and through matter
in the subsequent sections. From the detailed formalism developed the reader will
be able see how the parameters of the neutrino sector are pinned down from exper-
imental data. The measurement process has its own rich history, and we illustrate
some of that history by giving an introduction to methods of neutrino detection. This
includes brief discussions of various neutrino experiments that have been crucial to
the development of our understanding of neutrino properties. We do not intend to be
exhaustive in this survey of experiment. For a comprehensive listing and survey of
the vast experimental landscape of neutrino experiments the reader is encouraged to
consult the Particle Data Group’s Review of Particle Properties (RPP). In the final
section we summarize current understanding of neutrino properties (their masses
and mixings) and discuss some of the future goals in neutrino physics.
We will discuss many different experiments within this chapter. To make it easier
for the reader to keep track of all them we include Table 12.1 that briefly summarizes
each of them. The description of each experiment will become more understandable
as the reader progresses through the chapter.
In order to carry out experiments and observations that hope to measure properties
of neutrinos we must have copious sources because the interaction of neutrinos
with other particles is weak. We should not only identify which natural and human-
made sources can provide a large flux of neutrinos, but we should also have a good
understanding of the properties of these neutrinos at their birth, most especially their
energies, their expected flavor profiles, and their distances from our detectors.
The main sources of neutrinos that we describe below are from the Sun, from
supernovae, from cosmic rays (“atmospheric neutrinos”), from nuclear reactors, and
from accelerator sources. We discuss each of these in turn.
The sun is a copious source of neutrinos. The primary source of neutrinos from the
sun is from the basic fusion energy process of
p + p → d + e+ + νe . (12.2)
Although these neutrinos constitute the lion’s share of neutrinos produced by the
sun (∼ 86%) their energies are well below an MeV and are hard to detect by earth-
based detectors which typically have energy thresholds higher than several MeV.
12.1 Neutrino Sources 327
Table 12.1 Alphabetical list of experiments that are discussed in the chapter with comments on
the roles that are highlighted in the discussion
Experiment Location Date Comments
Baksan/BUST Russia 1977+ Detected neutrinos
from SN1987A
Daya Bay China 2011–2020 Reactor neutrino
detector
DUNE South Dakota 2030? Comprehensive
neutrino detector
Hyper-Kamiokande Japan 2027? Comprehensive
neutrino detector
ICARUS Gran Sasso 2010+ Solar and atmospheric
neutrino detector
IMB Lake Erie 1982–1991 Atmospheric and
SN1987A neutrinos
J-PARC Japan 2009+ High intensity proton
accelerator to make
neutrinos
KATRIN Germany 2018+ Search for neutrino
mass
Kamiokande Japan 1983–1985 Atmospheric neutrinos
Kamiokande II Japan 1985–1990 Detected neutrinos
from SN1987A
LBNF Fermilab 2030? Accelerator neutrino
source and near
detectors
MicroBooNE Fermilab 2014+ Accelerator neutrino
experiment
MiniBooNE Fermilab 2002+ Search for sterile
neutrinos
MINOS Fermilab 2005–2016 Accelerator neutrino
detector
NOνA Minnesota 2011+ Accelerator neutrino
detector
SNO Canada 1999–2006 Solar and atmospheric
neutrino observatory
Soudan 2 Minnesota 1989–2001 Atmospheric neutrinos
observatory
Super-Kamiokande Japan 1996+ Comprehensive
neutrino detector
T2K Japan 2010+ Accelerator neutrino
detector
328 12 Neutrinos
One therefore wishes to identify the source of the highest energy neutrinos coming
from the sun on which we can apply observational tools.
The production and subsequent decays of Boron-8 in the internal solar reaction
chain produces1 a spectrum of high energy electron neutrinos νe that is peaked at
6.5 MeV and reaches up to ∼ 16 MeV. The solar chain of reactions that yields these
neutrinos proceeds through
p+ p → d + e+ + νe (as source of d)
d+p → 3 He + γ (as source of 3 He)
3
He + 3 He → 4 He + 2 p (as source of 4 He)
4
He + 3 He → 7 Be + γ (as source of 7 Be)
7
Be + p → 8 B + γ (as source of 8 B)
8
B → 8 Be∗ + e+ + νe (giving high energy νe )
The νe neutrinos that result from the last decay constitute about 1 out of every
5000 neutrinos emitted from the sun. Nevertheless, the flux of “Boron-8 neutrinos”
reaching the earth is approximately 14 million neutrinos per second per cm2 .
The solar neutrino flux is well understood; the energy profile of the highest energy
neutrinos is well characterized; the flavor content of the neutrinos upon birth is
well understood (pure νe ); and, the distance from source to earth is of course well
understood. Thus, experiments that can detect the high-energy neutrinos coming
from the sun with some discriminating power on flavor content have the opportunity
to test whether neutrinos oscillate, signifying that they have mass. Indeed, such tests
were done, and oscillations detected, as will be described later in Sect. 12.4.
Historically the process was more messy, with experiments led by Ray Davis first
suggesting a deficit of detected neutrinos on earth compared to what was expected. By
the 1970s this was called the “solar neutrino problem.” Confusion reigned for quite
some time, including questions of systematic errors on the early experiments and
questions on how well the solar model of neutrino production was known. Ultimately,
those confusions resolved, especially due to the transformative Super-Kamiokande
and SNO (Sudbury Neutrino Observatory) experiments in the 1990s, and we are
left today with a clear picture of neutrino production within the sun and how they
indeed oscillate in flavor on their way to earth, resulting in a measurable reduction
of detected electron neutrinos.
Another natural source of neutrinos comes from the core collapse of a supernova.
A star with mass M > ∼ 8M collapses to a proto-neutron star when it runs out of
fuel for nuclear fusion. This supernova process is violent in many ways. Not only
1 See W. T. Winter et al. “The 8 B neutrino spectrum”. Phys. Rev. C73, 025503 (2006).
12.1 Neutrino Sources 329
does it eject its stellar envelope and produce many photons in the explosion, it also
ejects a very significant number of neutrinos. As the core collapses the free nucleons
capture electrons, producing neutrinos, which causes further core collapse due to
the drop in electron degeneracy pressure, and so on. Within a few seconds 99% of
the gravitational binding energy of the original star is converted to neutrinos hurling
outward into space.
It is a highly non-trivial calculation to determine the total rates of neutrinos, the
total rates of anti-neutrinos, and the relative rates of each flavor. The vast majority
of the neutrinos come within the first second, and there is a steady but falling rate
of neutrinos up to about 20 s. If there is an onset supernova within the galaxy or
otherwise nearby, modern large-scale experiments will detect the burst of neutrinos
that comes from it. These neutrinos are correlated with the arrival time of the light
emitting from the exploding supernova, giving further confirming evidence of the
neutrino burst’s origin. Galactic supernovae are estimated to take place twice over
the course of a century. Thus, there is high premium on getting as much out of an
event as possible.
There has been one nearby supernova in the modern era picked up by neutrino
experiments. Supernova SN1987A was seen in 1987 by a burst of neutrino events that
were recorded among several experiments, including Kamiokande II (12 antineu-
trinos), IMB (8 antineutrinos), and Baksan (5 antineutrinos). SN1987A occurred
approximately 50 kpc away in the Large Magellanic Cloud. Comparing the spread
of arrival time of these burst neutrinos from supernova models, including their uncer-
tainties, yielded the first initial upper-bound estimates of about 10 eV on the absolute
scale of neutrino masses. Given the many new and larger experiments available today,
careful measurement of neutrinos arising from the next nearby supernova may pro-
vide significantly more information, such as limits on non-standard interactions of
neutrinos with matter and with themselves, not to mention deeper insight into the
internal dynamics of a supernova. At an expected rate of about two per century, a new
supernova event useful for measuring neutrino properties may happen next week or
maybe not for another hundred years.
Another important natural source of neutrinos arises from cosmic rays interacting
with the atmosphere. High energy cosmic rays are mostly comprised of protons, and
when they collide with molecules in the atmosphere, cosmic ray showers develop. A
shower is made of many short-lived particles, including copious numbers of pions.
As we saw in Sect. 7.6, these pions then decay nearly 100% of the time to muons
and neutrinos, and many muons subsequently decay to even more neutrinos before
reaching the detector. The cascade relevant for main neutrino production is
There are other decay chains involving π → eν, and from K ± decays that are sub-
dominant. Thus, from tracking the neutrino flavor content in the cosmic ray shower
evolution, which is approximated by (12.3), one expects roughly double the num-
ber of muon neutrinos compared to electron neutrinos. The simple origin of this
conclusion is that pions decay to muons over 99% of the time.
A conundrum developed when experiments detected muon neutrinos at a rate
significantly less than double the electron neutrino rate. In-depth Monte Carlo sim-
ulations that tracked shower evolution and subsequent neutrino production were
pursued to compute the exact expected ratio in order to compare with experiment. A
useful observable was developed to report the comparison between data and Monte
Carlo bench-mark simulation that assumes no flavor changes after production in the
decay chains:
where Nμ is the total number of νμ and ν̄μ neutrinos and Ne is the total number
of νe and ν̄e neutrinos. Neutrinos in the sub-GeV energy range consistently showed
R 0.65 across multiple experiments, including Soudan 2, IMB, Kamiokande, and
Super-Kamiokande.
Upon closer inspection, it appeared that data for electron neutrino flux were match-
ing Monte Carlo estimates, but the data for muon neutrino flux were coming in far
short of Monte Carlo estimates. “Monte Carlo estimates” assumed no neutrino oscil-
lations. If one postulates that neutrino oscillations into another flavor is the explana-
tion for the deficit of neutrinos, one is led to consider the possibility that many νμ
neutrinos are “disappearing” into ντ via the oscillation νμ → ντ .
Experiments did not have sensitivity to the ντ flux to test the hypothesis directly at
the time; however, the supposition did suggest that there should be a different drop-
off in the νμ flux going downward versus upward. The reason for this difference is
that the oscillation probability depends on the distance the neutrino has traveled, as
we will discuss in the next section. If neutrinos are coming from above they travel
on order of 10 − 100 km depending on the incoming zenith angle, whereas if they
are coming from below they had to have been created on the other side of the earth
and traveled ∼ 103−4 km. Super-Kamiokande showed2 that muon neutrinos indeed
have this up-down asymmetry:
up
up−down Nνμ − Nνdown
μ
Aνμ = up = −0.31 ± 0.04 (Super − Kamiokande). (12.5)
Nνμ + Nνdown
μ
Neutrino sources are not limited to natural processes. Human-made sources, such as
nuclear reactors and proton beam bombardments on targets, have become increas-
ingly important in neutrino physics. Let us discuss each of these in turn.
First, nuclear reactors operate on the process of fission. When a neutron is captured
by a target nucleus the neutron-enriched isotope is unstable and decays typically to
two fission product isotopes and additional free neutrons. For example, a typical
fission process involving Uranium 235 is
92 U → 56 Ba + 36 Kr + 3n.
n + 235 141 92
(12.6)
The product isotopes of Barium and Krypton have significantly more neutrons than
their stable isotopes. To be more precise, the stable isotopes of Barium are 56 X Ba,
where X = 134 − 138. The stable isotopes of Krypton are 36 Kr where X = 80,
X
82 − 84. The neutron-richness of the fission products is expected due to the well-
known “belt of stability” in nuclear physics that shows the neutron-proton ratio
increasing as one proceeds up the periodic table. The high fraction of neutrons in high-
235 U, which is moderately stable, is inherited by the fission products.
Z nuclei, such as 92
The lower-Z fission products are then far from the “belt of stability” and proceed
to reduce their too-high neutron fraction through β-decay: converting neutrons into
protons and emitting electrons and anti-neutrinos in the process (n → pe− ν̄e ).
Thus, the neutrinos that come out of reactors are almost entirely the ν̄e that arise
out of reducing the neutron-richness of fission products. Their energies are typically
several MeV. There are more than 1020 ν̄e emitted per second in a giga-Watt power
reactor, which creates an excellent source of ν̄e . Furthermore, since the source is
located on earth, one can place detectors near and far to test the oscillation behavior
of ν̄e at multiple distances, pinning down parameters more accurately. The smaller
“baselines” (length from source to detector) for ν̄e from reactors compared to νe
from the sun enable additional handles on the PMNS matrix for neutrino mixing,
and additional sensitivities to the mass splittings among the neutrinos. The Daya Bay
experiment was able to utilize this flexibility to gain unique capabilities to measure
θ13 oscillation angle (the “reactor oscillation angle”).
Another mechanism by which to produce neutrinos copiously in the lab is through
focusing an intense accelerator beam of protons onto a target. A typical target is
graphite, which, for example, J-PARC uses to produce neutrinos for the T2K neu-
trino experiment. Graphite is also a contender for the target of the proton booster
at Fermilab, which will provide the source of neutrinos for the DUNE experiment.
The mechanism is similar to the mechanism that produces atmospheric neutrinos,
where protons collide with their atmospheric target making pions which then decay
to neutrinos. The same principle applies here
There is a larger flux of muon neutrinos to electron neutrinos produced at the collision
point for the same reason as was the case for atmospheric decays: pions preferen-
332 12 Neutrinos
tially decay to muons and muon neutrinos, and only produce electron neutrinos (and
another muon neutrino) in the decay of the daughter muons. Due to the high energy
of the incoming proton beam the muon neutrinos mostly follow the original proton
direction when they are produced.
There are several advantages of producing neutrinos through a protons-on-target
experiment compared to observations of neutrinos produced in the atmosphere. First,
one can construct a highly collimated and dense beam of neutrinos focussed on a
detector. Second, the baselines can be adjusted with near detectors and far detectors
that can compare detection readouts to better infer oscillation parameters. Third,
the energies of the neutrinos can vary by increasing the energy of the proton beam
bombarding the target. And finally, compositions of neutrinos and anti-neutrinos of
various flavors can be manipulated by strong magnetic fields on the parent pions.
Regarding the last point, T2K has “magnetic horns” that can focus and guide
charged pions of one charge along the proton beam line while the others are guided
away. Thus, if only π + are guided their decays into μ+ + νμ produce a rather pure
νμ beam.3 Likewise, focusing for π − generates a beam of ν̄μ . Placing detectors off
axis enables to filter a narrow band of energies of the neutrinos. These manipulations
of energy range and particle/anti-particle content are very helpful in getting the most
out of the experiment in order to comprehensively establish the neutrino properties.
Although there are three flavors of neutrinos let us for simplicity start by assuming
that there are only two flavors, |νe and |νμ . Each of these flavors is a mixture of
mass eigenstates |ν1 and |ν2 according to
νe cos θ sin θ ν1
= . (12.8)
νμ − sin θ cos θ ν2
3 The T2K beam is more complex than this, and has residual contributions from the opposite charge
pion decays, secondary muon decays, etc. See Abe et al. Phys. Rev. D87, 012001 (2013).
12.2 Neutrino Propagation Through Vacuum 333
And let us further suppose that at t = 0 the neutrino that comes out of the source
(the Sun) is in the |νe flavor eigenstate:
However, a full and proper treatment of the propagating neutrino is a subtle and
extensive task, which involves careful treatment of neutrino wave-packet evolution.
Nevertheless, it can been shown4 that assuming states are described by (12.10)
yields correct results as long as we enforce that relative phases of highly relativistic
neutrinos satisfy
m i2j L
φi − φ j , where m i2j ≡ m i2 − m 2j , (12.11)
2E
and where L is the distance the neutrino state has propagated from the neutrino
source. We have exchanged time t with distance L since it is distance that is a known
quantity for experiment. Furthermore, only relative phases have observable impact,
consistent with known principles of quantum theory, and so (12.11) is all we need
to proceed. Note, a naive derivation of (12.11) that starts with φ = Et − p · x =
2
(E − |p|)L m2EL leads to fortuitously to the correct answer.
If the neutrino state at birth (L = t = 0) is |ψ(L = 0) = |νe then obviously the
initial probability for it to be |νμ is zero. However, at later times (at distances L = 0)
this no longer remains true, and must be calculated by first computing |ψ(L) and
then computing the probability via
P(νμ ) = |
νμ |ψ(L)|2 . (12.12)
4A full treatment can be found, e.g., in Beuthe, Phys. Rep. 375, 105 (2003).
334 12 Neutrinos
This evolution of the state with distance is now unambiguous and calculable given
that we know φ from (12.11).
Now let us re-expand (12.14) in terms of flavor eigenstates again so that we can
compute probability overlap:
|ψ(L) = sin2 θ + cos2 θ e−iφ |νe + sin θ cos θ e−iφ − 1 |νμ . (12.16)
P(νe → νμ )(L) = |
νμ |ψ(L)|2 = sin2 (2θ ) sin2 (φ/2)
m 2 L
= sin2 (2θ ) sin2
4E
m 2 L GeV
= sin2 (2θ ) sin2 1.27 . (12.17)
eV2 km E
The two-state oscillation result is a decent approximation for solar neutrinos oscil-
lating to |νμ on their way to earth and atmospheric neutrinos produced from π → μν
oscillating from |νμ to |ντ from the point of creation high in the atmospheric (from
pions in cosmic ray showers) to earth based detectors.
From (12.17) we see that for a given mass-squared splitting m 2 of neutrinos
there is a characteristic oscillation length that depends on energy:
2
4π E E eV
L osc
(E) = = (2.5 km) . (12.18)
m 2 GeV m 2
There are only two characteristic oscillation lengths in vacuum of relevance within
the SM. There is a length associated with
which we label with the sol subscript since the dominant oscillation of solar neutrinos
propagating to earth is among the first two eigenstates. There is a second length
associated with
which we label with the subscript atm since the dominant oscillation of atmospheric
neutrinos propagating to earth is among the second and third eigenstates. There is
a third oscillation length associated with m 231 but it is very close to m 232 since
m 221 m 232 . The solar neutrino oscillations are predominantly νe → νμ flavor
oscillations, and the atmospheric neutrino oscillations are predominantly νμ → ντ .
Respect for these historical origins of neutrino observations leads us to retain the
language of m 2sol and m 2atm .
Our definitions of m 2atm ≡ m 232 and m 2sol ≡ m 221 given above assume the
“normal hierarchy” (NH) of neutrino mass spectrum, where m 3 m 2 > m 1 .
12.2 Neutrino Propagation Through Vacuum 335
Fig. 12.1 Standard convention for the hierarchy of neutrino mass eigenstates, where m 2sol
7.5 × 10−5 eV2 and m 2atm 2.5 × 10−3 eV2
Because we only know mass-squared differences rather than the absolute masses
of the neutrinos from experiment, and because of the incompleteness of neutrino
oscillation measurements, there is a second solution, called the “inverted hierarchy”
(IH) that is also consistent with observations. For IH the hierarchy of neutrino masses
is by usual convention m 2 > m 1 m 3 , where then m 2sol ≡ m 221 (same as before)
and m 2atm ≡ m 223 . One should note that
These are the characteristic distance scales of neutrinos oscillating their flavor con-
tent.
336 12 Neutrinos
In general, any flavor eigenvalue of the neutrino is a mixture of all mass eigenstates,
according to the PMNS matrix
∗
|να = Uαk |νk (relation among states) (12.23)
where Greek (Roman) letters are flavor (mass) eigenstate indices. Note the complex
∗ in (12.23) connecting states within the Hilbert space, whereas
conjugation on Uαi
among the fields
as defined earlier in Chap. 10. This is due to the creation operator b† that creates the
|ν state through b† |0 = |ν arising from the ν̄ quantum field and not the quantum
ν field.
Applying the same techniques as we did in the case of the two-state oscillation
problem described above, one finds that the probability of a flavor eigenstate |να
being measured a distance L later as a |νβ state is computed to be
⎛ ⎞
2
m 2jk L
Pα→β (L, E) = |
νβ |να | =
2
νi |Uβi ⎝ e −i 2E Uα j |ν j ⎠
∗
i j
3
2
m 2 L
−i 2Eik ∗
=
Uβi e Uαi
, (12.25)
i=1
where a physically irrelevant overall phase angle e−im k /2E was inserted so that every
2
term has a known value according to (12.11) above. Upon expanding this equation
one finds the well-known result5
3 3 m i2j L
∗ ∗
Pα→β (L, E) = δαβ − 4 Re(Uαi Uβi Uα j Uβ j ) sin 2
4E
j=1 i> j
3 3 m i2j L
∗ ∗
+2 Im(Uαi Uβi Uα j Uβ j ) sin . (12.26)
2E
j=1 i> j
Note, there are three different characteristic neutrino oscillation length scales depend-
ing on the energy of the neutrino and the mass differences between the different mass
j (E) = 4π E/m i j , where i j = 21, 32 and 31. An experiment that
eigenstates: L iosc 2
5 Three-generation oscillation probabilities in this section follow the notation of Nunokawa, Parke
and Valle, Prog. Part. Nucl. Phys. 60, 338 (2008).
12.2 Neutrino Propagation Through Vacuum 337
detects neutrinos from a known origin and with energy E has a good prospect for
j (E) are large macroscopic distances (e.g.,
observing clean oscillation signals if L iosc
hundreds to many thousands of kilometers), which they fortunately turn out to be for
energies of naturally produced neutrinos.
If there is no CP violation one finds that the probability of ν̄α → ν̄β oscillation is
the same as that of να → νβ . However, with CP violation a difference arises:
3
3 m i2j L
∗
P(να → νβ ) − P(ν̄α − ν̄β ) = 4 Im(Uαi Uβi Uα j Uβ∗ j ) sin
2E
j=1 i> j
∗ ∗
where Jαβ = Im(Uα1 Uα2 Uβ1 Uβ2 ) = ±J , (12.28)
with the sign being positive (negative) for a cyclic (anti-cyclic) permutation of e,
μ and τ (i.e., Jeμ = Jμτ = Jτ e = +J , whereas Jeτ = Jμe = Jτ μ = −J ). J is the
lepton-sector analog to the Jarlskog invariant associated with CP violation in the
quark sector.
In addition to the central importance of extracting m i2j from neutrino oscillation
behavior, one also has dependence on the mixing matrix U . A common way to
parametrize this matrix is (RPP)
⎛ ⎞
c12 c13 s12 c13 s13 e−iδ
U = ⎝ −s12 c23 − c12 s23 s13 eiδ c12 c23 − s12 s23 s13 eiδ s23 c13 ⎠ (12.29)
s12 s23 − c12 c23 s13 eiδ −c12 s23 − s12 c23 s13 eiδ c23 c13
where ci j ≡ cos θi j , si j ≡ sin θi j , and δ is the CP violation phase angle. There are
three independent angles involved in neutrino mixing, which are θ12 (“solar oscil-
lation angle”), θ23 (“atmospheric oscillation angle”), and θ13 (“reactor oscillation
angle”). The parenthetic names indicate what observations were (at least initially)
most sensitive to these angles. All of these angles can be restricted to the first quad-
rant [0, π/2] (and thus si j and ci j are always positive) without loss of generality as
long as δ is allowed to vary over the full range [0, 2π ].
We are now in a position to make numerous observations of neutrino oscillations
from various known sources to extract the three mass-differences (m i2j ), the three
mixing angles (θi j ), and the CP phase angle (δ). We know what sources neutrinos
come from as discussed in Sect. 12.1. However, before we continue, we must note
some complications with respect to neutrinos propagating through matter. We then
338 12 Neutrinos
will discuss the various ways that neutrinos are detected, which gives us opportunity
to highlight a few of the important neutrino experiments of the past, present and
future. After all of that we will be in a better position to summarize in Sect. 12.6
what is known about the neutrino sector (i.e., the best fits to the parameters), and the
future goals of neutrino physics.
When neutrinos propagate through matter they have some probability of interacting
coherently with the medium. The effect of these coherent interactions is to introduce
additional phase shifts in the neutrino waves functions. For neutrino flavor β the phase
shift introduced is e−i Vβ t , where Vβ is the effective potential that νβ experiences due
to its coherent scattering within the medium. These matter effects are sometimes
called MSW effects after Mikheyev, Smirnov and Wolfenstein who first introduced
and recognized its importance.
The electron neutrino νe experiences an effective potential of
√
Ve = 2G F n e (x) (12.30)
where
m 2k t
∗
Cαβ (t) = Uαk Uβk e−iφk (t) and φk (t) = , (12.32)
2E
k
with m k being the kth mass eigenstate mass and E the neutrino energy.
When passing through matter the propagating flavor νβ experiences the additional
phase shift of e−i Vβ t as described above. This phase shift then alters (12.32) to be
∗
M
Cαβ (t) = Uαk Uβk e−i Vβ t e−iφk (t) . (12.33)
k
One then computes the probability of measuring flavor νβ at time t using the standard
methods
P(να → νβ )(t) = |
νβ |ψα (t)|2 = |Cαβ
M
(t)|2 . (12.34)
There are many analytic recastings of the above equation, which are of limited value
since one must always resort to a final numerical computation. However, there is some
utility in analytically computing the two-state neutrino oscillation approximation in
the presence of matter, the results of which we will now describe.
If we assume that νe oscillations into νμ , which approximates well solar neutrino
oscillations, we can compute how the states that were given birth as νe inside the
sun propagate through that dense matter. One finds a result very similar to (12.17)
except that θ → θ M and m 2 → m 2M , where6
√
m 2M = [m 2 cos 2θ − 2 2G F En e ]2 + [m 2 sin 2θ ]2 , and (12.35)
m 2 sin 2θ
tan 2θ M = √ . (12.36)
m 2 cos 2θ − 2 2G F En e
This analytic expression gives us the ability to see the conditions at which a resonance
of neutrino oscillation may occur in the medium, which occurs at
√
2 2G F En e = m 2 cos 2θ (resonance condition). (12.37)
There are four variables at play here, E, n e , m 2 and θ , which conspire in some
cases to give a large matter effect. One example case of the relevance of taking into
account these effects is the case of solar neutrinos of E ∼ 1 − 10 MeV propagating
in the dense medium of the sun before exiting the sun on their way to detectors on
earth.
In a more careful treatment the spatial variation of the electron number density
n e (x) must be taken into account as the neutrinos propagate in the medium. In
that case one generally wishes to numerically integrate step-by-step a differential
equation, which we can express as
d|ψα (t)
i = Hαβ |νβ , where (12.38)
dt
⎛ ⎞
d C̃αβ Ve 0 0 m2
Hαβ = i =⎝ 0 0 0⎠ + ∗
Uαk Uβk k . (12.39)
dt 0 0 0 2E
k
In the case of solar neutrinos, when νe neutrinos are created in the core of the sun
and then propagate outward, one can replace t → r (neutrino velocity approximately
c = 1), compute n e (r ) within the core of the sun, and compute the neutrino oscillation
wave function as it propagates through the sun. The calculation then follows the
development of the neutrino wave function as it propagates in space using (12.38)
and (12.39). One finds that although the neutrinos do not reach or cross the resonance
condition of (12.37), the effect is large enough to be discerned in comparing the
modeling of neutrino production in the sun’s core with neutrino flavor measurements
on earth.
The ideal neutrino detector would read out an incoming neutrino’s existence, direc-
tion, energy, and flavor. Unfortunately, no ideal neutrino detector exists. However,
there are a suite of different detection techniques that tell us at least some of this infor-
mation. Piecing together the information from many different detectors has enabled
us to obtain a rather comprehensive understanding of neutrino masses and mixings.
Let us review some of those techniques.
Neutrino detection is made possible only by the effects neutrinos have on other
particles. Thus, the neutrinos must first interact with normal matter and then the
effects of this interaction must be registered in some way. In the following paragraphs
we describe some of these primary detection techniques applied to final states from
neutrino-induced scattering.
Cherenkov light from ν + N →
+ N where N and N are nucleons. A highly
energetic incoming neutrino can be converted to a lepton by charged-current pro-
cesses:
At MeV-scale energies, relevant for solar neutrinos and supernova neutrinos and low-
energy atmospheric neutrinos, the first interaction of inverse beta decay is signifi-
cantly more important for water Cherenkov reactions than the second. The reason is
that the hydrogen atom only contains the proton, and charged-current neutrino inter-
actions on neutrons within oxygen are very suppressed. Now, if the final state lepton
has velocity greater than the velocity of light in the medium it will emit Cherenkov
radiation as it traverses the detection volume. This technique is employed in exper-
iments with very large volumes of water or ice, which have plenty of protons and
neutrons with which neutrinos can interact. Photomultiplier tubes (or other types
of photon detectors) are utilized to record this signal of neutrino interaction. They
typically can infer some directional and energy information. Muons are particularly
interesting final states since one can measure their macroscopic decay lengths, which
are affected by the muon’s in-flight energy, by the length of their Cherenkov tails.
This in turn enables one to determine the parent neutrino’s energy up to standard kine-
matic inference uncertainties. Experiments that have utilized this technique include
IMB, Super-Kamiokande, and IceCube.
For high-energy scattering there can be difficulties resolving the charge of the final
state lepton. For example, the μ− in νμ n → μ− p has the same Cherenkov radiation
emitted as the μ+ from ν̄μ p → μ+ n. To resolve the difference one can try to detect
the resulting p through its own Cherenkov radiation. However, in many detectors the
proton’s radiation is too low to be discernible. The neutron in the μ+ n final state,
on the other hand, can be detected provided the detector is doped with Gd. Gd has a
neutron capture rate more than 160,000 times that of a free proton. After Gd captures
a neutron it cascades out higher energy γ -rays that become discernible a signal to
photon detectors. The Gd+n capture’s flash of light happens some μ-seconds after
the μ− Cherenkov radiation, thereby tagging the event as a negatively charged μ− .
The main material for a Cherenkov radiation detector does not have to be water or
ice. For example, the MiniBooNE collaboration used mineral oil. The two primary
advantages are higher index of refraction and the presence of scintillation light.
Higher index gives more Cherenkov radiation, and thus a stronger signal with lower
thresholds. The presence of scintillation light enables additional handles for neutrino
identification. The chief disadvantage is that much more complicated modeling is
necessary of light generation and transmission within the mineral oil medium.
Cherenkov light from elastic νe− → νe− . Any neutrino or antineutrino can scatter
off atomic electrons elastically through ν X + e− → ν X + e− . The final state electron
can be kicked to highly relativistic velocities which then Cherenkov radiates as it
traverses the detection volume. Although any neutrino species can take part in this
interaction, the most efficient scattering is νe e− → νe e− , which then imparts a very
forward kick to the final state electron in line with the original incoming neutrino.
This process has high correlation in direction with the source (e.g., the Sun).
Deuteron dissociation signals. The key aspect of the SNO detector, which was
key in resolving the solar neutrino problem, was its large tank of heavy water (D2 O).
The presence of deuterons enable several detection techniques simultaneously. An
electron neutrino can dissociate a deuteron through charge current interactions via
342 12 Neutrinos
νe + d → e− + p + p. (12.40)
The νe threshold for this interaction to occur is 1.44 MeV. The electron in the final
state can undergo Cherenkov radiation as described above. In addition, the deuteron
can dissociate via neutral current interaction via
νX + d → νX + n + p (12.41)
where ν X is any species of neutrino or antineutrino. The ν X threshold for this inter-
action to occur is the deuteron binding energy of 2.2 MeV. In this case there is no
Cherenkov radiation from the initial products of the ν X scattering. However, the
free neutron produced may be captured by a deuteron nucleus to produce the 3 H
isotope plus a 6.25 MeV gamma ray (γ ). This is then followed by Compton scat-
tering γ e− → γ e− that kicks an electron to sufficiently high velocity to produce
Cherenkov radiation, which can be detected. A later phase of the SNO experiment
added NaCl salt to the heavy water, which enticed the free neutrons to be captured by
the Chlorine with subsequent production of gamma rays with more energy available
(8.6 MeV) thereby increasing the efficiency of neutrino detection.
Calorimetry tracks from ν
N →
+ X , ν
+ X . For neutrinos with energy above
the GeV range, interactions are inelastic hard-scattering that make showers of par-
ticles. A tracking calorimeter can measure the hadronic jet that results. If the ν
interaction is a charged-current interaction producing a lepton
there will be an
additional visible charged-lepton track in the calorimeter. The MINOS and NOνA
detectors utilize this technique, although their precise methodologies for producing
and detecting the tracks is different.
Scintillation light and drift electrons from νe + Ar → e− + K and ν
+ Ar →
+ p + X . This is a key process for the ICARUS, MicroBooNe, and DUNE detec-
tors. Argon has several features that increase efficiency and quality of neutrino detec-
tion. First, liquid Argon is a very dense substance making for an increased number
of neutrino interactions per unit volume compared to water, for example. When an
electron neutrino traverses liquid Argon it converts to an electron through charged-
current interactions which simultaneously convert a neutron within the Argon to a
proton, making Potassium. For higher energy neutrinos, the interaction can best be
thought of as ν
+ Ar →
+ p + X , where the
-lepton and proton produce ioniz-
ing tracks as they traverse the detector volume. The radiation creates a scintillation
signal of light within the detector. In addition to this signal, the detector is composed
of modules with applied electric field gradients and charge readout planes. The read-
out planes are the anode planes of the electric field gradient, and they register a
signal of the electrons arriving. Combining all the information from the immediate
scintillation light and the somewhat later charge readout signals of arriving ionized
electrons allows for 3D reconstruction of the event, which thereby enables good iden-
tification of incoming neutrinos and more accurate reconstruction of its energy. The
detectors are generally called Liquid Argon Time Projection Chamber (LArTPC),
whose high sensitivity to neutrinos is nicely complementary to the high sensitivity
to anti-neutrinos of water Cherenkov detectors.
12.5 Direct Limits on Neutrino Masses 343
3
1H → 32 He+ + e− + ν̄e . (12.42)
The released energy of this decay is very small due to the small mass difference
between tritium and the Helium isotope. This small difference is by designed to be
maximally sensitive to a possible mass of the ν̄e particle. For massless neutrinos, the
electron kinetic energy can reach as high as E e,max (m ν = 0) = 18.6 keV. However,
if the neutrino does have mass the maximum energy is lowered, E e,max (m ν = 0) <
E e,max (m ν = 0), and the energy spectrum of the electron is distorted as it reaches
its end-point.
We can find E e,max (m ν ) from inspection of the differential decay width:
d
= C(E)(E 0 − E) (E 0 − E)2 − m 2ν̄e (12.43)
dE
where E 0 is the released energy of the decay, E is the electron’s kinetic energy, and
C(E) is an energy-dependent constant that does not depend on the neutrino mass.7
We see from (12.43) that E e,max = E 0 − m ν̄e .
Thus, a very careful measurement of the maximum electron energy spectrum may
show a signal for non-zero neutrino mass. The 90% CL current limit from KATRIN
experiment (Nature 18, 160 (2022) is m ν̄e < 0.7 eV. The projected sensitivity of
KATRIN is to be able to find evidence for neutrino masses unless m ν̄e <∼ 0.3 eV. The
current limit from similar Tritium decay experiments in the past is m ν̄e < 2 eV.
The limit above is expressed as a limit on the flavor eigenstate m ν̄e , which is not
a mass eigenstate. To be rigorous we need to decide what the limits really are on the
mass eigenstates from Tritium decay electron end-point spectrum. In terms of mass
eigenstates the decays are
d
= |Uei |2 C(E)(E 0 − E) (E 0 − E)2 − m 2ν̄i θ (E 0 − E − m ν̄i ), (12.44)
dE
i
where the θ (x) function enforces the requirement that the decay is kinematically
allowed. Supposing that all neutrino masses are below the release energy of 8.6 keV
one can dispense with this function.
There is a convenient simplification of (12.44) that is applicable when m ν̄i
(E 0 − E):
d 1 m 2ν̄i
= |Uei | C(E)(E 0 − E) 1 −
2 2
+ ···
dE 2 (E 0 − E)2
i
1 i |Uei | m ν̄i
2 2
C(E)(E 0 − E) 1 −2
+ ···
2 (E 0 − E)2
C(E)(E 0 − E) (E 0 − E)2 − m̂ 2ν̄e (12.45)
where
m̂ 2ν̄e ≡ |Uei |2 m 2ν̄i . (12.46)
i
The expansion is useful even though the maximum distortion of the spectrum is when
E 0 − E → m ν̄i . This due to the extremely low probability of measuring events with
E so close to the E 0 limit. Measured events occur at energies where the approximation
holds, and it is with these events that sensitivity to neutrino mass is obtained.
Notice that (12.45) is of the same form as (12.43). Thus, the limits from end-
point analysis of electron energies in Tritium decays apply to m̂ 2ν̄i . There are similar
results that can be obtained for m̂ νμ and m̂ ντ from careful measurements of decay
kinematics in pion decays and τ -lepton decays, respectively. These results and the
one on νe given above are
The upper limits on m̂ νμ and m̂ ντ are not independently very constraining when one
takes into account the implications to m̂ νμ and m̂ ντ limits after applying the con-
straints of m̂ νe and the m i2j experimental determinations from oscillation exper-
iments and observations. Future experiments hope to improve on these results, or
indeed find the absolute mass scale of neutrinos.
As we have discussed, in addition to the natural sources of neutrinos from the sun,
from cosmic ray collisions with the atmosphere, and from supernova, there are numer-
ous human-made sources at a variety of nuclear reactors and proton-on-target facili-
ties. Likewise, there are numerous experiments that have been constructed to detect
12.6 Neutrino Properties and Future Goals 345
these neutrinos. The experiments each have different capacities and sensitivities to
the various final states, which include neutrinos and anti-neutrinos of all flavors.
There is no easy way to combine all of these source-detector experimental permu-
tations into an easy summary, except to do a global fit of all data to determine the
mass splittings of the mass eigenstates and the entries of the PMNS matrix.
At this writing the data can be interpreted to be consistent with the following two
distinct possibilities (see Fig. 12.1). The first is the normal hierarchy (NH):
In both cases the CP violation phase δ is not constrained well, and there is not yet a
definitive determination that it is nonzero.
Two of the most important goals of the future neutrino program are to determine
the level of CP violation, if any, in the neutrino interactions, and to determine whether
the NH or IH is the correct relative ordering of neutrino mass eigenstates.
One method to determine if there is CP violation among neutrinos is to take
careful measurements of the electron neutrino appearance rate from, say νμ → νe
oscillations, and compare that with the rate of electron antineutrino appearance
from ν̄μ → ν̄e . This is ideally done for neutrino sources that can switch back and
forth between π − and π + beams. As discussed above, this is possible by select-
ing the charge of the pions directed toward the detector, which in turn selects for
π + → νe ν̄μ νμ + X or π − → ν̄e ν̄μ νμ + X . The detector then should have some
sensitivity to both νe and ν̄e . The T2K experiment can do this. It is based on J-
PARC proton beam on target producing copious pions from high intensity proton
beam on target. The pions then decay to (anti)-neutrinos which travel to the Super-
Kamiokande detector 295 km away. Some of those neutrinos oscillate to electron
neutrinos along the journey and are detected by Super-Kamiokande as such. Current
results8 suggest a somewhat higher number of νe than would be expected when
δ = 0, and, consistently, a somewhat lower number of ν̄e . The δ = 0 point in param-
eter space from this analysis is ruled out at the 95% CL. Nevertheless, more data
will be required to establish this result at a higher confidence level and converge on
a value for δ.
To determine whether neutrino masses obey NH or IH is difficult and there are
many ideas that have been pointed out over the years. Several ideas rely on the ability
to do precision measurements on observables constructed to be sensitive to the sign
of m 231 . For example, the difference between neutrino oscillation probabilities and
anti-neutrino oscillations is sensitive to this sign difference (see (12.27)). However,
a non-zero value requires CP violation and there is ambiguity in the extraction of the
δ angle and sgn(m 231 ).
To go a step deeper, one can show9 that when the neutrinos pass through matter
the general expression for probability of transition from a muon (anti)-neutrino to
an electron (anti)-neutrino changes to
2
sin2 (1 − x)31 m 221 sin2 (x31 )
P(νμ → νe ) = sin θ23 sin 2θ13
2 2
+ cos2 θ23 sin2 2θ12
(1 − x)2 m 231 x2
m 221 sin[(1 − x)31 ] sin x31
+ sin 2θ13 sin 2θ12 sin 2θ23 cos(31 + δ)
m 231 1−x x
√
where x ≡ 2 2E G F n e /m 231 and 31 ≡ m 231 L/(4E). The same expression
holds for P(ν̄μ → ν̄e ) except that δ → −δ, and also x → −x due to the change
in sign of the potential for ν̄e passing through the medium compared to νe , as dis-
cussed earlier in Sect. 12.3. Comparing the neutrino and anti-neutrino oscillation
rates carefully at different distances from the source, one is ultimately able to deter-
mine the sign of 13 and thus determine NH or IH. To maximize the discriminating
capability of the matter effects, it is helpful to have data from very far baselines, such
as the NOνA experiment, whose detector in Ash River, Minnesota is 810 km away
from the neutrino source at Fermilab.
In time, nature’s chosen hierarchy for neutrino masses might be determined by a
combination of currently accruing data, at T2K and NOνA for example. Nevertheless,
the future experiments, such as LBNF/DUNE, will contribute significantly to the
global effort that should culminate in a decisive determination of the hierarchy.
Problems
9 R. N. Cahn et al. “White Paper: Measuring the Neutrino Mass Hierarchy.” arXiv:1307.5487.
Problems 347
4. Make a table of E ν values down the vertical and m 2 values across the horizontal
and compute the oscillation distance L osc for each combination. Choose the m 2
values to be m 2sol = 7.5 × 10−3 eV2 and m 2atm = 2.5 × 10−3 eV2 and choose
the E ν values to be 1 MeV, 10 MeV, 100 MeV, 1 GeV, 10 GeV and 100 GeV.
Appendix
A
The value of c is exact, by definition. (Since October 1983, the official definition of
1 m is the distance traveled by light in a vacuum in exactly 1/299792458 of a second.)
In units with c = = 1, some other conversion factors are:
Conversions of particle decay widths to mean lifetimes and vice versa are obtained
using:
and in reverse:
where
1 0 0 1
σ0 = σ0 = ; σ 1 = −σ 1 = ;
0 1 1 0
0 −i 1 0
σ 2 = −σ 2 = ; σ 3 = −σ 3 = . (A.2.2)
i 0 0 −1
γ 0† = γ 0 ; (γ 0 )2 = 1 (A.2.3)
γ j† = −γ j ( j = 1, 2, 3) (A.2.4)
γ 0 γ μ† γ 0 = γ μ (A.2.5)
γμ γν + γν γμ = {γμ , γν } = 2gμν (A.2.6)
[γρ , [γμ , γν ]] = 4(gρμ γν − gρν γμ ) (A.2.7)
Tr(1) = 4 (A.2.8)
Tr(γμ γν ) = 4gμν (A.2.9)
Tr(γμ γν γρ γσ ) = 4(gμν gρσ − gμρ gνσ + gμσ gνρ ) (A.2.10)
Tr(γμ1 γμ2 . . . γμ2n ) = gμ1 μ2 Tr(γμ3 γμ4 . . . γμ2n ) − gμ1 μ3 Tr(γμ2 γμ4 . . . γμ2n )
. . . + (−1)k gμ1 μk Tr(γμ2 γμ3 . . . γμk−1 γμk+1 . . . γμ2n ) + . . .
+gμ1 μ2n Tr(γμ2 γμ3 . . . γμ2n−1 ) (A.2.11)
Tr(γ5 ) =0 (A.2.15)
Tr(γμ γ5 ) =0 (A.2.16)
Tr(γμ γν γ5 ) =0 (A.2.17)
Tr(γμ γν γρ γ5 ) =0 (A.2.18)
Tr(γμ γν γρ γσ γ5 ) = 4iμνρσ (A.2.19)
γ μ γμ =4 (A.2.20)
μ
γ γν γμ = −2γν (A.2.21)
μ
γ γν γρ γμ = 4gνρ (A.2.22)
μ
γ γν γρ γσ γμ = −2γσ γρ γν (A.2.23)
u( p, s)u( p, s) = /p + m (A.2.29)
s
v( p, s)v( p, s) = /p − m (A.2.30)
s
352 Appendix A
Throughout the text we have often referenced “RPP,” which is the Review of Particle
Properties publication of the Particle Data Group, listed here:
Workman, R.L. et al. (Particle Data Group), “Review of Particle Properties”, to be
published in Prog. Theor. Exp. Phys. 2022, 083C01 (2022). http://pdg.lbl.gov/.
Quantum Field Theory
M. Peskin, D.V. Schroeder, Introduction to Quantum Field Theory (Perseus Books,
1995)
L.H. Ryder, Quantum Field Theory, 2nd edn. (Cambridge UniversityPress, 1996)
M.D. Schwartz, Quantum Field Theory and the Standard Model (Cambridge Uni-
versity Press, 2014)
M. Srednicki, Quantum Field Theory (Cambridge University Press, 2007)
The Standard Model
C. Burgess, G. Moore, The Standard Model: A Primer (Cambridge University Press,
2007)
J.F. Donoghue, E. Golowich, B.R. Holstein, Dynamics of the Standard Model (Cam-
bridge University Press, 1992)
H. Georgi, Weak Interactions and Modern Particle Theory (Dover Publications,
2009)
M. Thomson, Modern Particle Physics (Cambridge University Press, 2013)
Collider Physics
V.D. Barger, R.J.N. Phillips, Collider Physics (Addison-Wesley, 1987)
J. Campbell, J. Huston, F. Krauss, The Black Book of Quantum Chromodynamics: A
Primer for the LHC Era (Oxford University Press, 2018)
M. Krämer, F.J.P. Soler (eds.), Large Hadron Collider Phenomenology (Institute of
Physics, Bristol, 2004)
T. Plehn, Lectures on LHC Physics. arXiv:0910.4182 [hep-ph]
Group Theory
J.F. Cornwell, Group Theory in Physics, vols. 1 and 2 (Academic, 1984)
H. Georgi, Lie Algebras in Particle Physics (Westview Press, 1999)
B.R. Hall, Lie Groups, Lie Algebras, and Representations (Springer, 2003)
P. Ramond, Group Theory: A Physicist’s Survey (Cambridge, 2010)
B.G. Wybourne, Classical Groups for Physicists (Wiley, 1974)
Supersymmetry
S.P. Martin, A Supersymmetry Primer. arXiv:hep-ph/9709356
Index
A (J = 3/2), 5, 6
Abelian (commutative) group, 200 Belle experiment, 169
Action, 23, 52–54 Beta function, 230, 233, 234
Active quarks, 231, 232 QCD, 230
Adjoint representation, 204, 205, 208– QED, 233
210, 212, 213, 217, 218 Bhabha scattering, 126
Altarelli–Parisi (DGLAP) equations, 237, Biunitary transformation, 284
241 B mesons, 308
Angular momentum conservation, 124– Boost, 14, 22, 34, 35
126, 160, 170 Bose-Einstein statistics, 61
Angular momentum operator, 39 Bottomonium, 7, 8, 251
Angular resolution, 131 Branching ratios, 10, 160
Annihilation operator, 59, 60, 64, 65 charged pion, 167, 187
Anticommutation relations, 65–67, 106 Higgs boson, 160, 283
Antineutrino, 165, 167 lepton-number violating limits, 168,
Antineutrino-electron scattering, 183 169
Antiparticle, 2–4, 7, 43, 47, 51, 52, 64, W boson, 289
96, 119, 120 Z boson, 274
Associativity property of group, 200 Breit-Wigner lineshape, 2, 251
Asymptotic freedom, 232, 234
ATLAS detector at LHC, 251 C
Cabibbo angle, 287, 288
B Cabibbo-Kobayashi-Maskawa (CKM) mix-
BaBar experiment, 169 ing, 287, 288
Bare coupling, 226, 228 Canonical commutation relations, 39
Bare mass, 227, 228 Canonical quantization
Barn (unit of cross-section), 73, 74, 349 complex scalar field, 189
Barred Dirac spinor, 36 Dirac fermion fields, 66
Baryons, 4–6 real scalar fields, 59
(J = 1/2), 5 Carbon-14 dating, 166
© Springer Nature Switzerland AG 2022 353
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7
354 Index
neutrino, 289
Z
Z boson, 2, 251
branching ratios, 274
couplings to fermions, 274
interactions with Z , W , h, 278, 279
mass, measured, 2
mass, prediction, 277
resonance at LHC, 251
width, 2