100% found this document useful (3 votes)

1K views362 pages

Particles Physics

Uploaded by

sem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

100% found this document useful (3 votes)

1K views362 pages

Particles Physics

Uploaded by

sem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 362

Graduate Texts in Physics

Stephen P. Martin
James D. Wells

Elementary
Particles
and Their
Interactions
Graduate Texts in Physics

Series Editors
Kurt H. Becker, NYU Polytechnic School of Engineering, Brooklyn, NY, USA
Jean-Marc Di Meglio, Matière et Systèmes Complexes, Bâtiment Condorcet, Université
Paris Diderot, Paris, France
Sadri Hassani, Department of Physics, Illinois State University, Normal, IL, USA
Morten Hjorth-Jensen, Department of Physics, Blindern, University of Oslo, Oslo,
Norway
Bill Munro, NTT Basic Research Laboratories, Atsugi, Japan
Richard Needs, Cavendish Laboratory, University of Cambridge, Cambridge, UK
William T. Rhodes, Department of Electrical Engineering and Computer Science,
Florida Atlantic University, Boca Raton, FL, USA
Susan Scott, Australian National University, Acton, Australia
H. Eugene Stanley, Center for Polymer Studies, Physics Department, Boston
University, Boston, MA, USA
Martin Stutzmann, Walter Schottky Institute, Technical University of Munich,
Garching, Germany
Andreas Wipf, Institute of Theoretical Physics, Friedrich-Schiller-University Jena,
Jena, Germany
Graduate Texts in Physics publishes core learning/teaching material for graduate-
and advanced-level undergraduate courses on topics of current and emerging fields
within physics, both pure and applied. These textbooks serve students at the MS-
or PhD-level and their instructors as comprehensive sources of principles, defi-
nitions, derivations, experiments and applications (as relevant) for their mastery
and teaching, respectively. International in scope and relevance, the textbooks cor-
respond to course syllabi sufficiently to serve as required reading. Their didactic
style, comprehensiveness and coverage of fundamental material also make them
suitable as introductions or references for scientists entering, or requiring timely
knowledge of, a research field.
Stephen P. Martin · James D. Wells

Elementary Particles
and Their Interactions
Stephen P. Martin James D. Wells
Physics Department Physics Department
Northern Illinois University University of Michigan
DeKalb, IL, USA Ann Arbor, MI, USA

ISSN 1868-4513 ISSN 1868-4521 (electronic)

Graduate Texts in Physics
ISBN 978-3-031-14367-0 ISBN 978-3-031-14368-7 (eBook)
https://doi.org/10.1007/978-3-031-14368-7

© Springer Nature Switzerland AG 2022

This work is subject to copyright. All rights are solely and exclusively licensed by the Publisher, whether
the whole or part of the material is concerned, specifically the rights of translation, reprinting, reuse
of illustrations, recitation, broadcasting, reproduction on microfilms or in any other physical way, and
transmission or information storage and retrieval, electronic adaptation, computer software, or by similar
or dissimilar methodology now known or hereafter developed.
The use of general descriptive names, registered names, trademarks, service marks, etc. in this publication
does not imply, even in the absence of a specific statement, that such names are exempt from the relevant
protective laws and regulations and therefore free for general use.
The publisher, the authors, and the editors are safe to assume that the advice and information in this book
are believed to be true and accurate at the date of publication. Neither the publisher nor the authors or
the editors give a warranty, expressed or implied, with respect to the material contained herein or for any
errors or omissions that may have been made. The publisher remains neutral with regard to jurisdictional
claims in published maps and institutional affiliations.

This Springer imprint is published by the registered company Springer Nature Switzerland AG
The registered company address is: Gewerbestrasse 11, 6330 Cham, Switzerland
Preface

Our most fundamental understanding of the laws of nature is embodied in the the-
ories of General Relativity and the Standard Model of elementary particle physics.
There are many excellent books about the Standard Model for students to consult.
However, the assumed background for the students is different for every book, and
the emphasis is different. For example, some authors do not assume that the stu-
dents have a good understanding of quantum field theory, and so present particle
physics without it. Other authors, on the other hand, present the material with the
assumption that the student already has a good working knowledge of it.
In contrast, our book is intended to be a one-semester course for graduate
students or advanced undergraduates that develops particle physics and quantum
field theory with equal emphasis while pursuing two goals. First, we want stu-
dents to come away with a basic and solid understanding of quantum field theory
techniques aimed at computing observables that are commonly studied by experi-
mentalists, such as pp and e+ e− collisions and particle decays. Second, we want
students to gain a comprehensive survey of the full structure of the Standard Model
of elementary particle physics. In other words, students will learn what are the
basic constituents of nature (leptons, quarks, etc.), the symmetries that they obey,
and the resulting interactions that they have between them.
Our hope is that if a student has only one formalized structured course to give
to particle physics, for one reason or another, that this book would be a good one
for that purpose. We have successfully taught the material of this book at North-
ern Illinois University and at the University of Michigan. At Northern Illinois
University, this book constitutes the material of the most advanced formal course
that a particle physicist usually takes before beginning guided research. The book
was originally conceived with that goal in mind. At the University of Michigan,
this course is taught as the first particle physics course to graduate students in
their first year where quantum field theory is not allowed to be assumed going
in. Furthermore, numerous graduate students outside of particle physics (mathe-
matics, engineering, etc.) have taken the course as part of their “graduate cognate
requirement” at the University, partly because they know that in one semester they
have the opportunity to learn both quantum field theory fundamentals and particle
physics.
Given the aim of this book, to be a one-semester course, decisions have been
made to focus the material for that purpose. We have left topics that are more
v
vi Preface

suitable to future advanced courses or research. For example, we do not cover

extensively renormalization techniques, higher order corrections, and renormaliza-
tion group analysis. We provide a brief introduction to the ideas within the QCD
chapter and allude to its importance in various places, but a more comprehensive
treatment is not possible given the goals of this book to be a first course in particle
physics. In addition, we say very little about ideas that go beyond the Standard
Model of particle physics. The Standard Model has many deficiencies, such as no
dark matter candidate, no understanding of the matter dominance over anti-matter
in the current universe, and no explanations for the large hierarchies between the
elementary particle masses or the electroweak scale and the Planck scale. These are
fascinating issues close to our hearts, but a first-semester course must be restricted
to what we know and not to what we suspect or think might be, even though we
agree that those are worthy topics of discussion for later inquiry.
The reader will also note that the book is very centered on calculation and
empirical observations. The reason for this is two-fold. First, a first-pass in quan-
tum field theory is best learned through its applications to concrete scattering and
decay problems. Second, a primary goal of the book is to be valuable to experi-
mentalists and theorists that are data oriented. Particle physics is an empirical field
at its foundation, and this book reflects that viewpoint.
Finally, we wish to thank the many people who have helped us in the writing
of this book. Special thanks to careful readings and technical help from Prudhvi
Bhattiprolu and Aaron Pierce. In addition, we thank the many students who have
provided invaluable feedback.

DeKalb, USA Stephen P. Martin

Ann Arbor, USA James D. Wells
Contents

1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Fundamental Forces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Resonances, Widths, and Lifetimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Leptons and Quarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Hadrons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Decays and Branching Ratios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2 Special Relativity and Lorentz Transformations . . . . . . . . . . . . . . . . . . . . 13
2.1 Lorentz Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.2 Relativistic Kinematics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.3 Tensors and Lorentz Invariant Quantities . . . . . . . . . . . . . . . . . . . . . . 20
2.4 Maxwell’s Equations and Electromagnetism . . . . . . . . . . . . . . . . . . . 24
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3 Relativistic Quantum Mechanics of Single Particles . . . . . . . . . . . . . . . . 29
3.1 Klein-Gordon and Dirac Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.2 Solutions of the Dirac Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.3 The Weyl Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.4 Majorana Fermions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 Field Theory and Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.1 The Field Concept and Lagrangian Dynamics . . . . . . . . . . . . . . . . . 51
4.2 Quantization of Free Scalar Field Theory . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Quantization of Free Dirac Fermion Field Theory . . . . . . . . . . . . . 64
4.4 Scalar Field with φ 4 Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5 Scattering Processes and Cross-Sections . . . . . . . . . . . . . . . . . . . . . . . 73
4.6 Scalar Field with φ 3 Coupling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.7 Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5 Quantum Electro-Dynamics (QED) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.1 QED Lagrangian and Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.2 Electron-Positron Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.1 e− e+ → μ− μ+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
5.2.2 e− e+ → f f . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
vii
viii Contents

5.2.3 Helicities in e− e+ → μ− μ+ . . . . . . . . . . . . . . . . . . . . . . . . . . 118

5.2.4 Bhabha Scattering (e− e+ → e− e+ ) . . . . . . . . . . . . . . . . . . . 126
5.3 Crossing Symmetry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.3.1 e− μ+ → e− μ+ and e− μ− → e− μ− . . . . . . . . . . . . . . . . . 133
5.3.2 Møller Scattering (e− e− → e− e− ) . . . . . . . . . . . . . . . . . . . . 135
5.4 Gauge Invariance in Feynman Diagrams . . . . . . . . . . . . . . . . . . . . . . . 137
5.5 External Photon Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.5.1 Compton Scattering (γ e− → γ e− ) . . . . . . . . . . . . . . . . . . . . 139
5.5.2 e+ e− → γ γ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6 Decay Processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.1 Decay Rates and Partial Widths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.2 Two-Body Decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay . . . . 157
6.4 Three-Body Decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
7 Fermi Theory of Weak Interactions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.1 Weak Nuclear Decays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
7.2 Muon Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
7.3 Corrections to Muon Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.4 Inverse Muon Decay (e− νμ → νe μ− ) . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.5 e− ν e → μ− ν μ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
7.6 Charged Currents and π ± Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
7.7 Unitarity, Renormalizability, and the W Boson . . . . . . . . . . . . . . . . 191
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
8 Quantum Chromo-Dynamics (QCD) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.1 Groups and Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
8.2 The Yang-Mills Lagrangian and Feynman Rules . . . . . . . . . . . . . . . 211
8.3 QCD Lagrangian and Feynman Rules . . . . . . . . . . . . . . . . . . . . . . . . . 217
8.4 Scattering of Quarks and Gluons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
8.4.1 Quark-Quark Scattering (qq → qq) . . . . . . . . . . . . . . . . . . . 219
8.4.2 Gluon-Gluon Scattering (gg → gg) . . . . . . . . . . . . . . . . . . . 222
8.5 Renormalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
8.6 Parton Distribution Functions and Hadron-Hadron
Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
8.7 Top-Antitop Production in p P and pp Collisions . . . . . . . . . . . . . . 242
8.8 Kinematics in Hadron-Hadron Scattering . . . . . . . . . . . . . . . . . . . . . . 247
8.9 Drell-Yan Scattering (+ − Production in Hadron
collisions) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
Contents ix

9 Spontaneous Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

9.1 Global Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
9.2 Local Symmetry Breaking and the Higgs Mechanism . . . . . . . . . . 261
9.3 Goldstone’s Theorem and the Higgs Mechanism in General . . . . 265
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
10 The Standard Electroweak Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 269
10.1 SU (2) L × U (1)Y Representations and Lagrangian . . . . . . . . . . . . . 269
10.2 The Standard Model Higgs Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 275
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing . . . . 280
10.4 Neutrino Masses and the Seesaw Mechanism . . . . . . . . . . . . . . . . . . 289
10.5 The Higgs Boson Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
10.5.1 Higgs Boson Decays Revisited . . . . . . . . . . . . . . . . . . . . . . . . 292
10.5.2 Higgs Boson Production at the LHC . . . . . . . . . . . . . . . . . . 297
10.5.3 Discovery Through γ γ and 4 Final States . . . . . . . . . . . 301
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
11 Neutral Meson Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
11.1 Neutral Kaons, D mesons and B mesons . . . . . . . . . . . . . . . . . . . . . . 307
11.2 Neutral Kaon Mixing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
11.3 CP Eigenstates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
11.4 Neutral Kaon Oscillations and Lifetimes . . . . . . . . . . . . . . . . . . . . . . . 312
11.5 Neutral Kaon Decay to Pions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
11.6 Direct CP Violation in Kaon Decay . . . . . . . . . . . . . . . . . . . . . . . . . . . 315
11.7 Neutral Kaon Decays to Leptons . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318
11.8 K L − K S Mass Difference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 320
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 323
12 Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
12.1 Neutrino Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
12.1.1 Solar Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 326
12.1.2 Supernova Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328
12.1.3 Atmospheric Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 329
12.1.4 Reactor and Accelerator Neutrino Sources . . . . . . . . . . . . 331
12.2 Neutrino Propagation Through Vacuum . . . . . . . . . . . . . . . . . . . . . . . . 332
12.2.1 Two-Generation Neutrino Oscillations . . . . . . . . . . . . . . . . . 332
12.2.2 Three-Generation Neutrino Propagation . . . . . . . . . . . . . . . 336
12.3 Neutrino Propagation Through Matter . . . . . . . . . . . . . . . . . . . . . . . . . 338
12.4 Detecting Neutrinos . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 340
12.5 Direct Limits on Neutrino Masses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 343
12.6 Neutrino Properties and Future Goals . . . . . . . . . . . . . . . . . . . . . . . . . 344
Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346

Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 349
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 353
Introduction
1

In this book, we will explore some of the tools necessary for attacking the fundamental
questions of elementary particle physics. These questions include:

• What fundamental particles is everything made out of?

• How do the particles interact with each other?
• What principles underlie the answers to these questions?
• How can we use this information to predict and interpret the results of experi-
ments?

The Standard Model of particle physics proposes some answers to these questions.
Although the Standard Model is an incomplete fundamental description of nature,
it is the benchmark against which future theories will be compared. Furthermore,
if new physics is uncovered at the CERN Large Hadron Collider (LHC), or other
future experiments, it is likely that it can be described using the same set of tools.
This Introduction contains a brief outline of the known fundamental particle con-
tent of the Standard Model, for purposes of orientation. These and many other exper-
imental results about elementary particles can be found in the Review of Particle
Properties, hereafter known as the RPP.1 Unless otherwise indicated all experimen-
tal data quoted in this book was obtained from RPP.

1 R.
L. Workman et al. (Particle Data Group), to be published in Prog. Theor. Exp. Phys. 2022,
083C01 (2022) with frequent updates.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 1

S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_1
2 1 Introduction

1.1 Fundamental Forces

The known interaction forces in nature are the universal attraction of gravity, the
electromagnetic force, the weak nuclear force, and the strong nuclear force. Among
these, gravity is special and is governed by Einstein’s theory of General Relativity.
The other forces are gauge theories. The definition of gauge theories and their prop-
erties will be explored extensively throughout this book. Here let it suffice to say that
a gauge force is one that is mediated by a spin-1 (vector) boson. The force-mediator
gauge bosons that we know about in the Standard Model are listed in Table 1.1.
The photon is the mediator of the electromagnetic force, while the W ± and Z 0
bosons mediate the weak nuclear force, which is seen primarily in decays and in
neutrino interactions. The W + and W − bosons are antiparticles of each other, so they
have exactly the some mass and lifetime. The gluon has an exact 8-fold degeneracy
due to a degree of freedom known as “color”. Color is the charge associated with the
strong nuclear force. Particles that carry net color charges are always confined by
the strong nuclear force, meaning that they can only exist in bound states. Therefore,
no value is listed for the gluon lifetime, and the entry “0” for its mass is meant to
indicate only that the classical wave equation for it has the same character as that of
the photon. Although the fundamental spin-1 bosons are often called force carriers,
that is not their only role, since they are particles in their own right.

1.2 Resonances, Widths, and Lifetimes

Table 1.1 also includes information about the width of the W and Z particle reso-
nances, measured in units of mass, GeV/c2 . In general, resonances can be described
by a relativistic Breit-Wigner lineshape, which gives the probability for the kinematic
mass reconstructed from the production and decay of the particle to have a particular
value M, in the idealized limit of perfect detector resolution and an isolated state.
For a particle of mass m, the probability is:

f (M)
P(M) = , (1.1)
(M 2 − m 2 )2 + m 2 (M)2

where f (M) and (M) are functions that usually vary slowly over the resonance
region M ≈ m, and thus can be treated as constants. The resonance width ≡ (m)

Table 1.1 The fundamental vector bosons of the standard model

Boson Charge Mass (GeV/c2 ) Width (GeV/c2 ) Lifetime (s) Force
Photon γ 0 0 0 ∞ EM
W± ±1 80.379 ± 0.012 2.085 ± 0.042 3.14 × 10−25 Weak
Z0 0 91.1876 ± 0.0021 2.4952 ± 0.0023 2.64 × 10−25 Weak
Gluon g 0 “0” Strong
1.3 Leptons and Quarks 3

is equivalent to the mean lifetime, which appears in the next column of Table 1.1;
they are related by

τ (in seconds) = (6.58212 × 10−25 )/[ (in GeV/c2 )]. (1.2)

The RPP lists the mean lifetime τ for some particles, and the width for others.
Actually, the Standard Model of particle physics predicts the width of the W boson far
more accurately than the experimentally measured width indicated in Table 1.1. The
predicted width, with uncertainties from input parameters, is W = 2.091 ± 0.002
GeV/c2 .

1.3 Leptons and Quarks

The remaining known indivisible constituents of matter are spin-1/2 fermions,

known as leptons (those without strong nuclear interactions) and quarks (those with
strong nuclear interactions). All experimental tests are consistent with the proposition
that these particles have no substructure. The leptons are listed in Table 1.2.
They consist of negatively charged electrons, muons, and taus, and weakly inter-
acting neutrinos. There is now good evidence (from experiments that measure oscil-
lations of neutrinos produced by the sun and in cosmic rays) that the neutrinos have
non-zero masses, but their absolute values are not known except for upper bounds
as shown. The quarks come in 6 types, known as “flavors”, listed in Table 1.3.

Table 1.2 The leptons of the standard model

Lepton Charge Mass (GeV/c2 ) Mean lifetime (s)
Electron e− −1 5.109989461(31) × 10−4 ∞
νe 0 < 2 × 10−9
Muon μ− −1 0.1056583745(24) 2.1969811(22) × 10−6
νμ 0 < 1.9 × 10−4
Tau τ− −1 1.77686(12) 2.903(5) × 10−13
ντ 0 < 0.018

Table 1.3 The quarks of the standard model

Quark Charge Mass (GeV/c2 )
Down d −1/3 4.4 × 10−3 to 5.2 × 10−3
Up u 2/3 1.8 × 10−3 to 2.7 × 10−3
Strange s −1/3 0.092 to 0.104
Charm c 2/3 1.275 ± 0.025
Bottom b −1/3 4.18 ± 0.03
Top t 2/3 173.1 ± 0.9
4 1 Introduction

The fermions listed in Tables 1.2 and 1.3 are often considered as divided into fam-
ilies, or generations. The first family is e− , νe , d, u, the second is μ− , νμ , s, c, and
the third is τ − , ντ , b, t. The masses of the fermions of a given charge increase with
the family. The weak interactions mediated by W ± bosons can change quarks of one
family into those of another, but it is an experimental fact that these family-changing
reactions are highly suppressed. All of the fermions listed above also have corre-
sponding antiparticles, with the opposite charge and color, but the same mass and
spin. The antileptons are positively charged e+ , μ+ , τ + and antineutrinos ν e , ν μ , ν τ .
For each quark, there is an antiquark (d, u, s, c, b, t) with the same mass but the
opposite charge. Antiquarks carry anticolor (anti-red, anti-blue, or anti-green).
The masses of the five lightest quarks (d, u, s, c, b) are somewhat uncertain,
and even the definition of the mass of a quark is subject to technical difficulties and
ambiguities. This is related to the fact that quarks exist only in colorless bound states,
called hadrons, due to the confining nature of the strong force. A colorless bound state
can be formed either from three quarks (a baryon), or from three antiquarks (an anti-
baryon), or from a quark with a given color and an antiquark with the corresponding
anti-color (a meson). All baryons are fermions with half-integer spin, and all mesons
are bosons with integer spin. The quark mass values shown in Table 1.3 correspond to
particular technical definitions of quark mass used by the RPP,2 but other definitions
give quite different values. The lifetimes of the d, u, s, c, b quarks are also fuzzy,
and are best described in terms of the hadrons in which they live. In contrast, the
top quark mass is relatively well-known, with an uncertainty under a percent. This is
because the top-quark mean lifetime (about 4.6 × 10−25 s) is so short that it decays
before it can form hadronic bound states (which take roughly 3 × 10−24 s to form).
Therefore it behaves like a free particle during its short life, and so its mass and width
can be defined in a way that is not subject to large ambiguities. Each of these quarks
has an exact 3-fold degeneracy, associated with the color that is the source charge
for the strong force. The colors are often represented by the labels red, green, and
blue, but these are just arbitrary labels; there is no experiment that could tell a red
quark from a green quark, even in principle.
There is also a Higgs boson, with spin 0 and charge 0. It was discovered in 2012,
and its mass has been measured to be 125.25 ± 0.17 GeV. Some extensions of the
Standard Model predict that this Higgs boson is not fundamental and is a composite
state of other particles. However, the data collected to date are consistent with the
Higgs boson being another elementary particle.

1.4 Hadrons

As remarked above, quarks and antiquarks are always found as part of colorless
bound states. The most common are the nucleons (the proton and the neutron), the
baryons that make up most of the directly visible mass in the universe. They and other

2 Here, we have quoted “MS masses” for u, d, s, c, b, and the “pole mass” for t.
1.4 Hadrons 5

similar baryons with total angular momentum (including both constituent spins and
orbital angular momentum) J = 1/2 are listed in Table 1.4.
The quarks listed in parentheses are the valence quarks of the bound state, but there
are also virtual (or “sea”) quark-antiquark pairs and virtual gluons in each of these
and other hadrons. The proton may be absolutely stable; experiments to try to observe
its decays have not found any, resulting in only a very high lower bound on the mean
lifetime. The neutron lifetime is also relatively long, but it decays into a proton,
electron, and antineutrino (n → pe− ν̄e ). The other J = 1/2 baryons decay in times
of order 10−10 s by weak interactions, except for the 0 baryon, which decays
extremely quickly by an electromagnetic interaction into the , which has the same
valence quark content: 0 → γ . In that sense, one can think of the 0 as being an
excited state of the . There are other excited states of these baryons, not listed here.
The mass of the baryons in Table 1.4 increases with the number of valence strange
quarks contained.
Note that the masses of the proton and the neutron (and all other hadrons) are
much larger than the sums of the masses of the valence quarks that make them
up. These nucleon masses come about from the strong interactions by a mechanism
known as chiral symmetry breaking. Nucleons dominate the visible mass of particles
in the universe. Therefore, it is only partially correct to say that the Higgs boson is
needed to understand the “origin of mass”. Most of the masses of the W ± and Z
bosons and the top, bottom, charm, and strange quarks and the leptons are indeed
believed to come from the Higgs mechanism, to be discussed below. However, the
Higgs mechanism is by no means necessary to understand the origin of all mass, and
in particular it is definitely not the explanation for most of the mass that is directly
observed in the universe.
There are also J = 3/2 baryons, with some of the more common ones listed in
Table 1.5. The RPP uses a slightly different notation for the ∗ and ∗ J = 3/2
baryons. Instead of the ∗ notation to differentiate these states from the corresponding
J = 1/2 baryons with the same quantum numbers, the RPP chooses to denote them
by their approximate mass in MeV (as determined by older experiments, so a little off
from the present best values) in parentheses, so (1385) and (1530). Very narrow

Table 1.4 Baryons with J = 1/2 made from light (u, d, s) quarks
J = 1/2 baryon Charge Mass (GeV/c2 ) Lifetime (s)
p (uud) +1 0.938272 >6.6 ×1036
n (udd) 0 0.939565 880.3
(uds) 0 1.11568 2.63 × 10−10
+ (uus) +1 1.18937 8.02 × 10−11
0 (uds) 0 1.19264 7.4 × 10−20
− (dds) −1 1.19745 1.48 × 10−10
0 (uss) 0 1.31486 2.9 × 10−10
− (dss) −1 1.32171 1.64 × 10−10
6 1 Introduction

Table 1.5 J = 3/2 baryons

J = 3/2 Charge Mass (GeV/c2 ) Lifetime (s)
baryon (GeV/c2 )
++ (uuu) +2 1.232 0.117 5.6 × 10−24
+ (uud) +1 "" "" ""
0 (udd) 0 "" "" ""
− (ddd) −1 "" "" ""
∗+ (suu) +1 1.383 0.036 1.8 × 10−23
∗0 (sud) 0 1.384 0.036 1.8 × 10−23
∗− (sdd) −1 1.387 0.039 1.7 × 10−23
∗0 (ssu) 0 1.532 0.0091 7.2 × 10−23
∗− (ssd) −1 1.535 0.0099 6.6 × 10−23

− (sss) −1 1.672 8.0 × 10−15 8.21 × 10−11

resonances correspond to very long-lived states; the

− is by far the narrowest and
most stable of the ten J = 3/2 baryon ground states listed.
There are also baryons containing heavy (c or b) quarks. The only ones that have
been definitively observed so far have J = 1/2 and contain exactly one heavy quark.
The lowest lying states with a charm quark include the + ++ + 0 +
c , c , c , c , c , c ,
0

and
0c resonances with masses ranging from 2.29 GeV/c2 to 2.7 GeV/c2 , and those
with a bottom quark include the 0b , 0b , − − + −
b ,
b , b , and b , with masses ranging
from 5.62 GeV/c2 to 5.82 GeV/c2 . More information about them can be found in
the RPP. Again, there are other baryons, generally with heavier masses, that can be
thought of as excited states of the more common ones listed above.
Bound states of a valence quark and antiquark are called mesons. They always
carry integer total angular momentum J . The most common J = 0 mesons are listed
in Table 1.6.
Here the bar over a quark name denotes the corresponding antiquark. The charged
pions π ± are antiparticles of each other, as are the charged kaons K ± , so they are
exactly degenerate mass pairs with the same lifetime. However, the K 0 and K 0
mesons are mixed and not quite exactly degenerate in mass, as will be discussed in
greater detail in Chap. 11. One of the interaction eigenstates (K L0 ) is actually much

Table 1.6 J = 0 mesons containing light (u, d, s) quarks and antiquarks

J = 0 meson Charge Mass (GeV/c2 )
π0 (u ū, d d̄) 0 0.134977
π± (u d̄); (d ū) ±1 0.139570
K± (u s̄); (s ū) ±1 0.493677
K 0, K 0 (d s̄); (s d̄) 0 0.497614
η (u ū, d d̄, s s̄) 0 0.54786
η (u ū, d d̄, s s̄) 0 0.95778
1.4 Hadrons 7

longer-lived than the other (K S0 ); the mean lifetimes are respectively 5.12 × 10−8
and 8.95 × 10−11 s. The lifetimes (and the widths) of the other J = 0 mesons are
not listed here; you can find them yourself in the RPP.
Besides the J = 0 mesons listed above, there are counterparts containing a single
heavy (charm or bottom) quark or antiquark, with the other antiquark or quark light
(up, down or strange). The most common ones are listed in Table 1.7.
There are also J = 0 mesons containing only charm and bottom quarks and
antiquarks. The ones with the lowest masses are listed in Table 1.8.
Vector (J = 1) mesons are also very important. Table 1.9 lists the most common
ones that contain only light (u, d, s) valence quarks and antiquarks.
The most common J = 1 mesons containing one heavy (c or b) quark or antiquark
are likewise shown in Table 1.10. Note that these have the same charges and slightly
larger masses than the corresponding J = 0 mesons in Table 1.7. Mesons with J = 1
and with both quark and antiquark heavy are shown in Table 1.11.
In principle, there should also be J = 1 Bc∗± mesons, but (unlike their J = 0
counterparts in Table 1.8) their existence has not been established experimentally.

Table 1.7 J = 0 mesons containing one heavy and one light quark and antiquark
J = 0 meson Charge Mass (GeV/c2 )
D0 , D0 (cū); (u c̄) 0 1.8648
D± (cd̄); (d c̄) ±1 1.8696
Ds± (cs̄); (s c̄) ±1 1.9683
B± (u b̄); (bū) ±1 5.279
B0, B0 (d b̄); (bd̄) 0 5.280
Bs0 , Bs0 (s b̄); (bs̄) 0 5.367

Table 1.8 J = 0 mesons containing a heavy quark and a heavy antiquark

J = 0 meson Charge Mass (GeV/c2 )
ηc “charmonium” (cc̄) 0 2.984
Bc± (cb̄); (bc̄) ±1 6.275
ηb “bottomonium” (bb̄) 0 9.398

Table 1.9 J = 1 mesons containing light quarks and antiquarks

J = 1 meson Charge Mass (GeV/c2 )
ρ± (u d̄); (d ū) ±1 0.7752
ρ0 (u ū, d d̄) 0 ""
ω0 (u ū, d d̄) 0 0.7827
K ∗± (u s̄); (s ū) ±1 0.8917
K ∗0 , K ∗0 (d s̄); (s d̄) 0 0.8958
φ (u ū, d d̄, s s̄) 0 1.01946
8 1 Introduction

Table 1.10 J = 1 mesons containing one heavy and one light quark and antiquark
J = 1 meson Charge Mass (GeV/c2 )
D ∗0 , D ∗0 (cū); (u c̄) 0 2.007
D ∗± (cd̄); (d c̄) ±1 2.010
Ds∗± (cs̄); (s c̄) ±1 2.112
B ∗0 , B ∗0 (d b̄); (bd̄) 0 5.325
B ∗± (u b̄); (bū) ±1 5.325
Bs∗0 , Bs∗0 (s b̄); (bs̄) 0 5.415

Table 1.11 J = 1 mesons containing a heavy quark and a heavy antiquark

J = 1 meson Charge Mass (GeV/c2 )
J /ψ “charmonium” (cc̄) 0 3.096916
ϒ “bottomonium” (bb̄) 0 9.4603

The heavy quarkonium (cc and bb) systems have other states besides the ηc , J /ψ
and ηb , ϒ from Tables 1.8 and 1.11. For the cc system, there are J = 0 mesons
χc0,1,2 that have the quark and antiquark in P-wave orbital angular momentum
states. There are also states ηc (2S), ψ(2S), ψ(3770), ψ(3872) that are similar to
the ηc and J /ψ, but with excited radial bound-state wavefunctions. Similarly, in the
bb system, there are excited bottomonium states ϒ(2S), ϒ(3S), ϒ(4S), ϒ(10860),
and ϒ(11020) with J = 1, and P-wave orbital angular momentum states with total
J = 0, χb0,1,2 (1P) and χb0,1,2 (2P). The spectroscopy of these states provides a
striking confirmation of the quark model for hadrons and of the strong force.
Much more detailed information on all of these hadronic bound states (and many
others not listed above), including the decay widths and the decay products, can be
found in the RPP. In Fig. 1.1 we scatter plot the mass and lifetime of many of the
elementary particles and boundstate particles that we have discussed in this chapter.
One sees that they fill out many orders of magnitude in both mass and lifetime. Some
reasons for the ordering of masses will become clear in future chapters, such as why
the pion masses are lower than the proton and neutron masses, but other orderings
of the particle masses remain a mystery, such as why the muon mass is much lower
than the top quark mass.
Theoretically, one also expects exotic mesons that are mostly “gluonium” or
glueballs, that is, bound states of gluons. However, these states are expected to mix
with excited quark-antiquark bound states, and they will be extremely difficult to
identify experimentally.
In collider experiments, hadrons are most often produced in groups called jets.
Roughly speaking, each jet can be thought of as originating, at the shortest dis-
tance scales, from individual gluons and quarks (partons) which then hadronize by
complicated processes into collections of final state particles that share the energy
and momentum of the original parton. The hadrons in a given jet have momenta in
approximately the same direction as their parent parton.
1.5 Decays and Branching Ratios 9

Fig. 1.1 Mass versus lifetime for many elementary particles and composite hadrons discussed in
this chapter

1.5 Decays and Branching Ratios

In some cases, hadrons can decay through the strong interactions, with widths of
order tens or hundreds of MeV. Some examples include:

++ → pπ + (1.3)
ρ− → π 0π − (1.4)
ω → π +π −π 0 (1.5)
φ → K + K −. (1.6)

There are also decays that are mediated by electromagnetic interactions, for example:

π0 → γγ (1.7)
+ → pγ (1.8)
0 → γ (1.9)
ρ0 → π +π −γ . (1.10)
10 1 Introduction

The smallest decay widths for hadrons are those mediated by the weak interactions,
for example:
n → pe− ν̄e (1.11)
− −
π → μ ν̄μ (1.12)
K + → π +π 0 (1.13)
+
B → D 0 μ+ ντ (1.14)

− → K . −
(1.15)
The weak interactions are also entirely responsible for the decays of the charged
leptons:
μ− → νμ e− ν̄e (1.16)
τ− → ντ e− ν̄e (1.17)
τ− → ντ μ− ν̄μ (1.18)
τ− → ντ + hadrons. (1.19)
Experimentally, the hadronic τ decays are classified by the number of charged
hadrons present in the final state, as either “1-prong” (if exactly one charged hadron),
“3-prong” (if exactly three charged hadrons), etc.
In most cases, a variety of different decay modes contribute to each total decay
width. The fraction that each final state contributes to the total decay width is known
as the branching ratio (or branching fraction), usually abbreviated as BR or B. As a
randomly chosen example, in the case of the ω meson the strong interaction accounts
for most, but not all, of the decays:

BR(ω → π + π − π 0 ) = (89.2 ± 0.7)% (strong) (1.20)

BR(ω → π γ ) = (8.3 ± 0.3)%
0
(EM) (1.21)
+ −
BR(ω → π π ) = (1.53 ± 0.13)% (EM) (1.22)
with other final states totaling less than 1%.
It is also common to present this information in terms of the partial widths into
various final states. If the total decay width for a parent particle X is (X ), then the
partial decay width of X into a particular final state Y is

(X → Y ) = BR(X → Y )(X ). (1.23)

The sum of all of the branching ratios is equal to 1, and the sum of the partial widths
is equal to the total decay width.
There are two roads to enlightenment regarding the Standard Model and its future
replacement. The experimental road, which is highly successful as indicated by the
impressive volume and detail in the RPP, finds the answers to masses, decay rates,
branching ratios, production rates, and even more detailed information like kinematic
and angular distributions directly from data in high-energy collisions. The theoretical
road aims to match these results onto predictions of quantum field theories specified
in terms of a small number of parameters. In the case of electromagnetic interactions,
1.5 Decays and Branching Ratios 11

quantum field theory is extremely successful, providing amazingly accurate predic-

tions for observable quantities such as magnetic moments and interaction rates. In
other applications, quantum field theory is only partly successful. For some calcu-
lations, perturbation theory and other known methods are too difficult to carry out,
or do not converge even in principle. In some cases, lattice gauge theory provides
useful information; this approach is based on a discretized approximation to quantum
field theory and stochastic methods. In other cases, only rough or even qualitative
results are possible. However, quantum field theory is systematic and elegant, and
provides understanding that is often elusive in the raw data. In the following, we will
try to understand some of the basic calculation methods of quantum field theory as
a general framework, and eventually the description of the Standard Model in terms
of it.
Special Relativity and Lorentz
Transformations 2

2.1 Lorentz Transformations

A successful description of elementary particles must be consistent with the two

pillars of modern physics: special relativity and quantum mechanics. Let us begin
by reviewing some important features of special relativity.
Spacetime has four dimensions. For any given event (for example, a firecracker
explodes, or a particle decays to two other particles) one can assign a four-vector
position:

(ct, x, y, z) = (x 0 , x 1 , x 2 , x 3 ) = x μ (2.1)

The Greek indices μ, ν, ρ, . . . run over the values 0, 1, 2, 3, and c is the speed of
light in vacuum. As a matter of terminology, x μ is an example of a contravariant
four-vector.
The laws of physics should not depend on what coordinate system we use, as long
as it is an “inertial reference frame”, which means that the coordinates describing the
position of a free classical particle do not accelerate. This invariance of the laws of
physics is a guiding principle in making a sensible theory. It is often useful to change
our coordinate system from one inertial reference frame to another, according to

x μ → x μ = L μ ν x ν . (2.2)

Here L μ ν is a constant 4 × 4 real matrix that parameterizes the Lorentz transforma-

tion. It is not arbitrary, however, as we will soon see. Such a change of coordinates
is called a Lorentz transformation.
As a simple example of a Lorentz transformation, suppose we rotate our coordinate
system about the z-axis by an angle α. Then in the new coordinate system:

x μ = (ct , x , y , z ) (2.3)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 13

S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_2
14 2 Special Relativity and Lorentz Transformations

where

ct = ct
x = x cos α + y sin α
y = −x sin α + y cos α
z = z. (2.4)

Alternatively, we could go to a frame moving with respect to the original frame with
velocity v along the z direction, with the origins of the two frames coinciding at time
t = t = 0. Then:

ct = γ (ct − βz)
x =x
y =y
z = γ (z − βct). (2.5)

where

β = v/c; γ = 1/ 1 − β 2 . (2.6)

Another way of rewriting this is to define the rapidity ρ by β = tanh ρ, so that

γ = cosh ρ and βγ = sinh ρ. Then we can rewrite (2.5),

x 0 = x 0 cosh ρ − x 3 sinh ρ
x 1 = x1
x 2 = x2
x 3 = −x 0 sinh ρ + x 3 cosh ρ. (2.7)

This change of coordinates is called a boost (with rapidity ρ and in the ẑ direction).
Another example of a contravariant four-vector is given by the 4-momentum
formed from the energy E and spatial momentum p of a particle:

p μ = (E/c, p ). (2.8)

In the rest frame of a particle of mass m, its 4-momentum is given by p μ =

(mc, 0, 0, 0). All contravariant four-vectors transform the same way under a Lorentz
transformation:

a μ = L μ ν a ν . (2.9)
2.1 Lorentz Transformations 15

The 4-momentum of a particle is related to its mass by the Lorentz transformation

that relates the frame of reference in which it is measured and the rest frame. In the
example of (2.5), one has:
⎛ ⎞
γ 0 0 −βγ
⎜ 0 1 0 0 ⎟
L μν =⎜
⎝ 0
⎟, (2.10)
0 1 0 ⎠
−βγ 0 0 γ

and the inverse Lorentz transformation is

⎛ ⎞
γ 0 0 βγ
⎜ 0 1 0 0 ⎟
a μ = (L −1 )μ ν a ν , (L −1 )μ ν =⎜
⎝ 0
⎟. (2.11)
0 1 0 ⎠
βγ 0 0 γ

A key property of special relativity is that for any two events one can define a
proper interval, which is independent of the Lorentz frame, and which tells us how
far apart the two events are in a coordinate-independent sense. So, consider two
events occurring at x μ and x μ + d μ , where d μ is some four-vector displacement.
The proper interval between the events is

(τ )2 = (d 0 )2 − (d 1 )2 − (d 2 )2 − (d 3 )2 = gμν d μ d ν (2.12)

where
⎛ ⎞
1 0 0 0
⎜0 −1 0 0 ⎟
gμν =⎜
⎝0
⎟ (2.13)
0 −1 0 ⎠
0 0 0 −1

is known as the metric tensor. Here, and from now on, we adopt the Einstein summa-
tion convention, in which repeated indices μ, ν, . . . are taken to be summed over. It
is an assumption of special relativity that gμν is the same in every inertial reference
frame.
The existence of the metric tensor allows us to define covariant four-vectors by
lowering an index:

xμ = gμν x ν = (ct, −x, −y, −z), (2.14)

pμ = gμν p ν = (E/c, − px , − p y , − pz ). (2.15)

Furthermore, one can define an inverse metric g μν so that

g μν gνρ = δρμ , (2.16)

16 2 Special Relativity and Lorentz Transformations

μ
where δν = 1 if μ = ν, and otherwise = 0. It follows that
⎛ ⎞
1 0 0 0
⎜0 −1 0 0 ⎟
g μν ⎜
=⎝ ⎟. (2.17)
0 0 −1 0 ⎠
0 0 0 −1

Then one has, for any vector a μ ,

aμ = gμν a ν ; a μ = g μν aν . (2.18)

It follows that covariant four-vectors transform as

aμ = L μ ν aν (2.19)

where (note the positions of the indices!)

L μ ν = gμρ g νσ L ρ σ . (2.20)

Because one can always use the metric to go between contravariant and covariant
four-vectors, people often use a harmlessly sloppy terminology and neglect the dis-
tinction, simply referring to them as four-vectors.
If a μ and bμ are any four-vectors, then

a μ bν gμν = aμ bν g μν = aμ bμ = a μ bμ ≡ a · b (2.21)

is a scalar quantity. For example, if p μ and q μ are the four-momenta of any two
particles, then p · q is a Lorentz-invariant; it does not depend on which inertial
reference frame it is measured in. In particular, a particle with mass m satisfies the
on-shell condition

p 2 = p μ pμ = E 2 /c2 − p 2 = m 2 c2 . (2.22)

The Lorentz invariance of dot products of pairs of 4-momenta, plus the conservation
of total four-momentum, plus the on-shell condition (2.22), is enough to solve most
problems in relativistic kinematics.

2.2 Relativistic Kinematics

Let us pause to illustrate this with an example. Consider the situation of two particles,
each of mass m, colliding. Suppose the result of the collision is two final-state
particles each of mass M. Let us find the threshold energy and momentum 4-vectors
2.2 Relativistic Kinematics 17

for this process in the COM (center-of-momentum) frame and in the frame in which
one of the initial-state particles is at rest. Throughout most of the following, we will
take c = 1, by a choice of units.
Relativistic kinematics problems are often more easily analyzed in the COM
frame, so let us consider that case first. Without loss of generality, we can take the
colliding initial-state particles to be moving along the z-axis. Then their 4-momenta
are:
μ

p1 = (E, 0, 0, E 2 − m 2 ), (2.23)

μ
p2 = (E, 0, 0, − E − m ).
2 2 (2.24)

The spatial momenta are required to be opposite by the definition of the COM
frame, which in turn requires the energies to be the same, using (2.22) and the fact
that the masses are assumed equal. The total 4-momentum of the initial state is
p μ = (2E, 0, 0, 0), and so this must be equal to the total 4-momentum of the final
state in the COM frame as well. Furthermore,

p 2 = 4E 2 (2.25)

is a Lorentz invariant, the same in any inertial frame.

Similarly, in the COM frame, the final state 4-momenta can be written as:

μ
k1 = (E f , 0, sin θ E 2f − M 2 , cos θ E 2f − M 2 ), (2.26)

μ
k2 = (E f , 0, − sin θ E 2f − M 2 , − cos θ E 2f − M 2 ). (2.27)

The angle θ parametrizes the arbitrary direction of the scattering. Without loss of
generality, we have taken the scattering to occur within the yz plane, as shown:

p1 p2

The fact that we are in the COM frame again requires the spatial momenta to be
opposite, and thus the energies to be equal to a common value E f because of the
assumed equal masses M. Now, requiring conservation of total 4-momentum gives
18 2 Special Relativity and Lorentz Transformations

μ μ μ μ
k1 + k2 = p1 + p2 , so E f = E. In order for the spatial momentum components
to be real, we therefore find the energy threshold condition in the COM frame

E > E thresh = M. (2.28)

Now let us reconsider the problem in a frame where one of the initial-state particles
is at rest, corresponding to a fixed-target experiment. In the Lab frame,

μ

p1 = (E , 0, 0, E 2 − m 2 ), (2.29)
μ
p2 = (m, 0, 0, 0) (2.30)

are the 4-momenta of the two initial-state particles, and E is the Lab frame energy
μ
√ particle. The total initial state 4-momentum is therefore p = (E +
of the moving
m, 0, 0, E 2 − m 2 ), leading to a Lorentz invariant

p 2 = (E + m)2 − (E 2 − m 2 ) = 2 m(E + m). (2.31)

This must be the same as (2.25), so the Lab frame energy is related to the COM
energy of each particle by

m(E + m) = 2E 2 . (2.32)

Because we already found E > M, the Lab frame threshold energy condition for the
scattering event to be possible is m(E + m) > 2M 2 , or

2M 2 − m 2
E > E thresh

= . (2.33)
m
Let us also relate the Lab frame 4-momenta to those in the COM frame. To find the
Lorentz transformation needed to go from the COM frame to the Lab frame, consider
μ
the 0, 3 components of the equation p2 = μ ν p2ν :

m γ βγ √ E
= . (2.34)
0 βγ γ − E 2 − m2

It follows that

E − m
β = 1 − m /E =
2 2 , (2.35)
E + m

1 E + m
γ = = E/m = , (2.36)
1 − β2 2m

E − m
βγ = E 2 /m 2 − 1 = . (2.37)
2m
2.2 Relativistic Kinematics 19

Now we can apply this Lorentz boost to the final-state momenta as found in the COM
frame to obtain the Lab frame momenta. For the first final-state particle:
⎛ ⎞⎛ ⎞
γ 0 0βγ E
⎜ 0 0 ⎟ ⎜ ⎟
k1
μ
=⎜
1 0 ⎟⎜ √0 ⎟ (2.38)
⎝ 0 0 1 0 ⎠ ⎝ sin θ √E − M ⎠
2 2
βγ 0 0 γ cos θ E 2 − M 2
⎛ ⎞
1 + cos θ 1 − m 2 /E 2 1 − M 2 /E 2
⎜ 0 ⎟
= (E 2 /m) ⎜
⎝
⎟.
⎠ (2.39)
sin θ (m/E) 1 − M 2 /E 2

1 − m /E + cos θ 1 − M /E
2 2 2 2

Note that for M > m, the z-component of the momentum is always positive (in the
same direction as the incoming particle in the Lab frame), regardless of the sign of
cos θ . (The other final-state momentum is obtained by just flipping the signs of cos θ
and sin θ .) The Lab-frame scattering angle with respect to the original collision axis
(the z-axis in both frames) is determined by

(m/E) sin θ
tan θ = . (2.40)
1 − m 2 /E 2 /1 − M 2 /E 2 + cos θ

For fixed θ in the COM frame, |θ | in the Lab frame decreases with increasing E/m,
as the produced particles go more in the forward direction.
Notice from (2.28) and (2.33) that while the production of a pair of heavy particles
of mass M requires beam energies in symmetric collisions that scale like M, in fixed-
target collisions the energy required scales like 2M 2 /m M, where m is the beam
particle mass. This is why fixed-target collisions are no longer an option for frontier
physics discoveries of very heavy particles or high-energy phenomena.
In collider applications, it is common to see the direction of a final-state particle
with respect to the colliding beams described either by the pseudo-rapidity η or the
longitudinal rapidity y. Suppose that the two colliding beams are oriented so that
Beam 1 is going in the ẑ direction and Beam 2 is going in the −ẑ direction. A final
state particle (or group of particles) emerging at an angle θ with respect to Beam 1
in general has a four-vector momentum given by:

p μ = (E, pT cos φ, pT sin φ, pz ), (2.41)

where pT = |p| sin θ is the transverse momentum, pz = |p| cos θ is the longitudinal
momentum, and E = |p|2 + m 2 is the energy, with m the mass and p the three-
vector momentum. (In hadron colliders, this four-vector is generally defined in the
lab frame, not in the center-of-momentum frame of the scattering event, which is
often unknown.) Then the pseudo-rapidity is defined by

1 |p| + pz
η = ln = − ln [tan(θ/2)] . (2.42)
2 |p| − pz
20 2 Special Relativity and Lorentz Transformations

Thus η = 0 corresponds to a particle coming out perpendicular to the beam line (θ =

90◦ ), while η = ±∞ correspond to the directions along the beams (θ = 0, 180◦ ).
Particles at small |η| (less than 1 or 2 or so, depending on the situation) are said to be
central, while those at large |η| are said to be forward. Note that the pseudo-rapidity
depends only on the direction of the particle, not on its energy. The longitudinal
rapidity is defined somewhat similarly by

1 E + pz
y= ln . (2.43)
2 E − pz

In fact, η = y in the special case of a massless particle, and they are very nearly
equal for a particle whose energy is large compared to its mass. However, in general
y does depend on the energy. For the same particle, the ordinary rapidity is given
by:

1 E + |p|
ρ = ln . (2.44)
2 E − |p|

The quantity y is the rapidity of the boost needed to move to a frame where the
particle has no longitudinal momentum along the beam direction, while ρ is the
rapidity of the boost needed to move to the particle’s rest frame. Confusingly, it has
become a standard abuse of language among collider physicists to call y simply the
rapidity, and among non-collider physicists it is common to see the letter η used to
refer to the ordinary rapidity, called ρ here. Some care is needed to ensure that one
is using and interpreting these quantities consistently.

2.3 Tensors and Lorentz Invariant Quantities

Now let us return to the study of the properties of Lorentz transformations. The
Lorentz-invariance of equation (2.21) implies that, if a μ and bμ are constant four-
vectors, then

gμν a μ bν = gμν a μ bν , (2.45)

so that

gμν L μ ρ L ν σ a ρ bσ = gρσ a ρ bσ . (2.46)

Since a μ and bν are arbitrary, it must be that:

gμν L μ ρ L ν σ = gρσ . (2.47)

2.3 Tensors and Lorentz Invariant Quantities 21

This is the fundamental constraint that a Lorentz transformation matrix must satisfy.
In matrix form, it could be written as L T gL = g. If we contract (2.47) with g ρκ , we
obtain

L ν κ L ν σ = δσκ (2.48)

Applying this to (2.2) and (2.19), we find that the inverse Lorentz transformation of
any four-vector is

a ν = a μ L μ ν (2.49)
aν = aμ L μ ν (2.50)

Let us now consider some more particular Lorentz transformations. To begin, we

note that as a matrix, det(L) = ±1. An example of a “large” Lorentz transformation
with det(L) = −1 is:
⎛ ⎞
−1 0 0 0
⎜ 0 1 0 0⎟
L μν =⎜
⎝ 0
⎟. (2.51)
0 1 0⎠
0 0 0 1

This just flips the sign of the time coordinate, and is therefore known as time reversal:

x 0 = −x 0 x 1 = x 1 x 2 = x 2 x 3 = x 3 . (2.52)

Another “large" Lorentz transformation is parity, or space inversion:

⎛ ⎞
1 0 0 0
⎜0 −1 0 0 ⎟
L μν =⎜
⎝0
⎟, (2.53)
0 −1 0 ⎠
0 0 0 −1

so that:

x 0 = x 0 x 1 = −x 1 x 2 = −x 2 x 3 = −x 3 . (2.54)

It was once thought that the laws of physics have to be invariant under these oper-
ations. However, it was shown experimentally in the 1950s that parity is violated
in the weak interactions, specifically in the weak decays of the 60 Co nucleus and
the K ± mesons. Likewise, experiments in the 1960s on the decays of K 0 mesons
showed that time-reversal invariance is violated (at least if very general properties
of quantum mechanics and special relativity are assumed).
22 2 Special Relativity and Lorentz Transformations

However, all experiments up to now are consistent with invariance of the laws
of physics under the subset of Lorentz transformations that are continuously con-
nected to the identity; these are known as “proper” Lorentz transformations and have
det(L) = +1. They can be built up out of infinitesimal Lorentz transformations:

L μ ν = δνμ + ωμ ν + O(ω2 ), (2.55)

where we agree to drop everything with more than one ωμ ν . Then, according to
(2.47),

gμν (δρμ + ωμ ρ + · · · )(δσν + ων σ + · · · ) = gρσ , (2.56)

gρσ + ωσρ + ωρσ + · · · = gρσ . (2.57)

Therefore

ωσρ = −ωρσ (2.58)

is an antisymmetric 4 × 4 matrix, with 4 · 3/2 · 1 = 6 independent entries. These

correspond to 3 rotations (ρ, σ = 1, 2 or 1, 3 or 2, 3) and 3 boosts (ρ, σ = 0, 1 or 0,
2 or 0, 3). It is a mathematical fact that any Lorentz transformation can be built up
out of repeated infinitesimal boosts and rotations, combined with the operations of
time-reversal and space inversion.
Lorentz transformations obey the mathematical properties of a group, known as
the Lorentz group. The subset of Lorentz transformations that can be built out of
repeated infinitesimal boosts and rotations form a smaller group, called the proper
Lorentz group. In the Standard Model of particle physics and generalizations of it,
all interesting objects, including operators, states, particles, and fields, transform
as well-defined representations of the Lorentz group. We will study these group
representations in more detail later.
So far we have considered constant four-vectors. However, one can also consider
four-vectors that depend on position in spacetime. For example, suppose that F(x)
is a scalar function of x μ . It is usual to leave the index μ off of x μ when it is used as
the argument of a function, so F(x) really means F(x 0 , x 1 , x 2 , x 3 ). Under a Lorentz
transformation from coordinates x μ → x μ , at a given fixed point in spacetime the
value of the function F reported by an observer using the primed coordinate system
is taken to be equal to the value of the original function F in the original coordinates:

F (x ) = F(x). (2.59)

Then

∂F 1 ∂F
∂μ F ≡ μ = , ∇F (2.60)
∂x c ∂t
2.3 Tensors and Lorentz Invariant Quantities 23

is a covariant four-vector. This is because:

∂ ∂xν ∂
(∂μ F) (x ) ≡ μ
F (x ) = μ ν F(x) = L μ ν (∂ν F)(x), (2.61)
∂x ∂x ∂x
showing that it transforms according to (2.19). (The second equality uses the chain
rule and (2.59); the last equality uses (2.49) with a μ = x μ .) By raising the index,
one obtains a contravariant four-vector function

1 ∂F
∂ μ F = g μν ∂ν F = , −∇ F . (2.62)
c ∂t

One can obtain another scalar function by acting twice with the 4-dimensional
derivative operator on F, contracting the indices on the derivatives:

1 ∂2 F
∂ μ ∂μ F = − ∇ 2 F. (2.63)
c2 ∂t 2
The object −∂ μ ∂μ F is a 4-dimensional generalization of the Laplacian.
A tensor is an object that can carry an arbitrary number of spacetime vector
indices, and transforms appropriately when one goes to a new reference frame. The
μ
objects g μν and gμν and δν are constant tensors. Four-vectors and scalar functions
and 4-derivatives of them are also tensors. In general, the defining characteristic of
μ μ ...
a tensor function Tν1 ν1 2 ...2 (x) is that under a change of reference frame, it transforms
so that in the primed coordinate system, the corresponding tensor T is:

Tνμ 1 μ2 ... (x ) = L μ1
1 ν2 ...
μ2 σ1 σ2 ρ1 ρ2 ...
ρ1 L ρ2 · · · L ν1 L ν2 · · · Tσ1 σ2 ... (x). (2.64)

A special and useful constant tensor is the totally antisymmetric Levi-Civita ten-
sor:
⎧
⎨ +1 if μνρσ is an even permutation of 0123
μνρσ = −1 if μνρσ is an odd permutation of 0123 (2.65)
⎩
0 otherwise

One use for the Levi-Civita tensor is in understanding the Lorentz invariance of
four-dimensional integration. Define 4 four-vectors so that in a particular frame they
are given by the infinitesimal differentials:

Aμ = (cdt, 0, 0, 0); (2.66)

Bμ = (0, d x, 0, 0); (2.67)
Cμ = (0, 0, dy, 0); (2.68)
Dμ = (0, 0, 0, dz). (2.69)

Then the 4-dimensional volume element

d 4 x ≡ d x 0 d x 1 d x 2 d x 3 = Aμ B ν C ρ D σ μνρσ (2.70)
24 2 Special Relativity and Lorentz Transformations

is Lorentz-invariant, since in the last expression it has no uncontracted four-vector

indices. It follows that if F(x) is a Lorentz scalar function of x μ , then the integral

I [F] = d 4 x F(x) (2.71)

is invariant under Lorentz transformations. This is good because eventually we will

learn to define theories in terms of such an integral, known as the action.

2.4 Maxwell’s Equations and Electromagnetism

An example of a relativistic theory that we are familiar with is electricity and mag-
netism. It is instructive to recast Maxwell’s equations into a manifestly relativistic
form. This will give us familiarity with four-component gauge field formulation of
the relativistic wave equations governing electromagnetic fields. We will also see the
relativistic version of gauge invariance and concept of gauge transformations for the
electromagnetic field, which will be explored more completely in later sections.
Recall that Maxwell’s equations can be written in the form:

∇·E = eρ, (2.72)

∂E
∇×B− = eJ, (2.73)
∂t
∇·B = 0, (2.74)
∂B
∇×E+ = 0, (2.75)
∂t
where ρ is the local charge density and J is the current density, with the magnitude of
the electron’s charge, e, factored out. These equations can be rewritten in a manifestly
relativistic form, using the following observations. First, suppose we add together the
equations obtained by taking ∂/∂t of (2.72) and ∇· of (2.73). Since the divergence
of a curl vanishes identically, this yields the Law of Local Conservation of Charge:

∂ρ
+ ∇ · J = 0. (2.76)
∂t
To put this into a Lorentz-invariant form, we can form a four-vector charge and
current density:

J μ = (ρ, J ), (2.77)

so that (2.76) becomes

∂μ J μ = 0. (2.78)
Problems 25

Furthermore, (2.74) and (2.75) imply that we can write the electric and magnetic
fields as derivatives of the electric and magnetic potentials V and A:

∂A
E = −∇V − , (2.79)
∂t
B = ∇ × A. (2.80)

Now if we assemble the potentials into a four-vector:

Aμ = (V , A ), (2.81)

then (2.79) and (2.80) mean that we can write the electric and magnetic fields as
components of an antisymmetric tensor:

Fμν = ∂μ Aν − ∂ν Aμ (2.82)
⎛ ⎞
0 Ex Ey Ez
⎜ −E x 0 −Bz By ⎟
=⎜⎝ −E y
⎟. (2.83)
Bz 0 −Bx ⎠
−E z −B y Bx 0

Now the Maxwell equations (2.72) and (2.73) correspond to the relativistic wave
equation

∂μ F μν = e J ν , (2.84)

or equivalently,

∂μ ∂ μ Aν − ∂ ν ∂μ Aμ = e J ν . (2.85)

The remaining Maxwell equations (2.74) and (2.75) are equivalent to the identity:

∂ρ Fμν + ∂μ Fνρ + ∂ν Fρμ = 0, (2.86)

for μ, ν, ρ = any of 0, 1, 2, 3. Note that this equation is automatically true because

of (2.82). Also, because F μν is antisymmetric and partial derivatives commute,
∂μ ∂ν F μν = 0, so that the Law of Local Conservation of Charge (2.78) follows from
(2.84).
The potential Aμ (x) may be thought of as fundamental, and the fields E and B as
derived from it. The theory of electromagnetism as described by Aμ (x) is subject to a
redundancy known as gauge invariance. To see this, we note that (2.85) is unchanged
if we do the transformation

Aμ (x) → Aμ (x) + ∂ μ λ(x), (2.87)

26 2 Special Relativity and Lorentz Transformations

where λ(x) is any function of position in spacetime. In components, this amounts

to:
∂λ
V →V+ , (2.88)
∂t
A → A − ∇λ . (2.89)

This transformation leaves F μν (or equivalently E and B) unchanged. Therefore,

the new Aμ is just as good as the old Aμ for the purposes of describing a particular
physical situation.
Problems

1. Natural units are a system of units where the units of time, distance, energy,
mass and momentum are all represented in terms of GeV. This is accomplished
by rescaling constants of nature such that = 1 (Planck’s quantum mechanics
constant) and c = 1 (speed of light). Let us understand this better by giving c
the units of “speedy” and the units of “spinny.” In other words,

c = 1 speedy and = 1 spinny.

Use also the fact that 1 GeV = 1.602 × 10−10 J to find what meters, seconds and
kg are in units of speedy, spinny and GeV. Compare the numerical values you
get with the conversion factors of meters, seconds and kg to GeV in appendix
A.1.
Note: there is no problem keeping “speedy” and “spinny” units for the entire
book but since they are always multiplied by factors of c and in the equations,
which are numerically just 1 in these units, it is numerically safe to always drop
reference to “speedy”’ and “spinny” units and just keep GeV. That choice defines
“natural units”.
2. A baseball has a mass of 0.145 kg. The distance between the pitcher’s mound
and home plate is 18.44 meters. A hitter has about 150 milliseconds to make a
decision on whether to swing at a pitch. Express these three measurements in
natural units (i.e., in units of GeV).
3. Using data from the “Review of Particle Properties”, find numerical values for
the mean distance traveled by a muon, a neutral pion, a charged pion, a tau lepton,
a K + meson, a D + meson, and a B + meson produced in high-energy physics
experiments, for the following values of the particle energy: 10 GeV, 100 GeV,
1000 GeV.
4. What is the threshold energy for the initial-state proton, for the reaction p + n →
p + p + π − , assuming the neutron is initially at rest?
5. Consider a particle at rest with mass M, which undergoes a 2-body decay to
particles a, b of masses m a and m b . Assuming that all motion of the decay
products is along the z direction, find the four-momenta of the particles a, b for:
(a) the special case m b = m a = m.
Problems 27

(b) the special case m b = 0.

(c) the general case. Write
√ your answer in the simplest form. Hint: write your
answer in terms of λ, where
λ ≡ M 4 + m a4 + m 4b − 2m a2 M 2 − 2m 2b M 2 − 2m a2 m 2b .
Check that your answer agrees with the special cases in parts (a) and (b).
6. Assume two incoming particles of masses m 1√and m 2 travel along the z direction
and collide with total center of mass energy s, producing two out-going parti-
cles of masses m 3 and m 4 traveling in the yz plane with an angle θ with respect
to the original z collision axis. Find the COM four vectors for all four particles.
7. Consider the elastic scattering of a muon neutrino off of an electron: νμ e− →
νμ e− . Suppose that in the lab frame the electron is initially at rest and the neutrino
has energy E, and treat the neutrino as massless.
(a) Write expressions for the 4-vectors of the particles in the center-of-momentum
frame.
(b) In the lab frame, what is the maximum angle of the scattered electron with
repect to the incoming neutrino?
8. Use the definition of Fμν in terms of Aμ to verify the Bianchi identity

∂ρ Fμν + ∂μ Fνρ + ∂ν Fρμ = 0. (2.90)

9. An asymmetric B-factory will study CP violation in the decays of B mesons.

It consists of an electron beam of energy E 1 colliding with a positron beam of
energy E 2 , with the energies chosen so that the bb bound state ϒ(4S) is produced
resonantly, and in a frame moving
√ with respect to the lab frame. For the purposes
of this problem, take M = 112 GeV exactly for the ϒ(4S) mass; this is close
to the actual value. Get other necessary quantities from the Review of Particle
Properties. You can also neglect the mass of the electron.
(a) If E 2 = 4 GeV, what should E 1 be in order to produce the ϒ(4S) resonantly?
(b) What is the mean distance travelled by the ϒ(4S) in the lab frame before it
decays?
(c) Suppose the ϒ(4S) decays to B 0 + B 0 , with the B 0 produced at an angle θ
with respect to the e− beam direction in the center of momentum frame. What
are the momentum and velocity 4-vectors of the B 0 and B 0 , as a function of
M, m = m B 0 , θ , E 1 , and E 2 ? Leave your answers in terms of these variables
(rather than numerical values).
(d) What is the mean distance travelled by the B 0 before it decays, as a function
of the same variables and the decay width B 0 ?
(e) Plug in all of the numbers for part (d). Which has the greater impact on the
mean distance travelled: the boost from the asymmetric beam energies, or
the variation in the angle θ ?
10. Two photons, with energies E 1 and E 2 , annihilate with each other in empty
space, with a collision angle θ . The final state is a muon + antimuon pair. In this
28 2 Special Relativity and Lorentz Transformations

problem, denote the mass of the muon as M, and its mean lifetime in its rest
frame as τ .
(a) Find the necessary inequality for the process to occur, in terms of the given
quantities. Discuss the behavior of this requirement for the two limiting cases
θ = 0 and θ = π .
(b) Now suppose that the collision occurs head-on (θ = 0), and that E 1 > E 2 .
What is the maximum energy that one of the resulting muons can have?]
(c) Now suppose that θ = 0, E 1 = 9M, and E 2 = M. What will be the maxi-
mum mean lifetime of the longer-lived muon?

11. Prove that any Lorentz transformation matrix L μ ν satisfies det(L) = ±1.
[Hint: Recall that det(AB) = det(A)det(B) for any matrices A, B.]
12. Consider a Lorentz transformation for which x μ → x μ = L μ ν x ν , where L μ ν
is constant.
μ
(a) Suppose that Tν is a tensor. Use (2.3.20) to write down how it transforms
μ
under a Lorentz transformation. Then, from this, prove that Tμ transforms
as a scalar.
(b) Let Aμ be the 4-vector potential for electromagnetism, and ∂μ = ∂/∂ x μ and
∂ μ = ∂/∂ xμ and let J μ be the corresponding current density, as in Sect. 2.4.
Write down how each of these objects transforms under a Lorentz transfor-
mation.
(c) Use the results of part (b) to derive how the field-strength tensor F μν trans-
forms under the Lorentz transformation, but without further appealing to
(2.3.20). Write the result in the form

F μν → X μν ρσ F ρσ , (2.91)

and then check that it does indeed agree with (2.3.20).

(d) Suppose that Maxwell’s equations (2.4.13) and (2.4.15) are satisfied in the
original coordinate system. Prove, using the above, that they are still satisfied
in the primed coordinate system resulting from the Lorentz transformation.
Relativistic Quantum Mechanics
of Single Particles 3

3.1 Klein-Gordon and Dirac Equations

Any realistic theory must be consistent with quantum mechanics. In this chapter, we
consider how to formulate a theory of quantum mechanics that is consistent with
special relativity.
Suppose that (x) is the wavefunction of a free particle in 4-dimensional space-
time. A fundamental principle of quantum mechanics is that the time dependence of
is determined by a Hamiltonian operator, according to:

∂
H = i . (3.1)
∂t
Now, the three-momentum operator is given by

P = −i∇. (3.2)

Because H and P commute, one can take to be one of the basis of wavefunctions
for eigenstates with energy and momentum eigenvalues E and p respectively:

H = E; P = p . (3.3)

One can now turn this into a relativistic Schrödinger wave equation for free particle
states, by using the fact that special relativity implies:

E = m 2 c4 + p 2 c2 , (3.4)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 29

S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_3
30 3 Relativistic Quantum Mechanics of Single Particles

where m is the mass of the particle. To make sense of this as an operator equation, we
could try expanding it in an infinite series, treating p 2 as small compared to m 2 c2 :

∂ p2 p4
i = mc2 1 + − + . . . (3.5)
∂t 2 m 2 c2 8 m 4 c4

2 2 4
= mc −
2
∇ − (∇ ) + . . .
2 2
(3.6)
2m 8 m 3 c2

If we keep only the first two terms, then we recover the standard non-relativistic
quantum mechanics of a free particle; the first term mc2 is an unobservable constant
contribution to the Hamiltonian, proportional to the rest energy, and the second term is
the usual non-relativistic kinetic energy. However, the presence of an infinite number
of derivatives leads to horrible problems, including apparently non-local effects.
Instead, one can consider the operator H 2 acting on , avoiding the square root.
It follows that

H 2 = E 2 = (c2 p 2 + m 2 c4 ) = (c2 P 2 + m 2 c4 ), (3.7)

so that:

∂ 2
− = −∇ 2 + m 2 . (3.8)
∂t 2
Here and from now on we have set c = 1 and = 1 by a choice of units. This conven-
tion means that mass, energy, and momentum all have the same units (GeV), while
time and distance have units of GeV−1 , and velocity is dimensionless. These con-
ventions greatly simplify the equations of particle physics. One can always recover
the usual metric system units using the following conversion table for energy, mass,
distance, and time, respectively:

1 GeV = 1.6022 × 10−3 erg = 1.6022 × 10−10 J, (3.9)

(1 GeV)/c2 = 1.7827 × 10−24 g = 1.7827 × 10−27 kg, (3.10)
(1 GeV)−1 (c) = 1.9733 × 10−14 cm = 1.9733 × 10−16 m, (3.11)
(1 GeV)−1 = 6.58212 × 10−25 s. (3.12)

Using (2.63), the wave-equation (3.8) can be rewritten in a manifestly Lorentz-

invariant way as

(∂ μ ∂μ + m 2 ) = 0. (3.13)

This relativistic generalization of the Schrödinger equation is known as the Klein-

Gordon equation.
It is easy to guess the solutions of the Klein-Gordon equation. If we try:

(x) = 0 e−ik·x , (3.14)

3.1 Klein-Gordon and Dirac Equations 31

where 0 is a constant and k μ is a four-vector, then ∂μ = −ikμ and so

∂ μ ∂μ = −k μ kμ = −k 2 . (3.15)

Therefore, we only need to impose k 2 = m 2 to have a solution. It is then easy to check

that this is an eigenstate of H and P with energy E = k 0 and three-momentum p = k,
satisfying E 2 = p 2 + m 2 .
However, there is a big problem with this. If k μ = (E, p ) gives a solution, then
so does k μ = (−E, p ). By increasing |p |, one can have |E| arbitrarily large. This is
a disaster, because the energy is not bounded from below. If the particle can interact,
it will make transitions from higher energy states to lower energy states. This would
seem to lead to the release of an infinite amount of energy as the particle acquires a
larger and larger three-momentum, without bound!
In 1927, Dirac suggested an alternative, based on the observation that the problem
with the Klein-Gordon equation seems to be that it is quadratic in H or equivalently
∂/∂t; this leads to the sign ambiguity for E. Dirac could also have been1 motivated
by the fact that particles like the electron have spin; since they have more than one
intrinsic degree of freedom, trying to explain them with a single wavefunction (x)
is doomed to failure. Instead, Dirac proposed to write a relativistic Schrödinger
equation, for a multi-component wavefunction a (x), where the spinor index a =
1, 2, . . . , n runs over the components. The wave equation should be linear in ∂/∂t;
since relativity places t on the same footing as x, y, z, it should also be linear in
derivatives of the spatial coordinates. Therefore, the equation ought to take the form
∂
i = H = (α · P + βm), (3.16)
∂t
where αx , α y , αz , and β are n × n matrices acting in “spinor space”.
To determine what α and β have to be, consider H 2 . There are two ways to
evaluate the result. First, by exactly the same reasoning as for the Klein-Gordon
equation, one finds

∂2
− = (−∇ 2 + m 2 ). (3.17)
∂t 2
On the other hand, expressing H in terms of the right-hand side of (3.16), we find:
⎡ ⎤
∂2
3
∂ ∂ ∂
− 2 = ⎣− α j αk − im (α j β + βα j ) j + β m ⎦ .
2 2 (3.18)
∂t ∂x j ∂xk ∂x
j,k=1 j

Since partial derivatives commute, one can write:

3
∂ ∂ 1
3
∂ ∂
α j αk = (α j αk + αk α j ) j k . (3.19)
∂x ∂x
j k 2 ∂x ∂x
j,k=1 j,k=1

1 Apparently, he realized this only in hindsight.

32 3 Relativistic Quantum Mechanics of Single Particles

Then comparing (3.17) and (3.18), one finds that the two agree if, for j, k = 1, 2, 3:

β 2 = 1, (3.20)
α j β + βα j = 0, (3.21)
α j αk + αk α j = 2δ jk . (3.22)

The simplest solution turns out to require n = 4 spinor indices. This may be
somewhat surprising, since naively one only needs n = 2 to describe a spin-1/2
particle like the electron. As we will see, the Dirac equation automatically describes
positrons as well as electrons, accounting for the doubling. It is easiest to write the
solution in terms of 2 × 2 Pauli matrices:

0 1 0 −i 1 0 1 0
σ1 = , σ2 = , σ3 = , and σ 0 = . (3.23)
1 0 i 0 0 −1 0 1

Then one can check that the 4 × 4 matrices

0 σ0 −σ j 0
β= , αj = , ( j = 1, 2, 3) (3.24)
σ 0 0 0 σj

obey the required conditions. The matrices β, α j are written in 2 × 2 block form,
so “0” actually denotes a 2 × 2 block of 0’s. Equation (3.16) is known as the Dirac
equation, and the 4-component object is known as a Dirac spinor. Note that the fact
that Dirac spinor space is 4-dimensional, just like ordinary spacetime, is really just
a coincidence.2 One must be careful not to confuse the two types of 4-dimensional
spaces!
It is convenient and traditional to rewrite the Dirac equation in a nicer way by
multiplying it on the left by the matrix β, and defining

γ 0 = β, γ j = βα j , ( j = 1, 2, 3). (3.25)

The result is

∂ ∂ ∂ ∂
i(γ 0 0 + γ 1 1 + γ 2 2 + γ 3 3 ) − m = 0, (3.26)
∂x ∂x ∂x ∂x

or, even more nicely:

(iγ μ ∂μ − m) = 0. (3.27)

The γ μ matrices are explicitly given, in 2 × 2 blocks, by:

0 σ0 0 σj
γ0 = , γ j
= , ( j = 1, 2, 3). (3.28)
σ0 0 −σ j 0

2 Forexample, if we lived in 10 dimensional spacetime, it turns out that Dirac spinors would have
32 components.
3.1 Klein-Gordon and Dirac Equations 33

[The solution found above for the γ μ is not unique. To see this, suppose U is any
constant unitary 4 × 4 matrix satisfying U † U = 1. Then the Dirac equation implies:

U (iγ μ ∂μ − m)U † U = 0, (3.29)

from which it follows that, writing γ μ = U γ μ U † , and = U ,

(iγ μ ∂μ − m) = 0. (3.30)

So, the new γ μ matrices together with the new spinor are just as good as the
old pair γ μ , ; there are an infinite number of different, equally valid choices. The
set we’ve given above is called the chiral or Weyl representation. Another popular
choice used by some textbooks (but not here) is the Pauli-Dirac representation.]
Many problems involving fermions in high-energy physics involve many gamma
matrices dotted into partial derivatives or momentum four-vectors. To keep the nota-
tion from getting too bloated, it is often useful to use the Feynman slash notation:

γ μ aμ = a/ (3.31)

for any four-vector a μ . Then the Dirac equation takes the even more compact form:

(i ∂/ − m) = 0. (3.32)

Some important properties of the γ μ matrices are:

γ 0† = γ 0 , γ j† = −γ j , ( j = 1, 2, 3), (3.33)
γ 0 γ μ† γ 0 = γ μ , (3.34)
Tr(γμ γν ) = 4gμν , (3.35)
γ μ γμ = 4, (3.36)
γμ γν + γν γμ = {γμ , γν } = 2gμν . (3.37)

Note that on the right-hand sides of each of (3.36) and (3.37), there is an implicit
4 × 4 unit matrix. It turns out that one almost never needs to know the explicit form
of the γ μ . Instead, the equations above can be used to derive identities needed in
practical work.
How does a Dirac spinor a (x) transform under a Lorentz transformation? It
carries no vector index, so it is not a tensor. On the other hand, the fact that the
Hamiltonian “mixes up” the components of a (x) is a clue that it doesn’t transform
like an ordinary scalar function either. Instead, we might expect that the spinor
reported by an observer in the primed frame is given by

(x ) = (x), (3.38)
34 3 Relativistic Quantum Mechanics of Single Particles

where is a 4 × 4 matrix that depends on the Lorentz transformation matrix L μ ν . In

μ
fact, one can show that for an infinitesimal Lorentz transformation L μ ν = δν + ωμ ν ,
one has:
i
(x ) = (1 − ωμν S μν )(x), (3.39)
2
where
i μ ν
S μν = [γ , γ ]. (3.40)
4
To obtain the result for a non-infinitesimal proper Lorentz transformation, we can
apply the same infinitesimal transformation a large number of times N , with N → ∞.
Letting μν = N ωμν , we obtain, after N iterations of the Lorentz transformation
parameterized by ωμ ν :
N
L μ ν = δνμ + μ ν /N → [exp( )]μ ν (3.41)

as N → ∞. Here we are using the identity:

lim (1 + x/N ) N = exp(x), (3.42)

N →∞

with the exponential of a matrix to be interpreted in the power series sense:

exp(M) = 1 + M + M 2 /2 + M 3 /6 + . . . . (3.43)

For the Dirac spinor, one has in the same way:

N
i i
(x ) = 1 − μν S μν /N (x) → exp − μν S μν (x). (3.44)
2 2

So, we have found the that appears in (3.38) corresponding to the L μ ν that appears
in (3.41):

i μν
= exp − μν S . (3.45)
2

As an example, consider a boost in the z direction:

⎛ ⎞
0 0 0 −ρ
⎜ 0 0 0 0 ⎟
μ ν =⎜
⎝ 0
⎟. (3.46)
0 0 0 ⎠
−ρ 0 0 0
3.1 Klein-Gordon and Dirac Equations 35

Then
⎛ ⎞ ⎛ ⎞
ρ2 0 0 0 0 0 0 −ρ 3
⎜ 0 0 0 0 ⎟ ⎜ 0 0 0 0 ⎟
2 = ⎜
⎝ 0
⎟, 3 = ⎜ ⎟, etc. (3.47)
0 0 0 ⎠ ⎝ 0 0 0 0 ⎠
0 0 0 ρ2 −ρ 3 0 0 0

so that from (3.41) and (3.43),

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
1 0 0 0 1 0 0 0 0 0 0 1
⎜0 1 0 0⎟ ρ2 ρ4 ⎜0 0⎟ ρ3 ⎜0 0⎟
⎜ ⎟ ⎜ ⎟ ⎜ ⎟
Lμ
0 0 0 0
ν = ⎜ ⎟+ + + ··· ⎜ ⎟− ρ+ + ... ⎜ ⎟
⎝0 0 1 0⎠ 2 24 ⎝0 0 0 0⎠ 6 ⎝0 0 0 0⎠
0 0 0 1 0 0 0 1 1 0 0 0
⎛ ⎞
cosh ρ 0 0 − sinh ρ
⎜ ⎟
⎜ 0 1 0 0 ⎟
= ⎜
⎝ 0 0 1 0
⎟,
⎠
(3.48)
− sinh ρ 0 0 cosh ρ

in agreement with (2.7). Meanwhile, 03 = − 30 = −ρ, so

i ρ ρ σ3 0
− μν S μν = − [γ 0 , γ 3 ] = (3.49)
2 4 2 0 −σ 3

in 2 × 2 blocks. Since this matrix is diagonal, it is particularly easy to exponentiate,

and (3.45) gives:
⎛ ⎞ ⎛ ρ/2 ⎞
ρ/2 0 0 0 e 0 0 0
⎜ 0 −ρ/2 0 0⎟ ⎜ 0 e−ρ/2 0 0 ⎟
= exp ⎜
⎝ 0
⎟=⎜ ⎟. (3.50)
0 −ρ/2 0 ⎠ ⎝ 0 0 e−ρ/2 0 ⎠
0 0 0 ρ/2 0 0 0 eρ/2

Therefore, this is the matrix that boosts a Dirac spinor in the z direction with rapidity
ρ, in (3.38).
Since is not a scalar, it is natural to ask whether one can use it to construct a
scalar quantity. A tempting guess is to get rid of all the pesky spinor indices by

4
† (x) ≡ a† a . (3.51)
a=1

However, under a Lorentz transformation, (x ) = (x) and † (x ) = † (x)† ,

so:

† (x ) = † † (x). (3.52)
36 3 Relativistic Quantum Mechanics of Single Particles

This will therefore be a scalar function if † = 1, in other words if is a unitary

matrix. However, this is not true, as the example of (3.50) clearly illustrates.
Instead, with amazing foresight, let us consider the object

† γ 0 . (3.53)

Under a Lorentz transformation:

† γ 0 (x ) = † † γ 0 (x). (3.54)

Therefore, † γ 0 will transform as a scalar if:

† γ 0 = γ 0 . (3.55)

One can check that this is indeed true for the special case of (3.50). More importantly,
(3.55) is true for any

i
= 1 − ωμν S μν (3.56)
2
that is infinitesimally close to the identity, using (3.33) and (3.34). Therefore, it is
true for any proper Lorentz transformation built out of infinitesimal ones.
Motivated by this, one defines, for any Dirac spinor ,

≡ †γ 0. (3.57)

One should think of as a column vector in spinor space, and as a row vector.
Then their inner product

, (3.58)

with all spinor indices contracted, transforms as a scalar function under proper
Lorentz transformations. Similarly, one can show that

γ μ (3.59)

transforms as a four-vector. One should think of (3.59) as a (row vector) × (matrix)

× (column vector) in spinor-index space, with a spacetime vector index μ hanging
around.
3.2 Solutions of the Dirac Equation 37

3.2 Solutions of the Dirac Equation

Our next task is to construct solutions to the Dirac equation. Let us separate out the
x μ -dependent part as a plane wave, by trying

(x) = u( p, s)e−i p·x . (3.60)

Here p μ is a four-vector momentum, with p 0 = E > 0. A solution to the Dirac

equation must also satisfy the Klein-Gordon equation, so p 2 = E 2 − p 2 = m 2 . The
object u( p, s) is a spinor, labeled by the 4-momentum p and s. For now s just
distinguishes between distinct solutions, but it will turn out to be related to the spin.
Plugging this into the Dirac equation (3.32), we obtain a 4 × 4 eigenvalue equation
to be solved for u( p, s):

( /p − m)u( p, s) = 0. (3.61)

To simplify things, first consider this equation in the rest frame of the particle,
where p μ = (m, 0, 0, 0). In that frame,

m(γ 0 − 1)u( p, s) = 0. (3.62)

Using the explicit form of γ 0 , we can therefore write in 2 × 2 blocks:

−1 1
u( p, s) = 0, (3.63)
1 −1

where each “1” means a 2 × 2 unit matrix. The solutions are clearly

√ χs
u( p, s) = m , (3.64)
χs
√
where χs can be any 2-vector, and the m normalization is a convention. In practice,
it is best to choose the χs orthonormal, satisfying χs† χr = δr s for r , s = 1, 2. A
particularly nice choice is:

1 0
χ1 = , χ2 = . (3.65)
0 1

As we will see, these just correspond to spin eigenstates Sz = 1/2 and −1/2.
Now, to construct the corresponding solution in any other frame, one can just
boost the spinor using (3.45). For example, consider the solution
⎛ ⎞
1
√ ⎜ 0 ⎟ −imt
(x ) = u( p, 1)e−i p·x = m⎜
⎝1⎠e
⎟ (3.66)
0
38 3 Relativistic Quantum Mechanics of Single Particles

in a frame where the particle is at rest; we have called it the primed frame for
convenience. We suppose the primed frame is moving with respect to the unprimed
frame with rapidity ρ in the z direction. Thus, the particle has, in the unprimed frame:

E = p 0 = m cosh ρ; pz = p 3 = m sinh ρ. (3.67)

Now, (x) = −1 (x ) from (3.38), so using the inverse of (3.50):

⎛ −ρ/2 ⎞
e
√ ⎜ 0 ⎟ −i p·x
(x) = m ⎜ ⎟
⎝ eρ/2 ⎠ e . (3.68)
0

We can rewrite this, noting that from (3.67),

√ ρ/2 √ −ρ/2
me = E + pz , me = E − pz . (3.69)

Therefore, one solution of the Dirac equation

√for a particle moving in the z direction,
with energy E and three-momentum pz = E 2 − m 2 , is:
⎛√ ⎞
E − pz
⎜ ⎟ −i p·x
(x) = ⎜ √ 0 ⎟ ,
⎝ E + pz ⎠ e (3.70)
0

so that
⎛√ ⎞
E − pz
⎜ ⎟
u( p, 1) = ⎜ √ 0 ⎟ (3.71)
⎝ E + pz ⎠
0

in this frame.
0
Similarly, if we use instead χ2 = in (3.64) in the rest frame, and apply the
1
same procedure, we find a solution:
⎛ ⎞ ⎛ ⎞
0 √ 0
√ ⎜ eρ/2 ⎟ −i p·x ⎜ E + pz ⎟ −i p·x
(x) = m ⎜ ⎝ 0 ⎠e
⎟ =⎜⎝
⎟e
⎠ , (3.72)
√ 0
e−ρ/2 E − pz

so that
⎛ ⎞
√ 0
⎜ E + pz ⎟
u( p, 2) = ⎜
⎝
⎟
⎠ (3.73)
√ 0
E − pz
3.2 Solutions of the Dirac Equation 39

in this frame. Note that pz in (3.70) and (3.72) can have either sign, corresponding
to the wavefunction for a particle moving in either the +z or −z directions.
In order to make a direct connection between spin and the various components of
a Dirac spinor, let us now consider how to construct the spin operator S. To do this,
recall that by definition, spin is the difference between the total angular momentum
operator J and the orbital angular momentum operator L:

J = L + S. (3.74)

Now,

L = x × P, (3.75)

where x and P are the three-dimensional position and momentum operators. The
total angular momentum must be conserved, or in other words it must commute with
the Hamiltonian:

[H , J] = 0. (3.76)

Using the Dirac Hamiltonian given in (3.16), we have

[H , L] = [α · P + βm, x × P] = −iα × P, (3.77)

where we have used the canonical commutation relation (with = 1) [P j , xk ] =

−iδ jk . So, comparing (3.74), (3.76) and (3.77), it must be true that:

−σ × P 0
[H , S ] = iα × P = i . (3.78)
0 σ ×P

One can now observe that the matrix:

1 σ 0
S= (3.79)
2 0σ

obeys (3.78). So, it must be the spin operator acting on Dirac spinors.
In particular, the z-component of the spin operator for Dirac spinors is given by
the diagonal matrix:
⎛ ⎞
1 0 0 0
1⎜ 0 −1 0 0 ⎟
Sz = ⎜ ⎟. (3.80)
2 ⎝0 0 1 0 ⎠
0 0 0 −1

Therefore, the solutions in (3.70) and (3.72) can be identified to have spin eigenvalues
Sz = +1/2 and Sz = −1/2, respectively. In general, a Dirac spinor eigenstate with
Sz = +1/2 will have only the first and third components non-zero, and one with
40 3 Relativistic Quantum Mechanics of Single Particles

Sz = −1/2 will have only the second and fourth components non-zero, regardless
of the direction of the momentum. Note that, as promised, Sz = 1/2 (−1/2) exactly
corresponds to the use of χ1 (χ2 ) in (3.64).
The helicity operator gives the relative orientation of the spin of the particle and
its momentum. It is defined to be:
p·S
h= . (3.81)
|p |

Like Sz , helicity has possible eigenvalues ±1/2 for a spin-1/2 particle. For example,
if pz > 0, then (3.70) and (3.72) represent states with helicity +1/2 and −1/2
respectively. The helicity is not invariant under Lorentz transformations for massive
particles. This is because one can always boost to a different frame in which the
3-momentum is flipped but the spin remains the same. (Also, note that unlike Sz ,
helicity is not even well-defined for a particle exactly at rest, due to the |p | = 0 in
the denominator.) However, a massless particle moves at the speed of light in any
inertial frame, so one can never boost to a frame in which its 3-momentum direction is
flipped. This means that for massless (or very energetic, so that E m) particles, the
helicity is fixed and invariant under Lorentz transformations. In any frame, a particle
with p and S parallel has helicity h = 1/2, and a particle with p and S antiparallel
has helicity h = −1/2.
Helicity is particularly useful in the high-energy limit. For example, we can con-
sider four solutions obtained from the E, pz m limits of (3.70) and (3.72), so that
| pz | = E:
⎛ ⎞
0
⎜ 0 ⎟ −i E(t−z)
pz >0,Sz =+1/2 =⎜ √ ⎟
⎝ 2E ⎠ e [p ↑, S ↑, h = +1/2] (3.82)
0
⎛ ⎞
√0
⎜ 2E ⎟ −i E(t−z)
pz >0,Sz =−1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↑, S ↓, h = −1/2] (3.83)
0
⎛√ ⎞
2E
⎜ 0 ⎟ −i E(t+z)
pz <0,Sz =+1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↓, S ↑, h = −1/2] (3.84)
0
⎛ ⎞
0
⎜ 0 ⎟ −i E(t+z)
pz <0,Sz =−1/2 =⎜
⎝ 0 ⎠e
⎟ [p ↓, S ↓, h = +1/2]. (3.85)
√
2E

In this high-energy limit, a Dirac spinor with h = +1/2 is called right-handed (R)
and one with h = −1/2 is called left-handed (L). Notice that a high-energy L state
3.2 Solutions of the Dirac Equation 41

is one that has the last two entries zero, while a high-energy R state always has the
first two entries zero.
It is useful to define matrices that project onto L and R states in the high-energy
or massless limit. In 2 × 2 blocks:

1 0 0 0
PL = ; PR = , (3.86)
0 0 0 1

where 1 and 0 mean the 2 × 2 unit and zero matrices, respectively. Then PL acting
on any Dirac spinor gives back a left-handed spinor, by just killing the last two
components. The projectors obey the rules:

PL2 = PL ; PR2 = PR ; PR PL = PL PR = 0. (3.87)

It is traditional to write PL and PR in terms of a “fifth” gamma matrix, which in our

conventions is given in 2 × 2 blocks by:

−1 0
γ5 = . (3.88)
0 1

Then
1 − γ5 1 + γ5
PL = ; PR = . (3.89)
2 2
The matrix γ5 satisfies the equations:

γ52 = 1; γ5† = γ5 ; {γ5 , γμ } = 0. (3.90)

So far, we have been considering Dirac spinor wavefunction solutions of the form

(x) = u( p, s)e−i p·x (3.91)

with p 0 = E > 0. We have successfully interpreted these solutions in terms of a

spin-1/2 particle, say, the electron. However, there is nothing mathematically wrong
with these solutions for p μ with p 0 < 0 and p 2 = m 2 . So, like the Klein-Gordon
equation, the Dirac equation has the embarrassment of negative energy solutions.
Dirac proposed to get around the problem of negative energy states by using the
fact that spin-1/2 particle are fermions. The Pauli exclusion principle dictates that
two fermions cannot occupy the same quantum state. Therefore, Dirac proposed that
all of the negative energy states are occupied. This prevents electrons with positive
energy from making disastrous transitions to the E < 0 states. The infinite number
of filled E < 0 states is called the Dirac sea.
If one of the states in the Dirac sea becomes unoccupied, it leaves behind a “hole”.
Since a hole is the absence of an E < 0 state, it effectively has energy −E > 0. An
42 3 Relativistic Quantum Mechanics of Single Particles

electron has charge3 −e, so the hole corresponding to its absence effectively has
the opposite charge, +e. Since both electrons and holes obey p 2 = m 2 , they have
the same mass. Dirac’s proposal therefore predicts the existence of “anti-electrons’
or positrons, with positive energy and positive charge. The positron was indeed
discovered in 1932 in cosmic ray experiments.
Feynman and Stückelberg noted that one can reinterpret the positron as a nega-
tive energy electron moving backwards in time, so that p μ → − p μ and S → −S.
According to this interpretation, the wavefunction for a positron with 4-momentum
p μ with p 0 = E > 0 is

(x) = v( p, s)ei p·x . (3.92)

Now, using the Dirac equation (3.32), v( p, s) must satisfy the eigenvalue equation:

( /p + m)v( p, s) = 0. (3.93)

We can now construct solutions to this equation just as before. First, in the rest
(primed) frame of the particle, we have in 2 × 2 blocks:

m m
v( p, s) = 0. (3.94)
m m

So, the solutions are

√ ξs
(x ) = m eimt (3.95)
−ξs

for any two-vector ξs .

One must be careful in interpreting the quantum numbers of the positron solutions
to the Dirac equation. This is because the Hamiltonian, 3-momentum, and spin
operators of a positron described by the wavefunction (3.92) are all given by the
negative of the expressions one would use for an electron wavefunction. Thus, acting
on a positron wavefunction, one has

∂
H = −i , (3.96)
∂t
P = i∇, (3.97)

1 σ 0
S = − , (3.98)
2 0 σ

3 Here, e is always defined to be positive, so that the electron has charge −e. (Some references
define e to be negative.).
3.2 Solutions of the Dirac Equation 43

where H , P, and S are the operators whose eigenvalues are to be interpreted as the
energy, 3-momentum, and spin of the positive-energy positron antiparticle. There-
fore, to describe a positron with spin Sz = +1/2 or −1/2, one should use, respec-
tively,

0 1
ξ1 = ; or ξ2 = , (3.99)
1 0

in (3.95).
Now we can boost to the unprimed frame just as before, yielding the solutions:
⎛ ⎞
√ 0
⎜ E + pz ⎟
v( p, 1) = ⎜
⎝
⎟,
⎠ (3.100)
√ 0
− E − pz
⎛ √ ⎞
E − pz
⎜ ⎟
v( p, 2) = ⎜ √ 0 ⎟
⎝ − E + pz ⎠ . (3.101)
0

Here v( p, 1) corresponds
to a positron moving in the +z direction with 3-momentum
pz and energy E = pz2 + m 2 and Sz = +1/2, hence helicity h = +1/2 if pz > 0.
Similarly, v( p, 2) corresponds to a positron with the same energy and 3-momentum,
but with Sz = −1/2, and therefore helicity h = −1/2 if pz > 0.
Note that for positron wavefunctions, PL projects onto states that describe right-
handed positrons in the high-energy limit, and PR projects onto states that describe
left-handed positrons in the high-energy limit. If we insist that PL projects on to left-
handed spinors, and PR projects on to right-handed spinors, then we must simply
remember that a right-handed positron is described by a left-handed spinor (annihi-
lated by PR ), and vice versa!
Later we will also need to use the Dirac row spinors:

u( p, s) = u( p, s)† γ 0 , v( p, s) = v( p, s)† γ 0 . (3.102)

The quantities u( p, s)u(k, r ) and u( p, s)v(k, r ) and v( p, s)u(k, r ) and v( p, s)v(k, r )

are all Lorentz scalars. For example, in the rest frame of a particle with mass m and
spin Sz = +1/2, one has

√ 0 σ0 √ √
u( p, 1) = u( p, 1) γ = m 1 0 1 0
† 0
= m 0 m 0 , (3.103)
σ0 0

so that
⎛√ ⎞
m
√ √ ⎜ 0 ⎟
u( p, 1)u( p, 1) = m0 m0 ⎜ √ ⎟
⎝ m ⎠ = 2 m. (3.104)
0
44 3 Relativistic Quantum Mechanics of Single Particles

Since this quantity is a scalar, it must be true that u( p, 1)u( p, 1) = 2m in any Lorentz
frame, in other words, for any p μ . More generally, if s, r = 1, 2 represent orthonor-
mal spin state labels, then the u and v spinors obey:

u( p, s)u( p, r ) = 2 mδsr , (3.105)

v( p, s)v( p, r ) = −2 mδsr , (3.106)
v( p, s)u( p, r ) = u( p, s)v( p, r ) = 0. (3.107)

Similarly, one can show that u( p, s)γ μ u( p, r ) = 2(m, 0 )δsr in the rest frame. Since
it is a four-vector, it must be that in any frame:

u( p, s)γ μ u( p, r ) = 2 p μ δsr . (3.108)

But the most useful identities that we will use later on are the spin-sum equations:

2
u( p, s)u( p, s) = /p + m; (3.109)
s=1
2
v( p, s)v( p, s) = /p − m. (3.110)
s=1

Here the spin state label s is summed over. These equations are to be interpreted in
the sense of a column vector times a row vector giving a 4 × 4 matrix, like:
⎛ ⎞ ⎛ ⎞
a1 a1 b1 a1 b2 a1 b3 a1 b4
⎜ a2 ⎟ ⎜ a2 b1 a2 b2 a2 b3 a2 b4 ⎟
⎜ ⎟ b1 b2 b3 b4 = ⎜ ⎟. (3.111)
⎝ a3 ⎠ ⎝ a3 b1 a3 b2 a3 b3 a3 b4 ⎠
a4 a4 b1 a4 b2 a4 b3 a4 b4

We will use (3.109) and (3.110) often when calculating cross-sections and decay
rates involving fermions.
As a check, note that if we act on the left of (3.109) with /p − m, the left hand
side vanishes because of (3.61), and the right hand side vanishes because of

( /p − m)( /p + m) = /p /p − m 2 = 0. (3.112)

This relies on the identity

/p /p = p ,
2
(3.113)

which follows from

μ ν μ ν
/p /p = p p γμ γν = p p (−γν γμ + 2gμν ) = − /p /p + 2 p ,
2
(3.114)
3.3 The Weyl Equation 45

where (3.37) was used. A similar consistency check works if we act on the left of
(3.110) with /p + m.
The Dirac spinors given above only describe electrons and positrons with both
momentum and spin aligned along the ±z direction. More generally, we could con-
struct u( p, s) and v( p, s) for states describing electrons or positrons with any p and
spin. However, in general that is quite a mess, and it turns out to be not particularly
useful in most practical applications, as we will see.

3.3 The Weyl Equation

It turns out that the Dirac equation can be replaced by something simpler and more
fundamental in the special case m = 0. If we go back to Dirac’s guess for the Hamil-
tonian, we now have just
H = α · P, (3.115)

and there is no need for the matrix β. Therefore, (3.20) and (3.21) are not applicable,
and we have only the one requirement:
α j αk + αk α j = 2δ jk . (3.116)

Now there are two distinct solutions involving 2 × 2 matrices, namely α = σ or

α = −σ . So we have two possible quantum mechanical wave equations:
∂
i ψ = ±iσ · ∇ψ. (3.117)
∂t
If we now define

σ μ = (σ 0 , σ 1 , σ 2 , σ 3 ), (3.118)
σ μ = (σ 0 , −σ 1 , −σ 2 , −σ 3 ), (3.119)

then we can write the two possible equations in the form:

iσ μ ∂μ ψ L = 0, (3.120)
iσ μ ∂μ ψ R = 0. (3.121)

Here we have attached labels L and R because the solutions to these equations turn
out to have left and right helicity, respectively, as we will see in a moment. Each of
these equations is called a Weyl equation. They are similar to the Dirac equation, but
only apply to massless spin-1/2 particles, and are 2 × 2 matrix equations rather than
4 × 4. The two-component objects ψ L and ψ R are called Weyl spinors.
We can understand the relationship of the Dirac equation to the Weyl equations
if we notice that the γ μ matrices can be written as

μ 0 σμ
γ = . (3.122)
σμ 0
46 3 Relativistic Quantum Mechanics of Single Particles

(Compare (3.28).) If we now write a Dirac spinor in its L and R helicity components,

L
= , (3.123)
R

then the Dirac equation (3.27) becomes:

0 σμ L L
i ∂μ = m , (3.124)
σμ 0 R R
or

iσ μ ∂μ R = m L , (3.125)
iσ μ ∂μ L = m R . (3.126)

Comparing with (3.120) and (3.121) when m = 0, we can indeed identify R as a

right-handed helicity Weyl fermion, and L as a left-handed helicity Weyl fermion.
Note that if m = 0, one can consistently set R = 0 as an identity in the Dirac
spinor without violating (3.125) and (3.126). Then only the left-handed helicity
fermion exists. This is how neutrinos originally appeared in the Standard Model;
a massless neutrino corresponds to a left-handed Weyl fermion. Recent evidence
shows that neutrinos do have small masses, so that this discussion has to be modified
slightly. However, for experiments in which neutrino masses can be neglected, it is
still proper to treat a neutrino as a Weyl fermion, or equivalently as the left-handed
part of a Dirac neutrino with no right-handed part.
The two versions of the Weyl equation, (3.120) and (3.121), are actually not
distinct. To see this, suppose we take the Hermitian conjugate of (3.121). Since
(∇)† = −∇ and (σ μ )† = σ μ , we obtain:
∂
i(σ 0 − σ · ∇)ψ R† = 0 (3.127)
∂t
or:

iσ μ ∂μ ψ R† = 0. (3.128)

In other words, a right-handed Weyl spinor is the Hermitian conjugate of a left-

handed Weyl spinor, and vice versa. The modern point of view is that the Weyl
equation is fundamental, and all 4-component Dirac fermions can be thought of as
consisting of two 2-component Weyl fermions coupled together by a mass. This is
because in the Standard Model, all fermions are massless until they are provided with
a mass term by the spontaneous breaking of the electroweak symmetry, as described
in Sect. 10.2. Since a right-handed Weyl fermion is the Hermitian conjugate of a
left-handed Weyl fermion, all one really needs is the single Weyl equation (3.120).
A Dirac fermion is then written as

χ
= , (3.129)
ξ†
3.4 Majorana Fermions 47

where χ and ξ are independent left-handed two-component Weyl fermions. All

fermion degrees of freedom can thus be thought of in terms of left-handed Weyl
fermions.

3.4 Majorana Fermions

A Majorana fermion can be obtained from a massive Dirac fermion by reducing

the number of degrees of freedom. This is done by identifying the right-handed part
with the conjugate of the left-handed part, imposed as a constraint, expressed by:

ψ ≡ L = R† . (3.130)

In four-component notation, a Majorana spinor has the form

ψ
M = , (3.131)
ψ†

and obeys the same wave equation as a Dirac fermion, (iγ μ ∂μ − m) M = 0. How-
ever, it has only half as many degrees of freedom; the Majorana condition (3.130)
ensures that a Majorana fermion is its own antiparticle. From (3.125), (3.126), one
sees that in the two-component form a classical Majorana fermion obeys the wave
equation:

iσ μ ∂μ ψ − mψ † = 0, (3.132)

or, in complex-conjugated form,

iσ μ ∂μ ψ † − mψ = 0. (3.133)

As we will see in Sect. 10.4, the experimental fact that neutrinos have small masses
suggests that they are likely to be Majorana fermions (and thus their own antiparti-
cles), although this expectation is based partly on theoretical prejudice and it is also
quite possible that they may be Dirac. In the minimal supersymmetric extension of
the Standard Model, there are new fermions called neutralinos and the gluino, which
are predicted to be Majorana fermions.
48 3 Relativistic Quantum Mechanics of Single Particles

Problems

1. Prove that the following statements are true, where the Ni are certain integers that
you will determine.
(a) γ μ γν γμ = N1 γν .
(b) γ μ γν γρ γμ = N2 gνρ .
(c) γ μ γν γρ γσ γμ = N3 γσ γρ γν .
(d) Tr(γμ γν γρ γσ ) = N4 (gμν gρσ − gμρ gνσ + gμσ gνρ ).
(e) [γρ , [γμ , γν ]] = N5 (gρμ γν − gρν γμ ).

2. Prove each of the following statements, where the numbers Ni are constants that
you are to determine:
(a) /p k/ /p = N1 p 2 k/ + N2 (k · p) /p
(b) Tr[ /p k/ /p k/] = N3 p 2 k 2 + N4 ( p · k)2
(c) Tr[ /p k/q/ /p ] = N5 ( p · k)(q · p) + N6 p 2 (q · k)
3. By taking the Hermitian conjugates of the Dirac equations ( /p − m)u( p, s) = 0
and ( /p + m)v( p, s) = 0, show that:
u( p, s)( /p − m) = 0, and
v( p, s)( /p + m) = 0.

4. (a) Show that (ū 1 /p u 2 )∗ = ū 2 /p u 1 .

(b) Determine the analogous expression for (ū 1 /p (1 − γ 5 )u 2 )∗ .
5. Simplify the following expressions so that each result involves at most one pro-
jection matrix:
(a) PL k/ PL
(b) PR k/ PL
(c) PL /p k/ PL
(d) PR /p k/ PL
(e) PL a/ b//cd/ /ek/ /pq/r/ PL
(f) PL a/ b//cd/ /ek/ /pq/r//s PL

6. In this problem, we will check the Lorentz invariance of the Dirac equation, and in
the process determine the Lorentz transformation rule for Dirac spinors. Suppose
that two coordinate systems are related by a Lorentz transformation

x μ = L μ ν x ν . (3.134)

The wavefunction (x ) as reported by an observer in the primed frame should

be related to that in the unprimed frame by

(x ) = (x) (3.135)
Problems 49

where is a 4 × 4 matrix. Now, the Dirac equation in the unprimed frame is

∂
iγ μ μ − m (x) = 0, (3.136)
∂x

and in the primed frame it is

μ ∂
iγ − m (x ) = 0. (3.137)
∂ x μ

(a) Show that these equations are consistent provided that

−1 γ ρ L ρ μ = γ μ . (3.138)
μ
(b) Now suppose that L μ ν = δν + ωμ ν with ωμ ν infinitesimal. Prove that the
equation found in part (a) is satisfied if

1
= 1 + ωμν [γμ , γν ] (3.139)
8
7. Consider a non-infinitesimal Lorentz transformation realized on contravariant
vectors and spinors according to a μ = L μ ν a ν and ψ (x ) = ψ(x). Find L μ ν
and for:
(a) a boost of rapidity ρ in the x direction.
(b) a rotation of angle θ around the z axis.
(c) a parity transformation P. (Hint: parity exchanges left-handed and right-
handed spinors, and P 2 = 1.)
Field Theory and Lagrangians
4

4.1 The Field Concept and Lagrangian Dynamics

It is now time to make a conceptual break from our earlier treatment of relativistic
quantum-mechanical wave equations for scalar and Dirac particles. There are two
reasons for doing this. First, the existence of negative energy solutions has lead us
to the concept of antiparticles. Now, a hole in the Dirac sea, representing a positron,
can be removed if the state is occupied by a positive energy electron, releasing an
energy of at least 2m. This forces us to admit that the total number of particles is
not conserved. The Klein-Gordon wavefunction φ(x) and the Dirac wavefunction
(x) were designed to describe single-particle probability amplitudes, but the correct
theory of nature evidently must describe a variable number of particles. Secondly,
we note that in the electromagnetic theory, Aμ (x) are not just quantum mechanical
wavefunctions; they exist classically too. If we follow this example with scalar and
spinor particles, we are lead to abandon φ(x) and (x) as quantum wavefunctions
representing single-particle states, and reinterpret them as fields that have meaning
even classically.
Specifically, a scalar particle is described by a field φ(x). Classically, this just
means that for every point x μ , the object φ(x) returns a number. Quantum mechan-
ically, φ(x) becomes an operator (rather than a state or wavefunction). There is a
distinct operator for each x μ . Therefore, we no longer have a position operator x μ ;
instead, it is just an ordinary number label that tells us which operator φ we are
talking about.
If φ(x) is now an operator, what states will it act on? To answer this, we can start
with a vacuum state

|0 (4.1)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 51

that describes an empty universe with no particles in it. If we now act with our field
operator, we obtain a state:

φ(x)|0, (4.2)

which, at the time t = x 0 , contains one particle at x. (What this state describes at
other times is a much more complicated question!) If we act again with our field
operator at a different point y μ , we get a state

φ(y)φ(x)|0, (4.3)

which in general can be a linear combination of states containing any number of

particles. [The operator φ(y) can either add another particle, or remove the particle
added to the vacuum state by φ(x). But, in addition, the particles can interact to
change their number.] In general, the field operator φ(x) acts on any state by adding
or subtracting particles. So this is the right framework to describe the quantum
mechanics of a system with a variable number of particles.
Similarly, to describe spin-1/2 particles and their antiparticles, like the electron
and the positron, or quarks and antiquarks, we will want to use a Dirac field (x).
Classically, (x) is a set of four functions (one for each of the Dirac spinor compo-
nents). Quantum mechanically, (x) is a set of four operators. We can build up any
state we want, containing any number of electrons and positrons, by acting on the
vacuum state |0 enough times with the fields (x) and (x) = † γ 0 .
The vector field Aμ (x) associated with electromagnetism is already very familiar
in the classical theory, as the electrical and vector potentials. In the quantum theory
of electromagnetism, Aμ (x) becomes an operator which can add or subtract photons
from the vacuum.
This way of dealing with theories of multi-particle physics is called field theory.
In order to describe how particles in a field theory evolve and interact, we need to
specify a single object called the action.
Let us first review how the action principle works in a simple, non-field-theory
setting. Let the variables qn (t) describe the configuration of a physical system. (For
example, a single q(t) could describe the displacement of a harmonic oscillator from
equilibrium, as a function of time.) Here n is just a label distinguishing the different
configuration variables. Classically, we could specify the equations of motion for the
qn , which we could get from knowing what forces were acting on it. A clever way
of summarizing this information is to specify the action

t f
S= L(qn , q̇n ) dt. (4.4)
ti

Here ti and t f are fixed initial and final times, and L is the Lagrangian. It is given in
simple systems by

L = T − V, (4.5)
4.1 The Field Concept and Lagrangian Dynamics 53

where T is the total kinetic energy and V is the total potential energy. Thus the action
S is a functional of qn ; if you specify a particular trajectory qn (t), then the action
returns a single number. The Lagrangian is a function of qn (t) and its first derivative.
The usefulness of the action is given by Hamilton’s principle, which states that if
qn (ti ) and qn (t f ) are held fixed as boundary conditions, then S is minimized when
qn (t) satisfy the equations of motion. Since S is at an extremum, this means that any
small variation in qn (t) will lead to no change in S, so that qn (t) → qn (t) + δqn (t)
implies δS = 0, provided that qn (t) obeys the equations of motion. Here δqn (t) is
any small function of t that vanishes at both ti and t f , as shown in the figure below:

q(t2)

q(t)+δq(t)
q(t)
q(t)

q(t1)

t1 t2
t

Let us therefore compute δS. First, note that by the chain rule, we have:

∂L ∂L

δL = δqn + δ q̇n . (4.6)
n
∂qn ∂ q̇n

Now, since

d
δ q̇n = (δqn ), (4.7)
dt
we obtain

tf
∂L d ∂L
δS = δqn + (δqn ) dt. (4.8)
n t
∂qn dt ∂ q̇n
i

Now integrating by parts yields:

tf
∂L d ∂L ∂ L t=t f
δS = δqn − dt + δqn . (4.9)
n t
∂qn dt ∂ q̇n n
∂ q̇n t=ti
i
54 4 Field Theory and Lagrangians

The last term vanishes because of the boundary conditions δqn (ti ) = δqn (t f ) = 0.
Since the variation δS is supposed to vanish for any δqn (t), it must be that

∂L d ∂L
− = 0, (4.10)
∂qn dt ∂ q̇n

which is the equation of motion for qn (t), for each n.

As a simple example, suppose there is only one qn (t) = x(t), the position of some
particle moving in one dimension in a potential V (x). Then

1 2
L =T −V = m ẋ − V (x), (4.11)
2
from which there follows:
∂L ∂V
=− = F, (4.12)
∂x ∂x
which we recognize as the Newtonian force, and

d ∂L d
= (m ẋ) = m ẍ. (4.13)
dt ∂ ẋ dt

So the equation of motion is Newton’s second law:

F = m ẍ. (4.14)

Everything we need to know about the dynamics of a physical system is encoded

in the action S, or equivalently the Lagrangian L. In quantum field theory, it will
tell us what the particle masses are, how they interact, how they decay, and what
symmetries provide selection rules on their behavior.
As a first example of an action for a relativistic field theory, consider a scalar field
φ(x) = φ(t, x). Then, in the previous discussion, we identify the label n with the spa-
tial position x, and qn (t) = φ(t, x). The action is obtained by summing contributions
from each x:

t f t f
S= dt L(φ, φ̇) = dt d 3 x L(φ, φ̇). (4.15)
ti ti

Now, since this expression depends on φ̇, it must also depend on ∇φ in order to be
Lorentz invariant. This just means that the form of the Lagrangian allows it to depend
on the differences between the field evaluated at infinitesimally nearby points. So, a
better way to write the action is:

S = d 4 x L(φ, ∂μ φ). (4.16)
4.1 The Field Concept and Lagrangian Dynamics 55

The object L is known as the Lagrangian density. Specifying a particular form for
L defines the theory.
To find the classical equations of motion for the field, we must find φ(x) so that
S is extremized; in other words, for any small variation φ(x) → φ(x) + δφ(x), we
must have δS = 0. By a similar argument as above, this implies the equations of
motion:

δL δL
− ∂μ = 0. (4.17)
δφ δ(∂μ φ)

Here, we use δL
δφ to mean the partial derivative of L with respect to φ; δ is used rather
than ∂ to avoid confusing between derivatives with respect to φ and spacetime partial
derivatives with respect to x μ . Likewise, δ(∂δL
μ φ)
means a partial derivative of L with
respect to the object ∂μ φ, upon which it depends.
As an example, consider the choice:

1 1
L= ∂μ φ∂ μ φ − m 2 φ 2 . (4.18)
2 2
It follows that:
δL
= −m 2 φ (4.19)
δφ

and

δL δ 1 αβ 1 1
= g ∂α φ∂β φ = g μβ ∂β φ + g αμ ∂α φ = ∂ μ φ. (4.20)
δ(∂μ φ) δ(∂μ φ) 2 2 2

Therefore,

δL
∂μ = ∂μ ∂ μ φ, (4.21)
δ(∂μ φ)

and so the equation of motion following from (4.17) is:

∂μ ∂ μ φ + m 2 φ = 0. (4.22)

This we recognize as the Klein-Gordon wave equation for a scalar particle of mass
m; compare to (3.13). This equation was originally introduced with the interpretation
as the equation governing the quantum wavefunction of a single scalar particle. Now
it has reappeared with a totally different interpretation, as the classical equation of
motion for the scalar field.
The previous discussion for scalar fields can be extended to other types of fields as
well. Consider a general list of fields j (x), which could include scalar fields φ(x),
Dirac or Majorana fields (x) with four components, Weyl fields ψ(x) with two
components, or vector fields Aμ (x) with four components, or several copies of any
56 4 Field Theory and Lagrangians

of these. The Lagrangian density L( j , ∂μ j ) determines the classical equations

of motion through the principle that S = d 4 x L should be stationary to first order
when a deviation j (x) → j (x) + δ j (x) is made, with boundary conditions that
δ j (x) vanishes on the boundary of the spacetime region on which S is evaluated.
(Typically one evaluates S between two times ti and t f , and over all of space, so
this means that δ (x) vanishes very far away from some region of interest.) The
Lagrangian density L should be a real quantity that transforms under proper Lorentz
transformations as a scalar.
Similarly to (4.6), we have by the chain rule:
δL

δL

δS = d4x δ j + δ(∂μ j ) . (4.23)
δ j δ(∂μ j )
j

Now using δ(∂μ j ) = ∂μ (δ j ), and integrating the second term by parts (this is
where the boundary conditions come in), one obtains

δL δL
δS = 4
d x δ j − ∂μ . (4.24)
δ j δ(∂μ j )
j

If we require this to vanish for each and every arbitrary variation δ j , we obtain the
Euler-Lagrange equations of motion:

δL δL
− ∂μ = 0, (4.25)
δ j δ(∂μ j )

for each j.
For example, let us consider how to make a Lagrangian density for a Dirac field
(x). Under Lorentz transformations, (x) transforms exactly like the wavefunction
solution to the Dirac equation. But now it is interpreted instead as a field; classically
it is a function on spacetime, and quantum mechanically it is an operator for each
point x μ . Now, (x) is a complex 4-component object, so † (x) is also a field.
One should actually treat (x) and † (x) as independent fields, in the same way
that in complex analysis one often treats z = x + i y and z ∗ = x − i y as independent
variables. As we found in Sect. 3.1, if we want to build Lorentz scalar quantities, it
is useful to use (x) = † γ 0 as a building block.
A good (and correct) guess for the Lagrangian for a Dirac field is:

L = iγ μ ∂μ − m, (4.26)

which can also be written as:

L = i † γ 0 γ μ ∂μ − m † γ 0 . (4.27)

This is a Lorentz scalar, so that when integrated d 4 x it will give a Lorentz-invariant
number. Let us now compute the equations of motion that follow from it. First, let
4.1 The Field Concept and Lagrangian Dynamics 57

us find the equations of motion obtained by varying with respect to † . For this, we
need:
δL
= iγ 0 γ μ ∂μ − mγ 0 , (4.28)
δ †
δL
= 0. (4.29)
δ(∂μ † )
The second equation just reflects the fact that the Lagrangian only contains the
derivative of , not the derivative of † . So, by plugging in to the general result
(4.25), the equation of motion is simply:

iγ μ ∂μ − m = 0, (4.30)
δL
† = 0 on the left by γ and used the fact that (γ ) = 1.
where we have multiplied δ 0 0 2

This is, of course, the Dirac equation.

We can also find the equations of motion obtained from the Lagrangian by varying
with respect to . For that, we need:
δL
= −m, (4.31)
δ
δL
= iγ μ . (4.32)
δ(∂μ )
Plugging these into (4.25), we obtain:

− i∂μ γ μ − m = 0. (4.33)

However, this is nothing new; it is just the Hermitian conjugate of (4.30), multiplied
on the right by γ 0 and using (3.34).
The Lagrangian density for an electromagnetic field is:
1 1
LEM = − Fμν F μν = − (∂μ Aν − ∂ν Aμ )(∂ μ Aν − ∂ ν Aμ ). (4.34)
4 4
To find the equations of motion that follow from this Lagrangian, we compute:
δLEM
= 0, (4.35)
δ Aν
since Aμ doesn’t appear in the Lagrangian without a derivative acting on it, and

δLEM δ 1 αρ βσ
= − (∂α Aβ − ∂β Aα )(∂ρ Aσ − ∂σ Aρ )g g
δ(∂μ Aν ) δ(∂μ Aν ) 4
1

= − (∂ρ Aσ − ∂σ Aρ )g μρ g νσ − (∂ρ Aσ − ∂σ Aρ )g νρ g μσ
4
+(∂α Aβ − ∂β Aα )g μα g βσ − (∂α Aβ − ∂α Aβ )g να g μβ
= −∂ μ Aν + ∂ ν Aμ
= −F μν . (4.36)
58 4 Field Theory and Lagrangians

So, the equations of motion,

δLEM δLEM
− ∂μ =0 (4.37)
δ Aν δ(∂μ Aν )

reduce to

∂μ F μν = 0, (4.38)

which we recognize as Maxwell’s equations in the case of vanishing J μ (see (2.84)).

In order to include the effects of a 4-current J μ , we can simply add a term

Lcurrent = −e J μ Aμ (4.39)

to the Lagrangian density. Then, since

δLcurrent
= −e J ν ; (4.40)
δ Aν
δLcurrent
= 0, (4.41)
δ(∂μ Aν )

the classical equations of motion for the vector field become

∂μ F μν − e J ν = 0, (4.42)

in agreement with (2.84). The current density J μ may be regarded as an external

source of unspecified origin for the electromagnetic field; or, it can be built out of
the fields for charged particles, as we will see.

4.2 Quantization of Free Scalar Field Theory

Let us now turn to the question of quantizing a field theory. To begin, let us recall how
one quantizes a simple generic system based on variables qn (t), given the Lagrangian
L(qn , q̇n ). First, one defines the canonical momenta conjugate to each qn :

∂L
pn ≡ . (4.43)
∂ q̇n

The classical Hamiltonian is then defined by

H (qn , pn ) ≡ pn q̇n − L(qn , q̇n ), (4.44)
n

where the q̇n are to be eliminated using (4.43). To go to the corresponding quantum
theory, the qn and pn and H are reinterpreted as Hermitian operators acting on
4.2 Quantization of Free Scalar Field Theory 59

a Hilbert space of states. The operators obey canonical equal-time commutation

relations:

[ pn , pm ] = 0; (4.45)
[qn , qm ] = 0; (4.46)
[ pn , qm ] = −iδnm , (4.47)

and the time evolution of the system is determined by the Hamiltonian operator H .
Let us now apply this to the theory of a scalar field φ(x) with the Lagrangian
density given by (4.18). Then φ(x) = φ(t, x) plays the role of qn (t), with x playing
the role of the label n; there is a different field at each point in space. The momentum
conjugate to φ is:

δL δ
1 2
π(x) ≡ = φ̇ + · · · = φ̇. (4.48)
δ φ̇ δ φ̇ 2

It should be emphasized that π(x), the momentum conjugate to the field φ(x), is not
in any way the mechanical momentum of the particle. Notice that π(x) is a scalar
function, not a three-vector or a four-vector!
The Hamiltonian is obtained by summing over the fields at each point x:

H = d 3 x π(x)φ̇(x) − L (4.49)

1
= d 3 x π(x)φ̇(x) − d 3 x [φ̇ 2 − (∇φ)2 − m 2 φ 2 ] (4.50)
2

1
= d 3 x [π 2 + (∇φ)2 + m 2 φ 2 ]. (4.51)
2

Notice that a nice feature has emerged: this Hamiltonian is a sum of squares, and is
therefore always ≥ 0. There are no dangerous solutions with arbitrarily large negative
energy, unlike the case of the single-particle Klein-Gordon wave equation.
At any given fixed time t, the field operators φ(x) and their conjugate momenta
π(x) are Hermitian operators satisfying commutation relations exactly analogous to
(4.45)–(4.47):

[φ(x), φ(y)] = 0; (4.52)

[π(x), π(y)] = 0; (4.53)
[π(x), φ(y)] = −iδ (3) (x − y). (4.54)

As we will see, it turns out to be profitable to analyze the system in a way similar to the
way one treats the harmonic oscillator in one-dimensional nonrelativistic quantum
mechanics. In that system, one defines “creation” and “annihilation” (or “raising”
60 4 Field Theory and Lagrangians

and “lowering”) operators a † and a as complex linear combinations of the position

and momentum operators x and p. Similarly, here we will define:

ap = d 3 x e−ip·x E p φ(x) + iπ(x) , (4.55)

where

Ep = p2 + m 2 , (4.56)

with the positive square root always taken. The overall coefficient in front of (4.55)
reflects an arbitrary choice (and in fact it is chosen differently by various books).
Equation (4.55) defines a distinct annihilation operator for each three-momentum p.
Taking the Hermitian conjugate yields:

ap† = d 3 x eip·x E p φ(x) − iπ(x) . (4.57)

To see the usefulness of these definitions, compute the commutator:

[ap , ak ] = d x d 3 y e−ip·x eik·y E p φ(x) + iπ(x) , E k φ(y) − iπ(y) (4.58)
† 3

= d 3 x d 3 y e−ip·x eik·y (E k + E p ) δ (3) (x − y), (4.59)

where (4.52)–(4.54) have been used. Now performing the y integral using the defi-
nition of the delta function, one obtains:

[ap , ak† ] = (E k + E p ) d 3 x ei(k−p)·x . (4.60)

This can be further reduced by using the important identity:

d 3 x eiq·x = (2π )3 δ (3) (q ), (4.61)

valid for any 3-vector q, to obtain the final result:

[ap , ak† ] = (2π )3 2E p δ (3) (k − p). (4.62)

Here we have put E k = E p , using the fact that the delta function vanishes except
when k = p. In a similar way, one can check the commutators:

[ap , ak ] = [ap† , ak† ] = 0. (4.63)

Up to a constant factor on the right-hand side of (4.62), these results have the
same form as the harmonic oscillator algebra familiar from non-relativistic quantum
4.2 Quantization of Free Scalar Field Theory 61

mechanics: [a, a † ] = 1; [a, a] = [a † , a † ] = 0. Therefore, the quantum mechanics

of a scalar field behaves like an infinite collection of harmonic oscillators, one asso-
ciated with each three-momentum p.
Now we can understand the Hilbert space of states for the quantum field theory.
We start with the vacuum state |0, which is taken to be annihilated by all of the
lowering operators:

ak |0 = 0. (4.64)

Acting on the vacuum state with any raising operator produces

ak† |0 = |k, (4.65)

which describes a state with a single particle with three-momentum k (and no definite
position). Acting multiple times with raising operators produces a state with multiple
particles. So

ak†1 ak†2 . . . ak†n |0 = |k1 , k2 , . . . , kn (4.66)

describes a state of the universe with n φ-particles with momenta k1 , k2 , . . . , kn .

Note that these states are automatically symmetric under interchange of any two
of the momentum labels ki , because of the identity [ak†i , ak† j ] = 0. This is another
way of saying that the multiparticle states obey Bose-Einstein statistics for identical
particles with integer (in this case, 0) spin.
In order to check our interpretation of the quantum theory, let us now evaluate
the Hamiltonian operator in terms of the raising and lowering operators, and then
determine how H acts on the space of states. First, let us invert the definitions (4.55)
and (4.57) to find φ(x) and π(y) in terms of the ap† and ap . We begin by noting that:

†
ap + a−p = d 3 x e−ip·x 2E p φ(x). (4.67)

Now we act on both sides by:

d p̃ eip·y , (4.68)

where we have introduced the very convenient shorthand notation:

d 3p
d p̃ ≡ , (4.69)
(2π )3 2E p

used often from now on. The result is that (4.67) becomes

† d 3 p ip·(y−x)
d p̃ e ip·y
(ap + a−p ) = d x φ(x)
3
e . (4.70)
(2π )3
62 4 Field Theory and Lagrangians

Since the p integral in braces is equal to δ (3) (y − x) (see (4.61)), one obtains after
performing the x integral:

†
φ(y) = d p̃ eip·y (ap + a−p ), (4.71)

or, renaming y → x, and p → −p in the second term on the right:

φ(x) = d p̃ (eip·x ap + e−ip·x ap† ). (4.72)

This expresses the original field in terms of raising and lowering operators. Similarly,
for the conjugate momentum field, one finds:

π(x) = −i d p̃ E p (eip·x ap − e−ip·x ap† ). (4.73)

Now we can plug the results of (4.72) and (4.73) into the expression (4.51) for
the Hamiltonian. The needed terms are:

1
d 3 x π(x)2 =
2

1
d 3 x d k̃ d p̃ (−i)2 E k E p eik·x ak − e−ik·x ak† eip·x ap − e−ip·x ap† , (4.74)
2

1
d 3 x (∇φ)2 =
2

1
d 3 x d k̃ d p̃ ikeik·x ak − ike−ik·x ak† · ipeip·x ap − ipe−ip·x ap† , (4.75)
2

m2
d 3 x φ(x)2 =
2

m2
d 3 x d k̃ d p̃ eik·x ak + e−ik·x ak† eip·x ap + e−ip·x ap† . (4.76)
2

Adding up the pieces, one finds:

1
H = d k̃ d p̃ d 3 x (m 2 − k · p − E k E p ) ak ap ei(k+p)·x + ak† ap† e−i(k+p)·x
2

+ (m 2 + k · p + E k E p ) ak† ap ei(p−k)·x + ak ap† ei(k−p)·x . (4.77)

Now one can do the x integration using:

d 3 x e±i(k+p)·x = (2π )3 δ (3) (k + p ), (4.78)

d 3 x e±i(k−p)·x = (2π )3 δ (3) (k − p ). (4.79)
4.2 Quantization of Free Scalar Field Theory 63

As a result, the coefficient (m 2 − k · p − E k E p ) of the aa and a † a † terms vanishes

after plugging in k = −p as enforced by the delta function. Meanwhile, for the aa †
and a † a terms one has k = p from the delta function, so that

(m 2 + k · p + E k E p ) δ (3) (k − p) = 2E p2 δ (3) (k − p). (4.80)

Now performing the k integral in (4.77), one finds:

1
H= d p̃ E p ap† ap + ap ap† . (4.81)
2

Finally, we can rearrange the second term, using

ap ap† = ap† ap + (2π )3 2E p δ (3) (p − p) (4.82)

from (4.62). The last term is infinite, so

H = d p̃ E p ap† ap + ∞, (4.83)

where “∞” means an infinite, but constant, contribution to the energy. Since a uniform
constant contribution to the energy of all states is unobservable and commutes will
all other operators, we are free to drop it, by a redefinition of the Hamiltonian. This
is a simple example of the process known as renormalization. (In a more careful
treatment, one could “regulate” the theory by quantizing the theory confined to a
box of finite volume, and neglecting all contributions coming from momenta greater
than some very large cutoff |p|max . Then the infinite constant would be rendered
finite. Since we are going to ignore the constant anyway, we won’t bother doing
this.) So, from now on,

H = d p̃ E p ap† ap (4.84)

is the Hamiltonian operator.

Acting on the vacuum state,

H |0 = 0, (4.85)

since all ap annihilate the vacuum. This shows that the infinite constant we dropped
from H is actually the infinite energy density associated with an infinite universe of
empty space, filled with the zero-point energies of an infinite number of oscillators,
one for each possible momentum 3-vector p. But we’ve already agreed to ignore it,
so let it go. One can show that:

[H , ak† ] = E k ak† . (4.86)

64 4 Field Theory and Lagrangians

Acting with H on a one-particle state, we therefore obtain

H |k = H ak† |0 = [H , ak† ]|0 = E k ak† |0 = E k |k. (4.87)

This proves
√ that the one-particle state with 3-momentum k has energy eigenvalue
E k = k2 + m 2 , as expected from special relativity. More generally, a multi-particle
state

|k1 , k2 , . . . , kn = ak†1 ak†2 . . . ak†n |0 (4.88)

is easily shown to be an eigenstate of H with eigenvalue E k1 + E k2 + · · · + E kn .

Note that it is not possible to construct a state with a negative energy eigenvalue!

4.3 Quantization of Free Dirac Fermion Field Theory

Let us now apply the wisdom obtained by the quantization of a scalar field in the
previous subsection to the problem of quantizing a Dirac fermion field that describes
electrons and positrons. A sensible strategy is to expand the fields and † in terms
of operators that act on states by creating and destroying particles with a given 3-
momentum. Now, since is a spinor with four components, one must expand it in a
basis for the four-dimensional spinor space. A convenient such basis is the solutions
we found to the Dirac equation, u( p, s) and v( p, s). So we expand the Dirac field,
at a given fixed time t, as:

2

(x) = d p̃ u( p, s)eip·x bp,s + v( p, s)e−ip·x dp,s
†
. (4.89)
s=1

Here s labels the two possible spin states in some appropriate basis (for example,
Sz = ±1/2). The operator bp,s will be interpreted as an annihilation operator, which
removes an electron, with 3-momentum p and spin state s, from whatever state it
†
acts on. The operator dp,s is a creation operator, which adds a positron to whatever
state it acts on. We are using b, b† and d, d † rather than a, a † in order to distinguish
the fermion and antifermion creation and annihilation operators from the scalar field
versions. Taking the Hermitian conjugate of (4.89), and multiplying by γ 0 on the
right, we get:

2

(x) = d p̃ u( p, s)e−ip·x bp,s
†
+ v( p, s)eip·x dp,s . (4.90)
s=1

†
The operator bp,s creates an electron, and dp,s destroys a positron, with the corre-
sponding 3-momentum and spin.
More generally, if the Dirac field describes some fermions other than the electron-
positron system, then you can substitute “particle” for electron and “antiparticle” for
4.3 Quantization of Free Dirac Fermion Field Theory 65

positron. So b† , b act on states to create and destroy particles, while d † , d create and
destroy antiparticles.
Just as in the case of a scalar field, we assume the existence of a vacuum state |0,
which describes a universe of empty space with no electrons or positrons present.
The annihilation operators yield 0 when acting on the vacuum state:

bp,s |0 = dp,s |0 = 0 (4.91)

for all p and s. To make a state describing a single electron with 3-momentum p and
spin state s, just act on the vacuum with the corresponding creation operator:
−
†
bp,s |0 = |ep,s . (4.92)

Similarly,

† − −
bk,r †
bp,s |0 = |ek,r ; ep,s (4.93)

is a state containing two electrons, etc.

Now, electrons are fermions; they must obey Fermi-Dirac statistics for identical
particles. This means that if we interchange the momentum and spin labels for the
two electrons in the state of (4.93), we must get a minus sign:
− − − −
|ek,r ; ep,s = −|ep,s ; ek,r . (4.94)

A corollary of this is the Pauli exclusion principle, which states that two electrons
cannot be in exactly the same state. In the present case, that means that we cannot add
to the vacuum two electrons with exactly the same 3-momentum and spin. Taking
k = p and r = s in (4.94):
− − − −
|ep,s ; ep,s = −|ep,s ; ep,s , (4.95)

− ; e− = 0, in other words there is no such state.

which can only be true if |ep,s p,s
Writing (4.94) in terms of creation operators, we have:

† †
bk,r †
bp,s |0 = −bp,s
†
bk,r |0. (4.96)

So, instead of the commutation relation

†
[bk,r , bp,s
†
]=0 (Wrong!) (4.97)

that one might expect from comparison with the scalar field, we must have an anti-
commutation relation:
† † †
bk,r †
bp,s + bp,s
†
bk,r = {bk,r , bp,s
†
} = 0. (4.98)
66 4 Field Theory and Lagrangians

Taking the Hermitian conjugate, we must also have:

{bk,r , bp,s } = 0. (4.99)

Similarly, applying the same thought process to identical positron fermions, one must
have:
†
{dk,r , dp,s
†
} = {dk,r , dp,s } = 0. (4.100)

Note that in the classical limit, → 0, these equations are unaffected, since
doesn’t appear anywhere. So it must be true that b, b† , d, and d † anticommute even
classically. So, as classical fields, one must have

{(x), (y)} = 0, (4.101)

{ † (x), † (y)} = 0, (4.102)
{(x), † (y)} = 0. (4.103)

Evidently, the classical Dirac field is not a normal number, but rather an anticommut-
ing or Grassmann number. Interchanging the order of any two Grassmann numbers
results in an overall minus sign.
In order to discover how the classical equations (4.101) and (4.103) are modified
when one goes to the quantum theory, let us construct the momentum conjugate to
(x). It is:

δL
P(x) = = iγ 0 = i † γ 0 γ 0 = i † . (4.104)
δ(∂0 )

So the momentum conjugate to the Dirac spinor field is just i times its Hermitian
conjugate. Now, naively following the path of canonical quantization, one might
expect the equal-time commutation relation:

[P(x), (y )] = −iδ (3) (x − y ). (Wrong!) (4.105)

However, this clearly cannot be correct, since these are anticommuting fields; in
the classical limit → 0, (4.105) disagrees with (4.103). So, instead we postulate
a canonical anticommutation relation for the Dirac field operator and its conjugate
momentum operator:

{P(x), (y )} = −iδ (3) (x − y ). (4.106)

Now just rewriting P = i † , this becomes:

{ † (x), (y )} = −δ (3) (x − y ). (4.107)

4.4 Scalar Field with φ 4 Coupling 67

From this, using a strategy similar to that used for scalar fields, one can obtain:

†
{bp,s , bk,r †
} = {dp,s , dk,r } = (2π )3 2E p δ (3) (p − k) δsr , (4.108)

and all other anticommutators of b, b† , d, d † operators vanish.

One also can check that the Hamiltonian is
2

H= d p̃ E p (bp,s
†
bp,s + dp,s
†
dp,s ) (4.109)
s=1

in a way very similar to the way we found the Hamiltonian for a scalar field in terms
of a, a † operators. In doing so, one must again drop an infinite constant contribution
(negative, this time) which is unobservable because it is the same for all states. Note
that H again has energy eigenvalues that are ≥ 0. One can show that:

† †
[H , bk,s ] = E k bk,s , (4.110)
† †
[H , dk,s ] = E k dk,s . (4.111)

(Note that these equations are commutators rather than anticommutators!) It follows
that the eigenstates of energy and 3-momentum are given in general by:

bp†1 ,s1 . . . bp†n ,sn dk†1 ,r1 . . . dk†m ,rm |0, (4.112)

which describes a state with n electrons (particles) and m positrons (antiparticles)

with the obvious 3-momenta and spins, and total energy E p1 + · · · + E pn + E k1 +
· · · + E km .

4.4 Scalar Field with φ 4 Coupling

So far, we have been dealing with free field theories. These are theories in which the
Lagrangian density is quadratic in the fields, so that the Euler-Lagrange equations
obtained by varying L are linear wave equations in the fields, with exact solutions
that are not too hard to find. At the quantum level, this nice feature shows up in the
simple time evolution of the states. In field theory, as in any quantum system, the
time evolution of a state |X is given in the Schrödinger picture by

d
i |X = H |X . (4.113)
dt
So, in the case of a multiparticle state with an energy eigenvalue E as described
above, the solution is just

|X (t) = e−i(t−t0 )H |X (t0 ) = e−i(t−t0 )E |X (t0 ). (4.114)

68 4 Field Theory and Lagrangians

In other words, the state at some time t is just the same as the state at some previous
time t0 , up to a phase. So nothing ever happens to the particles in a free theory; their
number does not change, and their momenta and spins remain the same.
We are interested in describing a more interesting situation where particles can
scatter off each other, perhaps inelastically to create new particles, and in which some
particles can decay into other sets of particles. To describe this, we need a Lagrangian
density that contains terms with more than two fields. At the classical level, this will
lead to non-linear equations of motion that have to be solved approximately. At the
quantum level, finding exact energy eigenstates of the Hamiltonian is not possible,
so one usually treats the non-quadratic part of the Hamiltonian as a perturbation on
the quadratic part, giving an approximate answer.
As an example, consider the free Lagrangian for a scalar field φ, as given in (4.18),
and add to it an interaction term:

L = L0 + Lint ; (4.115)
λ
Lint = − φ 4 . (4.116)
24
Here λ is a dimensionless number, a parameter of the theory known as a coupling.
It governs the strength of interactions; if we set λ = 0, we would be back to the free
theory in which nothing interesting ever happens. The factor of 1/4! = 1/24 is a
convention, and the reason for it will be apparent later. Now canonical quantization
can proceed as before, except that now the Hamiltonian is

H = H0 + Hint , (4.117)

where

λ
Hint = − d 3 x Lint = d 3 x φ(x)4 . (4.118)
24
Let us write this in terms of creation and annihilation operators, using (4.72):

λ
Hint = d 3 x d q̃1 d q̃2 d q̃3 d q̃4 aq1 eiq1 ·x + aq†1 e−iq1 ·x aq2 eiq2 ·x + aq†2 e−iq2 ·x
24

aq3 eiq3 ·x + aq†3 e−iq3 ·x aq4 eiq4 ·x + aq†4 e−iq4 ·x . (4.119)

Now we can perform the d 3 x integration, using (4.61). The result is:

λ
Hint = (2π ) d q̃1 d q̃2 d q̃3 d q̃4 aq†1 aq†2 aq†3 aq†4 δ (3) (q1 + q2 + q3 + q4 )
3
24
+ 4aq†1 aq†3 aq†3 aq4 δ (3) (q1 + q2 + q3 − q4 )
+ 6aq†1 aq†2 aq3 aq4 δ (3) (q1 + q2 − q3 − q4 )
+ 4aq†1 aq2 aq3 aq4 δ (3) (q1 − q3 − q3 − q4 )

+ aq1 aq2 aq3 aq4 δ (3) (q1 + q2 + q3 + q4 ) . (4.120)
4.4 Scalar Field with φ 4 Coupling 69

Here we have combined several like terms, by relabeling the momenta, giving rise
to the factors of 4, 6, and 4. This involves reordering the a’s and a † ’s. In doing so,
we have ignored the fact that a’s do not commute with a † ’s when the 3-momenta
are exactly equal. This should not cause any worry, because it just corresponds to
the situation where a particle is “scattered” without changing its momentum at all,
which is the same as no scattering, and therefore not of interest.
To see how to use the interaction Hamiltonian, it is useful to tackle a specific
process. For example, consider a scattering problem in which we have two scalar
particles with 4-momenta pa , pb that interact, producing two scalar particles with
4-momenta k1 , k2 :

pa pb → k1 k2 . (4.121)

We will work in the Schrödinger picture of quantum mechanics, in which operators

are time-independent and states evolve in time according to (4.113). Nevertheless,
in the far past, we assume that the incoming particles were far apart, so the system
is accurately described by the free Hamiltonian and its energy eigenstates. The same
applies to the outgoing particles in the far future. So, we can pretend that Hint is
“turned off” in both the far past and the far future. The states that are simple two-
particle states are

| pa , pb IN = ap†a ap†b |0, (4.122)

in the far past, and

|k1 , k2 OUT = ak†1 ak†2 |0, (4.123)

in the far future. These are built out of creation and annihilation operators just
as before, so they are eigenstates of the free Hamiltonian H0 , but not of the full
Hamiltonian. Now, we are interested in computing the probability amplitude that the
state | pa , pb IN evolves to the state |k1 , k2 OUT . According to the rules of quantum
mechanics this is given by their overlap at a common time, say in the far future:

OUT k1 , k2 | pa , pb OUT . (4.124)

The state | pa , pb OUT is the time evolution of | pa , pb IN from the far past to the far
future:

| pa , pb OUT = e−i T H | pa , pb IN , (4.125)

where T is the long time between the far past time when the initial state was created
and the far future time at which the overlap is computed. So we have:

OUT k1 , k2 | pa , pb OUT = OUT k1 , k2 |e−i T H | pa , pb IN . (4.126)

70 4 Field Theory and Lagrangians

The states appearing on the right-hand side are simple; see (4.122) and (4.123). The
complications are hidden in the operator e−i T H .
In general, e−i T H cannot be written exactly in a useful way in terms of creation
and annihilation operators. However, we can do it perturbatively, order by order in
the coupling λ. For example, let us consider the contribution linear in λ. We use the
definition of the exponential to write:

e−i T H = [1 − i H T /N ] N = [1 − i(H0 + Hint )T /N ] N , (4.127)

for N → ∞. Now, the part of this that is linear in Hint can be expanded as:

N −1
e−i T H = [1 − i H0 T /N ] N −n−1 (−i Hint T /N ) [1 − i H0 T /N ]n . (4.128)
n=0

(Here we have dropped the 0th order part, e−i T H0 , as uninteresting; it just corresponds
to the particles evolving as free particles.) We can now turn this discrete sum into an
integral, by letting t = nT /N and dt = T /N in the limit of large N :

T
−i T H
e = −i dt e−i(T −t)H0 Hint e−it H0 . (4.129)
0

Next we can use the fact that we know what H0 is when acting on the simple states
of (4.122) and (4.123):

e−it H0 | pa , pb IN = e−it Ei | pa , pb IN , (4.130)

−i(T −t)H0
OUT k1 , k2 |e = OUT k1 , k2 |e−i(T −t)E f , (4.131)

where

E i = E pa + E pb , E f = E k1 + E k2 (4.132)

are the energies of the initial and final states, respectively. So we have:
−i T H
OUT k1 , k2 |e | pa , pb IN =
T
−i dt e−i(T −t)E f e−it Ei OUT k1 , k2 |Hint | pa , pb IN . (4.133)
0

First let us do the t integral:

T T /2
−i(T −t)E f −it E i −i(E f +E i )T /2

dt e e =e dt
eit (E f −Ei ) (4.134)
0 −T /2
4.4 Scalar Field with φ 4 Coupling 71

where we have redefined the integration variable by t = t

+ T /2. As we take T →
∞, we can use the integral identity

∞
d x ei x A = 2π δ(A) (4.135)
−∞

to obtain:

T
dt e−i(T −t)E f e−it Ei = 2π δ(E f − E i ) e−i(E f +Ei )T /2 . (4.136)
0

This tells us that energy conservation will be enforced, and (dropping the phase
factor e−i(E f +Ei )T /2 , which will just give 1 when we take the complex square of the
probability amplitude):

OUT k1 , k2 | pa , pb OUT = −i 2π δ(E f − E i ) OUT k1 , k2 |Hint | pa , pb IN . (4.137)

Now we are ready to use our expression for Hint in (4.120). The action of Hint on
eigenstates of the free Hamiltonian can be read off from the different types of terms.
The aaaa-type term will remove four particles from the state. Clearly we don’t have
to worry about that, because there were only two particles in the state to begin with!
The same goes for the a † aaa-type term. The terms of type a † a † a † a † and a † a † a † a
create more than the two particles we know to be in the final state, so we can ignore
them too. Therefore, the only term that will play a role in this example is the a † a † aa
contribution:

λ
Hint = (2π )3 d q̃1 d q̃2 d q̃3 d q̃4 δ (3) (q1 + q2 − q3 − q4 ) aq†1 aq†2 aq3 aq4 , (4.138)
4
Therefore, we have:

4λ
OUT k1 , k2 | pa , pb OUT = −i(2π ) d q̃1 d q̃2 d q̃3 d q̃4
4
δ (4) (q1 + q2 − q3 − q4 )OUT k1 , k2 |aq†1 aq†2 aq3 aq4 | pa , pb IN . (4.139)

Here we have combined the three-momenta delta function from (4.138) with the
energy delta function from (4.137) to give a 4-momentum delta function.
It remains to evaluate:

OUT k1 , k2 |aq1 aq2 aq3 aq4 | pa , pb IN ,

† †
(4.140)

which, according to (4.122) and (4.123), is equal to

0|ak1 ak2 aq†1 aq†2 aq3 aq4 ap†a ap†b |0. (4.141)

This can be done using the commutation relations of (4.62) and (4.63). The strategy
is to commute aq3 and aq4 to the right, so they can give 0 when acting on |0, and
72 4 Field Theory and Lagrangians

commute aq†1 and aq†2 to the left so they can give 0 when acting on 0|. Along the
way, one picks up delta functions whenever the 3-momenta of an a and a † match.
One contribution occurs when q3 = pa and q4 = pb and q1 = k1 and q2 = k2 . It
yields:

(2π )3 2E q3 δ (3) (q3 − pa ) · (2π )3 2E q4 δ (3) (q4 − pb )

·(2π )3 2E q1 δ (3) (q1 − k1 ) · (2π )3 2E q2 δ (3) (q2 − k2 ). (4.142)

There are 3 more similar terms. You can check that each of them gives a contribution
equal to (4.142) when put into (4.139), after relabeling momenta; this cancels the
factor of 1/4 in (4.139). Now, the factors of (2π )3 2E q3 etc. all neatly cancel against
the corresponding factors in the denominator of d q̃3 , etc. (See (4.69).) The three-
momentum delta functions then make the remaining d 3 q1 , d 3 q2 , d 3 q3 , and d 3 q4
integrations trivial; they just set the four-vectors q3 = pa , q4 = pb , q1 = k1 , and
q2 = k2 in the remaining 4-momentum delta function that was already present in
(4.139).
Putting it all together, we are left with the remarkably simple result:

OUT k1 , k2 | pa , pb OUT = −iλ(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.143)

Rather than go through this whole messy procedure every time we invent a new
interaction term for the Lagrangian density, or every time we think of a new scattering
process, one can instead summarize the procedure with a simple set of diagrammatic
rules. These rules, called Feynman rules, are useful both as a precise summary of
a matrix element calculation, and as a heuristic guide to what physical process the
calculation represents. In the present case, the Feynman diagram for the process is:

initial state final state

Here the two lines coming from the left represent the incoming state scalar particles,
which get “destroyed” by the annihilation operators in Hint . The vertex where the
four lines meet represents the interaction itself, and is associated with the factor −iλ.
The two lines outgoing to the right represent the two final state scalar particles, which
are resurrected by the two creation operators in Hint .
This is just the simplest of many Feynman diagrams one could write down for
the process of two particle scattering in this theory. But all other diagrams represent
contributions that are higher order in λ, so if λ is small we can ignore them.
4.5 Scattering Processes and Cross-Sections 73

4.5 Scattering Processes and Cross-Sections

In Sect. 4.4, we found that the matrix element corresponding to 2 particle to 2 particle
λ 4
scattering in a scalar field theory with interaction Lagrangian − 24 φ is:

OUT k1 k2 | pa pb OUT = −iλ(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.144)

Now we would like to learn how to translate this information into something physi-
cally meaningful that could in principle be measured in an experiment. The matrix
element itself is infinite whenever 4-momentum is conserved, and zero otherwise.
So clearly we must do some work to relate it to an appropriate physically measurable
quantity, namely the cross-section.
The cross-section is the observable that gives the expected number of scattering
events N S that will occur if two large sets of particles are allowed to collide. Suppose
that we have Na particles of type a and Nb of type b, formed into large packets of
uniform density that move completely through each other as shown:

The two packets are assumed to have the same area A (shaded gray) perpendicular
to their motion. The total number of scattering events occurring while the packets
move through each other should be proportional to each of the numbers Na and Nb ,
and inversely proportional to the area A. The equation
Na Nb
NS = σ (4.145)
A
defines the cross-section σ . The rate at which the effective Na Nb /A is increasing with
time in an experiment is called the luminosity L (or instantaneous luminosity), and
the same quantity integrated over time is called the integrated luminosity. Therefore,

NS = σ L dt. (4.146)

The dimensions for cross-section are the same as area, and the official unit is 1
barn = 10−24 cm2 = 2568 GeV−2 . However, from the point of view of modern high-
energy experiments, a barn is a very large cross-section,1 so more commonly-used
units are obtained by using the prefixes nano-, pico-, and femto-:

1 The joke is that achieving an event with such a cross-section is “as easy as hitting the broad side
of a barn”.
74 4 Field Theory and Lagrangians

1 nb = 10−33 cm2 = 2.568 × 10−6 GeV−2 , (4.147)

1 pb = 10−36 cm2 = 2.568 × 10−9 GeV−2 , (4.148)
1 fb = 10−39 cm2 = 2.568 × 10−12 GeV−2 . (4.149)

As an example, the Tevatron collided protons ( p) and antiprotons ( p) with a center-

of-momentum (CM) energy of E CM = 1960 GeV and each experiment received a
total integrated luminosity of about

Ldt = 12 fb−1 = 12, 000 pb−1 , (4.150)

although not all of that data is useable in any given analysis. The Large Hadron
Collider (LHC) at CERN is a pp machine that previously collected 23.3 fb−1 of
integrated
√ luminosity per experiment
√ (ATLAS and CMS) at center-of-mass energy
s = 8 TeV. The LHC ran at s = 13 TeV from 2015–2018, collecting about
160 fb−1 of integrated luminosity per experiment. The peak luminosity attained was
2 × 1034 cm−2 sec−1 , exceeding the LHC design target by a factor of two.
To figure out how many scattering events one expects at a collider, one needs to
know the corresponding cross-section for that type of event, which depends on the
final state. The total cross-section for any type of scattering at hadron colliders is
quite large. By one estimate at the Tevatron it was approximately

σ ( p p → anything) = 0.075 barns. (4.151)

However, this estimate is quite fuzzy, because it depends on detection variables such
as the minimum momentum transfer that one requires in order to say that a scattering
event has occurred. For arbitrarily small momentum transfer in elastic scattering of
charged particles, the cross-section actually becomes arbitrarily large due to the long-
range nature of the Coulomb force, as we will see in section 5.2.4. Also, the vast
majority of the scattering events reflected in (4.151) are extremely uninteresting,
featuring final states of well-known and well-understood hadrons.
An example of a more interesting final state would be anything involving a top
quark (t)and anti-top quark (t) pair, for which the Tevatron cross-section was about

σ ( p p → tt + anything) = 7.5 pb. (4.152)

This means that about 96,000 top pairs were produced at the Tevatron. However,
only a small fraction of these were identified as such. At the LHC, the cross-section
for producing top-antitop pairs is about

σ ( pp → tt + anything) = (175, 250, 830) pb, (4.153)

√
for center-of-mass energies s = (7, 8, 13) TeV, respectively. This means that tens
of millions of tt pairs have already been produced, enabling unprecedented preci-
sion studies of top quark properties. Those studies will continue as the integrated
luminosity increases.
4.5 Scattering Processes and Cross-Sections 75

Fig. 4.1 Production cross-section for various particle final states at LHC energies of 7, 8 and 13
TeV. ATLAS experiment (from “Quantum Chromodynamics”, RPP)

A plot summarizing the theoretical predictions and experimental measurements

of cross-sections for a selection of some of the more important processes involving
Standard Model particles is shown in Fig. 4.1 for LHC energies of 7, 8, and 13 TeV.
Besides the very large total and the inelastic total, also shown are the cross-sections
for producing ≥ 1 or ≥ 2 hadronic jets, a photon, a W boson, a Z boson, t t¯ pairs,
a single top quark, combinations of vector bosons W W , W Z , Z Z , and γ γ , the
Higgs boson, and other even rarer processes as labeled. In some cases, the cross-
sections are given independently for sub-cases in which at least a certain number of
hadronic jets n j is required in the final state. Note that the ranges for these cross-
sections span many orders of magnitude. The plot shows both experimental results
and theoretical predictions for these cross-section. It is both gratifying and reassuring
that the theoretical predictions of the Standard Model are in good agreement with
the experimental results.
In general, our task is to figure out how to relate the matrix element for a given
collision process to the corresponding cross-section. Let us consider a general sit-
uation in which two particles a and b with masses m a and m b and 4-momenta pa
and pb collide, producing n final-state particles with masses m i and 4-momenta ki ,
where i = 1, 2, . . . , n:
76 4 Field Theory and Lagrangians

1 1

initial state final state

As an abbreviation, we can call the initial state |i = | pa , m a ; pb , m b IN and the
final state | f = |k1 , m 1 ; . . . ; kn , m n OUT . In general, all of the particles could be
different, so that a different species of creation and annihilation operators might be
used for each. They can be either fermions or bosons, provided that the process con-
serves angular momentum, charge, and color, and is consistent with other symmetries
of the Standard Model. If the particles are not scalars, then |i and | f should also
carry labels that specify the spin of each particle. Now, because of four-momentum
conservation, we can always write:

n
4 (4)
f |i = Mi→ f (2π ) δ pa + pb − ki . (4.154)
i=1

Here Mi→ f is called the reduced matrix element for the process. In the example
in Sect. 4.4, the reduced matrix element we found to first order in the coupling λ
was simply a constant: Mφφ→φφ = −iλ. However, in general, M can be a non-
trivial Lorentz-scalar function of the various 4-momenta and spin eigenvalues of the
particles in the problem. In practice, it is computed order-by-order in perturbation
theory, so it is only known approximately.
According to the postulates of quantum mechanics, the probability of a transition
from the state |i to the state | f is:

| f |i|2
Pi→ f = . (4.155)
f | f i|i
The matrix element has been divided by the norms of the states, which are not unity;
they will be computed below. Now, the total number of scattering events expected to
occur is:

N S = Na Nb Pi→ f . (4.156)
f

Here Na Nb represents the total number

of initial states, one for each possible pair
of incoming particles. The sum f is over all possible final states f . To evaluate
this, recall that if one puts particles in a large box of volume V , then the density of
one-particle states with 3-momentum k is

d 3k
density of states = V . (4.157)
(2π )3
4.5 Scattering Processes and Cross-Sections 77

So, including a sum over the n final-state particles implies

n

d 3 ki
→ V . (4.158)
(2π )3
f i=1

Putting this into (4.156) and comparing with the definition (4.145), we have for the
differential contribution to the total cross-section:
n

d 3 ki
dσ = Pi→ f A V . (4.159)
(2π )3
i=1

Let us now suppose that each packet of particles consists of a cylinder with a large
volume V . Then the total time T over which the particles can collide is given by the
time it takes for the two packets to move through each other:

V
T = . (4.160)
A|va − vb |

(Assume that the volume V and area A of each bunch are very large compared
to the cube and square of the particles’ Compton wavelengths.) It follows that the
differential contribution to the cross-section is:
n

V d 3 ki
dσ = Pi→ f V . (4.161)
T |va − vb | (2π )3
i=1

The total cross-section is obtained by integrating over 3-momenta of the final state
particles. Note that the differential cross-section dσ depends only on the collision
process being studied. So in the following we expect that the arbitrary volume V and
packet collision time T should cancel out.
To see how that happens in the example of φφ → φφ scattering with a φ 4 inter-
action, let us first compute the normalizations of the states appearing in Pi→ f . For
the initial state |i of (4.122), one has:

i|i = 0|apb apa ap†a ap†b |0. (4.162)

To compute this, one can commute the a † operators to the left:

i|i = 0|apb [apa ap†a ]ap†b |0 + 0|apb ap†a apa ap†b |0. (4.163)

Now the incoming particle momenta pa and pb are always different, so in the last
term the ap†a just commutes with apb according to (4.62), yielding 0 when acting on
0|. The first term can be simplified using the commutator (4.62):

i|i = 0|apb ap†b |0 (2π )3 2E a δ (3) (pa − pa ). (4.164)

78 4 Field Theory and Lagrangians

Commuting ap†b to the left in the same way yields the norm of the state |i:

i|i = (2π )6 4E a E b δ (3) (pa − pa )δ (3) (pb − pb ). (4.165)

This result is doubly infinite, since the arguments of each delta function vanish! In
order to successfully interpret it, let us recall the origin of the 3-momentum delta
functions. One can write

(2π )3 δ (3) (p − p ) = d 3 x ei0·x = V , (4.166)

showing that a 3-momentum delta-function with vanishing argument corresponds to

V /(2π )3 , where V is the volume that the particles occupy. So we obtain

i|i = 4E a E b V 2 (4.167)

for the norm of the incoming state.

Similarly, the norm of the state | f is:

n
n
f|f = (2π )3 2E i δ (3) (ki − ki ) = (2E i V ). (4.168)
i=1 i=1

In doing this, there is one subtlety; unlike the colliding particles, it could be that two
identical outgoing particles have exactly the same momentum. This seemingly could
produce “extra” contributions when we commute a † operators to the left. However,
at least for massive particles, one can usually ignore this, since the probability that
two outgoing particles will have exactly the same momentum is vanishingly small
in the limit V → ∞.
Next we turn to the square of the matrix element:

2
| f |i|2 = |Mi→ f |2 (2π )4 δ (4) ( pa + pb − ki ) . (4.169)

This also is apparently the square of an infinite quantity. To interpret it, we again
recall the origin of the delta functions:

T /2
2πδ(E f − E i ) = dt eit(E f −Ei ) = T (for E f = E i ), (4.170)
−T /2

(2π)3 δ k− p = d 3 x eix·( k− p ) = V for k= p . (4.171)

So we can write:

n
4 (4)
(2π ) δ pa + pb − ki = T V . (4.172)
i=1
4.5 Scattering Processes and Cross-Sections 79

Now if we use this to replace one of the two 4-momentum delta functions in (4.169),
we have:

n
| f |i|2 = |Mi→ f |2 (2π )4 T V δ (4) pa + pb − ki . (4.173)
i=1

Plugging the results of (4.167), (4.168) and (4.173) into (4.155), we obtain an
expression for the transition probability:

n n

4 (4) T 1
Pi→ f = |Mi→ f | (2π ) δ
2
pa + pb − ki . (4.174)
4E a E b V 2E i V
i=1 i=1

Putting this in turn into (4.161), we finally obtain:

|Mi→ f |2
dσ = d n , (4.175)
4E a E b |va − vb |

where d n is a short-hand notation for

n
n
4 (4) d 3 ki
d n = (2π ) δ pa + pb − ki , (4.176)
(2π )3 2E i
i=1 i=1

and is known as the n-body differential Lorentz-invariant phase space: As expected,

all factors of T and V have canceled out of the formula (4.175) for the differential
cross-section.
In the formula (4.175), one can write for the velocities:
pa pb
va = , vb = . (4.177)
Ea Eb

Now, assuming that the collision is head-on so that vb is opposite to va (or 0), the
denominator in (4.175) is:

4E a E b |va − vb | = 4(E a |pb | + E b |pa |). (4.178)

The most common case one encounters is two-particle scattering to a final state
with two particles. In the center-of-momentum frame, pb = −pa , so that the 2-body
Lorentz-invariant phase space becomes:

d 3 k1 d 3 k2
d 2 = (2π )4 δ (3) (k1 + k2 )δ(E a + E b − E 1 − E 2 ) (4.179)
(2π ) 2E 1 (2π )3 2E 2
3

δ (3) (k1 + k2 ) δ( k21 + m 21 + k22 + m 22 − E CM )
= d 3 k1 d 3 k2 , (4.180)
16π k1 + m 1 k2 + m 2
2 2 2 2 2
80 4 Field Theory and Lagrangians

where

E CM ≡ E a + E b (4.181)

is the center-of-momentum energy of the process. Now one can do the k2 integral;
the 3-momentum delta function just sets k2 = −k1 (as it must be in the CM frame).
If we define

K ≡ |k1 | (4.182)

for convenience, then

d 3 k1 = K 2 d K d = K 2 d K dφ d(cos θ ), (4.183)

where = (θ, φ) are the spherical coordinate angles for k1 . So

δ( K 2 + m 21 + K 2 + m 22 − E CM )
. . . d 2 = ... K 2 d K dφ d(cos θ), (4.184)
K 2 + m 21 K 2 + m 22

where . . . represents any quantity. To do the remaining d K integral, it is convenient

to change variables to the argument of the delta function. So, defining

W = K2 + m 21 + K 2 + m 22 − E CM , (4.185)

we find

K 2 + m 21 + K 2 + m 22
dW = KdK. (4.186)
K 2 + m 21 K 2 + m 22

Noticing that the delta function predestines K 2 + m 21 + K 2 + m 22 to be replaced
by E CM , we can write:

K 2d K K dW
= . (4.187)
K 2 + m 21 K 2 + m 22 E CM

Using this in (4.184), and integrating d W using the delta function δ(W ), we obtain:
K
d 2 = dφ d(cos θ ) (4.188)
16π 2 E CM

for the Lorentz-invariant phase space of a two-particle final state.

4.5 Scattering Processes and Cross-Sections 81

Meanwhile, in the CM frame, (4.178) simplifies according to:

4E a E b |va − vb | = 4(E a |pb | + E b |pa |) = 4|pa |(E a + E b ) = 4|pa |E CM . (4.189)

Therefore, using the results of (4.188) and (4.189) in (4.175), we have:

|k1 |
dσ = |Mi→ f |2 2 |p |
d (4.190)
64π 2 E CM a

for the differential cross-section for two-particle scattering to two particles.

The matrix element is almost always symmetric under rotations about the col-
lision axis determined by the incoming
particle momenta. If so, then everything is
independent of φ, and one can do dφ = 2π , leaving:

|k1 |
dσ = |Mi→ f |2 2 |p |
d(cos θ ). (4.191)
32π E CM a

If the particle masses satisfy m a = m 1 and m b = m 2 (or they are very small), then
one has the further simplification |k1 | = |pa |, so that

1
dσ = |Mi→ f |2 2
d(cos θ ). (4.192)
32π E CM

The formulas (4.190)–(4.192) will be used often in the following.

We can finally interpret the meaning of the result (4.143) that we obtained for
scalar φ 4 theory. Since we found Mφφ→φφ = −iλ, the differential cross-section in
the CM frame is:

λ2
dσφφ→φφ = 2
d(cos θ ). (4.193)
32π E CM
1
Now we can integrate over θ using −1 d(cos θ ) = 2. However, there is a double-
counting problem that we must take into account. The angles (θ, φ) that we have
integrated over represent the direction of the 3-momentum of one of the final state
particles. The other particle must then have 3-momentum in the opposite direction
(π − θ, −φ). The two possible final states with k1 along those two opposite direc-
tions are therefore actually the same state, because the two particles are identical. So,
we have actually counted each state twice when integrating over all d. To take this
into account, we have to divide by 2, arriving at the result for the total cross-section:

λ2
σφφ→φφ = 2
. (4.194)
32π E CM

In the system of units with c = = 1, energy has the same units as 1/distance. Since
λ is dimensionless, it checks that σ indeed has units of area. This is a very useful
thing to check whenever one has found a cross-section!
82 4 Field Theory and Lagrangians

4.6 Scalar Field with φ 3 Coupling

For our next example, let us consider a theory with a single scalar field as before,
but with an interaction Lagrangian that is cubic in the field:
μ
Lint = − φ 3 , (4.195)
6
instead of (4.116). Here μ is a coupling that has the same dimensions as mass.
As before, let us compute the matrix element for 2 particle to particle scattering,
φφ → φφ.
The definition and quantization of the free Hamiltonian proceeds exactly as before,
with equal time commutators given by (4.62) and (4.63), and the free Hamiltonian by
(4.84). The interaction part of the Hamiltonian can be obtained in exactly the same
way as the discussion leading up to (4.120), yielding:

μ
Hint = (2π )3 d˜q1 d q̃2 d q̃3 aq†1 aq†2 aq†3 δ (3) (q1 + q2 + q3 )
6
+3aq†1 aq†2 aq3 δ (3) (q1 + q2 − q3 )
+3aq†1 aq2 aq3 δ (3) (q1 − q2 − q3 )

(3)
+aq1 aq2 aq3 δ (q1 + q2 + q3 ) . (4.196)

As before, we want to calculate:

OUT k1 , k2 |pa pb OUT = OUT k1 , k2 |e−i T H |pa , pb IN

= 0|ak1 ak2 e−i T H ap†a ap†b |0, (4.197)

where H = H0 + Hint is the total Hamiltonian. However, this time if we expand

e−i T H only to first order in Hint , the contribution is clearly zero, because the net
number of particles created or destroyed by Hint is always odd. Therefore we must
work to second order in Hint , or equivalently in the coupling μ.
The operator e−i T H can be written as (compare to (4.127) and the surrounding
discussion):
N
−i T H T
e = 1 − i(H0 + Hint ) , (4.198)
N

2 , this becomes:
in the large N limit. Keeping only terms that are of order Hint

−2 N −n−2
N
T N −n−m−2

T

T m T

T n
e−i T H = 1 − i H0 −i Hint 1 − i H0 −i Hint 1 − i H0 .
N N N N N
n=0 m=0
(4.199)
4.6 Scalar Field with φ 3 Coupling 83

Now, in the large N limit, we can convert the discrete sums into integrals over the
variables t = T n/N and t
= T m/N + t with t = t
= T /N . Since most of the
contribution comes from large n, m when N → ∞, the result becomes:

T T
−i T H

e = dt dt
e−i H0 (T −t ) (−i Hint ) e−i H0 (t −t) (−i Hint ) e−i H0 t . (4.200)
0 t

When we sandwich this between the states k1 , k2 | and |pa , pb , we can substitute

e−i H0 (T −t ) → e−i E f (T −t ) ; (4.201)

e−i H0 t → e−i Ei t , (4.202)

where E i = E pa + E pb is the initial state energy eigenvalue and E f = E k1 + E k2

is the final state energy eigenvalue. Now, (−i Hint )|pa , pb will consist of a linear
combination of eigenstates |X of H0 with different energies E X . So we can also
substitute

e−i H0 (t −t) → e−i E X (t −t) (4.203)

provided that in place of E X we will later put in the appropriate energy eigenvalue
of the state created by each particular term in −i Hint acting on the initial state. So,
we have:

e−i T H = (−i Hint ) I (−i Hint ) (4.204)

where
T T

I = dt dt
e−i E f (T −t ) e−i E X (t −t) e−i Ei t . (4.205)
0 t

To evaluate the integral I , we can first define shifted integration variables t¯ =

t − T /2 and t¯
= t
− T /2, so that with due care to the limits of integration,

T /2 T /2

I =e −i T (E i +E f )/2
d t¯ ei t¯(E X −Ei ) d t¯
ei t¯ (E f −E X ) . (4.206)
−T /2 t¯

Now e−i T (Ei +E f )/2 is just a constant phase that will go away when we take the
complex square of the matrix element, so we drop it. Then, relabeling t¯ → t and
t¯
→ t
, and taking the limit of a very long time T → ∞:

∞ ∞
it(E X −E i )

I = dt e dt
eit (E f −E X +i) . (4.207)
−∞ t
84 4 Field Theory and Lagrangians

The t
integral does not have the form of a delta function because its lower limit of

integration is t. Therefore we have inserted an infinitesimal factor e−t so that the

integral converges for t
→ ∞; we will take → 0 at the end. Performing the t

integration, we get:

∞
i
I = dt eit(E f −Ei ) (4.208)
E f − E X + i
−∞

i
= 2π δ(E f − E i ) . (4.209)
E f − E X + i

As usual, energy conservation between the initial and final states is thus automatic.
Putting together the results above, we have so far:

OUT k1 , k2 |pa , pb OUT =

i
2π δ(E f − E i ) 0|ak1 ak2 (−i Hint ) (−i Hint )ap†a ap†b |0. (4.210)
E f − E X + i

Let us now evaluate the matrix element in (4.210). To do this, we can divide the
calculation up into pieces, depending on how many a and a † operators are contained
in each factor of Hint . First, let us consider the contribution when the right Hint
contains a † aa terms acting on the initial state, and the left Hint contains a † a † a
terms. Taking these pieces from (4.196), the contribution from (4.210) is:

OUT k1 , k2 |pa , pb OUT =
(a † a † a)(a † aa) part

μ 2
−i (2π )3 2π δ(E f − E i ) d r̃1 d r̃2 d r̃3 d q̃1 d q̃2 d q̃3
2
δ (r1 + r2 − r3 ) δ (3) (q1 − q2 − q3 )
(3)

i
0|ak1 ak2 ar1 ar2 ar3
† †
aq†1 aq2 aq3 ap†a ap†b |0. (4.211)
E f − EX

The factor involving E X is left inserted within the matrix element to remind us that
E X should be replaced by the eigenvalue of the free Hamiltonian H0 acting on the
state to its right.
The last line in (4.211) can be calculated using the following general strategy. We
commute a’s to the right and a † ’s to the left, using (4.62) and (4.63). In doing so, we
will get a non-zero contribution with a delta function whenever the 3-momentum of
an a equals that of an a † with which it is commuted, removing that a, a † pair. In the
end, every a must “contract” with some a † in this way (and vice versa), because an
a acting on |0 or an a † acting on 0| vanishes.
This allows us to identify what E X is. The aq2 and aq3 operators must be contracted
with ap†a and ap†b if a non-zero result is to be obtained. There are two ways to do
this: either pair up [aq2 , ap†a ] and [aq3 , ap†b ], or pair up [aq2 , ap†b ] and [aq3 , ap†a ]. In
4.6 Scalar Field with φ 3 Coupling 85

both cases, the result can be non-zero only if q2 + q3 = pa + pb . The delta-function

δ (3) (q1 − q2 − q3 ) then insures that there will be a non-zero contribution only when

q1 = pa + pb ≡ Q. (4.212)

So the energy eigenvalue of the state aq†1 aq2 aq3 ap†a ap†b |0 must be replaced by

E X = EQ = |pa + pb |2 + m 2 (4.213)

whenever there is a non-zero contribution.

Evaluating the quantity

0|ak1 ak2 ar†1 ar†2 ar3 aq†1 aq2 aq3 ap†a ap†b |0 (4.214)

now yields four distinct non-zero terms, corresponding to the following ways of
contracting a’s and a † ’s:

[ak1 , ar†1 ] [ak2 , ar†2 ] [aq2 , ap†a ] [aq3 , ap†b ] [ar3 , aq†1 ], or (4.215)
[ak1 , ar†2 ] [ak2 , ar†1 ] [aq2 , ap†a ] [aq3 , ap†b ] [ar3 , aq†1 ], or (4.216)
[ak1 , ar†1 ] [ak2 , ar†2 ] [aq2 , ap†b ] [aq3 , ap†a ] [ar3 , aq†1 ], or (4.217)
[ak1 , ar†2 ] [ak2 , ar†1 ] [aq2 , ap†b ] [aq3 , ap†a ] [ar3 , aq†1 ]. (4.218)

For the first of these contributions to (4.214), we get:

(2π )3 2E r1 δ (3) (r1 − k1 ) (2π )3 2E r2 δ (3) (r2 − k2 ) (2π )3 2E q2 δ (3) (q2 − pa )

(2π )3 2E q3 δ (3) (q3 − pb ) (2π )3 2E r3 δ (3) (r3 − q1 ). (4.219)

Now, the various factors of (2π )2 2E just cancel the factors in the denominators of the
definition of d q̃i and d r̃i . One can do the q1 , q2 , q3 , r1 , r2 , and r3 integrations trivially,
using the 3-momentum delta functions, resulting in the following contribution to
(4.211):
μ 2 i

1
−i (2π )4 δ(E f − E i )δ (3) (k1 + k2 − pa − pb ). (4.220)
2 E f − E Q 2E Q

The two delta functions can be combined into δ (4) (k1 + k2 − pa − pb ). Now, the
other three sets of contractions listed in (4.216)–(4.218) are exactly the same, after
a relabeling of momenta. This gives a factor of 4, so (replacing E f → E i in the
denominator, as allowed by the delta function) we have:

OUT k1 , k2 |pa , pb OUT † † =
(a a a)(a aa) part
†

i
(−iμ) 2
(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.221)
(E i − E Q )(2E Q )
86 4 Field Theory and Lagrangians

One can draw a simple picture illustrating what has happened in the preceding
formulas:

initial state final state

The initial state contains two particles, denoted by the lines on the left. Acting with
the first factor of Hint (on the right in the formula, and represented by the vertex on the
left in the figure) destroys the two particles and creates a virtual particle in their place.
The second factor of Hint destroys the virtual particle and creates the two final state
particles, represented by the lines on the right. The three-momentum carried by the
intermediate virtual particle is Q = pa + pb = k1 + k2 , so momentum is conserved
at the two vertices.
However, there are other contributions that must be included. Another one occurs
if the Hint on the right in (4.210) contains a † a † a † operators, and the other Hint
contains aaa operators. The corresponding picture is this:

initial state final state

Here the Hint carrying a † a † a † (the rightmost one in the formula) is represented by
the upper left vertex in the figure, and the one carrying aaa is represented by the
lower right vertex. The explicit formula corresponding to this picture is:

OUT k1 , k2 |pa , pb OUT =
(aaa)(a † a † a † ) part

μ 2
−i (2π )3 2π δ(E f − E i ) d r̃1 d r̃2 d r̃3 d q̃1 d q̃2 d q̃3
6
δ (r1 + r2 + r3 ) δ (3) (q1 + q2 + q3 )
(3)

i
0|ak1 ak2 ar1 ar2 ar3 aq†1 aq†2 aq†3 ap†a ap†b |0. (4.222)
E f − EX

As before, we can calculate this by commuting a’s to the right and a † ’s to the left.
In the end, non-zero contributions arise only when each a is contracted with some
a † . In doing so, we should ignore any terms that arise whenever a final state state ak
is contracted with an initial state ap† . That would correspond to a situation with no
scattering, since the initial state particle and the final state particle would be exactly
the same.
4.6 Scalar Field with φ 3 Coupling 87

The result contains 36 distinct contributions, corresponding to the 6 ways of

contracting ak1 and ak2 with any two of aq†1 , aq†2 , and aq†3 , times the 6 ways of
contracting ap†a and ap†b with any two of ar1 , ar2 , ar3 . However, all 36 of these
contributions are identical under relabeling of momentum, so we can just calculate
one of them and multiply the answer by 36. This will neatly convert the factor of
(−iμ/6)2 to (−iμ)2 . We also note that E X must be replaced by the free Hamiltonian
energy eigenvalue of the state aq†1 aq†2 aq†3 ap†a ap†b |0, namely:

E X = E q1 + E q2 + E q3 + E pa + E pb . (4.223)

For example consider the term obtained from the following contractions of a’s
and a † ’s:

[ak1 , aq†1 ] [ak2 , aq†2 ] [ar1 , ap†a ] [ar2 , ap†b ] [ar3 , aq†3 ]. (4.224)

This leads to factors of (2π )3 2E and momentum delta functions just as before. So
we can do the 3-momentum q1,2 and r1,2,3 integrals using the delta functions, in the
process setting q1 = k1 and q2 = k2 and r1 = pa and r2 = pb and r3 = q3 . Finally,
we can do the q3 integral using one of the delta functions already present in (4.222),
resulting in q3 = −pa − pb = −k1 − k2 = −Q, with Q the same as was defined in
(4.212). This allows us to identify in this case:

E X = E k1 + E k2 + E pa + E pb + E Q = 2E i + E Q . (4.225)

The end result is:

OUT k1 , k2 |pa , pb OUT =
(aaa)(a † a † a † ) part
−i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ). (4.226)
(E i + E Q )(2E Q )

It is now profitable to combine the two contributions we have found. One hint that
this is a good idea is the fact that the two cartoon figures we have drawn for them are
topologically the same; the second one just has a line that moves backwards. So if
we just ignore the distinction between internal lines that move backwards and those
that move forwards, we can draw a single Feynman diagram to represent both results
combined:
1

2
88 4 Field Theory and Lagrangians

The initial state is on the left, and the final state is on the right, and the flow of 4-
momentum is indicated by the arrows, with 4-momentum conserved at each vertex.
The result of combining these two contributions is called the s-channel contribu-
tion, to distinguish it from still more contributions that we will get to soon. Using a
common denominator for (E i − E Q ) and (E i + E Q ), we get:

OUT k1 , k2 |pa , pb OUT =
s−channel
i
(−iμ) 22
(2π )4 δ (4) (k1 + k2 − pa − pb ). (4.227)
Ei − EQ
2

If we now consider the four-vector

μ
paμ + pb = ( |pa |2 + m 2 + |pb |2 + m 2 , pa + pb ) = (E i , Q), (4.228)

then we recognize that

( pa + pb )2 = E i2 − |Q|2 = E i2 − E Q
2
+ m2. (4.229)
μ μ
(Note that pa + pb is not equal to (E Q , Q).) So we can rewrite the term
i i
= , (4.230)
E i2 − EQ
2 ( pa + pb )2 − m 2

The final result is that the s-channel contribution to the matrix element is:

k
OUT 1 2 a, k |p , p
b OUT =
s−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ). (4.231)
( pa + pb )2 − m 2

Note that one could just as well have put (k1 + k2 )2 in place of ( pa + pb )2 in this
expression, because of the delta function.
Now one can go through the same whole process with contributions that come
from the rightmost Hint (acting first on the initial state) consisting of a † a † a terms,
and the leftmost Hint containing a † aa terms. One can draw Feynman diagrams that
represent these terms, which look like:

1 1

1 2

2 2
4.6 Scalar Field with φ 3 Coupling 89

These are referred to as the t-channel and u-channel contributions respectively. Here
we have combined all topologically-identical diagrams. This is a standard procedure
that is always followed; the diagrams we have drawn with dashed lines for the scalar
field are the Feynman diagrams for the process. [The solid-line diagrams appearing
between (4.221) and (4.222) above are sometimes known as “old-fashioned Feynman
diagrams”, but it is very rare to see them in the modern literature.]
After much juggling of factors of (2π )3 and doing 3-momentum integrals using
delta functions, but using no new concepts, the contributions of the t-channel and
u-channel Feynman diagrams can be found to be simply:

k , k
OUT 1 2 a |p , p
b OUT =
t−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ), (4.232)
( pa − k1 )2 − m 2

and

OUT k1 , k2 |pa , pb OUT =
u−channel
i
(−iμ)2 (2π )4 δ (4) (k1 + k2 − pa − pb ). (4.233)
( pa − k2 )2 − m 2

The reduced matrix element can now be obtained by just stripping off the factors of
(2π )4 δ (4) (k1 + k2 − pa − pb ), as demanded by the definition (4.154). So the total
reduced matrix element, suitable for plugging into the formula for the cross-section,
is:

Mφφ→φφ = Ms + Mt + Mu (4.234)

where:
i
Ms = (−iμ)2 , (4.235)
( pa + pb )2 − m 2
i
Mt = (−iμ)2 , (4.236)
( pa − k1 )2 − m 2
i
Mu = (−iμ)2 . (4.237)
( pa − k2 )2 − m 2

The reason for the terminology s, t, and u is because of the standard kinematic
variables for 2→2 scattering known as Mandelstam variables:

s = ( pa + pb )2 = (k1 + k2 )2 , (4.238)
t = ( pa − k1 )2 = (k2 − pb )2 , (4.239)
u = ( pa − k2 )2 = (k1 − pb )2 . (4.240)
90 4 Field Theory and Lagrangians

The s-, t-, and u-channel diagrams are simple functions of the corresponding Man-
delstam variables:
i
Ms = (−iμ)2 , (4.241)
s − m2
i
Mt = (−iμ)2 , (4.242)
t − m2
i
Mu = (−iμ)2 . (4.243)
u − m2

Typically, if one instead scatters fermions or vector particles or some combination

of them, the s- t- and u- channel diagrams will have a similar form, with m always
the mass of the particle on the internal line, but with more junk in the numerators
coming from the appropriate reduced matrix elements.

4.7 Feynman Rules

It is now possible to abstract what we have found, to obtain the general Feynman
rules for calculating reduced matrix elements in a scalar field theory. Evidently,
the reduced matrix element M is the sum of contributions from each topologically
distinct Feynman diagram, with external lines corresponding to each initial state or
final state particle. For each term in the interaction Lagrangian
y n
Lint = − φ , (4.244)
n!
with coupling y, one can draw a vertex at which n lines meet. At each vertex, 4-
momentum must be conserved. Then:

• For each vertex appearing in a diagram, we should put a factor of −i y. For the
examples we have done with y = λ and y = μ, the Feynman rules are just:

Note that the conventional factor of 1/n! in the Lagrangian (4.244) makes the
corresponding Feynman rule simple in each case.
• For each internal scalar field line carrying 4-momentum p μ , we should put a
factor of i/( p 2 − m 2 + i):

2 2
4.7 Feynman Rules 91

This factor associated with internal scalar field lines is called the Feynman prop-
agator. Here we have added an imaginary infinitesimal term i, with the under-
standing that → 0 at the end of the calculation; this turns out to be necessary for
cases in which p 2 become very close to m 2 . This corresponds to the particle on the
internal line being nearly “on-shell”, because p 2 = m 2 is the equation satisfied
by a free particle in empty space.
• For each external line, we just have a factor of 1:

or 1

This is a rather trivial rule, but it is useful to mention it because in the cases of
fermions and vector fields, external lines will turn out to carry non-trivial factors
not equal to 1. (Here the gray blobs represent the rest of the Feynman diagram.)
Those are all the rules one needs to calculate reduced matrix elements for Feynman
diagrams without closed loops, also known as tree diagrams. The result is said to
be a tree-level calculation. There are additional rules that apply to diagrams with
closed loops (loop diagrams) which have not arisen explicitly in the preceding
discussion, but could be inferred from more complicated calculations. For them,
the additional rules are:
• For each closed loop in a Feynman diagram, there is an undetermined 4-
momentum μ . These loop momenta should be integrated over according to:

d 4
. (4.245)
(2π )4

Loop diagrams quite often diverge because of the integration over all μ , because
of the contribution from very large |2 |. This can be fixed by introducing a cutoff
|2 |max in the integral, or by other more rigorous methods, which can make the
integrals finite. The techniques of getting physically meaningful answers out of
this are known as regularization (making the integrals finite) and renormalization
(redefining coupling constants and masses so that the physical observables do not
depend explicitly on the unknown cutoff).
• If a Feynman diagram with one or more closed loops can be transformed into
an exact copy of itself by interchanging any number of internal lines through a
smooth deformation, without moving the external lines, then there is an additional
factor of 1/N , where N is the number of distinct permutations of that type. (This
is known as the “symmetry factor” for the loop diagram.)

Some examples might be useful. In the φ 3 theory, there are quite a few Feynman
diagrams that will describe the scattering of 2 particles to 3 particles. One of them
is shown below:
92 4 Field Theory and Lagrangians

1 2

For this diagram, according to the rules, the contribution to the reduced matrix
element is just

i i
M = (−iμ)3 . (4.246)
( pa + pb )2 − m 2 (k1 + k2 )2 − m 2

Imagine having to calculate this starting from scratch with creation and annihilation
operators, and tremble with fear! Feynman rules are good.
An example of a Feynman diagram with a closed loop in the φ 3 theory is:
1

There is a symmetry factor of 1/2 for this diagram, because one can smoothly
μ μ
interchange the two lines carrying 4-momenta μ and μ − pa − pb to get back
to the original diagram, without moving the external lines. So the reduced matrix
element for this diagram is:
2
1 i d4 i i
M= (−iμ)4 .
2 ( pa + pb )2 − m 2 (2π )4 ( − pa − pb )2 − m 2 + i 2 − m 2 + i

Again, deriving this result starting from the creation and annihilation operators is pos-
sible, but extraordinarily unpleasant! In the future, we will simply state the Feynman
rules for any theory from staring at the Lagrangian density. The general procedure
for doing this is rather simple (although the proof is not), and is outlined below.
A Feynman diagram is a precise representation of a contribution to the reduced
matrix element M for a given physical process. The diagrams are built out of three
types of building blocks:

vertices ←→ interactions (4.247)

internal lines ←→ free virtual particle propagation (4.248)
external lines ←→ initial, final states. (4.249)
4.7 Feynman Rules 93

The Feynman rules specify a mathematical expression for each of these objects. They
follow from the Lagrangian density, which defines a particular theory.2
To generalize what we have found for scalar fields, let us consider a set of generic
fields i , which can include both commuting bosons and anticommuting fermions.
They might include real or complex scalars, Dirac or Weyl fermions, and vector
fields of various types. The index i runs over a list of all the fields, and over their
spinor or vector indices. Now, it is always possible to obtain the Feynman rules by
writing an interaction Hamiltonian and computing matrix elements. Alternatively,
one can use powerful path integral techniques that are beyond the scope of this book
to derive the Feynman rules. However, in the end the rules can be summarized very
simply in a way that could be guessed from the examples of real scalar field theory
that we have already worked out. In the following, we will simply state the relevant
results; more rigorous derivations can be found in field theory textbooks.
For interactions, we have now found in two cases that the Feynman rule for n
scalar lines to meet at a vertex is equal to −i times the coupling of n scalar fields
in the Lagrangian with a factor of 1/n!. More generally, consider an interaction
Lagrangian term:

X i1 i2 ...i N
Lint = − i1 i2 . . . i N , (4.250)
P
where P is the product of n! for each set of n identical fields in the list i1 , i2 ,
. . . , i N , and X i1 i2 ...i N is the coupling constant that determines the strength of the
interaction. The corresponding Feynman rule attaches N lines together at a vertex.
Then the mathematical expression assigned to this vertex is −i X i1 i2 ...i N . The lines for
distinguishable fields among i 1 , i 2 , . . . , i N should be labeled as such, or otherwise
distinguished by drawing them differently from each other.
For example, consider a theory with two real scalar fields φ and ρ. If the interaction
Lagrangian includes terms, say,

λ1 2 2 λ2 3
Lint = − φ ρ − φ ρ, (4.251)
4 6
then there are Feynman rules:

Here the longer-dashed lines correspond to the field φ, and the shorter-dashed lines
to the field ρ.

2 It is tempting to suggest that the Feynman rules themselves should be taken as the definition of the

theory. However, this would only be sufficient to describe phenomena that occur in a perturbative
weak-coupling expansion.
94 4 Field Theory and Lagrangians

As another example, consider a theory in which a real scalar field φ couples to a

Dirac fermion according to:

Lint = −yφ. (4.252)

In this case, we must distinguish between lines for all three fields, because = † γ 0
is independent of . For Dirac fermions, one draws solid lines with an arrow coming
in to a vertex representing in Hint , and an arrow coming out representing . So
the Feynman rule for this interaction is:

Note that this Feynman rule is proportional to a 4 × 4 identity matrix in Dirac spinor
a
space. This is because the interaction Lagrangian can be written −yδa b φ b ,
where a is the Dirac spinor index for and b for . Often, one just suppresses the
spinor indices, and writes simply −i y for the Feynman rule, with the identity matrix
implicit.
The interaction Lagrangian (4.252) is called a Yukawa coupling. This theory has
a real-world physical application: it is precisely the type of interaction that applies
between the Standard Model Higgs boson φ = h and each Dirac fermion , with the
coupling y proportional to the mass of that fermion. We will return to this interaction
when we discuss the decays of the Higgs boson into fermion-antifermion pairs.
Let us turn next to the topic of internal lines in Feynman diagrams. These are
determined by the free (quadratic) part of the Lagrangian density. Recall that for a
scalar field, we can write the free Lagrangian after integrating by parts as:

1
L0 = φ(−∂μ ∂ μ − m 2 )φ. (4.253)
2

This corresponded to a Feynman propagator rule for internal scalar lines i/( p 2 −
m 2 + i). So, up to the i factor, the propagator is just proportional to i divided by
the inverse of the coefficient of the quadratic piece of the Lagrangian density, with
the replacement

∂μ −→ −i pμ . (4.254)

The free Lagrangian density for generic fields i can always be put into either the
form
1
L0 = i Pi j j , (4.255)
2
i, j
4.7 Feynman Rules 95

for real fields, or the form

L0 = ( † )i Pi j j , (4.256)
i, j

for complex fields (including, for example, Dirac spinors). To accomplish this, one
by parts, throwing away a total derivative in L0 which
may need to integrate the action
will not contribute to S = d 4 x L. Here Pi j is a matrix that involves spacetime
derivatives and masses. Then it turns out that the Feynman propagator can be found
by making the replacement (4.254) and taking i times the inverse of the matrix Pi j :

i(P −1 )i j . (4.257)

This corresponds to an internal line in the Feynman diagram labeled by i at one end
and j at the other.
As an example, consider the free Lagrangian for a Dirac spinor , as given by
(4.26). According to the prescription of (4.256) and (4.257), the Feynman propagator
connecting vertices with spinor indices a and b should be:
b
i ( /p − m)−1 a . (4.258)

In order to make sense of the inverse matrix, we can write it as a fraction, then
multiply numerator and denominator by ( /p + m), and use the fact that /p /p = p 2
from (3.113):

i i( /p + m) i( /p + m)
= = 2 . (4.259)
/p − m ( /p − m)( /p + m) p − m 2 + i

In the last line we have put in the i factor needed for loop diagrams as a prescription
for handling the possible singularity at p 2 = m 2 . So the Feynman rule for a Dirac
fermion internal line is:

Here the arrow direction on the fermion line distinguishes the direction of particle
flow, with particles (anti-particles) moving with (against) the arrow. For electrons
and positrons, this means that the arrow on the propagator points in the direction
of the flow of negative charge. As indicated, the 4-momentum p μ appearing in the
propagator is also assigned to be in the direction of the arrow on the internal fermion
line.
Next we turn to the question of Feynman rules for external particle and anti-
particle lines. At a fixed time t = 0, a generic field is written as an expansion of
the form:
96 4 Field Theory and Lagrangians

(x) = d p̃ i(p, n) eip·x ap,n + f (p, n) e−ip·x bp,n
†
, (4.260)
n

†
where ap,n and bp,n are annihilation and creation operators (which may or may
not be Hermitian conjugates of each other); n is an index running over spins and
perhaps other labels for different particle types; and i(p, n) and f (p, n) are expansion
coefficients. In general, we build an interaction Hamiltonian out of the fields . When
†
acting on an initial state ak,m |0 on the right, Hint will therefore produce a factor of
i(k, m) after commuting (or anticommuting, for fermions) the ap,n operator in to
†
the right, removing the ak,m . Likewise, when acting on a final state 0|bk,m on the
left, the interaction Hamiltonian will produce a factor of f (k, m). Therefore, initial
and final state lines just correspond to the appropriate coefficient of annihilation and
creation operators in the Fourier mode expansion for that field.
For example, comparing (4.260) to (4.72) in the scalar case, we find that i(p, n)
and f (p, n) are both just equal to 1.
For Dirac fermions, we see from (4.89) that the coefficient for an initial state
particle (electron) carrying 4-momentum p μ and spin state s is u( p, s)a , where a
is a spinor index. Similarly, the coefficient for a final state antiparticle (positron) is
v( p, s)a . So the Feynman rules for these types of external particle lines are:

Here the blobs represent the rest of the Feynman diagram in each case. Similarly,
considering the expansion of the field in (4.90), we see that the coefficient for
an initial state antiparticle (positron) is v( p, s)a and that for a final state particle
(electron) is u( p, s)a . So the Feynman rules for these external states are:

Note that in these rules, the p μ label of an external state is always the physical
4-momentum of that particle or anti-particle; this means that with the standard con-
vention of initial state on the left and final state on the right, the p μ associated with
each of u( p, s), v( p, s), u( p, s) and v( p, s) is always taken to be pointing to the
right. For v( p, s) and v( p, s), this is in the opposite direction to the arrow on the
fermion line itself.
Problems 97

Problems

1. Prove that if f ( p) is Lorentz invariant then so must be

d 3p
f ( p). (4.261)
(2π )3 2E p

2. Prove that

[H , ak† ] = E k ak† . (4.262)

3. Writing all Lorentz invariant terms for the lagrangian of a scalar field φ with no
more than two powers of φ yields

1 1
L= (∂μ φ)(∂ μ φ) − m 2 φ 2 . (4.263)
2 2
However, why not add a term linear in φ such as φ, where is a constant?
Show that this extra linear term (“tadpole term” as it is often called) can be
eliminated by a suitable redefinition of the field φ.
4. In our computation of σ (φφ → φφ) within the scalar φ 4 theory the matrix ele-
ment was Mi→ f = −iλ. Let us suppose instead that it is

Mi→ f = −iλ + iκ pa · k1 (4.264)

where κ is another free variable, pa is incoming φ momentum, and k1 is one of

the out-going φ momentum. Compute the total cross-section given this matrix
element.
5. Consider a general 2 → 2 particle scattering problem. The initial-state particles
have masses m a and m b and the final state particles have masses m 1 and m 2 . The
momenta for this scattering process is the corresponding pa pb → k1 k2 . Using
the Mandelstam variable definitions that s = ( pa + pb )2 , t = ( pa − k1 )2 and
u = ( pa − k2 )2 , prove that s + t + u = m a2 + m 2b + m 21 + m 22 . This means that
one can always eliminate one of the Mandelstam variables in terms of the other
two. Find expressions for the Lorentz-invariant quantities pa · pb , pa · k1 and
pa · k2 in terms of only s, t, m a , m b , m 1 , and m 2 .
6. Consider the lagrangian

1 1 1 1
L= (∂μ φ1 )(∂ μ φ1 ) + (∂μ φ2 )(∂ μ φ2 ) − m 21 φ12 − m 22 φ22 + m 23 φ1 φ2 + μφ13 .
2 2 2 2
Redefine this theory by rotating φ1 and φ2 such that the kinetic and mass terms
(bilinear in φi ) are canonical (i.e., diagonal). Then write down all the Feynman
rules of the theory. In other words, give all the external leg factors, the propaga-
tors, and the interaction vertices.
98 4 Field Theory and Lagrangians

Hint: Expressions will be simpler if you find the rotation angle α between the
{φ1 , φ2 } states and the mass eigenstates {φ1
, φ2
} in terms of m 1 , m 2 and m 3 and
then write all interaction Feynman rules in terms of μ and the angle α. Also,
you’ll need to find the mass-eigenstates masses m
2
2
1 and m 2 in terms of m 1 , m 2
and m 3 .
7. Draw all the Feynman diagrams corresponding to φφ → φφφφ scattering, to
order λ2 . You should find 10 distinct diagrams, which can be organized into two
distinct classes. Clearly label the particle 4-momenta for each external particle
and internal propagator line in each diagram. Use Feynman rules to write down
an expression for the matrix elements corresponding to these diagrams.
8. Consider the reduced matrix element obtained for the toy model example of
φφ → φφ scattering in φ 3 theory in Sect. 4.6.
(a) Find the differential cross section
dσ
d cos θ
as a function of μ, s, m, and cos θ .

(b) Find the total cross section in terms of μ, s, m, and simplify it as much as
possible.

N 2 μ4
σthreshold = (4.265)
1152π m 6
μ4
σsm 2 = , (4.266)
16π m 2 s 2
where N is a certain odd integer. [Hints: integrate directly in terms of the
variable cos θ . You are very likely to find the following definite integrals to
be useful:

1
dx 1 a+b
= ln , (4.267)
a + bx b a−b
−1
1
dx 2
= 2 , (4.268)
(a + bx)2 a − b2
−1
1
dx 1 a+b
= ln . (4.269)
(a + bx)(a − bx) ab a−b
−1

Remember to be wary of tricky factors of 2 at the very end.]

Problems 99

9. In the quantum theory of a free scalar field φ, compute the commutators

[H , ak† ] = E k ak† and (4.270)

[H , ak ] = ? (4.271)

In the theory of a free Dirac fermion field, use the anticommutation relations of
b, b† to compute the commutators

†
[H , bk,s ]=? and (4.272)
[H , bk,s ] = ? (4.273)

10. Consider the operator:

P=− d 3 x π(x) ∇φ(x), (4.274)

in the case of a free real scalar field φ. Rewrite this operator in terms of the ap
and ap† operators, and show that the result is:

P= d p̃ p ap† ap (4.275)

(Hint: You will have to argue that certain terms vanish, including an apparently
infinite one, by carefully noting their behavior as p → −p.) What is P acting on
the vacuum state |0 ? Compute the commutator:

[P, ak† ] =? (4.276)

What is the eigenvalue of P acting on the state |k = ak† |0 ?

11. Using the previous problem as inspiration, guess an expression for the total 4-
momentum operator P μ in the free quantum field theory for a Dirac fermion,
directly in terms of raising and lowering operators. Show that your guess is correct
by computing the commutators of this operator with the raising and lowering
operators, and using these to find how the operator P μ acts on 1-particle and
1-antiparticle states.
Quantum Electro-Dynamics (QED)
5

5.1 QED Lagrangian and Feynman Rules

Let us now see how all of the general rules we have developed so far apply in
the case of Quantum Electrodynamics. This is the quantum field theory governing
photons (quantized electromagnetic waves) and charged fermions and antifermions.
The fermions in the theory are represented by Dirac spinor fields carrying electric
charge Qe, where e is the magnitude of the charge of the electron. Thus Q = −1 for
electrons and positrons, +2/3 for up, charm and top quarks and their anti-quarks,
and −1/3 for down, strange and bottom quarks and their antiquarks. (Recall that a
single Dirac field, assigned a single value of Q, is used to describe both particles and
their anti-particles.) The free Lagrangian for the theory is:

1
L0 = − F μν Fμν + (iγ μ ∂μ − m). (5.1)
4
Now, earlier we found that the electromagnetic field Aμ couples to the 4-current
density J μ = (ρ, J ) by a term in the Lagrangian −e J μ Aμ (see (4.39)). Since J μ
must be a four-vector built out of the charged fermion fields and , we can guess
that

J μ = Qγ μ . (5.2)

The interaction Lagrangian density for a fermion with charge Qe and electromagnetic
fields is therefore

Lint = −eQγ μ Aμ . (5.3)

The value of e is determined by experiment. However, it is a running coupling

constant, which means that effectively its value has a logarithmic dependence on the

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 101
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_5
102 5 Quantum Electro-Dynamics (QED)

characteristic energy of the process. For very low energy experiments, the numerical
value is e ≈ 0.30282, corresponding to the experimental result for the fine structure
constant:

e2
α≡ ≈ 1/137.036. (5.4)
4π
For experiments done at energies near 100 GeV, the appropriate value is a little larger,
more like e ≈ 0.313.
Let us take a small detour to check that (5.2) really has the correct form and
normalization to be the electromagnetic current density. Consider the total charge
operator:

=
Q d x ρ(x) =
3
d x J (x) = Q
3 0
d x γ = Q
3 0
d 3 x † . (5.5)

Plugging in (4.89) and (4.90), and doing the x integration, and one of the momentum
integrations using the resulting delta function, one finds

2
2
= Q d p̃ † † †
Q [u ( p, s)bp,s + v † ( p, s)dp,s ][u( p, r )bp,r + v( p, r )dp,r ]. (5.6)
2E p
s=1 r =1

This can be simplified using some spinor identities:

u † ( p, s)u( p, r ) = v † ( p, s)v( p, r ) = 2E p δsr ; (5.7)

u ( p, s)v( p, r ) = v ( p, s)u( p, r ) = 0.
† †
(5.8)

Taking into account that the operators d, d † satisfy the anticommutation relation
(4.108), the result is

2

= Q
Q †
d p̃ bp,s bp,s − dp,s
†
dp,s . (5.9)
s=1

Here we have dropped an infinite contribution, much like we had to do in getting to

(4.84) and (4.109). This represents a uniform and constant (and therefore unobserv-
able) infinite charge density throughout all space. The result is that the total charge
eigenvalue of the vacuum vanishes:

= 0.
Q|0 (5.10)

One can now check that

b† ] = Q b† ,
[ Q, (5.11)
k,r k,r
d † ] = −Q d † .
[ Q, (5.12)
k,r k,r
5.1 QED Lagrangian and Feynman Rules 103

Therefore,

b† |0 = Q b† |0
Q (5.13)
k,r k,r

for single-particle states, and

d † |0 = −Q d † |0
Q (5.14)
k,r k,r

acting on a state
for single anti-particle states. More generally, the eigenvalue of Q
with N particles and N antiparticles is (N − N )Q. From (5.5), this verifies that the
charge density ρ is indeed the time-like component of the four-vector Qγ μ ,
which must therefore be equal to J μ .
The full QED Lagrangian is invariant under gauge transformations:

1
Aμ (x) → Aμ (x) − ∂μ θ (x), (5.15)
e
(x) → e i Qθ
(x), (5.16)
(x) → e−i Qθ (x), (5.17)

where θ (x) is an arbitrary function of spacetime (equal to −eλ(x) in (2.87)). A nice

way to see this invariance is to define the covariant derivative:

Dμ ≡ (∂μ + i Qe Aμ ). (5.18)

Here the term “covariant” refers to the gauge transformation symmetry, not the
Lorentz transformation symmetry as it did when we introduced covariant four-
vectors. Note that the covariant derivative actually depends on the charge of the
field it acts on. Now one can write the full Lagrangian density as

1
L = L0 + Lint = − F μν Fμν + iγ μ Dμ − m. (5.19)
4
The ordinary derivative of the spinor transforms under the gauge transformation with
an “extra” term:

∂μ → ∂μ (ei Qθ ) = ei Qθ ∂μ + i Q∂μ θ. (5.20)

The point of the covariant derivative of is that it transforms under the gauge
transformation the same way does, by acquiring a phase:

Dμ → ei Qθ Dμ . (5.21)

Here the contribution from the transformation of Aμ in Dμ cancels the extra term in
(5.20). Using (5.15)–(5.17) and (5.21), it is easy to see that L is invariant under the
gauge transformation, since the multiplicative phase factors just cancel.
104 5 Quantum Electro-Dynamics (QED)

Returning to the interaction term (5.3), we can now identify the Feynman rule for
QED interactions, by following the general prescription outlined with (4.250):

This one interaction vertex governs all physical processes in QED.

To find the Feynman rules for initial and final state photons, consider the Fourier
expansion of the vector field at a fixed time:

3

Aμ = d p̃ μ ( p, λ)eip·x ap,λ + μ∗ ( p, λ)e−ip·x ap,λ
†
. (5.22)
λ=0

Here μ ( p, λ) is a basis of four polarization four-vectors labeled by λ = 0, 1, 2, 3.

They satisfy the orthonormality condition:
⎧
⎨ −1 for λ = λ = 1, 2, or 3
μ ∗
( p, λ)μ ( p, λ ) = +1 for λ = λ = 0 (5.23)
⎩
0 for λ = λ

†
The operators ap,λ and ap,λ act on the vacuum state by creating and destroying
photons with momentum p and polarization vector μ ( p, λ).
However, not all of the four degrees of freedom labeled by λ can be physical. From
classical electromagnetism, we know that electromagnetic waves are transversely
polarized. This means that the electric and magnetic fields are perpendicular to the 3-
momentum direction of propagation. In terms of the potentials, it means that one can
always choose a gauge in which A0 = 0 and the Lorenz gauge condition ∂μ Aμ = 0
is satisfied. Therefore, physical electromagnetic wave quanta corresponding to the
classical solutions to Maxwell’s equations Aμ = μ e−i p·x with p 2 = 0 can be taken
to obey:

μ = (0, ), (5.24)

pμ μ = 0 (5.25)

(or, equivalent to the last condition, · p = 0). After imposing these two conditions,
only two of the four λ’s will survive as valid initial or final states for any given p μ .
For example, suppose that a state contains a photon with 3-momentum p = p ẑ,
so p μ = ( p, 0, 0, p). Then we can choose a basis of transverse linearly-polarized
vectors with λ = 1, 2:

μ ( p, 1) = (0, 1, 0, 0) x-polarized, (5.26)

μ ( p, 2) = (0, 0, 1, 0) y-polarized. (5.27)
5.1 QED Lagrangian and Feynman Rules 105

However, in high-energy physics it is often more useful to instead use a basis of left-
and right-handed circular polarizations that carry definite helicities λ = R, L:

1
μ ( p, R) = √ (0, 1, i, 0) right-handed, (5.28)
2
1
μ ( p, L) = √ (0, 1, −i, 0) left-handed. (5.29)
2

In general, incoming photon lines have a Feynman rule μ ( p, λ) and outgoing photon
lines have a Feynman rule μ∗ ( p, λ), where λ = 1, 2 in some convenient basis of
choice. Often, we will sum or average over the polarization labels λ, so the μ ( p, λ)
will not need to be listed explicitly for a given momentum.
Let us next construct the Feynman propagator for photon lines. The free
Lagrangian density given in (4.34) can be rewritten as:

1 μ

L0 = A gμν ∂ρ ∂ ρ − ∂μ ∂ν Aν , (5.30)
2
where we have dropped a total derivative. (The action, obtained by integrating L0 ,
does not depend on total derivative terms.) Therefore, following the prescription of
(4.257), it appears that we ought to find the propagator by finding the inverse of the
4 × 4 matrix

Pμν = − p 2 gμν + pμ pν . (5.31)

Unfortunately, however, this matrix is not invertible. The reason for this can be traced
to the gauge invariance of the theory; not all of the physical states we are attempting
to propagate are really physical.
This problem can be avoided using a trick, due to Fermi, called “gauge fixing”.
As long as we agree to stick to the Lorenz gauge, ∂μ Aμ = 0, we can add a term to
the Lagrangian density proportional to (∂μ Aμ )2 :

(ξ ) 1
L0 = L 0 − (∂μ Aμ )2 . (5.32)
2ξ

In Lorenz gauge, not only does the extra term vanish, but also its contribution to the
equations of motion vanishes. Here ξ is an arbitrary new gauge-fixing parameter;
it can be picked at will. Intermediate steps in a calculation may depend on it, but
physical results should not depend on the choice of ξ . The new term in the modified
(ξ )
free Lagrangian L0 is called the gauge-fixing term. Now the matrix to be inverted
is:

1
Pμν = − p 2 gμν + 1 − pμ pν . (5.33)
ξ
106 5 Quantum Electro-Dynamics (QED)

To find the inverse, one notices that as a tensor, (P −1 )νρ can only be a linear com-
bination of terms proportional to g νρ and to p ν p ρ . So, writing the most general
possible form for the answer:

(P −1 )νρ = C1 g νρ + C2 p ν p ρ (5.34)

and requiring that

Pμν (P −1 )νρ = δμρ , (5.35)

one finds the solution

1 1−ξ
C1 = − ; C2 = . (5.36)
p2 ( p 2 )2

It follows that the desired Feynman propagator for a photon with momentum p μ is:

i pμ pν
−gμν + (1 − ξ ) . (5.37)
p 2 + i p2

(Here we have put in the i factor in the denominator as usual.) In a Feynman diagram,
this propagator corresponds to an internal wavy line, labeled by μ and ν at opposite
ends, and carrying 4-momentum p. The gauge-fixing parameter ξ can be chosen at
the convenience or whim of the person computing the Feynman diagram. The most
popular choice for simple calculations is ξ = 1, called Feynman gauge. Then the
Feynman propagator for photons is simply:
−igμν
(Feynman gauge). (5.38)
p 2 + i

Another common choice is ξ = 0, known as Landau gauge, for which the Feynman
propagator is:

−i pμ pν
gμν − (Landau gauge). (5.39)
p 2 + i p2

(Comparing to (5.32), we see that this is really obtained as a formal limit ξ → 0.)
The Landau gauge photon propagator has the nice property that it vanishes when con-
tracted with either p μ or p ν , which can make some calculations simpler (especially
certain loop diagram calculations). Sometimes it is useful to just leave ξ unspecified.
Even though this means having to calculate more terms, the payoff is that in the end
one can see if the final answer for the reduced matrix element is independent of ξ ,
providing a consistency check.
We have now encountered most of the Feynman rules for QED. There are some
additional rules having to do with minus signs because of Fermi-Dirac statistics;
these can be understood by carefully considering the effects of anticommutation
relations for fermionic operators. In practice, one usually does not write out the
5.1 QED Lagrangian and Feynman Rules 107

spinor indices explicitly. All of the rules are summarized in the following two pages
in a cookbook form. Of course, the best way to understand how the rules work is to
do some examples. That will be the subject of the next few sections.

Feynman Rules for QED

To find the contributions to the reduced matrix element M for a physical process
involving charged Dirac fermions and photons:
1. Draw all topologically distinct Feynman diagrams, with wavy lines representing
photons, and solid lines with arrows representing fermions, using the rules below
for external lines, internal lines, and interaction vertices. The arrow direction is
preserved when following each fermion line. Enforce four-momentum conserva-
tion at each vertex.
2. For external lines, write (with 4-momentum p μ always to the right, and spin
polarization s or λ as appropriate):

3. For internal fermion lines, write Feynman propagators:

with 4-momentum p μ along the arrow direction, and m the mass of the fermion.
For internal photon lines, write:

with 4-momentum p μ along either direction in the wavy line. (Use ξ = 1 for
Feynman gauge and ξ = 0 for Landau gauge.)
108 5 Quantum Electro-Dynamics (QED)

4. For the interaction vertex of a fermion f of charge Q f , write:

The vector index μ is to be contracted with the corresponding index on the photon
line to which it is connected. This will be either an external photon line factor of
μ or μ∗ , or on an internal photon line propagator index. Note that the fermion f
coming into the vertex must be the same flavor as the fermion coming out of the
vertex; for example, there is no photon-muon-positron vertex.
5. For each loop momentum μ that is undetermined by four-momentum conserva-
tion with fixed external-state momenta, perform an integration

d 4
. (5.40)
(2π )4

Getting a finite answer from these loop integrations often requires that they be
regularized by introducing a cutoff or some other trick.
6. Put a factor of (−1) for each closed fermion loop.
7. To take into account suppressed spinor indices on fermion lines, write terms
involving spinors as follows. For fermion lines that go all the way through the
diagram, start at the end of each fermion line (as defined by the arrow direction)
with a factor u or v, and write down factors of γ μ or ( /p + m) consecutively,
following the line backwards until a u or v spinor is reached. For closed fermion
loops, start at an arbitrary vertex on the loop, and follow the fermion line back-
wards until the original point is reached; take a trace over the gamma matrices in
the closed loop.
8. If a Feynman diagram with one or more closed loops can be transformed into
an exact copy of itself by interchanging any number of internal lines through a
smooth deformation without moving the external lines, then there is an additional
symmetry factor of 1/N , where N is the number of distinct permutations of that
type.
9. After writing down the contributions from each diagram to the reduced matrix
element M according to the preceding rules, assign an additional relative minus
sign between different diagram contributions whenever the written ordering of
external state spinor wavefunctions u, v, u, v differs by an odd permutation.
5.2 Electron-Positron Scattering 109

5.2 Electron-Positron Scattering

5.2.1 e− e+ → μ− μ+

In the next few sections we will study some of the basic scattering processes in QED,
using the Feynman rules found in the previous section and the general discussion
of cross-sections given in Sect. 4.5. These calculations will involve several thematic
tricks that are common to many Feynman diagram evaluations.
We begin with electron-positron annihilation into a muon-antimuon pair:

e− e+ → μ− μ+ .

We will calculate the differential and total cross-sections for this process in the
center-of-momentum frame, to leading order in the coupling e. Since the mass of
the muon (and the anti-muon) is about √ m μ = 105.66 MeV, this process requires a
center-of-momentum energy of at least s = 211.3 MeV. By contrast, the mass of
the electron is only about m e = 0.511 MeV, which we can therefore safely neglect.
The error made in doing so is far less than the error made by not including higher-
order corrections.
A good first step is to label the momentum and spin data for the initial state
electron and positron and the final state muon and anti-muon:

Particle Momentum Spin Spinor

e− pa sa u( pa , sa )
e+ pb sb v( pb , sb ) (5.41)
μ− k1 s1 u(k1 , s1 )
μ+ k2 s2 v(k2 , s2 )

At order e2 , there is only one Feynman diagram for this process. Here it is:

Applying the rules for QED to turn this picture into a formula for the reduced matrix
element, we find:

−igμν
M = v( pb , sb ) (ieγ μ ) u( pa , sa )
( pa + pb )2
ν

u(k1 , s1 ) (ieγ ) v(k2 , s2 ) . (5.42)

The v( pb , sb ) (ieγ μ ) u( pa , sa ) part is obtained by starting at the end of the electron-

positron line with the positron external state spinor, and following it back to its
110 5 Quantum Electro-Dynamics (QED)

beginning. The interaction vertex is −i Qeγ μ = ieγ μ , since the charge of the elec-
tron and muon is Q = −1. Likewise, the u(k1 , s1 ) (ieγ ν ) v(k2 , s2 ) part is obtained
by starting at the end of the muon-antimuon line with the muon external state spinor,
and following it backwards. The photon propagator is written in Feynman gauge, for
simplicity, and carries indices μ and ν that connect to the two fermion lines at their
respective interaction vertices.
We can write this result more compactly by using abbreviations v( pb , sb ) = v b
and u( pa , sa ) ≡ u a , etc. Writing the denominator of the photon propagator as the
Mandelstam variable s = ( pa + pb )2 = (k1 + k2 )2 , and using the metric in the pho-
ton propagator to lower the index on one of the gamma matrices, we get:

e2
M=i (v b γμ u a )(u 1 γ μ v2 ). (5.43)
s

The differential cross-section involves the complex square of M:

e4
|M|2 = (v b γμ u a )(u 1 γ μ v2 )(v b γν u a )∗ (u 1 γ ν v2 )∗ . (5.44)
s2
Evaluating the complex conjugated terms in parentheses can be done systematically
by taking the Hermitian conjugate of the Dirac spinors and matrices they are made
of, taking care to write them in the reverse order. So, for example,

(v b γν u a )∗ = (vb† γ 0 γν u a )∗ = u a† γν† γ 0 vb = u a† γ 0 γν vb = u a γν vb . (5.45)

The third equality follows from the identity (A.2.5), which implies γν† γ 0 = γ 0 γν .
Similarly,

(u 1 γ ν v2 )∗ = v 2 γ ν u 1 . (5.46)

[Following the same strategy, one can show that in general,

(xγμ γν . . . γρ y)∗ = yγρ . . . γν γμ x (5.47)

where x and y are any u and v spinors.] Therefore we have:

e4
|M|2 = (v b γμ u a )(u a γν vb ) (u 1 γ μ v2 )(v 2 γ ν u 1 ). (5.48)
s2
At this point, we could work out explicit forms for the external state spinors and
plug (5.48) into (4.190) to find the differential cross-section for any particular set of
spins. However, this is not very convenient, and fortunately it is not necessary either.
In a real experiment, the final state spins of the muon and anti-muon are typically not
measured. Therefore, to find the total cross-section for all possible final states, we
should sum over s1 and s2 . Also, if the initial-state electron spin states are unknown,
we should average over sa and sb . (One must average, not sum, over the initial-state
5.2 Electron-Positron Scattering 111

spins, because sa and sb cannot simultaneously take on both spin-up and spin-down
values; there is only one initial state, even if it is unknown.) These spin sums and
averages will allow us to exploit the identities (3.109) and (3.110) (also listed in
Appendix B as (A.2.29) and (A.2.30)), so that the explicit forms for the spinors are
never needed.
After doing the spin sum and average, the differential cross-section must be sym-
metric under rotations about the collision axis. This is because the only special direc-
tions in the problem are the momenta of the particles, so that the cross-section can
only depend on the angle θ between the collision axis determined by the two initial-
state particles and the scattering axis determined by the two final-state particles. So,
we can apply (4.191) to obtain:

dσ 1 1 |k1 |
= |M|2 (5.49)
d(cos θ ) 2 s 2 s s s 32π s|pa |
a b 1 2

in the center-of-momentum frame, with the effects of initial state spin averaging and
final state spin summing now included.
We can now use (A.2.29) and (A.2.30), which in the present situation imply

u a u a = /pa + m e , (5.50)
sa

v2 v 2 = k/2 − m μ . (5.51)
s2

Neglecting m e as promised earlier, we obtain:

1 1 e4
|M|2 = (v b γμ /pa γν vb ) (u 1 γ μ [/k 2 − m μ ]γ ν u 1 ). (5.52)
2 s 2 s s s 4 s 2 s ,s
a b 1 2 b 1

Now we apply another trick. A dot product of two vectors is equal to the trace of the
vectors multiplied in the opposite order to form a matrix:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
b1 b1 b1 a1 b1 a2 b1 a3 b1 a4

⎜ b2 ⎟ ⎜ b2 ⎟
⎜ b2 a1 b2 a2 b2 a3 b2 a4 ⎟
⎜ ⎟ ⎜ ⎟ ⎜
a1 a2 a3 a4 ⎝ ⎠ = Tr ⎝ ⎠ a1 a2 a3 a4 ≡ Tr ⎝ ⎟ . (5.53)
b3 b3 b3 a1 b3 a2 b3 a3 b3 a4 ⎠
b4 b4 b4 a1 b4 a2 b4 a3 b4 a4

Applying this to each expression in parentheses in (5.52), we move the barred spinor
(thought of as a row vector) to the end and take the trace over the resulting 4 × 4
Dirac spinor matrix. So:

v b γμ /pa γν vb = Tr[γμ /pa γν vb v b ], (5.54)

u 1 γ [k/2 − m μ ]γ ν u 1 = Tr[γ μ (k/2 − m μ )γ ν u 1 u 1 ],
μ
(5.55)
112 5 Quantum Electro-Dynamics (QED)

so that

1 e4
|M|2 = Tr[γμ /pa γν vb v b ] Tr[γ μ (k/2 − m μ )γ ν u 1 u 1 ]. (5.56)
4 4 s2 s s
spins b 1

The reason this trick of rearranging into a trace is useful is that now we can once
again exploit the spin-sum identities (A.2.29) and (A.2.30), this time in the form:

vb v b = /p b − m e , (5.57)
sb

u 1 u 1 = k/1 + m μ . (5.58)
s1

The result, again neglecting m e , is:

1 e4
|M|2 = Tr[γμ /pa γν /p b ] Tr[γ μ (k/2 − m μ )γ ν (k/1 + m μ )]. (5.59)
4 4 s2
spins

Next we must evaluate the traces. First,

β
Tr[γμ /pa γν /p b ] = paα pb Tr[γμ γα γν γβ ] (5.60)
β
= paα pb (4gμα gνβ
− 4gμν gαβ + 4gμβ gνα ) (5.61)
= 4 paμ pbν − 4gμν pa · pb + 4 pbμ paν , (5.62)

where we have used the general result for the trace of four gamma matrices listed in
(A.2.10). Similarly, making use of the fact that the trace of an odd number of gamma
matrices is zero:

Tr[γ μ (/k 2 − m μ )γ ν (/k 1 + m μ )] = Tr[γ μ k/2 γ ν k/1 ] − m 2μ Tr[γ μ γ ν ] (5.63)

μ μ
= 4k2 k1ν − 4 g μν k1 · k2 + 4k1 k2ν − 4 g μν m 2μ , (5.64)

where (A.2.8)–(A.2.10) have been used. Taking the product of the two traces, one
finds that the answer reduces to simply:
1 e4
|M|2 = 2 8 ( pa · k2 )( pb · k1 ) + ( pa · k1 )( pb · k2 ) + ( pa · pb )m 2μ . (5.65)
4 s
spins

Our next task is to work out the kinematic quantities appearing in (5.65). Let
us call P and K the magnitudes of the 3-momenta of the electron and the muon,
respectively. We assume that the electron is initially moving in the +z direction, and
the muon makes an angle θ with respect to the positive z axis, within the yz plane.
The on-shell conditions for the particles are:

pa2 = pb2 = m 2e ≈ 0, (5.66)

k12 = k22 = m 2μ . (5.67)
5.2 Electron-Positron Scattering 113

Then we have:

pa = (P, 0, 0, P), (5.68)

pb = (P, 0, 0, −P), (5.69)

k1 = ( K 2 + m 2μ , 0, K sin θ, K cos θ ), (5.70)

k2 = ( K 2 + m 2μ , 0, −K sin θ, −K cos θ ). (5.71)

Since ( pa + pb )2 = (k1 + k2 )2 = s, it follows that

s
P= , (5.72)
4

s
K = − m 2μ , (5.73)
4

and
s
pa · pb = 2P 2 = , (5.74)
2
⎡ ⎤
s 4m 2μ
pa · k1 = pb · k2 = P K 2 + m 2μ − P K cos θ = ⎣1 − cos θ 1 − ⎦, (5.75)
4 s
⎡ ⎤
s 4m 2μ
pa · k2 = pb · k1 = P K 2 + m 2μ + P K cos θ = ⎣1 + cos θ 1 − ⎦. (5.76)
4 s

Putting these results into (5.65), we get:

! "
1 4m 2μ 4m 2μ
|M| = e 1 +
2 4
+ 1− cos θ .
2
(5.77)
4 s s
spins

Finally we can plug this into (5.49):

dσ 1 K
= |M|2 (5.78)
d(cos θ ) 4 32π s P
spins
! "
e 4 4m 2μ 4m 2μ 4m 2μ
= 1− 1+ + 1− cos θ , (5.79)
2
32π s s s s

or, rewriting in terms of the fine structure constant α = e2 /4π ,

! "
dσ π α2 4m 2μ 4m 2μ 4m 2μ
= 1− 1+ + 1− cos θ .
2
(5.80)
d(cos θ ) 2s s s s
114 5 Quantum Electro-Dynamics (QED)

#1 #1
Doing the integral over cos θ using −1 d(cos θ ) = 2 and −1 cos2 θ d(cos θ ) = 2/3,
we find the total cross-section:
!
4π α 2 4m 2μ 2m 2μ
σ = 1− 1+ . (5.81)
3s s s

It is a useful check that the cross-section has units of area; recall that when c = = 1,
then s = E CM 2 has units of mass2 or length−2 .

Equations (5.80) and (5.81) have been tested in many experiments, and correctly
predict the rate of production of muon-antimuon pairs at electron-positron colliders.
Let us examine some special limiting cases. Near the energy threshold for μ+ μ−
production, one may expand in the quantity
√
E = s − 2m μ . (5.82)

To leading order in small E, (5.81) becomes

π α2 E
σ ≈ . (5.83)
2m 2μ mμ

The cross-section therefore rises like the square root of the energy excess over the
threshold. However, going to increasing energy, σ quickly levels off because of
the 1/s factors in (5.81). Maximizing with$respect to s, one finds that the largest
√ √
cross-section in (5.81) is reached for s = 1 + 21 m μ ≈ 2.36m μ , and is about

σmax = 0.54α 2 /m 2μ = 1000 nb. (5.84)

√
In the high energy limit s
m μ , the cross-section decreases proportional to 1/s:
4π α 2
σ ≈ . (5.85)
3s
However, this formula is not good for arbitrarily high values of the center-of-
momentum energy, because there is another diagram in√which the photon is replaced
by a Z 0 boson. This effect becomes important when s is not small compared to
m Z = 91.1876 GeV.

5.2.2 e− e+ → f f .

In the last section, we calculated the cross-section for producing a muon-antimuon

pair in e− e+ collisions. We can easily generalize this to the case of production of
any charged fermion f and anti-fermion f . The Feynman diagram for this process
is obtained by simply replacing the muon-antimuon line by an f - f line:
5.2 Electron-Positron Scattering 115

The reduced matrix element for this process has exactly the same form as for e− e+ →
f f , except that the photon-μ− -μ+ vertex is replaced by a photon- f - f vertex, with:

ieγ ν −→ −i Q f eγ ν , (5.86)

where Q f is the charge of the fermion f . In the case of quarks, there are three
indistinguishable colors for each flavor (up, down, strange, charm, bottom, top). The
photon-quark-antiquark vertex is diagonal in color, so the three colors are simply
summed over in order to find the total cross-section for a given flavor. In general, if
we call n f the number of colors (or perhaps other non-spin degrees of freedom) of
the fermion f , then we have:

|Me− e+ → f f |2 = n f Q 2f |Me− e+ →μ− μ+ |2 , (5.87)

where it is understood that m μ should be replaced by m f everywhere in

|Me− e+ → f f |2 . It follows that the differential and total cross-sections for e− e+ →
f f are obtained from (5.80) and (5.81) by just multiplying by n f Q 2f and replacing
mμ → m f :
!
4π α 2 4m 2f 2m 2f
σe − e + → f f = n f Q 2f 1− 1+ . (5.88)
3s s s
√
In the high-energy limit s
m f , we have:

4π α 2
σe − e + → f f = n f Q 2f . (5.89)
3s

Figure 5.1 compares the total cross-section for e+ e− → f f (solid line) as given by
(5.88) to the asymptotic approximation (dashed line) given by (5.89).
We see that the true cross-section is always less
√ than the asymptotic approxima-
tion, but the two already agree fairly well when s > ∼ 2.5m f . This means that when
several fermions contribute, the total cross-section well above threshold is just equal
to the sum of n f Q 2f for the available states times a factor 4π α 2 /3s. For example,
the up quark has charge Q u = +2/3, and there are three colors, so the prefactor
116 5 Quantum Electro-Dynamics (QED)

Fig. 5.1 The cross-section

for e− e+ → f f , as a
function of the center of
momentum energy. The solid
line is (5.88), and the dashed 2 2 2
0.54 nfQf α /mf
line is the asymptotic limit of
(5.89) σ

0
0 1 2 3 4 5
ECM/mf

indicated above is 3(2/3)2 = 4/3. The prefactors for all of the fundamental charged
fermion types with masses less than m Z are:

up, charm quarks: Q f = 2/3, n f = 3 −→ n f Q 2f = 4/3

down, strange, bottom quarks: Q f = −1/3, n f = 3 −→ n f Q 2f = 1/3
muon, tau leptons: Q f = −1, n f = 1 −→ n f Q 2f = 1.

However, free quarks are not seen in nature because the QCD color force confines
them within color-singlet hadrons. This means that the quark-antiquark production
process e− e+ → Q Q cross-section cannot easily be interpreted in terms of spe-
cific particles in the final state. Instead, one should view the quark production as
a microscopic process, occurring at a distance scale much smaller than a typical
hadron. Before we “see” them in macroscopic-sized detectors, the produced quarks
then undergo further strong interactions that end up producing hadronic jets of par-
ticles with momenta close to those of the original quarks. This always involves at
least the further production of a quark-antiquark pair in order to make the final state
hadrons color singlets. A Feynman-diagram cartoon of the situation might look as
shown in Fig. 5.2. Because the hadronic interactions are most important at the strong-
interaction energy scale of a few hundred MeV, the calculation of the cross-section
√
can only be trusted for energies that are significantly higher than this. When s
1
GeV, one can make the approximation:

σ (e− e+ → hadrons) ≈ σ (e− e+ → qq). (5.90)
q

The final state can be quite complicated, so to test QED production of quarks, one can
just measure the total cross-section for producing hadrons. The traditional measure
of the total hadronic cross-section is the variable Rhadrons , defined as the ratio:

σ (e− e+ → hadrons)
Rhadrons = . (5.91)
σ (e− e+ → μ− μ+ )
5.2 Electron-Positron Scattering 117

Fig. 5.2 Hadronic jet production in e− e+ collisions

When the approximation is valid, one can always produce up, down and strange
quarks, which all have masses < √ 1 GeV. The threshold to produce charm-anticharm
quarks occurs roughly when √ s > 2m c ≈ 3 GeV, and that to produce bottom-
antibottom quarks is at roughly s > 2m b ≈ 10 GeV. As each of these thresholds
is passed, one gets a contribution
√ to Rhadrons that is approximately a constant pro-
portional to n f Q 2f . So, for s < 3 GeV, one has

4 1 1
Rhadrons = + + = 2. (u, d, s quarks) (5.92)
3 3 3
√
For 3 GeV< s < 10 GeV, the charm quark contributes, and the ratio is

4 1 1 4 10
Rhadrons = + + + = . (u, d, s, c quarks) (5.93)
3 3 3 3 3
√
Finally, for s > 10 GeV, we get

4 1 1 4 1 11
Rhadrons = + + + + = . (u, d, s, c, b quarks). (5.94)
3 3 3 3 3 3
Besides these “continuum” contributions to Rhadrons , there are resonant contributions
that come from e− e+ → hadronic bound states. These √ bound states tend to have
very large, but narrow, production cross-sections when s is in just the right energy
118 5 Quantum Electro-Dynamics (QED)

range to produce them. For example, when the bound state√consists of a charm
and anticharm quark, one gets the J /ψ particle resonance at s = 3.096916 GeV,
with a width of 0.00093 GeV. These resonances contribute very sharp peaks to the
measured Rhadrons . Experimentally, Rhadrons is quite hard to measure, being plagued
by systematic detector effects. Many of the older experiments at lower energy tended
to underestimate the systematic uncertainties. Figure 5.3 shows plots of the data from
RPP 2022. The approximate agreement with the predictions for Rhadrons in (5.92)–
(5.94) provides a crucial test of the quark model of hadrons, including the charges
of the quarks and the number of colors..

5.2.3 Helicities in e− e+ → μ− μ+

Up to now, we have computed the cross-section by averaging over the unknown

spins of the initial state electron and positron. However, some e− e+ colliders can
control the initial spin states, using polarized beams. This means that the beams
are arranged to have an excess of either L-handed or R-handed helicity electrons
and positrons. Practical realities make it impossible to achieve 100%-pure polarized
beams, of course. At a proposed future Linear Collider, it is a very important part of
the experimental program to be able to run with at least the electron beam polarized.
Present estimates are that one might be able to get 90% or 95% pure polarization
for the electron beam (either L or R), with perhaps 60% polarization for the positron
beam. This terminology means that when the beam is operating in R mode, then a
polarization of X % implies that

P R − PL = X % (5.95)

where P R and P L are the probabilities of measuring the spin pointing along and
against the 3-momentum direction, respectively. This experimental capability shows
that one needs to be able to calculate cross-sections without assuming that the initial
spin state is random and averaged over.
We could redo the calculation of the previous sections with particular spinors
u( p, s) and v( p, s) for the desired specific spin states s of initial
% state %
particles.
However, then we would lose our precious trick of evaluating s uu and s vv. A
nicer way is to keep the sum over spins, but eliminate the “wrong” polarization from
the sum using a projection matrix from (3.89). So, for example, we can use

PL u( p, s) ←→ L initial-state particle, (5.96)

PR u( p, s) ←→ R initial-state particle, (5.97)

in place of the usual Feynman rules for an initial state particle. Summing over the
spin s will not change the fact that the projection matrix allows only L- or R-handed
electrons to contribute to the cross-section. Now our traces over gamma matrices
will involve γ5 , because of the explicit expressions for PL and PR (see (3.89)).
5.2 Electron-Positron Scattering 119

√
Fig. 5.3 Rhad = σ (e+ e− → hadrons)/σ (e+ e− → μ+ μ− ) vs. s (from RPP 2022)

To get the equivalent rules for an initial state antiparticle, we must remember that
the spin operator acting on v( p, s) spinors is the opposite of the spin operator acting
on u( p, s) spinors. Therefore, PL acting on a v( p, s) spinor projects onto a R-handed
antiparticle. So if we form the object v( p, s)PR = v † ( p, s)γ 0 PR = v † ( p, s)PL γ 0 ,
the result must describe a R-handed positron; in this case, the bar on the spinor for an
antiparticle “corrects” the handedness. So, for an initial state antiparticle with either
120 5 Quantum Electro-Dynamics (QED)

L or R polarization, we can use:

v( p, s)PL ←→ L initial-state anti-particle, (5.98)

v( p, s)PR ←→ R initial-state anti-particle. (5.99)

In some cases, one can also measure the polarizations of outgoing particles, for
example by observing their decays. Tau leptons and anti-taus sometimes decay by
the weak interaction processes:

τ − → − ντ ν , (5.100)
τ + → + ν τ ν , (5.101)

where is either e or μ, with the angular distributions of the final state directions
depending on the spin of the τ , which may be one of the final state fermions in a
scattering or decay process of interest. If the polarization of a final-state fermion is
fixed by measurement, then we need to use:

u( p, s)PR ←→ L final-state particle, (5.102)

u( p, s)PL ←→ R final-state particle, (5.103)
PR v( p, s) ←→ L final-state anti-particle, (5.104)
PL v( p, s) ←→ R final-state anti-particle, (5.105)

in order to be able to calculate cross-sections to final states with specific L or R spin

polarization states. (As a result of the barred spinor notation, the general rule is that
the projection matrix in an initial state has the same handedness as the incoming
particle or antiparticle, while the projection matrix in a final state has the opposite
handedness of the particle being produced.)
As an example, let us consider the process:

e− + − +
R eR → μ μ , (5.106)

where the helicities of the initial state particles are now assumed to be known per-
fectly. The reduced matrix element for this process, following from the same Feyn-
man diagram as before, is:

e2
M = i (v b PR γ μ PR u a ) (u 2 γμ v1 ). (5.107)
s
A projection matrix can be moved through a gamma matrix by changing L ↔ R:

PR γ μ = γ μ PL , (5.108)
PL γ μ = γ μ PR . (5.109)
5.2 Electron-Positron Scattering 121

This follows from the fact that γ5 anticommutes with γ μ :

1 + γ5 1 − γ5
PR γ μ = γμ = γμ = γ μ PL . (5.110)
2 2

Therefore, (5.107) simply vanishes, because

PR γ μ PR = γ μ PL PR = 0. (5.111)

Therefore, the process e− + − + − +

R e R → μ μ does not occur in QED. Similarly, e L e L →
μ− μ+ does not occur in QED. By the same type of argument, μ− and μ+ in the
final state must have opposite L,R polarizations from each other in QED.
To study a non-vanishing reduced matrix element, let us therefore consider the
process:

e− + − +
L e R → μL μ R , (5.112)

in which we have now assumed that all helicities√ are perfectly known. To simplify
matters, we will assume the high energy limit s
m μ . The reduced matrix ele-
ment can be simply obtained from (5.43) by just putting in the appropriate L and R
projection matrices acting on each external state spinor:

e2
M=i (v b PR γ μ PL u a )(u 1 PR γμ PL v2 ). (5.113)
s
This can be simplified slightly by using the properties of the projections matrices:

PR γ μ PL = γ μ PL PL = γ μ PL , (5.114)

so that

e2
M=i (v b γ μ PL u a )(u 1 γμ PL v2 ), (5.115)
s
and so

e4
|M|2 = (v b γ μ PL u a )(u 1 γμ PL v2 )(v b γ ν PL u a )∗ (u 1 γν PL v2 )∗ . (5.116)
s2
To evaluate this, we compute:

(v b γ ν PL u a )∗ = (vb† γ 0 γ ν PL u a )∗ = u a† (PL )† (γ ν )† γ 0 vb (5.117)

= u a† PL γ 0 γ ν vb (5.118)
= u a† γ 0 PR γ ν vb (5.119)
= u a PR γ ν vb . (5.120)
122 5 Quantum Electro-Dynamics (QED)

Here we have used the facts that (PL )† = PL , and (γ ν )† γ 0 = γ 0 γ ν , and PL γ 0 =

γ 0 PR , and u a† γ 0 = u a . In a similar way,

(u 1 γν PL v2 )∗ = v 2 PR γν u 1 . (5.121)

Therefore,

e4
|M|2 = (v b γ μ PL u a )(u a PR γ ν vb )(u 1 γμ PL v2 )(v 2 PR γν u 1 ). (5.122)
s2
Because the spin projection matrices will only allow the specified set of spins to
contribute anyway, we are free to sum over the spin labels sa , sb , s1 , and s2 , without
changing anything. Let us do so, since it will allow us to apply the tricks

u a u a = /pa + m e , (5.123)
sa

v2 v 2 = k/2 − m μ . (5.124)
s2

Neglecting the masses because of the high-energy limit, we therefore have

|M|2 = |M|2 (5.125)
sa sb s1 s2
e4
= (v b γ μ PL /pa PR γ ν vb ) (u 1 γμ PL k/2 PR γν u 1 ). (5.126)
s2 s1 sb

This can be simplified by eliminating excess projection matrices, using:

PL /pa PR = /pa PR PR = /pa PR , (5.127)

PL k/2 PR = k/2 PR PR = k/2 PR , (5.128)

to get

e4
|M|2 = (v b γ μ /pa PR γ ν vb ) (u 1 γμ k/2 PR γν u 1 ). (5.129)
s2 s s
1 b

Again using the trick of putting the barred spinor at the end and taking the trace (see
the discussion around (5.53)) for each quantity in parentheses, this becomes:

e4
|M|2 = Tr[γ μ /pa PR γ ν vb v b ] Tr[γμ k/2 PR γν u 1 u 1 ]. (5.130)
s2 s s
1 b

Doing the sums over s1 and sb using the usual trick gives:

e4
|M|2 = Tr[γ μ /pa PR γ ν /p b ] Tr[γμ k/2 PR γν k/1 ]. (5.131)
s2
5.2 Electron-Positron Scattering 123

Now it is time to evaluate the traces. We have

μ ν μ 1 + γ5
Tr[γ /pa PR γ /p b ] = Tr[γ /pa γ ν /p b ] (5.132)
2
1 1
= Tr[γ μ /pa γ ν /p b ] + Tr[γ μ /pa γ ν /p b γ5 ], (5.133)
2 2
where we have used the fact that γ5 anticommutes with any gamma matrix to rear-
range the order in the last term. The first of these traces was evaluated in Sect. 5.2.1.
The trace involving γ5 is, from (A.2.19):

Tr[γ μ /pa γ ν /p b γ5 ] = paα pbβ Tr[γ μ γ α γ ν γ β γ5 ] (5.134)

μανβ
= paα pbβ (4i ). (5.135)

where μανβ is the totally antisymmetric Levi-Civita tensor defined in (2.65). Putting
things together:
μ
Tr[γ μ /pa PR γ ν /p b ] = 2 paμ pbν − g μν ( pa · pb ) + pb paν + i paα pbβ μανβ . (5.136)

In exactly the same way, we get

ρ
Tr[γμ k/2 PR γν k/1 ] = 2 k2μ k1ν − gμν (k2 · k1 ) + k1μ k2ν + ik2 k1σ μρνσ . (5.137)

Finally, we have to multiply these two traces together, contracting the indices μ and
ν. Note that the cross-terms containing only one tensor vanish, because the epsilon
tensors are antisymmetric under μ ↔ ν, while the other terms are symmetric. The
term involving two epsilon tensors can be evaluated using the useful identity

μανβ μρνσ = −2δρα δσβ + 2δσα δρβ , (5.138)

which you can verify by brute force substitution of indices. The result is simply:

Tr[γ μ /pa PR γ ν /p b ] Tr[γμ k/2 PR γν k/1 ] = 16( pa · k2 )( pb · k1 ), (5.139)

so that

16e4
|M|2 = ( pa · k2 )( pb · k1 ). (5.140)
s2
This result should be plugged in to the formula for the differential cross-section:

dσe− e+ →μ− μ+ |k1 |

L R L R
= |M|2 . (5.141)
d(cos θ ) 32π s|pa |

Note that one does not average over initial-state spins in this case, because they have
already been fixed. The kinematics is of course not affected by the fact that we have
124 5 Quantum Electro-Dynamics (QED)

fixed the helicities, and so can be taken from the discussion in Sect. 5.2.1 with m μ
replaced by 0. It follows that:

dσe− e+ →μ− μ+ e4
L R L R
= (1 + cos θ )2 (5.142)
d(cos θ ) 32π s
π α2
= (1 + cos θ )2 . (5.143)
2s
The angular dependence of this result can be understood from considering the con-
servation of angular momentum in the event. Drawing a short arrow to represent the
direction of the spin:

This shows that the total spin angular momentum of the initial state is Sẑ = −1 (taking
the electron to be moving in the +z direction). The total spin angular momentum
of the final state is Sn̂ = −1, where n̂ is the direction of the μ− . This explains
why the cross-section vanishes if cos θ = −1; that corresponds to a final state with
the total spin angular momentum in the opposite direction from the initial state.
The quantum mechanical overlap for two states with measured angular momenta in
exactly opposite directions must vanish. If we describe the initial and final states as
eigenstates of angular momentum with J = 1:

Initial state: |Jẑ = −1; (5.144)

Final state: |Jn̂ = −1, (5.145)

then the reduced matrix element squared is proportional to:

(1 + cos θ )2
|Jn̂ = −1|Jẑ = −1|2 = . (5.146)
4
Similarly, one can compute:

dσe− e+ →μ− μ+ π α2
R L R L
= (1 + cos θ )2 , (5.147)
d(cos θ ) 2s

corresponding to the picture:

5.2 Electron-Positron Scattering 125

with all helicities reversed compared to the previous case. If we compute the cross-
sections for the final state muon to have the opposite helicity from the initial state
electron, we get

dσe− e+ →μ− μ+ dσe− e+ →μ− μ+ π α2

L R R L
= R L L R
= (1 − cos θ )2 , (5.148)
d(cos θ ) d(cos θ ) 2s

corresponding to the pictures:

These are 4 of the possible 24 = 16 possible helicity configurations for e− e+ →

μ− μ+ . However, as we have already seen, the other 12 possible helicity combinations
all vanish, because they contain either e− and e+ with the same helicity, or μ− and
μ+ with the same helicity. If we take the average of the initial state helicities, and
the sum of the possible final state helicities, we get:
"
1 dσe− e+ →μ− μ+ dσe− e+ →μ− μ+ dσe− e+ →μ− μ+ dσe− e+ →μ− μ+
L R L R
+ R L R L
+ L R R L
+ R L L R
+ 12 · 0
4 d(cos θ ) d(cos θ ) d(cos θ) d(cos θ)
!
1 π α2
= 2(1 + cos θ )2 + 2(1 − cos θ)2 (5.149)
4 2s
π α2
= (1 + cos2 θ ), (5.150)
2s
√
in agreement with the s
m μ limit of (5.80).
The vanishing of the cross-sections for e− + − +
L e L and e R e R in the above process can be
generalized beyond this example and even beyond QED. Consider any field theory in
which interactions are given by a fermion-antifermion-vector vertex with a Feynman
rule proportional to a gamma matrix γ μ . If an initial state fermion and antifermion
merge into a vector, or a vector splits into a final state fermion and antifermion:
126 5 Quantum Electro-Dynamics (QED)

then by exactly the same argument as before, the fermion and antifermion must have
opposite helicities, because of v PL γ μ PL u = v PR γ μ PR u = 0 and u PL γ μ PL v =
u PR γ μ PR v = 0 and the rules of (5.96)–(5.99) and (5.102)–(5.105).
Moreover, if an initial state fermion (or anti-fermion) interacts with a vector and
emerges as a final state fermion (or anti-fermion):

then the fermions (or anti-fermions) must have the same helicity, because of the
identities u PL γ μ PL u = u PR γ μ PR u = 0 and v PL γ μ PL v = v PR γ μ PR v = 0. This
is true even if the interaction with the vector changes the fermion from one type to
another.
These rules embody the concept of helicity conservation in high energy scatter-
ing. They are obviously useful when the helicities of the particles are controlled or
measured by the experimenter. They are also useful because, as we will see, the weak
interactions only affect fermions with L helicity and antifermions with R helicity.
The conservation of angular momentum together with helicity conservation often
allows one to know in which direction a particle is most likely to emerge in a scat-
tering or decay experiment, and in what cases one may expect the cross-section to
vanish or be enhanced.

5.2.4 Bhabha Scattering (e− e+ → e− e+ )

In this section we consider the process of Bhabha scattering:

e− e+ → e− e+ . (5.151)
√
For simplicity we will only consider the case of high-energy scattering, with s =
E CM
m e , and we will consider all spins to be unknown (averaged over in the
initial state, summed over in the final state).
5.2 Electron-Positron Scattering 127

Label the momentum and spin data as follows:

Particle Momentum Spin Spinor

e− pa sa u( pa , sa )
e+ pb sb v( pb , sb ) (5.152)
e− k1 s1 u(k1 , s1 )
e+ k2 s2 v(k2 , s2 )

At order e2 , there are two Feynman diagram for this process:

The first of these is called the s-channel diagram; it is exactly the same as the one we
drew for e− e+ → μ− μ+ . The second one is called the t-channel diagram. Using the
QED Feynman rules listed at the end of Sect. 5.1, the corresponding contributions to
the reduced matrix element for the process are:

−igμν
Ms = v b (ieγ μ )u a u 1 (ieγ ν )v2 , (5.153)
( pa + pb )2

and

−igμν
Mt = (−1) u 1 (ieγ μ )u a v b (ieγ ν )v2 . (5.154)
( pa − k1 )2

The additional (−1) factor in Mt is due to Rule 9 in the QED Feynman rules at the
end of Sect. 5.1. It arises because the order of spinors in the written expression for
Ms is b, a, 1, 2, but that in Mt is 1, a, b, 2, and these differ from each other by an
odd permutation. We could have just as well assigned the minus sign to Ms instead;
only the relative phases of terms in the matrix element are significant.
Therefore the full reduced matrix element for Bhabha scattering, written in terms
of the Mandelstam variables s = ( pa + pb )2 and t = ( pa − k1 )2 , is:
& '
1 1
M = Ms + Mt = ie2 (v b γμ u a )(u 1 γ μ v2 ) − (u 1 γμ u a )(v b γ μ v2 ) . (5.155)
s t
128 5 Quantum Electro-Dynamics (QED)

Taking the complex conjugate of this gives:

M∗ = M∗s + M∗t (5.156)

& '
1 1
= −ie2 (v b γν u a )∗ (u 1 γ ν v2 )∗ − (u 1 γν u a )∗ (v b γ ν v2 )∗ (5.157)
s t
& '
1 1
= −ie2 (u a γν vb )(v 2 γ ν u 1 ) − (u a γν u 1 )(v 2 γ ν vb ) . (5.158)
s t

The complex square of the reduced matrix element, |M|2 = M∗ M, contains a pure
s-channel piece proportional to 1/s 2 , a pure t-channel piece proportional to 1/t 2 , and
an interference piece proportional to 1/st. For organizational purposes, it is useful
to calculate these pieces separately.
The pure s-channel contribution calculation is exactly the same as what we
did before for e− e+ → μ− μ+ , except that now we can substitute m μ → m e → 0.
Therefore, plagiarizing the result of (5.65), we have:

1 8e4
|Ms |2 = 2 [( pa · k2 )( pb · k1 ) + ( pa · k1 )( pb · k2 )] . (5.159)
4 s
spins

The pure t-channel contribution can be calculated in a very similar way. We have:

e4
|Mt |2 = (v 2 γν vb )(v b γμ v2 )(u 1 γ μ u a )(u a γ ν u 1 ). (5.160)
t2
Taking the average of% initial state spins and
% the sum over final state spins allows us
to use the identities sa u a u a = /pa and sa vb v b = /p b (neglecting m e ). The result
is:

1 e4
|Mt |2 = 2 (v 2 γν /p b γμ v2 )(u 1 γ μ /pa γ ν u 1 ) (5.161)
4 4t s ,s
spins 1 2

e4
= Tr[γν /p b γμ v2 v 2 ]Tr[γ μ /pa γ ν u 1 u 1 ], (5.162)
4t 2 s ,s
1 2

in which we have turned the quantity into a trace by moving the u 1 to the end. Now
performing the sums over s1 , s2 gives:

1 e4
|Mt |2 = 2 Tr[γν /p b γμ k/2 ]Tr[γ μ /pa γ ν k/1 ] (5.163)
4 4t
spins

e4
= Tr[γμ k/2 γν /p b ]Tr[γ μ /pa γ ν k/1 ]. (5.164)
4t 2
In the second line, the first trace has been rearranged using the cyclic property of
traces. The point of doing so is that now these traces have exactly the same form that
5.2 Electron-Positron Scattering 129

we encountered in (5.59), but with pa ↔ k2 and m μ → 0. Therefore we can obtain

1%
spins |Mt | by simply making the same replacements pa ↔ k2 and m μ → 0 in
2
4
(5.65):

1 8e4
|Mt |2 = 2 [( pa · k2 )( pb · k1 ) + (k2 · k1 )( pb · pa )] . (5.165)
4 t
spins

Next, consider the interference term:

1 ∗ e4
Mt Ms = − (v b γμ u a )(u a γ ν u 1 )(u 1 γ μ v2 )(v 2 γν vb ). (5.166)
4 4 st
spins spins

We chose to write the% factors parentheses in%that order, so that now

% we can imme-
diately use the tricks sa u a u a = /pa , and s1 u 1 u 1 = k/1 , and s2 v2 v 2 = k/2 , to
obtain:

1 ∗ e4
Mt Ms = − (v b γμ /pa γ ν k/1 γ μ k/2 γν vb ),
4 4 st s
spins b

which can now be converted into a trace by the usual trick of moving the v b to the
end:

1 ∗ e4
Mt Ms = − Tr[γμ /pa γ ν k/1 γ μ k/2 γν vb v b ] (5.167)
4 4 st s
spins b

e4
=− Tr[γμ /pa γ ν k/1 γ μ k/2 γν /p b ]. (5.168)
4 st
Now we are faced with the task of computing the trace of 8 gamma matrices. In
principle, the trace of any number of gamma matrices can be performed with the
algorithm of (A.2.11). The procedure is to replace the trace over 2n gamma matrices
by a sum over traces of 2n − 2 gamma matrices, and repeat until all traces are short
enough to evaluate using (A.2.8)–(A.2.11) and (A.2.15)–(A.2.19). However, in many
cases including the present one it is easier to simplify the contents of the trace first,
using (A.2.20)–(A.2.23). To evaluate the trace in (5.168), we first use (A.2.23) to
write:

γμ /pa γ ν k/1 γ μ = −2k/1 γ ν /pa , (5.169)

which implies that

Tr[γμ /pa γ ν k/1 γ μ k/2 γν /p b ] = −2Tr[k/1 γ ν /pa k/2 γν /p b ]. (5.170)

This can be further simplified by now using (A.2.22) to write:

γ ν /pa k/2 γν = 4 pa · k2 , (5.171)

130 5 Quantum Electro-Dynamics (QED)

so that:

Tr[γμ /pa γ ν k/1 γ μ k/2 γν /p b ] = −8( pa · k2 )Tr[k/1 /p b ] (5.172)

= −32( pa · k2 )(k1 · pb ), (5.173)

in which the trace has finally been performed using (A.2.9). So, from (5.168)

1 ∗ 8e4
Mt Ms = ( pa · k2 )( pb · k1 ). (5.174)
4 st
spins

Taking the complex conjugate of both sides, we also have:

1 ∗ 8e4
Ms Mt = ( pa · k2 )( pb · k1 ). (5.175)
4 st
spins

This completes the contributions to the total

1 1
|M|2 = |Ms |2 + |Mt |2 + M∗t Ms + M∗s Mt . (5.176)
4 4
spins spins

It remains to identify the dot products of momenta appearing in the above formulas.
This can be done by carrying over the kinematic analysis for the case e− e+ → μ− μ+
as worked out in (5.66)–(5.76), with m μ , m e → 0. Letting θ be the angle between
the 3-momenta directions of the initial state electron and the final state electron, we
have:

pa · pb = k1 · k2 = s/2, (5.177)
pa · k1 = pb · k2 = −t/2, (5.178)
pa · k2 = pb · k1 = −u/2, (5.179)

with
s
t = − (1 − cos θ ), (5.180)
2
s
u = − (1 + cos θ ). (5.181)
2
It follows from (5.159), (5.165), (5.174), and (5.175) that:

1 2e4
|Ms |2 = 2 (u 2 + t 2 ) (5.182)
4 s
spins

1 2e4
|Mt |2 = 2 (u 2 + s 2 ) (5.183)
4 t
spins

1 4e4 2
(M∗t Ms + M∗s Mt ) = u . (5.184)
4 st
spins
5.2 Electron-Positron Scattering 131

Putting this into (4.192), since |pa | = |k1 |, we obtain the spin-averaged differen-
tial cross-section for Bhabha scattering:
2
dσ e4 u + t2 u2 + s2 2u 2
= + + (5.185)
d(cos θ ) 16π s s2 t2 st
2
π α 2 3 + cos2 θ
= . (5.186)
2s 1 − cos θ
This result actually diverges for cos θ → 1, because of the t’s in the denominator.
This is not an integrable singularity, because the differential cross-section blows up
quadratically near cos θ = 1, so

1
dσ
σ = d(cos θ ) −→ ∞. (5.187)
d(cos θ )
−1

The infinite total cross-section corresponds to the infinite range of the Coulomb
potential between two charged particles. It arises entirely from the t-channel dia-
gram, in which the electron and positron scatter off of each other in the forward
direction (θ ≈ 0). It simply reflects that an infinite-range interaction will always
produce some deflection, although it may be extremely small. This result is the rela-
tivistic generalization of the non-relativistic, classical Rutherford scattering problem,
in which an electron or alpha particle (or some other light charged particle) scatters
off of the classical electric field of a heavy nucleus. As worked out in many textbooks
on classical physics (for example, H. Goldstein’s Classical Mechanics, J.D. Jack-
son’s Classical Electrodynamics), the differential cross-section for a non-relativistic
light particle with charge Q A and a heavy particle with charge Q B , with center-of-
momentum energy E CM to scatter through their Coulomb interaction is:

dσRutherford π Q 2A Q 2B α 2
= . (5.188)
d(cos θ ) 2E 2 (1 − cos θ )2
(Here one must be careful in comparing results, because the charge √ e used by Gold-
stein and Jackson differs from the one used here by a factor of 4π .) Comparing
the non-relativistic Rutherford result to the relativistic Bhabha result, we see that in
both cases the small-angle behavior scales like 1/θ 4 , and does not depend on the
signs of the charges of the particles.
In a real experiment, there is always some minimum scattering angle that can
be resolved. In a colliding-beam experiment, this is usually dictated by the fact that
detectors cannot be placed within or too close to the beamline. In other experiments,
one is limited by the angular resolution of detectors. Therefore, the true observable
quantity is typically something more like:

θcut
cos
dσ
σexperiment = d(cos θ ). (5.189)
d(cos θ )
− cos θcut
132 5 Quantum Electro-Dynamics (QED)

Of course, in real experiments, the minimum resolvable angle is just one of many
practical factors that have to be included.
In terms of the Feynman diagram interpretation, the divergence for small θ corre-
sponds to the photon propagator going on-shell; in other words, the situation where
the square of the t-channel virtual photon’s 4-momentum is nearly equal to 0, the
classical value for a real massless photon. For any scattering angle θ > 0, one has
s
( pa − k1 )2 = t = (1 − cos θ ) > 0, (5.190)
2
so that the virtual photon is said to be off-shell. In general, any time that a virtual
(internal line) particle can go on-shell, there will be a divergence in the cross-section
due to the denominator of the Feynman propagator blowing up. Sometimes this is
a real divergence with a physical interpretation, as in the case of Bhabha scattering.
In other cases, the divergence is removed by higher-order effects, such as the finite
life-time of the virtual particle, which will give an imaginary part to its squared mass,
removing the singularity in the Feynman propagator.

5.3 Crossing Symmetry

Consider the two completely different processes:

e− e+ → μ− μ+ (5.191)
e− μ+ → e− μ+ . (5.192)

The relevant Feynman diagrams for these two processes are very similar:

In fact, by stretching and twisting, one can turn the first process into the second by
the transformation:

initial e+ → final e− (5.193)

final μ− → initial μ+ , (5.194)

Two processes related to each other by exchanging some initial state particles with
their antiparticles in the final state are said to be related by crossing. Not surprisingly,
5.3 Crossing Symmetry 133

the reduced matrix elements

√ for these processes are also quite similar. For example,
in the high-energy limit s
m μ ,
2 2
1 4 u +t
|Me e →μ μ | = 2e
− + − +
2
, (5.195)
4 s2
spins
2
1 u + s2
|Me− μ+ →e− μ+ |2 = 2e4 . (5.196)
4 t2
spins

This similarity is generalized and made more precise by the following theorem.

Crossing Symmetry Theorem: Suppose two Feynman diagrams with reduced matrix ele-
ments M and M are related by the exchange (“crossing”) of some initial state particles and
antiparticles for the corresponding final state antiparticles
% and particles. If the crossed par-
μ μ
ticles have 4-momenta P1 , . . . Pn in M, then spins |M|2 can be obtained by substituting
μ μ %
Pi = −Pi into the mathematical expression for spins |M |2 , as follows:

(
μ μ (
|M(P1 , . . . , Pnμ , . . .)|2 = (−1) F |M (P1 , . . . , Pnμ , . . .)|2 (( ,
(5.197)
μ μ
spins spins Pi → −Pi

with the other (uncrossed) particle 4-momenta unaffected. Here F is the number of fermion
lines that were crossed.

If p μ = (E, p ) is a valid physical 4-momentum, then p μ = (−E, −p ) is never a

physical 4-momentum, since it has negative energy. So the relation between the two
diagrams is a formal one rather than a relation between physical reduced matrix
elements that would actually be measured; when one of the matrix elements involves
the 4-momenta appropriate for a physical process, the other one does not. However,
it is still perfectly valid as a mathematical relation, and therefore extremely useful.
In general, if one has calculated the cross-section or reduced matrix element for
one process, one can obtain the results for several “crossed” processes by merely
substituting momenta, with no additional labor. Equation (5.197) might look dubious
at first, since it may look like the right-hand side is negative for odd F. However, the
point is that when the expression for |M |2 is analytically continued to unphysical
momenta, it always flips sign for odd F. We will see this in some examples later.

5.3.1 e− μ+ → e− μ+ and e− μ− → e− μ−

Let us apply crossing symmetry to the example of (5.191) and (5.192) by assigning
primed momenta to the Feynman diagram for e− e+ → μ− μ+ :
&
e− ↔ pa
initial state (5.198)
e+ ↔ pb
&
μ− ↔ k1
final state (5.199)
μ+ ↔ k2 .
134 5 Quantum Electro-Dynamics (QED)

Label the momenta in the crossed Feynman diagram (for e− μ+ → e− μ+ ) without

primes:
& −
e ↔ pa
initial state (5.200)
μ+ ↔ pb
& −
e ↔ k1
final state (5.201)
μ+ ↔ k2 .

Then the Crossing Symmetry Theorem tells us that we can get the reduced matrix ele-
ment for the process e− μ+ → e− μ+ as a function of physical momenta pa , pb , k1 , k2
by substituting unphysical momenta

pa = pa ; pb = −k1 ; k1 = − pb ; k2 = k2 , (5.202)

into the formula for the reduced matrix element for the process e− e+ → μ− μ+ .
This means that we can identify:

s = ( pa + pb )2 = ( pa − k1 )2 = t, (5.203)

t = ( pa − k1 )2 = ( pa + pb )2

= s, (5.204)
u = ( pa − k2 )2 = ( pa − k2 )2 = u. (5.205)

In other words, crossing symmetry tells us that the formulas for the reduced matrix
elements for these two processes are just related by the exchange of s and t, as illus-
trated in the high-energy limit in (5.195) and (5.196). Since we had already derived
the result for the first process in Sect. 5.2.1, the second result has been obtained for
free. Note that we could have obtained the particular result (5.196) even more easily
just by noting that the calculation for the reduced matrix element of e− μ+ → e− μ+
is exactly the same as for Bhabha scattering, except that only the t-channel diagram
exists in the former case. So one only keeps the contribution with t (not s or u) in
the denominator, since that corresponds to the t-channel diagram.
We can carry this further by considering another process also related by crossing
to the two just studied:

e− μ− → e− μ− , (5.206)

with physical momenta:

&
e− ↔ pa
initial state (5.207)
μ− ↔ pb
&
e− ↔ k1
final state . (5.208)
μ− ↔ k2

This time, the Crossing Symmetry Theorem tells us that we can identify the matrix
element by again starting with the reduced matrix element for e− e+ → μ− μ+ and
replacing:

pa = pa ; pb = −k1 ; k1 = k2 ; k2 = − pb , (5.209)

5.3 Crossing Symmetry 135

so that

s = ( pa + pb )2 = ( pa − k1 )2 = t, (5.210)

t = ( pa − k1 )2 = ( pa − k2 )2 = u,

(5.211)
u = ( pa − k2 )2 = ( pa + pb )2 = s. (5.212)

Here the primed Mandelstam variable are the unphysical ones for the e− e+ → μ− μ+
process, and the unprimed ones are for the desired process e− μ− → e− μ− . We can
therefore infer, from (5.195), that
2
1 s + u2
|Me− μ− →e− μ− |2 = 2e4 , (5.213)
4 t2
spins

by doing the substitutions indicated by (5.210)–(5.212).

By comparing (5.196) to (5.213), one sees that the spin averaged and summed
squared matrix elements for the two processes are actually the same; the charge of the
muon does not matter at leading order. Because we are neglecting the muon mass,
the kinematics relating t and u to the scattering angle θ between the initial-state
electron and the final state electron in these two cases is just the same as in Bhabha
scattering:
s s
t = − (1 − cos θ ); u = − (1 + cos θ ). (5.214)
2 2

Therefore, putting (5.213) into (4.192) with |pa | = |k1 | and using e2 = 4π α, we
obtain the differential cross-section for e− μ± → e− μ± :

dσ π α2 u2 + s2
= (5.215)
d(cos θ ) s t2
π α 2 5 + 2 cos θ + cos2 θ
= . (5.216)
2s (1 − cos θ )2

Note that this again diverges for forward scattering cos θ → 1.

5.3.2 Møller Scattering (e− e− → e− e− )

As another example, let us consider the case of Møller scattering:

e− e− → e− e− . (5.217)

There are two Feynman diagrams for this process:

136 5 Quantum Electro-Dynamics (QED)

Now, nobody can stop us from getting the result for this process by applying
the Feynman rules to get the reduced matrix element, taking the complex square,
summing and averaging over spins, and computing the Dirac traces. However, an
easier way is to note that this is a crossed version of Bhabha scattering, which we
studied earlier. Making a table of the momenta:

Bhabha Moller (5.218)

e ↔ pa
−
e− ↔ pa (5.219)
e+ ↔ pb e − ↔ pb (5.220)
e− ↔ k1 e− ↔ k1 (5.221)
e+ ↔ k2 e− ↔ k2 , (5.222)

we see that crossing symmetry allows us to compute the Møller scattering by iden-
tifying the (initial state positron, final state positron) in Bhabha scattering with the
(final state electron, initial state electron) in Møller scattering, so that:

pa = pa pb = −k2 k1 = k1 k2 = − pb . (5.223)

Therefore, the Møller scattering reduced matrix element is obtained by putting

s = ( pa + pb )2 = ( pa − k2 )2 = u (5.224)

t = ( pa − k1 )2 = ( pa − k1 )2

=t (5.225)
u = ( pa − k2 )2 = ( pa + pb )2 =s (5.226)

into the corresponding result for Bhabha scattering. Using the results of (5.182)–
(5.184), we get:
2
1 s + t2 s2 + u2 2 s2
|Me− e− →e− e− |2 = 2e2 + + . (5.227)
4 u2 t2 ut
spins
5.4 Gauge Invariance in Feynman Diagrams 137

Again, if we keep only the t-channel part (that is, the part with t 2 in the denominator),
we recover the result for e− μ± → e− μ± in the previous section.
Applying (4.192), we find the differential cross-section:
⎛ ⎞
dσe− e− →e− e− 1 ⎝1
= |Me− e− →e− e− |2 ⎠ (5.228)
d(cos θ ) 32π s 4
spins

π α2 s 2 + t 2 s2 + u2 2 s2
= + + (5.229)
s u2 t2 ut
2
2π α 2 3 + cos2 θ
= . (5.230)
s 1 − cos2 θ

Just as in the cases in the previous sections, t = −s(1 − cos θ )/2 and u = −s(1 +
cos θ )/2 where θ is the angle between the initial-state and final-state electrons. How-
ever, in this case there is a special feature, because the two electrons in the final state
are indistinguishable particles. This means that the final state with an electron com-
ing out at angles (θ, φ) is actually the same quantum state as the one with an electron
coming out at angles (π − θ, −φ). (As a check, note that (5.230) is invariant under
cos θ → − cos θ . We have already integrated over the angle φ.) Therefore, to avoid
overcounting we must only integrate over half the range of θ , or equivalently divide
the total cross-section by 2. So, we have a tricky and crucial factor of 1/2 in the total
cross-section:
θcut
cos
1 dσe− e− →e− e−
σe− e− →e− e− = d(cos θ ). (5.231)
2 d(cos θ )
− cos θcut

To obtain a finite value for the total cross-section, we had to also impose a cut on the
minimum scattering angle θcut that we require in order to say that a scattering event
should be counted.

5.4 Gauge Invariance in Feynman Diagrams

Let us now turn to the issue of gauge invariance as it is manifested in QED Feyn-
man diagrams. Recall that when we found the Feynman propagator for a photon, it
contained a term that depended on an arbitrary parameter ξ . We have been work-
ing with ξ = 1 (Feynman gauge). Consider what the matrix element for the process
e− e+ → μ− μ+ would be if instead we let ξ remain unfixed. Instead of (5.42), we
would have

ie2 ( pa + pb )μ ( pa + pb )ν
M = (v b γ μ u a )(u 1 γ ν v2 ) −gμν + (1 − ξ ) . (5.232)
( pa + pb )2 ( pa + pb )2

If the answer is to be independent of ξ , then it must be true that the new term
proportional to (1 − ξ ) gives no contribution. This can be easily proved by observing
that it contains the factor
138 5 Quantum Electro-Dynamics (QED)

(v b γ μ u a )( pa + pb )μ = v b /pa u a + v b /p b u a = mv b u a − mv b u a = 0. (5.233)

Here we have applied the Dirac equation, as embodied in (A.2.24) and (A.2.25),
to write /pa u a = mu a and v b /p b = −m /p b . For any photon propagator connected (at
either end) to an external fermion line, the proof is similar. And, in general, one
can show that the 1 − ξ term will always cancel when one includes all Feynman
diagrams contributing to a particular process. So we can choose the most convenient
value of ξ , which is usually ξ = 1.
Another aspect of gauge invariance involves a feature that we have not explored
in an example so far: external state photons. Recall that the Feynman rules associate
factors of μ ( p, λ) and ∗μ ( p, λ) to initial or final state photons, respectively. Now,
making a gauge transformation on the photon field results in:

Aμ → Aμ + ∂ μ (5.234)

where is any function. In momentum space, the derivative ∂ μ is proportional to

p μ . So, in terms of the polarization vector for the electromagnetic field, a gauge
transformation is

μ ( p, λ) → μ ( p, λ) + ap μ (5.235)

where a is any quantity. The polarization vector and momentum for a physical photon
satisfy 2 = −1 and · p = 0 and p 2 = 0. As a consistency check, note that if these
relations are satisfied, then they will also be obeyed after the gauge transformation
(5.235).
Gauge invariance implies that the reduced matrix element should also be
unchanged after the substitution in (5.235). The reduced matrix element for a process
with an external state photon with momentum p μ and polarization label λ can always
be written in the form:

M = Mμ μ ( p, λ), (5.236)

which defines Mμ . Since M must be invariant under a gauge transformation, it

follows from (5.235) that

Mμ p μ = 0. (5.237)

This relation is known as the Ward identity for QED. It says that if we replace the
polarization vector for any photon by the momentum of that photon, then the reduced
matrix element should become 0. This is a nice consistency check on calculations.
5.5 External Photon Scattering 139

Another nice consequence of the Ward identity is that it provides for a simplified
way to sum or average over unmeasured photon polarization states. Consider a pho-
ton with momentum taken to be along the positive z axis, with p μ = (P, 0, 0, P).
Summing over the two polarization vectors in (5.26)–(5.27), we have:

2
2
|M|2 = Mμ M∗ν μ ( p, λ) ∗ν ( p, λ) = |M1 |2 + |M2 |2 , (5.238)
λ=1 λ=1

where Mμ = (M0 , M1 , M2 , M3 ). The Ward identity implies that p μ Mμ = PM0 +

%
PM3 = 0, so |M0 |2 = |M3 |2 . Therefore, we can write 2λ=1 |M|2 = −|M0 |2 +
|M1 |2 + |M2 |2 + |M3 |2 , or

2
|M|2 = −g μν Mμ M∗ν . (5.239)
λ=1

The last equation is written in a Lorentz invariant form, so it is true for any photon
momentum direction, not just momenta oriented along the z direction. Gauge invari-
ance, as expressed by the Ward identity, therefore implies that we can always sum
over a photon’s polarization states by the rule:

2
μ ( p, λ) ∗ν ( p, λ) = −g μν + (irrelevant)μν , (5.240)
λ=1

as long as we are taking the sum of the complex square of a reduced matrix element.
Although the (irrelevant)μν part is non-zero, it must vanish when contracted with
Mμ M∗ν , according to (5.239).

5.5 External Photon Scattering

5.5.1 Compton Scattering (γ e− → γ e− )

As our first example of a process with external-state photons, consider Compton

scattering:

γ e− → γ e− . (5.241)

First let us assign the following labels to the external states:

initial γ μ ( pa , λa )
initial e− u( pb , sb )
final γ ν∗ (k1 , λ1 )
final e− u(k2 , s2 ).

There are two Feynman diagrams for this process:

140 5 Quantum Electro-Dynamics (QED)

which are s-channel and u-channel, respectively. Applying the QED Feynman rules,
we obtain:

i( /pa + /p b + m)
Ms = u 2 (ieγ ν ) (ieγ μ )u b 1ν
∗
aμ (5.242)
( pa + pb )2 − m 2
e2 ∗
= −i u 2 γ ν ( /pa + /p b + m)γ μ u b 1ν aμ (5.243)
s−m 2

and

i( /p b − k/1 + m)
μ
Mu = u 2 (ieγ ) (ieγ ν )u b 1ν
∗
aμ (5.244)
( pb − k1 )2 − m 2
e2 ∗
= −i u 2 γ μ ( /p b − k/1 + m)γ ν u b 1ν aμ . (5.245)
u−m 2

Before squaring the total reduced matrix element, it is useful to simplify. So we note
that:

( /p b + m)γ μ u b = { /p b , γ μ }u b − γ μ ( /p b − m)u b (5.246)

= pbσ {γ σ , γ μ }u b + 0 (5.247)
= 2 pbσ g σ μ u b (5.248)
μ
= 2 pb u b , (5.249)

where we have used (A.2.24). So the reduced matrix element is:

∗ 1
ν μ
M = −ie2 1ν aμ u 2 γ /pa γ μ + 2 pb γ ν
s−m 2

1
μ ν ν μ

+ /
−γ k 1 γ + 2 pb γ ub. (5.250)
u − m2

Taking the complex conjugate of (5.250) gives:

∗ ∗ 1 ρ
M = ie 2
1σ aρ ub (γ ρ /pa γ σ + 2 pb γ σ )
s − m2

1 σ/ ρ σ ρ
+ (−γ k 1 γ + 2 p b γ ) u2. (5.251)
u − m2
5.5 External Photon Scattering 141

Now we multiply together (5.250) and (5.251), and average over the initial photon
polarization λa and sum over the final photon polarization λ1 , using

1
2
∗ 1
aμ aρ = − gμρ + irrelevant, (5.252)
2 2
λa =1

2
∗
1ν 1σ = −gνσ + irrelevant, (5.253)
λ1 =1

to obtain:
1
|M|2 =
2
λa ,λ1
4 & '
e 1 ν p γ μ + 2 pμ γ ν ) + 1 μk ν + 2 pν γ μ ) u
u2 (γ /a b (−γ / 1 γ b b
2 s − m2 u − m2
& '
1 1
ub (γμ /pa γν + 2 pbμ γν ) + (−γν k/1 γμ + 2 pbν γμ ) u 2 . (5.254)
s−m 2 u − m2

Next we can average over sb , and sum over s2 , using the usual tricks:

1 1
u b u b = ( /p b + m), (5.255)
2 s 2
b

u 2 . . . u 2 = Tr[. . . (k/2 + m)]. (5.256)
s2

The result is a single spinor trace:

1
|M|2 =
4
spins
& '
e4 1 ν p γ μ + 2 pμ γ ν ) + 1 μk ν + 2 p ν γ μ ) ( p + m)
Tr (γ /a b (−γ / 1 γ b /b
4 s − m2 u − m2
& '
1 1
(γμ /
p a γν + 2 pbμ γν ) + (−γ ν /
k 1 γμ + 2 p bν γμ ) (/
k 2 + m) . (5.257)
s − m2 u − m2

Doing this trace requires a little patience and organization. The end result can be
written compactly in terms of

s − m2
pa · pb = , (5.258)
2
m2 − u
pb · k 1 = . (5.259)
2
142 5 Quantum Electro-Dynamics (QED)

After some calculation, one finds:

1 4 pa · pb pb · k 1 1 1
|M| = 2e
2
+ +2m 2
−
4 pb · k 1 pa · pb pa · pb pb · k 1
spins
2 "
1 1
+m 4 − . (5.260)
pa · pb pb · k 1

Equation (5.260) is a Lorentz scalar. We can now find the differential cross-section
after choosing a reference frame. We will do this first in the center-of-momentum
frame, and then redo it in the “lab” frame in which the initial electron is at rest.
In the center-of-momentum frame, the kinematics is just like in the case eμ → eμ.
Call the magnitude of the 3-momentum of the photon in the initial state P. Then
using four-momentum conservation and the on-shell conditions pa2 = k12 = 0 and
pb2 = k22 = m 2 , and taking the initial state photon momentum to be in the +z direction
and the final state photon momentum to make an angle θ with the z-axis, we have:

paμ = (P, 0, 0, P), (5.261)

$
μ
pb = ( P 2 + m 2 , 0, 0, −P), (5.262)
μ
k1 = (P, 0, P sin θ, P cos θ ), (5.263)
$
μ
k2 = ( P 2 + m 2 , 0, −P sin θ, −P cos θ ). (5.264)

The initial and final state photons have the same energy, as do the initial and final
state electrons, so define:

E γ = P, (5.265)
$
Ee = P 2 + m 2 . (5.266)

Then we have:

s = (E e + E γ )2 , (5.267)
pa · pb = E γ (E e + E γ ), (5.268)
pb · k 1 = E γ (E e + E γ cos θ ), (5.269)
|k1 |
= 1. (5.270)
|pa |
5.5 External Photon Scattering 143

So, applying (4.191) to (5.260), we obtain:

dσ π α2 Ee + Eγ E e + E γ cos θ 2(cos θ − 1)
= + + m2
d(cos θ) s E e + E γ cos θ Ee + Eγ (E e + E γ )(E e + E γ cos θ)
2
cos θ − 1
+m 4 . (5.271)
(E e + E γ )(E e + E γ cos θ)

In a typical Compton scattering situation, the energies in the center-of-momentum

frame are much larger than the electron’s mass. So, consider the high-energy limit
E γ
m. Naively, we can set E e = E γ and m = 0 in that limit. However, this
requires some care for cos θ ≈ −1, since then the denominator factor E e + E γ cos θ
can become large. This is most important for the first term in (5.271). In fact, this
term gives the dominant contribution to the total cross-section, coming from the
cos θ ≈ −1 (back-scattering) region. Integrating with respect to cos θ , we find:

1
Ee + Eγ Ee + Eγ Ee + Eγ
d(cos θ ) = ln . (5.272)
E e + E γ cos θ Eγ Ee − Eγ
−1

Now, expanding in the small mass of the electron,

!
m2 m2
Ee − Eγ = Eγ 1 + m 2 /E γ2 − E γ = E γ 1+ + · · · − 1 = + O(m 4 ),
2E γ2 2E γ
E e + E γ = 2E γ + O(m 2 ). (5.273)

Therefore, using s ≈ 4E γ2 ,

1
Ee + Eγ
d(cos θ ) = 2 ln(s/m 2 ) + O(m 2 ), (5.274)
E e + E γ cos θ
−1

with the dominant contribution coming from cos θ near −1, where the denominator
of the integrand becomes small. Integrating the second term in (5.271), one finds:

1
E e + E γ cos θ 2E e
d(cos θ ) = = 1 + O(m 2 ). (5.275)
Ee + Eγ Ee + Eγ
−1

The remaining two terms vanish as m 2 /s → 0. So, for s

m 2 we have:

1
dσ π α2
σ = d(cos θ ) = 2 ln(s/m 2 ) + 1 (5.276)
d(cos θ ) s
−1
144 5 Quantum Electro-Dynamics (QED)

plus terms that vanish like (m 2 /s 2 )ln(s/m 2 ) as m 2 /s → 0. The cross-section falls

at high energy like 1/s, but with a logarithmic enhancement coming from back-
scattered photons with angles θ ≈ π . The origin of this enhancement can be
traced to the u-channel propagator, which becomes large when u − m 2 = 2 pb · k1 =
2E γ (E e + E γ cos θ ) becomes small. This corresponds to the virtual electron in the
u-channel Feynman diagram going nearly on-shell. (Notice that s − m 2 can never
become small when s
m 2 .)
Just for fun, let us redo the analysis of Compton scattering, starting from (5.260),
but now working in the lab frame in which the initial state electron is at rest. Let
us call the energy of the initial state photon ω and that of the final state photon ω .
(Recall that = 1 in our units, so the energy of a photon is equal to its angular
frequency.)

Then, in terms of the lab photon scattering angle θ , the 4-momenta are:

pa = (ω, 0, 0, ω), (5.277)

pb = (m, 0, 0, 0), (5.278)
k1 = (ω , 0, ω sin θ, ω cos θ ), (5.279)
k2 = (ω + m − ω , 0, −ω sin θ, ω − ω cos θ ). (5.280)

Here we have used four-momentum conservation and the on-shell conditions pa2 =
k12 = 0 and pb2 = m 2 . Applying the last on-shell condition k22 = m 2 now leads to:

2 m(ω − ω ) + 2ωω (cos θ − 1) = 0. (5.281)

This can be used to solve for ω in terms of cos θ , or vice versa:

ω
ω = ω ; (5.282)
1+ m (1 − cos θ )
m(ω − ω )
cos θ = 1 − . (5.283)
ωω
In terms of lab-frame variables, one has:

pa · pb = ωm; (5.284)
pb · k1 = ω m, (5.285)
5.5 External Photon Scattering 145

so that, from (5.260):

"
1 4 ω ω 1 1 2 1 1 2
|M| = 2e
2
+ +2m − +m − .(5.286)
4 ω ω ω ω ω ω
spins

This can be simplified slightly by using

1 1
m − = cos θ − 1, (5.287)
ω ω

which follows from (5.281), so that

1 4 ω ω
|M| = 2e
2
+ − sin θ .
2
(5.288)
4 ω ω
spins

Now to find the differential cross-section, we must apply (4.175):

1 1
dσ = |M|2 d2 , (5.289)
4E a E b |va − vb | 4

where d2 is the two-body Lorentz-invariant phase space, as defined in (4.176), for
the final state particles in the lab frame. Evaluating the prefactors for the case at
hand:

E a = ω, (5.290)
E b = m, (5.291)
|va − vb | = 1 − 0 = 1. (5.292)

(Recall that the speed of the photon is c = 1.)

The two-body phase space in the lab frame is:

d 3 k1 d 3 k2
d2 = (2π )4 δ (4) (k1 + k2 − pa − pb ) (5.293)
(2π )3 2E 1 (2π )3 2E 2
d 3 k1
= δ (3) (k1 + k2 − ω
z) δ(ω + E 2 − ω − m) d 3 k2 , (5.294)
16π 2 ω E 2
$
where ω is now defined to be equal to E 1 = |k1 | and E 2 is defined to be |k2 |2 + m 2 .
Performing the k2 integral using the 3-momentum delta function just sets k2 =
ω
z − k1 , resulting in:

d 3 k1
d2 = δ(ω + E 2 − ω − m) , (5.295)
16π 2 ω E 2
146 5 Quantum Electro-Dynamics (QED)

where now
$
E2 = ω2 − 2ωω cos θ + ω2 + m 2 . (5.296)

The phase space for the final state photon can

# be simplified by writing it in terms of
angular coordinates and doing the integral dφ = 2π :

d 3 k1 = dφ d(cos θ ) ω2 dω = 2π d(cos θ ) ω2 dω . (5.297)

Therefore,

d(cos θ ) ω dω
d2 = δ(ω + E 2 − ω − m) . (5.298)
8π E 2

In order to do the ω integral, it is simplest, as usual, to make a change of integration

variables to the argument of the delta function. So, defining

K = ω + E 2 − ω − m, (5.299)

we have

dK ω − ω cos θ

=1+ . (5.300)
dω E2

Therefore,

ω dω ω d K
= . (5.301)
E2 E 2 + ω − ω cos θ

Performing the d K integration sets K = 0, so E 2 = ω + m − ω . Using (5.301) in

(5.298) gives

ω d(cos θ ) ω2
d2 = = d(cos θ ), (5.302)
m + ω(1 − cos θ ) 8π 8π mω

where (5.282) has been used to simplify the denominator. Finally using this in (5.289)
yields:
⎛ ⎞
1 ω2
dσ = ⎝ |M|2 ⎠ d(cos θ ), (5.303)
4 32π mω2
spins

so that, putting in (5.288),

2
dσ π α2 ω ω ω
= 2 + − sin2 θ . (5.304)
d(cos θ ) m ω ω ω
5.5 External Photon Scattering 147

This result is the Klein-Nishina formula.

As in the center-of-momentum frame, the largest differential cross-section is
found for back-scattered photons with cos θ = −1, which corresponds to the small-
est possible final-state photon energy ω . Notice that, according to (5.282), when
ω
m, one gets very low energies ω when the photon is back-scattered. One can
#1
now integrate the lab-frame differential cross-section −1 d(cos θ ) to get the total
cross-section, using the dependence of ω on cos θ as given in (5.282). The result is:

1 2 2m 2ω 4 2(m + ω)
σ = π α2 − 2− 3 ln 1 + + 2+ . (5.305)
ωm ω ω m ω m(m + 2ω)2

In the high-energy (small m) limit, this becomes:

1 2ω 1
σ = π α2 ln + + ··· . (5.306)
ωm m 2ωm

We can re-express this in terms of the Mandelstam variable s = (ω + m)2 − ω2 =

2ωm + m 2 ≈ 2ωm,

π α2
σ = 2ln(s/m 2 ) + 1 + · · · . (5.307)
s
Equation (5.307) is the same result that we found in the center-of-momentum frame,
(5.276). This is an example of a general fact: the total cross-section does not depend
on the choice of reference frame, provided that one boosts along a direction parallel
to the collision axis. To see why, one need only look at the definition of the total
cross-section given in (4.145). The numbers of particles N S , Na , and Nb can be
simply counted, and so certainly do not depend on any choice of inertial reference
frame, while the area A is invariant under Lorentz boosts along the collision axis.
The low-energy Thomson scattering limit is also interesting. In the lab frame,
ω m implies ω /ω = 1, so that

dσ π α2
= 2 (1 + cos2 θ ), (5.308)
d(cos θ ) m

which is independent of energy. The total cross-section in this limit is then

8π α 2
σ = . (5.309)
3 m2
Unlike the case of high-energy Compton scattering, Thomson scattering is symmet-
ric under θ → π − θ , with a factor of 2 enhancement in the forward (θ = 0) and
backward (θ = π ) directions compared to right-angle (θ = π/2) scattering.
148 5 Quantum Electro-Dynamics (QED)

5.5.2 e+ e− → γ γ

As our final example of a QED process let us consider e+ e− → γ γ in the high-

energy limit. We could always compute the reduced matrix element starting from
the Feynman rules. However, since we have already done Compton scattering, it is
easier to use crossing. Labeling the physical momenta for Compton scattering now
with primes, we can make the following comparison table:

Compton e+ e− → γ γ (5.310)
γ ↔ pa +
e ↔ pa (5.311)
e− ↔ pb e − ↔ pb (5.312)
γ ↔ k1 γ ↔ k1 (5.313)
e− ↔ k2 γ ↔ k2 . (5.314)

For convenience, we have chosen e+ to be labeled by “a”, so that the initial-state e−

can have the same label “b” in both
% processes. Then according to the Crossing Sym-
metry Theorem, we can obtain spins |Me+ e− →γ γ |2 by making the replacements

pa = −k2 ; pb = pb ; k1 = k1 ; k2 = − pa (5.315)

in the Compton scattering result (for small m):

pa · pb pb · k1
|Mγ e− →γ e− |2 = 8e4 + , (5.316)
pb · k1 pa · pb
spins

obtained from (5.260). Because the crossing involves one fermion (a final state
electron changes into an initial state positron), there is also a factor of (−1)1 = −1,
according to (5.197). So, the result is:

−k2 · pb pb · k 1
|Me+ e− →γ γ |2 = (−1)8e4 + (5.317)
pb · k 1 −k2 · pb
spins

k 2 · pb pb · k 1
= 8e4 + . (5.318)
pb · k 1 k 2 · pb

The situation here is 2 → 2 massless particle scattering, so we can steal the

kinematics information directly from (5.177)–(5.181). The relevant facts for the
present case are:
u s
pb · k 1 = − = (1 + cos θ ), (5.319)
2 4
t s
pb · k2 = − = (1 − cos θ ). (5.320)
2 4
5.5 External Photon Scattering 149

Therefore,

1 t u 1 + cos2 θ
|Me+ e− →γ γ |2 = 2e4 + = 4e4 . (5.321)
4
spins
u t sin2 θ

It is a useful check, and a vindication of the (−1) F factor in the Crossing Symmetry
Theorem, that this is positive! Now plugging this into the formula (4.192) for the
differential cross-section, we get:

dσ 2π α 2 1 + cos2 θ
= . (5.322)
d cos(θ ) s sin2 θ

This cross-section is symmetric as θ → π − θ . That was inevitable, since the two

final state photons are identical; the final state in which a photon is observed coming
out at a (θ, φ) is actually the same as a state in which a photon is observed at an
angle (π − θ, −φ). When we find the total cross-section, we should therefore divide
by 2 (or just integrate over half of the range for cos θ ) to avoid overcounting. This
is the same overcounting issue for identical final state particles that arose for Møller
scattering at the end of Sect. 5.3.2.
The integrand in (5.322) diverges for sin θ = 0. This is because we have set m = 0
in the kinematics, which is not valid for scattering very close to the collision axis.
If you put back non-zero m, you will find that instead of diverging, the total cross-
section features a logarithmic enhancement ln(s/m 2 ) coming from small sin θ . In
a real e− e+ → γ γ experiment, however, photons very close to the electron and
positron beams will not be seen by any detector. The observable cross-section in one
of these colliding beam experiments is something more like

θcut
cos
1 dσ
σcut = d(cos θ ) (5.323)
2 d(cos θ )
− cos θcut

2π α 2 1 + cos θcut
= ln − cos θcut . (5.324)
s 1 − cos θcut

(You can easily check that this is a positive and increasing function of cos θcut .) On
the other hand, if you are interested in the total cross-section for electron-positron
annihilation with no cuts applied on the angle, then you must
√ take into account the
non-zero electron mass. Redoing everything with m s but non-zero, you can
show:

2π α 2 ) s *
σ = ln − 1 . (5.325)
s 2m 2
150 5 Quantum Electro-Dynamics (QED)

The logarithmic enhancement at large s in this formula comes entirely from the
sin θ ≈ 0 region. Note that this formula is just what you would have gotten by
plugging in

cos θcut = 1 − 4 m 2 /s (5.326)

into (5.324), for small m. In this sense, the finite mass of the electron “cuts off” the
would-be logarithmic divergence of the cross-section for small sin θ .

Problems

1. Instead of adding gauge fixing to the QED lagrangian, add a mass for the photon.
Compute the photon propagator.
2. Consider the process of antimuon scattering off of an electron:

μ+ e− → μ+ e− (5.327)

You may assume m e is negligible, but keep m μ . Work in the center-of-momentum

frame. Assign the incoming electron and antimuon 4-momenta pa and pb , and call
the magnitude of their 3-momenta P. Assign the final-state electron and antimuon
4-momenta k1 and k2 , and note that the magnitude of their 3-momenta is also P.
(In working this problem, do not use crossing symmetry. For the purposes of this
homework problem, that is considered cheating. If you don’t know what crossing
symmetry is yet, that’s fine.)
(a) Draw the Feynman diagram(s) which contribute to the reduced matrix element
for this process at order e2 .
(b) Define θ to be angle between the initial-state e− and the final state e− direc-
tions in the center-of-momentum frame. Work out all of the following quan-
tities in terms of m μ , P, and cos θ :

pa2 , pb2 , k12 , k22 , (5.328)

( pa · pb ), ( pa · k1 ), ( pa · k2 ), (5.329)
( pb · k1 ), ( pb · k2 ), (k1 · k2 ), (5.330)
s = ( pa + pb ) , 2
t = ( pa − k1 )2 , u = ( pa − k2 ) . (5.331)
2

(c) Use the Feynman rules of QED to obtain the reduced matrix element M.
(d) Take the complex square of the reduced matrix element you found. Sum over
final state spins, and average over initial state spins, and simplify. Write the
result in terms of Mandelstam variables s, t, u, and then rewrite it in terms of
P and the scattering angle θ .
(e) Find the differential cross section. Simplify your answer as much as possible.
(f) Now take m μ → 0. What is the differential cross section? You should note
something interesting for a particular value of cos θ .
Problems 151

(g) Consider the 16 possible helicity processes:

μ+ −
L eL → μ+ −
L eL ; μ+ −
L eL → μ+ −
R eL ; (5.332)
+ − + − + − + −
μL eL → μL e R ; μL eL → μR eR ; (5.333)
μ+ −
L eR → μ+ −
L eL ; μ+ −
L eR → μ+ −
R eL ; (5.334)
μ+ −
L eR → μ+ −
L eR ; μ+ −
L eR → μ+ −
R eR ; (5.335)
μ+ −
R eL → μ+ −
L eL ; μ+ −
R eL → μ+ −
R eL ; (5.336)
+ − + − + − + −
μ R eL → μL e R ; μ R eL → μR eR ; (5.337)
μ+ −
R eR → μ+ −
L eL ; μ+ −
R eR → μ+ −
R eL ; (5.338)
μ+ −
R eR → μ+ −
L eR ; μ+ −
R eR → μ+ −
R eR . (5.339)

Do not compute them. Instead, figure √ out which ones vanish by helicity
conservation in the limit m e , m μ s.

3. Consider the process of photon-photon scattering to produce an electron-positron

pair:

γ γ → e− e+ (5.340)

Work in the center-of-momentum frame, assign √ the initial-state photons 4-

momenta pa and pb , and call their energies E = s/4. Assign the final-state
electron and positron 4-momenta k1 and k2 , and let θ be the angle between one
of the initial-state photons and the e− . Call the mass of the electron m.
(a) Draw the Feynman diagram(s) which contribute to the reduced matrix element
for this process at order e2 . Write down a complete expression for the reduced
matrix element for the process, but you don’t have to simplify it.
(b) Compute the spin summed and averaged reduced matrix element squared,
1%
spins |M| .
2
4
dσ
4. (a) From results in problem 2, compute the differential cross-section d cos θ for
−
γγ → e e . +
2
(b) Find the total cross-section for the limit 4ms → 0, and for the limit of small but
non-vanishing 4 m 2 − 1. Hint: you will likely find the following definite integrals
s

to be useful.

1
1 1 1+a
dx = ln , (5.341)
1 − a2 x 2 a 1−a
−1
1
1 1 1 1+a
dx = + ln . (5.342)
(1 − a 2 x 2 )2 1 − a2 2a 1−a
−1
152 5 Quantum Electro-Dynamics (QED)

5. (a) Compute all the reduced matrix elements squared for the different helicity pro-
jections of e− μ− → e− μ− . These include e− − − − − − − −
L μ L → e R μ L , e L μ L → e L μ R , etc.
There are eight of them. Hint: some can be seen to be zero without calculation,
and others can be understood to be the same as another one already computed.
Assume the fermion masses are negligible.

6. Draw the following Feynman diagrams in QED. In each case, write down con-
sistent expressions for the reduced matrix elements in terms of clearly defined
momenta and spins, but you do not need to simplify or evaluate them.
(a) All tree-level diagrams contributing to e− e+ → μ+ μ− γ
(b) All one-loop diagrams contributing to e− e+ → μ+ μ− . [Here, you do not
need to write down the reduced matrix elements for the subset of diagrams
that just involve corrections to external legs. They have to be handled by a
different method.]
(c) A representative diagram contributing to γ γ → γ γ
(d) All diagrams contributing to e− μ+ → μ− e+
Decay Processes
6

6.1 Decay Rates and Partial Widths

So far, we have studied only 2 → 2 scattering processes. This is because in QED as

a stand-alone theory, the fundamental particles are stable. Photons cannot decay into
anything else (in any theory), because they are massless. The electron is the lightest
particle that carries charge, so it cannot decay because of charge conservation. The
photon interaction vertex with a fermion line does not change the type of fermion.
So if there is only a muon in the initial state, you can easily convince yourself that
there must be at least one muon in the final state. Going to the rest frame of the initial
muon, the total energy of the process is simply m μ , but the energy of the final state
would have to be larger than this, so if only QED interactions are allowed, the muon
must also be stable. The weak interactions get around this by allowing interaction
vertices that change the fermion type. Moreover, bound states can decay even in
QED.
For any type of unstable particle, the probability that a decay will occur in a very
short interval of time t should be proportional to t. So we can define the decay
rate (also known as the decay width; it is equal to the resonance width in (1.1)) as
the constant of proportionality:

(Probability of decay in time t) = t. (6.1)

Suppose we observe the decays of a large sample of particles of this type, all at rest.
If the number of particle at time t is denoted N (t), then the number of particles
remaining a short time later is therefore:

N (t + t) = (1 − t) N (t). (6.2)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 153
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_6
154 6 Decay Processes

It follows that
dN N (t + t) − N (t)
= lim = − N , (6.3)
dt t→0 t

so that

N (t) = e−t N (0). (6.4)

When one has computed or measured for some particle, it is traditional and sensible
to quote the result as measured in the rest frame of the particle. If the particle is moving
with velocity β, then because of relativistic time dilation, the survival probability for
a particular particle as a function of the laboratory time t is:
√
(Probability of particle survival) = e−t 1−β 2
. (6.5)

The quantity

τ = 1/ (6.6)

is also known as the mean lifetime of the particle at rest. (Putting in the units recovers
(1.2); see (A.1.6).) There is often more than one final state available for a decaying
particle. One can then compute or measure the decay rates into particular final states.
The rate for a particular final state or class of final states is called a partial width.
The sum of all exclusive partial widths should add up to the total decay rate, of
course.
Consider a process in which a particle at rest with 4-momentum

p μ = (M, 0, 0, 0) (6.7)

decays to several particles with 4-momenta ki and masses m i . Given the reduced
matrix element M for this process, one can show by arguments similar to those in
(4.5) for cross-sections that the differential decay rate is:

1
d = |M|2 dn , (6.8)
2M
where

n

d 3 ki
dn = (2π )4 δ (4) ( p − ki ) (6.9)
(2π )3 2E i
i i=1

is the n-body Lorentz-invariant phase space. (Compare this to (4.176); you will
see that the only difference is that the pa + pb in the 4-momentum delta function
for a scattering process has been replaced by p for a decay process.) To find the
contribution to the decay rate for final-state particles with 3-momenta restricted to
6.2 Two-Body Decays 155

be in some ranges, we should integrate d over those ranges. To find the total decay
rate , we should integrate the 3-momenta over all available ki . The energies in this
formula are defined by

Ei = ki2 + m i2 . (6.10)

6.2 Two-Body Decays

Most of the decay processes that one encounters in high-energy physics are two-
particle or three-particle final states. As general rule, if the number of particles in the
final state is larger, then the decay partial width for that final state tends to be smaller,
so a particle will typically decay into few-particle states if it can. If a two-particle
final state is available, it is usually a very good bet that three-particle final states will
lose. However, there are some exceptions to this, including in the important case of
the Higgs boson.
Let us simplify the formula (6.8) for the case of two-particle final states with
arbitrary masses. The evaluation of the two-particle final-state phase space is exactly
the same as in (4.179)–(4.188), with the simple replacement E CM → M. Therefore,

K
d2 = dφ1 d(cos θ1 ), (6.11)
16π 2 M
and
K
d = |M|2 dφ1 d(cos θ1 ). (6.12)
32π 2 M 2
It remains to solve for K . Energy conservation requires that

E1 + E2 = K 2 + m 21 + K 2 + m 22 = M. (6.13)

One can solve this by writing it as

E1 − M = E 12 − m 21 + m 22 (6.14)

and squaring both sides. The solution is:

M 2 + m 21 − m 22
E1 = , (6.15)
2M
M 2 + m 22 − m 21
E2 = , (6.16)
2M
λ(M 2 , m 21 , m 22 )
K = , (6.17)
2M
156 6 Decay Processes

where

λ(x, y, z) ≡ x 2 + y 2 + z 2 − 2x y − 2x z − 2yz. (6.18)

is known as the triangle function1 (or Källén function). It is useful to tabulate results
for some common special cases:

• If the final-state masses are equal, m 1 = m 2 = m, then the final state particles
share the energy equally in the rest frame:

E 1 = E 2 = M/2, (6.19)

M
K = 1 − 4 m 2 /M 2 , (6.20)
2
and

|M|2
d = 1 − 4 m 2 /M 2 dφ1 d(cos θ1 ). (6.21)
64π 2 M
• If one of the final-state particle is massless, m 2 = 0, then:

M 2 + m 21
E1 = , (6.22)
2M
M 2 − m 21
E2 = K = . (6.23)
2M
This illustrates the general feature that since the final state particles have equal
3-momentum magnitudes, the heavier particle gets more energy. In this case,

|M|2
d = 2
1 − m 21 /M 2 dφ1 d(cos θ1 ). (6.24)
64π M
• If the decaying particle has spin 0, or if its spin is not measured, then there can
be no special direction in the decay, so the final state particles must be distributed
isotropically in the center-of-momentum frame. One then obtains the total decay
rate from

dφ1 d(cos θ1 ) → 4π, (6.25)

provided that the two final state particles are distinguishable. There is an extra
factor of 1/2 if they are identical, to avoid counting each final state twice (see the
discussions at the end of Sects. 5.3.2 and 5.5.2).

1 It is so-named because, if each of √ x, √ y, √zis less than the sum of the other two, then λ(x, y, z)
√ √ √
is −16 times the square of the area of a triangle with sides x, y, z. However, in the present
context M, m 1 , m 2 never form a triangle; if M < m 1 + m 2 , then the decay is forbidden.
6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay 157

6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay

Let us now consider a simple and very important decay process, namely a scalar
particle φ decaying to a fermion-antifermion pair. As a model, let us consider the
Lagrangian already mentioned in Sect. 4.7:

Lint = −yφ. (6.26)

This type of interaction is called a Yukawa interaction, and y is a Yukawa coupling.

The corresponding Feynman rule was argued to be equal to −i y times an identity
matrix in Dirac spinor space, as shown in the picture immediately after (4.252).
Consider an initial state containing a single φ particle of mass M and 4-momentum
p, and a final state containing a fermion and antifermion each with mass m and with
4-momenta and spins k1 , s1 and k2 , s2 respectively. Then the matrix element will be
of the form

OUT k1 , s1 ; k2 , s2 | pOUT = −iM(2π )4 δ (4) (k1 + k2 − p). (6.27)

The Feynman rules for fermion external states don’t depend on the choice of inter-
action vertex, so they are the same as for QED. Therefore we can draw the Feynman
diagram:

and immediately write down the reduced matrix element for the decay:

M = −i y u(k1 , s1 ) v(k2 , s2 ). (6.28)

To turn M into a physically observable decay rate, we need to compute the squared
reduced matrix element summed over final state spins. From (6.28),

M∗ = i y v 2 u 1 , (6.29)

|M|2 = y 2 (u 1 v2 )(v 2 u 1 ). (6.30)

Summing over s2 , we have:

|M|2 = y 2 u 1 (k/2 − m)u 1 (6.31)
s2

= y 2 Tr[(k/2 − m)u 1 u 1 ]. (6.32)

158 6 Decay Processes

Now summing over s1 gives:

|M|2 = y 2 Tr[(k/2 − m)(k/1 + m)] (6.33)
s1 ,s2

= y 2 Tr[k/2 k/1 ] − m 2 Tr[1] (6.34)
= 4y (k1 · k2 − m ),
2 2
(6.35)

where we have used the fact that the trace of an odd number of gamma matrices
vanishes, and (A.2.8) and (A.2.9). The fermion and antifermion have the same mass
m, so

M 2 = (k1 + k2 )2 = k12 + k22 + 2k1 · k2 = m 2 + m 2 + 2k1 · k2 (6.36)

implies that

M2
k1 · k2 − m 2 = − 2 m2. (6.37)
2
Therefore,

4 m2
|M| = 2y M
2 2 2
1− , (6.38)
M2
spins

and, using (6.21),

3/2
y2 M 4 m2
d = 1− dφ1 d(cos θ1 ). (6.39)
32π 2 M2

Doing the (trivial) angular integrals finally gives the total decay rate:
3/2
y2 M 4 m2
= 1− . (6.40)
8π M2

In the Standard Model, the Higgs boson h plays the role of φ, and couples to each
fermion f with a Lagrangian that is exactly of the form given above:

Lint = − y f h f f . (6.41)
f

The Yukawa coupling for each fermion is approximately proportional to its mass:

mf
yf ≈ . (6.42)
175 GeV
6.3 Scalar Decays to Fermion-Antifermion Pairs: Higgs Decay 159

(The reason for this will be explained below in Sect. 10.3.) However, the m f appear-
ing in this formula is not quite equal to the mass, because of higher-order corrections.
For quarks, these corrections are quite large, and m f tends to come out considerably
smaller than the masses of the quarks quoted in Table 1.3.
At the LHC, one of the major goals is to study the Higgs boson through its decay
modes. The Higgs boson mass has been measured to be about 125 GeV. Since the top
quark has a mass of about 173 GeV, the decay h → tt is kinematically forbidden. The
next-lightest fermions in the Standard Model are the bottom quark, charm quark, and
tau lepton, so we expect decays h → bb and h → τ − τ + and h → cc. For quarks,
the sum in (6.41) includes a summation over 3 colors, leading to an extra factor of
n f = 3 in the decay rate. Since the kinematic factor (1 − 4m 2f /Mh2 )3/2 is close to 1
for all allowed fermion-antifermion final states, the leading-order prediction for the
decay rate to a particular fermion is approximately:

n f y 2f Mh
(h → f f ) = ∝ n f m 2f . (6.43)
16π
Estimates of the m f from present experimental data are:

m b ≈ 2.7 GeV → yb ≈ 0.0154, (6.44)

m τ ≈ 1.77 GeV → yτ ≈ 0.0101, (6.45)
m c ≈ 0.58 GeV → yc ≈ 0.0033, (6.46)

for a Higgs with mass Mh = 125 GeV. (Notice that even though the charm quark is
heavier than the tau lepton, it turns out that m τ > m c because of the large higher-
order corrections for the charm quark.) Therefore, the prediction is that bb final states
win, with, very roughly:

(h → bb) : (h → τ − τ + ) : (h → cc) :: 3(m b )2 : (m τ )2 : 3(m c )2 (6.47)

:: 1 : 0.143 : 0.046. (6.48)

A more accurate accounting of the Higgs boson width and branching ratios must
take into account many important corrections beyond our scope here. For example,
higher-order Feynman diagram are important, and increase the partial widths into
quarks substantially. Second, there are other final states that can appear in h decays,
notably gluon-gluon (gg) and γ γ , which both occur due to Feynman diagrams with
loops, and W + W − and Z 0 Z 0 . Naively, the last two are not kinematically allowed,
since 2m W and 2m Z are both greater than m h . However, they can still contribute
if one or both of the weak vector bosons is off-shell (virtual). These decays are
often written as h → W W (∗) and h → Z Z (∗) , with the (∗) indicating an off-shell
particle. Normally, such decays would be negligible compared to 2-body decays to
on-shell particles, but they are competitive because the bottom quark squared Yukawa
coupling yb2 ≈ 0.00024 is so small.
160 6 Decay Processes

Taking into account these effects,2 it turns out that the total width of the Higgs
boson is approximately 4.2 MeV, assuming m h = 125 GeV. This is an extremely
small decay width for such a heavy particle. One can define the branching ratio to
be the partial decay rate into a particular final state, divided by the total decay rate,
so for example

BR(h → bb) = (h → bb)/ total . (6.49)

In the Standard Model with m h = 125 GeV, the predicted branching fractions into
b, τ , and c pairs, taking into account all known effects, are:

BR(h → bb) ≈ 0.577, (6.50)

BR(h → τ − τ + ) ≈ 0.063, (6.51)
BR(h → cc) ≈ 0.029. (6.52)

Some other branching ratios that turn out to be extremely important for the Higgs
boson at the Large Hadron collider, but rely on more involved calculations, are:

BR(h → W W (∗) ) ≈ 0.215, (6.53)

BR(h → Z Z (∗) ) ≈ 0.0264, (6.54)
BR(h → gg) ≈ 0.0857, (6.55)
BR(h → γ γ ) ≈ 0.00228. (6.56)

We will return to the subject of the Higgs boson branching ratios in Sect. 10.5.
Finally, consider the helicities for the process h → f f . If we demanded that the
final states have particular helicities, then we would have obtained for the matrix
element, using PR , PL projection matrices:

R-fermion, R-antifermion: M = −i y u 2 PL PL v1 = −i y u 2 PL v1
= 0, (6.57)
L-fermion, L-antifermion: M = −i y u 2 PR PR v1 = −i y u 2 PR v1
= 0, (6.58)
R-fermion, L-antifermion: M = −i y u 2 PL PR v1 = 0, (6.59)
L-fermion, R-antifermion: M = −i y u 2 PR PL v1 = 0. (6.60)

(Recall, from (5.102)–(5.105), that to get a L fermion or antifermion in the final

state, one puts in a PR next to the spinor, while to get a R fermion or antifermion in
the final state, one puts in a PL .) So, the rule here is that helicity is always violated
by the scalar-fermion-antifermion vertex, since the scalar must decay to a state with
equal helicities.
One may understand this result from angular momentum conservation. The initial
state had no spin and no orbital angular momentum. In the final state, the spins

2 For
the results quoted in this paragraph, see https://arxiv.org/abs/1307.1347 S. Heinemeyer et al.,
“Handbook of LHC Higgs Cross Sections: 3. Higgs Properties”.
6.4 Three-Body Decays 161

of the outgoing particles must therefore have opposite directions. Since they have
momentum in opposite directions, this means they must also have the same helicity.
Drawing a short arrow to represent the spin, the allowed cases of RR helicities and
LL helicities look like:

The helicities of tau leptons can be (statistically) measured from the angular distri-
butions of their decay products, so this effect may eventually be measured with a
sample of h → τ − τ + decay events.

6.4 Three-Body Decays

Let us consider a generic three-body decay process in which a particle of mass M

decays to three lighter particles with masses m 1 , m 2 , and m 3 . We will work in the
rest frame of the decaying particle, so its four-vector is p μ = (M, 0, 0, 0), and the
four-vectors of the remaining particles are k1 , k2 , and k3 respectively:

In general, the formula for a three-body differential decay rate is

1
d = |M|2 d3 , (6.61)
2M
where

d 3 k1 d 3 k2 d 3 k3
d3 = (2π )4 δ (4) ( p − k1 − k2 − k3 ) (6.62)
(2π )3 2E 1 (2π )3 2E 2 (2π )3 2E 3

is the Lorentz-invariant phase space. Since there are 9 integrals to do, and 4 delta
functions, the result for d is a differential with respect to 5 remaining variables.
The best choice of 5 variables depends on the problem at hand, so there are several
162 6 Decay Processes

ways to present the result. Two of the 5 variables can be chosen to be the energies
E 1 and E 2 of two of the final-state particles; then the energy of the third particle
E 3 = M − E 1 − E 2 is also known from energy conservation. In the rest frame of
the decaying particle, the three final-state particle 3-momenta must lie in a plane,
because of momentum conservation. Specifying E 1 and E 2 also uniquely fixes the
angles between the three particle momenta within this decay plane. The remaining 3
variables just correspond to the orientation of the decay plane with respect to some
fixed coordinate axis. If we think of the three 3-momenta within the decay plane as
describing a rigid body, then the relative orientation can be described using three
Euler angles. These can be chosen to be the spherical coordinate angles φ1 and θ1 for
particle 1, and an angle α2 that measures the rotation of the 3-momentum direction
of particle 2 as measured about the axis of the momentum vector of particle 1. Then
one can show:
1
d3 = d E 1 d E 2 dφ1 d(cos θ1 ) dα2 . (6.63)
256π 5
The choice of which particles to label as 1 and 2 is arbitrary, and should be made to
maximize convenience.
If the initial state particle spin is averaged over, or if it is spinless, then there is
no special direction to measure the orientation of the final state decay plane with
respect to. In that case, for particular E 1 and E 2 , the reduced matrix element cannot
depend on the angles φ1 , θ1 , or α2 , and one can do the integrals

2π 1 2π
dφ1 d(cos θ1 ) dα2 = (2π )(2)(2π ) = 8π 2 . (6.64)
0 −1 0

Then,

1
d3 = d E1d E2 , (6.65)
32π 3
and so, for spinless or spin-averaged initial states,

1
d = |M|2 d E 1 d E 2 . (6.66)
64π 3 M
To do the remaining energy integrals, one must find the limits of integration. If one
decides to do the E 2 integral first, then by doing the kinematics one can show for
any particular E 1 that

1
E 2max,min = (M − E 1 )(m 2
23 + m 2
2 − m 2
3 ) ± (E 2 − m 2 ) λ(m 2 , m 2 , m 2 ) , (6.67)
1 1 23 2 3
2m 223

where the triangle function λ(x, y, z) was defined by (6.18), and

m 223 = (k2 + k3 )2 = ( p − k1 )2 = M 2 − 2E 1 M + m 21 (6.68)

Problems 163

is the invariant (mass)2 of the combination of particles 2 and 3. Then the limits of
integration for the final E 1 integral are:

M 2 + m 21 − (m 2 + m 3 )2
m 1 < E1 < . (6.69)
2M
A good strategy is usually to choose the label “1” for the particle whose energy
we care the most about. Then after doing the d E 2 integral, we will be left with an
expression for d/d E 1 .
In the special case that all final state particles are massless (or small enough to
neglect) m 1 = m 2 = m 3 = 0, then these limits of integration simplify to:

M M
− E1 < E2 < , (6.70)
2 2
M
0 < E1 < . (6.71)
2

Problems

1. In certain extensions of the Standard Model, including the Minimal Supersym-

metric Standard Model (MSSM), it is predicted that there is an electrically neutral
spin-0 particle A0 called the “pseudo-scalar Higgs boson”. (This is in addition
the the usual “Standard-Model-like” Higgs boson called h 0 , and another neutral
scalar Higgs boson H 0 , and a charged Higgs boson pair H ± .) The interaction
Lagrangian of this particle with each Standard Model Dirac fermion field f has
the form:

Lint = y f A0 f γ5 f (6.72)

where y f is a coupling constant. Compute the partial decay rate for A0 into a
fermion anti-fermion pair, as a function of y f , the mass of the pseudo-scalar
M A0 , and the mass of the fermion m f . You should find a result of the form:
p
4m 2f
= N n f y 2f M A0 1− , (6.73)
M A2 0

where n f = 3 for quarks and 1 for leptons, and N and p are numbers that you
will compute.
2. This problem is an extension of the previous problem. In the MSSM, it has been
calculated that the ratio of the couplings of A0 to bottom quarks and to top quarks
is:

yb mb
= tan2 β (6.74)
yt mt
164 6 Decay Processes

where tan β is a parameter of the model which is usually believed to be in the

range:

2< <
∼ tan β ∼ 55. (6.75)

Here, m t and m b differ somewhat from the actual masses, because of higher-order
corrections.
(a) Taking m t = 165 GeV and m b = 3 GeV and m t = 173 GeV and m b = 5 GeV
and m A0 = 400 GeV, make a plot of

BR(A0 → bb)
(6.76)
BR(A0 → tt)

as a function of tan β, using at least representative points tan β = 2, 5, 10, 20, 30,
40, 50. Use a log scale for the vertical axis. For what values of tan β is the
branching fraction of A0 into bb greater than for tt?
(b) Repeat part (a) for m A0 = 1000 GeV.
Fermi Theory of Weak Interactions
7

7.1 Weak Nuclear Decays

In nuclear physics, the weak interactions are responsible for decays of long-lived
isotopes. A nucleus with Z protons and A − Z neutrons, so A nucleons in all, is
denoted by A Z . If kinematically allowed, one can observe decays:
A
Z → A (Z + 1) + e− ν e . (7.1)

The existence of the antineutrino ν e was inferred by Pauli as a “desperate remedy” to

save the principle of energy conservation. The simplest example of this is the decay
of the neutron into a proton, electron, and antineutrino:

n → p + e− ν e (7.2)

with a mean lifetime of1

τ = 881.5 ± 1.5 s. (7.3)

Decays of heavier nuclei can be thought of as involving the subprocess:

“n” → “ p + ”e− ν e , (7.4)

1 The lifetime of the neutron is an infamous example of an experimental measurement that has

shifted dramatically over time. As recently as the late 1960s, it was thought that τn = 1010 ± 30
s, and as recently as 2010, the official value was 885.7 ± 0.8 seconds. Even today the systematic
uncertainties are a source of concern.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 165
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_7
166 7 Fermi Theory of Weak Interactions

where the quotes indicated that the neutron and proton are really not separate entities,
but part of the nuclear bound states. So for example, tritium decays according to
3
H → 3 He + e− ν e (τ = 5.6 × 108 s = 17.7 years), (7.5)

and carbon-14 decays according to

14
C → 14 N + e− ν e (τ = 1.8 × 1011 s = 8280 years). (7.6)

Nuclear physicists usually quote the half life t1/2 rather than the mean lifetime τ .
They are related by

t1/2 = τ ln(2), (7.7)

so that t1/2 = 5730 years for Carbon-14, making it ideal for dating dead organisms.
In the upper atmosphere, cosmic rays produce energetic neutrons, which in turn
constantly convert 14 N nuclei into 14 C. Carbon-dioxide-breathing organisms, or those
that eat them, maintain an equilibrium with the carbon content of the atmosphere, at a
level of roughly 14 C/12 C≈ 10−12 . However, a complication is the fact that this ratio
is not constant; it dropped in the early 20th century as more ordinary 12 C entered the
atmosphere because of the burning of fossil fuels containing the carbon of organisms
that have been dead for a very long time. The relative abundance 14 C/12 C ≈ 10−12
then doubled after 1954 because of nuclear weapons testing, reaching a peak in the
mid 1960s from which it has since declined. In any case, dead organisms lose half
of their 14 C every 5730 ± 30 years, and certainly do not regain it by breathing or
eating. So, by measuring the rate of e− beta rays consistent with 14 C decay produced
by a sample, and determining the historic atmospheric 14 C/12 C ratio as a function
of time with control samples or by other means, one can date the death of a sample
of organic matter.
One can also have decays that release a positron and neutrino:
A
Z → A (Z − 1) + e+ νe . (7.8)

These can be thought of as coming from the subprocess

“ p + ” → “n”e+ νe . (7.9)

In free space, the proton cannot decay, simply because m p < m n , but under the right
circumstances it is kinematically allowed when the proton and neutron are parts of
nuclear bound states. An example is
14
O → 14 N + e+ νe (τ = 71 s). (7.10)

The long lifetimes of such decays are what originally gave rise to the name “weak”
interactions.
7.2 Muon Decay 167

Charged pions also decay through the weak interactions, with a mean lifetime of
τπ ± = 2.6 × 10−8 s. (7.11)

This is still a very long lifetime by particle physics standards, and corresponds to
a proper decay length of cτ = 7.8 meters. The probability that a charged pion with
velocity β will travel a distance L in empty space before decaying is therefore
√
P = e−(L/7.8 m) 1−β /β .
2
(7.12)

This means that a relativistic charged pion will typically travel several meters before
decaying, unless it interacts (which it usually will in a collider detector). The main
decay mode is
π − → μ− ν μ (7.13)

with a branching ratio of 0.99988. (This includes submodes in which an additional

photon is radiated away.) The only other significant decay mode is

π − → e− ν e . (7.14)

with a branching ratio 1.2 × 10−4 . This presents a puzzle: since the electron is lighter,
there is more kinematic phase space available for the second decay, yet the first decay
dominates by almost a factor of 104 . We will calculate the reason for this later, in
Sect. 7.6.

7.2 Muon Decay

The muon decays according to

μ− → e− ν e νμ (τ = 2.2 × 10−6 s). (7.15)

This corresponds to a decay width of

= 3.0 × 10−19 GeV, (7.16)

implying a proper decay length of cτ = 659 meters. Muons do not undergo hadronic
interactions like pions do, so that relativistic muons will usually penetrate at least
the inner layers of particle detectors with a very high probability.
The Feynman diagram for muon decay can be drawn as:
168 7 Fermi Theory of Weak Interactions

This involves a 4-fermion interaction vertex.2 Because of the correspondence

between interactions and terms in the Lagrangian, we therefore expect that the
Lagrangian should contain terms schematically of the form

Lint = (ν μ . . . μ) (e . . . νe ) or (ν μ . . . νe ) (e . . . μ), (7.17)

where the symbols μ, νμ , e, νe mean the Dirac spinor fields for the muon, muon
neutrino, electron, and electron neutrino, and the ellipses mean matrices in Dirac
spinor space. To be more precise about the interaction Lagrangian, one needs clues
from experiment.
One clue is the fact that there are three quantum numbers, called lepton numbers,
that are additively conserved to a high accuracy in most experiments. (The only
confirmed exceptions are neutrino oscillation experiments.) They are assigned as:
⎧
⎨ +1 for e− , νe
L e = −1 for e+ , ν e (7.18)
⎩
0 for all other particles
⎧
⎨ +1 for μ− , νμ
L μ = −1 for μ+ , ν μ (7.19)
⎩
0 for all other particles
⎧
⎨ +1 for τ − , ντ
L τ = −1 for τ + , ν τ (7.20)
⎩
0 for all other particles

So, for example, in the nuclear decay examples above, one always has L e = 0 in
the initial state, and L e = 1 − 1 = 0 in the final state, with L μ = L τ = 0 trivially in
each case. The muon decay mode in (7.15) has (L e , L μ ) = (0, 1) in both the initial
and final states. If lepton numbers were not conserved, then one might expect that
decays like

μ− → e− γ (7.21)

would be allowed. However, these decays have never been observed, and the most
recent limit from the MEG experiment at the Paul Scherrer Institute is

BR(μ− → e− γ ) < 4.2 × 10−13 . (7.22)

(Actually, the MEG experiment searches for the decay μ+ → e+ γ , but the branching
ratio should be the same with all particles replaced by their anti-particles.) This is a

2 We will later find out that this is not a true fundamental interaction of the theory, but rather an
“effective” interaction that is derived from the low-energy effects of the W − boson.
7.2 Muon Decay 169

remarkably strong constraint, since this decay only has to compete with the already
weak mode in (7.15). It implies that

(μ− → e− γ ) < 1.3 × 10−31 GeV. (7.23)

The BaBar and Belle experiments have put similar (but not as stringent) bounds on
tau lepton number non-conservation:

BR(τ − → e− γ ) < 3.3 × 10−8 BaBar, (7.24)

BR(τ − → μ− γ ) < 4.4 × 10−8 BaBar, (7.25)
BR(τ − → e− π 0 ) < 8.0 × 10−8 Belle. (7.26)

Since 1998, experimental results from neutrinos produced in the Sun, the atmo-
sphere, by accelerators, and in reactors have given strong evidence for oscillations
of neutrinos that are caused by them having small non-zero masses that violate the
individual lepton numbers. (It is still an open question whether they also violate the
total lepton number

L ≡ Le + Lμ + Lτ ; (7.27)

for more on this, see Sect. 10.4.) However, these are very small effects for colliding
beam experiments, and can be ignored for almost all conceivable processes at the
Tevatron and the LHC.
The (near) conservation of lepton numbers suggests that the interaction Lagrangian
for the weak interactions can always be written in terms of fermion bilinears involv-
ing one barred and one unbarred Dirac spinor from each lepton family. So, we
will write the weak interactions for leptons in terms of building blocks with net
L e = L μ = L τ = 0, for example, like the first term in (7.17) but not the second.
More generally, we will want to use building blocks:

( . . . ν ) or (ν . . . ), (7.28)

where is any of e, μ, τ . Now, since each Dirac spinor has 4 components, a basis
for fermion bilinears involving any two fields 1 and 2 will have 4 × 4 = 16
elements. The can be classified by their transformation properties under the proper
Lorentz group and the parity transformation x → −x, as follows:

Term Number Parity (x → −x) Type

1 2 1 +1 Scalar = S
1 γ5 2 1 −1 Pseudo-scalar = P
1 γ μ
2 4 (−1)μ Vector = V
μ
1 γ γ5 2 4 −(−1)μ Axial-vector = A
μ ν
2 1 γ , γ
i
2 6 (−1)μ (−1)ν Tensor = T .
170 7 Fermi Theory of Weak Interactions

The entry under Parity indicates the multiplicative factor under which each of these
terms transforms when x → −x, with

μ +1 for μ = 0
(−1) = (7.29)
−1 for μ = 1, 2, 3.

The weak interaction Lagrangian for leptons could be formed out of any product of
such terms with 1 , 2 = , ν . Fermi originally proposed that the weak interaction
fermion building blocks were of the type V , so that muon decays would be described
by

Lint
V
= −G(ν μ γ ρ μ)(eγρ νe ) + c.c. (7.30)

Here “c.c.” means complex conjugate; this is necessary since the Dirac spinor fields
are complex. Some other possibilities could have been that the building blocks were
of type A:

Lint
A
= −G(ν μ γ ρ γ5 μ)(eγρ γ5 νe ) + c.c. (7.31)

or some combination of V and A, or some combination of S and P, or perhaps even

T.
Fermi’s original proposal of V for the weak interactions turned out to be wrong.
The most important clue for determining the correct answer for the proper Lorentz
and parity structure of the weak interaction building blocks came from an experiment
on polarized 60 Co decay by Wu in 1957. The 60 Co nucleus has spin J = 5, so that
when cooled and placed in a magnetic field, the nuclear spins align with B. Wu then
measured the angular dependence of the electron spin from the decay
60
Co → 60 Ni + e− ν e . (7.32)

The nucleus 60 Ni has spin J = 4, so the net angular momentum carried away by
the electron and antineutrino is 1. The observation was that the electron is emitted
preferentially in the direction opposite to the original spin of the 60 Co nucleus. This
can be explained consistently with angular momentum conservation if the electron
produced in the decay is always polarized left-handed and the antineutrino is always
right-handed. Using short arrows to designate spin directions, the most favored con-
figuration is:

The importance of this experiment and others was that right-handed electrons and
left-handed antineutrinos do not seem to participate in the weak interactions. This
7.2 Muon Decay 171

means that when writing the interaction Lagrangian for weak interactions, we can
always put a PL to the left of the electron’s Dirac field, and a PR to the right of a
ν e field. This helped establish that the correct form for the fermion bilinear in the
Lagrangian is V − A:

1 ρ
e PR γ ρ νe = eγ ρ PL νe = eγ (1 − γ5 )νe . (7.33)
2
Since this is a complex quantity, and the Lagrangian density must be real, one must
also have terms involving the complex conjugate of (7.33):

ν e PR γ ρ e = ν e γ ρ PL e. (7.34)

The feature that was considered most surprising at the time was that right-handed
Dirac fermion fields PR e, PR νe , and left-handed Dirac barred fermion fields e PL ,
and ν e PL never appear in any part of the weak interaction Lagrangian.
For muon decay, the relevant four-fermion interaction Lagrangian is:
√
Lint = −2 2G F (ν μ γ ρ PL μ)(eγρ PL νe ) + c.c. (7.35)

Here G F is a coupling constant with dimensions of [mass]−2 , known as the Fermi

constant. Its
√ numerical value is most precisely determined from muon decay. The
factor of 2 2 is a historical convention. Using the correspondence between terms in
the Lagrangian and particle interactions, we therefore have two Feynman rules:

These are related by reversing of all arrows, corresponding to the complex conjugate
in (7.35). The slightly separated dots in the Feynman rule picture are meant to
indicate the Dirac spinor structure. The Feynman rules for external state fermions
172 7 Fermi Theory of Weak Interactions

and antifermions are exactly the same as in QED, with neutrinos treated as fermions
and antineutrinos as antifermions. This weak interaction Lagrangian for muon decay
violates parity maximally, since it treats left-handed fermions differently from right-
handed fermions. However, helicity is conserved by this interaction Lagrangian, just
as in QED, because of the presence of one gamma matrix in each fermion bilinear.
We can now derive the reduced matrix element for muon decay, and use it to
compute the differential decay rate of the muon. Comparing this to the experimentally
measured result will allow us to find the numerical value of G F , and determine the
energy spectrum of the final state electron. At lowest order, the only Feynman diagram
for μ− → e− ν e νμ is:

using the first of the two Feynman rules above. Let us label the momenta and spins
of the particles as follows:

Particle Momentum Spin Spinor

μ− pa sa u( pa , sa ) = u a
e− k1 s1 u(k1 , s1 ) = u 1 (7.36)
νe k2 s2 v(k2 , s2 ) = v2
νμ k3 s3 u(k3 , s3 ) = u 3

The reduced matrix element is obtained by starting at the end of each fermion line
with a barred spinor and following it back (moving opposite the arrow direction) to
the beginning. In this case, that means starting with the muon neutrino and electron
barred spinors. The result is:
√
M = −i2 2G F (u 3 γ ρ PL u a )(u 1 γρ PL v2 ). (7.37)

This illustrates a general feature; in the weak interactions, there should be a PL

next to each unbarred spinor in a matrix element, or equivalently a PR next to each
barred spinor. (The presence of the gamma matrix ensures the equivalence of these
two statements, since PL ↔ PR when moved through a gamma matrix.) Taking the
complex conjugate, we have:
√
M∗ = i2 2G F (u a PR γ σ u 3 )(v 2 PR γσ u 1 ). (7.38)

Therefore,

|M|2 = 8G 2F (u 3 γ ρ PL u a )(u a PR γ σ u 3 ) (u 1 γρ PL v2 )(v 2 PR γσ u 1 ). (7.39)

7.2 Muon Decay 173

In the following, we can neglect the mass of the electron m e , since m e /m μ < 0.005.
Now we can average over the initial-state spin sa and sum over the final-state spins
s1 , s2 , s3 using the usual tricks:

1 1
u a u a = ( /pa + m μ ), (7.40)
2 s 2
a

u 1 u 1 = k/1 , (7.41)
s1

v2 v 2 = k/2 , (7.42)
s2

u 3 u 3 = k/3 , (7.43)
s3

to turn the result into a product of traces:

1
|M|2 = 4G 2F Tr[γ ρ PL ( /pa + m μ )PR γ σ k/3 ]Tr[γρ PL k/2 PR γσ k/1 ] (7.44)
2
spins

= 4G 2F Tr[γ ρ /pa PR γ σ k/3 ]Tr[γρ k/2 PR γσ k/1 ] (7.45)

Fortunately, we have already seen a product of traces just like this one, in (5.139),
so that by substituting in the appropriate 4-momenta, we immediately get:

1
|M|2 = 64G 2F ( pa · k2 )(k1 · k3 ). (7.46)
2
spins

Our next task is to turn this reduced matrix element into a differential decay rate.
Applying the results of Sect. 6.4 to the example of muon decay, with M = m μ
and m 1 = m 2 = m 3 = 0. According to our result of (7.46), we need to evaluate the
dot products pa · k2 and k1 · k3 . Since these are Lorentz scalars, we can evaluate
them in a frame where k2 is along the z-axis. Then

pa = (m μ , 0, 0, 0), (7.47)
k2 = (E ν e , 0, 0, E ν e ). (7.48)

Therefore,

pa · k2 = m μ E ν e . (7.49)

Also, k1 · k3 = 21 [(k1 + k3 )2 − k12 − k32 ] = 21 [( pa − k2 )2 − 0 − 0] = 21 [ pa2 + k22 −

2 pa · k2 ], so

1 2
k1 · k3 = (m − 2m μ E ν e ). (7.50)
2 μ
174 7 Fermi Theory of Weak Interactions

Therefore, from (7.46),

1
|M|2 = 32G 2F (m 3μ E ν e − 2m 2μ E ν2e ). (7.51)
2
spins

Plugging this into (6.66) with M = m μ , and choosing E 1 = E e and E 2 = E ν e , we

get:

G 2F 2
d = d E e d E ν e (m E ν − 2m μ E ν2e ). (7.52)
2π 3 μ e
Doing the d E ν e integral using the limits of integration of (6.70), we obtain:
mμ
2
G 2F 2
d = d E e d Eνe (m E ν − 2m μ E ν2e )
2π 3 μ e
mμ
2 −E e

G2 m 2μ E e2 m μ E e3
= d E e 3F − . (7.53)
π 4 3

We have obtained the differential decay rate for the energy of the final state electron:

d G 2F m 2μ 2 4E e
= E 1− . (7.54)
d Ee 4π 3 e 3m μ

The shape of this distribution is shown below as the solid line:

_
e , νμ
νe

dΓ/dE

0
0 0.1 0.2 0.3 0.4 0.5
E/mμ
7.2 Muon Decay 175

We see that the electron energy is peaked near its maximum value of m μ /2. This cor-
responds to the situation where the electron is recoiling directly against both the neu-
trino and antineutrino, which are collinear; for example, k1 = (m μ /2, 0, 0, −m μ /2),
μ μ
and k2 = k3 = (m μ /4, 0, 0, m μ /4):

The helicity of the initial state is undefined, since the muon is at rest. However,
we know that the final state e− , νμ , and ν e have well-defined L, L, and R helicities
respectively, as shown above by the short arrows pointing in the spin directions, since
this is dictated by the weak interactions. In the case of maximum E e , therefore, the
spins of νμ and ν e must be in opposite directions. The helicity of the electron is L, so
its spin must be opposite to its 3-momentum direction. By momentum conservation,
this tells us that the electron must move in the opposite direction to the initial muon
spin in the limit that E e is near the maximum.
The smallest possible electron energies are near 0, which occurs when the neutrino
and antineutrino move in nearly opposite directions, so that the 3-momentum of the
electron recoiling against them is very small.
We have done the most practically sensible thing by plotting the differential decay
rate in terms of the electron energy, since that is what is directly observable in an
experiment. Just for fun, however, let us pretend that we could directly measure the
νμ and ν e energies, and compute the distributions for them. To find d/d E νμ , we can
take E 2 = E ν e and E 1 = E νμ in (6.66) and (6.70)–(6.71), with the reduced matrix
element from (7.51). Then
G 2F 2
d = d E νμ d E ν e (m E ν − 2m μ E ν2e ), (7.55)
2π 3 μ e
and the range of integration for E ν e is now:
mμ mμ
− E νμ < E ν e < , (7.56)
2 2
so that
mμ
2
G 2F 2
d = d E νμ d Eνe (m E ν − 2m μ E ν2e ) (7.57)
2π 3 μ e
mμ
2 −E νμ

G2 m 2μ E ν2μ m μ E ν3μ
= d E e 3F − . (7.58)
π 4 3

Therefore, the E νμ distribution of final states has the same shape as the E e distribu-
tion:

d G 2 m 2μ 4E νμ
= F 3 E ν2μ 1 − . (7.59)
d E νμ 4π 3m μ
176 7 Fermi Theory of Weak Interactions

Finally, we can find d/d E ν e , by choosing E 2 = E e and E 1 = E ν e in (6.66) and

(6.70)–(6.71) with (7.51). Then:
mμ
2
G 2F 2
d = d E ν e d Ee (m E ν − 2m μ E ν2e ) (7.60)
2π 3 μ e
mμ
2 −E ν e

G 2F 2 2
= d Eνe 3
m μ E ν e − 2m μ E ν3e , (7.61)
2π
so that

d G 2 m 2μ 2E ν e
= F 3 E ν2e 1 − . (7.62)
d Eνe 2π mμ

This distribution is plotted as the dashed line in the previous graph. Unlike the
distributions for E e and E ν , we see that d/d E ν e vanishes when E ν e approaches its
m
maximum value of 2μ . We can understand this by noting that when E ν e is maximum,
the ν e must be recoiling against both e and νe moving in the opposite direction, so
the L, L, R helicities of e, νμ , and ν e tell us that the total spin of the final state is 3/2:

Since the initial-state muon only had spin 1/2, the quantum states have 0 overlap,
and the rate must vanish in that limit of maximal E ν e .
The total decay rate for the muon is found by integrating either (7.54) with respect
to E e , or (7.59) with respect to E νμ , or (7.62) with respect to E ν e . In each case, we
get:

m μ /2 m μ /2
m μ /2
d d d
= d Ee = d E νμ = d E ν e (7.63)
d Ee d E νμ d Eνe
0 0 0
G 2F m 5μ
= . (7.64)
192π 3
It is a good check that the final result does not depend on the choice of the final
energy integration variable. It is also good to check units: G 2F has units of [mass]−4
or [time]4 , while m 5μ has units of [mass]5 or [time]−5 , so indeed has units of [mass]
or [time]−1 .
Experiments tell us that

(μ− → e− νμ ν e ) = 2.99591(3) × 10−19 GeV, (7.65)

m μ = 0.1056584 GeV, (7.66)
7.2 Muon Decay 177

so we obtain the numerical value of Fermi’s constant from (7.64):

G F = 1.166364(5) × 10−5 GeV−2 . (7.67)

(This determination also includes some small and delicate corrections reviewed
below in Sect. 7.3.)
The 4-fermion weak interaction Lagrangian of (7.35) describes several other pro-
cesses besides the decay μ− → e− ν e νμ that we studied in Sect. 7.2. As the simplest
example, we can just replace each particle in the process by its anti-particle:

μ+ → e+ νe ν μ , (7.68)

for which the Feynman diagram is just obtained by changing all of the arrow direc-
tions:

The evaluations of the reduced matrix element and the differential and total decay
rates for this decay are very similar to those for the μ− → e− ν e νμ . For future
reference, let us label the 4-momenta for this process as follows:

Particle Momentum Spinor

μ+ pa va
e + k1 v1 (7.69)
νe k2 u2
νμ k3 v3 .

The reduced matrix element, following from the “+c.c.” term in (7.35), is then
√
M = −i2 2G F (v a γ ρ PL v3 )(u 2 γρ PL v1 ). (7.70)

As one might expect, the result for the spin-summed squared matrix element,

|M|2 = 128G 2F ( pa · k2 )(k1 · k3 ), (7.71)
spins

is exactly the same as obtained in (7.46), with the obvious substitution of primed
4-momenta. The differential and total decay rates that follow from this are, of course,
exactly the same as for μ− decay.
178 7 Fermi Theory of Weak Interactions

This is actually a special case of a general symmetry relation between particles

and anti-particles, which holds true in any local quantum field theory, and is known
as the CPT Theorem. The statement of the theorem is that the laws of physics, as
specified by the Lagrangian, are left unchanged after one performs the combined
operations of:

• charge conjugation (C): replacing each particle by its antiparticle,

• parity (P): replacing x → −x,
• time reversal (T): replacing t → −t.

It turns out to be impossible to write down any theory that fails to obey this rule, as
long as the Lagrangian is invariant under proper Lorentz transformations and contains
a finite number of spacetime derivatives and obeys some other technical assumptions.
Among other things, the CPT Theorem implies that the mass and the total decay rate
of a particle must each be equal to the same quantities for the corresponding anti-
particle. (It does not say that the differential decay rate to a particular final state
configuration necessarily has to be equal to the anti-particle differential decay rate
to the same configuration of final-state anti-particles; that stronger result holds only
if the theory is invariant under T. The four-fermion Fermi interaction for leptons
does respect invariance under T, but it is violated by a tiny amount in the weak
interactions of quarks.) We will study some other processes implied by the Fermi
weak interaction Lagrangians in Sects. 7.4, 7.5, and 7.6 below.

7.3 Corrections to Muon Decay

In the previous section, we derived the μ− decay rate in terms of Fermi’s four-fermion
weak interaction coupling constant G F . Since this decay process is actually the one
that is used to experimentally determine G F most accurately, it is worthwhile to note
the leading corrections to it.
First, there is the dependence on m e , which we neglected, but could have included
at the cost of a more complicated phase space integration. Taking this into account
using correct kinematics for m e = 0 and the limits of integration in (6.67)–(6.69),
one finds that the decay rate must be multiplied by a correction factor Fkin (m 2e /m 2μ ),
where

Fkin (x) = 1 − 8x + 8x 3 − x 4 − 12x 2 lnx. (7.72)

Numerically, Fkin (m 2e /m 2μ ) = 0.999813.

There are also corrections coming from two types of QED effects. First, there are
loop diagrams involving virtual photons:
7.3 Corrections to Muon Decay 179

Evaluating these diagrams is beyond the scope of this book. However, it should
be clear that they give contributions to the reduced matrix element proportional to
e2 G F , since each contains two photon interaction vertices. These contributions to
the reduced matrix element actually involve divergent loop integrals, which must be
“regularized” by using a high-energy cutoff. There is then a logarithmic dependence
on the cutoff energy, which can then be absorbed into a redefinition of the mass and
coupling parameters of the model, by the systematic process of renormalization. The
interference of the loop diagrams with the original lowest-order diagram then gives a
contribution to the decay rate proportional to αG 2F . There are also QED contributions
from diagrams with additional photons in the final state:

The QED diagrams involving an additional photon contribute to a 4-body final state,
with a reduced matrix element proportional to eG F . After squaring, summing over
final spins, and averaging over the initial spin, and integrating over the 4-body phase
space, the contribution to the decay rate is again proportional to αG 2F . Much of this
contribution actually comes from very soft (low-energy) photons, which are difficult
or impossible to resolve experimentally. Therefore, one usually just combines the
180 7 Fermi Theory of Weak Interactions

two types of QED contributions into a total inclusive decay rate with one or more
extra photons in the final state. After a heroic calculation, one finds that the QED
effect on the total decay rate is to multiply by a correction factor

α 25 π2 m2 m
e
α 2
FQED (α) = 1 + − − 2e 9 + 4π 2 + 24 ln + C2 + · · · (7.73)
π 8 2 mμ mμ π

where the C2 contribution refers to even higher-order corrections from: the interfer-
ence between Feynman diagrams with two virtual photons and the original Feynman
diagram; the interference between Feynman diagrams with one virtual photons plus
one final state photon and the original Feynman diagram; the square of the reduced
matrix element for a Feynman diagram involving one virtual photon; and two pho-
tons in the final state. A complicated calculation shows that C2 ≈ 6.68. Because of
the renormalization procedure, the QED coupling α actually is dependent on the
energy scale, and should be evaluated at the energy scale of interest for this problem,
which is naturally m μ . At that scale, α ≈ 1/135.9, so numerically

FQED (α) ≈ 0.995802. (7.74)

Finally, there are corrections involving the fact that the point-like four-fermion
interaction is actually due to the effect of a virtual W − boson. This gives a correction
factor

3m 2μ
FW = 1 + 2
≈ 1.000001, (7.75)
5MW

using MW = 80.4 GeV. The predicted decay rate defining G F experimentally includ-
ing all these higher-order effects is

G 2F m 5μ
μ− = Fkin FQED FW . (7.76)
192π 3
The dominant remaining uncertainty in G F quoted in (7.67) comes from the exper-
imental input of the muon lifetime.

7.4 Inverse Muon Decay (e− νμ → νe μ− )

Consider the process of muon-neutrino scattering on an electron:

e− νμ → νe μ− . (7.77)

The Feynman diagram for this process is:

7.4 Inverse Muon Decay (e− νμ → νe μ− ) 181

in which we see that the following particles have been crossed from the previous
diagram for μ+ decay:

initial μ+ → final μ− (7.78)

final e+ → initial e− (7.79)
final ν μ → initial νμ . (7.80)

In fact, this scattering process is often known as inverse muon decay. To apply the
Crossing Symmetry Theorem stated in Sect. 5.3, we can assign momentum labels
pa , pb , k1 , k2 to e− , νμ , νe , μ− respectively, as shown in the figure, and then make
the following comparison table:

μ+ → e+ νe ν μ e− νμ → νe μ−
μ+ , pa μ− , k2
e+ , k1 e− , pa (7.81)
νe , k2 νe , k1
ν μ , k3 νμ , pb

Therefore, we obtain the spin-summed, squared matrix element for e− νμ → νe μ−

by making the replacements

pa = −k2 ; k1 = − pa ; k2 = k1 ; k3 = − pb (7.82)

in (7.71), and then multiplying by (−1)3 for three crossed fermions, resulting in:

|Me− νμ →νe μ− |2 = 128G 2F (k2 · k1 )( pa · pb ). (7.83)
spins

Let us evaluate this result in the limit of high-energy scattering, so that m μ can be
neglected, and in the center-of-momentum frame. In that case, all four particles being
treated as massless, we can take the kinematics results from (5.177)–(5.181), so that
pa · pb = k1 · k2 = s/2, and

|Me− νμ →νe μ− |2 = 32G 2F s 2 . (7.84)
spins
182 7 Fermi Theory of Weak Interactions

Including a factor of 1/2 for the average over the initial-state electron spin,3 and
using (4.192),

dσ G2 s
= F . (7.85)
d(cos θ ) 2π

This differential cross-section is isotropic (independent of θ ), so it is trivial to inte-

1
grate −1 d(cos θ ) = 2 to get the total cross-section:

G 2F s
σe− νμ →νe μ− = . (7.86)
π
Numerically, we can evaluate this using (7.67):
√ 2
s
σe− νμ →νe μ− = 16.9 fb . (7.87)
GeV

In a typical experimental setup, the electrons will be contained in a target of

ordinary material at rest in the lab frame. The muon neutrinos might be produced
from a beam of decaying μ− , which are in turn produced by decaying pions, as
discussed later. If we call the νμ energy in the lab frame E νμ , then the center-of-
momentum energy is given by
√
s = E CM = 2E νμ m e + m 2e ≈ 2E νμ m e . (7.88)

Substituting this into (7.87) gives:

E νμ
σe− νμ →νe μ− = 1.7 × 10−2 fb . (7.89)
GeV

This is a very small cross-section for typical neutrino energies encountered in present
experiments, but it does grow with E νμ .
The isotropy of e− νμ → νe μ− scattering in the center-of-momentum frame can
be understood from considering what the helicities dictated by the weak interactions
tell us about the angular momentum. Since this is a weak interaction process involving
only fermions and not anti-fermions, they are all L helicity.

3 In the Standard Model with neutrino masses neglected, all neutrinos are left-handed, and all

antineutrinos are right-handed. Since there is only one possible νμ helicity, namely L, it would
be incorrect to average over the νμ spin. This is a general feature; one should never average over
initial-state neutrino or antineutrino spins, as long as they are being treated as massless.
7.5 e− ν e → μ− ν μ 183

We therefore see that the initial and final states both have total spin 0, so that the
process is s-wave, and therefore necessarily isotropic.

7.5 e− ν e → μ− ν μ

As another example, consider the process of antineutrino-electron scattering:

e− ν e → μ− ν μ . (7.90)

This process can again be obtained by crossing μ+ → e+ νe ν μ according to

initial μ+ → final μ− (7.91)

final e+ → initial e− (7.92)
final νe → initial ν e , (7.93)

as can be seen from the Feynman diagram:

Therefore, we obtain the spin-summed squared matrix element for e− ν e → μ− ν μ

by making the substitutions

pa = −k1 ; k1 = − pa ; k2 = − pb ; k3 = k2 (7.94)

in (7.71), and multiplying again by (−1)3 because of the three crossed fermions. The
result this time is:

|M|2 = 128G 2F (k1 · pb )( pa · k2 ) = 32G 2F u 2 = 8G 2F s 2 (1 + cos θ )2 , (7.95)
spins
184 7 Fermi Theory of Weak Interactions

where (5.179) and (5.181) for 2→2 massless kinematics have been used. Here θ is
the angle between the incoming e− and the outgoing μ− 3-momenta.
Substituting this result into (4.192), with a factor of 1/2 to account for averaging
over the initial e− spin, we obtain:

dσe− ν e →μ− ν μ G 2F s
= (1 + cos θ )2 . (7.96)
d(cos θ ) 8π

Performing the d(cos θ ) integration gives a total cross-section of:

G 2F s
σe− ν e →μ− ν μ = . (7.97)
3π

This calculation shows that in the center-of-momentum frame, the μ− tends to keep
going in the same direction as the original e− . This can be understood from the
helicity-spin-momentum diagram:

Since the helicities of e− , ν e , μ− , ν μ are respectively L, R, L, R, the total spin of

the initial state must be pointing in the direction opposite to the e− 3-momentum,
and the total spin of the final state must be pointing opposite to the μ− direction.
The overlap between these two states is therefore maximized when the e− and μ−
momenta are parallel, and vanishes when the μ− tries to come out in the opposite
direction to the e− . Of course, this reaction usually occurs in a laboratory frame in
which the initial e− was at rest, so one must correct for this when interpreting the
distribution in the lab frame.
The total cross-section for this reaction is 1/3 of that for the reaction e− νe →
μ− νμ . This is because the former reaction is an isotropic s-wave (angular momentum
0), while the latter is a p-wave (angular momentum 1), which can only use one of
the three possible J = 1 final states, namely, the one with J pointing along the ν μ
direction.

7.6 Charged Currents and π ± Decay

The interaction Lagrangian term responsible for muon decay and for the cross-
sections discussed above is just one term in the weak-interaction Lagrangian. More
generally, we can write the Lagrangian as a product of a weak-interaction charged
current Jρ− and its complex conjugate Jρ+ :
7.6 Charged Currents and π ± Decay 185
√
Lint = −2 2G F Jρ+ J −ρ . (7.98)

The weak-interaction charged current is obtained by adding together terms for pairs
of fermions, with the constraint that the total charge of the current is −1, and all
Dirac fermion fields involved in the current are left-handed, and all barred fields are
right-handed:

Jρ− = ν e γρ PL e + ν μ γρ PL μ + ν τ γρ PL τ + uγρ PL d + cγρ PL s + tγρ PL b . (7.99)

Notice that we have included contributions for the quarks. The quark fields d , s ,
and b appearing here are actually not quite mass eigenstates, because of mixing;
this is the reason for the primes. The complex conjugate of Jρ− has charge +1, and
is given by:

Jρ+ = (Jρ− )∗ = eγρ PL νe + μγρ PL νμ + τ γρ PL ντ

+ d γ ρ PL u + s γ ρ PL c + b γ ρ PL t. (7.100)

Both J −ρ and J +ρ transform under proper Lorentz transformations as four-vectors,

and are V − A fermion bilinears.
Unfortunately, there is an obstacle to a detailed, direct testing of the (V − A)(V −
A) form of the weak interaction Lagrangian for quarks. This is because the quarks are
bound into hadrons by strong interactions, so that cross-sections and decays involving
the weak interactions are subject to very large and very complicated corrections.
However, one can still do a quantitative analysis of some aspects of weak decays
involving hadrons, by a method of parameterizing our ignorance.
To see how this works, consider the process of charged pion decay. A π − consists
of a bound state made from a valence d quark and u antiquark, together with many
virtual gluons and quark-antiquark pairs. A Feynman diagram for π − → μ− ν μ
decay following from the current-current Lagrangian might therefore look like:

Unfortunately, the left-part of this Feynman diagram, involving the complications

of the π − bound state, is just a cartoon for the strong interactions, which are not
amenable to perturbative calculation. The u antiquark and d quark do not even have
fixed momenta in this diagram, since they exchange energy and momentum with
each other and with the virtual gluons and quark-anti-quark pairs. In principle, one
186 7 Fermi Theory of Weak Interactions

can find some distribution for the u and d momenta, and try to average over that
distribution, but the strong interactions are very complicated so this is not very easy
to do from the theoretical side. However, by considering what we do know about
the current-current Lagrangian, we can write down the general form of the reduced
matrix element. First, we know that the external state spinors for the fermions are:

Particle Momentum Spinor

μ− k1 u1 (7.101)
νμ k2 v2

In terms of these spinors, we can write:

√
M = −i 2G F f π pρ (u 1 γ ρ PL v2 ). (7.102)

In this formula, the factor (u 1 γ ρ PL v2 ) just reflects the fact that leptons are immune
from the complications of the strong interactions. The factor f π pρ takes into account
the part of the reduced matrix element involving the π − ; here p ρ is the 4-momentum
of the pion. The point is that whatever the pion factor in the reduced matrix element
is, we know that it is a four-vector in order to contract with the lepton part, and it
must be proportional to p ρ , since there is no other vector quantity in the problem
that it can depend on. (Recall that pions are spinless, so there is no spin dependence.)
So we are simply parameterizing all of our ignorance of the bound-state properties
of the pion in terms of a single constant f π , called the pion decay constant. It is a
quantity with dimensions of mass. In principle we could compute it if we had perfect
ability to calculate with the strong interactions. In practice, f π is an experimentally
−
measured quantity, with its value following √ most accurately from the π lifetime
that we will compute below. The factor 2G F is another historical convention;
it could have been absorbed into the definition of f π . But it is useful to have the
G F appear explicitly as a sign that this is a weak interaction; then f π is entirely a
strong-interaction parameter.
Let us now compute the decay rate for π − → μ− ν μ . Taking the complex square
of the reduced matrix element (7.102), we have:

|M|2 = 2G 2F f π2 pρ pσ (u 1 γ ρ PL v2 )(v 2 PR γ σ u 1 ). (7.103)

Now summing over final state spins in the usual way gives:

|M|2 = 2G 2F f π2 pρ pσ Tr[γ ρ PL k/2 PR γ σ (k/1 + m μ )] (7.104)
spins

= 2G 2F f π2 Tr[ /p k/2 PR /p k/1 ]. (7.105)

Note that we do not neglect the mass of the muon, since m μ /m π ± = 0.1056
GeV/0.1396 GeV = 0.756 is not a small number. However, the term in (7.105)
7.6 Charged Currents and π ± Decay 187

that explicitly involves m μ does not contribute, since the trace of 3 gamma matrices
(with or without a PR ) vanishes. Evaluating the trace, we have:

Tr[ /p k/2 PR /p k/1 ] = 4( p · k1 )( p · k2 ) − 2 p 2 (k1 · k2 ). (7.106)

The decay kinematics tells us that:

p 2 = m 2π ± ; k12 = m 2μ ; k22 = m 2ν μ = 0; (7.107)

1 1
p · k1 = [−( p − k1 )2 + p 2 + k12 ] = (m 2π ± + m 2μ ); (7.108)
2 2
1 1
p · k2 = [−( p − k2 ) + p + k2 ] = (−m 2μ + m 2π ± );
2 2 2
(7.109)
2 2
1 1 2
k1 · k2 = [(k1 + k2 ) − k1 − k2 ] = (m π ± − m 2μ ).
2 2 2
(7.110)
2 2

Therefore, Tr[ /p k/2 PR /p k/1 ] = m 2μ (m 2π ± − m 2μ ), and

m 2μ
|M| =
2
2G 2F f π2 m 2π ± m 2μ 1− , (7.111)
spins
m 2π ±

so that using (6.24), we get:

2
G 2F f π2 m π ± m 2μ m 2μ
d = 1− dφ d(cos θ ), (7.112)
32π 2 m 2π ±

where (θ, φ) are the angles for the μ− three-momentum. Of course, since the pion
is spinless, the differential decay rate is isotropic, so the angular integration trivially
gives dφd(cos θ ) → 4π , and:

2
− −
G 2F f π2 m π ± m 2μ m 2μ
(π → μ νμ) = 1− . (7.113)
8π m 2π ±

The charged pion can also decay according to π − → e− ν e . The calculation of this
decay rate is identical to the one just given, except that m e is substituted everywhere
for m μ . Therefore, we have:

2
G 2 f 2 m π ± m 2e m2
(π − → e− ν e ) = F π 1 − 2e , (7.114)
8π mπ±

and the ratio of branching ratios is predicted to be:

2
BR(π − → e− ν e ) (π − → e− ν e ) m 2e m 2π ± − m 2e
− − = = = 1.2 × 10−4 . (7.115)
BR(π → μ ν μ ) (π − → μ− ν μ ) m 2μ m 2π ± − m 2μ
188 7 Fermi Theory of Weak Interactions

The ratio (7.115) is a robust prediction of the theory, because the dependence on
f π has canceled out. Since there are no other kinematically-possible two-body decay
channels open to π − , it should decay to μ− ν μ almost always, with a rare decay to
e− ν e occurring 0.012% of the time. This has been confirmed experimentally. We
can also use the measurement of the total lifetime of the π − to find f π numerically,
using (7.113). The result is:

f π = 0.128 GeV. (7.116)

It is not surprising that this value is of the same order-of-magnitude as the mass of
the pion.
The most striking feature of the π − decay rate is that it is proportional to m 2μ , with
M proportional to m μ . This is what leads to the strong suppression of decays to e− ν e
(already mentioned at the end of Sect. 7.1). We found this result just by calculating.
To understand it better, we can draw a momentum-helicity-spin diagram, using the
fact that the − and ν produced in the weak interactions are L and R respectively:

The π − has spin 0, but the final state predicted by the weak interaction helicities
unambiguously has spin 1. Therefore, if helicity were exactly conserved, the π −
could not decay at all! However, helicity conservation only holds in the high-energy
limit in which we can treat all fermions as massless. This decay is said to be helicity-
suppressed, since the only reason it can occur is because m μ and m e are non-zero.
In the limit m → 0, we recover exact helicity conservation and the reduced matrix
element and the decay lifetime vanish. This explains why they should be proportional
to m and m 2 respectively. The helicity suppression of this decay is therefore a
good prediction of the rule that the weak interactions affect only L fermions and
R antifermions. In the final state, the charged lepton μ− or e− is said to undergo a
helicity flip, meaning that the L-helicity fermion produced by the weak interactions
has an amplitude to appear in the final state as a R-fermion. In general, a helicity flip
for a fermion entails a suppression in the reduced matrix element proportional to the
mass of the fermion divided by its energy.
Having computed the decay rate following from (7.102), let us find a Lagrangian
that would give rise to it involving a quantum field for the pion. Although the pion
is a composite, bound-state particle, we can still invent a quantum field for it, in an
approximate, “effective” description. The π − corresponds to a charged spin-0 field.
Previously, we studied spin-0 particles described by a real scalar field. However, the
particle and antiparticle created by a real scalar field turned out to be the same thing.
Here we want something different; since the π − is charged, its antiparticle π + is
clearly a different particle. This means that the π − particle should be described by
a complex scalar field.
7.6 Charged Currents and π ± Decay 189

Let us therefore define π − (x) to be a complex scalar field, with its complex
conjugate given by

π + (x) ≡ (π − (x))∗ . (7.117)

We can construct a real free Lagrangian density from these complex fields as follows:

L = ∂μ π + ∂ μ π − − m 2π ± π + π − . (7.118)

(Compare to the Lagrangian density for a real scalar field, (4.18).) This Lagrangian
density describes free pion fields with mass m π ± . At any fixed time t = 0, the π +
and π − fields can be expanded in creation and annihilation operators as:

−
π (x) = d p̃ (eip·x ap,− + e−ip·x ap,+
†
); (7.119)

π + (x) = d p̃ (eip·x ap,+ + e−ip·x ap,−
†
). (7.120)

Note that these fields are indeed complex conjugates of each other, and that they are
each complex since ap,− and ap,+ are taken to be independent. The operators ap,−
†
and ap,− act on states by destroying and creating a π − particle with 3-momentum
†
p. Likewise, the operators ap,+ and ap,+ act on states by destroying and creating a
π + particle with 3-momentum p. In particular, the single particle states are:

†
ap,− |0
= |π − ; p
, (7.121)
†
ap,+ |0
= |π + ; p
. (7.122)

One can now carry through canonical quantization as usual. Given an interaction
Lagrangian, one can derive the corresponding Feynman rules for the propagator and
interaction vertices. Since a π − moving forward in time is a π + moving backwards
in time, and vice versa, there is only one propagator for π ± fields. It differs from the
propagator for an ordinary scalar in that it carries an arrow indicating the direction
of the flow of charge:

The external state pion lines also carry an arrow direction telling us whether it is a
π − or a π + particle. A pion line entering from the left with an arrow pointing to the
right means a π + particle in the initial state, while a line entering from the left with
an arrow pointing back to the left means a π − particle in the initial state. Similarly,
if a pion line leaves the diagram to the right, it represents a final state pion, with an
arrow to the right meaning a π + and an arrow to the left meaning a π − . We can
summarize this with the following mnemonic figures:
190 7 Fermi Theory of Weak Interactions

In each case the Feynman rule factor associated with the initial- or final-state pion
is just 1.
Returning to the reduced matrix element of (7.102), we can interpret this as
coming from a pion-lepton-antineutrino interaction vertex. When we computed the
decay matrix element, the pion was on-shell, but in general this need not be the case.
The pion decay constant f π must therefore be generalized to a function f ( p 2 ), with

f ( p 2 )| p2 =m 2 = fπ (7.123)
π±

when the pion is on-shell. The momentum-space factor f ( p 2 ) pρ can be interpreted

by identifying the 4-momentum as a differential operator acting on the pion field,
using:

pρ ↔ i∂ρ . (7.124)

Then, reversing the usual procedure of inferring the Feynman rule from a term in
the interaction Lagrangian, we conclude that the effective interaction describing π −
decay is:
√
Lint,ß− ¯˚¯ = − 2G F (μγ ρ PL νμ ) f (−∂ 2 )∂ρ π − . (7.125)

Here f (−∂ 2 ) can in principle be defined in terms of its power-series expansion in

the differential operator −∂ 2 = ∇ 2 − ∂t2 acting on the pion field. In practice, one
usually just works in momentum space where it is f ( p 2 ). Since the Lagrangian must
be real, we must also include the complex conjugate of this term:
√
Lint,ß+ ¯˚¯ = − 2G F (ν μ γ ρ PL μ) f (−∂ 2 )∂ρ π + . (7.126)

The Feynman rules for these effective interactions are:

7.7 Unitarity, Renormalizability, and the W Boson 191

and

In each Feynman diagram, the arrow on the pion line describes the direction of flow
of charge, and the 4-momentum p ρ is taken to be flowing in to the vertex. When the
pion is on-shell, one can replace f ( p 2 ) by the pion decay constant f π .
Other charged mesons made out of a quark and antiquark, like the K ± , D ± , and
±
Ds , have their own decay constants f K , f D , and f Ds , and their decays can be treated
in a similar way.

7.7 Unitarity, Renormalizability, and the W Boson

An important feature of weak-interaction 2 → 2 cross-sections following from

Fermi’s four-fermion interaction is that they grow proportional to s for very large s;
see (7.86) and (7.97). This had to be true on general grounds just from dimensional
analysis. Any reduced matrix element that contains one four-fermion interaction will
be proportional to G F , so the cross-section will have to be proportional to G 2F . Since
this has units of [mass]−4 , and cross-sections must have dimensions of [mass]−2 , it
must be that the cross-section scales like the square of the characteristic energy of
the process, s, in the high-energy limit in which all other kinematic mass scales are
comparatively unimportant. This behavior of σ ∝ s is not acceptable for arbitrarily
large s, since the cross-section is bounded by the fact that the probability for any two
particles to scatter cannot exceed 1. In quantum mechanical language, the constraint
is on the unitarity of the time-evolution operator e−i H t . If the cross-section grows
too large, then our perturbative approximation e−i H t = 1 − i H t represented by the
lowest-order Feynman diagram must break down. The reduced matrix element found
from just including this Feynman diagram will have to be compensated somehow by
higher-order diagrams, or by changing the physics of the weak interactions at some
higher energy scale.
192 7 Fermi Theory of Weak Interactions

Let us develop the dimensional analysis of fields and couplings further. We know
that the Lagrangian must have the same units as energy. In the standard system in
which c = = 1, this is equal to units of [mass]. Since d 3 x has units of [length]3 ,
or [mass]−3 , and

L = d 3 x L, (7.127)

it must be that L has units of [mass]4 . This fact allows us to evaluate the units of
all fields and couplings in a theory. For example, a spacetime derivative has units of
inverse length, or [mass]. Therefore, from the kinetic terms for scalars, fermions, and
vector fields found for example in (4.18), (4.26), and (4.34), we find that these types
of fields must have dimensions of [mass], [mass]3/2 , and [mass] respectively. This
allows us to evaluate the units of various possible interaction couplings that appear
in the Lagrangian density. For example, a coupling of n scalar fields,
λn n
Lint = − φ (7.128)
n!
implies that λn has units of [mass]4−n . A vector-fermion-fermion coupling, like e in
QED, is dimensionless. The effective coupling f π for on-shell pions has dimensions
of [mass], because of the presence of a spacetime derivative together with a scalar
field and two fermion fields in the Lagrangian. Summarizing this information for the
known types of fields and couplings that we have encountered so far:
Object Dimension Role
L [mass] 4 Lagrangian density
∂μ [mass] derivative
φ [mass] scalar field
[mass]3/2 fermion field
Aμ [mass] vector field
λ3 [mass] scalar3 coupling
λ4 [mass] 0 scalar4 coupling
0 (7.129)
y [mass] scalar-fermion-fermion (Yukawa) coupling
e [mass]0 photon-fermion-fermion coupling
GF [mass]−2 fermion4 coupling
fπ [mass] fermion2 -scalar-derivative coupling
u, v, u, v [mass] 1/2 external-state spinors
M Ni →N f [mass]4−Ni −N f reduced matrix element for Ni → N f particles
σ [mass]−2 cross-section
[mass] decay rate.
It is a general fact that theories with couplings with negative mass dimension, like
G F , or λn for n ≥ 5, always suffer from a problem known as non-renormalizability.4

4 Theconverse is not true; just because a theory has only couplings with positive or zero mass
dimension does not guarantee that it is renormalizable. It is a necessary, but not sufficient, condition.
7.7 Unitarity, Renormalizability, and the W Boson 193

In a renormalizable theory, the divergences that occur in loop diagrams due to inte-
grating over arbitrarily large 4-momenta for virtual particles can be regularized by
introducing a high momentum cutoff, and then the resulting dependence on the
unknown cutoff can be absorbed into a redefinition of the masses and coupling
constants of the theory. In contrast, in a non-renormalizable theory, one finds that
this process requires introducing an infinite number of different couplings, each
of which must be redefined in order to absorb the momentum-cutoff dependence.
This dependence on an infinite number of different coupling constants makes non-
renormalizable theories non-predictive, although only in principle. We can always
use non-renormalizable theories as effective theories at low energies, as we have
done in the case of the four-fermion theory of the weak interactions. However, when
probed at sufficiently high energies, a non-renormalizable theory will encounter
related problems associated with the apparent failure of unitarity (cross-sections
that grow uncontrollably with energy) and non-renormalizability (an uncontrollable
dependence on more and more unknown couplings that become more and more
important at higher energies). For this reason, we are happier to describe physical
phenomena using renormalizable theories if we can.
An example of a useful non-renormalizable theory is gravity. The effective
coupling constant for Feynman diagrams involving gravitons is 1/MPlanck 2 , where
MPlanck = 2.4 × 10 GeV is the “reduced Planck mass”. Like G F , this coupling has
18

negative mass dimension, so tree-level cross-sections for 2 → 2 scattering involv-

ing gravitons grow with energy like s. This is not a problem as long as we stick
to scattering energies MPlanck , which corresponds to all known experiments and
directly measured phenomena. However, we do not know how unitarity is restored
in gravitational interactions at energies comparable to MPlanck or higher. Unlike the
case of the weak interactions, it is hard to conceive of an experiment with present
technologies that could test competing ideas.
To find a renormalizable “fix” for the weak interactions, we note that the (V −
A)(V − A) current-current structure of the four-fermion coupling could come about
from the exchange of a heavy vector particle. To do this, we imagine “pulling apart”
the two currents, and replacing the short line segment by the propagator for a virtual
vector particle. For example, one of the current-current terms is:

Since the currents involved have electric charges −1 and +1, the vector boson must
carry charge −1 to the right. This is the W − vector boson. By analogy with the
charged pion, W ± are complex vector fields, with an arrow on its propagator indi-
cating the direction of flow of charge.
194 7 Fermi Theory of Weak Interactions

The Feynman rule for the propagator of a charged vector W ± boson carrying
4-momentum p turns out to be:

In the limit of low energies and momenta, | pρ | m W , this propagator just becomes
a constant:

i pρ pσ gρσ
−gρσ + 2 −→ i 2 . (7.130)
p − m W + i
2 2 mW mW

In order to complete the correspondence between the effective four-fermion inter-

action and the more fundamental version involving the vector boson, we need W -
fermion-antifermion vertex Feynman rules of the form:

√
Here g is a fundamental coupling of the weak interactions, and the 1/ 2 is a standard
convention. These are the Feynman rules involving W ± interactions with leptons;
there are similar rules for interactions with the quarks in the charged currents Jρ+
and Jρ− given earlier in (7.99) and (7.100). The interaction Lagrangian for W bosons
with standard model fermions corresponding to these Feynman rules is:
g
Lint = − √ W +ρ Jρ− + W −ρ Jρ+ . (7.131)
2

Comparing the four-fermion vertex to the reduced matrix element from W -boson
exchange, we find that we must have:
2

√ −ig i
− i2 2G F = √ , (7.132)
2 m 2W
7.7 Unitarity, Renormalizability, and the W Boson 195

so that

g2
GF = √ 2 . (7.133)
4 2m W

The W ± boson has been discovered, with a mass m W = 80.4 GeV, so we conclude
that

g ≈ 0.65. (7.134)

Since this is a dimensionless coupling, there is at least a chance to make this into a
renormalizable theory that is unitary in perturbation theory. At very high energies,
the W ± propagator will behave like 1/ p 2 , rather than the 1/m 2W that is encoded in
G F in the four-fermion approximation. This “softens” the weak interactions √ at high
energies, leading to cross-sections that fall, rather than rise, at very high s.
When a massive vector boson appears in a final state, it has a Feynman rule
given by a polarization vector μ ( p, λ), just like the photon did. The difference is
that a massive vector particle V has three physical polarization states λ = 1, 2, 3,
satisfying

p ρ ρ ( p, λ) = 0 (λ = 1, 2, 3). (7.135)

One can sum over these polarizations for an initial or final state in a squared reduced
matrix element, with the result:

3
pρ pσ
ρ ( p, λ)σ∗ ( p, λ) = −gρσ + . (7.136)
λ=1
m 2V

Summarizing the propagator and external state Feynman rules for a generic massive
vector boson for future reference:

If the massive vector is charged, like the W ± bosons, then an arrow is added to each
line to show the direction of flow of charge.
The weak interactions and the strong interactions are invariant under non-Abelian
gauge transformations, which involve a generalization of the type of gauge invari-
ance we have already encountered in the case of QED. This means that the gauge
196 7 Fermi Theory of Weak Interactions

transformations not only multiply fields by phases, but can mix the fields. In the next
section we will begin to study the properties of field theories, known as Yang-Mills
theories, which have a non-Abelian gauge invariance. This will enable us to get a
complete theory of the weak interactions.

Problems

1. Consider an alternative theory for muon decay based on scalar-parity violating

interaction currents:
S−P
√
Lint = −4 2 G X (ν μ PL μ)(e PL νe ) + c.c. (7.137)

where G X is a constant.
For this S − P theory, calculate

1
|M|2 (7.138)
2
spins

for μ− → e− νμ ν e decay. Neglect the electron and neutrino masses.

2. Continue the previous problem by finding the resulting distribution for the electron
energy produced in muon decay according to the S − P theory,

d
, (7.139)
d Ee

and integrate to find . How do these predictions compare to the standard V − A

theory that we have established to be correct?
Historical note: the correctness of the V − A theory of the weak interactions was
established mostly by analyzing a variety of analogous nuclear decay experiments.
Muon decay played a minor role historically.
3. Consider the process of W − boson decay, through the interaction Lagrangian:
g
L = − √ Wμ− J +μ + c.c. (7.140)
2
J +μ = bγ μ PL t + sγ μ PL c + dγ μ PL u
+ τ γ μ PL ντ + μγ μ PL νμ + eγ μ PL νe . (7.141)

Notice that we are again making the approximation of ignoring the issue of mass
eigenstates being not quite the same as the fields that couple to the W − .

(a) The W − boson cannot decay into a final state with a bottom quark, within
the very good approximation just mentioned. Why? (This is a useful thing
sometimes; if your experiment tags a bottom quark jet, you can say it almost
certainly didn’t come from a W − decay unless it was mis-tagged.)
Problems 197

(b) Treating the electron as massless, compute the decay rate for W − → e− ν e ,
in terms of g and MW . [Draw the Feynman diagram, write down the reduced
matrix element, take its complex square, average over the three possible initial
polarizations of the W − boson, sum over all possible final state spins.]
(c) From your answer to part (b), infer the results for: (W − → μ− ν μ ) and
(W − → τ − ν τ ) and (W − → du) and (W − → sc), treating all of the
final-state fermions as massless. Remember that each quark in the final state
has 3 possible colors, which you must sum over. The antiquarks are constrained
to have the opposite color, so once you have summed over the quark colors,
you should not sum over the antiquark’s anticolors. Because the W − has a
large mass and the decay happens quickly, you can assume that the strong
interactions of the quarks and antiquarks are irrelevant until long after the
decay has occurred.
(d) From the above results, predict the total decay width of the W boson in GeV,
its lifetime in seconds, and its branching ratio into each of the possible final
states.
Quantum Chromo-Dynamics (QCD)
8

8.1 Groups and Representations

In this section, we will generalize the idea of gauge invariance found in electrodynam-
ics. This is primarily a mathematical exercise which will serve the greater purpose
of this chapter to describe quantum chromodynamics in the following sections—the
theory that governs the interactions of the quarks.
Recall that in QED the Lagrangian is defined in terms of a covariant derivative

Dμ = ∂μ + i Qe Aμ (8.1)

and a field strength

Fμν = ∂μ Aν − ∂ν Aμ (8.2)

as
1
L = − F μν Fμν + i D
/ − m. (8.3)
4
This Lagrangian is invariant under the local gauge transformation

1
Aμ → Aμ = Aμ − ∂μ θ, (8.4)
e
→ = ei Qθ , (8.5)

where θ (x) is any function of spacetime, called a gauge parameter. Now, the result
of doing one gauge transformation θ1 followed by another gauge transformation θ2
is always a third gauge transformation parameterized by the function θ1 + θ2 :

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 199
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_8
200 8 Quantum Chromo-Dynamics (QCD)

1 1 1
Aμ → (Aμ − ∂μ θ1 ) − ∂μ θ2 = Aμ − ∂μ (θ1 + θ2 ), (8.6)
e e e
i Q(θ1 +θ2 )
→e (e
i Qθ2 i Qθ1
) = e . (8.7)

Mathematically, these gauge transformations are an example of a group.

A group is a set of elements G = {gi } and a rule for multiplying them, with the
properties:

(1) Closure: If gi and g j are elements of the group G, then the product gi g j is also
an element of G.
(2) Associativity: gi (g j gk ) = (gi g j )gk .
(3) Existence of an Identity: There is a unique element I = g1 of the group, such
that for all gi in G, I gi = gi I = gi .
(4) Inversion: For each gi , there is a unique inverse element (gi )−1 satisfying
(gi )−1 gi = gi (gi )−1 = I .

It may or may not be also true that the group also satisfies the commutativity property:

gi g j = g j gi . (8.8)

If this is satisfied, then the group is called commutative or Abelian. Otherwise it is

non-commutative or non-Abelian.
In generalizing the QED Lagrangian, we will be interested in continuous Lie
groups. A continuous group has an uncountably infinite number of elements labeled
by one or more continuously varying parameters, which turn out to be nothing other
than the generalizations of the gauge parameter θ in QED. A Lie group is a continuous
group that also has the desirable property of being differentiable with respect to the
gauge parameters.
The action of group elements on physics states or fields can be represented by
a set of complex n × n matrices acting on n-dimensional complex vectors. This
association of group elements with n × n matrices and the states or fields that they
act on is said to form a representation of the group. The matrices obey the same rules
as the group elements themselves.
For example, the group of QED gauge transformations is the Abelian Lie group
U (1). According to (8.5), the group is represented on Dirac fermion fields by complex
1 × 1 matrices:

U Q (θ ) = ei Qθ . (8.9)

Here θ labels the group elements, and the charge Q labels the representation of
the group. So we can say that the electron, muon, and tau Dirac fields each live in
a representation of the group U (1) with charge Q = −1; the Dirac fields for up,
charm, and top quarks each live in a representation with charge Q = 2/3; and the
8.1 Groups and Representations 201

Dirac fields for down, strange and bottom quarks each live in a representation with
Q = −1/3. We can read off the charge of any field if we know how it transforms
under the gauge group. A barred Dirac field transforms with the opposite phase from
the original Dirac field of charge Q, and therefore has charge −Q.
Objects that transform into themselves with no change are said to be in the singlet
representation. In general, the Lagrangian should be invariant under gauge transfor-
mations, and therefore must be in the singlet representation. For example, each term
of the QED Lagrangian carries no charge, and so is a singlet of U (1). The photon
field Aμ has charge 0, and is therefore usually said (by a slight abuse of language) to
transform as a singlet representation of U (1). [Technically, it does not really trans-
form under gauge transformations as any representation of the group U (1), because
of the derivative term in (8.4), unless θ is a constant function so that one is making
the same transformation everywhere in spacetime.]
Let us now generalize to non-Abelian groups, which always involve representa-
tions containing more than one field or state. Let ϕi be a set of objects that together
transform in some representation R of the group G. The number of components of
ϕi is called the dimension of the representation, d R , so that i = 1, . . . , d R . Under a
group transformation,

ϕi → ϕi = Ui j ϕ j (8.10)

where Ui j is a representation matrix. We are especially interested in transformations

that are represented by unitary matrices, so that their action can be realized on the
quantum Hilbert space by a unitary operator. Consider the subset of group elements
that are infinitesimally close to the identity element. We can write these in the form:

U ()i j = (1 + i a T a )i j . (8.11)
aj
Here the Ti are a basis for all the possible infinitesimal group transformations.
The number of matrices T a is called the dimension of the group, dG , and there
is an implicit sum over a = 1, . . . , dG . The a are a set of dG infinitesimal gauge
parameters (analogous to θ in QED) that tell us how much of each is included in the
transformation represented by U (). Since U () is unitary,

U ()† = U ()−1 = 1 − i a T a , (8.12)

from which it follows that the matrices T a must be Hermitian.

Consider two group transformations g and gδ parameterized by a and δ a . By
the closure property we can then form a new group element

g gδ g−1 gδ−1 . (8.13)

Working with some particular representation, this corresponds to

U ()U (δ)U ()−1 U (δ)−1 = (1 + i a T a )(1 + iδ b T b )(1 − i c T c )(1 − iδ d T d ) (8.14)

= 1 − a δ b [T a , T b ] + · · · (8.15)
202 8 Quantum Chromo-Dynamics (QCD)

The closure property requires that this is a representation of the group element in
(8.13), which must also be close to the identity. It follows that

[T a , T b ] = i f abc T c (8.16)

for some set of numbers f abc , called the structure constants of the group. In prac-
tice, one often picks a particular representation of matrices T a as the defining or
fundamental representation. This determines the structure constants f abc once and
for all. The set of matrices T a for all other representations are then required to repro-
duce (8.16), which fixes their overall normalization. Equation (8.16) defines the Lie
algebra corresponding to the Lie group, and the hermitian matrices T a are said to be
generators of the Lie algebra for the corresponding representation. Many physicists
have a bad habit of using the words “Lie group” and “Lie algebra” interchangeably,
because we often only care about the subset of gauge transformations that are close
to the identity.
For any given representation R, one can always choose the generators so that:

Tr(TRa TRb ) = I (R)δ ab . (8.17)

The number I (R) is called the index of the representation. A standard choice is that
the index of the fundamental representation of a non-Abelian Lie algebra is 1/2.
(This can always be achieved by rescaling the T a , if necessary.) From (8.16) and
(8.17), one obtains for any representation R:

i I (R) f abc = Tr([TRa , TRb ]TRc ). (8.18)

It follows, from the cyclic property of the trace, that f abc is totally antisymmetric
under interchange of any two of a, b, c. By using the Jacobi identity,

[T a , [T b , T c ]] + [T b , [T c , T a ]] + [T c , [T a , T b ]] = 0, (8.19)

which holds for any three matrices, one also finds the useful result:

f ade f bce + f cde f abe + f bde f cae = 0. (8.20)

Two representations R and R are said to be equivalent if there exists some fixed
matrix X such that:

X TRa X −1 = TRa , (8.21)

for all a. Obviously, this requires that R and R have the same dimension. From a
physical point of view, equivalent representations are indistinguishable from each
other.
8.1 Groups and Representations 203

A representation R of a Lie algebra is said to be reducible if it is equivalent to a

representation in block-diagonal form; in other words, if there is some matrix X that
can be used to put all of the TRa simultaneously in a block-diagonal form:
⎛ ⎞
Tra1 0 ... 0
⎜ 0 Tra2 ... 0 ⎟
⎜ ⎟
X TRa X −1 =⎜ . .. .. .. ⎟ for all a. (8.22)
⎝ .. . . . ⎠
0 0 . . . Tran

Here the Trai are representation matrices for smaller representations ri . One calls this
a direct sum, and writes it as

R = r1 ⊕ r2 ⊕ · · · ⊕ rn . (8.23)

A representation that is not equivalent to a direct sum of smaller representations in this

way is said to be irreducible. Heuristically, reducible representations are those that
can be chopped up into smaller pieces that can be consistently treated individually.
With the above conventions on the Lie algebra generators, one can show that for
each irreducible representation R:

j
(TRa TRa )i j = C(R)δi (8.24)

(with an implicit sum over a = 1, . . . , dG ), where C(R) is another characteristic

number of the representation R, called the quadratic Casimir invariant. If we take
the sum over a of (8.17), it is equal to the trace of (8.24). It follows that for each
irreducible representation R, the dimension, the index, and the Casimir invariant are
related to the dimension of the group by:

dG I (R) = d R C(R). (8.25)

The simplest irreducible representation of any Lie algebra is just:

aj
Ti = 0. (8.26)

This is called the singlet representation.

aj
Suppose that we have some representation with matrices Ti . Then one can show
aj ∗
that the matrices −(Ti ) also form a representation of the algebra (8.16). This is
called the complex conjugate of the representation R, and is often denoted R:

TRa = −TRa∗ . (8.27)

If TRa is equivalent to TRa , so that there is some fixed matrix X such that

X TRa X −1 = TRa , (8.28)

204 8 Quantum Chromo-Dynamics (QCD)

then the representation R is said to be a real representation,1 and otherwise R is said

to be complex.
One can also form the tensor product of any two representations R, R of the Lie
algebra to get another representation:

j,y y j
(TR⊗R
a
)
i,x
≡ (TRa )i j δx + δi (TRa )x y . (8.29)

The representation R ⊗ R has dimension d R d R , and is typically reducible:

R ⊗ R = R1 ⊕ · · · ⊕ Rn (8.30)

with

d R⊗R = d R d R = d R1 + · · · + d Rn . (8.31)

This is a way to make larger representations (R1 . . . Rn out of smaller ones (R, R ).
One can check from the identity (8.20) that the matrices

(T a )b c = −i f abc (8.32)

form a representation, called the adjoint representation, with the same dimension
as the group G. As a matter of terminology, the quadratic Casimir invariant of the
adjoint representation is also called the Casimir invariant of the group, and given the
symbol C(G). Note that, from (8.25), the index of the adjoint representation is equal
to its quadratic Casimir invariant:

C(G) ≡ C(adjoint) = I (adjoint). (8.33)

We now list, without proof, some further group theory facts regarding Lie algebra
representations:

• The number of inequivalent irreducible representations of a Lie group is always

infinite.
• Unlike group element multiplication, the tensor product multiplication of repre-
sentations is both associative and commutative:

(R1 ⊗ R2 ) ⊗ R3 = R1 ⊗ (R2 ⊗ R3 ), (8.34)

R1 ⊗ R2 = R2 ⊗ R1 . (8.35)

1 Real representations can be divided into two sub-cases, “positive-real” and “pseudo-real”, depend-

ing on whether the matrix X can or cannot be chosen to be symmetric. In a pseudo-real representation,
the T a cannot all be made antisymmetric and imaginary; in a positive-real representation, they can.
8.1 Groups and Representations 205

• The tensor product of any representation with the singlet representation just gives
the original representation back:

1 ⊗ R = R ⊗ 1 = R. (8.36)

• The tensor product of two real representations R1 and R2 is always a direct sum
of representations that are either real or appear in complex conjugate pairs.
• The adjoint representation is always real.
• The tensor product of two irreducible representations contains the singlet repre-
sentation if and only if they are complex conjugates of each other:

R1 ⊗ R2 = 1 ⊕ · · · ←→ R2 = R 1 . (8.37)

It follows that if R is real, then R ⊗ R contains a singlet.

• The tensor product of a representation and its complex conjugate always contains
both the singlet and adjoint representations:

R ⊗ R = 1 ⊕ Adjoint ⊕ · · · . (8.38)

• As a corollary of the preceding rules, the tensor product of the adjoint represen-
tation with itself always contains both the singlet and the adjoint:

Adjoint ⊗ Adjoint = 1 S ⊕ Adjoint A ⊕ · · · . (8.39)

Here the S and A mean that the indices of the two adjoints on the left are combined
symmetrically and antisymmetrically respectively.
• If the tensor product of two representations contains a third, then the tensor
product of the first representation with the conjugate of the third representation
contains the conjugate of the second representation:

R1 ⊗ R2 = R3 ⊕ · · · , ←→ R1 ⊗ R 3 = R 2 ⊕ · · · , (8.40)
R1 ⊗ R2 ⊗ R3 = 1 ⊕ · · · , ←→ R1 ⊗ R2 = R 3 ⊕ · · · . (8.41)

• If R1 ⊗ R2 = r1 ⊕ · · · ⊕ rn , then the indices satisfy the following rule:

n
I (R1 )d R2 + I (R2 )d R1 = I (ri ). (8.42)
i=1
Let us recall how Lie algebra representations work in the example of SU (2), the
group of unitary 2 × 2 matrices (that’s the “U(2)” part of the name) with determinant
1 (that’s the “S”, for special, part of the name). This group is familiar from the
study of angular momentum in quantum mechanics, and the defining or fundamental
206 8 Quantum Chromo-Dynamics (QCD)

representation is the familiar spin-1/2 one with ϕi with i = 1, 2 or up,down. The Lie
algebra generators in the fundamental representation are:

σa
Ta = (a = 1, 2, 3), (8.43)
2

where the σ a are the three Pauli matrices (see, for example, (3.23)). One finds that
the structure constants are
⎧
⎨ +1 if a, b, c = 1, 2, 3 or 2, 3, 1 or 3, 1, 2
f abc = abc = −1 if a, b, c = 1, 3, 2 or 3, 2, 1 or 2, 1, 3 (8.44)
⎩
0 otherwise.

Irreducible representations exist for any “spin” j = n/2, where n is an integer, and
have dimension 2 j + 1. The representation matrices J a in the spin- j representation
satisfy the SU (2) Lie algebra:

[J a , J b ] = i abc J c . (8.45)

These representation matrices can be chosen to act on states ϕm = | j, m , according

to:

J 3 | j, m = m| j, m , (8.46)
J a J a | j, m = j( j + 1)| j, m , (8.47)

or, in matrix-vector notation:

Jm3m ϕm = mϕm , (8.48)
m
(J J )m ϕm = j( j + 1)ϕm .
a a
(8.49)

We therefore recognize from (8.24) that the quadratic Casimir invariant of the spin-
j representation of SU (2) is C(R j ) = j( j + 1). It follows from (8.25) that the
index of the spin- j representation is I (R j ) = j( j + 1)(2 j + 1)/3. The j = 1/2
representation is real, because

σ a∗ σa
X − X −1 = , (8.50)
2 2

where

0 i
X= . (8.51)
−i 0
8.1 Groups and Representations 207

More generally, one can show that all representations of SU (2) are real. Making a
table of the representations of SU (2):

spin dimension I (R) C(R) real?

singlet 0 1 0 0 yes
fundamental 1/2 2 1/2 3/4 yes
adjoint 1 3 2 2 yes (8.52)
3/2 4 5 15/4 yes
... ... ... ... ...
j 2 j + 1 j( j + 1)(2 j + 1)/3 j( j + 1) yes

The tensor product of any two representations of SU (2) is reducible to a direct sum,
as:

j1 ⊗ j2 = | j1 − j2 | ⊕ (| j1 − j2 | + 1) ⊕ · · · ⊕ ( j1 + j2 ) (8.53)

in which j is used to represent the representation of spin j.

The group SU (2) has many applications in physics. First, it serves as the Lie alge-
bra of angular momentum operators. The strong force (but not the electromagnetic or
weak forces, or mass terms) is invariant under a different SU (2) isospin symmetry,
under which the up and down quarks transform as a j = 1/2 doublet. Isospin is a
global symmetry, meaning that the same symmetry transformation must be made
simultaneously everywhere:

u u
→ exp(iθ a σ a /2) , (8.54)
d d

with a constant θ a that does not depend on position in spacetime. The weak inter-
actions involve a still different SU (2), known as weak isospin or SU (2) L . Weak
isospin is a gauge symmetry that acts only on left-handed fermion fields. The irre-
ducible j = 1/2 representations of SU (2) L are composed of the pairs of fermions
that couple to a W ± boson, namely:

νeL νμL ντ L
; ; ; (8.55)
eL μL τL

uL cL tL
; ; . (8.56)
d L s L bL

Here e L means PL e, etc., and the primes mean that these are not quark mass eigen-
states. When one makes an SU (2) L gauge transformation, the transformation can
be different at each point in spacetime. However, one must make the same transfor-
mation simultaneously on each of these representations. We will come back to study
the SU (2) L symmetry in more detail later, and see more precisely how it ties into
the weak interactions and QED.
One can generalize the SU (2) group to non-Abelian groups SU (N ) for any integer
N ≥ 2. The Lie algebra generators of SU (N ) in the fundamental representation are
208 8 Quantum Chromo-Dynamics (QCD)

Hermitian traceless N × N matrices. In general, a basis for the complex N × N

matrices is 2N 2 dimensional, since each matrix has N 2 entries with a real and
imaginary part. The condition that the matrices are Hermitian removes half of these,
since each entry in the matrix is required to be the complex conjugate of another entry.
Finally, the single condition of tracelessness removes one from the basis. That leaves
d SU (N ) = N 2 − 1 as the dimension of the group and of the adjoint representation.
For all N ≥ 3, the fundamental representation is complex.
For example, Quantum Chromo-Dynamics (QCD), the theory of the strong inter-
actions, is a gauge theory based on the group SU (3), which has dimension dG = 8.
In the fundamental representation, the generators of the Lie algebra are given by:

1 a
Ta = λ (a = 1, . . . , 8), (8.57)
2

where the λa are known as the Gell-Mann matrices:

⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 1 0 0 −i 0 1 0 0
λ1 = ⎝ 1 0 0 ⎠ ; λ2 = ⎝ i 0 0⎠; λ3 = ⎝ 0 −1 0⎠;
0 0 0 0 0 0 0 0 0
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
0 0 1 0 0 −i 0 0 0
λ4 = ⎝ 0 0 0 ⎠ ; λ5 = ⎝ 0 0 0 ⎠ ; λ6 = ⎝ 0 0 1⎠;
1 0 0 i 0 0 0 1 0
⎛ ⎞ ⎛ ⎞
0 0 0 1 0 0
1
λ7 = ⎝ 0 0 −i ⎠ ; λ8 = √ ⎝ 0 1 0 ⎠ . (8.58)
0 i 0 3 0 0 −2

Note that each of these matrices is Hermitian and traceless, as required. They have
also been engineered to satisfy Tr(λa λb ) = 2δ ab , so that

1 ab
Tr(T a T b ) = δ , (8.59)
2
and therefore the index of the fundamental representation is

I (F) = 1/2. (8.60)

One can also check that

4 j
(T a T a )i j = δ , (8.61)
3 i
so that the quadratic Casimir invariant of the fundamental representation is

C(F) = 4/3. (8.62)

8.1 Groups and Representations 209

By taking commutators of each pair of generators, one finds that the non-zero struc-
ture constants of SU (3) are:

f 123 = 1; (8.63)
f 147 = − f 156 = f 246 = f 257 = f 345 = − f 367 = 1/2; (8.64)
√
f 458 = f 678 = 3/2. (8.65)

and those related to the above by permutations of indices, following from the con-
dition that the f abc are totally antisymmetric. From these one can find the adjoint
representation matrices using (8.32), with, for example:
⎛ ⎞
0 0 0 0 0 0 0 0
⎜0 0 1 0 0 0 0 0⎟
⎜ ⎟
⎜ 0 −1 0 0 0 0 0 0⎟
⎜ ⎟
⎜0 0 0 0 0 0 1/2 0 ⎟
1
Tadjoint = −i ⎜ ⎜0 0 0
⎟ , (8.66)
⎜ 0 0 −1/2 0 0⎟⎟
⎜0 0 0 0 1/2 0 0 0⎟
⎜ ⎟
⎝ 0 0 0 −1/2 0 0 0 0⎠
0 0 0 0 0 0 0 0

etc. However, it is almost never necessary to actually use the explicit form of any
matrix representation larger than the fundamental. Instead, one relies on group-
theoretic identities. For example, calculations of Feynman diagrams often involve
the index or Casimir invariant of the fundamental representation, and the Casimir
invariant of the group. One can easily compute the latter by using (8.24) and (8.32):

f abc f abd = C(G)δ cd , (8.67)

with the result C(G) = 3.

Following is a table of the smallest few irreducible representations of SU (3):

dimension I (R) C(R) real?

singlet 1 0 0 yes
fundamental 3 1/2 4/3 no
anti-fundamental 3 1/2 4/3 no
6 5/2 10/3 no (8.68)
6 5/2 10/3 no
adjoint 8 3 3 yes
10 15/2 6 no
10 15/2 6 no
... (8.69)

It is usual to refer to each representation by its dimension in boldface. In general,

the representations can be classified by two non-negative integers α and β. The
dimension of the representation labeled by α, β is

dα,β = (α + 1)(β + 1)(α + β + 2)/2. (8.70)

210 8 Quantum Chromo-Dynamics (QCD)

Some tensor products involving these representations are:

3⊗3 = 1⊕8 (8.71)

3⊗3 = 3 A ⊕ 6S (8.72)
3⊗3 = 3 A ⊕ 6S (8.73)
3⊗3⊗3 = 1 A ⊕ 8 M ⊕ 8 M ⊕ 10 S (8.74)
8⊗8 = 1 S ⊕ 8 A ⊕ 8 S ⊕ 10 A ⊕ 10 A ⊕ 27 S . (8.75)

Here, when we take the tensor product of two or more identical representations, the
irreducible representations on the right side are labeled as A, S, or M depending on
whether they involve an antisymmetric, symmetric, or mixed symmetry combination
of the indices of the original representations on the left side.
In SU (N ), one can build any representation out of objects that carry only indices
transforming under the fundamental N and anti-fundamental N representations. It
is useful to employ lowered indices for the fundamental, and raised indices for the
antifundamental. Then an object carrying n lowered and m raised indices:

j ... j
ϕi11...inm (8.76)

transforms under the tensor product representation

N ⊗ .
. . ⊗ N ⊗ N ⊗ .
. . ⊗ N . (8.77)
n times m times

This is always reducible. To reduce it, one can decompose ϕ into parts that have
different symmetry and trace properties. So, for example, we can take an object that
transforms under SU (3) as N × N, and write it as:

j j 1 j 1 j k
ϕi = ϕi − δi ϕkk + δi ϕk . (8.78)
N N

The first term in parentheses transforms as an adjoint representation, and the second
as a singlet, under SU (N ). For SU (3), this corresponds to the rule of (8.71).
Similarly, an object that transforms under SU (N ) as N × N can be decomposed
as
1 1
ϕi j = (ϕi j + ϕ ji ) + (ϕi j − ϕ ji ). (8.79)
2 2
The two terms on the right-hand side correspond to an N (N + 1)/2-dimensional
symmetric tensor, and an N (N − 1)/2-dimensional antisymmetric tensor, irre-
ducible representations. For N = 3, these are the 6 and 3 representations, respec-
tively, and this decomposition corresponds to (8.72). By using this process of taking
symmetric and anti-symmetric parts and removing traces, one can find all necessary
tensor-product representation rules for any SU (N ) group.
8.2 The Yang-Mills Lagrangian and Feynman Rules 211

8.2 The Yang-Mills Lagrangian and Feynman Rules

In this section, we will construct the Lagrangian and Feynman rules for a theory of
Dirac fermions and gauge bosons transforming under a non-Abelian gauge group,
called a Yang-Mills theory.
Let the Dirac fermion fields be given by i , where i is an index in some representa-
aj
tion of the gauge group with generators Ti . Here i = 1, . . . , d R and a = 1, . . . , dG .
Under a gauge transformation, we have:

i → Ui j j (8.80)

where

U = exp(iθ a T a ). (8.81)

Specializing to the case of an infinitesimal gauge transformation θ a = a , we have

i → (1 + i a T a )i j j . (8.82)

Our goal is to build a Lagrangian that is invariant under this transformation.

First let us consider how barred spinors transform. Taking the Hermitian conjugate
of (8.82), we find

†i → † j (1 − i a T a ) j i , (8.83)

where we have used the fact that T a are Hermitian matrices. (Notice that taking the
Hermitian conjugate changes the heights of the representation indices, and in the case
of matrices, reverses their order. So the Dirac spinor carries a lowered representation
index, while the Hermitian conjugate spinor carries a raised index.) Now we can
multiply on the right by γ 0 . The Dirac gamma matrices are completely separate
from the gauge group representation indices, so we get the transformation rule for
the barred Dirac spinors:

i j
→ (1 − i a T a ) j i . (8.84)

We can rewrite this in a slightly different way by noting that

T jai = (T jai )† = (T ai j )∗ , (8.85)

so that
i j
→ (1 + i a [−T a∗ ])i j . (8.86)

i
Comparing with (8.27), this establishes that transforms in the complex conjugate
of the representation carried by i .
212 8 Quantum Chromo-Dynamics (QCD)

i
Since and i transform as complex conjugate representations of each other,
their tensor product must be a direct sum of representations that includes a singlet.
The singlet is obtained by summing over the index i:

i
i . (8.87)

As a check of this, under an infinitesimal gauge transformation, this term becomes:

i j k
i → (1 − i a T a ) j i (1 + i b T b )i k (8.88)
i
= i + O( 2 ), (8.89)

where the terms linear in a have indeed canceled. Therefore, we can include a
fermion mass term in the Lagrangian:

i
Lm = −m i . (8.90)

This shows that each component of the field i must have the same mass.
Next we would like to include a derivative kinetic term for the fermions. Just as
in QED, the term

i γ μ ∂μ i
i
(8.91)

is not acceptable by itself, because ∂μ i does not gauge transform in the same way
that i does. The problem is that the derivative can act on the gauge-parameter
function a , giving an extra term:

aj
∂μ i → (1 + i a T a )i j ∂μ j + i(∂μ a )Ti j . (8.92)

By analogy with QED, we can fix this by writing a covariant derivative involving
vectors fields, which will also transform in such a way as to cancel the last term in
(8.92):
aj
Dμ i = ∂μ i + ig Aaμ Ti j . (8.93)

The vector boson fields Aaμ are known as gauge fields. They carry an adjoint rep-
resentation index a in addition to their spacetime vector index μ. The number of
such fields is equal to the number of generator matrices T a , which we recall is the
dimension of the gauge group dG . The quantity g is a coupling, known as a gauge
coupling. It is dimensionless, and is the direct analog of the coupling e in QED. The
aj
entries of the matrix Ti take the role played by the charges q in QED. Notice that
the definition of the covariant derivative depends on the representation matrices for
the fermions, so there is really a different covariant derivative depending on which
fermion representation one is acting on.
8.2 The Yang-Mills Lagrangian and Feynman Rules 213

Now one can check that the Lagrangian

Lfermions = i γ μ Dμ i − m i
i i
(8.94)

is invariant under infinitesimal gauge transformations, provided that the gauge field
is taken to transform as:
1
Aaμ → Aaμ − ∂μ a − f abc b Acμ . (8.95)
g

The term with a derivative acting on a is the direct analog of a corresponding term
in QED, see (8.4). The last term vanishes for Abelian groups like QED, but it is
necessary to ensure that the covariant derivative of a Dirac field transforms in in the
same way as the field itself:

Dμ i → (1 + i a T a )i j Dμ j . (8.96)

If we limit ourselves to constant gauge parameters ∂μ a = 0, then the transformation

in (8.95) is of the correct form for a field in the adjoint representation.
Having introduced a gauge field for each Lie algebra generator, we must now
include kinetic terms for them. By analogy with QED, the gauge field Lagrangian
has the form:
1
Lgauge = − F μνa Fμν
a
, (8.97)
4

with an implied sum on a, where Fμνa is an antisymmetric field strength tensor for

each Lie algebra generator. However, we must also require that this Lagrangian is
invariant under gauge transformations. This is accomplished if we choose:
a
Fμν = ∂μ Aaν − ∂ν Aaμ − g f abc Abμ Acν . (8.98)

Then one can check, using (8.95), that

a
Fμν → Fμν
a
− f abc b Fμν
c
. (8.99)

From this it follows that Lgauge transforms as:

1 μνa a 1 1
− F Fμν → − F μνa Fμν
a
− F μνa f abc b Fμν
c
+ O( 2 ). (8.100)
4 4 2

The extra term linear in vanishes, because the part F μνa Fμν
c is symmetric under

interchange of a ↔ c, but its gauge indices are contracted with f abc , which is anti-
symmetric under the same interchange. Therefore, (8.97) is a gauge singlet.
214 8 Quantum Chromo-Dynamics (QCD)

Putting the above results together, we have found a gauge-invariant Lagrangian

for Dirac fermions coupled to a non-Abelian gauge field:

LYang−Mills = Lgauge + Lfermions . (8.101)

Now we can find the Feynman rules for this theory in the usual way. First we
identify the kinetic terms that are quadratic in the fields. That part of LYang-Mills
is:
1
Lkinetic = − (∂μ Aaν − ∂ν Aaμ )(∂ μ Aaν − ∂ ν Aaμ ) + i ∂/i − m i . (8.102)
i i
4

These terms have exactly the same form as in the QED Lagrangian, but with a sum
over dG copies of the vector fields, labeled by a, and over d R copies of the fermion
field, labeled by i. Therefore we can obtain the Feynman rules for vector and fermion
propagators in the same way. For the gauge fields, we need to include a gauge-fixing
term, just as in QED (compare (5.32) and the surrounding discussion), in order to
have a well-defined propagator:

1 μ a 2
Lgauge−fixing = − (∂ Aμ ) (8.103)
2ξ

with the index a summed over. This leads to:

where p μ is the 4-momentum along either direction in the wavy line, and one can
take ξ = 1 for Feynman gauge and ξ = 0 for Landau gauge. The δ ab in the Feynman
rule just means that a gauge field does not change to a different type as it propagates.
Likewise, the Dirac fermion propagator is:

with 4-momentum p μ along the arrow direction, and m the mass of the Dirac fermion.
j
Again the factor of δi means the fermion does not change its identity as it propagates.
The interaction Feynman rules follow from the remaining terms in LYang−Mills .
First, there is a fermion-fermion-vector interaction coming from the covariant deriva-
tive in L . Identifying the Feynman rule as i times the term in the Lagrangian (recall
the discussion surrounding (4.250)), from

L A = −g Aaμ γ μ T a (8.104)

we get:
8.2 The Yang-Mills Lagrangian and Feynman Rules 215

This rule says that the coupling of a gauge field to a fermion line is proportional to
the corresponding Lie algebra generator matrix. Since the matrices T a are not diag-
onal for non-Abelian groups, this interaction can change one fermion into another.
The Lagrangian density Lgauge contains three-gauge-field and four-gauge-field cou-
plings, proportional to g and g 2 respectively. After combining some terms using the
antisymmetry of the f abc symbol, they can be written as

L A A A = g f abc (∂μ Aaν )Aμb Aνc , (8.105)

g2
LAAAA = − f abe f cde Aaμ Abν Aμc Aνd . (8.106)
4
The three-gauge-field couplings involve a spacetime derivative, which we can treat
according to the rule of (7.124), as i times the momentum of the field it acts on (in
the direction going into the vertex). Then the Feynman rule for the interaction of
three gauge bosons with (spacetime vector, gauge) indices μ, a and ν, b and ρ, c is
obtained by taking functional derivatives of the Lagrangian density with respect to
the corresponding fields:

δ3
i LA A A, (8.107)
δ Aaμ δ Abν δ Acρ

resulting in:

where p μ , q μ , and k μ are the gauge boson 4-momenta flowing into the vertex.
Likewise, the Feynman rule for the coupling of four gauge bosons with indices μ, a
and ν, b and ρ, c and σ, d is:

δ4
i LA A A A, (8.108)
δ Aaμ δ Abν δ Acρ δ Adσ

leading to:
216 8 Quantum Chromo-Dynamics (QCD)

There are more terms in these Feynman rules than in the corresponding Lagrangian,
since the functional derivatives have a choice of several fields on which to act. Notice
that these fields are invariant under the simultaneous interchange of all the indices and
momenta for any two vector bosons, for example (μ, a, p) ↔ (ν, b, q). The above
Feynman rules are all that is needed to calculated tree-level Feynman diagrams in a
Yang-Mills theory with Dirac fermions. External state fermions and gauge bosons
are assigned exactly the same rules as for fermions and photons in QED. The external
state particles carry a representation or gauge index determined by the interaction
vertex to which that line is attached.
(However, this is not quite the end of the story if one needs to compute loop
diagrams. In that case, one must take into account that not all of the gauge fields
that can propagate in loops are actually physical. One way to fix this problem is by
introducing “ghost” fields that only appear in loops, never in initial or final states.
The ghost fields do not create and destroy real particles; they are really just book-
keeping devices that exist only to cancel the unphysical contributions of gauge fields
in loops. We will not do any loop calculations in this book, so we will not go into
more detail on that issue.)
The Yang-Mills theory we have constructed makes several interesting predictions.
One is that the gauge fields are necessarily massless. If one tries to get around this
by introducing a mass term for the vector gauge fields, like:

L A−mass = m 2V Aaμ Aaμ , (8.109)

then one finds that this term is not invariant under the gauge transformation of (8.95).
Therefore, if we put in such a term, we necessarily violate the gauge invariance of
the Lagrangian, and the gauge symmetry will not be a symmetry of the theory. This
sounds like a serious problem, because there is only one known freely-propagating,
non-composite, massless vector field, the photon. In particular, the massive W ±
boson cannot be described by the Yang-Mills theory that we have so far. One way
to proceed would be to simply keep the term in (8.109), and accept that the theory
is not fully invariant under the gauge symmetry. The only problem with this is that
the theory would be non-renormalizable in that case; as a related problem, unitarity
would be violated in scattering at very high energies. Instead, we can explain the
non-zero mass of the W ± boson by enlarging the theory to include scalar fields,
leading to a spontaneous breakdown in the gauge symmetry.
Another nice feature of the Yang-Mills theory is that several different couplings
are predicted to be related to each other. Once we have picked a gauge group G, a
8.3 QCD Lagrangian and Feynman Rules 217

set of irreducible representations for the fermions, and the gauge coupling g, then
the interaction terms are all fixed. In particular, if we know the coupling of one type
of fermion to the gauge fields, then we know g. This in turn allows us to predict,
as a consequence of the gauge invariance, what the couplings of other fermions to
the gauge fields should be (as long as we know their representations), and what the
three-gauge-boson and four-gauge-boson vertices should be.

8.3 QCD Lagrangian and Feynman Rules

The strong interactions are based on a Yang-Mills theory with gauge group SU (3)c ,
with quarks transforming in the fundamental 3 representation. The subscript c is
to distinguish this as the group of invariances under transformations of the color
degrees of freedom. As far as we can tell, this is an exact symmetry of nature. (There
is also an approximate SU (3)flavor symmetry under which the quark flavors u, d, s
transform into each other; isospin is an SU (2) subgroup of this symmetry.) Each of
the quark Dirac fields u, d, s, c, b, t transforms separately as a 3 of SU (3)c , and each
barred Dirac field u, d, s, c, b, t therefore transforms as a 3, as we saw on general
grounds in Sect. 8.2.
For example, an up quark is created in an initial state by any one of the three color
component fields:
⎛ ⎞ ⎛ ⎞
u red u1
u= ⎝ u blue ⎠ = ⎝ u 2⎠, (8.110)
u green u3

while an anti-up quark is created in an initial state by any of the fields:

u = u red u blue u green = u 1 u 2 u 3 . (8.111)

Since SU (3)c is an exact symmetry, no experiment can tell the difference between
a red quark and a blue quark, so the labels are intrinsically arbitrary. In fact, we can
do a different SU (3)c transformation at each point in spacetime, but simultaneously
on each quark flavor, so that:
a (x)T a a (x)T a a (x)T a
u → eiθ u, d → eiθ d, s → eiθ s, etc., (8.112)

where T a = λ2 with a = 1, . . . , 8, and the θ a (x) are any gauge parameter functions
a

of our choosing. This symmetry is in addition to the U (1)EM gauge transformations:

u → eiθ (x)Q u u, d → eiθ (x)Q d d, s → eiθ (x)Q s s, etc., (8.113)

where Q u = Q c = Q t = 2/3 and Q d = Q s = Q b = −1/3. For each of the 8 gen-

erator matrices T a of SU (3)c , there is a corresponding gauge vector boson called
a gluon, represented by a field G aμ carrying both a spacetime vector index and an
SU (3)c adjoint representation index.
218 8 Quantum Chromo-Dynamics (QCD)
SU (3)c × U (1)EM spin
u, c, t (3, + 23 ) 1
2
d, s, b (3, − 13 ) 1
2
e, μ, τ (1, −1) 1
2
νe , νμ , ντ (1, 0) 1
2
γ (1, 0) 1
gluon (8, 0) 1
One says that the unbroken gauge group of the Standard Model is SU (3)c ×
U (1)EM , with the fermions and gauge bosons transforming as:
The gluon appears together with the photon field Aμ in the full covariant derivative
for quark fields. Using an index i = 1, 2, 3 to run over the color degrees of freedom,
the covariant derivatives are:

aj 2
Dμ u i = ∂μ u i + ig3 G aμ Ti u j + i.e.Aμ ui , (8.114)
3

aj 1
Dμ di = ∂μ di + ig3 G aμ Ti d j + i.e.Aμ − di . (8.115)
3

Here g3 is the coupling constant associated with the SU (3)c gauge interactions. The
strength of the strong interactions comes from the fact that g3 e.
Using the general results for a gauge theory in Sect. 8.2, we know that the propa-
gator for the gluon is that of a massless vector field just like the photon:

Note that it is traditional, in QCD, to use “springy” lines for gluons, to easily distin-
guish them from wavy photon lines. There are also quark-gluon interaction vertices
for each flavor of quark:

Here the quark line can be any of u, d, s, c, b, t. The gluon interaction changes the
color of the quarks when T a is non-diagonal, but never changes the flavor of the
quark line, so an up quark remains an up quark, a down quark remains a down quark,
etc.
The Lagrangian density also contains a “pure glue” part:

1
Lglue = − F μνa Fμν a
(8.116)
4
a
Fμν = ∂μ G aν − ∂ν G aμ − g3 f abc G bμ G cν , (8.117)
8.4 Scattering of Quarks and Gluons 219

where f abc are the structure constants for SU (3)c given in (8.63)–(8.65). In addition
to the propagator, this implies that there are three-gluon and four-gluon interactions:

The spacetime- and gauge-index structure are just as given in Sect. 8.2 in the general
case, with g → g3 .

8.4 Scattering of Quarks and Gluons

8.4.1 Quark-Quark Scattering (qq → qq)

To see how the Feynman rules for QCD work in practice, let us consider the exam-
ple of quark-quark scattering. This is not a directly observable process, because
the quarks in both the initial state and final state are parts of bound states. How-
ever, it does form the microscopic part of a calculation for the observable process
hadron+hadron→jet+jet. We will see how to use the microscopic cross-section result
to obtain the observable cross-section later, in Sect. 8.6. To be specific, let us consider
the process of an up-quark and down-quark scattering from each other:

ud → ud. (8.118)

Let us assign momenta, spin, and color to the quarks as follows:

Particle Momentum Spin Spinor Color
initial u p s1 u( p, s1 ) = u 1 i
initial d p s2 u( p , s2 ) = u 2 j (8.119)
final u k s3 u(k, s3 ) = u 3 l
final d k s4 u(k , s4 ) = u 4 m
At leading order in an expansion in g3 , there is only one Feynman diagram:
220 8 Quantum Chromo-Dynamics (QCD)

The reduced matrix element can now be written down by the same procedure as in
QED. One obtains, using Feynman gauge (ξ = 1):
−ig μν δ ab
bj
M = u 3 (−ig3 γμ Tlai )u 1 u 4 (−ig3 γν Tm )u 2 (8.120)
( p − k)2
2 ai a j
μ

= ig3 Tl Tm u 3 γμ u 1 u 4 γ u 2 /t (8.121)

where t = ( p − k)2 . This matrix element is exactly what one finds in QED for
e− μ− → e− μ− , but with the QED squared coupling replaced by a product of matri-
ces depending on the color combination:

aj
e2 → g32 Tlai Tm . (8.122)

This illustrates that the “color charge matrix” g3 Tlai is analogous to the electric
charge eQ f . There are 34 = 81 color combinations for quark-quark scattering.
In order to find the differential cross-section, we continue as usual by taking the
complex square of the reduced matrix element:

aj bj |2 ,
|M|2 = g34 (Tlai Tm )(Tlbi Tm )∗ |M (8.123)

for each i, j, l, m (with no implied sum yet), and

= u 3 γμ u 1 u 4 γ μ u 2 /t.
M (8.124)

It is not possible, even in principle, to distinguish between colors. However, one

can always imagine fixing, by an arbitrary choice, that the incoming u-quark has
color red= 1; then the colors of the other quarks can be distinguished up to SU (3)c
rotations that leave the red component fixed. In practice, one does not measure the
colors of quarks in an experiment, even with respect to some arbitrary choice, so we
will sum over the colors of the final state quarks and average over the colors of the
initial state quarks:

1 1
|M|2 . (8.125)
3 3 m
i j l

To do the color sum/average most easily, we note that, because the gauge group
generator matrices are Hermitian,

(Tlbi Tm )∗ = (Tibl T jbm ).

bj
(8.126)
8.4 Scattering of Quarks and Gluons 221

Therefore, the color factor is

1 1 ai a j 1
(Tl Tm )(Tlbi Tm )∗ =
bj aj
(Tlai Tibl )(Tm T jbm ) (8.127)
3 3 m
9
i j l i, j,l,m
1
= Tr(T a T b )Tr(T a T b ) (8.128)
9
1
= I (3)δ ab I (3)δ ab (8.129)
9
1 1 2
= ( ) dG (8.130)
9 2
2
= (8.131)
9
In doing this, we have used the definition of the index of a representation (8.17);
the fact that the index of the fundamental representation is 1/2; and the fact that the
sum over a, b of δ ab δ ab just counts the number of generators of the Lie algebra dG ,
which is 8 for SU (3)c .
Meanwhile, the rest of |M|2 , including a sum over final state spins and an aver-
age over initial state spins, can be taken directly from the corresponding result for
e− μ− → e− μ− in QED, which we found by crossing symmetry in (5.213). Strip-
ping off the factor e4 associated with the QED charges, we find in the high energy
limit of negligible quark masses,
2
1 1 2 s + u2
|M | = 2 . (8.132)
2 s 2 s s s t2
1 2 3 4

Putting this together with the factor of g34 and the color factor above, we have

1 1 4g34 s 2 + u 2
|M|2 ≡ |M| =
2
. (8.133)
9 4 9 t2
colors spins

The notation |M|2 is a standard notation, which for a general process implies the
appropriate sum/average over spin and color. The differential cross-section for this
process is therefore:

dσ 1 2π αs2 s2 + u2
= |M|2 = , (8.134)
d(cos θ ) 32π s 9s t2

where

g32
αs = (8.135)
4π
222 8 Quantum Chromo-Dynamics (QCD)

is the strong-interaction analog of the fine structure constant. Since we are neglecting
quark masses, the kinematics for this process is the same as in any massless 2→2
process, for example as found in (5.177)–(5.181). Therefore, one can replace cos θ
in favor of the Mandelstam variable t, using

2dt
d(cos θ ) = , (8.136)
s
so

dσ 4π αs2 s2 + u2
= . (8.137)
dt 9s 2 t2

8.4.2 Gluon-Gluon Scattering (gg → gg)

Let us now turn to QCD scattering of gluons. Because there are three-gluon and
four-gluon interaction vertices, one has the interesting process gg → gg even at
tree-level. (It is traditional to represent the gluon particle name, but not its quantum
field, by g.) The corresponding QED process of γ γ → γ γ does not happen at tree-
level, but does occur at one loop. In QCD, because of the three-gluon and four-gluon
vertices, there are four distinct Feynman diagrams that contribute at tree-level:

The calculation of the differential cross-section from these diagrams is an important,

but quite tedious, one. Just to get an idea of how this proceeds, let us write down the
reduced matrix element for the first (“s-channel”) diagram, and then skip directly to
the final answer.
Choosing polarization vectors and color indices for the gluons, we have:

Particle Polarization vector Color

μ
initial gluon 1 = μ ( p, λ1 ) a
initial gluon ν
2 = ν ( p , λ2 ) b (8.138)
ρ∗
final gluon 3 = ρ∗ (k, λ3 ) c
final gluon 4 ∗ = σ ∗ (k , λ4 )
σ d

Let the internal gluon line carry (vector,gauge) indices (κ, e) on the left and (λ, f )
on the right. Labeling the Feynman diagram in detail:
8.5 Renormalization 223

The momenta flowing into the leftmost 3-gluon vertex are, starting from the upper-left
incoming gluon and going clockwise, ( p, − p − p , p ). Also, the momenta flowing
into the rightmost 3-gluon vertex are, starting from the upper-right final-state gluon
and going clockwise, (−k, −k , k + k ). So we can use the Feynman rules of Sect. 8.2
to obtain:
κλ e f
-channel = μ ν ρ∗ σ ∗ −ig δ
Msgg→gg 1 2 3 4
( p + p )2

−g f aeb gμκ (2 p + p )ν + gκν (− p − 2 p )μ + gνμ ( p − p)κ

−g f cd f gρσ (k − k)λ + gσ λ (−2k − k)ρ + gλρ (2k + k )σ . (8.139)

After writing down the reduced matrix elements for the other three diagrams, adding
them together, taking the complex square, summing over final state polarizations and
averaging over initial state polarizations, summing overfinal-state
gluon colors and
averaging over initial-state gluon colors according to 18 a 81 b c d one finds:

dσgg→gg 9π αs2 u2 + t 2 s2 + u2 s2 + t 2
= + + + 3 . (8.140)
dt 4 s2 s2 t2 u2

When we collide a proton with another proton or an antiproton, this process, and the
process ud → ud, are just two of many possible subprocesses that can occur. There
is no way to separate the proton into simpler parts, so one must deal with all of these
possible subprocesses. We will consider the subprocesses of proton-(anti)proton
scattering more systematically in the next section.

8.5 Renormalization

Since the strong interactions involve a coupling g3 that is not small, we should
worry about higher-order corrections to the treatment of quark-quark scattering in
the previous section. Let us discuss this issue in a more general framework than
just QCD. In a general gauge theory, the Feynman diagrams contributing to the
reduced matrix element at one-loop order in fermion+fermion → fermion+fermion
scattering are the following:
224 8 Quantum Chromo-Dynamics (QCD)

In each of these diagrams, there is a loop momentum μ that is unfixed by the external
4-momenta, and must be integrated over. Only the first two diagrams give a finite
answer when one naively integrates d 4 . This is not surprising; we do not really
know what physics is like at very high energy and momentum scales, so we have
no business in integrating over them. Therefore, one must introduce a very high
cutoff mass scale M, and replace the loop-momentum integral by one that kills the
contributions to the reduced matrix element from |μ | ≥ M. Physically, M should
be the mass scale at which some as-yet-unknown new physics enters in to alter the
theory. It is generally thought that the highest this cutoff is likely to be is about
MPlanck = 2.4 × 1018 GeV (give or take an order of magnitude), but it could very
easily be much lower.
As an example of what can happen, consider the next-to-last Feynman diagram
given above. Let us call q μ = p μ − k μ the 4-momentum flowing through either of
the vector-boson propagators. Then the part of the reduced matrix element associated
with the fermion loop is:
!
i(/ + q/ + m̂ f )
−i ĝ(T f )i γ μ
aj
(−1) d Tr
4

f
( + q)2 − m̂ 2f + i
|μ |≤M
!"
i(/ + m̂ f )
−i ĝ(T f )bij γ ν . (8.141)
− m̂ 2f + i
2
8.5 Renormalization 225

This involves a sum over all fermions that can propagate in the loop, and a trace over
the spinor indices of the fermion loop. For reasons that will become clear shortly,
we are calling the gauge coupling of the theory ĝ and #the mass of each fermion
species m̂ f . We are being purposefully vague about what |μ |≤M d 4 means, in part
because there are actually several different ways to cutoff the integral at large M. (A
straightforward step-function cutoff will work, but is clumsy to carry out and even
clumsier to interpret.)
The d 4 factor can be written as an angular part times a radial part ||3 d||. Now
there are up to five powers of || in the numerator (three from the d 4 , and two from
the propagators), and four powers of || in the denominator from the propagators. So
naively, one might expect that the result of doing the integral will scale like M 2 for
a large cutoff M. However, there is a conspiratorial cancellation, so that the large-M
behavior is only logarithmic. The result is proportional to:

ĝ 2 (q 2 g μν − q μ q ν ) Tr(T fa T fb ) [ln (M/m) + · · · ] (8.142)
f

where the · · · represents a contribution that does not get large as M gets large. The
m is a characteristic mass scale of the problem; it is something with dimensions of
mass built out of q μ and the m̂ f . It must appear in the formula in the way it does in
order to make the argument of the logarithm dimensionless. The arbitrariness in the
precise definition of m can be absorbed into the “· · · ”.
When one uses (8.142) in the rest of the Feynman diagram, it is clear that the
entire contribution must be proportional to:

Mfermion loop in gauge propagator ∝ ĝ 4 I (R f )ln(M/m) + · · · . (8.143)
f

What we are trying to keep track of here is just the number of powers of ĝ, the
group-theory factor, and the large-M dependence on ln(M/m).
A similar sort of calculation applies to the last diagram involving a gauge vector
boson loop. Each of the three-vector couplings involves a factor of f abc , with two of
the indices contracted because of the propagators. So it must be that the loop part of
the diagram make a contribution proportional to f acd f bcd = C(G)δ ab . It is again
logarithmically divergent, so that

Mgauge loop in gauge propagator ∝ ĝ 4 C(G)ln(M/m) + · · · . (8.144)

Doing everything carefully, one finds that the contributions to the differential cross-
section is given by:
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 ⎣ 11 4 ⎬
dσ = dσtree (ĝ) 1 + C(G) − I (R f )⎦ ln(M/m) + · · · , (8.145)
⎩ 4π 2 3 3 ⎭
f
226 8 Quantum Chromo-Dynamics (QCD)

where dσtree (ĝ) is the tree-level result (which we have already worked out in the
special case of QCD), considered to be a function of ĝ. To be specific, it is proportional
to ĝ 4 . Let us ignore all the other diagrams for now; the justification for this will be
revealed soon.
The cutoff M may be quite large. Furthermore, we typically do not know what
it is, or what the specific very-high-energy physics associated with it is. (If we did,
we could just redo the calculation with that physics included, and a higher cutoff.)
Therefore, it is convenient to absorb our ignorance of M into a redefinition of the
coupling. Specifically, inspired by (8.145), one defines a renormalized or running
coupling g(μ) by writing:
⎧ ⎡ ⎤ ⎫
⎨ (g(μ)) ⎣ 11
2 4 ⎬
ĝ = g(μ) 1 − C(G) − I (R f )⎦ ln(M/μ) , (8.146)
⎩ 16π 2 3 3 ⎭
f

Here μ is a new mass scale, called the renormalization scale, that we get to pick. (It
is not uncommon to see the renormalization scale denoted by Q instead of μ.) The
original coupling ĝ is called the bare coupling. One can invert this relation to write
the renormalized coupling in terms of the bare coupling:
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 ⎣ 11 4 ⎬
g(μ) = ĝ 1 + C(G) − I (R f )⎦ ln(M/μ) + · · · , (8.147)
⎩ 16π 2 3 3 ⎭
f

where we are treating g(μ) as an expansion in ĝ, dropping terms of order ĝ 5 every-
where.
The reason for this strategic definition is that, since we know that dσtree (ĝ) is
proportional to ĝ 4 , we can now write, using (8.145) and (8.146):
⎧ ⎡ ⎤ ⎫
⎨ ĝ 2 11 4 ⎬
dσ = dσtree (g) (ĝ/g)4 1 + ⎣ C(G) − I (R f )⎦ ln(M/m) + · · ·
⎩ 4π 2 3 3 ⎭
f
⎧ ⎡ ⎤ ⎫
⎨ g 2 ⎣ 11 4 ⎬
= dσtree (g) 1 + C(G) − I (R f ) ⎦ ln(μ/m) + · · · . (8.148)
⎩ 4π 2 3 3 ⎭
f

Here we are again dropping terms that go like g 4 ; these are comparable to 2-loop
contributions that we are neglecting anyway. The factor dσtree (g) is the tree-level dif-
ferential cross-section, but with g(μ) in place of ĝ. This formula looks very much like
(8.145), but with the crucial difference that the unknown cutoff M has disappeared,
and is replaced by the scale μ that we know, because we get to pick it.
What should we pick μ to be? In principle we could pick it to be the cutoff M,
except that we do not know what that is. Besides, the logarithm could then be very
large, and perturbation theory would converge very slowly or not at all. For example,
suppose that M = MPlanck , and the characteristic energy scale of the experiment
we are doing is, say, m = 0.511 MeV or m = 1000 GeV. These choice might be
8.5 Renormalization 227

appropriate for experiments involving a non-relativistic electron and a TeV-scale

collider, respectively. Then

ln(M/m) ≈ 50 or 35. (8.149)

This logarithm typically gets multiplied by 1/16π 2 times g 2 times a group-theory

quantity, but is still large. This suggests that a really good choice for μ is to make
the logarithm ln(μ/m) as small as possible, so that the correction term in (8.148) is
small. Therefore, one should choose

μ ≈ m. (8.150)

Then, to a first approximation, one can calculate using the tree-level approximation
using a renormalized coupling g(μ), knowing that the one-loop correction from these
diagrams is small. The choice of renormalization scale (8.150) allows us to write:

dσ ≈ dσtree (g(μ)). (8.151)

Of course, this is only good enough to get rid of the large logarithmic one-loop
corrections. If you really want all one-loop corrections, there is no way around
calculating all the one-loop diagrams, keeping all the pieces, not just the ones that
get large as M → ∞.
What about the remaining diagrams? If we isolate the M → ∞ behavior, they
fall into three classes. First, there are diagrams that are not divergent at all (the first
two diagram). Second, there are diagrams (the third through sixth diagrams) that
are individually divergent like ln(M/m), but sum up to a total that is not divergent.
Finally, the seventh through tenth diagrams have a logarithmic divergence, but it can
be absorbed into a similar redefinition of the mass. A clue to this is that they all
involve sub-diagrams:

The one-loop renormalized or running mass m f (μ) is defined in terms of the bare
mass m̂ f by

g2
m̂ f = m f (μ) 1 − C(R f )ln(M/μ) , (8.152)
2π 2
or

g2
m f (μ) = m̂ f 1+ C(R f )ln(M/μ) + · · · , (8.153)
2π 2

where C(R f ) is the quadratic Casimir invariant of the representation carried by the
fermion f . It is an amazing fact that the two redefinitions (8.146) and (8.152) are
enough to remove the cutoff dependence of all cross-sections in the theory up to and
including one-loop order. In other words, one can calculate dσ for any process, and
228 8 Quantum Chromo-Dynamics (QCD)

express it in terms of the renormalized mass m(μ) and the renormalized coupling
g(μ), with no M-dependence. This is what it means for a theory to be renormalizable
at one loop order.
In Yang-Mills theories, one can show that by doing some redefinitions of the
form:
!
L
ĝ = g(μ) 1 + bn g pn (ln(M/μ)) ,
2n
(8.154)
n=1
!

L
m̂ f = m f (μ) 1 + cn g qn (ln(M/μ)) ,
2n
(8.155)
n=1

one can simultaneously eliminate all dependence on the cutoff in any process up to
L-loop order. Here pn (x) and qn (x) are polynomials of degree n, and bn , cn are some
constants that depend on group theory invariants like the Casimir invariants of the
group and the representations, and the index. At any finite loop order, what is left
in the expression for any cross-section after writing it in terms of the renormalized
mass m(μ) and renormalized coupling g(μ) is a polynomial in ln(μ/m); these are
to be made small by choosing2 μ ≈ m. This is what it means for a theory to be
renormalizable at all loop orders. Typically, the specifics of these redefinitions is
only known at 2- or 3- or occasionally 4- loop order, except in some special theories.
If a theory is non-renormalizable, it does not necessarily mean that the theory is
useless; we saw that the four-fermion theory of the weak interactions makes reliable
predictions, and we still have no more predictive theory for gravity than Einstein’s
relativity. It does mean that we expect the theory to have trouble making predictions
about processes at high energy scales.
We have seen that we can eliminate the dependence on the unknown cutoff of a
theory by defining a renormalized running coupling g(μ) and mass m f (μ). When
one does an experiment in high energy physics, the results are first expressed in
terms of observable quantities like cross-sections, decay rates, and physical masses of
particles. Using this data, one extracts the value of the running couplings and running
masses at some appropriately-chosen renormalization scale μ, using a theoretical
prediction like (8.148), but with the non-logarithmic corrections included too. (The
running mass is not quite the same thing as the physical mass. The physical mass can
be determined from the experiment by kinematics, the running mass is related to it by
various corrections.) The running parameters can then be used to make predictions
for other experiments. This tests both the theoretical framework, and the specific
values of the running parameters.
The bare coupling and the bare mass never enter into this process of comparing
theory to experiment. If we measure dσ in an experiment, we see from (8.145) that

2 Of course, there might be more than one characteristic energy scale in a given problem, rather

than a single m. If so, and if they are very different from each other, then one may be stuck with
some large logarithms, no matter what μ is chosen. This has to be dealt with by fancier methods.
8.5 Renormalization 229

in order to determine the bare coupling ĝ from the data, we would also need to know
the cutoff M. However, we do not know what M is. We could guess at it, but this
would usually be a wild guess, devoid of practical significance.
A situation that arises quite often is that one extracts running parameters from
an experiment with a characteristic energy scale μ0 , and one wants to compare
with data from some other experiment that has a completely different characteristic
energy scale μ. Here μ0 and μ each might be the mass of some particle that is
decaying, or the momentum exchanged between particles in a collision, or some
suitable average of particle masses and exchanged momenta. It would be unwise to
use the same renormalization scale when computing the theoretical expectations for
both experiments, because the loop corrections involved in at least one of the two
cases will be unnecessarily large. What we need is a way of taking a running coupling
as determined in the first experiment at a renormalization scale μ0 , and getting from
it the running coupling at any other scale μ. The change of the choice of scale μ is
known as the renormalization group.3
As an example, let us consider how g(μ) changes in a Yang-Mills gauge theory.
Since the differential cross-section dσ for fermion+fermion →fermion+fermion
is an observable, in principle it should not depend on the choice of μ, which is an
arbitrary one made by us. Therefore, we can require that (8.148) is independent of
μ. Remembering that dσtree ∝ g 4 , we find:
⎧ ⎡ ⎤ ⎫
d ⎨ 4 dg g 2 ⎣ 11 4 1 ⎬
0= (dσ ) = (dσtree ) + C(G) − I (R f )⎦ + · · · , (8.156)
dμ ⎩ g dμ 4π 2 3 3 μ ⎭
f

where we are dropping all higher-loop-order terms that are proportional to (dσtree )g 4 .
The first term in (8.156) comes from the derivative acting on the g 4 inside dσtree .
The second term comes from the derivative acting on the lnμ one-loop correction
term. The contribution from the derivative acting on the g 2 in the one-loop correction
term can be self-consistently judged, from the equation we are about to write down,
as proportional to (dσtree )g 5 , so it is neglected as a higher-loop-order effect in the
expansion in g 2 . So, it must be true that:
⎡ ⎤
dg g 3 ⎣ 11 4
μ = − C(G) + I (R f )⎦ + · · · . (8.157)
dμ 16π 2 3 3
f

This differential equation, called the renormalization group equation or RG equation,

tells us how to change the coupling g(μ) when we change the renormalization scale.
An experimental result will provide a boundary condition at some scale μ0 , and then
we can solve the RG equation to find g(μ) at some other scale. Other experiments

3 The use of the word “group” is historical; this is not a group in the mathematical sense defined
earlier.
230 8 Quantum Chromo-Dynamics (QCD)

then test the whole framework. The right-hand side of the RG equation is known as
the beta function for the running coupling g(μ), and is written β(g), so that:

dg
μ = β(g). (8.158)
dμ
In a Yang-Mills gauge theory, including the effects of Feynman diagrams with more
loops,

g3 g5 g7
β(g) = b0 + b1 + b2 + · · · (8.159)
16π 2 (16π 2 )2 (16π 2 )3
where we already know that

11 4
b0 = − C(G) + I (R f ), (8.160)
3 3
f

and, just to give you an idea of how it goes,

34 20
b1 = − C(G)2 + C(G) I (R f ) + 4 C(R f )I (R f ), (8.161)
3 3
f f

etc. In QCD, the coefficients up to b3 (four-loop order) have been calculated. In

practical applications, it is usually best to work within an effective theory in which
fermions much heavier than the scales μ of interest are ignored. The sum over
fermions then includes only those satisfying m f < ∼ μ. The difference between this
effective theory and the more complete theory with all known fermions included can
be absorbed into a redefinition of running parameters. The advantage of doing this is
that perturbation theory will converge more quickly and reliably if heavy fermions
(that are, after all, irrelevant to the process under study) are not included.
In the one-loop order approximation, one can solve the RG equation explicitly.
Writing

dg 2 b0 4
= g , (8.162)
dlnμ 8π 2
you can check that

g 2 (μ0 )
g 2 (μ) = . (8.163)
b0 g 2 (μ0 )
1− 8π 2
ln(μ/μ0 )

To see how this works in QCD, let us examine the one-loop beta function. In
SU (3), C(G) = 3, and each quark flavor is in a fundamental 3 representation with
I (3) = 1/2. Therefore,
2
b0,QC D = −11 + n f (8.164)
3
8.5 Renormalization 231

where n f is the number of “active” quarks in the effective theory, usually those
with mass < ∼ μ. The crucial fact is that since there are only 6 quark flavors known,
b0,QCD is definitely negative for all accessible scales μ, and so the beta function
is definitely negative. For an effective theory with n f = (3, 4, 5, 6) quark flavors,
b0 = (−9, −25/3, −23/3, −7). Writing the solution to the RG equation, (8.163), in
terms of the running αs , we have:

αs (μ0 )
αs (μ) = b0 αs (μ0 )
. (8.165)
1− 2π ln(μ/μ0 )

Since b0 is negative, we can make αs blow up by choosing μ small enough. To make

this more explicit, we can define a quantity

QCD = μ0 e2π/b0 αs (μ0 ) , (8.166)

with dimensions of [mass], implying that

2π
αs (μ) = . (8.167)
b0 ln(QCD /μ)

This shows that at the scale μ = QCD , the QCD gauge coupling is predicted to
blow up, according to the 1-loop RG equation. A qualitative graph of the running of
αs (μ) as a function of renormalization scale μ is shown below:

αS(Q)

ΛQCD
Renormalization scale Q

Of course, once αs (μ) starts to get big, we should no longer trust the one-loop
approximation, since two-loop effects are definitely big. The whole analysis has been
extended to four-loop order, with significant numerical changes, but the qualitative
effect remains: at any finite loop order, there is some scale QCD at which the gauge
coupling is predicted to blow up in a theory with a negative beta function. This is not
a sign that QCD is wrong. Instead, it is a sign that perturbation theory is not going to
232 8 Quantum Chromo-Dynamics (QCD)

be able to make good predictions when we do experiments near μ = QCD or lower

energy scales. One can draw Feynman diagrams and make rough qualitative guesses,
but the numbers cannot be trusted. On the other hand, we see that for experiments
conducted at characteristic energies much larger than QCD , the gauge coupling
is not large, and is getting smaller as μ gets larger. This means that perturbation
theory becomes more and more trustworthy at higher and higher energies. This nice
property of theories with negative β functions is known as asymptotic freedom. The
name refers to the fact that quarks in QCD are becoming free (since the coupling is
becoming small) as we probe them at larger energy scales.
Conversely, the fact that the QCD gauge coupling becomes non-perturbative in
the infrared means that we cannot expect to describe free quarks at low energies
using perturbation theory. This theoretical prediction goes by the name of infrared
slavery. It agrees well with the fact that one does not observe free quarks outside of
bound states. While it has not been proved mathematically that the infrared slavery
of QCD necessarily requires the absence of free quarks, the two ideas are certainly
compatible, and more complicated calculations show that they are plausibly linked.
Heuristically, the growth of the QCD coupling means that at very small energies or
large distances, the force between two free color charges is large and constant as the
distance increases. In the early universe, after the temperature dropped below QCD ,
all quarks and antiquarks and gluons arranged themselves into color-singlet bound
states, and have remained that way ever since.
An important feature of the renormalization of QCD is that we can actually trade
the gauge coupling as a parameter of the theory for the scale QCD . This is remark-
able, since g3 (or equivalently αs ) is a dimensionless coupling, while QCD is a
mass scale. If we want, we can specify how strong the QCD interactions are either
by quoting what αs (μ0 ) is at some specified μ0 , or by quoting what QCD is. This
trade of a dimensionless parameter for a mass scale in a gauge theory is known as
dimensional transmutation. Working in a theory with five “active” quarks u, d, s, c, b
(the top quark is treated as part of the unknown theory above the cutoff), one finds
QCD is about 210 MeV. One can also work in an effective theory with only four
active quarks u, d, s, c, in which case QCD is about 290 MeV, or in an effective
theory with only three active quarks u, d, s, in which case QCD is about 330 MeV.
Alternatively, αs (m Z ) = 0.1181 ± 0.0013. Recently, it has become standard to use
this second way of specifying the QCD coupling strength. These results hold when
the details of the cutoff and the renormalization are treated in the most popular way,
called the MS scheme.4 A summary of the experimental data on the QCD coupling
is shown in Fig. 8.1.
The data determine α S (μ) at a variety of renormalization scales μ from 1.78
GeV up to more than 1000 GeV. (In the figure, Q was used as the name of the
renormalization scale instead of μ.) The most accurate determinations come from

4 In this scheme, one cuts off loop momentum integrals by a process known as dimensional regular-

ization, which continuously varies the number of spacetime dimensions infinitesimally away from
4, rather than putting in a particular cutoff M. Although bizarre physically, this scheme is consistent
with gauge invariance and relatively easy to calculate in.
8.5 Renormalization 233

Fig. 8.1 The renormalized

QCD coupling constant α S
as a function of energy Q
(from RPP 2022)

lattice QCD calculations of the mass splittings in the ϒ bottomonium system and
from the hadronic branching ratio in τ decays, but other inputs to the average come
from production of jets and tt pairs at hadron colliders, deep inelastic scattering at
the HERA proton-electron collider, and jet production data in e+ e− collisions. The
four-loop renormalization group running of α S (μ) with inputs from various widely
different μ are then used to determine the reference value α S (m Z ).
We can contrast this situation with the case of QED. For a U (1) group, there is
no non-zero structure constant, so C(G) = 0. Also, since the generator of the group
in a representation of charge Q f is just the 1 × 1 matrix Q f , the index for a fermion
with charge Q f is I (R f ) = Q 2f . Therefore,

4 16 4 4
b0,QED = 3n u (2/3)2 + 3n d (−1/3)2 + n (−1)2 = nu + nd + n, (8.168)
3 9 9 3

where n u is the number of up-type quark flavors (u, c, t), and n d is the number of
down-type quark flavors (d, s, b), and n is the number of charged leptons (e, μ, τ )
included in the chosen effective theory. If we do experiments with a characteristic
energy scale m e < <
∼ μ ∼ m μ , then only the electron itself contributes, and b0,EM =
4/3, so:

de e3 4
= βe = (m e < <
∼ μ ∼ m μ ). (8.169)
dlnμ 16π 2 3

This corresponds to a very slow running. (Notice that the smaller a gauge coupling
is, the slower it will run.) If we do experiments at characteristic energies that are
much less than the electron mass, then the relativistic electron is not included in the
234 8 Quantum Chromo-Dynamics (QCD)

effective theory (virtual electron-positron pairs are less and less important at low
energies), so b0,EM = 0, and the electron charge does not run at all:
de
=0 (μ m e ). (8.170)
dlnμ
This means that QED is not quite “infrared free”, since the effective electromagnetic
coupling is perturbative, but does not get arbitrarily small, at very large distance
scales. At extremely high energies, the coupling e could in principle become very
large, because the QED beta function is always positive. Fortunately, this is predicted
to occur only at energy scales far beyond what we can probe, because e runs very
slowly. Furthermore, QED is embedded in a larger, more complete theory anyway at
energy scales in the hundreds of GeV range, so the apparent blowing up of α = e2 /4π
much farther in the ultraviolet is just an illusion.

8.6 Parton Distribution Functions and Hadron-Hadron

Scattering

In general, a hadron is a QCD bound state of quarks, anti-quarks, and gluons. The
characteristic size of a hadron, like the proton or antiproton, is always roughly
1/QCD ≈ 10−13 cm, since this is the scale at which the strongly-interacting par-
ticles are confined. In general, the point-like quark, antiquark, and gluon parts of a
hadron are called partons, and the description of hadrons in terms of them is called
the parton model.
Suppose we scatter a hadron off of another particle (which might be another hadron
or a lepton or photon) with a total momentum exchange much larger than QCD . The
scattering can be thought of as a factored into a “hard scattering” of one of the point-
like partons, with the remaining partons as spectators, and “soft” QCD processes that
involve exchanges and radiation of low energy virtual gluons. The hard scattering sub-
process takes place on a time scale much shorter than 1/QCD expressed in seconds.
Because of asymptotic freedom, at higher scattering energies it becomes a better and
better approximation to think of the partons as individual entities that move collectively
before and after the scattering, but are free particles at the moment of scattering. As a
first approximation, we can consider only the hard scattering processes, and later worry
about adding on the various soft processes as part of the higher-order corrections. This
way of thinking about things allows us to compute cross-sections for hadron scattering
by first calculating the partonic cross-sections leading to a desired final state, and then
combining them with information about the multiplicity and momentum distributions
of partons within the hadronic bound states.
For example, suppose we want to calculate the scattering of a proton and antipro-
ton. This involves the following 2→2 partonic subprocesses:

qq → qq, qq → qq , q q → q q, q q → q q , (8.171)
qq → gg, qq → qq, qq → q q , qq → qq , (8.172)
qg → qg, qg → qg, gg → gg, gg → qq. (8.173)
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 235

where g is a gluon and q is any fixed quark flavor and q is any quark flavor that is
definitely different
√ from q. Let the center-of-mass energy of the proton and antiproton
be called s. Each parton only carries a fraction of the energy of the√energy of
the proton it belongs to,√so the partonic center-of-mass energy, call it ŝ, will be
significantly less than s. One often uses √ hatted Mandelstam variables ŝ, tˆ and
û for the partonic scattering event. If ŝ QCD , then the two partons in the
final state will be sufficiently energetic that they can usually escape from most or
all of the spectator quarks and gluons before hadronizing (forming bound states).
However, before traveling a distance 1/QCD , they must rearrange themselves into
color-singlet combinations, possibly by creating quark-antiquark or gluon-antigluon
pairs out of the vacuum. This hadronization process can be quite complicated, but
will usually result in a jet of hadronic particles moving with roughly the same 4-
momentum as the parton that was produced. So, all of the partonic process cross-
sections in (8.173) contribute to the observable cross-section for the process:

p p → j j + X, (8.174)

where j stands for a jet. The X stands for “anything”, and includes stray hadronic junk
left over from the original proton and antiproton. Similarly, partonic hard scatterings
like:

qq → q q g, qg → qgg, qg → qgg, gg → ggg, (8.175)

and so on, will contribute to an observable cross-section for

p p → j j j + X. (8.176)

The 2 × 2 hard scattering processes can also contribute to this process if one of the
final state partons hadronizes by splitting into two jets, or if there is an additional jet
from the initial state.
In order to use the calculation of cross-sections for partonic processes like (8.173)
to obtain measurable cross-sections, we need to know how likely it is to have a given
parton inside the initial-state hadron with a given 4-momentum. Since we are mostly
interested in high-energy scattering problems, we can make things simple, and treat
the hadron and all of its constituents
√ as nearly massless. (For the proton, this means
that we are assuming that s m p ≈ 1 GeV.) Suppose we therefore take the total
4-momentum of the hadron h in an appropriate Lorentz frame to be:
μ
ph = (E, 0, 0, E). (8.177)

This is sometimes called the “infinite momentum frame”, even though E is finite,
since E m p . Consider a parton constituent A (a quark, antiquark, or gluon) that
carries a fraction x of the hadron’s momentum:
μ
p A = x(E, 0, 0, E). (8.178)
236 8 Quantum Chromo-Dynamics (QCD)

The variable x is a standard notation, and is called the (longitudinal) momentum

fraction for the parton, or Feynman’s x. It is older than QCD, dating back to a time
when the proton was suspected to contain point-like partons with properties that were
then obscure. In order to describe the partonic content of a hadron, one defines:
⎛ ⎞
Probability of finding a parton of type A with
⎝ 4-momentum between x p μ and (x + d x) p μ ⎠ = f Ah (x) d x . (8.179)
inside a hadron h with 4-momentum p μ

The function f Ah (x) is called the parton distribution function or PDF for the parton
A in the hadron h. The parton can be either one of the two or three valence quarks or
antiquarks that are the nominal constituents of the hadron, or one of an indeterminate
number of virtual sea quarks and gluons. Either type of parton can participate in a
scattering event.
Hadronic collisions studied in laboratories usually involve protons or antiprotons,
so the PDFs of the proton and antiproton are especially interesting. The proton is
nominally a bound state of three valence quarks, namely two up quarks and one down
quark, so we are certainly interested in the up-quark and down-quark distribution
functions
p p
f u (x) and f d (x).

The proton also contains virtual gluons, implying a gluon distribution function:
p
f g (x). (8.180)

Furthermore, there are always virtual quark-antiquark pairs within the proton. This
p p
adds additional contributions to f u (x) and f d (x), and also means that there is a
non-zero probability of finding antiup, antidown, or strange or antistrange quarks:
p p p p
f u (x), f d (x), f s (x), f s (x). (8.181)

These parton distribution functions are implicitly summed over color and spin. So
p
f u (x) tells us the probability of finding an up quark with the given momentum
fraction x and any color and spin. Since the gluon is its own antiparticle (it lives in
the adjoint representation of the gauge group, which is always a real representation),
p
there is not a separate f g (x).
Although the charm, bottom and top quarks are heavier than the proton, virtual
charm-anticharm, bottom-antibottom, and top-antitop pairs can exist as long as their
total energy does not exceed m p . This can happen because, as virtual particles, they
need not be on-shell. So, one can even talk about the parton distribution functions
p p p p p p
f c (x), f c (x), f b (x), f b (x), f t (x), f t (x). Fortunately, these are small so one can
often neglect them, although they can be important for processes involving charm or
bottom quarks in the final state.
Given the PDFs for the proton, the PDFs for the antiproton follow immediately
from the fact that it is the proton’s antiparticle. The probability of finding a given
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 237

parton in the proton with a given x is the same as the probability of finding the
corresponding antiparton in the antiproton with the same x. Therefore, if we know
the PDFs for the proton, there is no new information in the PDFs for the antiproton.
We can just describe everything having to do with proton and antiproton collisions
in terms of the proton PDFs. To simplify the notation, it is traditional to write the
proton and antiproton PDFs as:

p p
g(x) = f g (x) = f g (x), (8.182)
p p
u(x) = f u (x) = f u (x), (8.183)
p p
d(x) = f d (x) = f d (x), (8.184)
p p
u(x) = f u (x) = f u (x), (8.185)
p p
d(x) = f (x) = f d (x), (8.186)
d
p p
s(x) = f s (x) = f s (x), (8.187)
p p
s(x) = f s (x) = f s (x). (8.188)

The PDFs are also functions of another parameter Q (sometimes denoted μ or

μ F ), known as the factorization scale. The factorization scale can be thought of as
the energy scale that serves as the boundary between what is treated as the short-
distance hard partonic process and what is taken to be part of the long-distance
physics associated with hadronization. The choice of the factorization scale is an
arbitrary one, and in principle the final result should not depend on it, just like the
choice of renormalization scale discussed in Sect. 8.5 is arbitrary. It is very common
to choose the factorization scale equal to the renormalization scale, although this is
not mandatory. (When they are distinguished in the literature, some authors use Q for
the renormalization scale and μ for the factorization scale, and some use the reverse
as we have here. Some use the same letter for both, especially when choosing them
to be the same numerically.) The PDFs have a mild logarithmic dependence on the
choice of factorization scale Q, which can be computed in perturbation theory by a set
of equations known as the DGLAP (Dokshitzer–Gribov–Lipatov–Altarelli–Parisi)
equations. So they are really functions g(x, Q 2 ) etc., although the Q-dependence is
often left implicit for brevity, as we will mostly do below.
It is usual to choose the factorization scale Q to be comparable to some energy
scale relevant to the physical process of interest. For example, in deeply inelastic
scattering of leptons off of protons, with momentum transfer to the scattered quark
q μ , the factorization scale is typically chosen as Q 2 = −q 2 . This Q 2 is positive, since
q μ is a spacelike vector. When producing a pair of heavy particles with some mass
m, it is common to choose Q = m or some fraction thereof. After having completed
238 8 Quantum Chromo-Dynamics (QCD)

a calculation, one often varies the renormalization and factorization scales (either
together or independently) over a range (say, from Q = m/4 to Q = 2m in the
case just mentioned) to see how the cross-section or other observables that resulted
from the calculation vary. This is a test of the accuracy of the perturbation theory
calculation, since in principle if one could calculate exactly rather than to some low
order in perturbation theory, the results should not depend on either scale choice at
all.
In the proton, antiquarks are always virtual, and so must be accompanied by a
quark with the same flavor. This implies that if we add up all the up quarks found in
the proton, and subtract all the anti-ups, we must find a total of 2 quarks:

1

d x u(x, Q 2 ) − u(x, Q 2 ) = 2. (8.189)
0

Similarly, summing over all x the probability of finding a down quark with a given
x, and subtracting the same thing for anti-downs, one has:

1

d x d(x, Q 2 ) − d(x, Q 2 ) = 1. (8.190)
0

Most of the strange quarks in the proton come from the process of a virtual gluon
splitting into a strange and anti-strange pair. Since the virtual gluon treats quarks and
antiquarks on an equal footing, for every strange quark with a given x, there should
be5 an equal probability of finding an antistrange with the same x:

s(x, Q 2 ) = s(x, Q 2 ). (8.191)

The up-quark PDF can be thought of as divided into a contribution u v (x) from the
two valence quarks, and a contribution u s (x) from the sea (non-valence) quarks that
are accompanied by an anti-up. (Here and below the factorization scale dependence
is left implicit, for brevity.) So we have:

u(x) = u v (x) + u s (x), u s (x) = u(x); (8.192)

d(x) = dv (x) + ds (x), ds (x) = d(x). (8.193)

There is also a constraint that the total 4-momentum of all partons found in the proton
must be equal to the 4-momentum of the proton that they form. This rule takes the
form:

5 Although QCD interactions do not change quark flavors, there is a small strangeness violation in
the weak interactions, so the following rule is not quite exact.
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 239

1
d x x[g(x) + u(x) + u(x) + d(x) + d(x) + s(x) + s(x) + · · · ] = 1, (8.194)
0

or more generally, for any hadron h made out of parton species A,

d x x f Ah (x) = 1. (8.195)
A 0

Each term x f Ah (x) represents the probability that a parton is found with a given
momentum fraction x, multiplied by that momentum fraction. One of the first com-
pelling pieces of evidence that the gluons are actual particles carrying real momentum
and energy, and not just abstract group-theoretic constructs, was that if one excludes
them from the sum rule (8.194), only about half of the proton’s 4-momentum is
accounted for:

1
d x x[u(x) + u(x) + d(x) + d(x) + s(x) + s(x) + · · · ] ≈ 0.5. (8.196)
0

If we could solve the bound state problem for the proton in QCD, like one can
solve the hydrogen atom in quantum mechanics, then we could derive the PDFs
directly from the Hamiltonian. However, we saw in Sect. 8.5 why this is not practi-
cal; perturbation theory in QCD is not accurate for studying low-energy problems
like bound-state problems, because the gauge coupling becomes very large at low
energies. Instead, the proton PDFs are measured by experiments including those
in which charged leptons and neutrinos probe the proton, like − p → − + X and
ν p → − + X . Several collaborations perform fits of available data to determine the
PDFs, and periodically publish updated result both in print and as computer code.
In each case, the PDFs are given in the form of computer codes obtained by fitting
to experimental data. Because of different techniques and weighting of the data, the
PDFs from different groups are always somewhat different.
As an example, let us consider the CTEQ collaboration’s CTEQ5L PDF set.
Here, the “5” says which update of the PDFs is being provided, and the “L” stands
for “lowest order”, which means it is the set appropriate when one only has the
lowest-order calculation of the partonic cross-section. This set is somewhat old; we
use it only because it is relatively easy to evaluate, since it is given in parameterized
function form rather than interpolation table form. There are more recent sets from
CTEQ and other collaborations. Each of these has a version appropriate for lowest-
order work, and other versions appropriate when one has the next-to-leading order
(NLO) or next-next-leading order (NNLO) formulas for the hard scattering process
of interest. At Q = 10 GeV, the CTEQ5L PDFs for u(x), u(x), and the valence
contribution u v (x) ≡ u(x) − u(x) are shown below, together with a similar graph
for the down quark and antiquark distributions:
240 8 Quantum Chromo-Dynamics (QCD)

1 1
CTEQ5L, Q=10 GeV CTEQ5L, Q=10 GeV
u(x) d(x)
0.8 u(x) = us(x) 0.8 d(x) = ds(x)
uv(x) = u(x) - u(x) dv(x) = d(x) - d(x)

0.6 0.6
x f(x) x f(x)
0.4 0.4

0.2 0.2

0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x

Here we follow tradition by graphing x times the PDF in each case, since they all
tend to get large near x = 0.
We see from the first graph that the valence up-quark distribution is peaked below
x = 0.2, with a long tail for larger x (where an up-quark is found to have a larger
fraction of the proton’s energy). There is even a significant chance of finding that
an up quark has more than half of the proton’s 4-momentum. In contrast, the sea
quark distribution u(x) is strongly peaked near x = 0. This is a general feature of
sea partons; the chance that a virtual particle can appear is greater when it carries a
smaller energy, and thus a smaller fraction x of the proton’s total momentum. The
solid curve shows the total up-quark PDF for this value of Q. The sea distribution
d(x) is not very different from that of the anti-up, but the distribution dv (x) is of
course only about half as big as u v (x), since there is only one valence down quark
to find in the proton.
Next, let us look at the strange and gluon PDFs:

1
CTEQ5L, Q=10 GeV
g(x)
0.8 s(x) = s(x)

0.6
x f(x)
0.4

0.2

0
0 0.2 0.4 0.6 0.8 1
x

The parton distribution function for gluons grows very quickly as one moves towards
x = 0. This is because there are 8 gluon color combinations available, and each virtual
8.6 Parton Distribution Functions and Hadron-Hadron Scattering 241

gluon can give rise to more virtual gluons because of the 3-gluon and 4-gluon vertex.
This means that the chance of finding a gluon gets very large if one requires that it only
have a small fraction of the total 4-momentum of the proton. The PDFs s(x) = s(x)
are suppressed by the non-zero strange quark mass, since this imposes a penalty on
making virtual strange and antistrange quarks. This explains why s(x) < d(x).
The value of the factorization scale Q = 10 GeV corresponds roughly to the
appropriate energy scale for many of the experiments that were actually used to
fit for the PDFs. However, at the Tevatron and LHC, one often studies events with
a much larger characteristic energy scale, like Q ∼ m t for top events and perhaps
Q ∼ 1000 GeV for supersymmetry events at the LHC. Larger Q is appropriate for
probing the proton at larger energy scales, or shorter distance scales. The next two
graphs show the CTEQ5L PDFs for Q = 100 and 1000 GeV:

1 1
CTEQ5L, Q=100 GeV CTEQ5L, Q=1000 GeV
g g
0.8 0.8

0.6 0.6
x f(x) u x f(x)
u
0.4
d 0.4
d

0.2 d 0.2 d
u u
0 0
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x

As Q increases from Q = 100 to 1000 GeV, the PDFs become larger at very small x
(although this is hard to see from the graphs), but smaller for x >
∼ 0.015 for gluons and
x>∼ 0.04 for quarks. More generally, the variation with Q can be made quantitative
using the DGLAP equations, which are built into the computer codes that provide
the parton distributions as a function of x and Q.
Now suppose we have available a set of PDFs, and let us see how to use them to
get a cross-section. Consider scattering two hadrons h and h , and let the partonic
differential cross-sections for the desired final state X be

d σ̂ (ab → X ) (8.197)

for any two partons a (to be taken from h) and b (from h ). The hat is used as a
reminder that this is a partonic process. If X has two particles 1,2 in it, then one
defines partonic Mandelstam variables:

ŝ = ( pa + pb )2 , (8.198)
tˆ = ( pa − k1 )2 , (8.199)
û = ( pa − k2 )2 . (8.200)
242 8 Quantum Chromo-Dynamics (QCD)

Let us work in the center-of-momentum frame, with approximately massless hadrons

and partons, so that
μ
ph = (E, 0, 0, E), (8.201)
μ
ph = (E, 0, 0, −E). (8.202)

Then we can define a Feynman x for each of the initial-state partons, xa and xb , so:

paμ = xa (E, 0, 0, E), (8.203)

μ
pb = xb (E, 0, 0, −E). (8.204)

It follows that, with s = ( ph + ph )2 for the whole hadronic event,

ŝ = (xa + xb )2 E 2 − (xa − xb )E 2 = 4xa xb E 2 = sxa xb . (8.205)

Here s is determined by the collider [(1960 GeV)2 at the Tevatron and (13 TeV)2 at
the LHC], while ŝ is different for each event. Now, to find the total cross-section to
produce the final state X in h, h collisions, we should multiply the partonic cross-
section by the probabilities of finding in h a parton a with momentum fraction in
the range xa to xa + d xa and the same probability for parton b in h ; then integrate
over all possible xa and xb , and then sum over all the different parton species a and
b. The result is:

1
1

dσ (hh → X ) = d xa d xb d σ̂ (ab → X ) f ah (xa ) f bh (xb ). (8.206)
a,b 0 0

This integration is done by computer, using PDFs with Q chosen equal to some
energy characteristic of the event. The partonic differential cross-section d σ̂ (ab →
μ μ μ
X ) depends on the momentum fractions xa and xb through pa = xa ph and pb =
μ μ μ
xb ph , with ph and ph controlled or known by the experimenter.

8.7 Top-Antitop Production in p P and pp Collisions

As an example, let us consider top-antitop production, first at the Tevatron. The

parton-level processes that can contribute to this are, with the first parton taken to be
from the proton and the second from the antiproton:

uu → tt, dd → tt, gg → tt, dd → tt,

uu → tt, ss → tt, ss → tt. (8.207)

These are listed in the order of their numerical importance in contributing to the
total cross-section for tt at the Tevatron. Notice that the most likely thing is to find
8.7 Top-Antitop Production in p P and pp Collisions 243

a quark in the proton and an antiquark in the anti-proton, but there is also a small
but non-zero probability of finding an anti-quark in the proton, and a quark in the
anti-proton. All of the processes involving quark and antiquark in the initial state
involve the same parton-level cross-section

d σ̂ (qq → tt)
. (8.208)
d tˆ
The gluon-gluon process has a partonic cross-section that is somewhat more difficult
to obtain:

d σ̂ (gg → tt) παs2 6(m 2 − tˆ)(m 2 − û) m 2 (ŝ − 4 m 2 )
= −
d tˆ 8ŝ 2 ŝ 2 3(m 2 − tˆ)(m 2 − û)
4[(m − tˆ)(m − û) − 2 m (m + tˆ)] 4[(m 2 − tˆ)(m 2 − û) − 2 m 2 (m 2 + û)]
2 2 2 2
+ +
3(m 2 − tˆ)2 3(m 2 − û)2

2 ˆ
3[(m − t )(m − û) + m (û − t )]
2 2 ˆ 3[(m − t )(m − û) + m 2 (tˆ − û)]
2 ˆ 2
− − , (8.209)
ŝ(m 2 − tˆ) ŝ(m 2 − û)

where m is the mass of the top quark. Even these leading-order partonic differen-
tial cross-sections depend implicitly on the renormalization scale μ, through the
renormalized coupling α S (μ).
In order to find the total cross-section, one can first integrate the partonic cross-
sections with respect to tˆ; this is equivalent to integrating over the final-state top
quark angle θ̂ in the partonic COM frame, since they are related linearly by
+
ŝ
tˆ = m 2t + −1 + cos θ̂ 1 − 4m 2t /ŝ . (8.210)
2

Therefore, for each partonic process one has

tˆmax
d σ̂
σ̂ = d tˆ , (8.211)
d tˆ
tˆmin

where
+
ŝ
tˆmax,min = m 2t + −1 ± 1 − 4m t /ŝ .
2 (8.212)
2

It is also useful to note that (8.205) implies

ŝ d ŝ
xb = ; d xb = . (8.213)
xa s xa s

So instead of integrating over xb , we can integrate over ŝ. The limits of integration
on ŝ are from ŝmin = 4m 2t (the minimum required to make a top-antitop pair) to
244 8 Quantum Chromo-Dynamics (QCD)

ŝmax = s (the maximum available from the proton and antiproton, corresponding to
xa = xb = 1). For a given ŝ, the range of xa is from ŝ/s to 1. Relabeling xa as just
x, we therefore have:

s 1 ,
1
σ ( p p → tt) = d ŝ dx σ̂ (qq → tt) u(x)u(ŝ/xs) + d(x)d(ŝ/xs)
xs
4m 2t ŝ/s

+u(x)u(ŝ/xs) + d(x)d(ŝ/xs) + 2 s(x)s(ŝ/xs)
-
+σ̂ (gg → tt)g(x)g(ŝ/xs) . (8.214)

√
Using the CTEQ5L PDFs and m t = 173 GeV, with s = 1960 GeV, and computing
αs (μ) using (8.165) starting from αs (m t ) = 0.1082, and working with the leading-
order partonic cross-sections, the results as a function of the common factorization
and renormalization scale Q = μ look like:

1
10 Total
σ(tt) [pb] at Tevatron

-
uu
0 dd
-
10
gg

-1
10
-
dd
-2 -uu
10
ss- + ss
-
0.2 0.5 1 2
Q/mt

Unfortunately, the accuracy of the above results, obtained with only leading order
partonic cross-sections and PDFs, is clearly not very high. Ideally, the lines should
be flat, but there is instead a strong dependence of the leading-order prediction on
Q = μ. The higher-order corrections to the quark-antiquark processes turn out to
be of order 10 to 20%, while the gluon-gluon process gets about a 70% correction
from its leading-order value at Q = μ = m t . Accurate comparisons with experiment
require a much more detailed and sophisticated treatment of the higher-order effects,
including at least a next-to-leading order calculation of the partonic cross-sections.
Still, some useful information can be gleaned. Experience has shown that evaluating
the leading-order result at Q ∼ m t /2 gives a decent estimate of the total cross-
section, although a principled justification for this scale choice is hard to make. Also,
the relative sizes of the parton-level contributions can be understood qualitatively
from the PDFs as follows. To produce a top-antitop pair, we must have ŝ > 4m 2t , so
according to (8.205),

xa xb > 4m 2t /s = 4(173)2 /(1960)2 = 0.0312. (8.215)

8.7 Top-Antitop Production in p P and pp Collisions 245

So at least one of the momentum fraction x’s must be larger than 0.1765 for m t = 173
GeV. This means that the largest contributions come from the valence quarks. Since
there are roughly twice as many valence up quarks as down quarks in the proton
for a given x, and twice as many antiups as antidowns in the antiproton, the ratio of
top-antitop events produced from up-antiup should be about 4 times that from down-
antidown. The gluon-gluon contribution is suppressed in this case because most of
the gluons are at small x and do not have enough energy to make a top-antitop pair.
Finally, the contributions from sea partons (u, d, s, s in the proton, and u, d, s, s in
the antiproton) are highly suppressed for the same reason.
Let us now consider tt production√ for the Large Hadron Collider, a pp collider,
by taking into account the larger s and the different parton distribution function
roles. Since both of the initial-state hadrons are protons, the formula for the total
cross-section is now:
s 1 ,
1
σ ( pp → tt) = d ŝ dx σ̂ (qq → tt) 2u(x)u(ŝ/xs) + 2d(x)d(ŝ/xs)
xs
4m 2t ŝ/s
-
+2 s(x)s(ŝ/xs) + σ̂ (gg → tt) g(x)g(ŝ/xs) . (8.216)

The factors of 2 are present because each proton can contribute either the quark, or
the antiquark; then the contribution of the other proton is fixed. The gluon-gluon
contribution has the same form as in p p collisions, because the gluon distribution is
identical in protons and in antiprotons.
Numerically integrating the above formula with a computer using the CTEQ5L
PDFs, one finds the leading order results shown below, as a function
√ of the common
factorization and renormalization scale Q = μ, for the case of s = 14 TeV:
3
10
Total
σ(tt) [pb] at 14 TeV LHC

2
10
- + uu
uu -
- -
dd + dd
1
10
- + -ss
ss
- + -cc
cc
0
100.2 0.5 1 2
Q/mt

Again these leading-order results are highly dependent on Q = μ, and subject to

large corrections at next-to-leading order and beyond. The most striking feature of
the LHC result is that in contrast to the Tevatron situation, the gluon-gluon partonic
contribution is dominant over the quark-antiquark contributions for the LHC. This is
partly because all of the quark-antiquark contributions now require a sea antiquark
246 8 Quantum Chromo-Dynamics (QCD)

PDF, but that is not the main reason. The really important effect is that at very high
energies like at the LHC, the top quark can be considered light (!)√ and so one can
make them using partons with much lower x. For example, with s = 14 TeV, the
kinematic constraint on the longitudinal momentum fractions becomes

xa xb > 4(173)2 /(14000)2 = 0.000611, (8.217)

so that now the smaller one can be as low as 0.0247. At low x, we saw above that
the gluon distribution function is very large; one has plenty of gluons available with
less than 1/10 of the protons’ total energy, and they dominate over the quark and
antiquark PDFs. This is actually a common feature, and is why you sometimes hear
people somewhat whimsically call the LHC a “gluon collider”; with so much energy
available for the protons, many processes are dominated by the large gluon PDF at
low x. There are some processes that do not rely on gluons at all, however. We will
see one example in Sect. 8.9. Those processes are dominated by sea quarks at the
LHC. Also, many processes get a large contribution from gluon-squark scattering as
well, for example gluon-squark production in supersymmetry.
One can also look at√the distribution of tt production as a function of the total
invariant mass Minv = ŝ of the hard scattering process, by leaving the ŝ integra-
tion in (8.214) and (8.216) unperformed. The resulting shape of the distribution
normalized by the total cross-section,
√
1 dσ (tt) 2 s dσ (tt)
= , (8.218)
σ (tt) d Minv σ (tt) d ŝ
√
is shown below for the Tevatron and the LHC with s = 14 TeV:

Tevatron
(1/σ)dσ(tt)/dMinv [GeV ]
-1

LHC at 14 TeV
0.006

0.004

0.002

0
300 500 600 700 800 900 1000
400
tt Minv [GeV]

The invariant mass distribution of the tt system is peaked not far above 2m t in
both cases, indicating that the top and antitop usually have only semi-relativistic
velocities. This is because the PDFs fall rapidly with increasing x, so the most
important contributions to the production cross-section occur when both x’s are
not very far above their minimum allowed values. At the LHC, the top and antitop
are likelier to be produced with higher energy than at the Tevatron, with a more
substantial tail at high mass.
8.8 Kinematics in Hadron-Hadron Scattering 247

8.8 Kinematics in Hadron-Hadron Scattering

Let us now consider the general problem of kinematics associated with hadron-
hadron collisions with underlying 2 → 2 parton scattering. To make things simple,
we will suppose all of the particles are essentially massless, so what we are about
to do does not work for tt in the final state (but could be generalized to do so).
After doing a sum/average over spins, colors, and any other unobserved degrees of
freedom, we should be able to compute the differential cross-section for the partonic
event from its Feynman diagrams as:

d σ̂ (ab → 12)
. (8.219)
d tˆ
As we learned in Sect. 8.6, we can then write:
d σ̂ (ab → 12)
dσ (hh → 12 + X ) = f ah (xa ) f bh (xb ) d tˆ d xa d xb . (8.220)
d tˆ
a,b

A cartoon picture of the scattering process in real space might look like:

There are several different ways to choose the kinematic variables describing the final
state. There are three significant degrees of freedom: two angles at which the final-
state particles emerge with respect to the collision axis, and one overall momentum
scale. (Once the magnitude of the momentum transverse to the beam for one particle
is specified, the other is determined.) The angular dependence about the collision
axis is trivial, so we can ignore it.
For example, we can use the following three variables: momentum of particle 1
transverse to the√ collision axis, pT ; the total center-of-momentum energy of the final
state partons, ŝ; and the longitudinal rapidity of the two-parton system in the lab
frame, defined by

1
Y = ln(xa /xb ). (8.221)
2
248 8 Quantum Chromo-Dynamics (QCD)

This may look like an obscure definition, but it is the rapidity (see Sect. 2) needed
to boost along the collision axis to get to the center-of-momentum frame for the
two-parton system. It is equal to 0 if the final-state particles are back-to-back, which
would occur in the special case that the initial-state partons have the same energy
in the lab frame. Instead of the variables (xa , xb , tˆ), we can use the more directly
observable variables (ŝ, pT2 , Y ), or perhaps some subset of these with the others
integrated over. Working in the center-of-momentum frame of the partons, we can
write:

paμ = ( Ê, 0, 0, Ê), (8.222)

μ
pb = ( Ê, 0, 0, − Ê), (8.223)
+
μ
k1 = ( Ê, 0, pT , Ê 2 − pT2 ), (8.224)
+
μ
k2 = ( Ê, 0, − pT , − Ê 2 − pT2 ), (8.225)

with

ŝ = 4 Ê 2 = xa xb s, (8.226)

from which it follows that

+
ŝ
tˆ = ( pa − k1 )2 = 1 + 1 − 4 pT2 /ŝ , (8.227)
2
+
ŝ
û = ( pa − k2 )2 = 1 − 1 − 4 pT2 /ŝ , (8.228)
2

and so

tˆû tˆ(sxa xb + tˆ)

pT2 = =− , (8.229)
ŝ sxa xb

where the last equality uses û = −ŝ − tˆ = −sxa xb − tˆ for massless particles. Mak-
ing the change of variables (xa , xb , tˆ) to (ŝ, pT2 , Y ) for a differential cross-section
requires

d tˆ d xa d xb = J d ŝ d( pT2 ) dY , (8.230)

where J is the determinant of the Jacobian matrix of the transformation. Evaluating

it, by taking the inverse of the determinant of its inverse, one finds:
. .
. ∂(ŝ, p 2 , Y ) .−1
. . xa xb
J =. T
. = . (8.231)
. ∂(tˆ, xa , xb ) . ŝ + 2tˆ
8.9 Drell-Yan Scattering (+ − Production in Hadron collisions) 249

Therefore, (8.220) becomes:

dσ (hh → 12 + X ) xa xb h d σ̂ (ab → 12)

= f a (xa ) f bh (xb ) , (8.232)
d ŝ d( pT ) dY
2 ˆ
ŝ + 2t a,b d tˆ

where the variables xa , xb , tˆ on the right-hand side are understood to be determined

in terms of ŝ, pT2 and Y by (8.226), (8.229), and (8.221).

8.9 Drell-Yan Scattering (+ − Production in Hadron

collisions)

As an example, let us consider the process of Drell-Yan scattering, which is the

production of lepton pairs in hadron-hadron collisions through a virtual photon:

hh → + − . (8.233)

This does not involve QCD as the hard partonic scattering, since the leptons = e,
μ, or τ are singlets under SU (3)c color. However, it still depends on QCD, because
to evaluate it we need to know the PDFs for the quarks inside the hadrons. Since
gluons have no electric charge and do not couple to photons, the underlying partonic
process is always:

qq → + − . (8.234)
√
with the q coming from either h or h . The cross-section for this, for s m Z and
not near a resonance, can be obtained by exactly the same methods as in (5.2.1) for
e+ e− → μ+ μ− ; we just need to remember to use the charge of the quark Q q instead
of the charge of the electron, and to average over initial-state colors. The latter effect
leads to a suppression of 1/3; there is no reaction if the colors do not match. One
finds that the differential partonic cross-section is:

d σ̂ (qq → + − ) 2π α 2 Q q2 (tˆ2 + û 2 )
= . (8.235)
d tˆ 3ŝ 4

Therefore, using û = −ŝ − tˆ for massless scattering, and writing separate contribu-
tions from finding the quark, and the antiquark, in h:

dσ (hh → + − )
=
d ŝ dpT2 dY
2π α 2 tˆ2 + (ŝ + tˆ)2
h h
x a x b Q 2
f h
(x a ) f (x b ) + f h
(x a ) f (x b ) . (8.236)
3 ŝ 4 (ŝ + 2tˆ) q
q q q q q

This can be used to make a prediction for the experimental distribution of events
with respect to each of ŝ, pT , and Y .
250 8 Quantum Chromo-Dynamics (QCD)

Alternatively, we can choose to integrate the partonic differential cross-section

with respect to tˆ first. Since tˆ determines the scattering angle with respect to the
collision axis, this will eliminate the integration over pT2 . The total partonic cross-
section obtained by integrating (8.235) is:

4π α 2 Q q2
σ̂ (qq → + − ) = . (8.237)
9ŝ
Therefore we get:

dσ (hh → + − ) =
4π α 2 2 h

d xa d xb Q q f q (xa ) f qh (xb ) + f qh (xa ) f qh (xb ) . (8.238)
9ŝ q

Now making the change of variables from (xa , xb ) to (ŝ, Y ) requires:

. .
. ∂(ŝ, Y ) .−1
d xa d xb = .. . d ŝ dY = d ŝ dY , (8.239)
∂(xa , xb ) . s

we therefore have:

dσ (hh → + − ) 4π α 2 2 h

= Q q f q (xa ) f qh (xb ) + f qh (xa ) f qh (xb ) . (8.240)
d ŝ dY 9 s ŝ q

Still another way to present the result is to leave only ŝ unintegrated, by again first
integrating the partonic differential cross-section with respect to tˆ, and then trading
one of the Feynman-x variables for ŝ, and do the remaining x-integration. This is
how we wrote the top-antitop total cross-section. The Jacobian factor in the change
of variables from (xa , xb ) → (xa , ŝ) is now:
. .−1
. ∂ ŝ .
d xa d xb = .. . d xa d ŝ = d xa d ŝ . (8.241)
∂x . xa s
b

Using this in (8.238) now gives, after replacing xa by x and integrating:

dσ (hh → + − )
=
d ŝ
1
4π α 2 2 ŝ h h h

Q q d x f (x) f (ŝ/xs) + f h
(x) f (ŝ/xs) . (8.242)
9ŝ 2 q xs q q q q
ŝ/s

This version makes a nice prediction that is (almost) independent of the actual parton
distribution functions. The right-hand side could have depended on both ŝ and s in
an arbitrary way, but to the extent that the PDFs are independent of Q, we see that it
8.9 Drell-Yan Scattering (+ − Production in Hadron collisions) 251

is predicted to scale like 1/ŝ 2 times some function of the ratio ŝ/s. Since the PDFs
run slowly with Q, this is a reasonably good prediction. Drell-Yan scattering has
been studied in hh = p p, pp, π ± p, and K ± p scattering experiments, and in each
case the results indeed satisfy the scaling law:

dσ (hh → + − )
ŝ 2 = Fhh (ŝ/s) (8.243)
d ŝ

to a very good approximation, for low ŝ not near a resonance. Furthermore, the
functions Fhh gives information about the PDFs.
Because of the relatively clean signals of muons in particle detectors, the Drell-
Yan process

p p → μ− μ+ + X or pp → μ− μ+ + X (8.244)

is often one of the first things one studies at a hadron collider to make sure everything
is working correctly and understood. It also provides a test of the PDFs, especially
at small x. √
For larger s, one must take into account the s-channel Feynman diagram with
a Z boson in place of the virtual photon. The resulting cross-section can be obtained
from (8.242) by replacing
!
4π α 2 2 4π α 2 Q q2 (Vq2 + Aq2 )(V2 + A2 ) 2Q q Vq V (1 − m 2Z /ŝ)
Qq → + − (8.245)
9ŝ 2
q
9 q
ŝ 2 (ŝ − m 2Z )2 + m 2Z 2Z (ŝ − m 2Z )2 + m 2Z 2Z

where V f and A f are coupling coefficients associated with the Z –fermion–

antifermion interaction vertex, and will be specified explicitly at the end of Sect.
10.1. The first term is from the virtual photon contribution to the matrix element,
while the second comes from the virtual Z boson contribution, and features a Breit-
Wigner resonance denominator from the Z boson propagator, which depends on the
mass and width m Z = 91.188 GeV and Z = 2.495 GeV. The last term is due to the
interference between these two amplitudes.
At the end of 2010, the CMS and ATLAS LHC detector collaborations released
their measurements of the μ− μ+ and e− e+ invariant mass distributions. The dimuon
plot from CMS is shown in Fig. 8.2.
Clearly visible are the effects of the η, ρ, ω, φ, J /ψ, ψ , and ϒ 1S, 2S, and 3S
meson resonances,
√ the Z boson resonance, and the general decrease of the cross-
section with ŝ (the invariant mass of the final state). The distribution includes the
effects of the intrinsic widths of the resonances as well as detector resolution effects
and trigger and detector efficiencies.
252 8 Quantum Chromo-Dynamics (QCD)

Fig. 8.2 The production of dimuon pairs as a function of μ+ μ− invariant mass (from CMS DP-
2018/055)

Problems

1. Each element g of the SU (2) group can be parametrized by three real parameters
xa , where

i xb T b [1 − x 2 ]1/2 + i x1 x2 + i x3
g = g(x1 , x2 , x3 ) = e =
−x2 + i x3 [1 − x 2 ]1/2 − i x1

where T k are the corresponding SU (2) Lie algebra generators and the xk take on
values

x 2 = x12 + x22 + x32 ≤ 1. (8.246)

Note, the identity element is when x 2 = 0, and so small deviation away from the
identity are when |x| 1.
(a) Determine what the generators T k are by
.
dg ..
iT = k
(8.247)
d xk .x1 =x2 =x3 =0

(b) Determine the commutation relations of T k .

(c) How are the T k ’s related to the Pauli matrices?
(d) Are the T k generators in a particular representation of the Lie algebra, and if
so, what representation?
Problems 253

¯
2. (a) Compute the QCD √ cross-section σ (u ū → f f ) where f = u for collision cen-
ter of mass energy s. Assume m f = 0 and m u = 0 for this problem, and u is the
up-quark of QCD. Represent your final answer not in terms of the SU (3) genera-
tors but in terms of “group and representation data”, such as I R ( f ), C(R), C(G),
dG , d R , etc. Now, compute precisely what you get for the cases of (b) f being in
the fundamental representation and (c) f being in the adjoint representation of
SU (3).
3. Consider the EHLQ pdf functions6

xu v (x) = 1.78 x 0.5 (1 − x 1.51 )3.5 (8.248)

xdv (x) = 0.67 x 0.4 (1 − x 1.51 )4.5 (8.249)
xu s (x) = xds (x) = 0.182 (1 − x)8.54 (8.250)
xss (x) = 0.081 (1 − x)8.54 (8.251)
xg(x) = (2.62 + 9.17 x) (1 − x)5.90 (8.252)

and q̄s (x) = qs (x) for every q = {u, d, s}. In addition, ū v (x) = d̄v (x) = 0. The
subscripts refer to valence (qv (x)) and sea (qs (x)) quarks. These PDFs depend
on Q very weakly (u(x) = u(x, Q)) but ignore that dependence here.
This problem below is to be done numerically. All results should be given to three
significant digits (e.g., 1.38).
(a) Compute the number of valence up quarks and number of valence down
quarks in the proton according to these PDFs.
(b) Compute the total fraction of momentum carried by all the partons.
(c) What is the total momentum carried by the gluons?
(d) Compute the total fraction of momentum carried by strange quarks (s and s̄).

4. Consider Drell-Yan production pp → μ+ μ− at LHC, which is pp collisions at

13 TeV center of mass energy.

(a) Compute the relevant parton-level differential cross-sections d σ̂ /d tˆ for u ū →

μ+ μ− and d d̄ → μ+ μ− . Ignore the Z boson and just pretend that only the
photon is exchanged for these processes.
(b) Write down the full integral of these partonic cross-sections over the parton
distribution functions for total cross-section of μ+ μ− with invariant mass
m μ+ μ− > Mcut .
(c) Integrate the result of part (b) using EHLQ PDFs (given in previous problem)
and give σ values for Mcut = 10 GeV, 100 GeV and 1000 GeV in units of fb.

6 Eichten et al., “Supercollider Physics.” Rev. Mod. Phys. 1984.

254 8 Quantum Chromo-Dynamics (QCD)

5. Consider all of the 2 → 2 partonic processes in QCD:

qq → qq, qq → qq , q q → q q, q q → q q , (8.253)

qq → gg, qq → qq, qq → q q , qq → qq , (8.254)
qg → qg, qg → qg, gg → gg, gg → qq. (8.255)

where q represents any fixed quark flavor, and q represents a quark flavor that
is definitely different from q. For the purposes of this problem,
√we consider only
massless partons (those very light compared to the partonic ŝ), and so do not
consider processes involving top quarks or antiquarks.
Several of the processes in this list actually have the same tree-level differential
cross-sections:

d σ̂ (qq → qq ) d σ̂ (q q → q q ) d σ̂ (qq → qq )
= = , (8.256)
d tˆ d tˆ d tˆ
d σ̂ (qq → qq) d σ̂ (q q → q q)
= , (8.257)
d tˆ d tˆ
d σ̂ (qg → qg) d σ̂ (qg → qg)
= . (8.258)
d tˆ d tˆ
So, there are really only 8 independent parton-level cross-sections in terms of
which one can express the leading-order two-jet production cross-section for
hadron colliders including the Tevatron and the LHC. Assume they are all known.

(a) Find an integral expression for the total Tevatron cross section for dijet pro-
duction in p p collisions at leading order in QCD in terms of the 8 independent
parton-level total cross sections and the PDFs g(x), u(x), d(x), u(x), d(x), s(x).
(Neglect charm and bottom PDFs; they are small.) I’ll start by including the
contributions for three of the 8 independent parton-level cross-sections, and
you fill in the contributions for the other 5:

σ(p p → j j + X) =
s 1 ,
dx
d ŝ g(x)g(ŝ/sx) σ̂ (gg → gg) + σ̂ (gg → qq)
xs q
0 0
-
+[u(x)u(ŝ/sx) + d(x)d(ŝ/sx) + s(x)s(ŝ/sx)]2σ̂ (qq → qq) + ? . (8.259)

√
Here s is the
√proton-antiproton collision energy in their center-of-momentum
frame, and ŝ is the parton-parton collision energy in their center-of-
momentum frame.
(b) Do the same for pp collisions relevant for the LHC. (The answer is different.)

6. Consider the parton-level process involving scattering of a quark with its anti-
quark: qq → qq. (Note that this is one of the partonic processes that appeared
Problems 255

in the previous problem.) We’ll be calculating this to leading order in QCD,

which means that you should only include diagrams with gluon exchange; pho-
ton exchange is relatively negligible. You will probably find Sects. 5.2.4 and 9.2.1
to be useful references. In fact, the “non-color part” of the calculation in this
problem is essentially identical to the calculation of Bhabha scattering in QED,
so large chunks of the calculation can be simply adopted to the present problem.
Therefore, the hard part is getting the color part correct; watch out for the fact
that the color factor for the interference term is different from the color factor for
the non-interference terms in part (b). Treat the quark q as massless.
(a) Draw the two Feynman diagrams, label them appropriately, and make a cor-
responding list of the initial and final state partons with their momenta, spin,
spinor, and color index. Write down the reduced matrix element, using the
QCD Feynman rules on page 195 and working in Feynman gauge (ξ = 1).
(b) Compute the reduced matrix element squared, summed (averaged) over final
(initial) spins and colors. You should find:
2 2 / 0
1 1 2 4 4 ŝ 2 + û 2 4 tˆ2 + û 2 3 û 2
|M| = g3 + −n (8.260)
3 2 9 tˆ2 9 ŝ 2 ŝ tˆ
colors spins

where n is a certain positive rational number that you will find, and ŝ, tˆ, û are
the partonic Mandelstam variables.
[Hint: You will need to compute Tr[T a T b T a T b ], with the adjoint indices a
and b implicitly summed over. This can be done by using equations (8.1.16),
(8.1.59), (8.1.61), and (8.1.67).]

d σ̂ (qq → qq)
(c) Find the parton-level differential cross-section .
d tˆ
7. Use crossing symmetry to find some others of the 2 → 2 partonic differential
cross-sections mentioned in Problem 1:
d σ̂ (qq → qq)
(a) Using the results of part (b) of the previous problem, find .
d tˆ
d σ̂ (qq → qq )
(b) Using the result for qq → qq in equation (9.2.16), find both
d tˆ
d σ̂ (qq → q q )
and . [Hint: don’t worry, nothing weird happens with the
d tˆ
color factors in this problem.]
Spontaneous Symmetry Breaking
9

9.1 Global Symmetry Breaking

Not all of the symmetries of the laws of physics are evident in the state that describes
a physical system, or even in the vacuum state with no particles. For example, in
condensed matter physics, the ground state of a ferromagnetic system involves a
magnetization vector that points in some particular direction, even though Maxwell’s
equations in matter do not contain any special direction. This is because it is ener-
getically favorable for the magnetic moments in the material to line up, rather than
remaining randomized. The state with randomized magnetic moments is unstable
to small perturbations, like a stick balanced on one end, and will settle in the more
energetically-favored magnetized state.
The situation in which the laws of physics are invariant under some symmetry
transformations, but the vacuum state is not, is called spontaneous symmetry break-
ing. In this section we will study how this works in quantum field theory. There
are two types of continuous symmetry transformations; global, in which the trans-
formation does not depend on position in spacetime, and local (or gauge) in which
the transformation can be different at each point. We will work out how sponta-
neous symmetry breaking works in each of these cases, using the example of a U (1)
symmetry, and then guess the generalizations to non-Abelian symmetries.
Consider a complex scalar field φ(x) with a Lagrangian density:

L = ∂ μ φ ∗ ∂μ φ − V (φ, φ ∗ ), (9.1)

with potential energy

V (φ, φ ∗ ) = m 2 φ ∗ φ + λ(φ ∗ φ)2 , (9.2)

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 257
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_9
258 9 Spontaneous Symmetry Breaking

where m 2 and λ are parameters of the theory. This Lagrangian is invariant under the
global U (1) transformations:

φ(x) → φ (x) = eiα φ(x) (9.3)

where α is any constant. The classical equations of motion for φ and φ ∗ following
from L are (see (4.25)):

δV
∂ μ ∂μ φ + = 0, (9.4)
δφ ∗
δV
∂ μ ∂μ φ ∗ + = 0. (9.5)
δφ

Clearly, there is a solution with ∂μ φ = ∂μ φ ∗ = 0, where φ(x) is just equal to any

constant that minimizes the potential.
If m 2 > 0 and λ > 0, then the minimum of the potential is at φ = 0. The quantum
mechanical counterpart of this statement is that the ground state of the system will
be one in which the expectation value of φ(x) vanishes:

0|φ(x)|0 = 0. (9.6)

The scalar particles created and destroyed by the field φ(x) correspond to quantized
oscillations of φ(x) about the minimum of the potential. They have squared mass
equal to m 2 , and interact with a four-scalar vertex proportional to λ.
Let us now consider what happens if the signs of the parameters m 2 and λ are
different. If λ < 0, then the potential V (φ, φ ∗ ) is unbounded from below for arbi-
trarily large |φ|. This cannot lead to an acceptable theory. Classically there would
be runaway solutions in which |φ(x)| → ∞, gaining an infinite amount of kinetic
energy. The quantum mechanical counterpart of this statement is that the expectation
value of φ(x) will grow without bound.
However, there is nothing wrong with the theory if m 2 < 0 and λ > 0. (One should
think of m 2 as simply a parameter that appears in the Lagrangian density, and not
as the square of some mythical real number m.) In that case, the potential V (φ, φ ∗ )
has a local maximum at φ = 0, and a degenerate set of minima with

v2
|φmin |2 = , (9.7)
2
where we have defined:

v = −m 2 /λ. (9.8)

The potential V does not depend on the phase of φ(x) at all, so it is impossible
to unambiguously determine the phase of φ(x) at the minimum. However, by an
arbitrary choice, we can make Im(φmin ) = 0. In quantum mechanics, the system
9.1 Global Symmetry Breaking 259

will have a ground state in which the expectation value of φ(x) is constant and equal
to the classical minimum:
v
0|φ(x)|0 = √ . (9.9)
2
The quantity v is a measurable property of the vacuum state, known as the vacuum
expectation value, or VEV, of φ(x). If we now ask what the VEV of φ is after
performing a U (1) transformation of the form (9.3), we find:
v v
0|φ (x)|0 = eiα 0|φ(x)|0 = eiα √ = √ . (9.10)
2 2

The VEV is not invariant under the U (1) symmetry operation acting on the fields of
the theory; this reflects the fact that we had to make an arbitrary choice of phase. One
cannot restore the invariance by defining the symmetry operation to also multiply
|0 by a phase, since 0| will rotate by the opposite phase, canceling out of (9.10).
Therefore, the vacuum state must not be invariant under the global U (1) symmetry
rotation, and the symmetry is spontaneously broken. The sign of the parameter m 2 is
what determines whether or not spontaneous symmetry breaking takes place in the
theory.
In order to further understand the behavior of this theory, it is convenient to rewrite
the scalar field in terms of its deviation from its VEV. One way to do this is to write:

1
φ(x) = √ [v + R(x) + i I (x)] , (9.11)
2
1
φ ∗ (x) = √ [v + R(x) − i I (x)] , (9.12)
2
where R and I are each real scalar fields, representing the real and imaginary parts
of φ. The derivative part of the Lagrangian can now be rewritten in terms of R and
I , as:
1 μ 1
L= ∂ R∂μ R + ∂ μ I ∂μ I . (9.13)
2 2
The potential appearing in the Lagrangian can be found in terms of R and I most
easily by noticing that it can be rewritten as

V (φ, φ ∗ ) = λ(φ ∗ φ − v 2 /2)2 − λv 4 /4. (9.14)

Dropping the last term that does not depend on the fields, and plugging in (9.11) and
(9.12), this becomes:

λ
V (R, I ) = [(v + R)2 + I 2 − v 2 ]2 (9.15)
4
λ 2
= λv 2 R 2 + λv R(R 2 + I 2 ) + (R + I 2 )2 . (9.16)
4
260 9 Spontaneous Symmetry Breaking

Comparing this expression with our previous discussion of real scalar fields in Chap.
4, we can interpret the terms proportional to λv as R R R and R I I interaction vertices,
and the last term proportional to λ as R R R R, R R I I , and I I I I interaction vertices.
The first term proportional to λv 2 is a mass term for R, but there is no term quadratic
in I , so it corresponds to a massless real scalar particle. Comparing to the Klein-
Gordon Lagrangian density of (4.18), we can identify the physical particle masses:

m 2R = 2λv 2 = −2 m 2 , (9.17)
m 2I = 0. (9.18)

It is useful to redo this analysis in a slightly different way, by writing

1
φ(x) = √ [v + h(x)]ei G(x)/v , (9.19)
2
1
φ ∗ (x) = √ [v + h(x)]e−i G(x)/v , (9.20)
2

instead of (9.11), (9.12). Again h(x) and G(x) are two real scalar fields, related to
R(x) and I (x) by a non-linear functional transformation. In terms of these fields,
the potential is:

λ 4
V (h) = λv 2 h 2 + λvh 3 + h . (9.21)
4
Notice that the field G does not appear in V at all. This is because G just corre-
sponds to the phase of φ, and the potential was chosen to be invariant under U (1)
phase transformations. However, G does have interactions coming from the part
of the Lagrangian density containing derivatives. To find the derivative part of the
Lagrangian, we compute:

1 i(v + h)
∂μ φ = √ ei G/v ∂μ h + ∂μ G , (9.22)
2 v

1 i(v + h)
∂μ φ ∗ = √ e−i G/v ∂μ h − ∂μ G , (9.23)
2 v

so that:

1 μ 1 h 2 μ
Lderivatives = ∂ h∂μ h + 1+ ∂ G∂μ G. (9.24)
2 2 v

The quadratic part of the Lagrangian, which determines the propagators for h and
G, is

1 μ 1 1
Lquadratic = ∂ h∂μ h − m 2h h 2 + ∂ μ G∂μ G, (9.25)
2 2 2
9.2 Local Symmetry Breaking and the Higgs Mechanism 261

with

m 2h = 2λv 2 , (9.26)
m 2G = 0. (9.27)

This confirms the previous result that the spectrum of particles consists of a mas-
sive real scalar (h) and a massless one (G). The interaction part of the Lagrangian
following from (9.21) and (9.24) is:

1 1 2 μ λ
Lint = h + 2 h ∂ G∂μ G − λvh 3 − h 4 . (9.28)
v 2v 4

The field and particle represented by G is known as a Nambu-Goldstone boson

(or sometimes just a Goldstone boson). The original U (1) symmetry acts on G by
shifting it by a constant that depends on the VEV:

G → G = G + αv; (9.29)
h → h = h. (9.30)

This explains why G only appears in the Lagrangian with derivatives acting on it. In
general, a broken global symmetry is always signaled by the presence of a massless
Nambu-Goldstone boson with only derivative interactions. This is an example of
Goldstone’s theorem, which we will state in a more general framework in Sect. 9.3.

9.2 Local Symmetry Breaking and the Higgs Mechanism

Let us now consider how things change if the spontaneously broken symmetry is
local, or gauged. As a simple example, consider a U (1) gauge theory with a fermion
ψ with charge Q and gauge coupling g and a vector field Aμ transforming according
to:

ψ(x) → ei Qθ (x) ψ(x), (9.31)

ψ(x) → e−i Qθ (x) ψ(x), (9.32)
1
Aμ (x) → Aμ (x) − ∂μ θ (x). (9.33)
g

In order to make a gauge-invariant Lagrangian density, the ordinary derivative is

replaced by the covariant derivative:

Dμ ψ = (∂μ + i Qg Aμ )ψ. (9.34)

Now, in order to spontaneously break the gauge symmetry, we introduce a complex

scalar field φ with charge +1. It transforms under a gauge transformation like

φ(x) → eiθ (x) φ(x); φ ∗ (x) → e−iθ (x) φ ∗ (x). (9.35)

262 9 Spontaneous Symmetry Breaking

To make a gauge-invariant Lagrangian, we again replace the ordinary derivative

acting on φ, φ ∗ by covariant derivatives:

Dμ φ = (∂μ + ig Aμ )φ; Dμ φ ∗ = (∂μ − ig Aμ )φ ∗ . (9.36)

The Lagrangian density for the scalar and vector degrees of freedom of the theory
is:
1 μν
L = Dμ φ ∗ D μ φ − V (φ, φ ∗ ) − F Fμν , (9.37)
4
where V (φ, φ ∗ ) is as given before in (9.2). Because the covariant derivative of the
field transforms like

Dμ φ → eiθ Dμ φ, (9.38)

this Lagrangian is easily checked to be gauge-invariant. If m 2 > 0 and λ > 0, then this
theory describes a massive scalar, with self-interactions with a coupling proportional
to λ, and interaction with the massless vector field Aμ .
However, if m 2 < 0, then the minimum
√ of the potential for the scalar field brings
about a non-zero VEV 0|φ|0 = v/ 2 = −m 2 /2λ, just as in the global symmetry
case. Using the same decomposition of φ into real fields h and G as given in (9.19),
one finds:

1 1
Dμ φ = √ ∂μ h + ig Aμ + ∂μ G (v + h) ei G/v . (9.39)
2 gv

It is convenient to define a new vector field:

1
Vμ = Aμ + ∂μ G, (9.40)
gv

since this is the combination that appears in (9.39). Then

1
Dμ φ = √ ∂μ h + igVμ (v + h) ei G/v , (9.41)
2
1
Dμ φ = √ ∂μ h − igVμ (v + h) e−i G/v .
∗
(9.42)
2
Note also that since

∂μ Aν − ∂ν Aμ = ∂μ Vν − ∂ν Vμ , (9.43)

the vector field strength part of the Lagrangian is the same written in terms of the
new vector Vμ as it was in terms of the old vector Aμ :

1 μν 1
− F Fμν = − (∂μ Vν − ∂ν Vμ )(∂ μ V ν − ∂ ν V μ ). (9.44)
4 4
9.2 Local Symmetry Breaking and the Higgs Mechanism 263

The complete Lagrangian density of the vector and scalar degrees of freedom is now:

1 μ 1
L= ∂ h∂μ h + g 2 (v + h)2 V μ Vμ − F μν Fμν − λ(vh + h 2 /2)2 . (9.45)
2 4
This Lagrangian has the very important property that the field G has completely
disappeared! Reading off the part quadratic in h, we see that it has the same squared
mass as in the global symmetry case, namely

m 2h = 2λv 2 . (9.46)

There is also a term quadratic in the vector field:

1 g2 v2 μ
LV V = − F μν Fμν + V Vμ . (9.47)
4 2
This means that by spontaneously breaking the gauge symmetry, we have given a
mass to the corresponding vector field:

m 2V = g 2 v 2 . (9.48)

We can understand why the disappearance of the field G goes along with the
appearance of the vector boson mass as follows. A massless spin-1 vector boson
(like the photon) has only two possible polarization states, each transverse to its
direction of motion. In contrast, a massive spin-1 vector boson has three possible
polarization states; the two transverse, and one longitudinal (parallel) to its direction
of motion. The additional polarization state degree of freedom had to come from
somewhere, so one real scalar degree of freedom had to disappear. The words used
to describe this are that the vector boson becomes massive by “eating” the would-be
Nambu-Goldstone boson G, which becomes its longitudinal polarization degree of
freedom. This is called the Higgs mechanism. The original field φ(x) is called a
Higgs field, and the surviving real scalar degree of freedom h(x) is called by the
generic term Higgs boson. The Standard Model Higgs boson and the masses of the
W ± and Z bosons result from a slightly more complicated version of this same idea,
as we will see.
An alternative way to understand what has just happened to the would-be Nambu-
Goldstone boson field G(x) is that it has been “gauged away”. Recall that

1
φ = √ (v + h)ei G/v (9.49)
2

behaves under a gauge transformation as:

φ → eiθ φ. (9.50)
264 9 Spontaneous Symmetry Breaking

Normally we think of θ in this equation as just some ordinary function of spacetime,

but since this is true for any θ , we can choose it to be proportional to the Nambu-
Goldstone field itself:

θ (x) = −G(x)/v. (9.51)

This choice, known as “unitary gauge”, eliminates G(x) completely, just as we saw
in (9.45). In unitary gauge,

1
φ(x) = √ [v + h(x)]. (9.52)
2

Notice also that the gauge transformation (9.51) gives exactly the term in (9.40), so
that Vμ is simply the unitary gauge version of Aμ . The advantage of unitary gauge
is that the true physical particle content of the theory (a massive vector and real
Higgs scalar) is more obvious than in the version of the Lagrangian written in terms
of the original fields φ and Aμ . However, it turns out to be easier to prove that the
theory is renormalizable if one works in a different gauge in which the would-be
Nambu-Goldstone bosons are retained. The physical predictions of the theory do not
depend on which gauge one chooses, but the ease with which one can compute those
results depends on picking the right gauge for the problem at hand.
Let us catalog the propagators and interactions of this theory, in unitary gauge.
The propagators of the Higgs scalar and the massive vector are:

From (9.45), there are also hV V and hhV V interaction vertices:

and hhh and hhhh self-interactions:

9.3 Goldstone’s Theorem and the Higgs Mechanism in General 265

Finally, a fermion with charge Q inherits the same interactions with Vμ that it had
with Aμ , coming from the covariant derivative:

This is a general way of making massive vector fields in gauge theories with inter-
acting scalars and fermions, without ruining renormalizability.

9.3 Goldstone’s Theorem and the Higgs Mechanism in General

Let us now state, without proof, how all of the considerations above generalize to
arbitrary groups. First, suppose we have scalar fields φi in some representation of a
aj
global symmetry group with generators Ti . There is some potential

V (φi , φi∗ ), (9.53)

which we presume has a minimum where at least some of the φi are non-zero. This
of course depends on the parameters and couplings appearing in V . The fields φi
will then have VEVs that can be written:
vi
0|φi |0 = √ . (9.54)
2
Any group generators that satisfy
aj
Ti v j = 0 (9.55)

correspond to unbroken symmetry transformations. In general, the unbroken global

symmetry group is the one formed from the unbroken symmetry generators, and
the vacuum state is invariant under this unbroken subgroup. The broken generators
satisfy
aj
Ti v j = 0. (9.56)
266 9 Spontaneous Symmetry Breaking

Goldstone’s Theorem states that for every spontaneously broken generator, labeled
by a, of a global symmetry group, there must be a corresponding Nambu-Goldstone
boson. (The group U (1) has just one generator, so there was just one Nambu-
Goldstone boson.)
In the case of a local or gauge symmetry, each of the would-be Nambu-Goldstone
bosons is eaten by the vector field with the corresponding index a. The vector fields
for the broken generators become massive, with squared masses that can be computed
in terms of the VEV(s) and the gauge coupling(s) of the theory. There are also Higgs
boson(s) for the uneaten components of the scalar fields that obtained VEVs.
One might also ask whether it is possible for fields other than scalars to obtain
vacuum expectation values. If one could succeed in concocting a theory in which a
fermion spinor field or a vector field has a VEV:

0|
α |0 = 0 (?), (9.57)
0|Aμ |0 = 0 (?), (9.58)

then Lorentz invariance will necessarily be broken, since the alleged VEV carries an
uncontracted spinor or vector index, and therefore transforms non-trivially under the
Lorentz group. This would imply that the broken generators would include Lorentz
boosts and rotations, in contradiction with experiment. However, there can be vacuum
expectation values for antifermion-fermion composite fields, since they can form a
Lorentz scalar:

|0 = 0. (9.59)

This is called a fermion-antifermion condensate. In fact, in QCD the quark-antiquark

composite fields do have vacuum expectation values:

0|uu|0 ≈ 0|dd|0 ≈ 0|ss|0 ≈ μ3 = 0, (9.60)

where μ is a quantity with dimensions of [mass] which is set by the scale QCD
at which non-perturbative effects become important. This is known as chiral sym-
metry breaking. The chiral symmetry is a global, approximate symmetry by which
left-handed u, d, s quarks are rotated into each other and right-handed u, d, s quarks
are rotated into each other. (The objects qq are color singlets, so these antifermion-
fermion VEVs do not break SU (3)c symmetry.) Chiral symmetry breaking is actually
the mechanism that is responsible for most of the mass of the proton and the neutron,
and therefore most of the mass of everyday objects. When the chiral symmetry is
spontaneously broken, the Nambu-Goldstone bosons that arise include the pions π ±
and π 0 . They are not exactly massless because the chiral symmetry was really only
approximate to begin with, but the Goldstone theorem successfully explains why
they are much lighter than the proton; m 2π m 2p . They are often called pseudo-
Nambu-Goldstone bosons, or PNGBs, with the “pseudo” indicating that the asso-
ciated spontaneously broken global symmetry was only an approximate symmetry.
Extensions of the Standard Model that feature new approximate global symmetries
that are spontaneously broken generally predict the existence of heavy exotic PNGBs.
For example, these are a ubiquitous feature of technicolor models.
Problems 267

Problems

1. In Sect. 4.6 we performed a “linearly realized spontaneous symmetry breaking”

minimization of a scalar field theory with U (1) continuous global symmetry √ that
spontaneously breaks with the Higgs vev along the real direction φ = v/ 2.
See (9.11). We then found the spectrum of the theory (i.e., the two scalar bosons
and their masses) and the interaction Feynman rules for those scalar bosons. Redo
calculation but choose the Higgs vev to be along the imaginary access φ =
this√
iv/ 2, where v is a positive real number. Find the masses and the interaction
Feynman rules for the particles.
2. Consider a theory of two complex scalar fields ϕ−1/5 and ϕ3/5 , where ϕ−1/5 has
charge Q = −1/5 and ϕ3/5 has charge Q = 3/5 under a U (1) Q gauge theory. The
potential V (ϕ−1/5 , ϕ3/5 ) is such that the U (1) Q theory is spontaneously broken
with ϕ3/5 = v and ϕ−1/5 = 0. In that case the ground state has a residual
Z N discrete symmetry left over. (a) Determine N . (b) Assuming ϕ3/5 ∝ (h + v),
write down all the allowed interactions of h with ϕ−1/3 up to O(h 3 ).
The Standard Electroweak Model
10

10.1 SU(2) L × U(1)Y Representations and Lagrangian

In this chapter we detail the elements of the Standard Model of elementary particle
physics. This is the culmination of many of the principles and facts that we have
developed up to this point in the book. We start in this section by studying the
implications of the Higgs mechanism of the Standard Model, which gives rise to the
masses of W ± and Z bosons, and the masses of the leptons and most of the masses
of the heavier quarks.
The electroweak interactions are mediated by three massive vector bosons W ± , Z
and the massless photon γ . The gauge group before spontaneous symmetry break-
ing must therefore have four generators. After spontaneous symmetry breaking,
the remaining unbroken gauge group is electromagnetic gauge invariance. A viable
theory must explain the qualitative experimental facts that the W ± bosons couple
only to L-fermions (and R-antifermions), that the Z boson couples differently to
L-fermions and R-fermions, but γ couples with the same strength to L-fermions and
R-fermions. Also, there are very stringent quantitative experimental tests involving
the relative strengths of fermion-antifermion-vector couplings and the ratio of the W
and Z masses. The Standard Model (SM) of electroweak interactions of Glashow,
Weinberg and Salam successfully incorporates all of these features and tests into a
spontaneously broken gauge theory. In the SM, the gauge symmetry breaking is:

SU (2) L × U (1)Y → U (1)EM . (10.1)

We will need to introduce a Higgs field to produce this pattern of symmetry breaking.
The SU (2) L subgroup is known as weak isospin. Left-handed SM fermions are
known to be doublets under SU (2) L :

νe νμ ντ uL cL tL
, , , , , . (10.2)
eL μL τL dL sL bL

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 269
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_10
270 10 The Standard Electroweak Model

Notice that the electric charge of the upper member of each doublet is always 1
greater than that of the lower member. The SU (2) L representation matrix generators
acting on these fields are proportional to the Pauli matrices:

T a = σ a /2, (a = 1, 2, 3) (10.3)

with corresponding vector gauge boson fields:

Wμa , (a = 1, 2, 3) (10.4)

and a coupling constant g. The right-handed fermions

eR , μR , τR , u R , cR , tR , dR , sR , bR , (10.5)

are all singlets under SU (2) L .

Meanwhile, the U (1)Y subgroup is known as weak hypercharge and has a coupling
constant g and a vector boson Bμ , sometimes known as the hyperphoton. The weak
hypercharge Y is a conserved charge just like electric charge q, but it is different for
left-handed and right-handed fermions. Both members of an SU (2) L doublet must
have the same weak hypercharge in order to satisfy SU (2) L gauge invariance.
Following the general discussion of Yang-Mills gauge theories in Sect. 8.2 (see
(8.97) and (8.98)), the pure-gauge part of the electroweak Lagrangian density is:

1 a aμν 1
Lgauge = − Wμν W − Bμν B μν , (10.6)
4 4
where:
a
Wμν = ∂μ Wνa − ∂ν Wμa − g abc Wμb Wνc , (10.7)
Bμν = ∂μ Bν − ∂ν Bμ (10.8)

are the SU (2) L and U (1)Y field strengths. The totally antisymmetric abc (with
123 = +1) are the structure constants for SU (2) L . This Lgauge provides for kinetic
terms of the vector fields, and Wμa self-interactions.
The interactions of the electroweak gauge bosons with fermions are determined by
the covariant derivative. For example, the covariant derivatives acting on the lepton
fields are:

νe
νe
Dμ = ∂μ + ig Bμ Y L + igWμ Ta a
, (10.9)
eL eL

Dμ e R = ∂μ + ig Bμ Y R e R . (10.10)

where Y L and Y R are the weak hypercharges of left-handed leptons and right-
handed leptons, and 2 × 2 unit matrices are understood to go with the ∂μ and Bμ
terms in (10.9). A multiplicative factor can always be absorbed into the definition of
10.1 SU (2) L × U (1)Y Representations and Lagrangian 271

the coupling g , so without loss of generality, it is traditional1 to set Y R = Q = −1.

The weak hypercharges of all other fermions are then fixed. Using the explicit form
of the SU (2) L generators in terms of Pauli matrices in (10.3), the covariant derivative
of left-handed leptons is:

νe νe Bμ 0 g Wμ3 Wμ1 − i Wμ2 νe
Dμ = ∂μ + i g Y L + . (10.11)
eL eL 0 Bμ 2 Wμ + i Wμ
1 2 −Wμ 3 eL

Therefore, the covariant derivatives of the lepton fields can be summarized as:
g g
Dμ νe = ∂μ νe + i g Y L Bμ + Wμ3 νe + i (Wμ1 − i Wμ2 )e L , (10.12)
2 2
g 3 g
Dμ e L = ∂μ e L + i g Y L Bμ − Wμ e L + i (Wμ1 + i Wμ2 )νe , (10.13)
2 2
Dμ e R = ∂μ e R − ig Bμ e R . (10.14)

The covariant derivative of a field must carry the same electric charge as the field
itself, in order for charge to be conserved. Evidently, then, Wμ1 − i Wμ2 must carry
electric charge +1 and Wμ1 + i Wμ2 must carry electric charge −1, so these must be
identified with the W ± bosons of the weak interactions. Consider the interaction
Lagrangian following from

μ νe
L = i ν e e L γ Dμ (10.15)
eL
g g
= − ν e γ μ e L (Wμ1 − i Wμ2 ) − e L γ μ νe (Wμ1 + i Wμ2 ) + · · · . (10.16)
2 2
Comparing with (7.99), (7.100), and (7.131), we find that to reproduce the weak-
interaction Lagrangian of muon decay, we must have:

1
Wμ+ ≡ √ (Wμ1 − i Wμ2 ), (10.17)
2
1
Wμ− ≡ √ (Wμ1 + i Wμ2 ). (10.18)
2
√
The 1/ 2 normalization agrees with our previous convention; the real reason for it
is so that the kinetic terms for W ± have a standard normalization: L = − 21 (∂μ Wν+ −
∂ν Wμ+ )(∂ μ W −ν − ∂ ν W −μ ).
The vector bosons Bμ and Wμ3 are both electrically neutral. As a result of spon-
taneous symmetry breaking, we will find that they mix. In other words, the fields
with well-defined masses (“mass eigenstates” or “mass eigenfields”) are not Bμ and
Wμ3 , but are orthogonal linear combinations of these two gauge eigenstate fields.

1 Some references define the weak hypercharge normalization so that Y is a factor of 2 larger than
here, for each particle.
272 10 The Standard Electroweak Model

One of the mass eigenstates is the photon field Aμ , and the other is the massive Z
boson vector field, Z μ . One can write the relation between the gauge eigenstate and
mass eigenstate fields as a rotation in field space by an angle θW , known as the weak
mixing angle:

Wμ3 cos θW sin θW Zμ
= , (10.19)
Bμ − sin θW cos θW Aμ

with the inverse relation:

3
Zμ cos θW − sin θW Wμ
= . (10.20)
Aμ sin θW cos θW Bμ

We now require that the resulting theory has the correct photon coupling to
fermions, by requiring that the field Aμ appears in the covariant derivatives in the
way dictated by QED. The covariant derivative of the right-handed electron field
(10.14) can be written:

Dμ e R = ∂μ e R − ig cos θW Aμ e R + ig sin θW Z μ e R . (10.21)

Comparing to Dμ e R = ∂μ e R − ie Aμ e R from QED, we conclude that:

g cos θW = e. (10.22)

Similarly, one finds using (10.19) that:

g
Dμ e L = ∂μ e L + i g Y L cos θW − sin θW Aμ e L + · · · (10.23)
2
Again comparing to the prediction of QED that Dμ e L = ∂μ e L − i.e.Aμ e L , it must
be that:
g
sin θW − g Y L cos θW = e (10.24)
2
is the electromagnetic coupling. In the same way:
g
Dμ νe = ∂μ νe + i g Y L cos θW + sin θW Aμ νe + · · · (10.25)
2
where the · · · represent W and Z terms. However, we know that the neutrino has no
electric charge, and therefore its covariant derivative cannot involve the photon. So,
the coefficient of Aμ in (10.25) must vanish:

g
sin θW + g Y L cos θW = 0. (10.26)
2
10.1 SU (2) L × U (1)Y Representations and Lagrangian 273

Now combining (10.22), (10.24), and (10.26), we learn that:

Y L = −1/2, (10.27)
gg
e= , (10.28)
g 2 + g 2
tan θW = g /g, (10.29)

so that

g g
sin θW = , cos θW = . (10.30)
g2 + g 2 g2 + g 2

These are requirements that will have to be satisfied by the spontaneous symmetry
breaking mechanism. The numerical values from experiment are approximately:

g = 0.652, (10.31)
g = 0.357, (10.32)
e = 0.313, (10.33)
sin θW = 0.231.
2
(10.34)

These are all renormalized, running parameters, evaluated at a renormalization scale

μ = m Z = 91.1876 GeV in the MS scheme.
In a similar way, one can work out what the weak hypercharges of all of the other
SM quarks and leptons have to be, in order to reproduce the correct electric charges
appearing in the coupling to the photon field Aμ from the covariant derivative. The
results can be summarized in terms of the SU (3)c × SU (2) L × U (1)Y representa-
tions:

νe νμ ντ
, , , ←→ (1, 2, − 21 ),
eL μL τL
eR , μ , τ , ←→ (1, 1, −1),
R R
uL cL tL (10.35)
, , , ←→ (3, 2, 16 ),
dL sL bL
u R, cR , tR , ←→ (3, 1, 23 ),
dR , sR , bR , ←→ (3, 1, − 13 ).

In general, the electric charge of any field f is given in terms of the eigenvalue of
the 3 component of weak isospin matrix, T 3 , and the weak hypercharge Y , as:

Q f = T f3 + Y f . (10.36)
274 10 The Standard Electroweak Model

Here T f3 is +1/2 for the upper component of a doublet, −1/2 for the lower component
of a doublet, and 0 for an SU (2) L singlet. The couplings of the SM fermions to the
Z boson then follow as a prediction. One finds for each SM fermion f :

= −Z μ f γμ (g L PL + g R PR ) f ,
f f
LZ f f (10.37)

where
g 3
g L = g cos θW T f3L − g sin θW Y f L =
f
T f L − sin2 θW Q f , (10.38)
cos θW
g
2
g R = −g sin θW Y f R
f
= − sin θW Q f , (10.39)
cos θW
with coefficients:

fermion T f3L Y f L Y f R Q f
νe , νμ , ντ 1
2 − 21 0 0
e, μ, τ − 21 − 21 −1 −1
1 1 2 2
u, c, t 2 6 3 3
d, s, b − 2 1 1
6 − 3 − 31
1

Equation (10.37) can also be rewritten in terms of vector and axial-vector cou-
plings to the Z boson:

= −Z μ f γμ (gV − g A γ5 ) f ,
f f
LZ f f (10.40)

with
f 1 f f
g
gV = gL + g R = T f3L − 2 sin2 θW Q f , (10.41)
2 2 cos θW
f 1 f f
g
gA = gL − g R = T f3L . (10.42)
2 2 cos θW
f f
The coupling parameters appearing in (8.245) are V f = gV /e and A f = g A /e.
The partial decay widths and branching ratios of the Z boson can be worked out
from these couplings, and agree with the results from experiment:

BR(Z → + − ) = 0.033658 ± 0.000023 (for = e, μ, τ , each) (10.43)

BR(Z → invisible) = 0.2000 ± 0.0006 (10.44)
BR(Z → hadrons) = 0.6991 ± 0.0006. (10.45)

The “invisible” branching ratio matches up extremely well with the theoretical pre-
diction for the sum over the three ν ν final states, while “hadrons” is due to quark-
antiquark final states. It is an important fact that the Z branching ratio into charged
leptons is small. This is unfortunate, since backgrounds for leptons are smaller than
for hadrons or missing energy, and Z bosons can appear in many searches for new
phenomena.
10.2 The Standard Model Higgs Mechanism 275

10.2 The Standard Model Higgs Mechanism

Let us now turn to the question of how to spontaneously break the electroweak gauge
symmetry in a way that satisfies the above conditions. There is actually more that
one way to do this, but the Standard Model chooses the simplest possibility, which
is to introduce a complex SU (2) L -doublet scalar Higgs field with weak hypercharge
Y
= +1/2:
+
φ 1

= ←→ 1, 2, . (10.46)
φ0 2

Each of the fields φ + and φ 0 is a complex scalar field; we know that they carry
electric charges +1 and 0 respectively from (10.36). Under gauge transformations,

transforms as:

(x) →
(x) = eiθ (x)σ /2
(x),
a a
SU (2) L : (10.47)
U (1)Y :
(x) →
(x) = eiθ (x)/2
(x). (10.48)

The Hermitian conjugate field transforms as:

† →
† =
† e−iθ σ /2 ,
a a
SU (2) L : (10.49)
U (1)Y :
† →
† =
† e−iθ/2 . (10.50)

It follows that the combinations

†
and D μ
† D μ
(10.51)

are gauge singlets. We can therefore build a gauge-invariant potential:

V (
,
† ) = m 2
†
+ λ(
†
)2 , (10.52)

and the Lagrangian density for

is:

L = D μ
† Dμ
− V (
,
† ). (10.53)

0
Now, provided that m 2 < 0, then
= is a local maximum, rather than a
0
minimum, of the potential. This will ensure the spontaneous symmetry breaking that
we demand. There are degenerate minima of the potential with

†
= v 2 /2, v = −m 2 /λ. (10.54)

Without loss of generality, we can choose the VEV of

to be real, and entirely in
the second (electrically neutral) component of the Higgs field:

0√
0|
|0 = . (10.55)
v/ 2
276 10 The Standard Electroweak Model

This is a convention, which can always be achieved by doing an SU (2) L gauge

transformation on the field
to make it so. By definition, the surviving U (1) gauge
symmetry is U (1)EM , so the component of
obtaining a VEV must be the one
assigned 0 electric charge. The U (1)EM gauge transformations acting on
are a
combination of SU (2) L and U (1)Y transformations:

1 0

→
= exp iθ
, (10.56)
0 0

or, in components,

φ + → eiθ φ + , (10.57)
φ0 → φ0. (10.58)

Comparing with the QED gauge transformation rule of (8.5), we see that indeed φ +
and φ 0 have charges +1 and 0, respectively.
The Higgs field
has two complex, so four real, scalar field degrees of freedom.
Therefore, following the example of Sect. 9.2, we can write it as:

i G a (x)σ a /2v
0

(x) = e v+h(x)
√
, (10.59)
2

where G a (x) (a = 1, 2, 3) and h(x) are each real scalar fields. The G a are would-
be Nambu-Goldstone bosons, corresponding to the three broken generators in
SU (2) L × U (1)Y → U (1)EM . The would-be Nambu-Goldstone fields can be
removed by going to unitary gauge, which means performing an SU (2) L gauge
transformation of the form of (10.47), with θ a = −G a /v. This completely elimi-
nates the G a from the Lagrangian, so that in the unitary gauge we have simply

0

(x) = v+h(x)
√
. (10.60)
2

The field h creates and destroys the physical Higgs particle, an electrically neutral
real scalar boson that has yet to be discovered experimentally. We can now plug
this into the Lagrangian density of (10.53), to find interactions and mass terms for
the remaining Higgs field h and the vector bosons. The covariant derivative of
in
unitary gauge is:

1 0 i g g 0
Dμ
= √ +√ Bμ + Wμa σ a , (10.61)
2 ∂μ h 2 2 2 v+h

and its Hermitian conjugate is:

1
i
g g a a
Dμ
= √
†
0 ∂μ h −√ 0 v + h Bμ + Wμ σ . (10.62)
2 2 2 2
10.2 The Standard Model Higgs Mechanism 277

Therefore,
1
D μ
† D μ
= ∂μ h∂ μ h
2
√ √
(v + h)2
g Bμ + gWμ3 2gWμ− g B μ 2gW −μ
√ + gW
3μ 0
+ 0 1 √ + +μ μ , (10.63)
8 2gWμ g Bμ − gWμ3 2gW g B − gW 3μ 1

which becomes, after simplifying,

1 (v + h)2 2 + −μ 1
∂μ h∂ μ h + g Wμ W + (gWμ3 − g Bμ )(gW 3μ − g B μ ) . (10.64)
2 4 2

This can be further simplified using:

gWμ3 − g Bμ = g 2 + g 2 cos θW Wμ3 − sin θW Bμ = g 2 + g 2 Z μ , (10.65)

where the first equality uses (10.30) and the second uses (10.20). So finally we have:

L
kinetic = D μ
† Dμ

1 μ (v + h)2 2 + −μ 1 2 2 μ
= ∂μ h∂ h + g Wμ W + (g + g )Z μ Z . (10.66)
2 4 2

The parts of this proportional to v 2 make up (mass)2 terms for the W ± and Z vector
bosons, vindicating the earlier assumption of neutral vector boson mixing with the
form that we took for the sine and cosine of the weak mixing angle. Since there is
no such (mass)2 term for the photon field Aμ , we have successfully shown that the
photon remains massless, in agreement with the fact that U (1)EM gauge invariance
remains unbroken. The specific prediction is:

g2 v2 (g 2 + g 2 )v 2
m 2W = , m 2Z = , (10.67)
4 4
which agrees with the experimental values provided that the VEV is approximately:
√ 0
v = 2φ = 246 GeV (10.68)

in the conventions used here.2 Note that, comparing (7.133) and (10.67), the Fermi
constant is simply related to the VEV, by:

1
GF = √ . (10.69)
2v 2

2 It be noted that there is another extremely common convention in which v is defined to be

should √
a factor 1/ 2 smaller than here, so that v = φ 0 = 174 GeV in that convention.
278 10 The Standard Electroweak Model

A non-trivial prediction of the theory is that

m W /m Z = cos θW . (10.70)

All of the above predictions are subject to small, but measurable, loop corrections.
For example, the present experimental values m W = 80.379 ± 0.012 GeV and m Z =
91.1876 ± 0.0021 GeV give:
on−shell
sin2 θW ≡ 1 − m 2W /m 2Z = 0.22301 ± 0.00025, (10.71)

which is significantly lower than the MS-scheme running value in (10.34).

The remaining terms in (10.66) are Higgs-vector-vector and Higgs-Higgs-vector-
vector couplings. This part of the Lagrangian density implies the following unitary
gauge Feynman rules:

with the arrow direction on W ± lines indicating the direction of the flow of positive
charge. The field-strength Lagrangian terms of (10.6) provides the momentum part
of the W , Z propagators above, and also yields 3-gauge-boson and 4-gauge-boson
interactions:
10.2 The Standard Model Higgs Mechanism 279

where:

X μν,ρσ = 2 g μν g ρσ − g μρ g νσ − g μσ g νρ . (10.72)

Finally, the Higgs potential V (

,
† ) gives rise to a mass and self-interactions for
h. In unitary gauge:

λ 4
V (h) = λv 2 h 2 + λvh 3 + h , (10.73)
4
just as in the toy model studied in (9.2). Therefore, the Higgs boson has self-
interactions with Feynman rules:
280 10 The Standard Electroweak Model

and a mass
√
mh = 2λv, (10.74)

It would be great if we could evaluate this numerically using present data. Unfortu-
nately, while we know what the Higgs VEV v is, there is no present experiment that
gives any direct measurement of λ. Indirectly we know what it needs to be from the
Higgs boson mass, if the SM is the correct theory. Furthermore, there are indirect
effects of the Higgs mass in loops of precision electroweak obervables, such as the
Z mass, W mass, sin2 θW , etc. The experiments that measure these observables sug-
gested well before the Higgs boson discovery that m h should be less than 200 GeV.
The self-consistency of these indirect constraints vs. physical mass was verified by
the discovery of the Higgs boson at m h = 125 GeV.

10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing

The gauge group representations for fermions in the Standard Model are chiral.
This means that the left-handed fermions transform in a different representation than
the right-handed fermions. Chiral fermions have the property that they cannot have
masses without breaking the symmetry that makes them chiral.
For example, suppose we try to write down a mass term for the electron:

Lelectron mass = −m e ee. (10.75)

The Dirac spinor for the electron can be separated into left- and right-handed pieces,

e = PL e L + PR e R , (10.76)

and the corresponding barred spinor as:

e = (e†L PL + e†R PR )γ 0 = e†L γ 0 PR + e†R γ 0 PL = e L PR + e R PL , (10.77)

where, to avoid any confusion between between (e L ) and (e)PL , we explicitly define

e L ≡ e†L γ 0 , e R ≡ e†L γ 0 . (10.78)

10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 281

Equation (10.75) can therefore be written:

Lelectron mass = −m e (e L e R + e R e L ). (10.79)

The point is that this is clearly not a gauge singlet. In the first place, the e L part of
each term transforms as a doublet under SU (2) L , and the e R is a singlet, so each term
is an SU (2) L doublet. Furthermore, the first term has Y = Ye R − Ye L = −1/2, while
the second term has Y = 1/2. All terms in the Lagrangian must be gauge singlets in
order not to violate the gauge symmetry, so the electron mass is disqualified from
appearing in this form. More generally, for any Standard Model fermion f , the naive
mass term

Lf mass = −m f ( f L f R + f R f L ) (10.80)

is not an SU (2) L singlet, and is not neutral under U (1)Y , and so is not allowed.
Fortunately, fermion masses can still arise with the help of the Higgs field. For
the electron, there is a gauge-invariant term:

φ+
Lelectron Yukawa = −ye ν e e L e R + c.c. (10.81)
φ0

Here ye is a Yukawa coupling of the type we studied in (6.3). The field ν e e L
carries weak hypercharge +1/2, as does the Higgs field, and e R carries weak hyper-
charge −1, so the whole term is a U (1)Y singlet, as required. Moreover, the doublets
transform under SU (2) L as:
+ +
φ −iθ a σ a /2 φ
→e , (10.82)
φ0 φ0

ν e e L → ν e e L e+iθ σ /2 ,
a a
(10.83)

so (10.81) is also an SU (2) L singlet. Going to the unitary gauge of (10.60), it

becomes:
ye
LYukawa = − √ (v + h)(e L e R + e R e L ), (10.84)
2

or, reassembling the Dirac spinors without projection matrices:

ye
LYukawa = − √ (v + h)ee. (10.85)
2

This can now be interpreted as an electron mass, equal to

ye v
me = √ , (10.86)
2

and, as a bonus, an electron-positron-Higgs interaction vertex, with Feynman rule:

282 10 The Standard Electroweak Model

Since we know the electron mass and the Higgs VEV already, we can compute the
electron Yukawa coupling:

√ 0.511 MeV
ye = 2 = 2.94 × 10−6 . (10.87)
246 GeV

Unfortunately, this is so small that we can forget about ever observing the interactions
of the Higgs particle h with an electron. Notice that although the neutrino participates
in the Yukawa interaction, it disappears in unitary gauge from that term.
Masses for all of the other leptons, and the down-type quarks (d, s, b) in the
Standard Model arise in exactly the same way. For example, the bottom quark mass
comes from the gauge-invariant Yukawa coupling:

φ+
L = −yb t L b L b R + c.c., (10.88)
φ0

implying that, in unitary gauge, we have a b-quark mass and an hbb vertex:
yb
L = − √ (v + h)bb. (10.89)
2

The situation is slightly different for up-type quarks (u, c, t), because the complex
conjugate of the field
must appear in order to preserve U (1)Y invariance. It is
convenient to define
0∗
˜ 0 1 ∗ φ

≡
= , (10.90)
−1 0 −φ +∗

which transforms as an SU (2) L doublet in exactly the same way that

does:

→ eiθ σ /2
,
a a
(10.91)

˜
˜ → eiθ a σ a /2
. (10.92)

˜ has weak hypercharge Y = −1/2. (The field φ +∗ has a negative electric

Also,

charge.) Therefore, one can write a gauge-invariant Yukawa coupling for the top
quark as:

φ 0∗
L = −yt t L b L t R + c.c. (10.93)
−φ +∗
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 283

Going to unitary gauge, one finds that the top quark has a mass:
yt
L = − √ (v + h)tt. (10.94)
2

In all cases, the unitary-gauge version of the gauge-invariant Yukawa interaction is

yf
L = − √ (v + h) f f . (10.95)
2

The mass and the h-fermion-antifermion coupling obtained by each Standard Model
fermion in this way are both proportional to y f . The Higgs mechanism not only
explains the masses of the W ± and Z bosons, but also explains the masses of fermions.
Notice that all of the particles in the Standard Model (except the photon and gluon,
which must remain massless because of SU (3)c × U (1)EM gauge invariance) get a
mass from spontaneous electroweak symmetry breaking of the form:

m particle = kv, (10.96)

where k is some combination of dimensionless couplings. For the fermions, it is

proportional to a Yukawa coupling; for the W ± and Z bosons it depends on gauge
couplings, and for the Higgs particle itself, it is the Higgs self-coupling λ.
There are two notable qualifications for quarks. First, gluon loops make a large
modification to the tree-level prediction. For each quark, the physical mass measured
from kinematics is

yq v 4αs
mq = √ 1 + + ··· , (10.97)
2 3π

where yq is the running (renormalized) Yukawa coupling evaluated at a renormal-

ization scale μ = m q . The QCD gluon-loop correction increases m t by roughly 6%,
and has an even larger effect on m b because αs is larger at the renormalization scale
μ = m b than at μ = m t . The two-loop and higher corrections (indicated by · · · ) are
smaller but still significant, and must be taken into account in precision work, for
example when predicting the branching ratios of√the Higgs boson decay. This was
the reason for the notation m f (more nearly y f v/ 2 than m f ) that was used in (6.3).
A second qualification is that there is actually another source of quark masses,
coming from non-perturbative QCD effects, as has already been mentioned at the
end of Sect. 9.3. If the Higgs field did not break SU (2) L × U (1)Y , then these chi-
ral symmetry breaking effects would do it anyway, using an antifermion-fermion
VEV rather than a scalar VEV. This gives contributions to all quark masses that are
roughly of order QCD . For the top, bottom, and even charm quarks, this is rela-
tively insignificant. However, for the up and down quarks, it is actually the dominant
effect. They get only a few MeV of their mass from the Higgs field. Therefore, the
most important source of mass in ordinary nuclear matter is really chiral symmetry
breaking in QCD, not the Standard Model Higgs field.
284 10 The Standard Electroweak Model

The Standard Model fermions consist of three families with identical gauge inter-
actions. Therefore, the most general form of the Yukawa interactions is actually:
φ+
Le,μ,τ Yukawas = − ν i iL ye i j R j + c.c., (10.98)
φ0
φ+
Ld,s,b Yukawas = − u iL d iL yd i j d R j + c.c., (10.99)
φ0
0∗
φ
Lu,c,t Yukawas = − u iL d iL yu i j u R j + c.c. (10.100)
−φ +∗

Here i, j are indices that run over the three families, so that:
⎛ ⎞ ⎛ ⎞ ⎛ ⎞
eR eL νe
R j ⎝
= μR ⎠ , ⎝
L j = μL ⎠ , ⎝
νi = νμ ⎠ , (10.101)
τR τL ντ
⎛ ⎞ ⎛ ⎞ ⎛ ⎞ ⎛ ⎞
dL dR uL uR
dL j = ⎝ sL ⎠ , dR j = ⎝ sR ⎠ , u L j = ⎝ cL ⎠ , uRj = ⎝ cR ⎠ . (10.102)
bL bR tL tR

In a general basis, the Yukawa couplings

ye i j , yd i j , yu i j (10.103)

are complex 3 × 3 matrices in family space. In unitary gauge, the Yukawa interaction
Lagrangian can be written as:

h i i

L=− 1+ L mei j R j + d L mdi j d R j + u iL mui j u R j + c.c., (10.104)
v

where
v
mf i j = √ yf i j . (10.105)
2

(The fermion fields are now labeled with a prime, to distinguish them from the basis
we are about to introduce.) It therefore appears that the masses of Standard Model
fermions are actually 3 × 3 complex matrices.
It is most convenient to work in a basis in which the fermion masses are real and
positive, so that the Feynman propagators are simple. This can always be accom-
plished, thanks to the following:

Mass Diagonalization Theorem. Any complex matrix M can be diagonalized by a biunitary

transformation:

U L† MU R = M D (10.106)
where M D is diagonal with positive real entries, and U L and U R are unitary matrices.
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 285

To apply this in the present case, consider the following redefinition of the lepton
fields:

Li = L L i j L j , Ri = L R i j R j . (10.107)

where L L i j and L R i j are unitary 3 × 3 matrices. The lepton mass term in the unitary
gauge Lagrangian then becomes:

h i † j
L=− 1+ L (L L me L R )i R j . (10.108)
v

Now, the theorem just stated assures us that we can choose the matrices L L and L R
so that:
⎛ ⎞
me 0 0
L †L me L R = ⎝ 0 m μ 0 ⎠. (10.109)
0 0 mτ

So we can write in terms of the unprimed (mass eigenstate) fields:

h
L=− 1+ (m e ee + m μ μμ + m τ τ τ ). (10.110)
v

In the same way, one can do unitary-matrix redefinitions of the quark fields:

d Li = DL i j dL j , d Ri = DR i j dR j , (10.111)
u Li = UL i u L j ,
j
u Ri = URi u R j ,
j
(10.112)

chosen in such a way that

⎛ ⎞
md 0 0
D †L md D R = ⎝ 0 ms 0 ⎠, (10.113)
0 0 mb
⎛ ⎞
mu 0 0
U L† mu U R = ⎝ 0 mc 0 ⎠, (10.114)
0 0 mt

with real and positive diagonal entries.

It might now seem that worrying about the possibility of non-diagonal 3 × 3
Yukawa matrices was just a waste of time, but one must now consider how these
field redefinitions from primed (gauge eigenstate) to unprimed (mass eigenstate)
fields affect the other terms in the Lagrangian. First, consider the derivative kinetic
terms. For the leptons, L contains

j j
i L ∂/L j = i( L L †L ) j ∂/(L L L ) j = i L ∂/ L j . (10.115)
286 10 The Standard Electroweak Model

This relies on the fact that the (constant) field redefinition matrix L L is unitary,
L †L L L = 1. The same thing works for all of the other derivative kinetic terms, for
example, for right-handed up-type quarks:

iu R ∂/u R j = i(u R U R† ) j ∂/(U R u R ) j = iu R ∂/u R j ,

j j
(10.116)

which relies on U R† U R = 1. So the redefinition has no effect at all here; the form
of the derivative kinetic terms is exactly the same for unprimed fields as for primed
fields.
There are also interactions of fermions with gauge bosons. For example, for the
right-handed leptons, the QED Lagrangian contains a term

j j
− e Aμ R γ μ R j = −e Aμ ( R L †R ) j γ μ (L R R ) j = −e Aμ R γ μ R j . (10.117)

Just as before, the unitary condition (this time L †R L R = 1) guarantees that the form of
the Lagrangian term is exactly the same for unprimed fields as for primed fields. You
can show quite easily that the same thing applies to interactions of all fermions with
Z μ and the gluon fields. The unitary redefinition matrices for quarks just commute
with the SU (3)c generators, since they act on different indices.
But, there is one place in the Standard Model where the above argument does
not work, namely the interactions of W ± vector bosons. This is because the W ±
interactions involve two different types of fermions, with different unitary redefini-
tion matrices. Consider first the interactions of the W ± with leptons. In terms of the
original primed fields:
g
L = − √ Wμ+ ν i γ μ Li + c.c., (10.118)
2

so that
g
L = − √ Wμ+ ν i γ μ (L L L )i + c.c. (10.119)
2

Since we did not include a Yukawa coupling or mass term for the neutrinos, we
did not have to make a unitary redefinition for them. But now we are free to do so;
defining νi in the same way as the corresponding charged leptons, νi = L L ν j , we
get,

ν i = (ν L †L )i , (10.120)

resulting in
g
L = − √ Wμ+ ν i γ μ Li + c.c. (10.121)
2
10.3 Fermion Masses and Cabibbo-Kobayashi-Maskawa Mixing 287

So once again the interactions of W ± bosons have exactly the same form for mass-
eigenstate leptons. However, consider the interactions of W ± bosons with quarks.
In terms of the original primed fields,
g
L = − √ Wμ+ u iL γ μ d Li

+ c.c., (10.122)
2

which becomes:
g
L = − √ Wμ+ u iL γ μ (U L† D L d L )i + c.c. (10.123)
2

There is no reason why U L† D L should be equal to the unit matrix, and in fact it is
not. So we have finally encountered a consequence of going to the mass-eigenstate
basis. The charged-current weak interactions contain a non-trivial matrix operating
in quark family space,

V = U L† D L , (10.124)

called the Cabibbo-Kobayashi-Maskawa matrix (or CKM matrix). The CKM matrix
V is itself unitary, since V † = (U L† D L )† = D †L U L , implying that V † V = V V † = 1.
But, it cannot be removed by going to some other basis using a further unitary matrix
without ruining the diagonal quark masses. So we are stuck with it.
One can think of V as just a unitary rotation acting on the left-handed down
quarks. From (10.123), we can define

d Li = Vi j d L j , (10.125)

are the quarks that interact

where the d L j are mass eigenstate quark fields, and the d Li
in a simple way with W bosons:
g
L = − √ Wμ+ u iL γ μ d Li

+ c.c., (10.126)
2

and so in terms of mass eigenstate quark fields:

g
L = − √ Vi j Wμ+ u iL γ μ d L j + c.c. (10.127)
2

This is the entire effect of the non-diagonal Yukawa matrices.

The numerical entries of the CKM matrix are a subject of continuing experimental
investigation. To a first approximation, it turns out that the CKM mixing is just a
rotation of down and strange quarks:
288 10 The Standard Electroweak Model
⎛ ⎞
cos θc sin θc 0
V = ⎝ − sin θc cos θc 0⎠ (10.128)
0 0 1

where θc is called the Cabibbo angle. This implies that the interactions of W + with
the mass-eigenstate quarks are very nearly:
g
L = − √ Wμ+ cos θc [u L γ μ d L + c L γ μ s L ] + sin θc [u L γ μ s L − c L γ μ d L ] + t L γ μ b L .
2
(10.129)

The terms proportional to sin θc are responsible for strangeness-changing decays.

Numerically,

cos θc ≈ 0.974, sin θc ≈ 0.23. (10.130)

Strange hadrons have long lifetimes because they decay through the weak interac-
tions, and with reduced matrix elements that are proportional to sin2 θc = 0.05.
More precisely, the CKM matrix is:
⎛ ⎞ ⎛ ⎞
Vud Vus Vub 0.9743 0.2252 0.004
V = ⎝ Vcd Vcs Vcb ⎠ ≈ ⎝ 0.230 0.975 0.041 ⎠ , (10.131)
Vtd Vts Vtb 0.008 0.04 0.999

where the numerical values given are estimates of the magnitude only (not the sign
or phase). In fact, the CKM matrix contains one phase that cannot be removed by
redefining phases of the fermion fields. This phase is the only source of CP violation
in the Standard Model.
Weak decays of mesons involving the W bosons allow the entries of the CKM
matrix to be probed experimentally. For example, decays

B → D+ ν , (10.132)

where B is a meson containing a bottom quark and D contains a charm quark, can be
used to extract |Vcb |. The very long lifetimes of B mesons are explained by the fact
that |Vub | and |Vcb | are very small. One of the ways of testing the Standard Model is
to check that the CKM matrix is indeed unitary:

V † V = 1, (10.133)

∗ + V V ∗ + V V ∗ = 0. This is an automatic
which implies in particular that Vud Vub cd cb td tb
consequence of the Standard Model, but if there is further unknown physics out there,
then the weak interactions could appear to violate CKM unitarity.
10.4 Neutrino Masses and the Seesaw Mechanism 289

The partial decay widths and branching ratios of the W boson can be worked out
from (10.121) and (10.127), and agree well with the experimental results:

BR(W + → + ν ) = 0.1086 ± 0.0009 (for = e, μ, τ , each) (10.134)

BR(W + → hadrons) = 0.6741 ± 0.0027. (10.135)

where the “hadrons” refers mostly to Cabibbo-allowed final states ud and cs, with a
much smaller contribution from the Cabibbo-suppressed final states cd and us. The
tb final state is of course not available due to kinematics; this implies the useful fact
that (up to a very small effect from CKM mixing) b quarks do not result from W
decays in the Standard Model.
Also, the very small magnitudes of Vtd and Vts imply that top quarks decay to
bottom quarks almost every time:

BR(t → W + b) ≈ 1. (10.136)

This greatly simplifies the experimental identification of top quarks.

10.4 Neutrino Masses and the Seesaw Mechanism

Evidence from observation of neutrinos produced in the Sun, the atmosphere, accel-
erators, and reactors have now established that neutrinos do have mass. In the renor-
malizable version of the Standard Model given up to here, this cannot be explained.
The basic reason for this is the absence of right-handed neutrinos from the list in
(10.35). To remedy the situation, we can add three right-handed fermions that are
singlets under all three components of the gauge group SU (3)c × SU (2) L × U (1)Y :

N R1 , N R2 , N R3 ←→ (1, 1, 0). (10.137)

With these additional gauge-singlet right-handed neutrino degrees of freedom, it is

now possible to write down a gauge-invariant Lagrangian interaction of the neutrinos
with the Higgs field:

i φ 0∗
Lν neutrino Yukawas = − νi L yν i j N R j + c.c. (10.138)
−φ +∗

Note the similarity of this with the up-type quark Yukawa couplings in (10.100).
Going to unitary gauge, one obtains a neutrino mass matrix
290 10 The Standard Electroweak Model

v
mν i j = √ yν i j . (10.139)
2

just as in (10.105) for the Standard Model charged fermions. This neutrino mass
matrix can be diagonalized to obtain the physical neutrino masses as the absolute
values of its eigenvalues. The neutrinos in this scenario are Dirac fermions, as the
mass term couples together left-handed and right-handed degrees of freedom that
are independent.
Although the magnitudes of the neutrino masses are not yet determined by exper-
iment, there are strong upper bounds, as seen in Table 1.2 in the Introduction. Also,
limits from the WMAP and Planck measurements of the cosmic background radi-
ation, interpreted within the standard cosmological model, implies that the sum of
the three Standard Model neutrinos should be at most 0.17 eV. Neutrino oscilla-
tion data do not constrain the individual neutrino masses, but imply that the largest
differences between squared masses should be less than 3 × 10−3 eV2 . So, several
independent pieces of evidence indicate that neutrino masses are much smaller that
any of the charged fermion masses. To accommodate this within the Dirac mass
framework of (10.139), the eigenvalues of the neutrino Yukawa matrix yν i j would
have to be extremely small, no larger than about 10−9 . Such small dimensionless
couplings appear slightly unnatural, in a purely subjective sense, and this suggests
that neutrino masses may have a different origin than quark and leptons masses.
The seesaw mechanism is a way of addressing this problem, such that very small
neutrino masses naturally occur, even if the corresponding Yukawa couplings are of
order 1. One includes, besides (10.138), a new term in the Lagrangian:

1
L = − Mi j N i N j , (10.140)
2

where Ni are the Majorana fermion fields (see Sect. 3.4) that include N Ri , and Mi j
is a symmetric mass matrix. If the neutrino fields carry lepton number 1, then this
Majorana mass term necessarily violates the total lepton number L = L e + L μ +
L τ . Now the total mass matrix for the left-handed neutrino fields ν Li and the right-
handed neutrino fields N Ri , including both (10.139) and (10.140), is:

0 √v yν
M= √v yνT
2 . (10.141)
M
2

The point of the seesaw mechanism is that if the eigenvalues of M are much larger than
those of the Dirac mass matrix √v yν , then the smaller set of mass eigenvalues of M
2
will be pushed down. Since M does not arise from electroweak symmetry breaking,
it can naturally be very large. For illustration, taking M and yν to be 1 × 1 matrices,
the absolute values of the neutrino mass eigenvalues of M are approximately:

v 2 yν 2
, and M (10.142)
2M
10.4 Neutrino Masses and the Seesaw Mechanism 291

in the limit vyν M. For example, to get a neutrino mass of order 0.1 eV, one could
have yν = 1.0 and M = 3 × 1014 GeV, or yν = 0.1 and M = 3 × 1012 GeV. The
light neutrino states (corresponding to the lighter eigenvectors of M) are mostly the
Standard Model ν L and they are Majorana fermions. There are also three extremely
heavy Majorana neutrino mass eigenstates, which decouple from present weak inter-
action experiments. The fact that the magnitude of M necessary to make this work is
not larger than the Planck scale, and is very roughly commensurate with other scales
that occur in other theories such as supersymmetry, is encouraging. In any case,
the ease with which the seesaw mechanism accommodates very small but non-zero
neutrino masses has made it a favorite scenario of theorists.
In either of the two cases above, the left-handed parts of the neutrino mass eigen-
states ν1 , ν2 , ν3 (with masses m 1 < m 2 < m 3 ) can be related to the left-handed parts
of the neutrino weak-interaction eigenstates νe , νμ , ντ (which each couple to the
corresponding charged lepton only, and the W boson) by:
⎛ ⎞ ⎛ ⎞
νeL ν1L
⎝ νμL ⎠ = U ⎝ ν2L ⎠ , (10.143)
ντ L ν3L

where U is a unitary matrix known as the Pontecorvo-Maki-Nakagawa-Sakata

(PMNS) matrix. Neutrino oscillations are due to the fact that U is not equal to
the identity matrix.
To recap, by introducing right-handed gauge-singlet neutrino decrees of freedom,
there are two distinct scenarios for neutrino masses. In the Dirac scenario, neutrinos
and antineutrinos are distinct, and total lepton number is conserved although the
individual lepton numbers are violated. In the Majorana scenario, neutrinos are their
own antiparticles, and total lepton number is violated along with the individual lepton
numbers. At this writing, both possibilities are consistent with the experimental data.
To tell the difference between these two scenarios, one can look for neutrinoless
double beta decay, i.e. a nuclear transition
A
Z → A (Z + 2) + e− e− , (10.144)

(at the nucleon level nn → ppe− e− ), which can proceed via the quark-level Feyn-
man diagram shown below.
292 10 The Standard Electroweak Model

Since this process requires a violation of total lepton number in the neutrino propaga-
tor, it can only occur in the case of Majorana neutrinos. It is the subject of continuing
searches.

10.5 The Higgs Boson Discovery

One of the most momentous discoveries of the last half century was that of the Higgs
boson. Before its discovery in 2012, its properties and even its existence were in
doubt, despite the many successes of the Standard Model. Research publications on
“Higgsless theories” had persisted to the very end. As discussed above, the simplest
path to achieve masses for the W and Z bosons and the fermions of the Standard
Model is to introduce a single scalar Higgs boson doublet that condenses, breaking
electroweak symmetry down to the U (1)EM . The fluctuation around this background
value is the Higgs boson. However, when the theory was first formulated, there
was little guidance as to what mass it should have. One only knew that it was con-
trolled by the vacuum expectation value, which was known to be v 246 GeV, and
√ constant λ that was completely unknown, leading to an unknown
a dimensionless
mass m h = 2λv, as we saw in (10.74).
Experiments had been searching for the Higgs boson for decades without success.
Just prior to its discovery, evidence from the sum of data collected from the Z pole
experiments at CERN LEP and SLAC SLC, combined with the top quark and W
mass measurements at Tevatron and LEP2 at CERN, suggested that if the Standard
Model is the underlying theory of the weak scale, then the Higgs boson mass needed
to be in the range 114 GeV < m h < ∼ 180 GeV at 95% confidence level. It should be
emphasized that the lower bound of 114 GeV was derived directly by not seeing the
Higgs boson produced and decay at LEP2, whereas the upper bound was derived
indirectly, and thus less reliably, by a global analysis of compatibility to all data that
is sensitive to the Higgs boson mass, via quantum loops. A problem with this kind
of indirect bound is that it is always possible that some other unsuspected particle(s)
also contribute in the loops, interfering with the Higgs boson contribution.
Below, we will review the physics of the Higgs boson discovery at the LHC,
starting with a discussion of the decay modes.

10.5.1 Higgs Boson Decays Revisited

In Sect. 6.3, we have already calculated the leading-order decays of the Higgs boson
into fermion-antifermion final states. However, there are other final states that are
quite important besides h → f f . First, the Higgs boson can decay into two gluons,
h → gg, with the gluons eventually manifesting themselves in the detector as jets.
This decay cannot happen at tree level, but does occur through the one-loop diagram
below, where quarks go around the loop:
10.5 The Higgs Boson Discovery 293

(One must also include the diagram with the gluons exchanged, or equivalently the
diagram with the quarks running the other direction around the loop.) Even though
one-loop graph amplitudes are usually not competitive with tree-level amplitudes,
this is an exception because the gluons have strong couplings and because the top
quark with its large yt participates in the loop diagrams, while in the on-shell 2-body
decays to fermions, only lighter fermions (with much smaller Yukawa couplings)
can appear.
The resulting partial decay width for h → gg depends on the quark masses in
two places. First, at the hq q̄ vertex there is a Yukawa coupling, and second there
needs to be a chirality flip in one of the propagators to enable a non-zero result. That
is, the trace over the three fermion propagator numerators vanishes (due to an odd
number of γ matrices) unless one of the propagators is traced over the mass term.
These two facts explain why the top quark, being by far the most massive quark,
gives a dominant contribution to this amplitude.
The spin-summed squared amplitude for the h → gg transition is

m 4h αs 2 2

|M|2 = A 1/2 (τq ) , (10.145)
v2 π q

where the sum is over quark flavors q = t, b, c, . . . with τq ≡ 4m q2 /m 2h and the

kinematic function from doing the 1-loop integration is:

A1/2 (τ ) = τ + τ (1 − τ ) f (τ ) (10.146)

where
⎧

⎨ sin−1 1/√τ 2 , (for τ ≥ 1),
f (τ ) = √ 2 (10.147)
⎩ − ln
1 1+ √1−τ
− iπ (for τ ≤ 1).
4 1− 1−τ

In the limit of small τ , the function A1/2 (τ ) is proportional to τ and therefore to

m q2 /m 2h , so that to a very good approximation all quark contributions except for
the top quark can indeed be neglected, consistent with the expectation explained in
the previous paragraph. For large τ , appropriate for the top-quark contribution, the
function A1/2 (τ ) approaches a constant, with A1/2 (τ ) = 23 + 45τ
7
+ ···.
294 10 The Standard Electroweak Model

The partial width into gg can therefore be approximated as

|M|2 m 3h αs 2 2

(h → gg) = = A 1/2 (τ t ) (10.148)
32π m h 32π v 2 π

The first equality can be obtained from (6.24)–(6.25). Keep in mind that there is a fac-
tor of 1/2 from the indistinguishability of the gluons; otherwise each kinematic con-
figuration of gluons would be double counted when the final state phase space is inte-
grated over. Using αs = 0.118, m t = 173 GeV, m h = 125 GeV, and v = 246 GeV,
one finds (h → gg) = 0.214 MeV. This is in contrast to a more complete and state-
of-the-art computation, which instead gives (h → gg + X ) = 0.349 MeV, where
X represents anything (including nothing). The reason for the discrepancy is that
higher loop contributions and the radiation of additional soft gluons enhance the
decay partial width.
The decay h → γ γ also is absent at tree level, but does occur due to one-loop
graphs where any charged particle goes around the loop. This again includes notably
the top quark, but now the largest contribution is due to the W boson. The two most
important Feynman diagrams are:

The resulting partial decay width is:

m 3h α 2 f 2

(h → γ γ ) = A1 (τW ) + Nc Q 2f A1/2 (τ f ) (10.149)
64π v π
2
f

f
where the sum is over f = t, b, c, s, u, d and τ, μ, e, with Nc = 3 for quarks and
f
Nc = 1 for leptons, with Q t,c,u = 2/3 and Q b,s,d = −1/3 and Q τ,μ,e = −1, with
τW = 4m 2W /m 2h and τ f = 4m 2f /m 2h , and

A1 (τ ) = 3τ (τ/2 − 1) f (τ ) − 3τ/2 − 1, (10.150)

where f (τ ) is the same function as appears in A1/2 (τ ), and was given already in
(10.147). Although the resulting branching ratio to two photons is much smaller
(about 2.3 × 10−3 in the Standard Model with m h = 125 GeV, after taking into
account higher-order corrections), it is still important because the corresponding
backgrounds at colliders are also small.
10.5 The Higgs Boson Discovery 295

The decay h → Z γ is also mediated by similar one-loop graphs, but it will not
be reviewed here because it turns out to be quite small and not as useful, once the
corresponding background rates are taken into account. It has still not been observed,
but even this non-observation can constrain some non-minimal models.
Also significant are decays through two massive vector bosons, h → W + W − and
h → Z Z , corresponding to the Feynman rules found in Sect. 10.2 (below (10.71))
are important. In fact, these decays are important even if (as turns out to be true in the
real world) m h < 2m W and m h < 2m Z , despite the on-shell decays being forbidden
so that one of the vector bosons must be virtual. One usually indicates this by writing

h → W W (∗) , h → Z Z (∗) , (10.151)

where the “(∗)” means that the corresponding particle may be off-shell, depending
on the kinematics. If one of the vector bosons is off-shell, then the decay can be
thought of as really three-body. This means that the decay is really h → W ± f f¯ or
h → Z f f¯, where f f¯ is any final state that couples to the off-shell W ± , and f f¯ is
any fermion-antifermion final state that couples to the off-shell Z . For example, for
leptonic final states of the off-shell vector boson:

Since the competing tree-level two-body decays h → f f are suppressed by small

Yukawa couplings, these three-body decays are competitive, despite their kinematic
suppression. If we take m h to be a free variable, as it was before the discovery, then
the closer the vector bosons are to being on-shell, the larger the amplitude will be.
The decays h → W W (∗) become increasingly important for larger m h , and would
actually dominate if m h >∼ 135 GeV.
The results of a careful computation (using the program HDECAY by A. Djouadi,
J. Kalinowski, and M. Spira, Comput. Phys. Commun. 108, 56, (1998), hep-
ph/9704448) including all these effects are shown in the two graphs below, which
depict the state-of-the-art computations of the branching ratios and the total width
tot for the Higgs boson, as a function of m h for the range allowed before the dis-
covery in 2012.
296 10 The Standard Electroweak Model

0
10
bb
Higgs Branching Ratio

+ −
ττ
-1
10 cc
gg
+ −
W W
-2
10 ZZ
γγ

-3
10100 120 140 160 180 200 220 240
Higgs mass [GeV]
Higgs Total Width [GeV]

0
10

-1
10

-2
10

-3
10100 120 140 160 180 200 220 240
Higgs mass [GeV]
The largest decay modes were predicted to be bb and/or W W (∗) over the entire range
of m h , with the total width dramatically increasing when both W bosons can be on
shell. However, because it has very low backgrounds in colliders, the γ γ final state
was understood to be very important for the discovery of a light Higgs boson, despite
its tiny branching ratio. The Z Z (∗) final state is also important because it can lead to
low-background signals if both of the Z bosons decay to leptons. Note that the gg
final state is useless as a discovery mode because of huge QCD backgrounds to dijet
production, but it is important to keep track of for two reasons. First, its presence
reduces the branching ratios into the more useful final states. Second, it is related,
by crossing symmetry, to the largest production cross-section mode, as we discuss
next.
10.5 The Higgs Boson Discovery 297

10.5.2 Higgs Boson Production at the LHC

At hadron colliders such as the LHC, the largest parton-level production processes
for the Higgs boson is:

gg → h, (10.152)

This process cannot occur at tree level, but it does occur due to the same one-loop
diagram mentioned above in Sect. 10.5.1 for the decay h → gg. The roles of the
initial state and the final state are simply exchanged, by crossing:

Although the amplitude is loop-suppressed, the large gluon PDFs at the LHC make it
by far the most important production mode for a 125 GeV Higgs boson at the LHC,
with a cross-section exceeding that of the next largest, the W -boson fusion process
discussed below, by more than an order of magnitude. It was the process that figured
most prominently in the initial discovery.
At leading order, the production cross-section for gg → h is directly proportional
to the decay width h → gg, by crossing symmetry. To see this connection, we begin
with the generalized cross-section formula for initial state massless states with four-
momentum pa and pb scattering
to a single final state particle with four-momentum
k = (E, k), with E = |k|2 + m 2h in the present case. From (4.175) and (4.176)
with n = 1, and |va − vb | = 2 and 4E a E b = ŝ in the center of momentum frame,
we obtain:
3
1 d k 1 1 1
d σ̂ = · · |M| 2
(2π )4 δ (4) ( pa + pb − k). (10.153)
2ŝ (2π )3 2E 4 64

By crossing symmetry, |M|2 is the same spin-summed and color-summed squared
matrix element as in the decay calculation, but here it comes with the prefactor 14 · 64
1
,
since in this case we need to average (instead of sum) over initial state gluon spins
(2 spins) and color factors (8 gluons g a=1...8 ). Since the cross-section is invariant
under boosts along the beam direction we chose to work in the center-of-momentum
frame where pa + pb = 0, so that the delta function
√ vanishes except for k = 0. For
on-shell production of the Higgs boson, E = ŝ = m h .
Integrating (10.153) over the three-momentum k and collecting terms we find that
π
σ̂ = 2
δ(ŝ − m 2h ) |M|2 . (10.154)
256m h
298 10 The Standard Electroweak Model

Now, from (10.148) we know that |M|2 = 32π m h and so the cross-section of
gg → h can be obtained by knowing the partial decay width h → gg:

π2
σ̂ (gg → h) = (h → gg)δ(ŝ − m 2h ). (10.155)
8m h

The δ function dependence of the cross-section corresponds to the fact that free, on-
shell asymptotic states cannot scatter 2-to-1 unless the four-momenta of the first two
μ
particles pa,b are precisely arranged to construct the final momentum k = p1 + p2
such that k 2 = m 2h . Unlike 2-to-2 scattering not just any sufficiently large incoming
momenta will do. In the center of mass frame this requires that E = m h and k = 0.
More generally, there is no way to allow AB → C particle scattering unless
m A + m B ≤ m C . However, if that is allowed, then C → AB decays are allowed
with the same amplitude, giving C a decay width. The decay width means that there
is a finite spread of k 2 around m C2 (with the finite spread being determined by )
C
such that AB → C is allowed. The finite spread is the Breit-Wigner width, which is
characterized by replacing the δ-function with

1 m h h
δ(ŝ − m 2h ) → . (10.156)
π (ŝ − m 2h )2 + m 2h h2

This is the correct physical and non-singular interpretation of the δ-function in

(10.154), as one can show by treating the Higgs boson as a virtual intermediate
state with a Feynman propagator. In practice, if the width is much less than the mass
h m h , which is certainly the case for the Standard Model Higgs boson at 125
GeV (for which the total width is h = 4.2 MeV, so h /m h 3.4 × 10−5 ), then
the δ function is a good approximation. In general, replacing the Breit-Wigner peak
with a δ-function is called the “narrow width approximation.”
In our case the narrow width approximation is useful and appropriate because
weneed to integrate over gluon-gluon luminosity in the very close neighborhood
of ŝgg = m h , and the integrand hardly varies over the narrow width of the Higgs
Breit-Wigner function. Writing g(x) for the pdf of gluons at momentum fraction x,
the total cross section in pp collisions is

1 1
σ ( pp → h) = d xa d xb g(xa )g(xb )σ̂ (gg → h). (10.157)
0 0

Recall that the partonic center of mass energy is ŝ = xa xb s. It is convenient to define

a different set of variables,

τ = xa xb and x = xa . (10.158)
10.5 The Higgs Boson Discovery 299

The Jacobian of this variable change is J = 1/x, so that

1 1
dx
σ ( pp → h) = dτ g(x)g(τ/x)σ̂ (τ s). (10.159)
x
0 τ

With these integration variables, the parton-level cross-section σ̂ is a function of

ŝ = τ s only, and not x, so we can perform the x integral over the gluon PDFs
separately. It is convenient to define a function of τ only, called the gluon-gluon
luminosity function:

1
dL(τ ) dx
= g(x)g(τ/x), (10.160)
dτ x
τ

This leaves us with

1
dL(τ )
σ ( pp → h) = dτ σ̂ (τ s). (10.161)
dτ
0

For a given fixed value of m h , the total leading-order cross-section for pp → h due
to the gg → h parton-level process is therefore a simple function of s and m h , and
can be obtained by using the δ-function in σ̂ to integrate over τ , with the result:

σ ( pp → h) = Fgg (τh )σ0 (10.162)

where

π2
σ0 = (h → gg), (10.163)
8m 3h

and

1
dx
Fgg (τh ) = τh g(x)g(τh /x) (10.164)
x
τh

is simply a dimensionless number dependent only on the ratio

τh = m 2h /s, (10.165)

which in turn depends on the Higgs boson mass and the proton beam energy.
300 10 The Standard Electroweak Model

The numerical value of σ0 , using the leading-order width (h → gg) = 0.214
MeV obtained above, is:

π2
σ0 = (H → gg) 53 fb (10.166)
8m 3H

Using the MSTW2008NLO √ parton distribution functions for gluons gives Fgg (τh ) =
99, 127, and 292 for s = 7, 8, and 13 TeV, respectively, with m h = 125 GeV (and
the factorization scale in the gluon PDFs set equal to m h ). Thus, √ we get leading-
order estimates of σ ( pp → h) = 5.2 pb, 6.7 pb, and 15.4 pb, for s = 7, 8, and 13
TeV, respectively. These simple estimates are considerably smaller than the results
of state-of-the-art computations of σ ( pp → h + X ) coming from the CERN Higgs
cross-section working group, which gives approximately 17 pb, 21 pb, and 49 pb,
respectively. This increase of more than a factor of 3 compared to our results above
is because of the large effects of higher-order loop corrections, the emission of addi-
tional soft gluons, and a more sophisticated use of PDFs, all of which we have not
included in our simple analysis. This demonstrates the great importance of the heroic
efforts that have been made to calculate such higher order effects. In addition, more
sophisticated calculations provide crucial kinematic information about the kinemat-
ics of Higgs boson events, including the distribution of the transverse momenta of
the Higgs boson, and the numbers and momenta of the additional jets that may be
produced in the event.
Other parton-level processes that produce the Higgs boson at the LHC have smaller
cross-sections, but are important because they involve additional final state particles
whose presence can be used to control backgrounds. Furthermore, these processes
involve different couplings, allowing tests of the proposition that the new scalar
particle is really behaving as expected for the Standard Model Higgs. First, there
are the weak vector boson fusion modes, which refers to the parton-level processes
qq → qqh, qq → qqh, and q q → q qh through Feynman diagrams like this:

Here, the quark jets in the final state are usually found at small angles with respect
to the beam. Tagging events with these forward jets is a way to reduce backgrounds.
Another type of channel features Higgs bosons that are radiated off of weak vector
bosons:

qq → Z h, (10.167)
qq → W ± h. (10.168)
10.5 The Higgs Boson Discovery 301

These channels provide useful modes for confirmation and study, because the pres-
ence of the extra weak boson reduces backgrounds. The process qq → Z h occurs
due to this Feynman diagram:

while the process qq → W ± h is due to parton-level Feynman diagrams like the

ones below:

as well as others related by d → s and/or u → c.

Another important process, long anticipated but only very recently observed, is
pp → tth, which is due to Feynman diagrams including the one below:

This is particularly useful as a direct test of the Higgs boson interaction with the top
quark.

10.5.3 Discovery Through γ γ and 4 Final States

The pp Large Hadron Collider experiments at CERN, ATLAS and CMS, were both
designed to be able to cover the entire range of Higgs boson mass suggested by
the indirect constraints on it. For much of the allowed mass region, the primary
target for discovery was the decay to γ γ , manifested as a narrow mass peak of two
signal
photons centered on m h = m γ γ . The main background, largely created by q q̄ →
bkgd
γ γ , consists of a diffuse spectrum of m γ γ . There are also important contributions
to the background from gg → γ γ and from fake photons.
Indeed, the diphoton signal is half of how the Higgs boson discovery was estab-
lished – a peak of γ γ events that ultimately could be identified with m h . The top
302 10 The Standard Electroweak Model

Fig. 10.1 Top panel: The diphoton invariant mass spectrum from ATLAS data (ATLAS-CONF-
2022-094). The upper red line is signal (from h → γ γ ) plus background (mostly qq → γ γ ). The
dashed blue line is a fit to the background-only hypothesis. The black curves on the bottom show the
signal expectation (black solid curve) and the background subtracted data (data circles with 68%
CL vertical uncertainty bars). Bottom panel: The four-lepton invariant mass spectrum from CMS
is shown as the black data points (Sirunyan et al. (CMS), EPJ C81, 488 (2021) and reprinted in RPP
2022). The prediction from backgrounds other than the Standard Model Higgs boson are shown as
the blue histogram. The red curve includes the backgrounds plus the prediction of a Higgs boson
with invariant mass peak of m h = m 4l = 125 GeV

10.1 shows an ATLAS collaboration plot of this γ γ peak after data

panel of Fig.√
collection at s = 7 TeV and 8 TeV. A similar result was obtained by CMS.
The other half of the Higgs discovery came about via h → Z Z (∗) → 4 decays
of the Higgs boson, where the (∗) indicates an off-shell Z boson, and 4 indicates
Problems 303

any one of the following permutations: e+ e− e+ e− , μ+ μ− μ+ μ− or e+ e− μ+ μ− .

This branching fraction is very small. From (6.54) and (10.43) above, we have:

BR(h → Z Z (∗) → 4l) = BR(h → Z Z (∗) )[BR(Z → e+ e− ) + BR(Z → μ+ μ− )]2

≈ 0.0264(0.067316)2 ≈ 0.000120.

Fortunately, the backgrounds for four leptons near an invariant mass of m 4 = m h

are also very small. This enables a signal to be discerned above the background. This
is illustrated in the bottom panel of Fig. 10.1, which shows a plot of the invariant
mass
√ distribution of four lepton final states recorded by the CMS collaboration in the
s = 7 TeV and 8 TeV runs. Again, a similar result was obtained by ATLAS. The
consistency of the diphoton and Z Z (∗) → 4 invariant mass peaks in the ATLAS
and CMS experiments then allowed a definitive discovery.
The announced discovery of the Higgs boson in July 2012 has given rise to a new
era in particle physics phenomenology. The present and future research efforts of the
LHC will be centered around careful measurements of the Higgs boson, as well as
the search for any new particles that might be associated with it.
Problems

1. Using the Standard Model lagrangian

(a) Write down the Feynman rule for Z boson interactions with fermion f .
(b) Assuming f massless, compute the partial widths for the decays of Z → f f¯
f f
in terms of gV , g A , and m Z .
(c) Plug in the numbers for every possible fermion that participates in Z → f f¯
(i.e., those with m f < m Z /2, but in the calculation continue to assume m f =
0). Make a table of those partial widths and the branching ratios for each
Z → f f¯, and compare your branching fraction numbers with those of the
RPP that are quoted as Z decays to + − , invisible (neutrinos), and hadrons
(quarks). Hint: To do this problem assume the following numerical values for
m Z , α and sin2 θW :

m Z = 91.2 GeV,
1
α= ,
129
sin2 θW = 0.23.

2. For problems 2–5 consider the process of top-quark decay t → bW + . This is a

weak interaction process, which occurs due to the following term in the interaction
Lagrangian:
g
L = − √ Wμ− (bγ μ PL t) + c.c. (10.169)
2
304 10 The Standard Electroweak Model

where t, b are the Dirac spinor fields for the top quark and bottom quark. (Ignore
the fact that the bottom quark appearing here is not quite a mass eigenstate; this is a
very small effect.) The top quark decays very quickly, as you will discover below,
so it does not form complicated bound states like the lighter quarks. Therefore, one
can just use the simple charged-current weak-interaction Feynman rule implied
by the above interaction Lagrangian.
μ
Let the 4-momentum of the top quark be p μ , and that of the W + boson be k1 ,
μ
and that of the bottom quark be k2 . Find the kinematic quantities:

p2 ; k12 ; k22 ; p · k1 ; p · k2 ; k1 · k2 (10.170)

in terms of the symbols m t and m W . Treat the bottom quark as massless. (This is a
good approximation, since m t = 173.1 ± 1.3 GeV, m W = 80.399 ± 0.023 GeV,
and m b ≈ 5 GeV. Note that kinematic quantities generally involve the squares of
ratios of masses.)
3. Draw the Feynman diagram and write down the reduced matrix element for top
quark decay. Take the complex square of the reduced matrix element, and sum
over the final state polarizations of the W + boson. Then average over the initial
t spin, and sum over the initial spin of the b. (The quarks have 3 colors, but the
color of the final state bottom quark is constrained to be the same as that of the
initial state top quark. Since one should average over the initial state quark color,
the net color factor is just 1.)
4. Compute the decay rate of the top quark. You should find a result of the form:
N3
2 2
+g 2 m 3t MW MW
(t → bW ) = 1 + N2 1− (10.171)
N1 π M W 2 m 2t m 2t

where N1 , N2 , and N3 are positive integers that you will find.

5. Find the numerical value of the decay width in GeV, and the corresponding top
quark lifetime in seconds. Use g = 0.65, m t = 173.1 GeV, m W = 80.4 GeV.
What is the dimensionless ratio (t → bW + )/m t ?
6. In a general Yang-Mills theory (including fermion and scalars), the gauge-
coupling beta functions at one-loop order can always be written as

d 1
μ gi = β(gi ) = Bi gi3 . (10.172)
dμ 16π 2

for each Lie algebra component of the gauge symmetry.

(a) Define αi = gi2 /4π , and show that the quantities αi−1 run linearly with
ln(μ/μ0 ), in the one-loop running approximation.
(b) In the full Standard Model, there are three gauge couplings gi with i =
1, 2, 3 for the three components of the unbroken gauge symmetry, SU (3)c ×
SU (2) L × U (1)Y , with beta function coefficients:

B1 = 41/10, B2 = −19/6, B3 = −7. (Standard Model) (10.173)

Problems 305

√
Here, g2 = g and g1 = 5/3g , where g, g are the SU (2) L and U (1)Y cou-
plings in the normalization of Sect. 11.1. The choice of normalization for g1
is called the GUT (Grand Unified Theory) normalization. As boundary condi-
tions, take α3 (M Z ) = 0.1185, g2 (M Z ) = 0.652, and g1 (M Z ) = 0.461. Make
a graph of αi−1 (μ) as a function of log10 (μ/1 GeV), for M Z ≤ μ ≤ 1019 GeV,
using the one-loop running approximation. Make a note of the numerical values
of αi (μ) at μ = 1000 GeV and μ = 5000 GeV.
(c) In the Minimal Supersymmetric Standard Model (MSSM), the same three
gauge couplings appear, but they have different one-loop running coefficients:

B1 = 33/5, B2 = 1, B3 = −3, (MSSM) (10.174)

due to the fact that the MSSM contains more fields appearing in the loop
diagrams. Let us assume that the new particles in the MSSM all have the same
masses μSUSY . (This is probably not realistic, but captures the main point of
the following.) Then, assuming supersymmetry is correct, the renormalization
group running should use the MSSM coefficients for μ > μSUSY . Starting
with the values you found for αi (μ) at μ = μSUSY = 1000 GeV and μ =
μSUSY = 5000 GeV in part (b), make graphs of αi−1 (μ) for μSUSY ≤ μ ≤ 1019
GeV in the MSSM, as a function of log10 (μ/GeV). You should observe that
the three running gauge couplings become approximately equal at a single
renormalization group scale μGUT . (To really do this right, one ought to include
at least 2-loop RG running, as well as small “threshold” corrections when
matching the MSSM onto the Standard Model. But those effects make a rather
small difference.) What do you estimate for μGUT ?
This famous unification of gauge couplings is held by many to be an indirect
piece of evidence (but far from compelling) in favor of supersymmetry, since
it suggests that at very high energies the gauge interaction themselves unify
into a simpler Grand Unified Theory (GUT), a larger Yang-Mills theory with a
single Lie algebra SU (5) or S O(10) or E 6 . The unification of gauge couplings
can also occur in some versions of superstring theory.
Neutral Meson Mixing
11

11.1 Neutral Kaons, D mesons and B mesons

In this chapter we will investigate the experimental effects of neutral meson mix-
ing. An excellent laboratory in which to do so is the neutral kaon sector. Kaons are
particles that are bound states of a single anti-strange quark along with an up quark
(charged kaons, K + ) or down quark (neutral kaons, K 0 ). Such composite particles
satisfy the requirements of QCD, which demands that bounds states are color sin-
glets. There are two reasons to explore meson mixing among the neutral kaons in
some detail. First, the subject is endowed with its own complexity that is beyond the
discussion we have encountered so far. This complexity is the quantum mechanical
mixing over time of two “equivalent states” from the point of view of the charges
that they share. These oscillations show up in many guises in particle physics, but
especially in neutral meson mixings and neutrino flavor mixing. The methods dis-
cussed here will have transferability to understanding those other systems. In the
0
case of neutral kaons, the K 0 = d s̄ state can mix with its anti-particle K = d̄s.
The Hamiltonian eigenstates are superpositions of these states with distinct masses
and lifetimes, and are called K L and K S for “K long” (lifetime about 5.1 × 10−8
seconds) and “K short” (lifetime about 8.95 × 10−11 seconds), respectively.
The second reason to discuss neutral meson mixing is that historically it was the
first place in which CP violation was observed in particle physics experiments. Here,
C is the charge conjugation operation that turns each particle into its antiparticle,
and P is the parity (or space inversion) operation, each with eigenvalues ±1. The
combination of these operations, CP, is a symmetry of the strong interactions but
not the weak interactions. This is because there is a complex phase in the weak
interaction Hamiltonian that cannot be removed by a field redefinition. As we will
see below, the discovery of CP violation came about by recognizing that the K L
eigenstate could decay into two different final states with different CP eigenvalues.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 307
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_11
308 11 Neutral Meson Mixing

This was unexpected and could only be explained if there is a CP violating interaction
allowed among the constituent quarks. We know from our earlier discussion of the
CKM matrix that governs W boson interaction among the quarks that with three (or
more) families not all phases can be absorbed in the fields and one non-zero phase
possibility is left over. Thus, a mechanism for introducing CP violation is possible,
which predicted the necessity of a third generation of quarks (t, b) even before they
were found.
Neutral kaon mixing formalism has much in common with the formalisms of
neutral D-meson mixing and neutral B-meson mixing. The latter two have detailed
0 0
descriptions in review essays within the RPP (“D 0 − D Mixing” and “B 0 − B
0
Mixing”). The D 0 and D mesons are made from (cū) and (c̄u) quark bound states.
0
The Bd0 and B d mesons are made from (b̄d) and (bd̄) quark bound states, and likewise
0
Bs0 and B s are made from (b̄s) and (bs̄). All of these have oscillatory mixing behavior
similar to neutral kaon mixing. The reader is encouraged to consult these review
essays, whose formalism and language will be straightforward to understand after
reading this chapter.
Our discussion in this chapter is centered mainly on the mixing phenomena itself
0
and not on diagrammatical calculations of K 0 − K mixing. For the most part we
assume the mixing and investigate the consequences for time evolution and experi-
mental measurement. The reason for this emphasis is that the calculations are loop-
order computations which we have not emphasized within this introductory book.
Only when discussing the K L − K S mass difference do we invoke the necessary
one-loop diagrams that directly account for the mass difference. In that case we aim
to show that the extraordinarily tiny mass difference that is measured indeed arises
from a sensible calculation of the theory.

11.2 Neutral Kaon Mixing

The neutral kaon states

0
K 0 = (d s̄), and K = (s d̄). (11.1)

are eigenstates of “strangeness”, meaning they have well-defined net number of

0
strange quarks within them. K 0 has strangeness s = +1 and K has strangeness
s = −1. The importance of identifying the eigenstates of strangeness is that these
are the kaons created by the strong, weak, or electromagnetic interactions of the
Standard Model, which one should think of as producing either strange quarks or
anti-strange quarks, which then hadronize into the mesons.
Suppose that a K 0 state is produced at time t = 0,

|ψ(t = 0) = |K 0 , (11.2)

11.2 Neutral Kaon Mixing 309

0
Fig. 11.1 Leading order K 0 − K mixing diagrams. The mixing violates strangeness (S = 2)

and let us imagine for a moment that the K 0 state has no mixing with any other state.
In this case, quantum mechanics says that the wave function over time evolves as

|ψ(t) = |K 0 ei Mt e−t/2 , (11.3)

where M is the mass of the particle and is the total width of the particle, which
gives rise to the exponential decay of the wave function in time due to this decay
probability.
We can rewrite (11.3) in more standard quantum mechanical framework by defin-
ing an effective Hamiltonian

H = M −i , (11.4)
2
and applying this to the time-dependent Schrödinger equation

∂
H |ψ(t) = i |ψ(t). (11.5)
∂t

Upon applying the boundary condition that |ψ(t = 0) = |K 0 at t = 0 one sees that
(11.3) is the solution of (11.5).
0
However, in reality K 0 mixes with K and so (11.3) and (11.5) are too simplistic.
0
The mixing of K 0 and K arises from box diagrams in the weak interactions which
0
are shown in Fig. 11.1. Since K 0 and K are strangeness eigenstates, this mixing
violates strangeness by 2 units, S = ±2. We shall come back to the details of this
mixing contribution, but for now we need only recognize that the mixing does indeed
occur and it is small due to the suppressions caused by the mixing being a loop effect
and the presence of the W boson within the loop, which is much heavier than the
kaon (m 2K /m 2W < 10−4 ).
0
The Schrödinger equation now becomes, in the |K 0 , |K basis,

∂ M11 − 2i 11 M12 − 2i 12
H |ψ(t) = i |ψ(t), where H = . (11.6)
∂t M21 − 2i 21 M22 − 2i 22

Note, the matrices M and are necessarily Hermitian since they correspond to
∗ and = ∗ .
observables. Thus, M11 , M22 , 11 , and 22 are real, and M21 = M12 21 12
The effective Hamiltonian is further restricted due to a general property of local
quantum field theories, invariance under CPT (the product of charge conjugation,
310 11 Neutral Meson Mixing

parity, and time reversal operations). This can be shown to imply the exact relations
M11 = M22 , whose common value we will set to M, and 11 = 22 whose common
value we will set to . To emphasize the smallness of M12 and 12 , we will write
them as μ and γ respectively. The effective Hamiltonian is then

M − 2i μ − 2i γ
H= . (11.7)
μ∗ − 2i γ ∗ M − 2i

Diagonalizing this effective Hamiltonian gives eigenvalues

i i
λ S = m S − S , and λ L = m L − L , (11.8)
2 2
where
1 1
m S = M − m, S = + , (11.9)
2 2
1 1
m L = M + m, L = − , (11.10)
2 2
where
i 1/2
m + = 2 (μ − iγ /2) μ∗ − iγ ∗ /2 . (11.11)
2
Thus m and are nonzero by virtue of μ, γ = 0, and it turns out that, as defined
above, they are both positive. Thus, m L > m S but S > L , which is to say that
even though the K L mass is larger than the K S mass, the K L lifetime (τ L = 1/ L )
is longer than the K S lifetime (τ S = 1/ S ). As we will see later, the mass difference
between the two states is very nearly degenerate (i.e., m = m L − m S m L , m S ),
but the difference in width is large ( S L ). This is due to an interplay between
kinematics and the CP symmetry’s gatekeeping of what final states K L and K S are
easily allowed to decay into. The point is that, as we will see below, K L mainly decays
into three-pion states, which have a 3-body phase-space suppression compared to
the 2-body phase-space suppression for K S decays. The K L decays are even further
kinematically suppressed by the fact that m K − 3m π happens to be rather small.
The eigenstates K L and K S that correspond to the eigenvalues λ L and λ S are

1 0

|K L = (1 +
)|K 0 − (1 −
)|K for λ L , (11.12)
2 + 2|
|2
1 0

|K S = (1 +
)|K 0 + (1 −
)|K for λ S , (11.13)
2 + 2|
|2

where |
| 1 in the neutral kaon system and is defined to be
√ √
μ − iγ /2 − μ∗ − iγ ∗ /2

=√ √ . (11.14)
μ − iγ /2 + μ∗ − iγ ∗ /2
11.2 Neutral Kaon Mixing 311

Note that
= 0 only because of CP violation, for otherwise μ and γ would be real.
Current experimental best-fit values for the neutral kaon system are

M = 0.497611 ± 0.000013 GeV, (11.15)

m = (3.484 ± 0.006) × 10−15 GeV, (11.16)
S = (7.3510 ± 0.0033) × 10−15 GeV, (11.17)
L = (1.2866 ± 0.0053) × 10−17 GeV, (11.18)

= |
|eiφ
, (11.19)

where

|
| = (2.228 ± 0.011) × 10−3 , (11.20)
π
φ
= (0.9671 ± 0.0011) radians. (11.21)
4
Remarkably, the real and imaginary parts of
are almost equal.
The time evolution of these diagonalized Hamiltonian eigenstates is now straight-
forward, and is given by

|K L (t) = |K L e−iλ L t , and |K S (t) = |K S e−iλ S t , (11.22)

where
1+
1−

p= , q= . (11.24)

2 + 2|
|2 2 + 2|
|2

Note, p and q satisfy the condition | p|2 + |q|2 = 1, which is required for normalized
0
eigenstates. The expansion of K 0 and K in terms of K L and K S is then

|K 0 1 q q |K L
0 = . (11.25)
|K 2 pq −p p |K S

This expression is valuable when the produced states are strange eigenstates (e.g.,
K 0 ) but the time evolution is governed by the mass eigenstates. We will do examples
of these considerations in the subsequent discussion.
312 11 Neutral Meson Mixing

11.3 CP Eigenstates
0
In the previous subsection we identified the K 0 state and K state by their bound
state constituent quarks. Each had definite strangeness according to the single strange
quark or anti-strange quark it contains. K 0 was identified as a bound state of d and
0
s̄ quarks, and K as a bound state of d̄ and s quarks. If we apply the CP operator on
these quarks they turn into their anti-quark complements. Thus we recognize that1
0 0
C P|K 0 = |K , and C P|K = |K 0 . (11.26)

Given this rather simple transformation process, we can form CP eigenstates:

1 0

|K (−)
0
= √ |K 0 − |K , (11.27)
2
1 0

|K (+)
0
= √ |K 0 + |K , (11.28)
2
0 and K 0 have CP eigenvalues of −1 and +1 respectively:
where K (−) (+)

0
C P|K (+) = (+1)|K (+)
0
and C P|K (−)
0
= (−1)|K (−)
0
. (11.29)

Why would we want to identify CP eigenstates in this system? After all, the mass
eigenstates K L and K S are not CP eigenstates, nor are the “production eigenstates”
0
K 0 and K eigenstates of CP. Nevertheless, there are two key reasons why it is
helpful to identify the CP eigenstates. First, since |
| 1 in the kaon system, the
mass eigenstates are nearly identical to the CP eigenstates, which is important to
note. Turning this statement around, the mass eigenstates are not CP eigenstates only
because there are small CP violating effects in the kaon system. This is all abstract
if there are no experimental indications of this small CP violation. In fact, there is
experimental indication, which we will come to shortly, which is best understood by
showing that the kaon mass eigenstates each have at least a little bit of CP even and
CP odd eigenstate overlap within them. That is the second reason why it is helpful
to identify formally the CP eigenstates among the neutral kaons.

11.4 Neutral Kaon Oscillations and Lifetimes

A neutral kaon is made experimentally by production of strange quarks in one manner

or another. This can be accomplished by W +∗ → s̄u or W −∗ → s ū or g ∗ , γ ∗ , Z ∗ →
s s̄, where the ∗ denotes a virtual (off-shell) boson. During the hadronization process

0
1 More precisely, C P|K 0 = η|K , where η is an arbitrary and unobservable phase factor such
that |η|2 .We choose η = 1 out of convenience.
11.4 Neutral Kaon Oscillations and Lifetimes 313

each s or s̄ quark combines with other quarks to create pure flavor kaons. Thus, the
0
kaons are given birth as a flavor eigenstate, either K 0 (d s̄) or K (s d̄). However, these
states are not mass eigenstates, and the time evolution of the system is obtained by
expanding in terms of its Hamiltonian eigenstates K L and K S .
Let us suppose that at time t = 0 we have created a pure flavor eigenstate K 0 .
From (11.25) we can expand K 0 in terms of the mass eigenstates

1
|K 0 = (|K L + |K S ) . (11.30)
2p

Let us define the quantum state |ψ(t) to be the time-evolved wave function subject
to the initial condition that |ψ(0) = |K 0 at t = 0. At subsequent times

1 −iλ L t 1 q −iλ S t 0
|ψ(t) = e + e−iλ S t |K 0 + e − e−iλ L t |K . (11.33)
2 2p

Utilizing (11.33) one finds that at time t the probability that the |ψ(t) state is
measured to be |K 0 given that it started at t = 0 as |K 0 is

PK 0 (t) = | K 0 |ψ(t)|2 (11.34)

1 − S t

= e + e− L t + 2 cos((m L − m S )t) e−( S + L )t/2 . (11.35)

4
0
Similarly, the probability that the |ψ(t) state is measured to be |K given that it
started as |K 0 at t = 0 is
0
PK 0 (t) = | K |ψ(t)|2 (11.36)
2

1 q
= e− S t + e− L t − 2 cos((m L − m S )t) e−( S + L )t/2 . (11.37)
4 p

The lifetime of K L is τ L = 5.12 × 10−8 s, which is about 570 times longer than the
K S lifetime of τ S = 8.956 × 10−11 s. At times well above τ S the |K S component of
|ψ(t) in (11.32) has almost completely decayed away and therefore |ψ(t τ S )

|K L . Thus, all the branching ratios at those late times are reflective of K L decays. On
the other hand, for times t τ S the only decays that are taking place are those of the
314 11 Neutral Meson Mixing

K S state, since it has such a shorter lifetime. These considerations allow us to exploit
experimentally the regions of time and space where we expect a preponderance of
K S decays and K L decays.
We can rephrase the above considerations in terms of spatial resolution instead of
time resolution of K L and K S decays. The cτ lifetime decay length for K S and K L
are ∼ 2.7 cm and ∼ 15 m, respectively. The good separation in lifetime and the not-
too-large macroscopic distances that the kaons travel gives experiment the capability
of creating circumstances that can separate K L decays from K S . For example, if one
produces the kaons relativistically, or near relativistic, such as in the manner above
(pure K 0 at t = 0) one can observe decays at distances long enough (well past cτ of
K S ) where only K L states exist. This “K L region” must be adjusted for the specific
kinematics in each experiment, especially if the produced kaons are not relativistic. In
the K L region there are no K S and plenty of K L states present, and what is measured
can be interpreted as pure K L decays.

11.5 Neutral Kaon Decay to Pions

Let us consider possible hadronic decays of the kaons. The two candidates for decay
that are kinematically viable are 2π and 3π final states. By 2π we mean K 0 → π 0 π 0
and π + π − , and by 3π we mean K 0 → π 0 π 0 π 0 and π 0 π + π − . These two classes of
final states are pure CP eigenstate final states, where 2π is pure CP even (+1) and 3π
is pure CP odd (+1). This can be understood by recognizing that both final states are
invariant under charge conjugation C and that the pions are pseudoscalar particles
(Pπ = −1). The 2π final state is even under parity Pπ Pπ = (−1)(−1) = +1 and
the 3π final state is odd under parity Pπ Pπ Pπ = (−1)3 = −1, therefore

C P|2π = (+1)|2π and C P|3π = (−1)|3π . (11.38)

Here, it is important that the kaon states are spin-0, so that in their rest frame the
total angular momentum is 0. Since the pions are also spinless, the orbital angular
momentum is 0, so the contribution to the parity of the pion states is always (−1) =
(−1)0 = 1. In order to produce a +1 CP eigenstate in the final state, the parent particle
that gave rise to it must have at least some CP even component to it. Likewise, in
order to produce a −1 CP eigenstate in the final state, the parent particle that gave
rise to this must have at least some CP odd component to it. In other words, to decay
0 or K 0 respectively:
into 2π or 3π the parent state must have a component of K (+) (−)
0 → π 0 π 0 , π + π − and K 0 → π 0 π 0 π 0 , π 0 π + π − are allowed. (11.39)
K (+) (−)

If we compare (11.12) and (11.13) with (11.27) and (11.28), noting that |
| 1,
0 ) and the K
we see that the K L eigenstate is nearly a pure CP odd eigenstate (K (−) S
0
eigenstate is nearly a pure CP even eigenstate (K (+) ). The overlap of K L and K S in
terms of these CP eigenstates is
11.6 Direct CP Violation in Kaon Decay 315

1
|K L = |K (−)
0
+
|K (+)
0
, (11.40)
1 + |
|2
1
|K S = |K (+)
0
+
|K (−)
0
, (11.41)
1 + |
|2

where again
is the CP violating parameter defined by (11.14).
A clear signature for CP violation then is to witness decays of K L into π π which
0 component, which in turn can only be present if
can only take place through its K (+)
there is CP violation in the theory. Indeed, that is what was found and CP violation
in the kaon sector was established.2
Our description above of how to determine that there is CP violation present in the
kaon system has some additional subtleties beyond just making a K L and watching
it decay. First, it must be noted that there is no direct production mechanism for
making pure K L states. What is really produced are flavor eigenstates (‘strangeness
0
eigenstates’) K 0 and K which have nearly equal parts of K S and K L in them. How
do we create a circumstance such that when we see a decay into π + π − we know it
came from a K L meson? As mentioned above, one key method of identification is
to wait long enough when only the K L component will survive.
Let us review briefly the experimental evidence. As we just mentioned, K L decays
to pions take place through its overlap with C P even 2π final states and C P odd final
states 3π . Recall that the K L would be a pure C P odd eigenstates if the CP violation
parameter
= 0 (see (11.40)). Therefore, a clear signal for CP violation in the kaon
system, and therefore
= 0, is a non-zero value of K L → 2π . Experimentally one
finds (see RPP tables)

(K L → π + π − )
= (1.967 ± 0.010) × 10−3 , and (11.42)
(K L → all)
(K L → π 0 π 0 )
= (8.64 ± 0.06) × 10−4 . (11.43)
(K L → all)

The decay rates of K L → π π are small but they are nevertheless nonzero, indicating
a small CP violating effect in kaon mixing and kaon decays.

11.6 Direct CP Violation in Kaon Decay

In the above we have approximated all CP violating effects as coming through the
CP violating parameter
that affects the misalignment of the pure CP eigenstates
0 and K 0 with respect to the mass eigenstates K and K , as shown in (11.40)
K (+) (−) L S
and (11.41). However, there is another source of CP violation in the kaon system

H. Christenson, J. W. Cronin, V. L Fitch, “Evidence for the 2π Decay of the K 20 Meson.” Phys.
2 J.

Rev. Lett. 13 (1964) 138.

316 11 Neutral Meson Mixing

that affects observables. This is the “direct CP violation” that occurs from the CP
eigenstate decaying into a final state with different CP charge.
The disentangling of the direct and indirect CP-violating effects starts with defin-
ing two parameters that can be extracted out of observable ratios of CP violating to
CP conserving decay amplitudes for K L and K S decays to π 0 π 0 and π + π − final
states. Define:

Evaluating these requires some care in analyzing weak decay amplitudes of kaon
decays into two pions. Final state pions with different isospins will have differ-
ent amplitudes for the decay. Therefore, it is important to decompose the two-pion
systems into pure isospin eigenstates with their appropriate Clebsch-Gordan coeffi-
cients:

1 2
|π π =
0 0
|(π π ) I =0 − |(π π ) I =2 , (11.46)
3 3

2 1
|π + π − = |(π π ) I =0 + |(π π ) I =2 . (11.47)
3 3

With these definitions we can define isospin-dependent amplitudes

0
(π π ) I |Hweak |K 0 = A I eiδ I , and (π π ) I |Hweak |K = A∗I eiδ I , (11.48)

for I = 0, 2. The phases δ I are due entirely to final state pion interaction and thus
are the same for both amplitudes. Then A2 is complex, but without loss of generality
we can take A0 to be real by a choice of phase for the two-pion states.
To compute η+− and η00 one needs to expand the amplitudes of the numerator
and denominator of (11.44) and (11.45):

2 1 0

π + π − |Hweak |K L = (π π ) I =0 | + (π π ) I =2 | | Hweak | p|K 0 − q|K ,

3 3

2 1 0

+ −
π π |Hweak |K S = (π π ) I =0 | + (π π ) I =2 | | Hweak | p|K 0 + q|K ,
3 3

1 2 0

π π |Hweak |K L = (π π ) I =0 |
0 0
− (π π ) I =2 | | Hweak | p|K 0 − q|K ,
3 3

1 2 0

π 0 π 0 |Hweak |K S = (π π ) I =0 | − (π π ) I =2 | | Hweak | p|K 0 + q|K .

3 3
11.6 Direct CP Violation in Kaon Decay 317

Expanding in small |A2 |/A0

1/20 and
, one finds that

η+− =
+
, (11.49)
η00 =
− 2
, (11.50)

where
i Im[A2 ] i(δ2 −δ0 )

= √ e . (11.51)
2 A0

The physical origin of

is that the weak interaction has a CP violating complex
phase in the CKM matrix, and thus CP violation can occur directly in the decays
of the kaons. This effect is accounted for separately from the indirect effect from
the weak interaction role in forming the states K L and K S , which are not pure CP
0
eigenstates, through K 0 − K mixing.
Experiment gains direct access to Re(
/
) by measuring a ratio of various observ-
able decay rates:

(K L → π + π − )/ (K S → π + π − ) η+− 2

= = 1 + 6 Re . (11.52)
(K L → π π )/ (K S → π π )
0 0 0 0 η00

The KTeV experiment3 at Fermilab reported the measurement of

to be

Re = (1.92 ± 0.21) × 10−3 . (11.53)

This can be compared with the CERN’s full NA48 experimental result4 of

Re = (1.47 ± 0.22) × 10−3 . (11.54)

Combining all data the RPP global fit is

Re = (1.68 ± 0.20) × 10−3 . (11.55)

From these experimental results we see that the numerical impact of direct CP vio-
lation in kaon decays is nonzero and measurable but subdominant to the overall
manifestation of CP violation in the neutral kaon system.

3 Abouzaid et al. (KTeV), Phys. Rev. D83, 092001 (2011).

4 J. R. Batley et al. (NA48), Phys. Lett. B544, 97 (2002).
318 11 Neutral Meson Mixing

11.7 Neutral Kaon Decays to Leptons

The neutral kaons also have substantial branching fraction into semileptonic final
states. The RPP reports

(K L → π ± e∓ νe )
= 40.55 ± 0.11%, (11.56)
(K L → all)
(K L → π ± μ∓ νμ )
= 27.04 ± 0.07%. (11.57)
(K L → all)
0
These decays are made possible by the amplitudes of K 0 and K strange eigenstates
into semi-leptonic states due to the W -mediated weak interactions

0
A = π − + ν |Hweak |K 0 , and A∗ = π + − ν̄ |Hweak |K . (11.58)

The corresponding Feynman diagrams are shown in Fig. 11.2. Note, the following
decay amplitudes are to a good approximation zero in comparison to the above
amplitudes and play little role in semi-leptonic decays:

B = π − + ν |Hweak |K
0, and B ∗ = π + − ν̄ |Hweak |K 0
0. (11.59)
0

In other words, in contrast to A and A∗ there are no tree-level diagram contributions

from the weak interactions that can give rise to B and B ∗ .
Let us suppose that at time t = 0 a pure K 0 state is produced. Due to the quantum
mechanical time evolution of the state, as described above in (11.33), one finds that
the amplitudes for semi-leptonic decays over time become

1 −im S t − S t/2

M + (t) = π − + ν |Hweak |ψ(t) = A e e + e−im L t e− L t/2 , (11.60)

and

M − (t) = π + − ν̄ |Hweak |ψ(t)

= A∗ (1 − 2 Re(
)) e−im S t e− S t/2 − e−im L t e− L t/2 . (11.61)
2

Fig. 11.2 Tree-level diagrams for the decays of neutral kaons to leptons K 0 → π − + ν (left) and
0
K → π + − ν̄ (right), with amplitudes given in (11.58)
11.7 Neutral Kaon Decays to Leptons 319

where |ψ(t) is the time-dependent state subject to the boundary condition that
|ψ(0) = |K 0 (see (11.33)), and we are working to first order in
. From these
amplitudes we can compute the decay rates into π − + ν and π + − ν̄ to be

+ (t) = 0 [α(t) + β(t)] , (11.62)

− (t) = 0 [1 − 4 Re(
)] [α(t) − β(t)] , (11.63)

where 0 ∝ |A |2 is a common factor to both decay partial width computations, and

α(t) = e− L t + e− S t , (11.64)

β(t) = 2 cos(mt) e−avg t , (11.65)

with m = m L − m S and avg = ( L + S )/2.

A lepton charge asymmetry can be formed that cancels the common factor 0 :

+ (t) − − (t)
A ± (t) = (11.66)
+ (t) + − (t)
β(t) + 2[α(t) − β(t)] Re(
)
= . (11.67)
α(t) − 2[α(t) − β(t)] Re(
)

The asymmetry has a particularly simple and useful form in the limit of large time
compared to the K S lifetime (t S 1), where the K L decays are dominating:

β
A ± (t)
+ 2 Re(
) = 2 cos(mt) e−( S − L )t/2 + 2 Re(
). (11.68)
α
By careful measurements over time of this lepton charge asymmetry of a beam of
kaons that are produced as K 0 at time t = 0 one is able to extract both the mass
splitting m and the CP violating parameter Re(
). Note, in the limit of long time
t S 1 the asymmetry A ± (t) is dominated by the second term and one can directly
measure Re(
) from the experimental result:

1
Re(
) = A ± (t), for t S−1 . (11.69)
2
The experimental extraction of Re(
) from this technique was already good enough
in the 1970s to establish its value between 0.0016 < Re(
) < 0.0017, which can
be compared with today’s experimentally determined result using all techniques
Re(
) = (1.66 ± 0.02) × 10−3 (RPP). However, the primary value of this analysis
is the determination of the mass splitting.
It is traditional to quote the mass difference in units of inverse seconds by virtue
of the oscillatory cos(mt) factors that (11.62) and (11.63) depend on. For example,
the best fit value for m according to the RPP is

m = (5.293 ± 0.009) × 109 s −1 . (11.70)

320 11 Neutral Meson Mixing

Fig. 11.3 Lepton asymmetry of neutral kaon decays A ± as defined in (11.67), as a function of t/τ S ,
0
where τ S = 1/ S is the K S0 lifetime. The beam at t = 0 consists of pure K 0 and K tagged states
which then oscillate over time according to quantum mechanical evolution. The K L lifetime is much
longer and corresponds to τ L /τ S
570. Thus, for t/τ S > 10 in the figure, all the decays are to a
good approximation K L0 decays, and the asymmetry asymptotes here to 2 Re(
)
0.003. This plot
is taken from Adler et al. Phys. Lett. B363, 237 (1995) who found m = (5.274 ± 0.029) × 109 /s
from best fit to their data at CPLEAR detector at CERN

This central value is equivalent to m = 3.484 × 10−12 MeV, which is extraordi-

narily small compared to m K = 498 MeV. Figure 11.3 shows how the asymmetry
develops over time during a time period greater than the K S lifetime but much shorter
than the K L lifetime. This experimentally measured value will be compared to the
Standard Model theoretical direct calculation of the mass difference in Sect. 11.8.

11.8 K L − K S Mass Difference

The mass difference of the K L and K S is small and is in principle computable within
the Standard Model. The mass splitting is due to level repulsion originating from
the S = ±2 off-diagonal terms in the interaction Hamiltonian in (11.6): m =
m L − m S
2Re[M12 ] = 2Re[μ]. (See (11.11) in the approximation that μ and γ
are real.) The matrix element M12 can be computed in quantum field theory at leading
order in perturbation theory as

1 0 S=2
M12 = K |Heff |K 0 , (11.71)
2m K
S=2 is the part of the effective Hamiltonian density that changes strangeness
where Heff
by 2 units, which arises from the Feynman diagrams shown in Fig. 11.1. (The nor-
11.8 K L − K S Mass Difference 321

0
malization of |K 0 and K | on the right side of (11.71) is chosen for consistency
with the conventional hadronic matrix element in (11.77) below.)
A leading-order 1-loop computation of the diagrams of Fig. 11.1 gives a result for
the effective Hamiltonian density of the form
g4
S=2
Heff = Vid∗ Vis V jd
∗
V js F(m i2 , m 2j , MW
2
) Osd , (11.72)
16π 2 m 4W i, j=u,c,t

which includes the 4-quark operator

Osd = [d̄γμ PL s][d̄γ μ PL s]. (11.73)

In (11.72) there is an insertion of g and a CKM matrix element for each W interaction
vertex, a factor of 1/m 2W for each W -boson propagator, the factor 1/16π 2 is a typical
1-loop integration suppression factor, and F is a kinematic function with dimensions
of [mass]2 . From the last fact, one might naively expect that Heff S=2 should have

contributions that scale like g /m W and g m t /m W . However, the true result is much
4 2 4 2 4

smaller. To see why, note that if all of the up-type quarks i, j = u, c, t had the same
mass, then the result would actually vanish. This is because if all m i2 were the same,
then the kinematic function F would contribute the same to each term, and so the
result would be proportional to V ∗ V , which vanishes due to unitarity of the
i id is
CKM matrix, i Vik∗ Vil = δkl . The same applies to the summed index j = u, c, t.
This means that in particular, the contributions to Heff S=2 must vanish in the limit

m i /m W → 0, and therefore for i = u, c must be suppressed by factors m i2 /m 2W .

2 2

The suppression due to cancellation from CKM unitarity, which also occurs in many
other contexts, is called the Glashow-Iliopoulis-Maiani (GIM) mechanism.
This still leaves open the possibility that the virtual top-quark contributions pro-
portional to m 2t could dominate. However, the top-quark contributions are suppressed
by very small CKM matrix elements which more than compensate for the large mass
enhancement. A naive estimate, which can be verified by a more involved loop inte-
gral calculation, is that the ratio of magnitudes of contributions proportional to m 2t
and m 2c is roughly
|Vts Vtd∗ |2 m 2t
∗ |2 m 2
0.04,
|Vcs Vcd
(11.74)
c

and so we can conclude that the dominant contribution is due to to m 2c .

A 1-loop calculation of the charm quark contribution gives

S=2 g 4 m 2c ∗ 2
Heff =η (Vcs Vcd ) Osd , (11.75)
128π 2 m 4W

where η is a number of order unity that reflects the sizeable and complicated effects
of higher-order corrections. Now, using |Vcs |
cos θC
0.97 and |Vcd |
sin θC

0.22 where θC is the Cabibbo angle, we can estimate, using (11.71),

322 11 Neutral Meson Mixing

g 4 m 2c 1 0
m = η 2 4
sin2 θC cos2 θC K |Osd |K 0 . (11.76)
128π m W mK

The remaining hadronic matrix element in (11.76) is rather difficult to obtain reliably,
as it is inherently non-perturbative. It can be parameterized as

0 4 2 2
K |Osd |K 0 = f m B, (11.77)
3 K K
where f K
113 MeV is the kaon decay constant, and B is another dimensionless
quantity of order unity. (The constant f K is a universal non-perturbative parameter
which appears in many other kaon decay matrix elements, but one must be careful√
because it is often defined in a normalization that makes it larger by a factor 2.)
Thus we get as a rough leading order estimate, and putting in the numbers including
m c = 1.5 GeV:

g 4 m 2c f K2 m K sin2 θC cos2 θC
m = ηB
ηB (3 × 10−15 GeV), (11.78)
96π 2 m 4W

which is of the same order as the experimental result of 3.5 × 10−15 GeV quoted
above in (11.16). Historically, this is how Gaillard and Lee predicted the charm
quark mass in 1974 before its discovery, by identifying the value of m c that gave a
theory prediction for the kaon mass splitting m equal to the experimental result.
Since then, it has been understood in increasingly greater detail that higher order
corrections to m, not reviewed here, are numerically important. These include not
just the factors η and B mentioned above, but non-negligible long-distance effects
S=2 .
not captured at all by the effective Hamiltonian Heff
As a matter of terminology, the existence of m is an example of a flavor-
changing neutral current, or FCNC. The name reflects the fact that the change in
flavor (strangeness, in the case of m with S = 2) is not accompanied by a net
change in electric charge of the hadron. In constrast, the S = ±1 decays of neu-
tral kaons to charged pions, depicted in Fig. 11.2, are examples of charged current
flavor-changing processes.
Some other examples of FCNCs are the S = 1 processes K ± → π ± ν ν̄ and
0
K 0 → μ+ μ− , the C = 2 process of D 0 –D mixing, and the B = 2 process of
0
B 0 –B mixing. The GIM mechanism of partial or full cancellation of the would-be
leading contributions to FCNCs due to the unitarity of the CKM matrix applies much
more generally. This makes the precision measurement of FCNCs a powerful tool for
indirect constraints on physics beyond the Standard Model, because hypothetical new
physics effects governed by couplings not involving the CKM matrix, and therefore
not subject to the GIM mechanism, can in principle overwhelm the small sub-leading
order Standard Model contributions. This can occur even when the new particles are
heavier than the TeV scale and beyond direct reach at colliders.
Problems 323

Problems

1. Find approximate numerical values for the complex parameters μ and γ defined
in (11.7) that fit the experimental central values for the observables given in
(11.16)–(11.21).
2. Compute the coefficients a− (t) and a+ (t) of

|ψ(t) = a− (t)|K (−)

0
+ a+ (t)|K (+)
0
(11.79)

subject to the initial condition that |ψ(0) = |K 0 . Compute the probability that
a measurement would find the system in the state |K (−) 0 at time t.
+ −
3. Consider the S = 1 decay process K → μ μ in the Standard Model.
0

(a) At tree level you can try to draw diagrams involving γ and Z exchange in the
s channel. Explain why these diagrams vanish identically.
(b) At 1-loop order, you can draw two diagrams, each involving a pair of virtual
W bosons. Use the CKM matrix factors to discuss how the GIM mechanism
works to suppress the amplitude beyond naive expectation in this case. Make
a rough order-of-magnitude estimate of the resulting contribution to the decay
rate.
Neutrinos
12

Neutrinos were a somewhat neglected sector of the Standard Model for many years
since they were originally thought to be massless without much complexity to con-
sider. Massless neutrinos were accommodated within the old SM by disallowing any
gauge-singlet right-handed neutrinos, thereby forbidding any gauge-invariant renor-
malizable mass term for neutrinos. Experimental and theoretical progress over the
years began to point to neutrinos having mass, and now that fact is well established.
Earlier in Sect. 10.4 we described how neutrinos can obtain mass. In this chapter
we describe the unique experimental implications of massive neutrinos. We first
describe the sources of copious neutrino fluxes with which one can conduct exper-
iments to infer neutrino properties. One of the first strong evidences that neutrinos
might not be massless fermions was the solar neutrino deficit, where experimentalists
measured the neutrino flux coming from the sun and found too few. The results were
not compatible with the massless hypothesis. The reason for this, and the effect that
pervades this entire chapter, is that massive neutrinos oscillate in their flavor content
over time as they propagate. In other words, mass eigenstates and flavor eigenstates
are not synonymous, and a neutrino produced as an electron flavor eigenstate will
oscillate to other flavors over time:

ψ(t) = α(t)νe + β(t)νμ + γ (t)ντ , where α(0) = 1 and β(0) = γ (0) = 0. (12.1)

Neutrino masses are required to enable β(t), γ (t) = 0 for future times, and thus
enable the νe component of the produced state to reduce or “disappear” over time
(|α(t)| < 1). A similar situation develops for a muon neutrino or tau neutrino pro-
duced at t = 0—they will not maintain their flavor identity over time. Experiment
aims to determine the details of these flavor oscillations.
Because of these flavor oscillations the reader should pay close attention to the
type of neutrino (its flavor) that is produced from the sources described in Sect. 12.1.

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 325
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_12
326 12 Neutrinos

One should equally keep in mind that the pure flavor content at birth does not last long,
and the neutrino state will oscillate into a different identity with a linear superposition
of all three flavor components.
The qualitative description above will be upgraded to a precise quantitative
description of the propagation of neutrinos through vacuum and through matter
in the subsequent sections. From the detailed formalism developed the reader will
be able see how the parameters of the neutrino sector are pinned down from exper-
imental data. The measurement process has its own rich history, and we illustrate
some of that history by giving an introduction to methods of neutrino detection. This
includes brief discussions of various neutrino experiments that have been crucial to
the development of our understanding of neutrino properties. We do not intend to be
exhaustive in this survey of experiment. For a comprehensive listing and survey of
the vast experimental landscape of neutrino experiments the reader is encouraged to
consult the Particle Data Group’s Review of Particle Properties (RPP). In the final
section we summarize current understanding of neutrino properties (their masses
and mixings) and discuss some of the future goals in neutrino physics.
We will discuss many different experiments within this chapter. To make it easier
for the reader to keep track of all them we include Table 12.1 that briefly summarizes
each of them. The description of each experiment will become more understandable
as the reader progresses through the chapter.

12.1 Neutrino Sources

In order to carry out experiments and observations that hope to measure properties
of neutrinos we must have copious sources because the interaction of neutrinos
with other particles is weak. We should not only identify which natural and human-
made sources can provide a large flux of neutrinos, but we should also have a good
understanding of the properties of these neutrinos at their birth, most especially their
energies, their expected flavor profiles, and their distances from our detectors.
The main sources of neutrinos that we describe below are from the Sun, from
supernovae, from cosmic rays (“atmospheric neutrinos”), from nuclear reactors, and
from accelerator sources. We discuss each of these in turn.

12.1.1 Solar Neutrinos

The sun is a copious source of neutrinos. The primary source of neutrinos from the
sun is from the basic fusion energy process of

p + p → d + e+ + νe . (12.2)

Although these neutrinos constitute the lion’s share of neutrinos produced by the
sun (∼ 86%) their energies are well below an MeV and are hard to detect by earth-
based detectors which typically have energy thresholds higher than several MeV.
12.1 Neutrino Sources 327

Table 12.1 Alphabetical list of experiments that are discussed in the chapter with comments on
the roles that are highlighted in the discussion
Experiment Location Date Comments
Baksan/BUST Russia 1977+ Detected neutrinos
from SN1987A
Daya Bay China 2011–2020 Reactor neutrino
detector
DUNE South Dakota 2030? Comprehensive
neutrino detector
Hyper-Kamiokande Japan 2027? Comprehensive
neutrino detector
ICARUS Gran Sasso 2010+ Solar and atmospheric
neutrino detector
IMB Lake Erie 1982–1991 Atmospheric and
SN1987A neutrinos
J-PARC Japan 2009+ High intensity proton
accelerator to make
neutrinos
KATRIN Germany 2018+ Search for neutrino
mass
Kamiokande Japan 1983–1985 Atmospheric neutrinos
Kamiokande II Japan 1985–1990 Detected neutrinos
from SN1987A
LBNF Fermilab 2030? Accelerator neutrino
source and near
detectors
MicroBooNE Fermilab 2014+ Accelerator neutrino
experiment
MiniBooNE Fermilab 2002+ Search for sterile
neutrinos
MINOS Fermilab 2005–2016 Accelerator neutrino
detector
NOνA Minnesota 2011+ Accelerator neutrino
detector
SNO Canada 1999–2006 Solar and atmospheric
neutrino observatory
Soudan 2 Minnesota 1989–2001 Atmospheric neutrinos
observatory
Super-Kamiokande Japan 1996+ Comprehensive
neutrino detector
T2K Japan 2010+ Accelerator neutrino
detector
328 12 Neutrinos

One therefore wishes to identify the source of the highest energy neutrinos coming
from the sun on which we can apply observational tools.
The production and subsequent decays of Boron-8 in the internal solar reaction
chain produces1 a spectrum of high energy electron neutrinos νe that is peaked at
6.5 MeV and reaches up to ∼ 16 MeV. The solar chain of reactions that yields these
neutrinos proceeds through

p+ p → d + e+ + νe (as source of d)
d+p → 3 He + γ (as source of 3 He)
3
He + 3 He → 4 He + 2 p (as source of 4 He)
4
He + 3 He → 7 Be + γ (as source of 7 Be)
7
Be + p → 8 B + γ (as source of 8 B)
8
B → 8 Be∗ + e+ + νe (giving high energy νe )

The νe neutrinos that result from the last decay constitute about 1 out of every
5000 neutrinos emitted from the sun. Nevertheless, the flux of “Boron-8 neutrinos”
reaching the earth is approximately 14 million neutrinos per second per cm2 .
The solar neutrino flux is well understood; the energy profile of the highest energy
neutrinos is well characterized; the flavor content of the neutrinos upon birth is
well understood (pure νe ); and, the distance from source to earth is of course well
understood. Thus, experiments that can detect the high-energy neutrinos coming
from the sun with some discriminating power on flavor content have the opportunity
to test whether neutrinos oscillate, signifying that they have mass. Indeed, such tests
were done, and oscillations detected, as will be described later in Sect. 12.4.
Historically the process was more messy, with experiments led by Ray Davis first
suggesting a deficit of detected neutrinos on earth compared to what was expected. By
the 1970s this was called the “solar neutrino problem.” Confusion reigned for quite
some time, including questions of systematic errors on the early experiments and
questions on how well the solar model of neutrino production was known. Ultimately,
those confusions resolved, especially due to the transformative Super-Kamiokande
and SNO (Sudbury Neutrino Observatory) experiments in the 1990s, and we are
left today with a clear picture of neutrino production within the sun and how they
indeed oscillate in flavor on their way to earth, resulting in a measurable reduction
of detected electron neutrinos.

12.1.2 Supernova Neutrinos

Another natural source of neutrinos comes from the core collapse of a supernova.
A star with mass M > ∼ 8M collapses to a proto-neutron star when it runs out of
fuel for nuclear fusion. This supernova process is violent in many ways. Not only

1 See W. T. Winter et al. “The 8 B neutrino spectrum”. Phys. Rev. C73, 025503 (2006).
12.1 Neutrino Sources 329

does it eject its stellar envelope and produce many photons in the explosion, it also
ejects a very significant number of neutrinos. As the core collapses the free nucleons
capture electrons, producing neutrinos, which causes further core collapse due to
the drop in electron degeneracy pressure, and so on. Within a few seconds 99% of
the gravitational binding energy of the original star is converted to neutrinos hurling
outward into space.
It is a highly non-trivial calculation to determine the total rates of neutrinos, the
total rates of anti-neutrinos, and the relative rates of each flavor. The vast majority
of the neutrinos come within the first second, and there is a steady but falling rate
of neutrinos up to about 20 s. If there is an onset supernova within the galaxy or
otherwise nearby, modern large-scale experiments will detect the burst of neutrinos
that comes from it. These neutrinos are correlated with the arrival time of the light
emitting from the exploding supernova, giving further confirming evidence of the
neutrino burst’s origin. Galactic supernovae are estimated to take place twice over
the course of a century. Thus, there is high premium on getting as much out of an
event as possible.
There has been one nearby supernova in the modern era picked up by neutrino
experiments. Supernova SN1987A was seen in 1987 by a burst of neutrino events that
were recorded among several experiments, including Kamiokande II (12 antineu-
trinos), IMB (8 antineutrinos), and Baksan (5 antineutrinos). SN1987A occurred
approximately 50 kpc away in the Large Magellanic Cloud. Comparing the spread
of arrival time of these burst neutrinos from supernova models, including their uncer-
tainties, yielded the first initial upper-bound estimates of about 10 eV on the absolute
scale of neutrino masses. Given the many new and larger experiments available today,
careful measurement of neutrinos arising from the next nearby supernova may pro-
vide significantly more information, such as limits on non-standard interactions of
neutrinos with matter and with themselves, not to mention deeper insight into the
internal dynamics of a supernova. At an expected rate of about two per century, a new
supernova event useful for measuring neutrino properties may happen next week or
maybe not for another hundred years.

12.1.3 Atmospheric Neutrinos

Another important natural source of neutrinos arises from cosmic rays interacting
with the atmosphere. High energy cosmic rays are mostly comprised of protons, and
when they collide with molecules in the atmosphere, cosmic ray showers develop. A
shower is made of many short-lived particles, including copious numbers of pions.
As we saw in Sect. 7.6, these pions then decay nearly 100% of the time to muons
and neutrinos, and many muons subsequently decay to even more neutrinos before
reaching the detector. The cascade relevant for main neutrino production is

pion cosmic ray showers : π + → μ+ νμ → νe νμ ν̄μ + e+ ,

π − → μ− ν̄μ → ν̄e νμ ν̄μ + e− . (12.3)
330 12 Neutrinos

There are other decay chains involving π → eν, and from K ± decays that are sub-
dominant. Thus, from tracking the neutrino flavor content in the cosmic ray shower
evolution, which is approximated by (12.3), one expects roughly double the num-
ber of muon neutrinos compared to electron neutrinos. The simple origin of this
conclusion is that pions decay to muons over 99% of the time.
A conundrum developed when experiments detected muon neutrinos at a rate
significantly less than double the electron neutrino rate. In-depth Monte Carlo sim-
ulations that tracked shower evolution and subsequent neutrino production were
pursued to compute the exact expected ratio in order to compare with experiment. A
useful observable was developed to report the comparison between data and Monte
Carlo bench-mark simulation that assumes no flavor changes after production in the
decay chains:

(Nμ /Ne )data

R= , (12.4)
(Nμ /Ne )MC

where Nμ is the total number of νμ and ν̄μ neutrinos and Ne is the total number
of νe and ν̄e neutrinos. Neutrinos in the sub-GeV energy range consistently showed
R 0.65 across multiple experiments, including Soudan 2, IMB, Kamiokande, and
Super-Kamiokande.
Upon closer inspection, it appeared that data for electron neutrino flux were match-
ing Monte Carlo estimates, but the data for muon neutrino flux were coming in far
short of Monte Carlo estimates. “Monte Carlo estimates” assumed no neutrino oscil-
lations. If one postulates that neutrino oscillations into another flavor is the explana-
tion for the deficit of neutrinos, one is led to consider the possibility that many νμ
neutrinos are “disappearing” into ντ via the oscillation νμ → ντ .
Experiments did not have sensitivity to the ντ flux to test the hypothesis directly at
the time; however, the supposition did suggest that there should be a different drop-
off in the νμ flux going downward versus upward. The reason for this difference is
that the oscillation probability depends on the distance the neutrino has traveled, as
we will discuss in the next section. If neutrinos are coming from above they travel
on order of 10 − 100 km depending on the incoming zenith angle, whereas if they
are coming from below they had to have been created on the other side of the earth
and traveled ∼ 103−4 km. Super-Kamiokande showed2 that muon neutrinos indeed
have this up-down asymmetry:
up
up−down Nνμ − Nνdown
μ
Aνμ = up = −0.31 ± 0.04 (Super − Kamiokande). (12.5)
Nνμ + Nνdown
μ

This is one of many self-consistency checks of the super-Kamiokande data that

definitively settled the question that neutrinos have mass and oscillate.

2 K. Scholberg (for Super-K collab.) “Atmospheric Neutrinos at Super-Kamiokande.” hep-

ex/9905016.
12.1 Neutrino Sources 331

12.1.4 Reactor and Accelerator Neutrino Sources

Neutrino sources are not limited to natural processes. Human-made sources, such as
nuclear reactors and proton beam bombardments on targets, have become increas-
ingly important in neutrino physics. Let us discuss each of these in turn.
First, nuclear reactors operate on the process of fission. When a neutron is captured
by a target nucleus the neutron-enriched isotope is unstable and decays typically to
two fission product isotopes and additional free neutrons. For example, a typical
fission process involving Uranium 235 is

92 U → 56 Ba + 36 Kr + 3n.
n + 235 141 92
(12.6)

The product isotopes of Barium and Krypton have significantly more neutrons than
their stable isotopes. To be more precise, the stable isotopes of Barium are 56 X Ba,

where X = 134 − 138. The stable isotopes of Krypton are 36 Kr where X = 80,
X

82 − 84. The neutron-richness of the fission products is expected due to the well-
known “belt of stability” in nuclear physics that shows the neutron-proton ratio
increasing as one proceeds up the periodic table. The high fraction of neutrons in high-
235 U, which is moderately stable, is inherited by the fission products.
Z nuclei, such as 92
The lower-Z fission products are then far from the “belt of stability” and proceed
to reduce their too-high neutron fraction through β-decay: converting neutrons into
protons and emitting electrons and anti-neutrinos in the process (n → pe− ν̄e ).
Thus, the neutrinos that come out of reactors are almost entirely the ν̄e that arise
out of reducing the neutron-richness of fission products. Their energies are typically
several MeV. There are more than 1020 ν̄e emitted per second in a giga-Watt power
reactor, which creates an excellent source of ν̄e . Furthermore, since the source is
located on earth, one can place detectors near and far to test the oscillation behavior
of ν̄e at multiple distances, pinning down parameters more accurately. The smaller
“baselines” (length from source to detector) for ν̄e from reactors compared to νe
from the sun enable additional handles on the PMNS matrix for neutrino mixing,
and additional sensitivities to the mass splittings among the neutrinos. The Daya Bay
experiment was able to utilize this flexibility to gain unique capabilities to measure
θ13 oscillation angle (the “reactor oscillation angle”).
Another mechanism by which to produce neutrinos copiously in the lab is through
focusing an intense accelerator beam of protons onto a target. A typical target is
graphite, which, for example, J-PARC uses to produce neutrinos for the T2K neu-
trino experiment. Graphite is also a contender for the target of the proton booster
at Fermilab, which will provide the source of neutrinos for the DUNE experiment.
The mechanism is similar to the mechanism that produces atmospheric neutrinos,
where protons collide with their atmospheric target making pions which then decay
to neutrinos. The same principle applies here

p + target −→ π − , π + , . . . −→ νμ , ν̄μ , νe , ν̄e , . . . (12.7)

There is a larger flux of muon neutrinos to electron neutrinos produced at the collision
point for the same reason as was the case for atmospheric decays: pions preferen-
332 12 Neutrinos

tially decay to muons and muon neutrinos, and only produce electron neutrinos (and
another muon neutrino) in the decay of the daughter muons. Due to the high energy
of the incoming proton beam the muon neutrinos mostly follow the original proton
direction when they are produced.
There are several advantages of producing neutrinos through a protons-on-target
experiment compared to observations of neutrinos produced in the atmosphere. First,
one can construct a highly collimated and dense beam of neutrinos focussed on a
detector. Second, the baselines can be adjusted with near detectors and far detectors
that can compare detection readouts to better infer oscillation parameters. Third,
the energies of the neutrinos can vary by increasing the energy of the proton beam
bombarding the target. And finally, compositions of neutrinos and anti-neutrinos of
various flavors can be manipulated by strong magnetic fields on the parent pions.
Regarding the last point, T2K has “magnetic horns” that can focus and guide
charged pions of one charge along the proton beam line while the others are guided
away. Thus, if only π + are guided their decays into μ+ + νμ produce a rather pure
νμ beam.3 Likewise, focusing for π − generates a beam of ν̄μ . Placing detectors off
axis enables to filter a narrow band of energies of the neutrinos. These manipulations
of energy range and particle/anti-particle content are very helpful in getting the most
out of the experiment in order to comprehensively establish the neutrino properties.

12.2 Neutrino Propagation Through Vacuum

As we contemplate the various sources of neutrino production, we recognize that all

are born as flavor eigenstates. When the pion decays, it decays mainly to muon plus
muon neutrino. When the neutrinos produced through the chain of reactions internal
to the sun, they are born as electron neutrinos. However, flavor eigenstates are not
mass eigenstates, and that is what creates subtlety and opportunity for discovering
its masses and mixing angles through experiments sensitive to oscillations.

12.2.1 Two-Generation Neutrino Oscillations

Although there are three flavors of neutrinos let us for simplicity start by assuming
that there are only two flavors, |νe and |νμ . Each of these flavors is a mixture of
mass eigenstates |ν1 and |ν2 according to

νe cos θ sin θ ν1
= . (12.8)
νμ − sin θ cos θ ν2

3 The T2K beam is more complex than this, and has residual contributions from the opposite charge

pion decays, secondary muon decays, etc. See Abe et al. Phys. Rev. D87, 012001 (2013).
12.2 Neutrino Propagation Through Vacuum 333

And let us further suppose that at t = 0 the neutrino that comes out of the source
(the Sun) is in the |νe flavor eigenstate:

|ψ(t = 0) = |νe = cos θ |ν1 + sin θ |ν2 . (12.9)

The evolution of |ψ(t) in time requires us to compute the evolution of |νk in

time. Our earlier solutions to the Dirac equation for freely propagating fermions
suggests to us that

|νi (t) = e−iφ |νi (t = 0), where φi = p · x = E i t − pi · x. (12.10)

However, a full and proper treatment of the propagating neutrino is a subtle and
extensive task, which involves careful treatment of neutrino wave-packet evolution.
Nevertheless, it can been shown4 that assuming states are described by (12.10)
yields correct results as long as we enforce that relative phases of highly relativistic
neutrinos satisfy

m i2j L
φi − φ j , where m i2j ≡ m i2 − m 2j , (12.11)
2E
and where L is the distance the neutrino state has propagated from the neutrino
source. We have exchanged time t with distance L since it is distance that is a known
quantity for experiment. Furthermore, only relative phases have observable impact,
consistent with known principles of quantum theory, and so (12.11) is all we need
to proceed. Note, a naive derivation of (12.11) that starts with φ = Et − p · x =
2
(E − |p|)L m2EL leads to fortuitously to the correct answer.
If the neutrino state at birth (L = t = 0) is |ψ(L = 0) = |νe then obviously the
initial probability for it to be |νμ is zero. However, at later times (at distances L = 0)
this no longer remains true, and must be calculated by first computing |ψ(L) and
then computing the probability via

P(νμ ) = |
νμ |ψ(L)|2 . (12.12)

The distance evolution of |ψ(L) is

|ψ(L) = cos θ e−iφ1 |ν1 + sin θ e−iφ2 |ν2 . (12.13)

Multiplying this equation by a physically inconsequential phase factor of eiφ1 one

finds

|ψ(L) = cos θ |ν1 + sin θ e−iφ |ν2 , where (12.14)

(m 22 − m 21 )L m 2 L
φ ≡ φ2 − φ1 = = . (12.15)
2E 2E

4A full treatment can be found, e.g., in Beuthe, Phys. Rep. 375, 105 (2003).
334 12 Neutrinos

This evolution of the state with distance is now unambiguous and calculable given
that we know φ from (12.11).
Now let us re-expand (12.14) in terms of flavor eigenstates again so that we can
compute probability overlap:

|ψ(L) = sin2 θ + cos2 θ e−iφ |νe + sin θ cos θ e−iφ − 1 |νμ . (12.16)

The probability of νe → νμ transition is then found to be

P(νe → νμ )(L) = |
νμ |ψ(L)|2 = sin2 (2θ ) sin2 (φ/2)

m 2 L
= sin2 (2θ ) sin2
4E

m 2 L GeV
= sin2 (2θ ) sin2 1.27 . (12.17)
eV2 km E

The two-state oscillation result is a decent approximation for solar neutrinos oscil-
lating to |νμ on their way to earth and atmospheric neutrinos produced from π → μν
oscillating from |νμ to |ντ from the point of creation high in the atmospheric (from
pions in cosmic ray showers) to earth based detectors.
From (12.17) we see that for a given mass-squared splitting m 2 of neutrinos
there is a characteristic oscillation length that depends on energy:
2
4π E E eV
L osc
(E) = = (2.5 km) . (12.18)
m 2 GeV m 2

There are only two characteristic oscillation lengths in vacuum of relevance within
the SM. There is a length associated with

m 2sol ≡ m 221 = m 22 − m 21 7.5 × 10−5 eV2 (12.19)

which we label with the sol subscript since the dominant oscillation of solar neutrinos
propagating to earth is among the first two eigenstates. There is a second length
associated with

m 2atm ≡ m 232 = m 23 − m 22 2.5 × 10−3 eV2 (12.20)

which we label with the subscript atm since the dominant oscillation of atmospheric
neutrinos propagating to earth is among the second and third eigenstates. There is
a third oscillation length associated with m 231 but it is very close to m 232 since
m 221 m 232 . The solar neutrino oscillations are predominantly νe → νμ flavor
oscillations, and the atmospheric neutrino oscillations are predominantly νμ → ντ .
Respect for these historical origins of neutrino observations leads us to retain the
language of m 2sol and m 2atm .
Our definitions of m 2atm ≡ m 232 and m 2sol ≡ m 221 given above assume the
“normal hierarchy” (NH) of neutrino mass spectrum, where m 3 m 2 > m 1 .
12.2 Neutrino Propagation Through Vacuum 335

Fig. 12.1 Standard convention for the hierarchy of neutrino mass eigenstates, where m 2sol
7.5 × 10−5 eV2 and m 2atm 2.5 × 10−3 eV2

Because we only know mass-squared differences rather than the absolute masses
of the neutrinos from experiment, and because of the incompleteness of neutrino
oscillation measurements, there is a second solution, called the “inverted hierarchy”
(IH) that is also consistent with observations. For IH the hierarchy of neutrino masses
is by usual convention m 2 > m 1 m 3 , where then m 2sol ≡ m 221 (same as before)
and m 2atm ≡ m 223 . One should note that

m 2highest − m 2lowest = m 23 − m 21 = m 231 = m 232 + m 221 in NH, whereas

m 2highest − m 2lowest = m 22 − m 23 = m 223 in IH,

as illustrated in Fig. 12.1.

As we will discuss below, neutrino mixings are very large, and hardly aligned
well with mass eigenstates. Therefore, it is never appropriate in the modern setting of
neutrino physics to analyze only the two-state oscillations as a technical analysis for
extracting parameters. However, it has been presented to illustrate the physics behind
neutrino oscillation in a simpler context that gives approximately correct numerical
results in some cases. One should keep in mind, however, that the definitions of
m 2sol and m 2atm given in (12.19) and (12.20) retain their unambiguous and precise
usefulness within the three-generation oscillation framework of the SM. One just
has to note that the convention for assigning m 2atm and m 2sol to values of m i2j
changes depending on whether one assumes NH or IH, as discussed above.
We can now compute the characteristic solar and atmospheric oscillation lengths
from the two mass-squared gaps within the SM:

4π E E
sol (E) =
L osc (33, 000 km) , (12.21)
m 2sol GeV

4π E E
L atm (E) =
osc
(990 km) . (12.22)
m 2atm GeV

These are the characteristic distance scales of neutrinos oscillating their flavor con-
tent.
336 12 Neutrinos

12.2.2 Three-Generation Neutrino Propagation

In general, any flavor eigenvalue of the neutrino is a mixture of all mass eigenstates,
according to the PMNS matrix
∗
|να = Uαk |νk (relation among states) (12.23)

where Greek (Roman) letters are flavor (mass) eigenstate indices. Note the complex
∗ in (12.23) connecting states within the Hilbert space, whereas
conjugation on Uαi
among the fields

να = Uαk νk (relation among fields) (12.24)

as defined earlier in Chap. 10. This is due to the creation operator b† that creates the
|ν state through b† |0 = |ν arising from the ν̄ quantum field and not the quantum
ν field.
Applying the same techniques as we did in the case of the two-state oscillation
problem described above, one finds that the probability of a flavor eigenstate |να
being measured a distance L later as a |νβ state is computed to be

⎛ ⎞
2

m 2jk L

Pα→β (L, E) = |
νβ |να | =

2

νi |Uβi ⎝ e −i 2E Uα j |ν j ⎠

∗

i j

3
2

m 2 L

−i 2Eik ∗

=
Uβi e Uαi
, (12.25)

i=1

where a physically irrelevant overall phase angle e−im k /2E was inserted so that every
2

term has a known value according to (12.11) above. Upon expanding this equation
one finds the well-known result5

3 3 m i2j L
∗ ∗
Pα→β (L, E) = δαβ − 4 Re(Uαi Uβi Uα j Uβ j ) sin 2
4E
j=1 i> j

3 3 m i2j L
∗ ∗
+2 Im(Uαi Uβi Uα j Uβ j ) sin . (12.26)
2E
j=1 i> j

Note, there are three different characteristic neutrino oscillation length scales depend-
ing on the energy of the neutrino and the mass differences between the different mass
j (E) = 4π E/m i j , where i j = 21, 32 and 31. An experiment that
eigenstates: L iosc 2

5 Three-generation oscillation probabilities in this section follow the notation of Nunokawa, Parke
and Valle, Prog. Part. Nucl. Phys. 60, 338 (2008).
12.2 Neutrino Propagation Through Vacuum 337

detects neutrinos from a known origin and with energy E has a good prospect for
j (E) are large macroscopic distances (e.g.,
observing clean oscillation signals if L iosc
hundreds to many thousands of kilometers), which they fortunately turn out to be for
energies of naturally produced neutrinos.
If there is no CP violation one finds that the probability of ν̄α → ν̄β oscillation is
the same as that of να → νβ . However, with CP violation a difference arises:

3
3 m i2j L
∗
P(να → νβ ) − P(ν̄α − ν̄β ) = 4 Im(Uαi Uβi Uα j Uβ∗ j ) sin
2E
j=1 i> j

which simplifies to the universal expression

P(να → νβ ) − P(ν̄α − ν̄β ) =

m 221 L m 232 L m 231 L
−16 Jαβ sin sin sin , (12.27)
4E 4E 4E

∗ ∗
where Jαβ = Im(Uα1 Uα2 Uβ1 Uβ2 ) = ±J , (12.28)

with the sign being positive (negative) for a cyclic (anti-cyclic) permutation of e,
μ and τ (i.e., Jeμ = Jμτ = Jτ e = +J , whereas Jeτ = Jμe = Jτ μ = −J ). J is the
lepton-sector analog to the Jarlskog invariant associated with CP violation in the
quark sector.
In addition to the central importance of extracting m i2j from neutrino oscillation
behavior, one also has dependence on the mixing matrix U . A common way to
parametrize this matrix is (RPP)
⎛ ⎞
c12 c13 s12 c13 s13 e−iδ
U = ⎝ −s12 c23 − c12 s23 s13 eiδ c12 c23 − s12 s23 s13 eiδ s23 c13 ⎠ (12.29)
s12 s23 − c12 c23 s13 eiδ −c12 s23 − s12 c23 s13 eiδ c23 c13

where ci j ≡ cos θi j , si j ≡ sin θi j , and δ is the CP violation phase angle. There are
three independent angles involved in neutrino mixing, which are θ12 (“solar oscil-
lation angle”), θ23 (“atmospheric oscillation angle”), and θ13 (“reactor oscillation
angle”). The parenthetic names indicate what observations were (at least initially)
most sensitive to these angles. All of these angles can be restricted to the first quad-
rant [0, π/2] (and thus si j and ci j are always positive) without loss of generality as
long as δ is allowed to vary over the full range [0, 2π ].
We are now in a position to make numerous observations of neutrino oscillations
from various known sources to extract the three mass-differences (m i2j ), the three
mixing angles (θi j ), and the CP phase angle (δ). We know what sources neutrinos
come from as discussed in Sect. 12.1. However, before we continue, we must note
some complications with respect to neutrinos propagating through matter. We then
338 12 Neutrinos

will discuss the various ways that neutrinos are detected, which gives us opportunity
to highlight a few of the important neutrino experiments of the past, present and
future. After all of that we will be in a better position to summarize in Sect. 12.6
what is known about the neutrino sector (i.e., the best fits to the parameters), and the
future goals of neutrino physics.

12.3 Neutrino Propagation Through Matter

When neutrinos propagate through matter they have some probability of interacting
coherently with the medium. The effect of these coherent interactions is to introduce
additional phase shifts in the neutrino waves functions. For neutrino flavor β the phase
shift introduced is e−i Vβ t , where Vβ is the effective potential that νβ experiences due
to its coherent scattering within the medium. These matter effects are sometimes
called MSW effects after Mikheyev, Smirnov and Wolfenstein who first introduced
and recognized its importance.
The electron neutrino νe experiences an effective potential of
√
Ve = 2G F n e (x) (12.30)

due to charged-current interactions mediated by W boson exchange, where n e (x) is

the electron number density within the medium. The precise interaction is between
the νe propagating through the medium and electrons that are present in the medium.
Thus, the relevance of this effective potential to inducing a sizable phase shift in the
νe wave function depends on a sufficiently dense number of electrons through which
the neutrino passes.
In addition to the phase shift due to charged-current interactions, all the neutrino
flavors have identical effective potential contributions from neutral current interac-
tions of the neutrinos with other particles in the medium (electrons, protons and
neutrons). These neutral current interactions are mediated by the Z boson. How-
ever, these contributions contribute universal phase shifts for all neutrino flavors. A
mere shift in the overall phase of the propagating neutrino state has no effect on the
observables. On the other hand, if nature has sterile neutrinos that do not interact via
the Z boson, yet mix with the three neutrinos of the Standard Model, the resulting
species-dependent phase shift from neutral-current interactions would not drop out
and they would have to be included in the analysis. We do not consider that beyond
the Standard Model possibility further here.
Before inserting this phase shift into the neutrino propagation analysis, let us
reconsider the time-dependent neutrino state |ψα (t) which is defined to be the pure
flavor eigenstate |να at t = 0. The time evolution of that state’s flavor composition,
using (12.25), is

|ψα (t) = Cαβ (t) |νβ (12.31)

12.3 Neutrino Propagation Through Matter 339

where
m 2k t
∗
Cαβ (t) = Uαk Uβk e−iφk (t) and φk (t) = , (12.32)
2E
k

with m k being the kth mass eigenstate mass and E the neutrino energy.
When passing through matter the propagating flavor νβ experiences the additional
phase shift of e−i Vβ t as described above. This phase shift then alters (12.32) to be

∗
M
Cαβ (t) = Uαk Uβk e−i Vβ t e−iφk (t) . (12.33)
k

One then computes the probability of measuring flavor νβ at time t using the standard
methods

P(να → νβ )(t) = |
νβ |ψα (t)|2 = |Cαβ
M
(t)|2 . (12.34)

There are many analytic recastings of the above equation, which are of limited value
since one must always resort to a final numerical computation. However, there is some
utility in analytically computing the two-state neutrino oscillation approximation in
the presence of matter, the results of which we will now describe.
If we assume that νe oscillations into νμ , which approximates well solar neutrino
oscillations, we can compute how the states that were given birth as νe inside the
sun propagate through that dense matter. One finds a result very similar to (12.17)
except that θ → θ M and m 2 → m 2M , where6
√
m 2M = [m 2 cos 2θ − 2 2G F En e ]2 + [m 2 sin 2θ ]2 , and (12.35)

m 2 sin 2θ
tan 2θ M = √ . (12.36)
m 2 cos 2θ − 2 2G F En e

This analytic expression gives us the ability to see the conditions at which a resonance
of neutrino oscillation may occur in the medium, which occurs at
√
2 2G F En e = m 2 cos 2θ (resonance condition). (12.37)

There are four variables at play here, E, n e , m 2 and θ , which conspire in some
cases to give a large matter effect. One example case of the relevance of taking into
account these effects is the case of solar neutrinos of E ∼ 1 − 10 MeV propagating
in the dense medium of the sun before exiting the sun on their way to detectors on
earth.

6 See, for example “Neutrino masses Mixing, and Oscillations” in RPP.

340 12 Neutrinos

In a more careful treatment the spatial variation of the electron number density
n e (x) must be taken into account as the neutrinos propagate in the medium. In
that case one generally wishes to numerically integrate step-by-step a differential
equation, which we can express as

d|ψα (t)
i = Hαβ |νβ , where (12.38)
dt

⎛ ⎞
d C̃αβ Ve 0 0 m2
Hαβ = i =⎝ 0 0 0⎠ + ∗
Uαk Uβk k . (12.39)
dt 0 0 0 2E
k

In the case of solar neutrinos, when νe neutrinos are created in the core of the sun
and then propagate outward, one can replace t → r (neutrino velocity approximately
c = 1), compute n e (r ) within the core of the sun, and compute the neutrino oscillation
wave function as it propagates through the sun. The calculation then follows the
development of the neutrino wave function as it propagates in space using (12.38)
and (12.39). One finds that although the neutrinos do not reach or cross the resonance
condition of (12.37), the effect is large enough to be discerned in comparing the
modeling of neutrino production in the sun’s core with neutrino flavor measurements
on earth.

12.4 Detecting Neutrinos

The ideal neutrino detector would read out an incoming neutrino’s existence, direc-
tion, energy, and flavor. Unfortunately, no ideal neutrino detector exists. However,
there are a suite of different detection techniques that tell us at least some of this infor-
mation. Piecing together the information from many different detectors has enabled
us to obtain a rather comprehensive understanding of neutrino masses and mixings.
Let us review some of those techniques.
Neutrino detection is made possible only by the effects neutrinos have on other
particles. Thus, the neutrinos must first interact with normal matter and then the
effects of this interaction must be registered in some way. In the following paragraphs
we describe some of these primary detection techniques applied to final states from
neutrino-induced scattering.
Cherenkov light from ν + N → + N where N and N are nucleons. A highly
energetic incoming neutrino can be converted to a lepton by charged-current pro-
cesses:

ν̄ + p → + + n (inverse beta decay),

ν + n → − + p.
12.4 Detecting Neutrinos 341

At MeV-scale energies, relevant for solar neutrinos and supernova neutrinos and low-
energy atmospheric neutrinos, the first interaction of inverse beta decay is signifi-
cantly more important for water Cherenkov reactions than the second. The reason is
that the hydrogen atom only contains the proton, and charged-current neutrino inter-
actions on neutrons within oxygen are very suppressed. Now, if the final state lepton
has velocity greater than the velocity of light in the medium it will emit Cherenkov
radiation as it traverses the detection volume. This technique is employed in exper-
iments with very large volumes of water or ice, which have plenty of protons and
neutrons with which neutrinos can interact. Photomultiplier tubes (or other types
of photon detectors) are utilized to record this signal of neutrino interaction. They
typically can infer some directional and energy information. Muons are particularly
interesting final states since one can measure their macroscopic decay lengths, which
are affected by the muon’s in-flight energy, by the length of their Cherenkov tails.
This in turn enables one to determine the parent neutrino’s energy up to standard kine-
matic inference uncertainties. Experiments that have utilized this technique include
IMB, Super-Kamiokande, and IceCube.
For high-energy scattering there can be difficulties resolving the charge of the final
state lepton. For example, the μ− in νμ n → μ− p has the same Cherenkov radiation
emitted as the μ+ from ν̄μ p → μ+ n. To resolve the difference one can try to detect
the resulting p through its own Cherenkov radiation. However, in many detectors the
proton’s radiation is too low to be discernible. The neutron in the μ+ n final state,
on the other hand, can be detected provided the detector is doped with Gd. Gd has a
neutron capture rate more than 160,000 times that of a free proton. After Gd captures
a neutron it cascades out higher energy γ -rays that become discernible a signal to
photon detectors. The Gd+n capture’s flash of light happens some μ-seconds after
the μ− Cherenkov radiation, thereby tagging the event as a negatively charged μ− .
The main material for a Cherenkov radiation detector does not have to be water or
ice. For example, the MiniBooNE collaboration used mineral oil. The two primary
advantages are higher index of refraction and the presence of scintillation light.
Higher index gives more Cherenkov radiation, and thus a stronger signal with lower
thresholds. The presence of scintillation light enables additional handles for neutrino
identification. The chief disadvantage is that much more complicated modeling is
necessary of light generation and transmission within the mineral oil medium.
Cherenkov light from elastic νe− → νe− . Any neutrino or antineutrino can scatter
off atomic electrons elastically through ν X + e− → ν X + e− . The final state electron
can be kicked to highly relativistic velocities which then Cherenkov radiates as it
traverses the detection volume. Although any neutrino species can take part in this
interaction, the most efficient scattering is νe e− → νe e− , which then imparts a very
forward kick to the final state electron in line with the original incoming neutrino.
This process has high correlation in direction with the source (e.g., the Sun).
Deuteron dissociation signals. The key aspect of the SNO detector, which was
key in resolving the solar neutrino problem, was its large tank of heavy water (D2 O).
The presence of deuterons enable several detection techniques simultaneously. An
electron neutrino can dissociate a deuteron through charge current interactions via
342 12 Neutrinos

νe + d → e− + p + p. (12.40)

The νe threshold for this interaction to occur is 1.44 MeV. The electron in the final
state can undergo Cherenkov radiation as described above. In addition, the deuteron
can dissociate via neutral current interaction via

νX + d → νX + n + p (12.41)

where ν X is any species of neutrino or antineutrino. The ν X threshold for this inter-
action to occur is the deuteron binding energy of 2.2 MeV. In this case there is no
Cherenkov radiation from the initial products of the ν X scattering. However, the
free neutron produced may be captured by a deuteron nucleus to produce the 3 H
isotope plus a 6.25 MeV gamma ray (γ ). This is then followed by Compton scat-
tering γ e− → γ e− that kicks an electron to sufficiently high velocity to produce
Cherenkov radiation, which can be detected. A later phase of the SNO experiment
added NaCl salt to the heavy water, which enticed the free neutrons to be captured by
the Chlorine with subsequent production of gamma rays with more energy available
(8.6 MeV) thereby increasing the efficiency of neutrino detection.
Calorimetry tracks from ν N → + X , ν + X . For neutrinos with energy above
the GeV range, interactions are inelastic hard-scattering that make showers of par-
ticles. A tracking calorimeter can measure the hadronic jet that results. If the ν
interaction is a charged-current interaction producing a lepton there will be an
additional visible charged-lepton track in the calorimeter. The MINOS and NOνA
detectors utilize this technique, although their precise methodologies for producing
and detecting the tracks is different.
Scintillation light and drift electrons from νe + Ar → e− + K and ν + Ar →
+ p + X . This is a key process for the ICARUS, MicroBooNe, and DUNE detec-
tors. Argon has several features that increase efficiency and quality of neutrino detec-
tion. First, liquid Argon is a very dense substance making for an increased number
of neutrino interactions per unit volume compared to water, for example. When an
electron neutrino traverses liquid Argon it converts to an electron through charged-
current interactions which simultaneously convert a neutron within the Argon to a
proton, making Potassium. For higher energy neutrinos, the interaction can best be
thought of as ν + Ar → + p + X , where the -lepton and proton produce ioniz-
ing tracks as they traverse the detector volume. The radiation creates a scintillation
signal of light within the detector. In addition to this signal, the detector is composed
of modules with applied electric field gradients and charge readout planes. The read-
out planes are the anode planes of the electric field gradient, and they register a
signal of the electrons arriving. Combining all the information from the immediate
scintillation light and the somewhat later charge readout signals of arriving ionized
electrons allows for 3D reconstruction of the event, which thereby enables good iden-
tification of incoming neutrinos and more accurate reconstruction of its energy. The
detectors are generally called Liquid Argon Time Projection Chamber (LArTPC),
whose high sensitivity to neutrinos is nicely complementary to the high sensitivity
to anti-neutrinos of water Cherenkov detectors.
12.5 Direct Limits on Neutrino Masses 343

12.5 Direct Limits on Neutrino Masses

As we have detailed above, manifestations of neutrino oscillations depend only on the

mass-squared differences between neutrinos to a very good approximation, not on the
overall mass scale of the neutrinos. Nevertheless, there are limits to the absolute mass
scale from a variety of experimental results. Earlier we described how arrival time of
neutrinos from supernova 1987A puts an absolute limit on the neutrino mass of about
m i < 10 eV. In this section we discuss how kinematic distributions of β-decays put
direct limits on neutrino masses.
Regarding β-decay limits, let us discuss as an illustration the Karlsruhe Tritium
Neutrino experiment (KATRIN). This experiment attempts to measure very carefully
the kinematic distributions of tritium (31 H) decay:

3
1H → 32 He+ + e− + ν̄e . (12.42)

The released energy of this decay is very small due to the small mass difference
between tritium and the Helium isotope. This small difference is by designed to be
maximally sensitive to a possible mass of the ν̄e particle. For massless neutrinos, the
electron kinetic energy can reach as high as E e,max (m ν = 0) = 18.6 keV. However,
if the neutrino does have mass the maximum energy is lowered, E e,max (m ν = 0) <
E e,max (m ν = 0), and the energy spectrum of the electron is distorted as it reaches
its end-point.
We can find E e,max (m ν ) from inspection of the differential decay width:

d
= C(E)(E 0 − E) (E 0 − E)2 − m 2ν̄e (12.43)
dE
where E 0 is the released energy of the decay, E is the electron’s kinetic energy, and
C(E) is an energy-dependent constant that does not depend on the neutrino mass.7
We see from (12.43) that E e,max = E 0 − m ν̄e .
Thus, a very careful measurement of the maximum electron energy spectrum may
show a signal for non-zero neutrino mass. The 90% CL current limit from KATRIN
experiment (Nature 18, 160 (2022) is m ν̄e < 0.7 eV. The projected sensitivity of
KATRIN is to be able to find evidence for neutrino masses unless m ν̄e <∼ 0.3 eV. The
current limit from similar Tritium decay experiments in the past is m ν̄e < 2 eV.
The limit above is expressed as a limit on the flavor eigenstate m ν̄e , which is not
a mass eigenstate. To be rigorous we need to decide what the limits really are on the
mass eigenstates from Tritium decay electron end-point spectrum. In terms of mass
eigenstates the decays are

d
= |Uei |2 C(E)(E 0 − E) (E 0 − E)2 − m 2ν̄i θ (E 0 − E − m ν̄i ), (12.44)
dE
i

7 See, e.g., Bilenky et al., Phys. Rep. 379, 69 (2003).

344 12 Neutrinos

where the θ (x) function enforces the requirement that the decay is kinematically
allowed. Supposing that all neutrino masses are below the release energy of 8.6 keV
one can dispense with this function.
There is a convenient simplification of (12.44) that is applicable when m ν̄i
(E 0 − E):

d 1 m 2ν̄i
= |Uei | C(E)(E 0 − E) 1 −
2 2
+ ···
dE 2 (E 0 − E)2
i

1 i |Uei | m ν̄i
2 2
C(E)(E 0 − E) 1 −2
+ ···
2 (E 0 − E)2

C(E)(E 0 − E) (E 0 − E)2 − m̂ 2ν̄e (12.45)

where

m̂ 2ν̄e ≡ |Uei |2 m 2ν̄i . (12.46)
i

The expansion is useful even though the maximum distortion of the spectrum is when
E 0 − E → m ν̄i . This due to the extremely low probability of measuring events with
E so close to the E 0 limit. Measured events occur at energies where the approximation
holds, and it is with these events that sensitivity to neutrino mass is obtained.
Notice that (12.45) is of the same form as (12.43). Thus, the limits from end-
point analysis of electron energies in Tritium decays apply to m̂ 2ν̄i . There are similar
results that can be obtained for m̂ νμ and m̂ ντ from careful measurements of decay
kinematics in pion decays and τ -lepton decays, respectively. These results and the
one on νe given above are

m̂ νe < 2 eV, (12.47)

m̂ νμ < 0.17 MeV, (12.48)
m̂ ντ < 18.2 MeV. (12.49)

The upper limits on m̂ νμ and m̂ ντ are not independently very constraining when one
takes into account the implications to m̂ νμ and m̂ ντ limits after applying the con-
straints of m̂ νe and the m i2j experimental determinations from oscillation exper-
iments and observations. Future experiments hope to improve on these results, or
indeed find the absolute mass scale of neutrinos.

12.6 Neutrino Properties and Future Goals

As we have discussed, in addition to the natural sources of neutrinos from the sun,
from cosmic ray collisions with the atmosphere, and from supernova, there are numer-
ous human-made sources at a variety of nuclear reactors and proton-on-target facili-
ties. Likewise, there are numerous experiments that have been constructed to detect
12.6 Neutrino Properties and Future Goals 345

these neutrinos. The experiments each have different capacities and sensitivities to
the various final states, which include neutrinos and anti-neutrinos of all flavors.
There is no easy way to combine all of these source-detector experimental permu-
tations into an easy summary, except to do a global fit of all data to determine the
mass splittings of the mass eigenstates and the entries of the PMNS matrix.
At this writing the data can be interpreted to be consistent with the following two
distinct possibilities (see Fig. 12.1). The first is the normal hierarchy (NH):

Normal Hierarchy (NH) : m 3 m 2 > m 1

m 231 2.5 × 10−3 eV2 and m 221 7.5 × 10−5 eV2
sin2 θ12 0.30, sin2 θ23 0.48, sin2 θ13 0.021.

The second solution is the inverted hierarchy (IH):

Inverted Hierarchy (IH) : m 1 > m 2 m 3

m 223 2.5 × 10−3 eV2 and m 221 7.5 × 10−5 eV2
sin2 θ12 0.30, sin2 θ23 0.60, sin2 θ13 0.022.

In both cases the CP violation phase δ is not constrained well, and there is not yet a
definitive determination that it is nonzero.
Two of the most important goals of the future neutrino program are to determine
the level of CP violation, if any, in the neutrino interactions, and to determine whether
the NH or IH is the correct relative ordering of neutrino mass eigenstates.
One method to determine if there is CP violation among neutrinos is to take
careful measurements of the electron neutrino appearance rate from, say νμ → νe
oscillations, and compare that with the rate of electron antineutrino appearance
from ν̄μ → ν̄e . This is ideally done for neutrino sources that can switch back and
forth between π − and π + beams. As discussed above, this is possible by select-
ing the charge of the pions directed toward the detector, which in turn selects for
π + → νe ν̄μ νμ + X or π − → ν̄e ν̄μ νμ + X . The detector then should have some
sensitivity to both νe and ν̄e . The T2K experiment can do this. It is based on J-
PARC proton beam on target producing copious pions from high intensity proton
beam on target. The pions then decay to (anti)-neutrinos which travel to the Super-
Kamiokande detector 295 km away. Some of those neutrinos oscillate to electron
neutrinos along the journey and are detected by Super-Kamiokande as such. Current
results8 suggest a somewhat higher number of νe than would be expected when
δ = 0, and, consistently, a somewhat lower number of ν̄e . The δ = 0 point in param-
eter space from this analysis is ruled out at the 95% CL. Nevertheless, more data
will be required to establish this result at a higher confidence level and converge on
a value for δ.
To determine whether neutrino masses obey NH or IH is difficult and there are
many ideas that have been pointed out over the years. Several ideas rely on the ability
to do precision measurements on observables constructed to be sensitive to the sign

8 K. Abe (T2K Collaboration). Phys. Rev. Lett. 121, 171802 (2018).

346 12 Neutrinos

of m 231 . For example, the difference between neutrino oscillation probabilities and
anti-neutrino oscillations is sensitive to this sign difference (see (12.27)). However,
a non-zero value requires CP violation and there is ambiguity in the extraction of the
δ angle and sgn(m 231 ).
To go a step deeper, one can show9 that when the neutrinos pass through matter
the general expression for probability of transition from a muon (anti)-neutrino to
an electron (anti)-neutrino changes to
2
sin2 (1 − x)31 m 221 sin2 (x31 )
P(νμ → νe ) = sin θ23 sin 2θ13
2 2
+ cos2 θ23 sin2 2θ12
(1 − x)2 m 231 x2
m 221 sin[(1 − x)31 ] sin x31
+ sin 2θ13 sin 2θ12 sin 2θ23 cos(31 + δ)
m 231 1−x x
√
where x ≡ 2 2E G F n e /m 231 and 31 ≡ m 231 L/(4E). The same expression
holds for P(ν̄μ → ν̄e ) except that δ → −δ, and also x → −x due to the change
in sign of the potential for ν̄e passing through the medium compared to νe , as dis-
cussed earlier in Sect. 12.3. Comparing the neutrino and anti-neutrino oscillation
rates carefully at different distances from the source, one is ultimately able to deter-
mine the sign of 13 and thus determine NH or IH. To maximize the discriminating
capability of the matter effects, it is helpful to have data from very far baselines, such
as the NOνA experiment, whose detector in Ash River, Minnesota is 810 km away
from the neutrino source at Fermilab.
In time, nature’s chosen hierarchy for neutrino masses might be determined by a
combination of currently accruing data, at T2K and NOνA for example. Nevertheless,
the future experiments, such as LBNF/DUNE, will contribute significantly to the
global effort that should culminate in a decisive determination of the hierarchy.

Problems

1. Calculate the required threshold energy E > E ν̄thresh

for ν̄ + p → + + n to
be kinematically allowed. Likewise, compute threshold energy for ν + e− →
νe + − to be kinematically allowed for = μ, τ .
2. Supernova SN1987A is about 170,000 light years away. In its explosion visible
light and neutrinos were released over the course of a few seconds. Let us suppose
the variance of release time is T = 2 s, and that both photons and neutrinos
release on average at the same time. Assuming that a burst of neutrinos are detected
within 12 s of the light signal, estimate the upper bound on the mass of the neutrino.
3. Imagine a beam of pions with energy E π traveling in the z direction which subse-
quently decay into π → μ + νμ . Determine the energy and angular spread with
respect to the z axis of the νμ decay products. For a given E νμ compute the angle
that νμ makes with respect to the z axis.

9 R. N. Cahn et al. “White Paper: Measuring the Neutrino Mass Hierarchy.” arXiv:1307.5487.
Problems 347

4. Make a table of E ν values down the vertical and m 2 values across the horizontal
and compute the oscillation distance L osc for each combination. Choose the m 2
values to be m 2sol = 7.5 × 10−3 eV2 and m 2atm = 2.5 × 10−3 eV2 and choose
the E ν values to be 1 MeV, 10 MeV, 100 MeV, 1 GeV, 10 GeV and 100 GeV.
Appendix
A

A.1 Natural Units and Conversions

The speed of light and are:

c = 2.99792458 × 1010 cm/s = 2.99792458 × 108 m/s, (A.1.1)

= 1.05457148 × 10−34 J s = 6.58211814 × 10−25 GeV s. (A.1.2)

The value of c is exact, by definition. (Since October 1983, the official definition of
1 m is the distance traveled by light in a vacuum in exactly 1/299792458 of a second.)
In units with c = = 1, some other conversion factors are:

1 GeV = 1.60217646 × 10−3 erg = 1.60217646 × 10−10 J, (A.1.3)

1 GeV = 1.78266173 × 10−24 g = 1.78266173 × 10−27 kg, (A.1.4)
1 GeV−1 = 1.97326937 × 10−14 cm = 1.97326937 × 10−16 m. (A.1.5)

Conversions of particle decay widths to mean lifetimes and vice versa are obtained
using:

1 GeV−1 = 6.58211814 × 10−25 s, (A.1.6)

1 sec = 1.51926778 × 1024 GeV−1 . (A.1.7)

Conversions of cross-sections in GeV−2 to barn units involve:

1 GeV−2 = 3.89379201 × 10−4 barns (A.1.8)

= 3.89379201 × 105 nb (A.1.9)
= 3.89379201 × 108 pb (A.1.10)
= 3.89379201 × 1011 fb, (A.1.11)

S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7_A
350 Appendix A

and in reverse:

1 nb = 10−33 cm2 = 2.56819059 × 10−6 GeV−2 , (A.1.12)

1 pb = 10−36 cm2 = 2.56819059 × 10−9 GeV−2 , (A.1.13)
1 fb = 10−39 cm2 = 2.56819059 × 10−12 GeV−2 . (A.1.14)

A.2 Dirac Spinor Formulas

In the Weyl (or chiral) representation):

μ 0 σμ
γ = (A.2.1)
σμ 0

where

1 0 0 1
σ0 = σ0 = ; σ 1 = −σ 1 = ;
0 1 1 0

0 −i 1 0
σ 2 = −σ 2 = ; σ 3 = −σ 3 = . (A.2.2)
i 0 0 −1

γ 0† = γ 0 ; (γ 0 )2 = 1 (A.2.3)
γ j† = −γ j ( j = 1, 2, 3) (A.2.4)
γ 0 γ μ† γ 0 = γ μ (A.2.5)
γμ γν + γν γμ = {γμ , γν } = 2gμν (A.2.6)
[γρ , [γμ , γν ]] = 4(gρμ γν − gρν γμ ) (A.2.7)

The trace of an odd number of γ μ matrices is 0.

Tr(1) = 4 (A.2.8)
Tr(γμ γν ) = 4gμν (A.2.9)
Tr(γμ γν γρ γσ ) = 4(gμν gρσ − gμρ gνσ + gμσ gνρ ) (A.2.10)
Tr(γμ1 γμ2 . . . γμ2n ) = gμ1 μ2 Tr(γμ3 γμ4 . . . γμ2n ) − gμ1 μ3 Tr(γμ2 γμ4 . . . γμ2n )
. . . + (−1)k gμ1 μk Tr(γμ2 γμ3 . . . γμk−1 γμk+1 . . . γμ2n ) + . . .
+gμ1 μ2n Tr(γμ2 γμ3 . . . γμ2n−1 ) (A.2.11)

In the chiral (or Weyl) representation, in 2 × 2 block form:

−1 0
γ5 = (A.2.12)
0 1

1 − γ5 1 0 1 + γ5 0 0
PL = = ; PR = = (A.2.13)
2 0 0 2 0 1
Appendix A 351

The matrix γ5 satisfies:

γ5† = γ5 ; γ52 = 1; {γ5 , γ μ } = 0 (A.2.14)

Tr(γ5 ) =0 (A.2.15)
Tr(γμ γ5 ) =0 (A.2.16)
Tr(γμ γν γ5 ) =0 (A.2.17)
Tr(γμ γν γρ γ5 ) =0 (A.2.18)
Tr(γμ γν γρ γσ γ5 ) = 4iμνρσ (A.2.19)

γ μ γμ =4 (A.2.20)
μ
γ γν γμ = −2γν (A.2.21)
μ
γ γν γρ γμ = 4gνρ (A.2.22)
μ
γ γν γρ γσ γμ = −2γσ γρ γν (A.2.23)

( /p − m)u( p, s) = 0; ( /p + m)v( p, s) = 0 (A.2.24)

u( p, s)( /p − m) = 0; v( p, s)( /p + m) = 0 (A.2.25)

u( p, s)u( p, r ) = 2 mδsr (A.2.26)

v( p, s)v( p, r ) = −2 mδsr (A.2.27)
v( p, s)u( p, r ) = u( p, s)v( p, r ) = 0 (A.2.28)

u( p, s)u( p, s) = /p + m (A.2.29)
s

v( p, s)v( p, s) = /p − m (A.2.30)
s
352 Appendix A

A.3 Further Reading

Throughout the text we have often referenced “RPP,” which is the Review of Particle
Properties publication of the Particle Data Group, listed here:
Workman, R.L. et al. (Particle Data Group), “Review of Particle Properties”, to be
published in Prog. Theor. Exp. Phys. 2022, 083C01 (2022). http://pdg.lbl.gov/.
Quantum Field Theory
M. Peskin, D.V. Schroeder, Introduction to Quantum Field Theory (Perseus Books,
1995)
L.H. Ryder, Quantum Field Theory, 2nd edn. (Cambridge UniversityPress, 1996)
M.D. Schwartz, Quantum Field Theory and the Standard Model (Cambridge Uni-
versity Press, 2014)
M. Srednicki, Quantum Field Theory (Cambridge University Press, 2007)
The Standard Model
C. Burgess, G. Moore, The Standard Model: A Primer (Cambridge University Press,
2007)
J.F. Donoghue, E. Golowich, B.R. Holstein, Dynamics of the Standard Model (Cam-
bridge University Press, 1992)
H. Georgi, Weak Interactions and Modern Particle Theory (Dover Publications,
2009)
M. Thomson, Modern Particle Physics (Cambridge University Press, 2013)
Collider Physics
V.D. Barger, R.J.N. Phillips, Collider Physics (Addison-Wesley, 1987)
J. Campbell, J. Huston, F. Krauss, The Black Book of Quantum Chromodynamics: A
Primer for the LHC Era (Oxford University Press, 2018)
M. Krämer, F.J.P. Soler (eds.), Large Hadron Collider Phenomenology (Institute of
Physics, Bristol, 2004)
T. Plehn, Lectures on LHC Physics. arXiv:0910.4182 [hep-ph]
Group Theory
J.F. Cornwell, Group Theory in Physics, vols. 1 and 2 (Academic, 1984)
H. Georgi, Lie Algebras in Particle Physics (Westview Press, 1999)
B.R. Hall, Lie Groups, Lie Algebras, and Representations (Springer, 2003)
P. Ramond, Group Theory: A Physicist’s Survey (Cambridge, 2010)
B.G. Wybourne, Classical Groups for Physicists (Wiley, 1974)
Supersymmetry
S.P. Martin, A Supersymmetry Primer. arXiv:hep-ph/9709356
Index

A (J = 3/2), 5, 6
Abelian (commutative) group, 200 Belle experiment, 169
Action, 23, 52–54 Beta function, 230, 233, 234
Active quarks, 231, 232 QCD, 230
Adjoint representation, 204, 205, 208– QED, 233
210, 212, 213, 217, 218 Bhabha scattering, 126
Altarelli–Parisi (DGLAP) equations, 237, Biunitary transformation, 284
241 B mesons, 308
Angular momentum conservation, 124– Boost, 14, 22, 34, 35
126, 160, 170 Bose-Einstein statistics, 61
Angular momentum operator, 39 Bottomonium, 7, 8, 251
Angular resolution, 131 Branching ratios, 10, 160
Annihilation operator, 59, 60, 64, 65 charged pion, 167, 187
Anticommutation relations, 65–67, 106 Higgs boson, 160, 283
Antineutrino, 165, 167 lepton-number violating limits, 168,
Antineutrino-electron scattering, 183 169
Antiparticle, 2–4, 7, 43, 47, 51, 52, 64, W boson, 289
96, 119, 120 Z boson, 274
Associativity property of group, 200 Breit-Wigner lineshape, 2, 251
Asymptotic freedom, 232, 234
ATLAS detector at LHC, 251 C
Cabibbo angle, 287, 288
B Cabibbo-Kobayashi-Maskawa (CKM) mix-
BaBar experiment, 169 ing, 287, 288
Bare coupling, 226, 228 Canonical commutation relations, 39
Bare mass, 227, 228 Canonical quantization
Barn (unit of cross-section), 73, 74, 349 complex scalar field, 189
Barred Dirac spinor, 36 Dirac fermion fields, 66
Baryons, 4–6 real scalar fields, 59
(J = 1/2), 5 Carbon-14 dating, 166
© Springer Nature Switzerland AG 2022 353
S. P. Martin and J. D. Wells, Elementary Particles and Their Interactions,
Graduate Texts in Physics, https://doi.org/10.1007/978-3-031-14368-7
354 Index

Casimir invariant, 203 Fermi (G F ), 171, 178

Charge conjugation, 178 general, 93
Charge conservation, 24, 25 QCD (g3 , αs ), 218–221, 231
Charged current, 184, 185, 194 experimental value, 232, 233
Charge density, 24 renormalized, 226–228, 273
Charge operator, 102 scalar3 , 82
Charmonium, 7, 8, 251 scalar4 , 68
Chiral representations (of gauge group), scalarn , 90, 93
280 Yang-Mills gauge, 212, 213, 215, 217,
Chiral symmetry breaking, 5, 266, 283 261
Chiral (Weyl) representation of Dirac matri- Yukawa, 94, 157–159, 281–285
ces, 33, 350 Yukawa, neutrino, 289, 290
Closure property of group, 200–202 Covariant derivative
CMS detector at LHC, 251 QED, 103, 199
Color, 2, 4 Yang-Mills, 212, 213
confinement, 2, 4, 116, 234 Covariant four-vector, 16
sum over, 220 CPT, 309
Commutation relations, 59, 60 CPT Theorem, 178
Commutative (Abelian) group, 200 CP violation, 337
Commutator of Lie algebra generators, direct, 316
202 Creation operator, 59
Complex conjugate of spinor expressions, fermionic, 64, 65
110, 128, 171, 172, 185 Crossing, 132–134, 136, 137, 148, 149,
Complex conjugate representation, 203 181, 183
Complex scalar field, 188, 189 Cross-section, 73, 74
Compton scattering, 139–148 differential, 77
Confinement of color, 2, 4, 116, 234 two particle → two particle, 81
Conservation of angular momentum, 124– CTEQ parton distribution functions, 239
126, 160, 170
Conservation of charge, 24, 25 Current density, electromagnetic, 24, 58,
Conservation of energy, 71, 84, 165 101
Conservation of helicity, 126, 188 Cutoff, 63, 91, 108, 179, 193, 224–229,
Conservation of lepton number (or not), 232
168, 169, 290–292
Conservation of momentum, 16, 76, 107, D
108 Decay rate (width, ), 153
Contravariant four-vector, 13, 14, 16, 22 Defining representation, 202
Density of states, 76
Cosmic rays, 329 DGLAP equations, 237, 241
Coulomb potential, 131 Differential cross-section, 77
Coupling Dimensional analysis, 176, 191–193
bare, 226, 228 Dimensional regularization, 232
electromagnetic (e), 58, 101, 272 Dimensional transmutation, 232
electroweak (g), 194, 195, 270–273 Dimension of group (dG ), 201, 203
electroweak (g ), 270–273 Dimension of representation (d R ), 201
Index 355

Dimension of tensor product representa- generic fields, 94, 95

tion, 204 gluons, 218
Dirac equation, 31–33 Higgs boson, 264
Dirac field, 52, 55, 56, 64–66 massive vector fields, 195
Dirac matrices, 32, 33, 350 photon, 105, 106
Dirac spinor, 31, 32 scalar, complex (π ± ), 189
Direct sum representation, 203 scalar, real, 91
D mesons, 308 W boson, 194
Double-counting problem, identical par- Yang-Mills vector fields, 214
ticles in final state, 81, 137, 149, Feynman rules, 72
156 Fermi weak interaction theory, 171,
Drell-Yan scattering, 249–251 172
fermion external lines, 96
E fermions with known helicity, 118–
Electroweak Standard Model (Glashow- 120
Weinberg-Salam) gauge theory, 269– general, 92–96
280 massive vector bosons, 195, 196
Electroweak vector boson interactions, pion decay, 189–191
278, 279 QCD, 218, 219
Energy conservation, 71, 84, 165 QED, 104–108
μνρσ (Levi-Civita) tensor, 23 scalar φ n theory, 90, 91, 93
Equal-time commutators, 59 standard electroweak gauge theory, 278–
Equivalence of representations of group, 280
202 W-fermion-antifermion, 194
ηb (scalar bottomonium), 7 Yang-Mills, 214–216
ηc (scalar charmonium), 7 Yukawa coupling, 94
η meson, 251 Yukawa coupling, electron, 281
Euler-Lagrange equations of motion, 52– Feynman slash notation, 33
57, 67 Feynman-Stückelberg interpretation, 42
Feynman’s x, 236
F Field, 51, 52
Factorization scale, 237 Dirac, 52, 55, 56, 64–66
Femtobarns (fb), 74 electromagnetic, 24, 25, 52, 57, 58
Fermi constant (G F ), 171 Majorana, 47, 55, 290
numerical value, 177 scalar, 51, 52, 54, 55
relation to m W , 195 vector, 52, 55, 57, 58
relation to Higgs VEV, 277 Weyl, 55
Fermi-Dirac statistics, 65, 106 Field theory, 52
Fermion-antifermion condensate, 266 Fine structure constant, 102, 113
Fermi weak interaction theory, 171 Flavor Changing Neutral Currents (FCNC),
Feynman diagram, 72, 87–92 322
Feynman gauge, 106, 107, 137, 214 Flavors of quarks, 3, 4
Feynman propagator Four-fermion interaction, 171, 191, 193
charged massive vector fields, 194 Four-vector, 13–16, 21
Dirac fields, 95, 107 Free field theory, 67
356 Index

Fundamental representation, 202 free scalar field, 59, 61–63

general classical system, 58
G interaction, 68, 69, 96
gμν (metric tensor), 15 single Dirac particle, 31, 39
Gamma matrices, 32 single scalar particle, 29, 30
traces, 350, 351 Hamilton’s principle, 53
γ5 matrix, 41, 351 Hard scattering of partons, 234
Gauge eigenstate fields, 271, 272, 285, Helicity, 40, 41
287, 288 Helicity conservation, 126, 188
Gauge fixing, 105, 106 Helicity flip, 188
gauge-fixing parameter (ξ ), 105, 106 Helicity suppression, 188
Gauge invariance, 25, 103, 199, 200 Higgs boson, 4, 5, 94, 158–161, 276,
Yang-Mills, 200, 201, 211–214, 216 278–280, 295
Gell-Mann matrices, 208 experimental mass, 4
Generators interactions with Z , W , h, 278–280
of Lie algebra, 202 mass in Standard Model, 280
SU (2), 206 Higgs field
SU (3), 208 generic, 262, 263, 266, 269
SU (N ), 207 Standard Model, 275, 276, 278–280
GeV, 30 Higgs mechanism, 262, 263, 266, 269
GIM mechanism, 321 Higgs boson
Global symmetry, 207, 257, 258 discovery, 292
Global symmetry breaking, 258–261 Hole (absence of an electron), 41
Glueballs, 8
Gluon, 185, 217–219 I
propagator, 218 Identical final state particle overcounting
Gluon collider, 246 problem, 81, 137, 149, 156
Gluon-gluon scattering, 243–245 Identity element of group, 200
Gluonium, 8 i factor in propagators, 91
Goldstone boson. See Nambu-Goldstone Ignorance, parameterization of, 185, 186,
boson 226
Goldstone’s theorem, 266 Index of representations, 202
Grassmann (anticommuting) numbers, 66 adjoint, 204
SU (2), 206
Gravity, 193, 228 SU (3), 208
Group, 22, 200 Inertial reference frame, 13
Infinite momentum frame, 235
H Infrared slavery, 232
Hadron-hadron scattering, 234–239, 246– IN state, 69
251 Integrated luminosity, 73
Hadronization, 235 Interaction Hamiltonian, 68, 69, 96
Hadrons, 4, 5 Interaction Lagrangian, 68
Half life, 166 Internal line, 87
Hamiltonian Inverse metric tensor (g μν ), 15
free Dirac field, 67 Inverse muon decay, 181
Index 357

Inverse of group element, 200 Longitudinal polarization of massive vec-

Irreducible representation, 203 tor boson, 263
Isospin global symmetry, 207 Longitudinal rapidity, 19, 20, 247
Loop corrections to muon decay, 178,
J 179
Jacobian, 248, 250 Loops in Feynman diagrams, 91, 92, 95,
Jacobi identity, 202 108, 193, 216, 223–228, 283
Jet, 116, 219, 235 fermion minus sign, 108
J /ψ (vector charmonium), 7, 8, 118, 251 Lorentz-invariant phase space
2-body, 79, 80, 145, 155
K 3-body, 161, 162
Kaons, 307 n-body, 79, 154
CP eigenstates, 312 Lorentz transformations, 13, 14, 22
decays, 314 Lowering operator, 59, 61, 62
neutral mixing, 308 Luminosity, 73
oscillations, 313
Klein-Gordon equation, 30, 31, 37, 55 M
Klein-Nishina formula for Compton scat- Majorana fermion, 47, 290–292
tering, 147 Majorana field, 290
Mandelstam variables (s, t, u), 89, 90,
L 127
Lagrangian, 52–54 Mandelstam variables, partonic (ŝ, tˆ, û),
Lagrangian density, 55 235, 241
Landau gauge, 106, 107, 214 Mass diagonalization, 284
Large Hadron Collider (LHC), 1, 159, Mass eigenstate fields, 197, 271, 272,
241, 242, 245, 246, 251, 297 285, 287, 288
Lattice gauge theory, 11 Massive vector boson, 195, 216, 263–
Left-handed fermions 266, 269, 272
and weak interactions, 170, 171, 185, propagator, 195
207, 269–271, 280 Matrix element, 72, 73, 75
Dirac, 40 reduced, 76
projection matrix (PL ), 41, 43 Maxwell’s equations, 24, 25, 58
Weyl, 46, 47 Mean lifetime, 154
Left-handed polarization of photon, 105 MEG, 168
Lepton numbers (individual), 168, 169 Mesons, 4, 6, 7
Lepton number (total), 169, 290–292 (J = 0), 6, 7
Leptons, 3 (J = 1), 7
Levi-Civita ( μνρσ ) tensor, 23 Metric tensor (gμν ), 15
Lie algebra, 202 Møller scattering, 135
Lie group, 200 Momentum conservation, 16, 76, 107,
Linear collider, 118 108
Local (gauge) symmetry, 199, 257 Momentum fraction (Feynman’s x), 235,
breaking, 261–264, 266 236
Longitudinal momentum fraction (Feyn- MS renormalization scheme, 232, 273
man’s x), 236 Muon decay, 167, 168, 171–177
358 Index

Muon production in e− e+ collisions, 109– P

114 Parity, 21, 169, 178
Parity violation, 21, 170, 171
N Partial width, 10, 154
Nambu-Goldstone boson, 260, 261, 266 Parton, 234–236
pseudo, 266 Parton Distribution Function (PDF), 236–
would-be, 263, 264, 266, 276 241
Nanobarns (nb), 74 Partonic subprocess, 234
Natural units, 349, 350 Parton model, 234
Neutral mesons, 307 Pauli exclusion principle, 41, 65
Neutrino, 3, 46 Pauli matrices (σ 1,2,3 ), 32, 350
anti-, 4, 165, 167 φ meson, 251
atmospheric, 329 Photon field (Aμ ), 52, 101, 103–106
direct mass search, 343 Photon polarization, 104, 105
disappearance, 330 Photon propagator, 105–107
experiments, 327, 340 Picobarns (pb), 74
flavor oscillations, 325 Pion (π ± ) decay, 185–191
inverted hierarchy, 345 Pion decay constant ( f π ), 186
masses, 289–292, 325 Polarization vector, 104
matter effects (MSW), 338 and gauge transformations, 138
mixing, 291 massive vector boson, 195
normal hierarchy, 345 Polarized beams, 118
oscillations, two-state, 334 Pontecorvo-Maki-Nakagawa-Sakata (PMNS)
oscillation, three-state, 337 matrix, 291
reactor, 331 Positron, 32, 42
seesaw mechanism, 290, 291 Potential energy, 53, 54, 257
solar, 326, 340 Projection matrices for helicity (PL , PR ),
supernova, 328 41, 351
Yukawa coupling, 289 Propagator
Neutrinoless double beta decay, 291 charged massive vector boson, 194,
Neutron, 5 195
Neutron decay, 165 Dirac fermion, 95, 107
Non-Abelian gauge invariance, 200, 201, generic, 95
211–214 gluon, 218
Non-renormalizable theories, 193, 216, Higgs boson, 264
228 massive vector boson, 194, 195
Nuclear weak decays, 165, 166 photon, 105, 106
Nucleons, 5 scalar, complex (π ± ), 189
scalar, real, 91
O W boson, 194
ω meson, 251 Yang-Mills vector fields, 214
On-shell, 16 Proper interval, 15
OUT state, 69 Proper Lorentz transformations, 21, 22
Overcounting problem for identical final Proton, 5
state particles, 81, 137, 149, 156 Pseudo-Nambu-Goldstone boson, 266
Index 359

Pseudo-rapidity, 19, 20 Review of Particle Properties (Particle

Pseudo-scalar fermion bilinear, 169 Data Group), 1
Pseudo-scalar Higgs boson, 163 Rhadrons , 116–118
ρ meson, 251
Q Right-handed antifermions and weak inter-
QCD coupling, 218–221, 231 actions, 170, 171, 182
experimental value, 232, 233 Right-handed fermions
QCD scale (QCD ), 231 Dirac, 40
Quadratic Casimir invariant, 203 Weyl, 46
Quantum Chromo-Dynamics (QCD), 116, Right-handed polarization of photon, 105
208, 217–219
Quantum Electro-Dynamics (QED), 101– Right-handed projection matrix (PR ), 43
103
Feynman rules, 104–108 Rotations, 22
Quark, 3 Rutherford scattering, 131
flavors, 3, 4
masses, 3, 4, 283 S
Quark-antiquark condensate, 266 Scalar field, 51, 52, 55
Quarkonium, 8 complex, 188
Quark-quark scattering, 219–222 Scalar function, 22
s-channel, 88–90
R Schrodinger equation, 29
Raising operator, 59, 61, 62 Schrodinger picture of quantum mechan-
Rapidity, 14, 35, 38 ics, 67, 69
longitudinal, 19, 20, 247 Sea partons, 236, 238, 240
pseudo-, 19, 20 Sea quarks, 5, 238
Reduced matrix element, 76 Seesaw mechanism for neutrino masses,
Reducible representation, 203 290, 291
Regularization, 63, 91, 193, 224, 232 Singlet representation, 201, 203
Renormalizable theories, 193 Slash notation, 33
Renormalization, 63, 91, 179, 180 Speed of light (c), 13, 349
Renormalization group, 229 Spin operator, 39
Renormalization scale, 226–229, 237 Spinor, 31
Renormalized (running) coupling, 226– Dirac, 32
228 Majorana, 47
electroweak, 273 Weyl, 45
Renormalized (running) mass, 227, 228 Spin-sum identities, 44, 351
Representation of group, 200 Spontaneous symmetry breaking, 257
Representations global, 257–261
of non-Abelian group, 201–205 local (gauge), 261–264
of SU (2), 205–207 Standard Model, 269–280
of SU (3), 208–210 Structure constants of non-Abelian group,
of SU (N ), 210 202
of U (1), 200, 201 SU (2) Lie algebra, 205–207
SU (2) L (weak isospin), 207, 269, 270
360 Index

SU (3) Lie algebra, 208–210 V

SU (N ) Lie algebra, 207, 210 Vacuum Expectation Value (VEV), 259
Subprocess, partonic, 234 antiquark-quark, 266
Supersymmetry, 47, 241, 246 Standard Model, 275
Symmetry factor (of Feynman diagram), Standard Model numerical, 277
91, 92, 108 Vacuum state, 51, 61, 65
Valence quarks, 5, 185, 236, 238–240,
T 245
Tau decays, 10 Volume element, 23
t-channel, 89, 90, 127
Technicolor, 266 W
Tensor, 23 Ward identity, 138, 139
Tensor product of representations, 204 W boson, 2
Tevatron, 74, 241, 242, 246 branching ratios, 289
Thomson scattering, 147 couplings to fermions, 286–288
Three-body phase space, 161–163 interactions with Z , W , h, 278, 279
Time reversal (T), 21, 178 mass, measured, 2
Top quark, 3, 4 mass, prediction, 277
decay, 289 width, measured, 2
LHC production cross-section, 74, 245, width, predicted, 3
246 Weak hypercharge [U (1)Y ], 269, 270,
Tevatron production cross-section, 74, 273, 275, 276
242–246 Weak-interaction charged current, 184,
Traces of γ matrices, 33, 111, 350, 351 185, 194
Trace trick, 111, 112 Weak isospin [SU (2) L ], 207, 269, 270
Transverse momentum ( pT ), 247 Weak mixing angle θW , 272
Transverse polarization, 104, 263 Weak mixing angle (Weinberg angle) θW ,
Tree-level diagrams, 91 272–274, 277, 278
Triangle function (λ), 156 Weyl (chiral) representation of Dirac matri-
Two-body phase space, 145 ces, 33, 350
Weyl equation, 45
U Weyl fermion, 46
u-channel, 89, 90 Weyl spinor, 45
U (1) as a Lie group, 200 Would-be Nambu-Goldstone boson, 263,
U (1) for electromagnetism, 201, 217, 264, 266, 276
218 Standard Model, 276
U (1)Y (weak hypercharge), 269, 270,
273, 275, 276 X
Unitarity of CKM matrix, 288 x, Feynman (momentum fraction), 236
Unitarity of quantum mechanical time ξ (gauge-fixing parameter), 105, 106
evolution, 191, 193, 216
Unitary gauge, 264, 265 Y
Standard Model, 276, 278, 279 Yang-Mills theories, 196, 211–216
Units, 30, 349 Yukawa coupling, 94, 157–159, 281–285
ϒ (vector bottomonium), 7, 8, 251
Index 361

neutrino, 289

Z
Z boson, 2, 251
branching ratios, 274
couplings to fermions, 274
interactions with Z , W , h, 278, 279
mass, measured, 2
mass, prediction, 277
resonance at LHC, 251
width, 2

Gauge Thoery
No ratings yet
Gauge Thoery
439 pages
Slobodan Perovic, Milan M. Cirkovic - The Cosmic Microwave Background - Historical and Philosophical Lessons-Cambridge University Press (2024)
No ratings yet
Slobodan Perovic, Milan M. Cirkovic - The Cosmic Microwave Background - Historical and Philosophical Lessons-Cambridge University Press (2024)
220 pages
Physics Fields
No ratings yet
Physics Fields
731 pages
Particle Physics
No ratings yet
Particle Physics
101 pages
3.2.1.7 Applications of Conservation Laws 56865
No ratings yet
3.2.1.7 Applications of Conservation Laws 56865
29 pages
Work On Quantum Mechanics (Arjun Berera, Luigi Del Debbio) v1
100% (6)
Work On Quantum Mechanics (Arjun Berera, Luigi Del Debbio) v1
444 pages
Hovanessian
No ratings yet
Hovanessian
20 pages
Miransky, Dynamical Symmetry Breaking in Quantum Field Theories
100% (2)
Miransky, Dynamical Symmetry Breaking in Quantum Field Theories
550 pages
Hunt2021 Book BeginnerSGuideToKotlinProgramm
100% (1)
Hunt2021 Book BeginnerSGuideToKotlinProgramm
516 pages
10 1142@9789814277549fmatter
No ratings yet
10 1142@9789814277549fmatter
15 pages
Atoms in Electromagnetic Fields 2nd Ed - C. Cohen-Tannoudji (World, 2004) WW PDF
No ratings yet
Atoms in Electromagnetic Fields 2nd Ed - C. Cohen-Tannoudji (World, 2004) WW PDF
769 pages
Basic Radiation Concepts
No ratings yet
Basic Radiation Concepts
44 pages
PDF Fits at Hera: Amanda Cooper-Sarkar
No ratings yet
PDF Fits at Hera: Amanda Cooper-Sarkar
6 pages
Kohanoff - Electronic Structure Calculations For Solids and Molecules - Theory and Computational Methods - Book - Cambridge-An
100% (1)
Kohanoff - Electronic Structure Calculations For Solids and Molecules - Theory and Computational Methods - Book - Cambridge-An
372 pages
NSB Round 1
No ratings yet
NSB Round 1
9 pages
Lepton Scattering Hadrons and QCD Proceedings 1st Edition Hadrons and QCD Workshop On Lepton Scattering Instant Download
100% (1)
Lepton Scattering Hadrons and QCD Proceedings 1st Edition Hadrons and QCD Workshop On Lepton Scattering Instant Download
34 pages
Dine M. Supersymmetry and String Theory
No ratings yet
Dine M. Supersymmetry and String Theory
537 pages
Particle Physics and Cosmology: Chapter Outline
No ratings yet
Particle Physics and Cosmology: Chapter Outline
46 pages
Deshmukh PC Quantum Mechanics Formalism Methodologies and AP
100% (3)
Deshmukh PC Quantum Mechanics Formalism Methodologies and AP
659 pages
Alexander L. Kuzemsky - Alexander Leonidovich Kuzemsky
100% (1)
Alexander L. Kuzemsky - Alexander Leonidovich Kuzemsky
1,259 pages
Quantum Theory of Solids - Charles Kittel
87% (15)
Quantum Theory of Solids - Charles Kittel
523 pages
P. C. Deshmukh - Foundations of Classical Mechanics-Cambridge University Press (2019)
100% (2)
P. C. Deshmukh - Foundations of Classical Mechanics-Cambridge University Press (2019)
592 pages
Why Computational Physics Is Necessary For Researchers and Scientists
No ratings yet
Why Computational Physics Is Necessary For Researchers and Scientists
8 pages
Mass Formula For Hadrons and A Nonet Scheme For Mesons
No ratings yet
Mass Formula For Hadrons and A Nonet Scheme For Mesons
10 pages
Pandas Illustrated: The Definitive Visual Guide To Pandas - by Lev Maximov - Jan, 2023 - Better Programming
No ratings yet
Pandas Illustrated: The Definitive Visual Guide To Pandas - by Lev Maximov - Jan, 2023 - Better Programming
99 pages
Gauge Theory
No ratings yet
Gauge Theory
7 pages
Econophysics and Capital Asset Pricing: James Ming Chen
No ratings yet
Econophysics and Capital Asset Pricing: James Ming Chen
293 pages
Symmctry: Dynamical
100% (1)
Symmctry: Dynamical
459 pages
Resource Letter - Van Der Waals and Casimir-Polder Forces PDF
No ratings yet
Resource Letter - Van Der Waals and Casimir-Polder Forces PDF
27 pages
Differential Geometry and General Relativity 1 9789819900213 9789819900220 Compress
100% (2)
Differential Geometry and General Relativity 1 9789819900213 9789819900220 Compress
566 pages
Quantum Chromodynamics: Standard Model Particle Physics
No ratings yet
Quantum Chromodynamics: Standard Model Particle Physics
4 pages
The Not-So-Harmless Axion
No ratings yet
The Not-So-Harmless Axion
5 pages
2010-Gourgoulhon-Special Relativity in General Frames-From Particles To Astrophysics
100% (2)
2010-Gourgoulhon-Special Relativity in General Frames-From Particles To Astrophysics
800 pages
Malcolm P. Kennett - Essential Statistical Physics 2020
100% (1)
Malcolm P. Kennett - Essential Statistical Physics 2020
263 pages
Icpaqgp 2023
No ratings yet
Icpaqgp 2023
1 page
Aslam, Jamil - Fayyazuddin - Riazuddin - Theory of Relativity (2015, World Scientific) PDF
No ratings yet
Aslam, Jamil - Fayyazuddin - Riazuddin - Theory of Relativity (2015, World Scientific) PDF
226 pages
Atomic Molecular Optical Physics by Hertel C Schulz, Volume 1
100% (6)
Atomic Molecular Optical Physics by Hertel C Schulz, Volume 1
710 pages
Barrau-Douady2022 Book ArtificialIntelligenceForFinan
No ratings yet
Barrau-Douady2022 Book ArtificialIntelligenceForFinan
182 pages
Open Quantum Systems: Bassano Vacchini
100% (1)
Open Quantum Systems: Bassano Vacchini
436 pages
Bellantoni L Modern Physics The Scenic Route
100% (3)
Bellantoni L Modern Physics The Scenic Route
205 pages
Canadell - Orbital Approach To The Electronic Structure of Solids
100% (2)
Canadell - Orbital Approach To The Electronic Structure of Solids
365 pages
(Advances in Quantum Chemistry 50) H.J.Å. Jensen (Eds.) - Response Theory and Molecular Properties (A Tribute To Jan Linderberg and Poul Jørgensen) - Elsevier, Academic Press (2005) PDF
100% (1)
(Advances in Quantum Chemistry 50) H.J.Å. Jensen (Eds.) - Response Theory and Molecular Properties (A Tribute To Jan Linderberg and Poul Jørgensen) - Elsevier, Academic Press (2005) PDF
351 pages
John L. Friedman - Nikolaos Stergioulas - Rotating Relativistic Stars-Cambridge University Press (2013)
No ratings yet
John L. Friedman - Nikolaos Stergioulas - Rotating Relativistic Stars-Cambridge University Press (2013)
435 pages
2019 Book BasicQuantumMechanics
89% (9)
2019 Book BasicQuantumMechanics
516 pages
The Dirac Equation in Curved Spacetime A Guide For Calculations by Peter Collas, David Klein
100% (2)
The Dirac Equation in Curved Spacetime A Guide For Calculations by Peter Collas, David Klein
111 pages
Reuter - Quantum Gravity
No ratings yet
Reuter - Quantum Gravity
354 pages
NoN Covalente Forces
100% (1)
NoN Covalente Forces
528 pages
Vdoc - Pub Relativistic Quantum Mechanics and Quantum Field Theory
100% (2)
Vdoc - Pub Relativistic Quantum Mechanics and Quantum Field Theory
272 pages
Black Hole Information and Thermodynamics
100% (3)
Black Hole Information and Thermodynamics
115 pages
Numerical Quantum Dynamics
100% (1)
Numerical Quantum Dynamics
281 pages
The Quantum Mechanics of Many-Body Systems: Second Edition
From Everand
The Quantum Mechanics of Many-Body Systems: Second Edition
D. J. Thouless
No ratings yet
Molecular Quantum Electrodynamics
From Everand
Molecular Quantum Electrodynamics
D. P. Craig
4/5 (2)
(Oxford Graduate Texts) Milonni, Peter W. - An Introduction To Quantum Optics and Quantum Fluctuations (2019, Oxford University Press) PDF
100% (1)
(Oxford Graduate Texts) Milonni, Peter W. - An Introduction To Quantum Optics and Quantum Fluctuations (2019, Oxford University Press) PDF
543 pages
The Primordial Density Perturbation - David H Lyth and Andrew R Liddle
No ratings yet
The Primordial Density Perturbation - David H Lyth and Andrew R Liddle
517 pages
2019 Book YetAnotherIntroductionToDarkMatter
100% (1)
2019 Book YetAnotherIntroductionToDarkMatter
183 pages
Colliding Plane Waves in General Relativity
From Everand
Colliding Plane Waves in General Relativity
J. B. Griffiths
No ratings yet
List of Particles
No ratings yet
List of Particles
8 pages
Group Theory Report
0% (1)
Group Theory Report
12 pages
General Relativity and Gravitational Waves, Essentials of Theory and Practice
100% (2)
General Relativity and Gravitational Waves, Essentials of Theory and Practice
214 pages
How To Program A Quantum Computer
No ratings yet
How To Program A Quantum Computer
175 pages
(Thomas - Wolfram, - Sinasi - Ellialtıoglu, - 2014) Applications of Group Theory To Atoms, Molecules, and Solids
100% (3)
(Thomas - Wolfram, - Sinasi - Ellialtıoglu, - 2014) Applications of Group Theory To Atoms, Molecules, and Solids
485 pages
Greek Frequency Dictionary 1 - Essential Vocabulary - 2500 Most Common Greek Words
100% (2)
Greek Frequency Dictionary 1 - Essential Vocabulary - 2500 Most Common Greek Words
226 pages
(Lecture Notes in Physics) Fulvio Ricci, Massimo Bassan - Experimental Gravitation-Springer (2022)
100% (1)
(Lecture Notes in Physics) Fulvio Ricci, Massimo Bassan - Experimental Gravitation-Springer (2022)
446 pages
Atomic Molecular Optical Physics by Hertel C Schulz, Volume 2
100% (6)
Atomic Molecular Optical Physics by Hertel C Schulz, Volume 2
752 pages
Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)
100% (1)
Joel Franklin - Mathematical Methods For Oscillations and Waves-Cambridge University Press (2020)
275 pages
Electrons in Solids
100% (4)
Electrons in Solids
408 pages
(Cambridge Monographs on Particle Physics, Nuclear Physics and Cosmology Vol.1) Elliot Leader, Enrico Predazzi-An Introduction to Gauge Theories and Modern Particle Physics 2 Volume Hardback Set (Camb.pdf
100% (3)
(Cambridge Monographs on Particle Physics, Nuclear Physics and Cosmology Vol.1) Elliot Leader, Enrico Predazzi-An Introduction to Gauge Theories and Modern Particle Physics 2 Volume Hardback Set (Camb.pdf
543 pages
Mecinq Cuanto Fisic Normal
No ratings yet
Mecinq Cuanto Fisic Normal
188 pages
The Supersymmetric Dirac Equation
100% (4)
The Supersymmetric Dirac Equation
216 pages
(Oxford Graduate Texts) Efstratios Manousakis - Practical Quantum Mechanics - Modern Tools and Applications (2016, Oxford University Press) PDF
100% (3)
(Oxford Graduate Texts) Efstratios Manousakis - Practical Quantum Mechanics - Modern Tools and Applications (2016, Oxford University Press) PDF
348 pages
Modern Nuclear Physics - From Fundamentals To Frontiers by Alexandre Obertelli, Hiroyuki Sagawa
100% (10)
Modern Nuclear Physics - From Fundamentals To Frontiers by Alexandre Obertelli, Hiroyuki Sagawa
739 pages
Gravitation and Gauge Symmetries PDF
No ratings yet
Gravitation and Gauge Symmetries PDF
537 pages
623
100% (1)
623
304 pages
Introduction To The Theory of Lie Groups by Roger Godement, Urmie Ray
100% (4)
Introduction To The Theory of Lie Groups by Roger Godement, Urmie Ray
300 pages
Astroparticle Physics and Cosmology Astroparticle Physics and Cosmology
No ratings yet
Astroparticle Physics and Cosmology Astroparticle Physics and Cosmology
296 pages
From Classical To Quantum Fields PDF
100% (11)
From Classical To Quantum Fields PDF
951 pages
Physics: Earning Objectives
No ratings yet
Physics: Earning Objectives
7 pages
Introduction To Particle Physics
No ratings yet
Introduction To Particle Physics
101 pages
Susskind 01
100% (1)
Susskind 01
24 pages
Differential Geometry in Physics Lugo
100% (6)
Differential Geometry in Physics Lugo
374 pages
Operational Spacetime PDF
100% (1)
Operational Spacetime PDF
349 pages
Quarks Model
No ratings yet
Quarks Model
12 pages
Advanced Condensed Matter Physics - Leonard M. Sander
100% (8)
Advanced Condensed Matter Physics - Leonard M. Sander
288 pages
DeAngelis2015 PDF
100% (4)
DeAngelis2015 PDF
680 pages
Solved Problems in Classical Electromagnetism
From Everand
Solved Problems in Classical Electromagnetism
Jerrold Franklin
No ratings yet
2017 Book MechanicsAndThermodynamics PDF
100% (8)
2017 Book MechanicsAndThermodynamics PDF
459 pages
Gauge Invariance and Weyl-Polymer Quantization
100% (1)
Gauge Invariance and Weyl-Polymer Quantization
104 pages
Mullin, William J - Quantum Weirdness-Oxford University Press (2017)
100% (1)
Mullin, William J - Quantum Weirdness-Oxford University Press (2017)
225 pages
7 Atomic, Nuclear and Particle Physics: 7.3 The Structure of Matter
No ratings yet
7 Atomic, Nuclear and Particle Physics: 7.3 The Structure of Matter
8 pages
Python Delegator
No ratings yet
Python Delegator
8 pages
7 Must-Haves in Your Data Science CV - by Elad Cohen - Towards Data Science
No ratings yet
7 Must-Haves in Your Data Science CV - by Elad Cohen - Towards Data Science
7 pages
Essentials of Hamiltonian Dynamics - John H.lowenstein
100% (12)
Essentials of Hamiltonian Dynamics - John H.lowenstein
203 pages
(UNITEXT For Physics) Kurt Lechner - Classical Electrodynamics - A Modern Perspective (2018, Springer) PDF
100% (9)
(UNITEXT For Physics) Kurt Lechner - Classical Electrodynamics - A Modern Perspective (2018, Springer) PDF
699 pages
Spacetime Algebra and Electron Physics
100% (1)
Spacetime Algebra and Electron Physics
123 pages
Gauge Theory of Elementary Particle Physics
No ratings yet
Gauge Theory of Elementary Particle Physics
6 pages
Martin - Nuclear and Particle Physics - An Introduction
100% (5)
Martin - Nuclear and Particle Physics - An Introduction
415 pages

Particles Physics

Uploaded by

Particles Physics

Uploaded by

Graduate Texts in Physics

ISSN 1868-4513 ISSN 1868-4521 (electronic)

© Springer Nature Switzerland AG 2022

suitable to future advanced courses or research. For example, we do not cover

DeKalb, USA Stephen P. Martin

5.2.3 Helicities in e− e+ → μ− μ+ . . . . . . . . . . . . . . . . . . . . . . . . . . 118

9 Spontaneous Symmetry Breaking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257

• What fundamental particles is everything made out of?

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 1

1.1 Fundamental Forces

1.2 Resonances, Widths, and Lifetimes

Table 1.1 The fundamental vector bosons of the standard model

τ (in seconds) = (6.58212 × 10−25 )/[ (in GeV/c2 )]. (1.2)

1.3 Leptons and Quarks

The remaining known indivisible constituents of matter are spin-1/2 fermions,

Table 1.2 The leptons of the standard model

Table 1.3 The quarks of the standard model

Table 1.5 J = 3/2 baryons

− (sss) −1 1.672 8.0 × 10−15 8.21 × 10−11

resonances correspond to very long-lived states; the

Table 1.6 J = 0 mesons containing light (u, d, s) quarks and antiquarks

Table 1.8 J = 0 mesons containing a heavy quark and a heavy antiquark

Table 1.9 J = 1 mesons containing light quarks and antiquarks

Table 1.11 J = 1 mesons containing a heavy quark and a heavy antiquark

1.5 Decays and Branching Ratios

BR(ω → π + π − π 0 ) = (89.2 ± 0.7)% (strong) (1.20)

(X → Y ) = BR(X → Y )(X ). (1.23)

quantum field theory is extremely successful, providing amazingly accurate predic-

2.1 Lorentz Transformations

A successful description of elementary particles must be consistent with the two

Here L μ ν is a constant 4 × 4 real matrix that parameterizes the Lorentz transforma-

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2022 13

Another way of rewriting this is to define the rapidity ρ by β = tanh ρ, so that

In the rest frame of a particle of mass m, its 4-momentum is given by p μ =

The 4-momentum of a particle is related to its mass by the Lorentz transformation

and the inverse Lorentz transformation is

(τ )2 = (d 0 )2 − (d 1 )2 − (d 2 )2 − (d 3 )2 = gμν d μ d ν (2.12)

xμ = gμν x ν = (ct, −x, −y, −z), (2.14)

Furthermore, one can define an inverse metric g μν so that

g μν gνρ = δρμ , (2.16)

Then one has, for any vector a μ ,

It follows that covariant four-vectors transform as

where (note the positions of the indices!)

2.2 Relativistic Kinematics

is a Lorentz invariant, the same in any inertial frame.

E > E thresh = M. (2.28)

p 2 = (E + m)2 − (E 2 − m 2 ) = 2 m(E + m). (2.31)

p μ = (E, pT cos φ, pT sin φ, pz ), (2.41)

Thus η = 0 corresponds to a particle coming out perpendicular to the beam line (θ =

2.3 Tensors and Lorentz Invariant Quantities

gμν a μ bν = gμν a μ bν , (2.45)

gμν L μ ρ L ν σ a ρ bσ = gρσ a ρ bσ . (2.46)

Since a μ and bν are arbitrary, it must be that:

gμν L μ ρ L ν σ = gρσ . (2.47)

Let us now consider some more particular Lorentz transformations. To begin, we

Another “large" Lorentz transformation is parity, or space inversion:

L μ ν = δνμ + ωμ ν + O(ω2 ), (2.55)

gμν (δρμ + ωμ ρ + · · · )(δσν + ων σ + · · · ) = gρσ , (2.56)

gρσ + ωσρ + ωρσ + · · · = gρσ . (2.57)

ωσρ = −ωρσ (2.58)

is an antisymmetric 4 × 4 matrix, with 4 · 3/2 · 1 = 6 independent entries. These

is a covariant four-vector. This is because:

Aμ = (cdt, 0, 0, 0); (2.66)

Then the 4-dimensional volume element

is Lorentz-invariant, since in the last expression it has no uncontracted four-vector

is invariant under Lorentz transformations. This is good because eventually we will

2.4 Maxwell’s Equations and Electromagnetism

∇·E = eρ, (2.72)

so that (2.76) becomes

Now if we assemble the potentials into a four-vector:

∂ρ Fμν + ∂μ Fνρ + ∂ν Fρμ = 0, (2.86)

for μ, ν, ρ = any of 0, 1, 2, 3. Note that this equation is automatically true because

Aμ (x) → Aμ (x) + ∂ μ λ(x), (2.87)

where λ(x) is any function of position in spacetime. In components, this amounts

This transformation leaves F μν (or equivalently E and B) unchanged. Therefore,

c = 1 speedy and = 1 spinny.

(τ )2 = (d 0 )2 − (d 1 )2 − (d 2 )2 − (d 3 )2 = gμν d μ d ν (2.12)

However, under a Lorentz transformation, (x ) = (x) and † (x ) = † (x)† ,