0% found this document useful (0 votes)
10 views315 pages

Manual Pythia

This document is a comprehensive manual for the PYTHIA 8.3 event generator, detailing its application in high-energy particle physics. It covers the program's structure, algorithms, and various physics models, emphasizing strong interactions and the reproduction of experimental collision properties. Additionally, it includes user guidance, installation instructions, and interfacing with external programs, making it a valuable resource for researchers in the field.

Uploaded by

bielmartins0502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views315 pages

Manual Pythia

This document is a comprehensive manual for the PYTHIA 8.3 event generator, detailing its application in high-energy particle physics. It covers the program's structure, algorithms, and various physics models, emphasizing strong interactions and the reproduction of experimental collision properties. Additionally, it includes user guidance, installation instructions, and interfacing with external programs, making it a valuable resource for researchers in the field.

Uploaded by

bielmartins0502
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 315

SciPost Physics Codebases Submission

LU-TP 22-16
MCNET-22-04

A comprehensive guide to the physics and usage of PYTHIA 8.3

Christian Bierlich1 , Smita Chakraborty1 , Nishita Desai2 , Leif Gellersen1 ,


Ilkka Helenius3,4 , Philip Ilten9 , Leif Lönnblad1 , Stephen Mrenna5 , Stefan Prestel1 ,
arXiv:2203.11601v1 [hep-ph] 22 Mar 2022

Christian T. Preuss6,7 , Torbjörn Sjöstrand1 , Peter Skands6 , Marius Utheim1,3 , and


Rob Verheyen8
1
Dept. of Astronomy and Theoretical Physics, Lund University, Sölvegatan 14A, S-223 62 Lund, Sweden
2
Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India
3
University of Jyvaskyla, Department of Physics, P.O. Box 35, FI-40014 University of Jyvaskyla, Finland
4
Helsinki Institute of Physics, P.O. Box 64, FI-00014 University of Helsinki, Finland
5
Fermilab, Batavia, Illinois, USA
6
School of Physics and Astronomy, Monash University, Wellington Rd, Clayton VIC-3800, Australia
7
Institute for Theoretical Physics, ETH, CH-8093 Zürich, Switzerland
8
Dept. of Physics and Astronomy, UCL, Gower St, Bloomsbury, London WC1E 6BT, United Kingdom
9
Dept. of Physics, University of Cincinnati, Cincinnati, OH 45221, USA

March 23, 2022

Abstract
This manual describes the PYTHIA 8.3 event generator, the most recent version of an evolving
physics tool used to answer fundamental questions in particle physics. The program is most
often used to generate high-energy-physics collision “events”, i.e. sets of particles produced
in association with the collision of two incoming high-energy particles, but has several uses
beyond that. The guiding philosophy is to produce and re-produce properties of experi-
mentally obtained collisions as accurately as possible. The program includes a wide ranges
of reactions within and beyond the Standard Model, and extending to heavy ion physics.
Emphasis is put on phenomena where strong interactions play a major role.
The manual contains both pedagogical and practical components. All included physics
models are described in enough detail to allow the user to obtain a cursory overview of used
assumptions and approximations, enabling an informed evaluation of the program output.
A number of the most central algorithms are described in enough detail that the main results
of the program can be reproduced independently, allowing further development of existing
models or the addition of new ones.
Finally, a chapter dedicated fully to the user is included towards the end, providing ped-
agogical examples of standard use cases, and a detailed description of a number of external
interfaces. The program code, the online manual, and the latest version of this print manual
can be found on the PYTHIA web page:

https://www.pythia.org/

1
SciPost Physics Codebases Submission

Contents

I Introduction 7

1 Preliminaries 7
1.1 What is an “event generator” ? 7
1.2 The structure of a simulated event 8
1.3 To what types of problems can PYTHIA be applied ? 11
1.4 Historical evolution of the PYTHIA program 12

2 Program structure and basic algorithms 14


2.1 Program structure and overview 14
2.2 Monte-Carlo techniques 15
2.2.1 Random-number generation 16
2.2.2 Some standard techniques 16
2.2.3 The veto algorithm 18
2.2.4 Phase space (M-generator and RAMBO) 23
2.3 Process-generation basics 25
2.3.1 2 → 2 processes 26
2.3.2 2 → 3 processes 28
2.3.3 Processes involving resonances 29

II Physics content 33

3 Internal process types 33


3.1 Hard QCD 33
3.1.1 Light quarks and gluons 34
3.1.2 Heavy flavours 34
3.1.3 Three-parton processes 35
3.2 Electroweak 35
3.2.1 Prompt photon production 35
3.2.2 Weak bosons 36
3.2.3 Photon collisions 38
3.2.4 Photon-parton scattering 38
3.3 Onia 39
3.4 Top production 41
3.5 Higgs 41
3.6 Supersymmetry 42
3.7 Hidden valley 43
3.8 Dark matter 44
3.9 Other exotica 46
3.10 Couplings and scales for internal processes 46
3.11 Handling of resonances and their decays 48

2
SciPost Physics Codebases Submission

3.12 Parton distribution functions 52


3.13 Phase-space cuts for hard processes 54
3.14 Second hard process 56

4 Parton showers 58
4.1 The simple shower 67
4.1.1 Basic shower branchings 68
4.1.2 The dipole evolution 73
4.1.3 Matrix-element and other corrections 76
4.1.4 QED, electroweak and other showers 79
4.1.5 Algorithms for automated shower variations and enhanced splittings 81
4.2 The VINCIA antenna shower 86
4.2.1 Common features 88
4.2.2 QCD showers 92
4.2.3 QED showers 98
4.2.4 EW showers 99
4.3 The DIRE shower 101
4.3.1 Phase-space coverage and ordering 101
4.3.2 Transition rates 103
4.3.3 Weight handling aspects 106

5 Matching and merging 108


5.1 PYTHIA methods for leading-order multi-jet merging 113
5.2 PYTHIA methods for matching 114
5.3 PYTHIA methods for NLO multi-jet merging 115
5.4 Matching and Merging in VINCIA 116
5.4.1 Leading-order merging 117
5.4.2 NLO matching 118
5.5 Matching and Merging in DIRE 118

6 Soft and beam-specific processes 118


6.1 Total and semi-inclusive cross sections 119
6.1.1 Proton total cross sections 121
6.1.2 Proton elastic cross sections 122
6.1.3 Proton diffractive cross sections 123
6.1.4 Other cross sections 125
6.1.5 Low-energy processes 127
6.2 Multiparton interactions basics 129
6.2.1 The perturbative cross section 129
6.2.2 The impact-parameter model 131
6.2.3 The generation sequence 133
6.2.4 Momentum and flavour conservation 135
6.2.5 Interleaved and intertwined evolution 137
6.2.6 Spatial parton vertices 138
6.2.7 Other MPI aspects 139
6.3 Beam remnants 140
6.3.1 Flavour structure 141

3
SciPost Physics Codebases Submission

6.3.2 Colour structure 141


6.3.3 Primordial k⊥ 142
6.3.4 Longitudinal momentum 142
6.4 Hadron-hadron collisions 143
6.4.1 Minimum-bias and related inclusive processes 143
6.4.2 Diffractive processes 144
6.4.3 Hard diffraction 145
6.5 Lepton-lepton collisions 146
6.5.1 Bremsstrahlung and lepton PDFs 146
6.5.2 Beamstrahlung 148
6.5.3 Processes 148
6.6 Lepton-hadron collisions 148
6.6.1 Parton distribution functions and structure functions 149
6.6.2 Deep inelastic scattering 149
6.7 Photon-hadron and photon-photon collisions 151
6.7.1 Parton distribution functions of resolved photons 151
6.7.2 Photoproduction 152
6.7.3 Photon-photon collisions 155
6.7.4 Ultra-peripheral collisions 156
6.8 Heavy ion collisions 158
6.8.1 Wounded nucleons 158
6.8.2 The ANGANTYR model 159

7 Hadronization 162
7.1 The Lund String model 162
7.1.1 Selection of flavour and transverse momentum 164
7.1.2 Joining two jets in qq events 166
7.1.3 Fragmentation of systems with gluons 166
7.1.4 Hadron vertices 167
7.1.5 Junction topologies 167
7.1.6 Small-mass systems 170
7.2 Colour reconnections 170
7.2.1 The MPI-based model 173
7.2.2 QCD-based colour reconnections 174
7.2.3 The gluon-move scheme 176
7.2.4 The SK models 177
7.2.5 Other CR models 179
7.3 String interactions and collective effects 179
7.3.1 String shoving 180
7.3.2 Rope hadronization 182
7.3.3 The thermal model 184
7.4 Hadronic rescattering 185
7.5 Bose–Einstein effects 186
7.6 Deuteron production 188

8 Particles and decays 190


8.1 Particle properties 191

4
SciPost Physics Codebases Submission

8.1.1 Masses 191


8.1.2 Widths 193
8.1.3 Lifetimes 195
8.2 Decays 196
8.2.1 Hadron decays with parton showers 196
8.2.2 Inclusive hadron decays 198
8.2.3 Variable-width hadrons 199
8.2.4 Strong decays 200
8.2.5 Electromagnetic decays 201
8.2.6 Weak decays 202
8.2.7 Helicity decays 205
8.2.8 Tau decays 206

III Using PYTHIA 8.3 208

9 Using PYTHIA stand-alone 208


9.1 Installation 209
9.2 Program setup 209
9.3 Settings 210
9.3.1 Beams and PDFs 212
9.3.2 Process selection 213
9.3.3 Soft processes 214
9.3.4 Parton- and hadron-level settings 214
9.3.5 Particle data 215
9.4 Analysis of generated event 216
9.4.1 The Vec4 class 216
9.4.2 The Particle class 217
9.4.3 The Event class 217
9.5 Program output 218
9.5.1 Messages, warnings, and errors 219
9.6 Advanced settings examples 219
9.6.1 Matching and merging settings 219
9.6.2 Variable energies and beam particles 224
9.7 Advanced usage 225
9.7.1 User-defined settings 226
9.7.2 User hooks 226
9.7.3 Semi-internal processes and resonances 227
9.7.4 Multithreading 230
9.8 Event weight handling 232
9.8.1 Overview of process specific weights 233
9.8.2 Automatic weight variations 233
9.9 Tuning PYTHIA 234
9.9.1 Comments on the tuning procedure 234
9.9.2 The default PYTHIA 8.3 tuning: MONASH 2013 235
9.9.3 The ATLAS A14 tune 235
9.9.4 Automatic tuning approaches 236

5
SciPost Physics Codebases Submission

10 Interfacing to external programs 237


10.1 Generation tools 237
10.1.1 Les Houches Accord and Les Houches Event File functionality 237
10.1.2 SLHA 243
10.1.3 LHAHDF5 246
10.1.4 LHAPDF 247
10.1.5 POWHEG 248
10.1.6 MADGRAPH5_AMC@NLO 249
10.1.7 HELACONIA 251
10.1.8 EVTGEN 252
10.1.9 External random-number generators 253
10.2 Output formats 254
10.2.1 HEPMC versions 2 and 3 254
10.2.2 Histograms with the YODA package 255
10.2.3 Interfacing with ROOT 255
10.3 Analysis tools 256
10.3.1 RIVET versions 2 and 3 256
10.3.2 FASTJET 257
10.4 Computing environments 257
10.4.1 PYTHON interface 257

IV Summary and Outlook 260

Appendices 263

A Full list of internal processes 263


A.1 Standard model processes 263
A.2 Beyond-the-Standard-Model processes 269

References 276

Index 305

List of acronyms 312

6
SciPost Physics Codebases Submission

Part I
Introduction
This manual is organized into three major parts. This first part contains introductory material
about event generators in general and the basic technical details of event generation. The sec-
ond part presents a more detailed description of the physics implemented inside of PYTHIA. The
physics is divided according to how it appears in the program flow itself, though the lines drawn
can be fuzzy: the hard process (including external calculations); parton showering; multiparton
interactions; beam remnants; and hadronization. There are also dedicated sections on the DIRE
and VINCIA parton showers, as well as the treatment of heavy-ion collisions. Some of the details
have not been thoroughly documented before, while others have appeared in prior publications.
The third part is about how the user interacts with PYTHIA. In many applications, PYTHIA is part
of a code stack or work flow, with other programs calling into PYTHIA or vice versa. This part
describes both basic standalone usage and documents typical interfaces in detail.

1 Preliminaries
PYTHIA 8.3 [1] is a scientific code library that is widely used for the generation of events in
high-energy collisions between particles, where effects of the strong nuclear force, governed by
Quantum Chromodynamics (QCD), are of high importance. It is written mainly in C++ and inter-
weaves a comprehensive set of detailed physics models for the evolution from a few-body hard-
scattering process to a complex multi-particle final state. Parts of the physics have been rigorously
derived from theory, while other parts are based on phenomenological models, with parameters
to be determined from data. Currently, the largest user community comes from the Large Hadron
Collider (LHC) experimental collaborations, but the program is also used for a multitude of other
phenomenological or experimental studies in astro-, nuclear, and particle physics. Main tasks
performed by the program include investigations of experimental consequences of theoretical hy-
potheses, interpretation of experimental data — including estimation of systematic uncertainties
and unfolding — development of search strategies, and detector design and performance studies.
It also plays an important role as a versatile vessel for exploring new theoretical ideas and new
algorithmic approaches, ranging from minor user modifications to full-fledged developments of
novel physics models.

1.1 What is an “event generator” ?


In particle physics, the outcome of a collision between two incoming particles, or of the isolated
decay of a particle, is called an “event”. At the most basic level, an event therefore consists of
a number of outgoing particles such as might be recorded in a snapshot taken by an idealized
detector, with conservation laws implying that the total summed energies and momenta of the
final-state particles should match those of the initial state, as should any discrete quantum numbers
that are conserved by the physics process(es) in question.
Due to the randomness of quantum processes, the number of outgoing particles and their prop-
erties vary from event to event. The probability distributions for these properties can be inferred
by studying an ensemble of events in data. Conversely, given a set of theoretically calculated
(or modelled) probability distributions, it is possible to produce ensembles of simulated events to

7
SciPost Physics Codebases Submission

compare to data.
A numerical algorithm that can produce (or “generate”) random sequences of such simulated
events, one after the other, is called an “event generator”. The simulations can be based on known
or hypothetical laws of nature. This allows for the exploration and comparison of competing
paradigms, and studies of the sensitivity of proposed physical observables to the differences. Only
rarely do the algorithms represent exact solutions however, so a common issue is to consider
whether ansätze and approximations made, and the level of detail offered by a given modelling,
are adequate for the problem at hand. The detailed physics descriptions contained in the main
parts of this report are intended to assist with this task.
Returning to the structure of a high-energy physics event, in its crudest form, it is a list of the
sub-atomic particles produced in a collision along with a measure of the probability for that event
to occur. In PYTHIA, the list is referred to as the “event record”, and it includes the four-momentum,
production point, and many other properties of each particle, cf. section 9.4 for details. It typically
also includes quite a bit of history information showing intermediate stages of the event modelling.
The measure of the relative probability of a given event within a sample is given by the weight of
that event relative to the sum of weights for the sample. For the typical case of unweighted events,
this is just the inverse of the total number of events in the sample; cases that give rise to weighted
events are summarized in section 9.8. The total cross section for the sample is also computed,
allowing for the conversion of relative probabilities into cross sections.
Note that, although the starting point is often a relatively simple cross section computed in
fixed-order perturbation theory, the total probability distribution for simulated events, fully dif-
ferentially in all relevant phase-space variables and quantum numbers of the produced set of
final-state particles, can typically not be expressed analytically. Instead, it is evaluated directly,
using numerical methods, with Markov Chain Monte Carlo (MCMC) algorithms based on pseudo-
random number generators as the main ingredient. The mathematical basis of the main ones used
in PYTHIA is covered in section 2.2.
The aim of the event generator is ambitious: to predict all of the observable properties of a
high-energy collision or decay process. The full properties of an event, however, cannot currently
be calculated from first principles alone. Many different, complex phenomena, which are likely
related, are described by a proliferation of models that each focus on a limited dynamical range.
As a result, the predictions of an event generator like PYTHIA 8.3 depend upon O(100) parame-
ters. The values of these parameters are inferred from comparisons to data. A collection of such
parameter values is referred to as a tune.
Event-generator predictions are useful, because they serve as a proxy for what an event would
look like before interacting with any measurement devices. As such, it can be used to investigate
the consequences of new and old phenomena, and study the loss, mismeasurement, and misiden-
tification of particles in experiments. Thus, it is an important tool for interpreting collider data.
Event generators are realized as computer codes. In modern times, most of the larger projects are
developed in the C++ programming language.

1.2 The structure of a simulated event


The main goal of PYTHIA is to simulate particle production in high-energy collisions over the full
range of energy scales accessible to experiments, in as much detail as possible. However, hadron
collisions and hadroproduction in particular are exceedingly complex, and no comprehensive the-
ory exists currently that can predict event properties over this full range. For practical purposes,
the wide range of phenomena are factored into a number of components. A natural division for

8
SciPost Physics Codebases Submission

·
·

·
·

Hard Interaction
Resonance Decays
MECs, Matching & Merging
FSR
ISR*
QED
dσ̂0
Weak Showers
Hard Onium
Multiparton Interactions
Beam Remnants*
MPI MPI
Strings
Ministrings / Clusters
Colour Reconnections
String Interactions
Bose-Einstein & Fermi-Dirac
Primary Hadrons
Meson Secondary Hadrons
Baryon
·· Antibaryon Hadronic Reinteractions
· Heavy Flavour (*: incoming lines are crossed)

Figure 1: Schematic of the structure of a pp → tt event, as modelled by PYTHIA. To


keep the layout relatively clean, a few minor simplifications have been made: 1) shower
branchings and final-state hadrons are slightly less numerous than in real PYTHIA events,
2) recoil effects are not depicted accurately, 3) weak decays of light-flavour hadrons are
not included (thus, e.g. a KS0 meson would be depicted as stable in this figure), and 4)
incoming momenta are depicted as crossed (p → −p). The latter means that the beam
remnants and the pre- and post-branching incoming lines for ISR branchings should be
interpreted with “reversed” momentum, directed outwards towards the periphery of the
figure; this avoids beam remnants and outgoing ISR emissions having to criss-cross the
central part of the diagram.

9
SciPost Physics Codebases Submission

these components is a time-ordering or, equivalently, an energy or transverse momentum ordering,


where the best understood physics is calculated at the shortest time scales and largest energies,
and the least understood physics is modelled at the longest time scales and lowest energies. This
division is well motivated and often underpinned by factorization theorems, but it is not entirely
unambiguous and sometimes is open to corrections.
The ordering in time is not completely intuitive, at least not in a directional sense from past
to future. We should rather speak of time windows centred on a hard collision that then expand
forwards and backwards in time, introducing successive phenomena, until we are left with a pair
of incoming protons from accelerator beams, for example, and a number of outgoing particles. In
momentum space, we normally speak of the “hardness” scale that characterizes each (sub)process,
and often use a measure of transverse momentum p⊥ to quantify this.
For simplicity, we will here concentrate on the sufficiently complex case of hadron-hadron
collisions, with an explicit schematic of a fully simulated pp → tt event given in fig. 1. The
radial coordinate illustrates hardness scales, starting with the hardest subprocess near the centre
(labelled dσ̂0 ), and ending with stable final-state particles and the incoming beam particles at the
periphery.
In our hardness- or time-ordered picture, the components of a high-energy collision are:
1. A hard scattering of two partons, one from each incoming hadron, into a few outgoing par-
ticles. The initial partons are selected using parton distribution functions for the incoming
hadrons, and the kinematics of the outgoing particles are based on matrix elements calcu-
lated in perturbation theory. Such calculations introduce a factorization scale and a renor-
malization scale. Partons with momenta below these scales are not included in the hard
scattering, but will be introduced by other stages of the event generation. In the current us-
age of PYTHIA, it is common to import the results of parton-level calculations from external
packages, though a number of simple processes are calculated internally. Hard-scattering
predictions depend on a few, universal input parameters that are determined from data, such
as the value of the strong coupling at the Z boson mass and parton distribution functions.
2. The hard process may produce a set of short-lived resonances, such as Z or W± gauge bosons
or top quarks, whose decay to normal particles has to be considered in close association with
the hard process itself.
3. Fixed-order radiative corrections may be incorporated via (combinations of) matrix-element
corrections, matching, and/or merging strategies, cf. section 5. In fig. 1, the violet shaded
region surrounding the hard process represents the range of scales covered by a (generic)
matrix-element merging strategy active above some given p⊥min scale.
4. Initial–State Radiation (ISR) of additional particles (partons, photons, and others) starting
from the scattering initiators using numerical resummation of soft and collinear gluon emis-
sion. This (together with its final-state equivalent below) is commonly referred to as the
parton shower.
5. Final–State Radiation (FSR) of additional particles from the hard scattering itself and also
from any resonance decays.
6. In competition with ISR and FSR, further scattering processes between additional partons
from the incoming beams may take place, in a phenomenon known as Multiple Parton In-
teractions (MPI). This is not to be confused with “pileup”, which generally refers to several
distinct hadron-hadron collisions recorded in the same detector snapshot.

10
SciPost Physics Codebases Submission

7. At some stage after the MPIs and perhaps before resonance decays, strings begin to form, as
the non-perturbative limit of colour dipoles. These dipoles, however, are typically defined
by colour connections that are assigned in the Nc → ∞ limit, and are not unique for Nc = 3.
As discussed further in section 7.2, the associated colour-space ambiguities can be modelled
via Colour Reconnection (CR). It is also possible that long-range dynamical interactions
could physically alter the colour flow and/or change the configuration of the expanding
strings before they fragment. Depending on the characteristic timescales involved (often
not specified explicitly in simple CR models), such effects may also be referred to as colour
reconnections, but could also come under the rubric of string interactions.

8. The strong interaction now results in the confinement of QCD partons into colour-singlet
subsystems known as strings or, in small-mass limiting cases, clusters. What is currently
left of the incoming hadron constituents are combined into beam remnants. In fig. 1, the
transition between the partonic and hadronic stages of the event generation is highlighted
by the concentric annuli shaded blue.

9. The strings fragment into hadrons based on the Lund string model. Optionally, effects of
overlapping strings may be taken into account, e.g. by collecting them into so-called “ropes”
and/or allowing interactions between them.

10. Identical particles that are close in phase space may exhibit Bose-Einstein enhancements
(for integer-spin particles) or Fermi-Dirac suppressions (for half-integer-spin particles).

11. Unstable hadrons produced in the fragmentation process decay into other particles until
only stable particles remain (with some user flexibility to define what is stable).

12. In densely populated regions of phase space, the produced particles may rescatter, reanni-
hilate, and/or recombine with one another.

The introduction of heavy-ion beams introduces an additional layer of complexity wrapped around
this picture. Lepton-lepton collisions are much simpler, since they do not involve many of the
complications arising from hadron beams.

1.3 To what types of problems can PYTHIA be applied ?


PYTHIA can be applied to a large set of phenomenological problems in particle physics, and to
related problems in astro-particle, nuclear, and neutrino physics. Historically, the core of PYTHIA is
the Lund string model of hadronization. This model is most appropriate when the invariant masses
of the hadronizing systems are above 10 GeV or so. For lower-mass systems, the model is less firmly
reliable. Low-mass systems may still occur in PYTHIA, typically then as subsystems within a larger
event, e.g., produced by heavy-flavour decays, colour reconnections, and/or hadronic rescattering.
For the very lowest-mass systems, which produce just one or two hadrons, a simple cluster-style
model, called ministrings, is implemented, otherwise the normal string fragmentation is applied.
In addition to string hadronization, PYTHIA of course also incorporates state-of-the-art models for
a wide range of other particle-physics phenomena. Here, we provide a non-exclusive list of various
applications of the PYTHIA machinery.
We emphasize that the majority of these models are based on dedicated original work done by
authors, students, and sometimes external contributors, representing a significant and sustained
intellectual effort. When quoting results obtained with PYTHIA, we therefore ask that users make

11
SciPost Physics Codebases Submission

an effort to cite, alongside this manual, such original works as would be deemed directly relevant
to the study at hand, i.e. without whose implementation in PYTHIA the study could not have been
done. Appropriate references can be found throughout the manual.

• Lepton-lepton, lepton-hadron, and hadron-hadron collisions with configurable beam prop-


erties, such as beam energies and crossing angles, to simulate one or many Standard-Model
processes encoded in PYTHIA. This is the standard application of PYTHIA, but not the only
one.

• The same as above, except using parton-level configurations for the hard process input from
an external source.

• Ordinary particle decays, where the particles are produced by another physics program.
This includes the limiting case of a particle gun (i.e. a single particle with user-defined
momentum).

• Beyond Standard Model (BSM) particle decays, including decay chains.

• Resonance decays including the effects of final-state parton showering and hadronization.

• Hadronization of (colour-singlet) partonic configurations, as may arise from ordinary or


exotic particle decays.

• Generation of Les Houches Event (LHE) formatted files from the internal hard processes for
other physics studies.

• Ion-ion collisions for ion geometries well described with a Woods-Saxon potential (non-
p
deformed, A > 16) for sNN > 10 GeV.

• Astro-particle phenomena like dark-matter annihilation into Standard-Model particles.

• User-inspired modifications of standard PYTHIA modules as allowed by the UserHooks


methods and those for semi-internal processes and/or semi-internal resonances.

As always, caveat emptor.

1.4 Historical evolution of the PYTHIA program


To bring some of the main development lines into context, we here provide a brief summary of the
historical evolution of the PYTHIA program and its ancestor, JETSET. Detailed descriptions of the
various physics components will be found in subsequent sections, including relevant references;
a more elaborate review of the historical evolution of PYTHIA can be found in ref. [2].
In the late seventies the Lund group began to study strong interactions, and notably the
hadronization subsequent to a collision process. A linear confinement potential was assumed
to be realized by a string stretched out between a pulled-apart colour–anticolour pair, as a simple
one-dimensional representation of a three-dimensional flux tube or vortex line. In order to allow
detailed studies, two PhD students were entrusted to code up this model, and also include effects
such as particle decays. This program was given the name JETSET. The model and code were
gradually extended to encompass more physics, in particular with reference to e+ e− physics. The
key addition was a model for e+ e− → qqg, wherein the colour field was assumed to stretch as one
string piece from the q end to the g and then as a second piece on from the g to the q end, with

12
SciPost Physics Codebases Submission

no direct connection between the q and q. This model received experimental support at PETRA in
1980 [3], thereby starting the success story of the Lund event generators. The idea of subdividing
the full colour topology into a set of colour–anticolour dipoles rapidly prompted extensions also
to other collision processes, notably to pp ones, with the PYTHIA generator built on top of JETSET.
Later, it also came to develop into the dipole picture of parton showers, and to foreshadow related
techniques for higher-order matrix-element calculations.
In part, the continued evolution was driven by interactions with the experimental communities
and their priorities. An early involvement in SSC studies led to an extension of the scope of PYTHIA
from QCD physics to encompass a wide selection of Standard Model (SM) processes, notably those
related to Higgs-boson signatures. At the same time, QCD processes needed to be modelled better,
which led to the development of new concepts, such as backwards evolution to handle initial-state
radiation, and multiparton interactions and colour reconnection to describe underlying events
and minimum-bias physics. When LHC physics studies began in 1990, these capabilities helped
PYTHIA play a prominent role in benchmarking the evolving design of the LHC detectors, and
additionally many Beyond-the-Standard-Model scenarios were included to cater to the demands
of the community.
The Large Electron–Positron Collider (LEP) became the first operating collider where JETSET
had been used from the early days of detector design, and the program came to play a key role
in most physics analyses carried out there. QCD phenomena were a primary focus of experimen-
tal studies, and this led to an emphasis on issues such as parton-shower algorithms and matrix-
element corrections to them. The ARIADNE dipole shower [4] offered a successful alternative to
the more traditional internal JETSET one. With LEP 2, the emphasis shifted from QCD towards
electroweak processes such as W+ W− pair production, which had already been incorporated into
PYTHIA. This led, naturally, to the integration of the JETSET capabilities into PYTHIA, with PYTHIA
maintaining the project name and legacy.
Also at HERA, the Lund-based programs came to play a prominent role from the onset, with
codes such as LEPTO [5], ARIADNE and LDC [6] built on top of JETSET. Photon physics was intro-
duced into PYTHIA to handle γp at HERA and γγ at LEP 2.
A further area of study is heavy-ion collisions, where early on the FRITIOF [7] program came
to be widely used. Some of these ideas have been revived, updated, and implemented in the
PYTHIA 8.3/ANGANTYR model. It is worth noting, also, that many heavy-ion collision models,
used notably at the Relativistic Heavy Ion Collider (RHIC), have been based on PYTHIA.
The separation above, by collider, gives one way of describing the evolution of the code(s).
Underlying it is a belief in universality, that many aspects of particle collisions are the same, in-
dependent of the beam type. Therefore, physics developments made in one context can also be
applied to others. This is why one single code has found such widespread use.
The early codes were all written in FORTRAN 77. With the CERN decision to replace that
language by C++ for LHC applications, PYTHIA underwent a similar transformation in 2004 – 2008.
A new organizational structure was put in place for the new PYTHIA 8, in an attempt to clean
up blemishes incurred during the years of rapid expansion, but deep down most of the physics
algorithms survived in a new shape.
One area where the evolution has overtaken PYTHIA is that of matrix elements. Before it was
possible for most users to perform matrix-element calculations on computers, such expressions
were published in articles and hard-coded from these. Now, with the physics demand for higher
final-state multiplicities and higher-order perturbative accuracy, that is no longer feasible. For
all but the simplest processes, we therefore rely on separate, external matrix-element codes to
provide the hard interactions themselves, e.g. via the Les Houches interfaces, to which we then

13
SciPost Physics Codebases Submission

can add parton showers, underlying events, and hadronization. Also parton distribution functions
are obtained externally, even if a few of the more commonly used ones are distributed with the
code.
The program has continued to expand also after the transition to C++. Some developments
are done from the onset within the PYTHIA code, such as the machinery for matching and merging
between matrix elements and parton showers, or the PYTHIA 8.3/ANGANTYR framework for heavy-
ion collisions, or the space–time picture of hadronization and hadronic rescattering. Other have
come by the integration of externally developed packages, such as the VINCIA and DIRE alternatives
to the existing simpler parton showers already in place.
In total, the JETSET/PYTHIA manuals have more than 35 000 citations by now, attesting to its
widespread use. That use also includes possible future projects such as ILC, FCC, CLIC and EIC. The
counting of code citations does not include the numerous articles describing the development and
application of the physics content in the programs. This is harder to count, with many borderline
cases, but the order of magnitude is comparable with the one for the code itself.

2 Program structure and basic algorithms


The PYTHIA 8.3 general-purpose Monte-Carlo event generator’s structure reflects the different
physics descriptions and models needed to generate fully exclusive final states as they can be
detected at collider experiments. The first part of this section gives a brief overview of the pro-
gram structure, while the latter parts describe basics of Monte Carlo (MC) techniques and process
generation employed by PYTHIA 8.3.

2.1 Program structure and overview


Internally, PYTHIA 8.3 is structurally divided into three main parts: process level, parton level, and
hadron level. This reflects the components of an event as introduced in section 1.1.
The process level represents the hard-scattering process, including the production of short-
lived resonances. The hard process is typically described perturbatively, with a limited number of
particles, typically at high-energy scales.
The parton level includes initial- and final-state radiation, where various shower models are
available. Multiparton interactions are also included at this stage, along with the treatment of
beam remnants and the possibility of the colour-reconnection phenomenon. At the end of the
parton-level evolution, the event represents a realistic partonic structure, including jets and the
description of the underlying event.
The hadron level then takes care of QCD confinement of partons into colour-singlet systems. In
PYTHIA 8.3, the hadronization is described by QCD strings fragmenting into hadrons. Furthermore,
other aspects like the decay of unstable hadrons and hadron rescattering are dealt with at the
hadron level. The physics models of hadronization are typically non-perturbative, and thus require
modelling and the tuning of parameters. The output of the hadron level is then a realistic event
as it can be observed in a detector.
On top of this general structure, a significant number of shared objects and cross talk is passed
between these levels: PDFs are relevant in both the process level and ISR, the matching and
merging machinery works on the interface between parton showers and process level, and the
Info object is used throughout all levels to store and access central information. Under certain
circumstances, like the analysis of heavy-ion collisions using the PYTHIA 8.3/ANGANTYR model,

14
SciPost Physics Codebases Submission

The User

Input Main Program Output

Info
Settings Pythia 8.3 event generator Pythia8Rivet
LHA...
HepMC
LHEF
Hist
Event process Event event

ProcessLevel PartonLevel HadronLevel


ProcessContainer TimeShower StringFragmentation
PhaseSpace SpaceShower StringInteractions
SLHAinterface Dire, Vincia ParticleDecays
ResonanceDecay MultipleInteractions BoseEinstein
BeamRemnants LowEnergySigma

Merging
BeamParticle
SigmaProcess
SigmaTotal

Vec4, Rndm, ParticleData, PhysicsBase, UserHooks, HeavyIons, ...

Figure 2: Simplified picture of the PYTHIA 8.3 structure, showing some of the impor-
tant classes in bold. The main program itself creates one or more Pythia objects,
and provides input in terms of Settings and potentially-perturbative event input.
The main physics components are grouped into ProcessLevel, PartonLevel, and
HadronLevel, with additional structure to complement and interconnect them.

multiple parton-level objects can be used for separate subcollisions, which are then combined for
hadronization.
From the user’s perspective, PYTHIA 8.3 is a C++ library. The actual executable is implemented
by the user, based on the requirements regarding input, output, features, and analysis, and many
examples come with the PYTHIA 8.3 package. For detailed information on how to install and use
PYTHIA 8.3, both standalone and with external interfaces, see part III. Figure 2 gives a rough
overview of the PYTHIA 8.3 program structure.

2.2 Monte-Carlo techniques


Real events observed in particle colliders are stochastic. To emulate this, event generators sample
from probability distributions using pseudo-random numbers. Naively, a pseudo-random number
(between 0 and 1) is compared to a cumulative distribution function to determine an effect, e.g.
the angle of a particle in a decay, the type of particle produced in hadronization, etc. Since real
cases are rarely this simple, we use this section to describe some of the technical details of how

15
SciPost Physics Codebases Submission

pseudo-random numbers are used within the program.

2.2.1 Random-number generation


At the core of all Monte-Carlo methods lies the access to a random number generator. Truly
random numbers require special equipment and are difficult to obtain at the required pace, so
in practice pseudo-random numbers are used, where deterministic computer algorithms are used
to emulate a random behaviour. This also allows a user of the code to reproduce a given event
sample, simply by setting the same random-number seed. Nevertheless, the numbers must appear
to be random, e.g. evenly distributed between 0 and 1, have no detectable correlations, and have
a long period before they start to repeat. Many pseudo-random number generators once thought
to exhibit no internal correlations, have later been revealed to have flaws, so care is needed.
A review of several current generators is found in ref. [8, ]. Common for them is that they
can be viewed as having an N -dimensional state vector x, living in a N -dimensional hypercube
with periodic boundary conditions such that each number is in the range between 0 and 1. A
new state is obtained by a matrix multiplication x i+1 = A × x i , where A is a N × N matrix of
integers. There is some sophisticated theory involved in the choice of A, involving concepts such as
Kolmogorov–Anosov mixing and the Lyapunov exponent. Some of the key results are that A should
have determinant unity, with complex eigenvalues away from the unit circle, and additionally that
multiplication with it should require a minimal amount of operations so as to keep speed up.
The RANMAR default in PYTHIA is based on the Marsaglia–Zaman algorithm [9], but imple-
mented in double precision with N = 97. There remains some tiny correlations 97 numbers apart,
which could be fixed by multiplication by A several times between each set of 97 random numbers
actually used [10], but this is not a necessity for event generators, where typically one is in a
completely different part of the code 97 random numbers later. The RANMAR algorithm can be
initialized to run one of more than 900 000 000 different sequences, each with a period of more
than 1043 . By default, the same sequence is always run, which is useful for checks and debug
purposes.
The MIXMAX alternative [11] is also provided as an option, and additionally there is an inter-
face allowing the user to link in an external algorithm of choice.

2.2.2 Some standard techniques


There are two main kinds of random-number usage in PYTHIA, one without a memory of a previous
evolution in “time” and one with. The latter is part of the veto algorithm described in the next
subsection. Here we introduce the former.
The simplest situation is that we know a function f (x) that is non-negative in the allowed
x range x min ≤ x ≤ x max . We want to select an x at random so that the probability in a small
interval dx around a given x is proportional to f (x) dx.
If it is possible to find a primitive function F (x) with a known inverse F −1 (x), an x can be
found as follows:
Z x Z x max
f (x) dx = R f (x) dx
x min x min

=⇒x = F −1 (F (x min ) + R (F (x max ) − F (x min ))) , (1)

where R is a random number evenly distributed between 0 and 1. The statement of the first line
is that a fraction R of the total area under f (x) should be to the left of x. However, seldom are

16
SciPost Physics Codebases Submission

functions of interest so nice that the method above works. It is therefore necessary to use more
complicated schemes.
If the maximum of f (x) is known, f (x) ≤ fmax in the x range considered, a hit-or-miss method
will yield the correct answer. In this method x and y are chosen according to

x = x min + R1 (x max − x min ) ,


y = R2 fmax . (2)

This is repeated until a y < f (x) is selected. The accepted x value is then distributed uniformly
in the area below f (x). Equivalently, the selected x can be accepted with probability f (x)/ fmax ,
without the explicit constructionR of a y. The efficiency of this method, i.e. the average probability
that an x will be retained, is ( f (x) dx)/( fmax (x max − x min )). The method is acceptable if this
number is not too low, e.g. if f (x) does not fluctuate too wildly or is too sharply peaked.
The algorithm is independent of the absolute normalization of f (x); in a sense, the procedure
automatically rescales the function to have unit integral. As a by-product of the x selection it is
also possible to do a Monte-Carlo integration of f (x):
Z x max ntry
1 X
f (x) dx ≈ (x max − x min ) f (x i ) , (3)
x min ntry i=1

where x i runs over all x values tried, whether accepted or not. The error decreases like 1/ ntry ,
p
also if x represents more than one dimension. More conventional integration methods converge
faster than this in one dimension, but slower in higher dimensions.
Often f (x) does have narrow spikes, and it may not even be possible to define an fmax . Then
one may use a variable transformation to flatten out the function. A related method is importance
sampling, which works if one can find a function g(x), with f (x) ≤ g(x) over the x range of
interest. Here g(x) is picked to be a “simple” function, such that the primitive function G(x) and
its inverse G −1 (x) are known. Then the methods above can be combined:

x = G −1 (G(x min ) + R1 (G(x max ) − G(x min ))) ,


y = R2 g(x) . (4)

This is repeated until a y < f (x) is selected. Note that the first step selects (x, y) uniformly in
the area below g(x), whereas the second half is to accept those that also are below f (x). Using
an acceptance probability f (x)/g(x) again removes the need to introduce an intermediate y.
If f (x) has several spikes, it may not be possible to find a g(x) that both covers all of them
P and
has an invertible primitive function. However, assume that we can find a function g(x) = i g i (x),
such that f (x) ≤ g(x) over the x range considered, and such that the functions g i (x) are non-
negative and have invertible primitive functions. Then multichannel sampling [12] R extends on
the importance-sampling prescription, by using the relative size of the integrals I i = g i (x) dx to
each time pick a new g i for the x selection in eq. (4). The y selection and the accept/reject works
as before, since it is easy to see that the weighted usage of the different g i (x) adds up to g(x).
In addition to the generic methods, it is also sometimes possible to find special tricks. For
instance, a single Gaussian exp(−x 2 ) is not integrable, but the product of two is, by transforming
to plane-polar coordinates:
2
+ y2) 2 2
e−(x dx d y = e−r rdr dϕ ∝ e−r dr 2 dϕ , (5)

17
SciPost Physics Codebases Submission

which gives
Æ
x= − ln R1 cos(2πR2 ) ,
Æ
y= − ln R1 sin(2πR2 ) , (6)

i.e. two Gaussian-distributed numbers are obtained from two random ones. Another trick is that
n−1 −x
a judicious choice P
of convolutions can
n Qnbe used  to show that f (x) = x e /(n − 1)! can be
obtained by x = − i=1 ln R i = − ln i=1 R i .

2.2.3 The veto algorithm


A broad class of stochastic evolution algorithms, including ones describing radioactive decays,
parton showers, and also PYTHIA’s modelling of MPI, involve the generation of ordered sequences
of state changes (transitions), where the ordering parameter is typically a measure of time and/or
resolution scale.
For probability distributions (and/or domains) that are complicated to handle analytically, the
veto algorithm offers a convenient and mathematically exact approach by which simple overes-
timates can be used instead of the original functions that are then reimprinted via a veto step.
This circumvents the need for costly and delicate numerical integrations and root finding, and
the overestimating functions and domains can be tailored to the problem at hand for maximum
efficiency.
Before describing the algorithm itself, however, let us first clear up a point of semantics. In
the context of parton showers, the veto algorithm is the main way by which Sudakov form factors
(see below) and related quantities are calculated. One therefore occasionally sees the phrase
“Sudakov veto algorithm”, but this risks giving the mistaken impression that Sudakov invented
the veto algorithm. To avoid this, the terms “veto algorithm” and “Sudakov (form) factor” are
kept separate in this work, with the former referring to the broad numerical sampling method
described in this section and the latter being an (important) example of a physical quantity that
can be calculated with it.
Consider a stochastic process that is ordered in some measure of evolution scale. E.g. for
nuclear decay, the ordering measure could be time (in the rest frame of the decaying nucleus),
while for PYTHIA’s evolution algorithms, which are formulated in momentum space, the ordering
is normally done in a measure of transverse momentum, from high to low. This ensures that
infrared and collinear divergences of the corresponding transition amplitudes are associated with
vanishing resolution scales, or equivalently with asymptotically late times in the algorithmic sense.
Starting from a given initial value, u, for the evolution scale, the probability for the next tran-
sition (e.g. a nuclear decay, or a shower branching) to happen at a lower scale t < u, is given by

p(t|u) = f (t) Π(u, t) , (7)


where f (t) is the probability (sometimes called the “naive” probability) for a transition to occur
at the scale t under the implicit condition that the state still exists at t. The latter is made explicit
by the survival probability, Π(u, t) ∈ [0, 1], which represents the probability that the state remains
unchanged over the interval [u, t]. Analogously to nuclear decay, Π(u, t) is given by a simple
exponential of the integrated naive transition rate,
 Zu 
Π(u, t) = exp − dτ f (τ) , (8)
t

18
SciPost Physics Codebases Submission

such that
∂ Π(u, τ)
p(t|u) = τ=t
. (9)
∂τ
The survival probability Π(u, t) is often referred to as the Sudakov form factor. We note that
the two are only strictly identical for final-state showers, while for initial-state showers they are
related via ratios of parton distribution functions, and in the context of MPI one can really only
talk about a Sudakov-like factor. In this section, only the survival probability itself, which we
denote by Π(u, t), will be of interest.
It is worth pointing out that the ordered probability density p(t|u) remains well-behaved and
bounded by unity even if the integrated naive transition rate exceeds unity. In fact, due to the
aforementioned collinear and soft singularities, f (t) typically diverges for t → 0, in which case
the total probability for at least one-state change becomes
Zu
dτ p(τ|u) = 1 − exp (F (0) − F (u)) → 1 . (10)
0

That is, since F (0) → −∞ for a divergent kernel, the probability for at least one-state change
simply saturates at unity. This reflects the unitarity of the shower algorithm, which is also manifest
in eq. (9). If the naive probability does not diverge, or if the evolution is stopped at a finite cutoff
t cut > 0, then there is a non-zero probability, given by Π(u, t cut ), to have no state change at all.
Starting from eq. (7), probabilities for two or more ordered transitions can easily be con-
structed as well, e.g. for two successive branchings with t < u < v:

P(t|u|v) = f2 (t) f1 (u) Π2 (v, u) Π1 (u, t) , (11)

where f1 (u) is the naive transition rate at the scale of the “first” transition, and that of the “second”
transition is f2 (t). Note that we do not assume f1 = f2 since the state undergoes a change at the
intermediate scale τ = u (and the phase space is generally also different). This is also emphasized
by the presence of two separate survival probabilities with different subscripts instead of a single
combined Π(v, t).
We now turn to how to actually sample from eq. (7). The branching kernel f (t) is typically
not simple enough to allow for the use of inversion sampling as described in the previous section.
Fortunately, the veto algorithm [13–17] (and its antecedents, see the “thinning algorithm” [18,
19]) enables sampling from eq. (7) in a quite efficient and flexible manner. This algorithm relies
on the existence of an overestimating “trial” function g(t) ≥ f (t) that is simple enough for samples
to be drawn from eq. (7) directly, with f replaced by g. A flowchart representation of the veto
algorithm in its simplest form is shown in fig. 3.
To confirm that this algorithm produces eq. (7), we follow along and write out its probability
distribution q(t|u) to find
Zu
f (t 0 ) f (t 0 )
•   ‹ ˜
q(t|u) = dt 0 g(t 0 ) Π g (u, t) δ t − t 0
+ 1 − p t|t 0
, (12)
0 g(t 0 ) g(t 0 )

where the first term describes the probability to accept the proposed trial scale t 0 , and the second
term gives the probability to reject the trial scale. Note that eq. (12) explicitly displays the Marko-
vian nature of the veto algorithm, with every recursive step only depending on the previous one.
Equation (12) may be solved by considering the differential equation

q(t|u) = f (u)δ(t − u) − f (u)q(t|u), (13)
∂u

19
SciPost Physics Codebases Submission

Sample t from g(t) Π g (u, t)

Set u = t
f (t)
Accept trial with probability g(t)

Done

Figure 3: Flowchart representation of the veto algorithm. The red and green arrows
refer to rejection and acceptance of the trial scale t respectively.

which is found by application of Leibniz’s rule for differentiation to eq. (12). We can find a solution
by using an ansatz q(t|u) = q̂(t|u)e−F (u) , which after integration leads to

q(t|u) = f (t) Π f (u, t)Θ(t − σ) + q0 (t, σ). (14)

The scale σ in the step function Θ(t − σ) and the function q0 appear because information is lost
in converting eq. (12) to eq. (13), but they are easily understood by reconsidering the structure of
the algorithm. Mathematically, no other scale σ was introduced at any point, so eq. (14) cannot
depend on it. As a result, σ must equal zero and the function q0 must vanish, recovering eq. (7).
In practice, however, the infrared cutoff on the shower evolution does introduce a scale σ. In that
case, the algorithm shown in fig. 3 is stopped whenever t drops below σ. The function q0 then
represents the superfluous probability of sampling a scale below the cutoff, which is not associated
with any change of state.
Many extensions of the veto algorithm are possible and are often used, of which we only
discuss a few. Further details may be found in refs. [13, 15–17, 20].
One can replace the acceptance probability f (t)/g(t) by some other r(t) ∈ [0, 1] and com-
pensate by modifying the event weight by a multiplicative factor f (t)/g(t)r(t) in case the scale
is accepted and (1 − f (t)/g(t))/(1 − r(t)) in case it is rejected. Writing out the probability distri-
bution again, we find
Zu
0 0 0
•
0 0
 f (t 0 ) 0
  0 0 ˜
0 1 − f (t )/g(t )
q(t|u) = dt g(t ) Π g (u, t ) r(t )δ t − t + 1 − r(t ) p t|t ,
0 g(t 0 )r(t 0 ) 1 − r(t 0 )
(15)
where the weights appear as multiplicative factors. It is then straightforward to see that eq. (15)
reduces to eq. (12). This modification enables sampling from eq. (7) in cases where it is difficult
to find a g(t) ≥ f (t), or even in cases where f (t) may be negative. However, in both cases events
with negative weights will appear.
Applying eq. (15), shower uncertainties can be efficiently incorporated as event weights. In
that case, r(t) represents the baseline acceptance probability, while f (t) is a modified branching
kernel that parameterizes the uncertainties through variations of the renormalization scale, its
non-singular components, or choice of parton distribution function for initial state showers. If
g(t) overestimates both the baseline and the modified branching kernels, the event weights stay
positive. This is also the basis for generating biased emissions of rare splittings. More details can
be found in section 4.1.5.

20
SciPost Physics Codebases Submission

Set u1 = u Set u2 = u Set un = u

Sample t 1 from g1 (t 1 ) Π g1 (u1 , t 1 ) Sample t 2 from g2 (t 2 ) Π g2 (u2 , t 2 ) Sample t n from g n (t n ) Π g n (un , t n )


...
Set u1 = t 1 Set u2 = t 2 Set un = t n
f1 (t 1 ) f2 (t 2 ) f n (t n )
Accept with probability g1 (t 1 ) Accept with probability g2 (t 2 ) Accept with probability g n (t n )

Select highest scale t i

Done

Figure 4: Flowchart representation of the first competition veto algorithm.

We complete this section by discussing some variations of the veto algorithm in the context of
competition between channels. In most cases, multiple branching kernels f i (t) contribute to the
total parton-shower probability distribution, which may then be written as
n
X
p̃(t|u) = f˜(t) Π f˜ (t, u) where f˜(t) = f i (t). (16)
i=1

One way to handle competition is to apply the veto algorithm to all channels individually, and
then select the channel with the highest scale t i . A flowchart representation of this procedure is
shown in fig. 4.
It may be shown to yield eq. (16) as follows:
– n Zu ™ n  
Y X Y  
dt i f i (t i ) Π f i (u, t i ) Θ t j − tk δ t − t j
i=1 0 j=1 k6= j
 
n
X YZ ti Z u
=  dt j f (t j ) Π f j (u, t j ) dt i f i (t i ) Π f i (u, t i ) δ (t − t i )
i=1 j6=i 0 0

n
X Y
= f i (t) Π f i (u, t) Π f j (u, t) = p̃(t|u). (17)
i=1 j6=i

Equivalently, the result of eq. (17) may be used with overestimates g i (t i ) in place of f i (t i ), then
selecting the highest scale before proceeding to the acceptance step. A flowchart representation
of this procedure is shown in fig. 5, and produces the same result. This algorithm is used to
interleave initial-state and final-state radiation with multiple parton interactions. Furthermore,
it is more efficient when branching-kernel evaluation is expensive, such as for matrix-element
corrections. A third option is available, where instead a single scale is drawn according to the sum
of overestimates g̃(t) and a channel is selected with probability g i (t)/g̃(t) for the acceptance step.
A flowchart representation of this procedure is shown in fig. 6 and again produces eq. (16). This
procedure is used in a few specific places, in particular when the overestimates of several channels
are very similar. Examples include quark-flavour selection in g → qq splittings and the efficient
sampling of the large number of branchings in VINCIA’s Electroweak (EW) shower. It is important

21
SciPost Physics Codebases Submission

Sample t 1 from g1 (t 1 ) Π g1 (u1 , t 1 ) Sample t 2 from g2 (t 2 ) Π g2 (u2 , t 2 ) ... Sample t n from g n (t n ) Π g n (un , t n )

Set u = t i
Select highest scale t i

f i (t)
Accept with probability g i (t)

Done

Figure 5: Flowchart representation of the second-competition veto algorithm. This algo-


rithm is used to to interleave initial-state and final-state radiation with multiple parton
interactions, and is more efficient when branching-kernel evaluation is expensive, such
as for matrix-element corrections.

Sample t from g̃(t) Π g̃ (u, t)

g i (t)
Select a channel with probability g̃(t)
Set u = t

f i (t)
Accept with probability g i (t)

Done

Figure 6: Flowchart representation of the third-competition veto algorithm. Useful in


situations where multiple channels have similar overestimates, like for quark-flavour se-
lection in g → qq splittings and the efficient sampling of the large number of branchings
in VINCIA’s electroweak shower.

22
SciPost Physics Codebases Submission

to note that these algorithms may also be combined, for instance by grouping several channels
for use with the algorithm depicted in fig. 6, and further combining them with the algorithms
depicted in fig. 4 or fig. 5. In fact, the different shower models available in PYTHIA often use
different procedures to optimize code structure and performance.

2.2.4 Phase space (M-generator and RAMBO)


One standard task is to distribute the momenta of final-state particles uniformly according to
Lorentz Invariant Phase Space (LIPS), on top of which then later dynamical aspects can be added.
(Non-uniform sampling methods, used e.g. when resonances are present, are discussed separately,
in section 2.3.) The relevant phase-space density is
n
‚ Œ n
X Y d3 p
4 (4) i
dΦn (P; p1 , p2 , . . . , pn ) = (2π) δ P− pi 3
. (18)
i=1 i=1 (2π) 2p 0
i

where P is the total four-momentum, and pi , i ≥ 1 are the n different outgoing four-momenta.
Usually the process is initially considered in the rest frame of the system, P = (M ; 0), and later
boosted to the relevant frame of the whole event.
A common special case is two-body final states, where the Centre of Mass (CM)-frame expres-
sion reduces to
|p| |p|
dΦ2 = dΩ = d cos θ dϕ . (19)
16π2 M 16π2 M
That is, the direction of one of the outgoing particles has to be picked uniformly on the unit sphere,
with the other moving out in the opposite direction. The three-momentum length is
q 
λ M 2 , m21 , m22
|p| = |p1 | = |p2 | = (20)
2M
where the Källén λ function can be written in a number of equivalent ways

λ(a2 , b2 , c 2 ) = a4 + b4 + c 4 − 2a2 b2 − 2a2 c 2 − 2b2 c 2


= (a2 − b2 − c 2 )2 − 4b2 c 2
= (a2 − (b + c)2 )(a2 − (b − c)2 )
= (a + b + c)(a − b − c)(a − b + c)(a + b − c) . (21)

The energies are given by

q M 2 + m21 − m22
p10 = m21 + p2 = ,
2M
q M 2 + m22 − m21
p20 = m22 + p2 = . (22)
2M
For three or more final particles, PYTHIA implements two different generic methods, the older
M-generator [21] and the newer RAMBO [22] one, but approaches tailor made for the specific
situation are also common, e.g. in parton showers. RAMBO is the best choice for the case of
massless products, whereas the situation is less obvious once the masses constitute a significant
fraction of the full energy.

23
SciPost Physics Codebases Submission

The basic idea of the M-generator strategy is to view the full event as arising from a sequence
of fictitious two-body decays. Thus a four-body decay 0 → 1 + 2 + 3 + 4, as an example, is viewed
as a sequence 0 → 123 + 4 → 12 + 3 + 4 → 1 + 2 + 3 + 4, where 123 and 12 represent intermediate
states. By a suitable insertion of a unit factor

d3 p12
1 = δ(4) (p12 − p1 − p2 ) d4 p12 δ(m212 − p12
2
) dm212 = δ(4) (p12 − p1 − p2 ) 0
dm212 , (23)
2p12

and a similar one for p123 , the four-body phase space can be reformulated as

dΦ4 P; p1 , p2 , p3 , p4 ∝ dΦ2 P; p123 , p4 dm2123 dΦ2 (p123 ; p12 , p3 ) dm212 dΦ2 (p12 ; p1 , p2 ) . (24)
 

The mass-dependent parts can be collected and simplified to


q q q
λ(m20 , m2123 , m24 ) λ(m2123 , m212 , m23 ) λ(m212 , m21 , m22 )
dm123 dm12 . (25)
m0 m123 m12
The (m123 , m12 ) phase space can easily be sampled within allowed borders, but the rest of the
expression then becomes a weight that has to be taken into account by hit-and-miss Monte Carlo.
This is where the algorithm can be slow. Once the intermediate masses have been selected, two-
particle kinematics are constructed in a sequence of rest frames for 1 + 2, 12 + 3 and 123 + 4,
interleaved with Lorentz boosts between them.
The RAMBO algorithm provides an alternative sampling of n-body phase space, which by
construction has constant (uniform) weight for arbitrary n in the massless limit. The starting
point is the following identity for massless four-vectors,
Z Z∞ 0 Z
4 2 0 q 0
d q δ(q ) exp(−q ) = exp(−q ) dΩ = 2π . (26)
0 2

A four-momentum q distributed according to the integrand of the left-hand side of this identity
can be generated via the steps

q0 = − log(R1 R2 ), cos θ = 2R3 − 1, ϕ = 2πR4 , (27)

=⇒ q = (q0 , q0 sin θ sin ϕ, q0 sin θ cos ϕ, q0 cos θ ) . (28)


µ
RAMBO repeats this process n times to produce a set of momenta qi that initially have i qi ≡ Qµ .
P
µ
The final momenta pi are then pconstructed by applying a boost Λ ν to the CM frame of Q and scaling
µ µ
by an overall factor x = M / Q2 , so that P µ ≡ pi = x(Λ νQν ) = (M , 0).
P

To illustrate that this leads to momenta distributed according to eq. (18), we may start from
n multiples of eq. (26) and unitarily transform the momenta as
Yn Z
(2π)n = d 4 qi δ(qi2 ) exp(−qi0 )
i=1
n
‚ Œ  
4 4
X M
×d Qδ Q− qi dx δ x − p
i=1 Q2
n
Y
× d 4 pi δ4 (pi − x (Λqi )) . (29)
i=1

24
SciPost Physics Codebases Submission

Identifying the phase-space measure eq. (18) in eq. (29) and integrating over the other variables
then leads to Z  π n−1 M 2n−4
dΦn (P; p1 , p2 , . . . , pn ) = , (30)
2 (n − 1)!(n − 2)!
which is indeed the n-body massless phase-space volume [22]. RAMBO thus samples the massless
phase space isotropically, with constant weight given by eq. (30).
For massive particles, no equivalent general expression for the phase-space volume exists.
However, the massless RAMBO algorithm may be adapted to the massive case at the cost of intro-
ducing variable event weights, which translates to a reduced efficiency at the unweighted level.
Starting from the massless momenta pi , massive momenta ki are obtained through
q
ki = ypi , ki0 = |ki |2 + m2i . (31)

The momenta ki are on-shell and preserve momentum conservation as long as the rescaling pa-
rameter y is given by the solution of the equation
n q
X
y 2 |pi |2 + m2i = M . (32)
i=1

Since eq. (32) is a monotonic function of y with a solution 0 ≤ y ≤ 1, the value of y may be
determined easily using the Newton–Raphson method. Through a similar procedure as the one
followed in eq. (29), the event weight may be determined to be
!−1 !2n−3
n n
|k j |2 n
‚ Œ
 π n−1 1 Y |ki | X X
w= |k j | , (33)
2 (n − 1)!(n − 2)! i=1
ki0 j=1
k0j j=1

which is bounded from above by the massless weight, eq. (30), so that the distribution can be
unweighted by accepting the generated massive phase-space point with the probability
<1
z }| {
 <1
z P }| {
n
P ‚ Œ2n−4
Y |k
|ki |  j j  | j |k j |
Paccept = 0  P |k j |2 
, (34)
i=1
ki M
j k0j

which (by construction) tends to unity in the massless limit.

2.3 Process-generation basics


Particle-physics cross sections can crudely be divided into two categories: perturbative and non-
perturbative. Both kinds of processes play crucial roles in PYTHIA. The former can be computed
order by order in perturbation theory, e.g. based on Feynman-diagram rules. For the electroweak
sector, the couplings are sufficiently small that higher-order corrections should offer a rapidly con-
verging series. The exception is the enhanced emission of soft or collinear photons, but this is a
well-understood issue. For the strong sector, on the other hand, the large αs coupling leads to a
slower convergence. It can still work well for QCD processes involving large momentum transfers.
In the opposite limit, at low momentum transfers, the perturbative coupling diverges and pertur-
bation theory breaks down. Therefore the total cross section in hadron-hadron collisions, which

25
SciPost Physics Codebases Submission

is dominated by such low scales, can only be described in terms of effective, phenomenological
models. The same applies for its main components — elastic, diffractive and nondiffractive cross
sections — which therefore are classified as non-perturbative processes.
In the current section, the focus will be on perturbative processes, introducing how these
are defined and generated inside PYTHIA. Non-perturbative processes are discussed separately
in section 6. There are also components that partly bridge the gap between the two, such as
multiparton interactions (MPIs), hard diffraction, and photoproduction processes. These are also
discussed in section 6, along with further aspects specific to simulating cross sections in heavy-ion
collisions.
To begin the discussion of perturbative process generation, consider a process a + b → f n ,
where a and b are two incoming particles that together create a final state f consisting of n
particles. The differential cross section can then be written as

dσ̂ |M |2 |M|2
= q ≈ , (35)
dΦn 2 λ(ŝ, m2a , m2b ) 2ŝ

where ŝ = (pa + p b )2 is the squared


p invariant mass of the collision system. Usually ma and m b are
negligible in comparison with ŝ, and then the last expression is obtained. The process-specific
physics is encapsulated in the matrix element M, which we shall assume can be calculated per-
turbatively. The |M|2 expression also has to be averaged over incoming spin and colour configu-
rations, and summed over outgoing spin and colour configurations, where relevant.
In some rare cases a and b are the actual incoming beam particles. Normally, however, a
and b are constituents of the true beam particles, A and B. Then one needs to introduce parton
distribution functions (PDFs), f aA(x, Q2 ) (and f bB (x, Q2 )), that to leading order describe the prob-
ability to find a parton a inside the particle A, with a fraction x of the particle four-momentum, if
the hard-collision process probes the particle at a (factorization) scale Q2 . The cross section then
reads
dσ̂(ŝ, Q2 )
Z Z Z
2 2
A
σ = dx 1 f a (x 1 , Q ) B
dx 2 f b (x 2 , Q ) dΦn , (36)
dΦn
where
ŝ = x 1 x 2 s with s = (pA + pB )2 . (37)
The nature of the PDFs varies depending on what kind of particle is concerned: hadrons, nuclei,
leptons, photons, or pomerons. They will therefore be discussed further in the respective beam
context. The most commonly used and best studied are the proton PDFs, cf. section 3.12, and for
these we will omit the A and B superscripts.

2.3.1 2 → 2 processes
Massless Kinematics: In a massless 2 → 2 subprocess a(p1 ) + b(p2 ) → c(p3 ) + d(p4 ) it is con-
ventional to write the cross section in terms of the Mandelstam variables

ŝ = (p1 + p2 )2 = (p3 + p4 )2 , (38)



t̂ = (p1 − p3 )2 = (p2 − p4 )2 = − (1 − cos θ̂ ) , (39)
2

û = (p1 − p4 )2 = (p2 − p3 )2 = − (1 + cos θ̂ ) , (40)
2

26
SciPost Physics Codebases Submission

where θ̂ is the scattering angle, defined as the polar angle of particle 3, in the rest frame of
the collision. Since dΦ2 ∝ d cos θ̂ ∝ d t̂ (assuming a trivial flat ϕ dependence, as is the case
unless the incoming beams are transversely polarized), it is common to recast eq. (36) accordingly.
Furthermore, dx 1 dx 2 = dτ d y, where τ = x 1 x 2 = ŝ/s and y = (1/2) ln(x 1 /x 2 ). It is also standard
to use x f (x) rather than f (x). In total this gives

dσ̂(ŝ, t̂, Q2 )
ZZZ

σ= d y d t̂ x 1 f aA(x 1 , Q2 ) x 2 f bB (x 2 , Q2 ) . (41)
τ d t̂

The û variable is redundant since ŝ + t̂ + û = 0, but often symmetry properties of matrix elements
are apparent if it is used judiciously. In a frame where a and b come in back-to-back, moving in
the ±z directions, p̂⊥2
= t̂ û/ŝ is the squared transverse momentum of the outgoing c and d. A
frequent choice is to put Q2 = p̂⊥ as the factorization scale.
The sampling ranges for each of the (τ, y, t̂) variables depends on whether phase-space cuts
are imposed at the process-generation level, cf. section 3.13. Generically, they are:

ŝmin ŝmax
<τ < , (42)
s s
1 1
− | ln τ| < y(τ) < | ln τ| , (43)
2 2
v v
2
4p̂2
u u
4p̂⊥max
< |z(τ)| < 1 − ⊥min ,
t t
1− (44)
τs τs

where t̂ has been replaced by z = cos θ̂ via eq. (39), and we emphasize that there are solutions
for both positive and negative z. The phase-space boundaries are set via the (user-specifiable)
parameters m̂min,max , p̂⊥min,max , and/or Q̂2min , cf. section 3.13. To give some examples:

• For processes containing an s-channel resonance, it may be desirable to only generate phase-
space points within a specific range of m̂ values. Processes involving resonance production
and decay are discussed in more detail in section 2.3.3.

• A restriction like p̂⊥ > p̂⊥min , implying ŝmin = max(m̂2min , 4p̂⊥min


2
), is mandatory for matrix
elements that diverge in the p̂⊥ → 0 limit; this includes in particular massless t-channel
QCD processes. It may also be convenient for studies focusing on the high-p⊥ tail of “hard”
2 → 2 processes.

• The option to specify a Q2min value is intended for t-channel DIS-type processes with distin-
guishable final-state particles, cf. section 3.13, in which case ŝmin ≥ Q2min and
z(τ) ≤ 1 − 2Q2min /(τs).

The selection of phase-space points (τ, y, z) is described in detail in ref. [14], and remains
unchanged. The basic strategy is to use multichannel sampling in each of the three variables sep-
arately. Thus, for instance, the τ dependence is modelled as a mix of sampling according to either
1/τ or 1/τ2 . The normalization factors of the two possibilities are determined at initialization,
and would depend on the process, the choice of PDFs and αs , and the p⊥min cut. That way an
upper envelope is found for the real cross-section expression. The probability that a trial phase-
space point is retained is given by the ratio of the full differential cross section to the multichannel
overestimate, and the accepted events are assigned a standard weight of unity. There is always
the risk that the intended upper estimate of the cross section is exceeded by the full expression in

27
SciPost Physics Codebases Submission

some corner of phase space, even if it is not common. Such points are associated with a weight
correspondingly above unity.
The cross section for a process is obtained in parallel with the generation of events, using the
multidimensional generalization of eq. (3). Thus the error decreases with the number of events
generated.
When several processes are to be generated simultaneously, an upper envelope is found for
each differential cross section separately. The size of integrated envelopes, i.e. the upper estimate
of the respective cross sections, is used as a relative weight when the next process is selected. If the
trial phase-space point is rejected then a new process choice is made. That is, a larger overestimate
will make a given process more likely to be picked, but then afterwards also more likely to be
rejected. In the end, all processes are generated in proportion to their correct integrated cross
sections.
Generation in (τ, y, z) is only one possible choice. It has the advantage that additional τ terms
can be used for the sampling of resonances in the cross section, cf. section 2.3.3. For the generation
2
of MPIs, however, it is essential to use p̂⊥ rather than t̂, cf. section 6.2. Then one may instead note
that
dx 1 dx 2 dτ 2
d t̂ = d y d t̂ = d y3 d y4 dp̂⊥ , (45)
x1 x2 τ
where y3 and y4 are the rapidities of the two outgoing particles.

Massive Kinematics: So far we have considered massless kinematics. It is quite common to have
cases where one or both of the outgoing particles are massive, while the incoming ones still are
assumed massless. In some cases, such as elastic scattering, both incoming and outgoing masses
need to be taken into account. The fully general t̂ expression is
q
ŝ2 − ŝ(m21 + m22 + m33 + m24 ) + (m21 − m22 )(m23 − m24 ) − λ(ŝ, m21 , m22 ) λ(ŝ, m23 , m24 ) cos θ̂
t̂ = − ,
2ŝ
(46)
with û obtained by m3 ↔ m4 and cos θ̂ → − cos θ̂ , resulting in
2 2

ŝ + t̂ + û = m21 + m22 + m33 + m24 . (47)

The limits t̂ min < t̂ < t̂ max (all negative or, for t̂ max , zero) are obtained for cos θ̂ = ∓1. Often t̂ max
is close to zero and a numerically safer recipe for it is obtained by noting that

(m21 + m24 − m22 − m23 )(m21 m24 − m22 m23 )


t̂ min t̂ max = (m23 − m21 )(m24 − m22 ) + . (48)

If m21 = m22 = 0 then t̂ min t̂ max = m23 m24 and p̂⊥


2
= ( t̂ û − m23 m24 )/ŝ.

2.3.2 2 → 3 processes
In pure s-channel 2 → 3 processes, say (unpolarized) e+ e− → γ∗ /Z → qqg, cross sections factorize
into production and decay steps, and the decay phase space is easy to generate in terms of two
energy variables and three angles. Such decays are not coded as explicit hard processes, however,
but instead are handled during the parton-level shower evolution. Three-body final states such as
e+ e− → γ∗ /Z → qqg are then reached via showering from e+ e− → γ∗ /Z → qq (cf. section 2.3.3

28
SciPost Physics Codebases Submission

on resonances and section 4 on parton showers), with matrix-element corrections applied to the
extent available and switched on, cf. section 5.
For 2 → 3 hard processes that do not factorize into resonance production and decay plus
shower, it becomes much more messy to set up phase space, since there are more possibilities for
peaks in different places. PYTHIA does not have a general-purpose machinery to handle generic
cross sections. Instead, the main assumption is that such processes are provided via the Les
Houches accord, cf. section 10.1.1, from external programs that have their own phase-space gen-
erators.
There are a few internal 2 → 3 processes, however, for very specific tasks. These are generated
according to one of three different prescriptions, tailored to the squared amplitudes for massless
QCD 2 → 3 processes, Vector Boson Fusion (VBF), and central diffractive processes, respectively.
These were developed separately and employ somewhat different notation in the code, here rela-
belled for clarity. Note that all three assume a cylindrical symmetry with respect to the collision
axis.
Massless QCD a(p1 ) + b(p2 ) → c(p3 ) + d(p4 ) + e(p5 ) cross sections contain divergences when
any of the final-state particles become collinear to the beam, collinear to each other, and/or soft. It
is therefore important to choose a set of phase-space variables that allows for the isolation of these
singularities. The parameterization used in PYTHIA is ( y3 , y4 , y5 , p⊥3 2 2
, p⊥4 , ϕ3 , ϕ4 ). The rapidity
sampling here is simple and consistent, while the p⊥ selection is not, unfortunately. The p⊥5
is fixed opposite to the vector sum of the other two, and in the first instance gets a different p⊥
spectrum than them. Notably, a requirement for all p⊥ > p⊥min can be imposed with full efficiency
for two, but is inefficient
p for the third. It is also important to avoid the collinear singularity by an
additional cut on R = (∆ y)2 + (∆ϕ)2 for all outgoing pairs.
A process of special interest is vector-boson fusion to a Higgs boson, W+ W− → H and ZZ → H
(and/or W+ W+ → H++ in some BSM scenarios). Since the bosons are emitted from fermion
lines this results in 2 → 3 processes of the character f1 (p1 ) + f2 (p2 ) → f3 (p3 ) + f4 (p4 ) + H(p5 ).
The variables chosen in this case are (τ, y, y5 , p⊥32 2
, p⊥4 , ϕ3 , ϕ4 ). Here, special care is taken in
2 2
the modelling of p⊥3 and p⊥4 which, unlike the QCD cross sections, have no p⊥ → 0 divergence
but instead are fairly flat out to the gauge-boson mass. Note that the physics of the process here
naturally singles out the Higgs p⊥ as having a different shape than the other two, again different
from the QCD case. The same machinery is also used for heavy-quark fusion to Higgs, qq → QQH
and gg → QQH, where top masses are selected with a Breit–Wigner shape.
Another special case is central diffraction, e.g. p(p1 ) + p(p2 ) → p(p3 ) + p(p4 ) + X (p5 ), where
X is the central diffractive system. Here sampling in t 1 = (p1 − p3 )2 and t 2 = (p2 − p4 )2 is crucial
to impose an exponential fall-off in these variables. The energy fractions x 1 and x 2 taken from
the incoming proton defines m2X = p52 = x 1 x 2 s in the collinear limit t 1 , t 2 → 0. Away from it also
the ϕ3 and ϕ4 angles play a role, and one requires a more elaborate definition of x 1 and x 2 .

2.3.3 Processes involving resonances


The term “resonance” has a specific meaning in PYTHIA and refers to particles whose decays are
considered to be part of the hard process. This enables PYTHIA to modify the total calculated cross
section depending on which decay channels are open or closed (including effects of sequential
decays, such as t → bW+ followed by W+ → e+ νe ), and also provides a natural framework for
incorporating process-specific aspects such as spin correlations and/or finite-width effects. Here,
we focus aspects of phase-space generation common to all processes involving resonances. Details
on cross-section considerations, process-specific features, and some further sophistications are

29
SciPost Physics Codebases Submission

explained in section 3.11, while user implementations of “semi-internal” resonances is described


in section 9.7.3, and the handling of SUSY Les Houches Accord (SLHA) decay tables is covered in
section 10.1.2.
We focus first on the simplest treatment available in PYTHIA, with partial widths and branching
fractions fixed to their on-shell values. Technically, in the code this corresponds to decay channels
that are assigned meMode = 100. Values of 101, 102, and 103 additionally include some simple
kinematic threshold effects, also discussed below. Lastly, we emphasize that the default treatment
often goes further than this, with most decay modes of SM resonances (and some BSM ones)
assigned meMode = 0 implying the use of dedicated matrix-element expressions for branching
fractions that can vary over a reasonably broad resonance peak; this is covered separately in
section 3.11; see further section 9.7.3 for user implementations of such expressions.
Starting from a cross section computed in the zero-width approximation, i.e. for stable final-
state resonances, the simplest shape modelling available in PYTHIA is a relativistic Breit–Wigner
substitution of the type

m2max
m0 Γ0
Z Z
2 1
1= δ(m − m20 )dm2 → dm2 , (49)
m2min π (m2 − m20 )2 + m20 Γ02

for each final-state resonance, where m0 and Γ0 are the nominal (on-shell) mass and widths of the
resonance, respectively, and m is allowed to vary in a range m ∈ [mmin , mmax ] that can be specified
individually for each resonance in PYTHIA’s particle data table. Note that choosing a small range
will reduce the total cross sections accordingly.
The phase-space integral in eq. (41) is then extended to include integrations over m2 for each
resonance, and the sampled values1 for these masses are used instead of the on-shell ones in the
evaluation of dσ̂/d t̂ and also in the relations between kinematic variables such as between t̂ and
cos θ̂ . This offers a crude level of approximation to the expected mass dependence of the full
cross section, at least in the vicinity of the resonance(s) where the resonant amplitudes can still
be assumed to dominate over any (non-resonant) background processes.
A complication arises for processes that involve pair production of the same kind of particle,
such as tt, W+ W− , or Z0 Z0 production. For such processes, on-shell matrix elements are phrased
in terms of a single pole-mass value, while the procedure above produces two different values, m3
and m4 . For the specific case of double-vector-boson production, PYTHIA uses 4-fermion matrix
elements that include the full mass dependence (as well as the full γ∗ /Z interference). However,
for more general processes involving two of the same kind of resonance (such as tt production),
the choice made in PYTHIA is to use an average squared mass,

m23 + m24 (m23 − m24 )2


m̄2 = − , (50)
2 4ŝ

which is defined so that (ŝ, m23 , m24 ) and (ŝ, m̄2 , m̄2 ) correspond to the same CM-frame three-
momenta,
1
|p∗ (ŝ, m̄2 , m̄2 )|2 = |p∗ (ŝ, m23 , m24 )|2 = ŝ − (m3 + m4 )2 ŝ − (m3 − m4 )2 .
 
(51)
4ŝ

Analogous modified values for the t̂ and û variables are defined to correspond to the same CM-
1
See ref. [14, sec. 7.4.2] for details on the sampling procedure.

30
SciPost Physics Codebases Submission

frame scattering angle,


(m23 − m24 )2 1€ Š
¯t̂ = t̂ −
p
= − ŝ − 2m̄2 − 2|p∗ | ŝ cos θ̂ , (52)
4ŝ 2
(m23 − m24 )2 1€ p Š
¯ = û −
û = − ŝ − 2m̄2 + 2|p∗ | ŝ cos θ̂ , (53)
4ŝ 2
and these variables (m̄, ¯t̂, and û)
¯ are then used in the evaluation of the on-shell cross-section
formula. If in doubt whether full matrix elements or the mass-symmetrized approximation repre-
sented by eqs. (50) to (53) is used for a given process, the corresponding sigmaKin() method
can be inspected in the code (with m̄2 then typically denoted s34Avg). We note that, when gauge
bosons are involved, the procedure is not guaranteed to be gauge invariant, nor positive definite,
and breakdowns should be expected if any resonance masses are far from their on-shell values.
The alternative would be to change to use full 4- or 6-body matrix elements instead (as already
done for double-vector-boson production), e.g. by interfacing external hard-process generators,
cf. section 10.1.
Be aware that, if a decay mode has been assigned meMode = 100 and mmin is such that the
decaying resonanceP can fluctuate down in mass to below the nominal threshold for the given decay
mode (i.e., mmin < j m j with m j the on-shell daughter masses for the decay mode in question),
it is assumed that at least one of the daughters could also fluctuate down to keep the channel
open. Otherwise the program will hit an impasse.
Alternatively, simple step functions Θ(m − j m j ) can be applied to impose kinematic thresh-
P

olds; this is done for decay channels that are assigned meMode = 101. A slightly more sophisti-
cated alternative is to use a smooth threshold factor,
v
2
m21 + m22 m21 m22
u
β=
t
1− − 4 (54)
m2 m4
for two-body decay modes, and v P
mj
u
t j
1− (55)
m
for multi-body ones, again with m j equal to the on-shell masses of the decay products for the
given mode. The former correctly encodes the shrinking size of the phase space near threshold
(but would still miss any non-trivial matrix-element factors) while the latter is only a crude simpli-
fication. Two separate options exist for this, depending on whether the stored on-shell branching
fraction should be considered to already include this factor (meMode = 103) or whether it should
be modified by it (meMode = 102). In the former case (meMode = 103), the actual factor ap-
plied is the ratio of the above to the corresponding value for m = m0 , with a safety limit imposed
in case that denominator turns out to be very small, to avoid unintentionally large rescalings at
large m.
Among the options discussed thus far, only the no-threshold one (meMode = 100) allows for
purely off-shell decay modes, i.e. ones for which the on-shell daughter mass values exceed m0 ; as
noted above one or more of the daughters must then be able to fluctuate down in mass, or there
will be trouble. The
P remaining options (meMode = 101 – 103) are all restricted to phase-space
points satisfying m j < m, with m j the on-shell daughter-mass values.
Currently, the only higher level of sophistication available in PYTHIA is to go all-in and imple-
ment dedicated decay-rate calculations specific to each given resonance and decay mode; this is

31
SciPost Physics Codebases Submission

obtained for meMode = 0. As mentioned above, this is the default for most SM resonance de-
cays in PYTHIA as well as for some BSM ones, meaning that such code exists in the program (in
the form of process-specific SigmaProcess::weightDecay() methods and resonance-specific
ResonanceWidths::calcWidth() methods) and is used by default. See further section 3.11.
Finally, note that both PYTHIA’s simple shower as well as the VINCIA antenna shower allow
for the insertion of resonance decays as 1 → n branchings in the overall perturbative evolution,
at decay-specific perturbative scales. This is called interleaved resonance decays [23] and is also
further discussed in section 3.11.

32
SciPost Physics Codebases Submission

Part II
Physics content
The PYTHIA event generator is the product of a physics development program in close touch with
experimental reality. The two have often gone hand in hand, by making it possible to check which
ideas work and which do not. Many of the concepts that today form the accepted picture of
high-energy collisions can be traced directly to this undertaking, including string fragmentation,
dipole showers, multiparton interactions, colour reconnection, and more. This part of the manual
provides details on these and the other physics models encoded in PYTHIA. Many of these models
still evolve to handle new experimental input, or to accommodate the progress in our theoretical
understanding of the standard model of particle physics.
The first section describes the physics processes — sometimes denoted “hard processes” —
internal to PYTHIA. These processes are those that can be calculated in leading-order perturbation
theory in the standard model or simple extensions. While some of the calculations are currently
outclassed, and are more of interest as a cross check or for quick studies, others, particularly
the treatment of jet production and BSM physics models, are still actively used for comparisons
with data. Further sections describe the core of the PYTHIA engine: parton showers, multiparton
interactions, hadronization, and particle decays. An important complement to these sections is the
one on matching and merging, which documents the methods for interfacing external calculations
(that are more precise in perturbation theory) with the PYTHIA engine.
Notable additions to the PYTHIA 8.3 presentation here are the descriptions of two parton-
shower plugins — DIRE and VINCIA— and the heavy-ion collision machinery.

3 Internal process types


PYTHIA 8.3 includes a good selection of native hard processes at Leading Order (LO). The hard
processes are generated by sampling the allowed phase space using matrix elements, and convo-
luted with the PDFs, as a weight. Usually the multichannel sampling strategy results in accepted
events with a common unit weight, but there are exceptions to this rule, so it is wise to be pre-
pared for non-unit weights in event-analysis programs. In addition to the internal processes, there
are several ways to feed in externally generated hard processes for showering and hadronization,
including several options for matching and merging of higher-order processes. These options are
discussed in detail in section 5 and section 10. This section classifies and lists internally defined
hard processes and discusses about special features and appropriate settings for given processes.
All internally defined processes are listed in appendix A, where also references to the cross-section
formulae are given. If a process is included for both quark and lepton initial or final particles, the
process is written with an “f” (denoting fermion) whereas a “q” is used when only quarks are
expected. Charge-conjugate processes are always implicitly included.

3.1 Hard QCD


The internal QCD processes can be classified in three categories: (1) 2 → 2 scattering of light
quarks and gluons, (2) production of heavy flavours (charm and bottom) in 2 → 2 processes,
and (3) 2 → 3 processes involving light quarks and gluons. For hard processes suitable, process-
dependent, phase-space cuts need to be applied to avoid soft and collinear singularities of per-

33
SciPost Physics Codebases Submission

turbative QCD. In addition to these, soft QCD processes are included. These are discussed in
section 6.1 and include also a unitarized version of the hard 2 → 2 cross sections, where the diver-
gences in the p⊥ → 0 limit have been regulated with the screening parameter p⊥0 , see section 6.2
for more details. Normally the hard and soft QCD processes would not be used simultaneously,
since typically they target different kinds of physics studies. If they are combined nevertheless, rel-
evant phase-space cuts should be introduced to separate the regions handled by each machinery,
to prevent double counting.

3.1.1 Light quarks and gluons


This subclass of processes contain all possible 2 → 2 scatterings of massless quarks and gluons. In
total there are six possibilities:

• gg → gg

• gg → qq

• qg → qg

• qq0 → qq0 where incoming and outgoing flavours are the same

• qq → gg

• qq → q0 q0

By default the light flavours include u, d and s quarks but it is also possible to produce c and b
quarks with the massless matrix elements.

3.1.2 Heavy flavours


These processes provide heavy-quark pair production where heavy flavours stand for c and b. In
LO, there are two possible processes each:

• gg → cc

• qq → cc

• gg → bb

• qq → bb

Unlike the case of massless partons, the finite mass also makes the matrix element expressions
finite in the p⊥ → 0 limit, so there is no need to introduce phase-space cuts to avoid divergences.
However, it is also possible to generate these processes within the regularized soft QCD frame-
work, though the mass effects are then not accounted for in the matrix elements. When consid-
ering heavy-quark production, one should keep in mind that especially c quarks are abundantly
produced in the parton shower at LHC energies [24]. Therefore, to obtain e.g. inclusive D-meson
spectra, these processes should be combined with the light-parton processes above. This combina-
tion will also provide the total QCD jet cross section in LO. Notice also that the qg → qg process is
available only in the massless approximation above. (The massive matrix element for this process
incorrectly sets the incoming quark on mass shell, so it is not a better alternative.)

34
SciPost Physics Codebases Submission

3.1.3 Three-parton processes


In addition to 2 → 2 QCD processes, LO expressions for processes with three final state partons are
also included in PYTHIA. These contain only light partons, but if needed the massive quarks can be
dealt with using the massless matrix elements. One should also keep in mind that, since three-jet
events can be formed from two-parton final states in the parton showers, mixing these with the
2 → 2 QCD processes would lead to double counting. So far this section is partly incomplete, e.g.
colour flows are rather simple, so the purpose is rather to provide a way to check cross sections in
specific kinematics where e.g. all three jets need to be above a certain p⊥ . Included processes are
listed below:

• gg → ggg

• qq → ggg

• qg → qgg

• qq0 → qq0 g where q and q0 are different flavours

• qq → qqg where incoming and outgoing flavours are the same

• qq → q0 q0 g where q and q0 are different flavours

• qq → qqg where incoming and outgoing flavours are the same

• gg → qqg

• qg → qq0 q0 where q and q0 are different flavours

• qg → qqq where incoming and outgoing flavours are the same

3.2 Electroweak
The internally defined electroweak (EW) processes contain prompt photon production, processes
with photons in the initial state, and processes including electroweak bosons as an intermediate
state or in the final state.

3.2.1 Prompt photon production


These processes include parton-initiated production that have one or two photons in the final
state. The partonic cross sections are at LO in QCD for massless partons and contain only 2 → 2
processes. Thus, similarly as for light-flavour QCD, the expressions diverge in the p⊥ → 0 limit
and require a minimum p⊥ cut to avoid QCD singularities. These processes are, however, also
included in the eikonalized description of the soft QCD process class, where the divergences are
regulated with the p⊥0 . Therefore this event class should be preferred when low p⊥ photons are
considered. The available processes are

• qg → qγ

• qq → gγ

• gg → gγ

35
SciPost Physics Codebases Submission

• qq → γγ

• gg → γγ

Notice that the processes with two gluons in the initial state are box graphs. By default, it is
assumed that the five massless quarks may form the loop, such that the expressions should be
valid in a region where p⊥ is between the b and t quark masses. It is, however, possible to change
the number of active flavours inside the loop if a different region is considered. In addition to the
photons produced in the hard scattering (prompt photons), photons may also be formed in parton
showers and hadron decays. Therefore QCD processes might be needed to obtain a realistic rates
for photon production.

3.2.2 Weak bosons


This section includes processes with standard model EW gauge bosons, γ∗ /Z and W± . The pro-
cesses are classified into single and pair production, where the single production is associated with
a parton and as an intermediate particle in t-channel exchange between two fermions.
As a highly-virtual photon γ∗ cannot be distinguished from a Z boson with equal quantum
numbers, typically both contributions should be accounted for to include interference effects. It
is, however, possible to consider these two components separately, without the interference. This
applies for all of the following processes including neutral EW bosons.

Boson exchange, DIS The EW boson t-channel exchange is mainly relevant in Deep Inelastic
Scattering (DIS) processes in lepton-hadron collisions but may also be applied for other types
of collisions. There are two different contributions, one with neutral and one with charged EW
bosons, namely

• f f 0 → f f 0 where a neutral γ∗ /Z boson is exchanged so that the initial- and final-state


fermion pair remains the same.

• f1 f2 → f3 f4 where a charged W± boson is exchanged so that the initial- and final-state


fermions are different. This includes charged current DIS with a charged-lepton beam and
DIS with neutrino beams.

In pp collisions the factorization and renormalization scales are usually related to the transverse
momentum, p⊥ , of the final-state particles. However, in DIS a more appropriate hard scale is
usually the virtuality of the intermediate particle, Q2 . Therefore, when studying DIS with these
processes, it is advised to set the renormalization and factorization scales appropriately, see sec-
tion 3.10 for details. Similarly, rather than having a phase-space cut on p⊥ to avoid divergences,
here it is more appropriate to set a minimum Q2 value to make sure that the relevant phase space
is covered. Furthermore, since the default parton shower distributes the emission recoils globally,
it is not well suited for DIS studies where the scattered lepton is expected to stay intact. Instead
it is recommended to use either the dipole-recoil option or the DIRE shower, see section 4.1 and
section 4.3 for further details.

Single boson production Two different options are included for the single EW-boson production
(s-channel) processes. In the first case the process is described as 2 → 1 scattering where the final
state is either γ∗ /Z or W± :

36
SciPost Physics Codebases Submission

• ff → γ∗ /Z

• ff0 → W±

The decay products of the short-lived (or virtual) particles and their kinematics are then derived
as described in section 2.3.
The other possibility is to consider single EW-boson production as a 2 → 2 process where the
decay products can be predetermined. This is useful if only certain final states are studied and
enables one to set phase-space cuts for the final state, e.g. for the p⊥ of the produced lepton.
These overlap with the first class of single-boson processes so one should not mix these processes
to avoid double counting. The possible processes are:
0
• ff → γ∗ → f0 f
0
• ff → γ∗ /Z → f0 f

• f1 f2 → W± → f3 f4

The difference between the first two is that in the first, the final state can be any of three possible
lepton generations or five possible quark flavours, whereas in the second, the decay channels are
set by the Z-decay modes. In the former, only γ∗ exchange is included and the process is part
of the MPI framework. In the latter, it is also possible to select between pure γ∗ , Z, and the full
interference modes. For the last, W± production, the decay channels are always the same for W+
and W− . These are set for W+ and charge-conjugated channels are applied for W− . No quark-mass
effects are included for the angular distribution of the decay products of the W± .

Boson pair production These processes describe possible combinations of two EW-boson pro-
duction, also including LO correlations for 4-lepton final states [25].
0
• ff → γ∗ /Z γ∗ /Z
0
• ff → Z W±

• ff → W+ W−

Notice that for the second process, no contribution from a virtual photon is included. In addition,
it is possible to produce EW bosons in the parton shower as described in section 4.1.4 and sec-
tion 4.2.4. Therefore, a full EW-boson pair production might require a combination of different
processes with some additional care to avoid possible double counting.

Boson and parton production These processes produce events where an EW boson is produced
in association with a parton, where the latter in this case refers either to a quark, gluon, photon,
or lepton. The possible channels are:

• qq → γ∗ /Z g

• qg → γ∗ /Z q

• ff → γ∗ /Z γ

• fγ → γ∗ /Z f

37
SciPost Physics Codebases Submission

• qq → W± g

• qg → W± q

• ff → W± γ

• fγ → W± f

Again, there will be overlap with the single-boson production channels and the appropriate process
depends on the final state and considered kinematics. For fully inclusive EW-boson production,
the single-boson class is the relevant one, but for the high-p⊥ tail these processes would provide
more accurate kinematics. These processes should also be favoured when EW-boson production
is studied with an associated high-p⊥ jet or lepton.

3.2.3 Photon collisions


Many modern PDF sets include perturbatively-generated photons as a constituent of protons. In
addition, all electrically-charged particles accelerated to high energies may emit photons that act
as initiators for hard processes. The difference between these two cases is that in the former case,
the beam hadron will be resolved, whereas in the latter case, the beam particle will stay intact.
The following two-photon initiated processes are included in PYTHIA 8.3 and can be applied for
resolved and unresolved beams:

• γγ → qq

• γγ → cc

• γγ → bb

• γγ → e+ e−

• γγ → µ+ µ−

• γγ → τ+ τ−

3.2.4 Photon-parton scattering


A few processes with a photon and a parton as initiators have been included. These are mainly
relevant for photoproduction in ep collisions but can also be applied to other collision types with
beams that may provide photons and partons. Similarly as pure-QCD processes with light partons,
these processes also contain collinear and soft divergences so a suitable phase-space cut, e.g. on
partonic p⊥ , must be applied to obtain finite cross sections. Unlike for the pure-QCD processes, no
regularized description applicable at any p⊥ is present for any of the photon-initiated processes.
The included processes for photon-parton collisions are:

• gγ → qq

• gγ → cc

• gγ → bb

• qγ → qg

38
SciPost Physics Codebases Submission

• qγ → qγ

Here, the heavy-quark pair production processes contain the full mass dependence in the matrix
elements. Similar to the case of pure-QCD processes, at high-enough collision energies heavy
quarks, at least charm, may be produced via parton-shower emissions so several processes might
need to be considered to obtain realistic heavy-quark production rates.

3.3 Onia
Hard processes involving charmonium and bottomonium are provided using Non–Relativistic Quan-
tum Chromodynamics (NRQCD) [26], which includes both colour-singlet and colour-octet contri-
butions. The spectroscopic notation 2s+1 L J specifies the necessary quantum numbers to define a
state: spin s, orbital angular momentum L, and total angular momentum J. Processes are available
for the 3 S1 , 3 PJ , and 3 DJ states containing cc or bb, given an arbitrary radial excitation n, e.g. any
Υ (nS) for the 3 S1 bb onia states. Double onium production is also available for double-3 S1 cc and
bb processes, but only with colour-singlet contributions provided. Because of the long-standing
discrepancy between polarization in data and NRQCD predictions, only unpolarized processes are
provided, with isotropic decays, which can then be reweighted accordingly by the user for a given
polarization model.
Within NRQCD, the inclusive cross-section for a heavy onium state, H, can be written as,
X
dσ(pp → H + X ) = dσ̂(pp → QQ[2s+1 L J ] + X )〈O H [2s+1 L J ]〉 (56)
s,L,J

where the cross section is factorized into a sum of products between short-distance matrix ele-
ments, dσ̂, and long-distance matrix elements 〈O H [2s+1 L J ]〉. The short-distance matrix elements
are calculated with perturbative QCD [27–32], while the long-distance matrix elements are deter-
mined from phenomenological fits to parameters [30,33,34]. Default settings for these parameters
are provided for the J/ψ, ψ(2S), χc0 , χc1 , χc2 , ψ(3770), Υ (1S), Υ (2S), Υ (3S), χ b0 , χ b1 , and χ b2
states.
The sum in eq. (56) for a given physical onium state H[2s+1 L J ] is over the expansion of its
Fock states,

(1) (8)
¶ ¶
H[2s+1 L J ] = O(1) QQ[2s+1 L J ] + O(v) QQ[2s+1 (L ± 1)J 0 ]g
(8)

+ O(v 2 ) QQ[2s+1 (L ± 1)J 0 ]gg + . . . (57)

where the superscript (1) indicates a colour-singlet state, (8) a colour-octet state, and the ex-
pansion is in the velocity v of the heavy-quark system. Consequently, a long-distance and short-
distance matrix element must be provided for each state in the expansion.
For the physical 3 S1 states the following processes are available.

(1) (8) (8)


¶ ¶ ¶
• gg → cc(3 S1 )[3 S1 ] g • qg → cc(3 S1 )[3 S1 ] q • qg → cc(3 S1 )[1 S0 ] q

(1) (8) (8)


¶ ¶ ¶
• gg → cc(3 S1 )[3 S1 ] γ • qq → cc(3 S1 )[3 S1 ] q • qq → cc(3 S1 )[1 S0 ] q

(8) (8) (8)


¶ ¶ ¶
• gg → cc(3 S1 )[3 S1 ] g • gg → cc(3 S1 )[1 S0 ] g • gg → cc(3 S1 )[3 PJ ] g

39
SciPost Physics Codebases Submission

(8) (8)
¶ ¶
• qg → cc(3 S1 )[3 PJ ] q • qq → cc(3 S1 )[3 PJ ] q

(8) (8) (8)


The 3 PJ Fock states are a summation of contributions for J = 0, 1, 2. The 3 P1 and 3 P2 long-
3 (1)
distance matrix elements are calculated from the long-distance matrix element.
P0
The following processes are available for the physical 3 PJ states, again with J = 0, 1, 2.

(1) (1) (8)


¶ ¶ ¶
• gg → QQ(3 PJ )[3 PJ ] g • qq → QQ(3 PJ )[3 PJ ] q • qg → QQ(3 PJ )[3 S1 ] q

(1) (8) (8)


¶ ¶ ¶
• qg → QQ(3 PJ )[3 PJ ] q • gg → QQ(3 PJ )[3 S1 ] g • qq → QQ(3 PJ )[3 S1 ] q

(8) (1) (1)


Similar to the 3 PJ states, the colour-singlet 3 P1 and 3 P2 long-distance matrix elements are
3 (1)
calculated from the long-distance matrix element.
P0
For physical 3 DJ production, the following processes are provided:

(1) (8)
¶ ¶
• gg → QQ(3 DJ )[3 DJ ] g • qg → QQ(3 DJ )[3 PJ ] q

(8) (8)
¶ ¶
• gg → QQ(3 DJ )[3 PJ ] g • qq → QQ(3 DJ )[3 PJ ] q

(8)
The colour-octet 3 PJ contributions are treated in the same fashion as for the physical 3 S1 state.
Finally, double onium production is available for any arbitrary same-flavour 3 S1 configuration.
(1) (1)
¶ ¶
• gg → QQ(3 S1 )[3 S1 ] QQ(3 S1 )[3 S1 ]

(1) (1)
¶ ¶
• qq → QQ(3 S1 )[3 S1 ] QQ(3 S1 )[3 S1 ]

The default configuration for double-onium production is to provide all possible combinations of
the same-flavour physical 3 S1 states.
Many of the short-distance matrix elements diverge as p⊥ → 0, and consequently must be
regulated either with a hard cutoff or a smooth damping factor. Onium can be produced in a
hard process, but also in multiparton interactions, except for double onium. In a hard process, a
hard cutoff in p⊥0 is used, although it is also possible to implement smooth damping through a
user defined scheme, see section 9.7.2. In multiparton interactions, instead, a smooth damping
is performed with a cutoff scale p⊥0 for a given energy E0 with an evolution parameter. See
section 6.2 for more details.
The colour-octet states are defined in the event record using a non-standard numbering scheme,
99nq ns nr nL nJ , where nq is the quark flavour of the state and ns is the colour-octet state type.
Here, 0 is 3 S1 , 1 is 1 S0 , and 2 is 3 PJ . All remaining numbers follow the standard Particle Data
(8)
Group (PDG) numbering scheme [35, sec. 45]. As an example, 9941003 is the 1 S0 cc colour-
octet state for the colour-singlet J/ψ. After the parton shower and hadronization, all colour-octet
states are forced to isotropically decay into their corresponding physical colour-singlet state and
a soft gluon. A user-configurable mass splitting is used to set the mass of the colour-octet states
for a given colour-singlet. This determines the softness of the gluon emitted in the octet to singlet
transition.
Colour-octet states are allowed to evolve under the timelike QCD parton shower, see section 4
for more details on parton showers. This is meant to account for the competing effects of unbound
QQ states that emit additional gluons to become a semi-bound state, and semi-bound QQ states

40
SciPost Physics Codebases Submission

that are broken apart by additional gluon radiation. The combination is approximated by allowing
the colour-octet states to radiate in the parton shower with twice the q → qg splitting probability.
Both the probability of a colour-octet state being considered in the parton shower and the pre-
factor for the splitting kernel can be configured.
This treatment of colour-octet production in the parton shower is a simplification. The colour-
octet state can be treated as a gluon, and so a factor of 9/4 rather than 2 may be used. Using a
q → qg splitting kernel, rather than g → gg, is roughly equivalent to always following the path of
the harder gluon, resulting in harder final-state onia. Additionally, soft gluons producing heavy-
quark pairs will have not have sufficient phase space to produce hard semi-bound states. However,
after the g → QQ splitting, each heavy quark carries only approximately half the onium energy,
reducing the energy of the gluon emissions. In principle, these two effects between softer and
harder gluon emissions should approximately balance. However, comparisons to measurements
of prompt J/ψ production in jets from pp collisions indicate that this treatment underestimates
the local radiation surrounding onia [36, 37].

3.4 Top production


Standard model top production has now been part of standard measurements for over two decades
and state-of-the-art experimental observations now make use of higher-order calculations. How-
ever, we still maintain a minimum set of top-production processes that can be used either with a
K-factor for quick testing or for designing searches for non-standard decay modes by modifying
the top-decay table by hand.
Production processes available include:

• gg → tt

• qq → tt

• ff → tt (via t-channel W or s-channel Z/γ separately)

• γγ → tt

• gγ → tt

• qq0 → q00 t (single top via s-channel W)

It may be possible, for example, to test for the production of charged Higgses in top decays by
adding the decay mode t → bH+ to the decay table and using the BSM Higgs sector (see the next
section for details of setting BSM Higgs parameters).

3.5 Higgs
Pythia includes the capability of simulating production of standard model or BSM Higgses via the
Two–Higgs Doublet Model (2HDM). The production processes for the SM Higgs include:

• ff̄ → H

• gg → H (via 1-loop)

• qg → Hq (via 1-loop)

• γγ → H (via 1-loop)

41
SciPost Physics Codebases Submission

• ff̄ → Z H (via s-channel Z)

• ff̄ → W± H (via s-channel W)

• ff̄ → Hff̄ (vector-boson fusion, ZZ and W± W± can be selected separately)

• ff̄ → HQQ̄ where Q = b, t

BSM Higgses can be produced using Higgs:useBSM = on. To allow for CP-violating cases, the
neutral Higgses are named H1 , H2 , A 3 , which in the CP-conserving case refer to two scalar and
one pseudoscalar Higgs, respectively. The neutral Higgses are ordered by mass. All processes
mentioned above for the SM Higgs are also available for BSM ones by replacing H with the required
BSM Higgs name. Further processes available for BSM Higgses are the pair-production processes.

• ff̄ → H1,2 A 3

• ff̄ → H+ H1,2

• ff̄ → H+ H−

The couplings of each Higgs boson to SM fermions can be set independently to account for all
possible 2HDM structures. Further selection of the parity of each Higgs is also possible. We refer
the user to the online manual for a description of each parameter.
The decay of the Higgses is also calculated automatically based on input parameters. The
decay table can be overwritten by the user, either using the PYTHIA settings structure or using
the SLHA interface (see section 10.1.2). Since the LHC cross-section working group recommends
the usage of Next–to–Leading Order (NLO) decay widths, we use a multiplicative factor for all
internally-calculated widths. The factor is calculated for mH = 125 GeV, but should be sensible for
a range of masses. Furthermore, the Breit–Wigner shape of the Higgs resonance is complicated
due to a dependence on mass. For resonance searches, it may be useful to “clip the wings” of
the Breit–Wigner shape using Higgs:clipWings and Higgs:wingsFac (what factor of width
beyond which to clip) parameters.

3.6 Supersymmetry
The implementation of the Minimal Supersymmetric Simplified Model (MSSM) allows fully gen-
eral, complex 6 × 6 mixing in the squark sector, and up to five neutral gauginos (corresponding to
next-to-minimal MSSM). We also allow all four kinds of R-parity violating couplings (one bi-linear
and three tri-linear). Users are expected to input parameters via an SLHA file (see section 10.1.2).
Typically, the Higgs sector of Supersymmetry (SUSY) is identical to a type-2 2HDM model and can
be generated via the Higgs processes described above. PYTHIA is also capable of calculating decay
widths in the standard channels for all SUSY particles. However, if a decay table is provided in
the SLHA file, the internal calculation is turned off. For very low-width particles, the lifetime is
set as the inverse of the total decay width. All particles with a decay width set to zero are set as
stable.
Pair production of squarks (q̃i ), gluinos (g̃), and gauginos (χ̃ 0j , χ̃ ± ), including pairs like squark-
gluino, squark-gaugino, and gluino-gaugino, are implemented with EW contributions. We also
implement resonant production of squarks via R–Parity Violating (RPV) λ00 couplings, with corre-
sponding modification to showering and hadronization to include the new colour structure. Here
follows a full list of the processes available.

42
SciPost Physics Codebases Submission

• squark-pair production (including anti-squarks and EW interference)


(∗)
– ff̄ → q̃i q̃ j
(∗)
– gg → q̃i q̃ j

• gluino pair and gluino-squark production

– qq̄ → g̃ g̃
– gg → g̃ g̃
– qi g → q̃i g̃ (and charge conjugate)

• electroweak-gaugino pair production

– ff̄ → χ̃ ± χ̃ ±
– ff̄ → χ̃i0 χ̃ 0j
– ff̄ → χ̃ ± χ̃ 0j

• gaugino-gluino and gaugino-squark production

– ff̄ → g̃χ̃ ∓
– ff̄ → g̃χ̃i0
(∗)
– ff̄ → q̃ j χ̃ ∓
(∗)
– ff̄ → q̃ j χ̃i0

• slepton or sneutrino-pair production


(∗)
– ff̄ → ˜`i ˜` j

• resonant production of a squark via an R-parity violating process

– qi q̄ j → q̃k (RPV)

Further selection of what processes to turn on is also possible by specifying individual PDG
IDs of particles. All supersymmetric particles are given PDG codes greater than 1000000, with the
superpartners generally carrying the corresponding code to their SM partner, e.g. an up quark is 2
and the two up squarks are named 1000002 and 2000002. The full list of PDG codes is available
in the published review [35, sec. 45].

3.7 Hidden valley


Hidden Valley (HV) refers to a range of scenarios characterized by a gauge-symmetric dark sector
with various possibilities of portals into the “valley”. PYTHIA currently is the only general-purpose
Monte Carlo code that implements a HV scenario, including running of gauge couplings, show-
ering, and hadronization in the dark sector [38, 39]. There are multiple particle spectra and
production modes available which together can cover a wide range of phenomenology.
First, based on the rank N of the dark SU(N), radiation to either dark photons (i.e. U(1)) or
dark gluons (i.e. SU(N)) is implemented. The matter content is modelled of in two separate ways
— first via partners of the SM fermions (named dark-u, dark-d, dark-e and so on) that carry both

43
SciPost Physics Codebases Submission

the SM charges of their partner as well as fundamental of the dark SU(N), and second via a “dark
quark” that carries only the dark charge but does not carry any SM charges. In the first case, dark
sector particles can be produced via normal SM gauge bosons and radiate to both SM and dark
bosons based on their mass and relative strengths of the dark and SM couplings. In the second
case, we implement an extra Z 0 portal to produce said quarks via a kinetic mixing with the SM
photon. The spin of the dark-sector particles (aside from the gauge bosons) can be set by the user
to be either scalars, fermions, or vectors.
Two kinds of models are available in PYTHIA, depending on the charge of the new fermions.
The first case is where the new fermions also carry some standard-model charge and can therefore
be produced via one of the standard-model gauge bosons. The radiation of the final state-fermions
then includes both dark-sector radiation as well as SM radiation. The processes that fall in this
category include:

• gg → F v F̄ v via intermediate gluon where F v is the hidden sector quark — either one of the
quarks U v , Dv , S v , C v , B v , Tv , or generic quark Q v

• qq̄ → F v F̄ v via intermediate gluon where F v is the hidden sector fermion — either one of
the quarks U v , Dv , S v , C v , B v , Tv , or generic quark Q v

• ff̄ → F v F̄ v via intermediate Z or γ∗ . F v includes all the quarks above, plus the “leptons” E v ,
ν E v , and similarly for µ and τ flavours

It is possible to simulate a hidden sector where the new fermions do not carry any SM charges,
but in this case, we need a new portal, which PYTHIA assumes is a new vector Z 0 . This Z 0 is then
expected to be able to decay to both SM fermions as well as dark-sector fermions. Pair production
of dark-sector fermions via this portal can be done using:

• qq̄ → Z v followed by Z v → F v F̄ v

An important phenomenological effect is the running of the hidden-sector strong coupling


which can make significant changes to the radiation pattern in the dark sector. This is by default
taken into account by using the one-loop beta function of SU(N) once the number of colours and
flavours of new fermions is set. The running can also be turned off by the user to use a fixed-
coupling value instead. There is also an inherent ambiguity in the composition of the hadrons
in the dark sector. PYTHIA allows the user to manually set the ratio of scalar to vector mesons
as well as the parameters of the Lund symmetric fragmentation function or the dark sector (see
section 7.1 for details of the fragmentation functions). The decay table of the hidden mesons back
into the standard model (should this be desirable), can be done by the user at run time using the
standard particle data scheme that PYTHIA uses for all particles.

3.8 Dark matter


Multiple models for Dark Matter (DM) are currently implemented in PYTHIA. They may be sepa-
rated into two different categories — production via s-channel mediator and production via pair
production of mediators (typically seen in co-annihilation or co-scattering scenarios of DM). In all
cases, the DM is assumed to be fermionic. We provide the possibility to produce dark matter with
one associated jet for the s-channel models (vector or axial-vector Z 0 and scalar or pseudoscalar A).
For the mediator pair-production processes, all mediators are produced via Drell–Yan production.
The PDG provides some standard codes for common DM particles and mediators, cf. [35,
sec. 45]. Of these, the fermionic DM code 52, the s-channel scalar mediator (54) and vector

44
SciPost Physics Codebases Submission

mediator (55) are used in this implementation. The new mediators are either charged scalar
(with PDG code 56), charged vector-like fermion (PDG code 57), and doubly charged fermion
(PDG code 59). The neutral partner that accompanies the charged mediators is given the PDG
code 58.
The singlet model contains a scalar singlet with quantum numbers identical to a right-handed
lepton. Therefore, it couples via a Yukawa-like coupling to a SM right-handed lepton and the DM
is a Dirac fermion. Both the scalar and the DM are odd under a Z2 symmetry to ensure the stability
of the DM.

L = ∂µ φ ∗ ∂ µ φ + χ̄(iγν ∂ν χ) − m2φ |φ|2 − mχ̄χ − ( y` ¯`φχ + h.c.). (58)

The fermionic mediators are based on models that have mixing between a singlet and an n-
plet vector fermion, both charged under a Z2 symmetry for which all of SM particles are even.
The mixing between the singlet and n-plet is then calculated based on the value of n. The lightest
neutral state is denoted as dark matter.

L = χ̄(iγν ∂ν χ) + ψ̄(iγν Dν ψ) − m1 χ̄χ − m2 ψ̄ψ. (59)

The mixing term depends on the representation of ψ. For example, for a triplet case, we have
c † a

Lmix = χ̄(Φ τ Φ h )ψ a
+ h.c. , (60)
Λ2 h

where Φh is the SM-Higgs doublet, ψ is the triplet fermion, τ are the Pauli matrices and χ is the
singlet fermion.
Production of DM can be studied in two ways — either by directly producing the s-channel
mediator, which then decays to DM, or by producing the charged partner of DM via Drell–Yan
followed by the decay of the partner. The production processes therefore are

• qq̄ → Z 0 → χ̄χ

• gḡ → S → χ̄χ, note that 1-loop gḡ → S via top-loop is included in this production.

• qq̄ → Z 0 g (mono-jet)

• gḡ → S g (also mono-jet, via 1-loop in production)

• qq̄ → Z 0 H, i.e. mono-Higgs production (coupling of the SM to the new Z 0 has to be set by
the user)

• ff̄ → ψψ̄ where ψ = ˜`± (scalar with leptonic quantum numbers), χ ± (singly charged
fermion), or χ ±± (doubly charged fermion), followed by decay of ψ into DM (Drell–Yan
for charged partners)

Couplings of quarks and leptons to the mediators are assumed to be generation universal, however
vector and axial-vector components (or equivalently scalar and pseudoscalar components for the
scalar mediator) can be set individually for up type, down type, charged lepton, neutrino, and
dark-matter fermions. In case of Z 0 , it is also possible to choose a kinetic-mixing parameter ε
which then automatically sets the rest.

45
SciPost Physics Codebases Submission

3.9 Other exotica


Finally, we mention other models of new physics that are implemented in PYTHIA, though they
are perhaps not as popular as they once were. We refer the reader to the online manual for the
detailed descriptions of the model parameters and only provide a list here.

• Fourth generation includes production of fourth-generation quarks or leptons via the usual
SM-mediated processes.

• New gauge boson Z 0 , W 0 and horizontal gauge boson production can be performed
through ff → V production followed by decay. For Z 0 , full interference with SM γ, Z in the
s-channel production is taken into account. It is possible to have both universal and non-
universal models where couplings to each generation should be set by hand by the user.

• The left-right symmetry model includes a right handed SU(2) sector. Along with the gauge
bosons W 0 and Z 0 , it also includes heavy right-handed neutrinos that can be used to study
signatures of heavy neutral leptons.

• Leptoquark production includes resonant single production or pair production of scalar


leptoquarks via gluon-mediated diagrams. The flavour of the leptoquark should be set by
the user by explicitly setting the decay table of the leptoquark.

• Compositeness models include simple models of excited fermions and contact interactions
that modify standard QCD dijet or Drell–Yan production of leptons.

• Extra dimensions includes production of the graviton or the extra Kaluza–Klein gauge bo-
son (e.g. KK-gluon) of the Randall–Sundrum model. Further processes include modification
of SM dijet/dilepton production due to extra KK-bosons in the TeV-scale or large extra di-
mension models. Finally, Unparticle emission is modelled associated with a jet or photon.

3.10 Couplings and scales for internal processes


The perturbatively calculated cross sections for QCD and Quantum Electrodynamics (QED) pro-
cesses depend directly on the value of the relevant coupling evaluated at the scale at which the
hard scattering occurs. The scale dependence of the couplings arises due to the renormalization
procedure required to obtain finite cross sections and can be calculated by solving the renormal-
ization group equations of the applied theory.
In PYTHIA 8.3 the running of the QCD coupling, αs (Q2 ), is implemented up to second order and
applied at first order by default to match the precision of the internally-calculated cross sections.
A fixed value can also be used, but the potential usage is limited to special cases and generally a
running coupling should be applied for realistic cross-section estimates. The coefficients related
to the value of the coupling are fixed by setting the αs (Q2 ) value at the mass of the Z boson.
Similarly, running of the QED coupling αem (Q2 ) has been implemented in PYTHIA 8.3. This,
however, runs much slower than the QCD one and only first-order running is implemented. An
option to use a fixed value for αem (Q2 ) is included, either by setting the value directly at the mass
of the Z boson or by matching to its value at vanishing momentum transfer. In addition, it is
possible to globally scale the cross sections with a K-factor if such behaviour is desired.
There are two relevant scales that needs to be set. The renormalization scale, Q2ren , arises
from the renormalization procedure and defines at which scale the couplings are evaluated. The

46
SciPost Physics Codebases Submission

factorization scale Q2fact arises from factorizing the short-distance phenomena (hard scattering)
from the large-distance (soft) structure of hadrons. This scale determines at which Q2 the PDFs
of resolved beams are probed.
As the scale dependencies arise from an approximated description of QCD, there is some
amount of freedom in the scale choices. The only solid guideline is that the scales should be
related to the hardness of the scattering process and therefore the optimal choice depends on the
type of the studied process. Multiple options for the scale choices have been implemented into
PYTHIA 8.3, and all options are available for both Q2ren and Q2fact .
For 2 → 1 processes two options exist:

• the squared invariant mass, ŝ, i.e. the mass of the produced particle;

• and a fixed scale.

For 2 → 2 a few more options are included:

• the smaller of the squared transverse masses of the outgoing particles, min(m2⊥,3 , m2⊥,4 );

• the geometric mean of the squared transverse masses of the outgoing particles, m⊥,3 · m⊥,4 ;

• the arithmetic mean of the squared transverse masses of the outgoing particles,
(m2⊥,3 + m2⊥,4 )/2;

• the squared invariant mass of the system, ŝ, relevant for s-channel processes;

• the squared invariant momentum transfer − t̂, relevant especially for DIS events as this co-
incides with the virtuality of the intermediate photon Q2 ;

• and a fixed scale.

For 2 → 3 processes the possible choices are:

• the smallest of the squared transverse mass of the outgoing particles, min(m2⊥,3 , m2⊥,4 , m2⊥,5 );

• Ç
the geometric mean of the two smallest squared transverse masses of the outgoing particles,
m2⊥,3 · m2⊥,4 · m2⊥,5 /max(m2⊥,3 , m2⊥,4 , m2⊥,5 );

• the geometric mean of the squared transverse masses of the outgoing particles,
(m2⊥,3 · m2⊥,4 · m2⊥,5 )1/3 ;

• the arithmetic mean of the squared transverse masses of the outgoing particles,
(m2⊥,3 + m2⊥,4 + m2⊥,5 )/3;

• the squared invariant mass of the system, ŝ, relevant for s-channel processes;

• and a fixed scale.

For vector-boson-fusion (VBF) processes, such as f1 f2 → f3 Hf4 , the virtualities of the interme-
diate bosons would not be accounted for with the above options and would likely underestimate
the relevant scales. Therefore modified scale choices where, instead of the transverse mass of the
final-state particle, a virtuality estimate m2⊥,V i = m2V + p⊥,i
2
can be used in the options above when
relevant.

47
SciPost Physics Codebases Submission

Traditionally, the theoretical uncertainties related to the truncated pQCD expansion are esti-
mated by varying the QCD scales by a factor of two or so. To enable such variations, options to
multiply the scales determined by the options above by constant factors have been implemented.
In a basic form, these variations will, however, require to generate a completely new set of events,
so mapping out all possible uncertainties might become computationally demanding. Therefore,
options to calculate weights for each event based on different scale variations have been imple-
mented in PYTHIA 8.3 for more efficient uncertainty estimation, see section 9.8 for details. Notice
also that the couplings and scales can be set separately for MPIs and initial- and final-state showers.

3.11 Handling of resonances and their decays


By default, the SM electroweak gauge bosons, top quarks, the Higgs boson, and generally all
BSM particles are classified as resonances. Note that all of these have on-shell masses above 20
GeV (with the exception of some hypothetical weakly interacting and stable particles such as the
gravitino, which are also considered resonances).
Importantly, neither hadrons nor any particles that can be produced in hadron decays, such
as τ leptons, are included in this category. The decays of such particles are performed after
hadronization, and changing their decay channels will not automatically affect the reported cross
section. For example, allowing only the decay Z → µ+ µ− will reduce the total cross section re-
ported by PYTHIA for hard processes like pp → Z by the appropriate branching fraction, while
allowing only the decay J/ψ → µ+ µ− will not change the cross section for gg → J/ψg. The
reason for this is that hadron and τ decays involve multistep chains that cannot be predicted be-
forehand: a hard process like gg → gg can develop a shower with a g → bb branching, where the
b hadronizes to a B̄0 that oscillates to a B0 that decays to a J/ψ. Any bias at the hard-process level
would not affect these other production mechanisms and could thus be misleading. Instead, the
user must consider all relevant production sources and perform their own careful bookkeeping.
Both types, “resonances” and “unstable particles”, can have Breit–Wigner distributed mass
spectra (at least when generated by internal PYTHIA processes); more on this below. For the
remainder of this subsection we focus on the production and decay of those particles that are
classified as resonances, referring to section 8 for the treatment of hadron and τ decays.
Note that the cross-section reduction factors to account for decay modes that have been switched
off are always evaluated at initialization, for nominal masses. For instance, in the example above,
the Z → µ+ µ− reduction factor is evaluated at the nominal Z mass, even when that factor is used,
later on, say in the description of the decay of a 125 GeV Higgs boson, where at least one Z would
be produced below this mass. We know of no case where this approximation has any serious
consequences, however.
Note also that, for the specific case of electroweak showers (cf. section 4.1.4 and section 4.2.4),
the decays of any resonances that are produced by the shower (i.e. not by the hard process) are
treated inclusively, ignoring any user restrictions on which channels should be open or closed. It
is then up to the user to select the final states of interest and reject the rest.
Finally, a word of caution: the above logic implies that switching off all of the decay channels
of a resonance will result in cross sections evaluating to zero, precluding PYTHIA from being able
to generate any events. Instead, to force a resonance to be treated as stable for a given run, set
NN:mayDecay = false, with NN being its particle ID code.

Total and partial widths: For resonances, the partial widths to different decay channels are typ-
ically perturbatively calculable, given the parameters of the respective model. By default, during

48
SciPost Physics Codebases Submission

SM H0 Decay Mode: gg γγ γZ ZZ WW bb cc µ+ µ− τ+ τ−
NLO rescaling factor: 1.47 0.88 0.95 1.10 1.09 1.11 0.98 0.974 0.992

Table 1: Numerical correction factors applied to the LO SM-Higgs decay partial width,
based on LHCXSWG recommendations [40]. Note that the strong coupling is fixed to
αs = 0.12833 in this context.

initialization PYTHIA therefore computes the hadronic widths of W, Z, t, and SM Higgs bosons at
NLO in QCD, with

αs (m2V ) LO
 
NLO
ΓV→qq = 1 + ΓV→qq ,
π
5αs (m2Z ) LO
 
NLO
Γt→bW = 1 − Γt→bW , (61)

where V is a generic vector boson. For H0 , the default is a set of channel-specific numerical NLO
rescaling factors recommended by the LHCXSWG [40], with current values given in table 1 valid
for a reasonable range around the nominal Higgs mass of mH = 125 GeV. Note also that PYTHIA 8
computes the LO partial widths for H0 → γγ and H0 → gg using running quark-mass values in
the loop integrals (evaluated at mH ); this gives a non-negligible shift relative to PYTHIA 6 which
used pole-mass values in the same expressions. For comparisons, the LHCXSWG rescaling factors
can optionally be replaced by simple (1 + αs /π) correction for the decays to quarks, and for the
loop-induced decays the running mass values can be replaced by pole ones.
For BSM resonances, PYTHIA applies the (1 + αs /π) factor to all integer-spin BSM particle
decays to quark-antiquark pairs and to semi-leptonic decays of right-handed neutrinos, while the
(1 − 5αs /(2π)) one is applied to t0 → qW decays.
At the technical level, these decay-rate calculations are performed by dedicated calcWidth()
methods in the derived ResonanceWidths class for the given resonance. Note that this means
that the tabulated widths for these particles stored in the program’s particle data table are purely
dummy values, overridden at initialization. To force a resonance with ID code NN to have a certain
user-defined width, Γ , set NN:doForceWidth = on and NN:mWidth = Γ . Input of resonance
widths via the SLHA interface is discussed separately below.

Breit–Wigner modelling: We now turn to PYTHIA’s modelling of resonance shapes. Note that
this applies to resonances that are produced by PYTHIA (i.e. in PYTHIA’s internal hard processes
and/or in decays performed by PYTHIA). For externally generated ones, cf. section 10.1, it is the
responsibility of the external generator to model the shape of the produced resonances, though
PYTHIA’s modelling may still apply to any resonances produced by subsequent decays of particles
that are kept stable in the external process.
An important note in the specific context that an external generator is responsible not only for
resonance production, but also for one or more of their decays is that the total invariant mass of
the resonance-decay products (and hence the resonance shape) is only guaranteed to be preserved
during parton showering if an explicit resonance mother (with Les Houches status code 2) is
present in the externally provided event record. This is particularly relevant for any coloured
resonances (such as top quarks), for which the reconstructible resonance-mass distribution will

49
SciPost Physics Codebases Submission

otherwise be impacted by unphysically large QCD recoil effects to parton(s) outside the resonance-
decay system. In principle, the same issue exists for QED recoil effects in decays of electrically
charged resonances.
The basics of phase-space generation and Breit–Wigner sampling in the context of processes
involving resonances were covered in section 2.3.3. As already mentioned there, decay-rate calcu-
lations specific to each given resonance and decay mode are the default for most SM-resonance de-
cays in PYTHIA as well as for some BSM ones, via process-specific SigmaProcess::weightDecay()
methods and resonance-specific ResonanceWidths::calcWidth() methods, enabled for decay
channels assigned meMode = 0. For resonances that include such channels, (49) of section 2.3.3
is generalized to
m j Γ j (m)
P
1
, (62)
π (m2 − m20 )2 + m2 Γtot
2
(m)
where both the partial widths Γ j and the total width Γtot are in principle allowed to depend on m.
There are two main sources of m dependence:

• Running couplings in the relevant matrix elements. This also applies e.g. to the NLO normal-
izations given by eq. (61), in which αs (m20 ) is replaced by αs (m2 ). The SM-Higgs resonance
is sufficiently narrow that no appreciable running effects are expected, hence the partial
widths given in table 1 are left unchanged.

• Threshold effects. For bosonic resonances (Z, W, H, and particles that are trivially related to
them such as Z0 , W0 , H+ , and A bosons), decays to same-flavour fermion pairs are associated
with the following threshold factors:

β
 3
 : scalar
mΓ0 β : pseudoscalar

Γ (m) = Θ(ŝ − 4m2f ) , (63)
m0  β(3 − β )/2 : vector
2

β
 3
: axial-vector
q
where Γ0 is the on-shell partial width and β = 1 − 4m2f /m2 is the fermion velocity in the
rest frame of the decay. Resonances that have both vector and axial-vector (or both scalar
and pseudoscalar) couplings use appropriate mixtures of these factors, and analogous but
more complicated expressions are used for decays into unequal masses e.g. of the W+ . For
other decays, the m dependence is typically more complicated.

We refer to the corresponding implementations in the weightDecay() and calcWidth() meth-


ods in the code, which can be inspected for more details about the treatment of a given process
and/or decay mode, respectively.

Decay angular distributions: In many cases, non-trivial angular distributions are encoded in
PYTHIA via process-specific LO matrix elements that include the relevant decays. For example, for
the hard process ff → W+ W− (with f denoting a generic fermion), PYTHIA generates the angular
distributions for the two W decays at the same time, using the full ff → W+ W− → 4-fermion
matrix elements.
This allows for an accounting of the effects of spin correlations between the production and
decay stages. Note, however, that only diagrams with the same resonant structure as the pro-
duction process are included; interference with background processes is not accounted for by this
method.

50
SciPost Physics Codebases Submission

Using V to denote a generic weak boson (W± or Z0 , with the latter typically including γ∗ /Z
interference where relevant) and H to denote a generic neutral Higgs boson, processes for which
such matrix-element-corrected resonance-decay distributions are generated by PYTHIA 8.3 in-
clude:

• Decays of (unpolarized) top quarks: t → bW+ → b 2f.

• Electroweak decays of neutral Higgs bosons: H → VV → 4f and H → γZ → γ 2f, in both


cases allowing for generic (BSM) mixed-CP states.

• Electroweak resonant s-channel processes 2f → V → 2f. Note: this extends to BSM vector
bosons such as V0 and VR , and also includes the full γ∗ /Z/Z0 interference for Z0 ones.

• Electroweak resonant 2 → 4 processes 2f → VV → 4f and 2f → HV → 4f. Also 2f → V0 → VV → 4f.

• W decays in ff → g/γ W → g/γ 2f.

• BSM excited-graviton decays in 2f → G∗ and gg → G∗ processes, cf. [41].

• BSM compositeness excited-fermion decays in 2 → f∗ → g/γ f and 2 → f∗ → V f, with V


decaying isotropically for the latter.

A prominent example of a process that is absent from this list is top-quark pair production, implying
that internally generated tt events in PYTHIA do not exhibit non-trivial correlations between the
two top decays. Note also that, for externally provided events (cf. section 10.1), only the top- and
Higgs-decay correlations in the two first points above are applied. When interfacing external hard
processes it is therefore important to consider whether, and how, resonance decays are treated by
the external generator.
At the technical level, these process-specific angular distributions are implemented via dedi-
cated weightDecay() methods in the derived SigmaProcess class for the given hard process.

Effects of PDFs on resonance shapes: Often, the observable resonance shape results from a
convolution with non-trivial parton distribution functions. For hadrons, these tend to be strongly
peaked towards small x, with a typical asymptotic behaviour roughly like f (x) ∝ 1/x. When
convoluted with the Breit–Wigner shape, this tilts the overall resonance shape; the parton-parton
luminosity is higher in the low-mass tail than it is in the high-mass tail.
If the low-mass enhancement is strong enough, the wide tails of the Breit–Wigner can even
lead to a secondary peaking of the cross section towards very low masses. This is obviously un-
physical, as the resonant approximation is invalid that far from the resonance, and non-resonant
background processes would anyway normally dominate in that region. The desire to cut away
such behaviour is one reason for the default choices made in PYTHIA for the mmin limits in eq. (49).
For non-standard PDFs, or when making user-defined modifications to the nominal mass and/or
width values (e.g. for BSM particles), it is up to the user to check that sensible mmin limits are
imposed.

Interleaved resonance decays: Rounding off the discussion of resonance production and de-
cays, PYTHIA also allows for interleaving resonance decays with the final-state shower evolution,
as described in ref. [23]. Currently, this is only done by default for the VINCIA shower model,
while it exists as a non-default option for PYTHIA’s simple showers.

51
SciPost Physics Codebases Submission

When interleaved resonance decays are enabled, resonance decays are inserted into the final-
state shower evolution as 1 → n branchings, at a scale which by default is given by the following
measure of the off-shellness of the resonance propagator,

(m2 − m20 )2
Q2RES = , (64)
m20

with median value 〈Q RES〉 = Γ . (A few alternative choices are also offered, including an option
to use a fixed scale Q RES ≡ Γ .) As part of the resonance-decay branching process, a “resonance
shower” is also performed, in the region m0 > Q > Q RES. This shower stage only involves the
decaying resonance and its decay products, with no recoils to any other partons. Note that any
nested resonance decays associated with intermediate scales (e.g. the W boson produced in a
t → bW decay) are also performed during this stage, along with their corresponding resonance
showers, while any decays associated with scales below Q RES occur afterwards, sequentially.
The main consequence is that resonances are prevented from participating as emitters or re-
coilers for radiation at scales below Q RES; only their decay products can do that. We refer to
ref. [23] for further details.

3.12 Parton distribution functions


Parton distribution functions provide number distributions of a parton flavour i at a given mo-
mentum fraction x when a hadron is probed at scale Q2 , and are a necessary input for any hard
process generation with hadron beams [42]. Here, we focus on PDFs for hadrons and nuclei
— PDFs for other types of beams (including leptons, photons, and pomerons) are discussed sepa-
rately in section 6. The scale evolution of the PDFs is provided by the Dokshitzer–Gribov–Lipatov–
Altarelli–Parisi (DGLAP) equations [43–45] and usually these are derived in a global QCD analysis
where the non-perturbative input at an initial scale is fitted to a wide range of experimental data.
Further constraints are provided by the momentum- and baryon-number sum rules. Nowadays, it
is common that in addition to the best fit, the PDF sets also provide error sets that can be used to
quantify how the uncertainties in the applied data propagate into other observables.
In the case of protons, the high-precision DIS data from HERA collider form the backbone of
the PDF analyses. On top of this, the modern PDF sets incorporate a wealth of different LHC data
to increase the kinematic reach of the analysis and to obtain further constraints for the flavour
dependence. With this, in kinematic regions relevant for LHC studies, the proton structure is
known with a percent-level accuracy, except for in a few regions like the very small-x region.
PYTHIA comes with some 20 different proton PDF sets. There are a few (pre-HERA) sets that
are out of date, e.g. GRV94L and CTEQ5L, but are kept in for historical reasons as some earlier
tunes were based on these. In addition, there are a few sets that include HERA data but did not
have any from LHC data (e.g. CTEQ6L) which mainly differ from the older ones due different
small-x gluon behaviour. The more modern ones include several data sets from LHC experiments
which provide further constraints for gluon PDFs and flavour separation between different quarks.
Another recent development in the PDFs is the inclusion of QED evolution that enables inclusion
of photons as a part of the hadron structure. The current default set is NNPDF2.3 QCD+QED LO,
which does contain some datasets from LHC, but not the most recent ones. It is important to note,
however, that the default Monash tune is based on this PDF set, so updating to a more recent
PDF set would not lead to an improved description unless a complete retuning of pp parameters is
performed. Many further sets are accessible via the LHAPDF interface, cf. section 10.1.4. This runs
slightly slower than the built-in sets, but also offers further facilities such as error bands around

52
SciPost Physics Codebases Submission

the central PDF member. Notice also that there might be small differences between the internally
defined PDFs sets and the corresponding LHAPDF grids due to different interpolation routines and
different extrapolations beyond the provided interpolation grid.
The neutron PDF is obtained from the proton one by isospin conjugation. This is not quite
correct for some recent sets where the QCD evolution is combined with a QED one, i.e. where the
quarks can radiate off photons, but in practice it is good enough except for photon physics.
For pions, the main set is based on GRS 99 [46]. This work makes the ansatz that valence,
p
gluon, and sea PDFs are of the form N x a (1− x) b (1+A x + B x) at an initial scale Q20 = 0.26 GeV2 ,
with the parameters fitted to data. By choosing a small Q 0 , the distributions can be assigned a
valence-quark-like shape at that scale, and strange and heavier quarks can be taken to vanish. An
older set based on GRV 92 [47] is available, but is deprecated in favour of GRS 99. A similar PDF
is also available for the kaon [48].
For other hadrons, rough estimates for PDFs have been made based on the form above, with
A = B = 0. No data is available, so the parameters a and b have been chosen heuristically, based on
the guiding principle that all valence quarks should have roughly the same velocity for the hadron
to stay together over time, and thus heavier quarks must take a larger average momentum fraction.
The N are fixed by the flavour and momentum sum relations. For details on this procedure, see
ref. [49]. These PDFs are referred to as the SU21 sets, and are stored in the LHAPDF format and
distributed with PYTHIA 8.3. Specifically, the PDFs included this way are available for the following
+
hadrons: p, π+ , K+ , φ 0 , η, D0 , Ds+ , J/ψ, B+ , Bs0 , Bc+ , Υ , Σ+ , Ξ+ , Ω− , Σ++ + −
c , Ξc , Ωc , Σb , Ξb , and
0

Ωb . The SU21 p and π PDFs are less accurate than other available sets, so they should not to be
used in real studies, but are included for completeness. Hadrons with the same quark contents
as the ones above are assumed to have the same PDFs. Furthermore, other cases can be defined
using isospin conjugation, since no QED effects are included in the SU21 sets. Mixed cases such
as π0 and Σ0 are assumed to have equal u and d contents, which are given by the averages for
the corresponding implemented PDF (i.e. π+ and Σ+ , respectively). Using such rules, all normal
hadrons can be simulated, except for baryons with more than one charm or bottom quark. One
final technical point is that in the SU21 LHAPDF files for flavour-diagonal mesons, the antiquark
content represents the sea, in order to make it possible to separate valence and sea (e.g. for J/ψ,
the c column represents the charm content, while the c column represents charm sea).
Also, a few nuclear PDF sets have been included internally. These can be used to estimate
the leading nuclear effect for inclusive high-p⊥ observables, such as jet production, but for more
involved studies it is recommended to use the full heavy-ion machinery, see section 6.8. More
nPDFs are available as LHAPDF grids, but the advantage of the internally defined sets is that any
proton baseline PDF can be applied and, if needed, the number of protons and neutrons can be
redefined event-by-event.
A fair fraction of the internal PDFs are LO ones. This ensures a sensible behaviour also for
processes at low x and/or Q2 (discussed further below), but also some NLO and NNLO proton
sets are available, for the modelling of hard processes. In this context, it can be mentioned that,
at large x and Q2 , NLO corrections to the PDF shape are often more important than those for the
matrix elements, such that NLO PDFs and LO MEs can be a viable combination. Further, in the
large-(x, Q2 ) region where the behaviour is nowadays rather well understood, PDFs do not risk
turning negative.
For showers and MPIs, the case is less clear; they both connect to low-p⊥ scales around or
below 1 GeV, and especially MPIs can probe extremely small x values, down to around 10−8 at
LHC energies, cf. section 6.2. In this region, all PDF components are poorly known, especially
the dominant gluonic one. In an LO description, the PDFs are required to be non-negative, and

53
SciPost Physics Codebases Submission

HERA data in combination with Regge theory provide some reasonable constraints on the low-x
behaviour. PDFs need not be positive definite at higher orders, NLO or NNLO, since it is only the
convolution of NLO (NNLO) hard-process matrix elements with NLO (NNLO) PDFs that should
be non-negative, up to NNLO (N3 LO) terms. Actually, at scales p⊥ ∼ 1 GeV the whole pertur-
bative expansion is poorly convergent, since αs is large. Some recent PDFs attempt a resummed
description of the small-x behaviour to restore a guaranteed PDF positivity [50]. Nevertheless, in
general, the criteria for what constitutes an optimal or at least sensible PDF choice for the hard
process are not necessarily the same as for showers and MPIs; for this reason, PYTHIA 8.3 allows
for the use of one PDF set for the hard process and a different set for showers and MPIs. This can
also be useful to preserve shower- and underlying-event tuning properties while changing PDFs
for the hard process.
It is also possible to pick different PDF sets for the two incoming beam particles, which may be
convenient as a technical trick but has no physics motivation when colliding beams are the same.

3.13 Phase-space cuts for hard processes


Several different phase-space cuts have been implemented for the internal hard processes in
PYTHIA 8.3. These serve two purposes: to properly set values that ensure the approximations
in the theory description are valid, and to allow for more efficient event generation when only a
certain part of the available phase space is considered. The principal example is the lower limit
of the partonic p⊥ of 2 → 2 processes, that needs to be set to a high enough value such that the
divergent behaviour of the massless matrix elements in the p⊥ → 0 limit is avoided. Similarly, a
suitable lower limit for p⊥ should be applied when considering e.g. jet production at higher values
of p⊥ , to avoid the inefficiency otherwise associated with a rapidly dropping p⊥ spectrum. (But
also see comment at the end of this subsection.)
The number of implemented phase-space cuts for the hard scattering depends on the number
of final-state particles of the process. For 2 → 1 only two options are included:
• the minimum invariant mass mmin

• the maximum invariant mass mmax


If the value of the latter is lower than the value of the former, the invariant mass will be limited
from above by the collision energy. The same cuts also apply to 2 → 2 and 2 → 3 processes.
For 2 → 2 processes some more options appear. The first three are related to invariant trans-
verse momentum of the process:
• the minimum transverse momentum p⊥min

• the maximum transverse momentum p⊥max

• an additional lower transverse-momentum cut p⊥diverge


The latter is to prevent divergences in the p⊥ → 0 limit for processes where a particle has a mass
smaller than the set p⊥diverge . In these cases, however, the larger of the p⊥min and p⊥diverge is always
applied for the p⊥ selection. The next set of cuts is related to limits of Breit–Wigner (BW) mass
distributions. By default, the mass selection based on BW shapes is always applied for particles
with a width above a certain threshold. There are two different thresholds that can be set:
• the minimum width of a resonance for which the Breit–Wigner shape can be deformed by
the variation of the cross section across the peak;

54
SciPost Physics Codebases Submission

• and the minimum width of a resonance that is below the former threshold, for which a sim-
plified treatment is applied instead, where a symmetric Breit–Wigner selection is decoupled
from the hard-process cross section.
Notice that the allowed mass range of a given particle can be set by modifying the particle prop-
erties. In case of DIS, instead of p⊥ , the most relevant phase-space cut is the lower limit for the
allowed virtuality of the intermediate photon:
• minimum Q2 for t-channel processes with non-identical particles
Notice that the cuts for p⊥ will also be applied when a non-zero cut for Q2 is applied.
For 2 → 3 processes that do not contain soft or collinear singularities, such as Higgs production
in EW-boson fusion, the same cuts as in the 2 → 2 case can be applied. For QCD processes, where
such singularities need to be accounted for, alternative cuts are defined. Also, since the outgoing
partons are no longer back-to-back, cuts for individual partons can be used for a more detailed
phase-space mapping:
• the minimum transverse momentum for the highest-p⊥ parton

• the maximum transverse momentum for the highest-p⊥ parton

• the minimum transverse momentum for the lowest-p⊥ parton

• the maximum transverse momentum for the lowest-p⊥ parton

• the minimum separation R (= (∆η)2 + (∆φ)2 ) between any two outgoing partons
p

The last one needs to have a high-enough value to avoid collinear divergences associated with the
outgoing partons.
As described above, the phase-space cuts can be used to improve the sampling efficiency by
focusing on a particular phase-space volume, e.g. defined by cuts on partonic p⊥ . In some cases
this might, however, require several runs that need to be combined later on. Similar improvement
in efficiency can also be achieved by reweighting the cross section of the hard process with a
suitable kinematic variable. In PYTHIA 8.3 the events can easily be reweighted by p⊥ −α , where α
is a power that could e.g. approximate the p⊥ dependence of the hard cross section. This allows
for a more uniform filling of the phase space, even when the cross section itself drops rapidly. The
downside is that when reweighting is applied, each event comes with a weight that needs to be
accounted for e.g. when filling histograms. In addition to this built-in reweighting of internally
defined 2 → 2 hard processes, there are also more involved options for reweighting with different
variables that can be enabled with the user hooks described in section 9.7.2.
An important aspect is that the described phase-space cuts are applied only for the hard scat-
tering, i.e. before any showering or hadronization. As the shower emissions will modify the four-
momentum of outgoing partons; a jet formed from the final particles will have a somewhat dif-
ferent transverse momentum than the parton that originated the jet. Final-state radiation and
hadronization can reduce the energy of the jet, whereas initial-state radiation and multiparton in-
teractions may enhance it. Therefore a “fiducial phase-space volume” is needed, i.e. hard processes
must be generated in a larger volume than the volume of interest for final-state observables, at
the unfortunate cost of generating many events that will be thrown away. The necessary amount
of oversampling depends highly on the kinematics and beam configuration considered, so it needs
to be checked case-by-case. For jet studies, this is usually done by plotting the hard-process p⊥
associated with accepted jets or events. If a non-negligible fraction of events near the p⊥min scale
are accepted then p⊥min is too high.

55
SciPost Physics Codebases Submission

3.14 Second hard process


The MPI framework in PYTHIA will generate a variable number of partonic 2 → 2 interactions in
addition to the selected hard process itself. These, mainly QCD processes, will form the underlying
event, typically consisting of rather soft particles. Occasionally, they may also contain a hard
scattering but, due to power-law falloff of the relevant cross sections, such events are rare. There
are, however, cases when the studied observable is such that more control over the kinematics of
the second scattering can significantly improve the sampling efficiency (e.g. of four-jet final states),
or the second process is not included as a part of current MPI generation (e.g. the production of
an EW boson together with a jet). The machinery for a second hard process can be used in these
situations. It can be viewed as an approach to generate so-called Double Parton Scattering (DPS)
events, but with two key distinctions. First, the DPS framework as used for theoretical studies
typically assumes that there are exactly two hard interactions in an event, while the second-hard
setup allows there to be further MPIs just like when starting out from one hard interaction. Second,
the MPI machinery uniquely fixes how two hard cross sections should be combined into a total,
while this usually involves a free parameter in the DPS expressions.
The basic approach in the PYTHIA implementation for the generation of two hard processes in
a single event is that, first, the two processes are selected completely independently and, after-
wards, momentum conservation and the possible correlations in the PDFs are accounted for by
the rejection of a fraction of the topologies. This makes the process sampling symmetric and thus
the distinction between “first” and “second” is used only for bookkeeping. Furthermore, as long
as there is some overlap in phase space of the two processes, any of the two can be the hardest
one. In principle, this construction would allow the generation of any two internally (or exter-
nally) defined processes, but in practice there is no need for a very fine-grained control of both
processes, and furthermore the combination of two rare processes would give a negligible cross
section. Therefore a somewhat more limited set of second processes have been implemented.
Still, the first process can be selected from the complete list of processes (appendix A), or even
provided externally. The processes that can be enabled as a second hard one include:

• standard QCD 2 → 2 processes, i.e. two-jet production

• a prompt photon and a jet

• two prompt photons

• charmonium production, colour singlet and octet

• bottomonium production, colour singlet and octet

• γ∗ /Z production with full interference

• single W± production

• production of a γ∗ /Z and a parton

• production of a W± and a parton

• top-pair production

• single-top production

• bottom-pair production

56
SciPost Physics Codebases Submission

Technically these can be combined freely, but some combinations would double count and there-
fore must be avoided. This includes a γ∗ /Z/W± together with a jet or on its own, and bb production
as part of the QCD 2 → 2 processes or on its own. Also, since the last one will include only bb
production explicitly in the hard scattering, the pairs produced by gluon splittings in the parton
showers will not be present in that sample. Thus, depending on the kinematics, this might or
might not be enough to give realistic cross-section estimates.
By default, the phase-space cuts, couplings, and scales for the second hard process are the
same as for the primary scattering. It is, however, possible to set different cuts for the second one,
and, due to fully symmetric treatment of the two processes, the cuts for the second process can
be set higher or lower than for the primary one. The cuts that can be separately specified are the
minimum and maximum values for the invariant mass and transverse momentum of the process.
It is instructive to consider some Poissonian statistics before showing how the cross sections
of two processes should be combined. If the average number of subcollisions, 〈n〉, is known, the
probability for n of them to occur is given by

e−〈n〉
Pn = 〈n〉n (65)
n!

In case where 〈n〉 is small, as it is for hard processes, we can approximate e−〈n〉 = 1. The
probability for one event to happen is then P1 = 〈n〉, and correspondingly for two such events
we find P2 = 〈n〉2 /2 = P12 /2. Now consider two independent event types a and b, such that
〈n〉 = 〈na 〉 + 〈n b 〉 = P1a + P1b . The probability for any combination of two events a and b is then
given by
(P1a + P1b )2
2
P1a + 2P1a P1b + P1b
2
P2 = = . (66)
2 2
From this it can be read off that a probability for having two different-type events comes with a
factor 2 relative to the same-type cases. If modelled in terms of increasing time, or decreasing hard
scale (say p⊥ ), the mixed combination can occur in two ways, either where the event a happens
before b, or the other way around, which explains the factor of 2.
The proper way to evaluate the resulting cross section thus depends on whether the two pro-
cesses are the same, and on whether the phase-space regions overlap. The simplest case is when
the two processes do not overlap, i.e. either the phase-space regions are completely separated or
the two process are different. An example of the latter would be a combination of processes where
the first produces two jets and the second two photons. When the a and b cross sections are small
fractions of the total non-diffractive cross sections σND , naively the probabilities Pa,b = σa,b /σND
enter multiplicatively. Thus their combined cross section is

naive σ1a σ1b


σ2a b = Pa Pb σND = . (67)
σND

This simplification neglects the dependence on collision geometry, however. The probability
for a hard process is enhanced in central collisions, i.e. for small impact parameter, while it is de-
pleted in peripheral ones. This leads to a so-called “trigger bias” effect, where events containing a
first hard process predominantly occur in central collisions, which thereby enhances the likelihood
of a second hard process. In the context of traditional MPIs this is known as the “pedestal effect”,
where a selected high-p⊥ process has more underlying-event activity than an average event, see
more details in section 6.2.2. When the colliding matter profiles have been specified, along with

57
SciPost Physics Codebases Submission

the parameters that set the 〈nMPI 〉, a correction factor fimpact can be derived event-by-event within
the MPI framework. Its average value gives a corrected combined cross section
σ1a σ1b σ1a σ1b
σ2a b = 〈 fimpact 〉 = . (68)
σND σeff
In the last step we introduce σeff , which is the conventional parameter that many experimental
results are expressed in terms of, but here it is a prediction of the model.
The cross section σ2aa of two identical processes follows the same pattern, except for the extra
factor of 1/2 that has already been explained. Often a would itself be the sum of several subpro-
cesses, e.g. the six main classes of 2 → 2 QCD processes that contribute to two-jet production. If so,
then a compensating factor of 2 will automatically occur for the mixed-subprocess configurations,
in the same spirit as eq. (66).
The cross section calculation becomes somewhat more complicated in cases when there is par-
tial overlap between the two processes. An example would be identical processes with different,
but partly overlapping, cuts on p⊥ . In such cases it is useful to split the problem into two com-
pletely independent processes a and b and a common process c. The first (second) process can be
selected according to σa + σc (σ b + σc ). Half of the events should be discarded if both processes
are chosen as c, and the combined cross section should be reduced accordingly.
So far it has been assumed that the generation of the two processes can be done independently,
apart from the geometrical correction factor for the final cross sections. This obviously misses
all possible correlations between the PDFs and, perhaps more importantly, may violate energy-
momentum conservation. Part of the selected events will be discarded to account for these effects,
even though each process would be acceptable on its own. The correlations in multiparton PDFs
implemented in PYTHIA are described further in section 6.2.4. The PDF reduction factor is obtained
as the average of the two possible orderings, where either the second or first PDF is corrected for
the parton taken out either by the first or second process.
In the end, the cross sections provided by PYTHIA after the event generation do account for all
these effects, including the correction factor 〈 fimpact 〉 and the PDF rescaling. The error estimates
provided by PYTHIA are statistical ones and do not cover the potentially large model uncertainties,
as usual. When the first process is provided externally, PYTHIA does not have the information
whether there is an overlap between the first and the second process, and so will assume that this
is not the case. The proper correction for an overlap then rests with the user.

4 Parton showers
The most violent pp collisions at the LHC may have five to ten easily separated jets. Zooming
in on these, they display a substructure of jets-inside-jets-inside-jets, associated with the pertur-
bative production of increasingly nearby partons. Such a fractal nature is expected to continue
down to the hadronization scale, a bit below 1 GeV. At that scale, the event may contain up to a
hundred partons, even if the full partonic structure is masked by the subsequent non-perturbative
hadronization process. There is no way to perform matrix-element calculations to describe such
complicated event topologies. Instead, the standard approach is to start out from a matrix-element
calculation with only a few well-separated partons, and then apply a parton shower to that.
Parton showers attempt to describe how a basic hard process is dressed up by emissions at suc-
cessively “softer” (longer-wavelength) and/or more “collinear” (smaller-angle) resolution scales,
to give an approximate but realistic picture of the (sub)structure of the partonic state across the

58
SciPost Physics Codebases Submission

full range of (perturbative) resolution scales. Such a shower is constructed in a recursive manner,
from the large scale of the hard process down to a lower cutoff at around the hadronization scale.
In each step, the number of partons is increased by one, or in very special cases two, and the
random nature of the steps leads to a large variability of final states. It is worth emphasizing that,
although often thought of in the context of QCD, parton (or more generally particle) showers are
in fact common to any quantum field theory with several (quasi-)massless particles. Thus, show-
ers are present in QCD, QED, and the EW theory above the symmetry breaking scale and as such,
dedicated modules describing all of these are part of PYTHIA 8.3.
One starting point is to study the ratio of two differential matrix elements, dσn+1 /dσn , where
the numerator corresponds to the emission of one more gluon in the final state. It then turns
out that this ratio is given by universal expressions, i.e. independent of which specific process is
considered, if this gluon is either soft, or collinear with one of the already existing partons. This
means that one can formulate a generic scheme that can be applied to any process of interest.
Such schemes started to be developed in the late 1970s. A key ingredient has been the DGLAP
evolution equations [43–45], which describe near-collinear emissions. Modern showers, like the
three available with PYTHIA 8.3, are based on many subsequent developments, intended to make
them cover the full phase space as well as possible. These aspects are described later, but initially
we introduce the simpler, classical (collinear “leading-log”) framework that helps in understanding
the overall picture.
Historically, showers are split into two kinds, ISR and FSR, which occur respectively before or
after the hard process. Alternatively, they may be referred to as spacelike and timelike showers,
respectively, since their representation in terms of Feynman diagrams contain off-shell intermedi-
ate particles that are either spacelike or timelike. The more virtual such a particle is, the shorter it
may exist. Therefore, the highest virtualities occur in and closest to the hard interaction, and then
showers with decreasing virtualities stretch backwards (for ISR) or forwards (for FSR) in time.
LHC processes usually contain both ISR and FSR, and outside the strictly collinear limits the dis-
tinction can be blurred, just like interfering Feynman graphs of a different nature may contribute
to a given final state. A decay γ∗ /Z → qq is pure FSR, however, while its production qq → γ∗ /Z
can be discussed in terms of ISR only, so these are often used as textbook examples. (Conversely,
ISR-FSR interference can be exemplified by t-channel colour-singlet exchange, such as in deep
inelastic scattering or vector boson fusion.)

FSR Starting from γ∗ /Z → qq, either the q or q may emit a g, e.g. q → qg. This produces a qqg
state, after which either of the three partons may branch, and so on. The differential probability
for a parton to branch can be written as

dQ2 αs (Q2 ) X
dPa (z, Q2 ) = Pa→bc (z) dz . (69)
Q2 2π b,c

Here a is the mother that splits into partons b and c, where the momentum-energy of the mother is
split such that b takes fraction z and c takes 1 − z. The Q scale, used to order emissions in a falling
sequence, is a key distinguishing feature of different shower implementations, and may be chosen
e.g. as mass, transverse momentum or energy-weighted emission angle. That is, z parameterizes
the longitudinal and Q the transverse evolution of the shower. There is also an azimuthal angle
ϕ that determines the orientation of the decay plane; typically, and for the purpose of this brief
introduction, this is assumed to be distributed isotropically, though we note that PYTHIA does allow
for non-uniform distributions as well, e.g. to reflect gluon polarization effects.

59
SciPost Physics Codebases Submission

A key issue that distinguishes parton showers from so-called analytic resummation approaches,
is that the latter only maintain exact energy and momentum conservation in the strict soft and
collinear limits while showers do so over all of phase space. This difference leads to the crucial
aspect of recoil effects in parton showers, which will play an important role when we introduce
dipole showers later on.
There are three different DGLAP splitting kernels,

4 1 + z2
Pq→qg (z) = , (70)
3 1−z
2
1 − z(1 − z)
Pg→gg (z) = 3 , (71)
z(1 − z)
1 2
z + (1 − z)2 .

Pg→qq (z) = (72)
2
These obey the trivial symmetry relations Pa→c b (z) = Pa→bc (1−z). The Pg→qq kernel is normalized
for one quark flavour only, and has to be summed over all kinematically allowed channels.
The same approach can also be used for other branchings, notably QED ones, where αs in
eq. (69) is replaced by αem and the splitting kernels are

1 + z2
Pf→fγ (z) = ef2 , (73)
1−z
Pγ→ff (z) = Nc ef2 z 2 + (1 − z)2 ,

(74)

where Nc = 3 if f is a quark and Nc = 1 if a charged lepton.


The DGLAP kernels are often written with additional terms that modify the behaviour at z = 1
and 0, in order to conserve momentum-energy and flavour in analytic calculations. This is not
necessary in event generators, partly because the 0 and 1 limits are never reached, and partly
because conservation issues are handled explicitly: parton a is removed at the same time as b and
c are inserted in the list of currently existing partons.
The branching probability in eq. (69) can be integrated over the kinematically allowed z range
Z z (Q ) 2

2 dQ2 αs (Q2 ) X max


dPa (Q ) = 2 Pa→bc (z) dz , (75)
Q 2π b,c z (Q2 )
min

to express the infinitesimal probability that a branches in a dQ2 infinitesimal step. (Strictly speak-
ing |dQ2 | since Q2 is decreasing in the evolution.) The probability for a not to branch in the same
step thus is 1 − dPa (Q2 ). By multiplication of the no-emission probabilities (exponentiation),
the probability for a not to branch between an initial scale Q21 and a final lower Q22 becomes the
Sudakov factor [51]
Z Q2 !
1
Πa (Q21 , Q22 ) = exp − dPa (Q2 ) . (76)
Q22

The differential probability for a to evolve from a Q2max to a Q2 and then branch at the latter scale,
thus is Πa (Q2max , Q2 ) dPa (Q2 ). Note that the introduction of a Sudakov factor ensures that the
total probability for a to branch cannot exceed unity, something that is not guaranteed for dPa
alone.
We observe that the Sudakov factor plays a crucial role in the selection of a branching scale.
The veto-algorithm technology in section 2.2.3 is eminently suited to handle cases where the Q2

60
SciPost Physics Codebases Submission

and z integrations cannot be done analytically. The Sudakov factor also is closely related to virtual
corrections of matrix elements, i.e. loop corrections. This will play a key role for the matching and
merging methods presented in the next section.

ISR The ISR description starts out from the evolution equation for Parton Distribution Function
(PDF)s,

dQ2 αs (Q2 ) X
Z Z
0 0
2
d f b (x, Q ) = 2 2
f a (x , Q ) dx Pb/a (z) dz δ(x − x 0 z)
Q 2π a
dQ αs (Q )
2 2 X
Z
dz  0 x 2 
= 2 f a x = , Q Pb/a (z) , (77)
Q 2π a
z z

where f i (x, Q2 ) is the probability to find a parton i inside a hadron, with i carrying a fraction x of
the full hadron momentum if the hadron is probed at a scale Q2 .
As for FSR, the evolution is driven by branchings a → bc but, where FSR is formulated in
terms of the decay rate of a, ISR is given in terms of the production rate of b. The simple splitting
kernels are easily related, Pb/a (z) = Pa→bc (z), except that Pg/g (z) = 2Pg→gg (z), since two gluons
are produced for each gluon that decays.
The evolution of PDFs starts at some low scale Q20 and then proceeds towards the Q2 scale of
the hard process, where they enter into the cross-section expression. While eq. (77) describes the
evolution of an inclusive distribution, an exclusive shower formulation similar to the FSR one is
possible, although more complicated. A key problem is that the two incoming cascades, one from
each side of the event, may not end up as the colliding partons one is interested in. For example,
in gg → H the two incoming gluons must have an invariant mass that matches the Higgs mass.
The solution to this problem is backwards evolution [52]. In this method, the evolved PDFs are
first used to select the hard process of interest, say qq → γ∗ /Z . Only afterwards are the incoming
showers then constructed backwards in time, from the high Q2 scale down to the low Q20 . To this
end, we introduce
Z z (Q ) 2

2 d f b (x, Q2 ) dQ2 αs (Q2 ) X max x 0 f a (x 0 , Q2 )


dP b (x, Q ) = = dz Pb/a (z) , (78)
f b (x, Q2 ) Q2 2π a z (Q2 ) x f b (x, Q2 )
min

where we have used that z = x/x 0 . Here dP b is the probability that parton b becomes associated
with a branching a → bc during the interval dQ2 . A no-branching probability Π b (x, Q21 , Q22 ) can be
defined in analogy with the Sudakov factor eq. (76). The corrected probability for a parton b that
branches or interacts at Q2max to be assigned a mother a at Q2 then is Π b (x, Q2max , Q2 ) dP b (x, Q2 ).
This a in its turn must be evolved to yet lower Q2 to find its mother at an even higher x value.

Recoils and dipoles An isolated parton cannot branch, if energy and momentum is to be pre-
served. Take as an example γ∗ /Z → qq → q∗ q → qqg, where q∗ is the off-shell quark that branches
as q∗ → qg. Initially, the q and q can split the energy equally, but the off-shell q∗ acquires a larger
mass than q, and so it must have a larger energy while the q receives a smaller one. In this case
we would call q the radiator (or emitter) and q the recoiler, but note that at the end, both may
yield energy to create the g. Also, considering the existence of q → qg branchings, it may be
simpler to say that it is the qq pair that jointly radiates the g. Note that q and q have opposite and
compensating colours and thus form a colour dipole, hence the concept of dipole radiation.

61
SciPost Physics Codebases Submission

This picture generalizes to the subsequent emission of further gluons [53, 54]. In the limit of
infinitely many colours, Nc → ∞ [55], the qqg system exactly splits into one qg dipole and one
gq dipole. These can radiate independently, and the recoil is distributed within each dipole. It is
still possible, but not necessary, to split the radiation inside each dipole as being associated with
either dipole end.
To allow dipole showers to operate, unique colour indices (in the Nc → ∞ limit) are assigned
to all coloured partons, both ones produced in the hard process and ones in the subsequent shower
evolution. For the extension to ISR, and to decays like t → bW+ , one should note that the hole
left behind by a scattered or decayed colour parton can act like its anticolour.

Formal basis of parton showers In the previous discussion, we have developed the basic idea
of parton showers, similarly to the historical development. We now want to turn to a more in-
depth treatment about the formal basis of modern shower algorithms as the three implemented
in PYTHIA 8.3.
We have seen above that parton showers build upon the factorization of (squared) amplitudes
in soft and collinear limits. Technically, this means that whenever either two (or more) particles
become collinear or one (or more) particle becomes soft, the full (squared) amplitude can be well
approximated by the (squared) matrix element without the unresolved particle times a universal
radiation function. It is the latter, which takes the effect of the soft or collinear radiation into
account. This factorization is reminiscent of the perturbative physics of the hard process and
occurs, because an intermediate, almost on-shell, propagator can be replaced by a polarization
sum, such that the amplitude may be split into two independent pieces. Vital for the construction
of showers is that this factorization is universal in the sense that it is process and multiplicity
independent. This means that the same radiation functions can be used for different squared
matrix elements and at any multiplicity, as long as only single-unresolved radiation is concerned.
The latter comment serves to emphasize that at higher multiplicities also multiple-unresolved
limits occur, in which, for instance, two particles become simultaneously soft or three particles
become simultaneously collinear. For such configurations, it should be obvious that higher-order
radiation functions are needed and the ones describing single-soft or (double-)collinear radiation
are not sufficient. At the same time, it is always possible to factorize phase-space integration
measures into on-shell steps by introducing delta functions to factorize the decay system, and
introducing recoiling systems to guarantee four-momentum conservation. Taking matrix-element
and phase-space factorization together, it follows that cross sections can be factorized. This allows
for iteration of the approximation, as long as the measure of “softness” or “collinearity” remains
appropriate. In this context, the requirement of an appropriate measure leads to the notion of
strong ordering, which means that radiation of soft particles is yet softer and radiation of collinear
particles is yet more collinear. Although different possibilities to factorize matrix elements exist, all
inherit that the approximation should recover the singularities of fixed-order results. On the one
hand, DGLAP evolution is driven by collinear radiation; on the other hand, factorization-breaking
(so-called non-global) logarithms are driven by soft radiation. These are the limits any parton
shower resumming the leading, i.e. largest, logarithms should recover.
Based on the above, we can start thinking about the construction of a shower model. It should
be emphasized that the construction of showers is by no means unique. As the bare minimum, a
shower algorithm must define the following.
1. Radiation functions, i.e. the matrix-element factorization.

2. A phase-space factorization and recoil procedure.

62
SciPost Physics Codebases Submission

3. An ordering variable, i.e. a measure of “softness” and/or “collinearity”.

For each of these points, different choices are possible and used, motivated by different desires to
obtain certain objectives: simplicity, extendability, or simply to describe specific processes better
at the cost of describing others worse.
Using somewhat general language for now, we can denote the radiation functions by K j/ĩ k̃ ,
describing the radiation of particle j from the two parent particles ĩ and k̃, i.e. the branching
ĩ k̃ 7→ i jk. Depending on the specifics of the shower algorithm, one of the two parents ĩ and k̃ may
be distinguished as the “emitter” while the other, the “recoiler”, only ensures four-momentum
conservation, or both parents act as emitters and recoilers in an agnostic way. The former is how
both PYTHIA’s simple shower and DIRE are structured, whereas the latter describes the antenna
picture employed in the VINCIA shower. No matter which specific choice of radiation functions is
made, the sum of terms must reproduce all single-unresolved limits of the full real-emission cross
section,
single-unresolved X
dσn+1 −−−−−−−−−−→ K j/ĩ k̃ dΦ+1 dσn =: Kn7→n+1 dΦ+1 dσn , (79)
j

with the cross sections σ defined as in eq. (36). This factorization consists of two parts: the
factorization of the squared matrix element and the factorization of the phase space.
Specifically, in the case of two particles i and j becoming collinear, the n + 1-particle matrix
element factorizes into a product of the n-particle matrix element and the DGLAP splitting kernels
eqs. (70) to (74),

ik j 8πα
|Mn+1 |2 −→ Pĩ→i j (z) |Mn |2 + angular terms . (80)
2pi · p j

Generally, the collinear limit involves spin correlations between the factorized matrix element and
the (spin-dependent) DGLAP kernels, here indicated by the additional “angular terms”. These
terms vanish upon azimuthal integration and are therefore not necessarily implemented in a
parton-shower algorithm. It is, however, vital to account for these terms in so-called NLO sub-
traction schemes to ensure point-wise cancellation of singularities. In the limit of a single gauge
boson becoming soft, however, the emission of the soft boson can be described by a universal factor
known as the soft eikonal. Different to the collinear limit, soft radiation is an intrinsically coherent
phenomenon, meaning that the boson is emitted by the whole particle ensemble, introducing a
sum over radiators:
E j →0 X 2pi · pk
|Mn+1 |2 −−−→ 8πα Cik |M n |2 , (81)
i<k
(2pi · p j )(2p j · pk )

with charge factors Cik depending on the charges of the radiators i and k. Especially in the case
of QCD, these charge factors introduce intricate colour correlations for soft gluon emissions. It is
because of these complications that most parton showers only consider the leading-colour limit, i.e.
neglect any contributions in the above sum that correspond to emissions from non-neighbouring
partons.
Besides the factorization of matrix elements, in eq. (79) we used that the n + 1-particle phase
space exactly factorizes into a product of an n-particle phase space and the branching phase
space dΦ+1 , obtained through a formal insertion of an intermediate off-shell particle with mass

63
SciPost Physics Codebases Submission

m2i j = (pi + p j )2 ,

dΦn+1 (q; p1 , . . . , pi , p j , pk , . . . , pn+1 ) = dΦn (q; p1 , . . . , pĩ , pk̃ , . . . , pn )


dm2i j
× J(pĩ , pk̃ ; pi j , pk ) dΦ2 (pi j ; pi , p j )

≡ dΦn (q; p1 , . . . , pĩ , pk̃ , . . . , pn ) dΦ+1 (pi , p j , pk ) . (82)

It must be emphasized that the n-particle phase-space measure is here written with on-shell mo-
menta pĩ and pk̃ instead of an off-shell intermediate momentum pi j . This means we here assume
an on-shell phase-space factorization, i.e. that after each emission, all momenta are separately
physical and momentum is conserved at each step in the shower,

pĩ + pk̃ = pi + p j + pk . (83)

The change from the off-shell momenta {pi j , pk } to the on-shell momenta {pĩ , pk̃ } is represented
by the Jacobian J(pĩ , pk̃ ; pi j , pk ) . Specific forms of kinematic mappings {pĩ , pk̃ } 7→ {pi , p j , pk } (or
“recoil schemes”) are again shower specific. Presently, however, all showers in PYTHIA 8.3 employ
an on-shell factorization as described here. While this might not generally be required, this is a
key requirement for the matching and merging techniques utilized in PYTHIA 8.3, cf. section 5.
The branching phase space dΦ+1 accounts for the degrees of freedom entering through the
emission of one particle from the n-particle configuration and can generally be expressed in terms
of three “shower variables” t, z, and φ,

1
dΦ+1 (pi , p j , pk ) = |J(t, z, φ)| dΦ+1 (t, z, φ) = |J(t, z, φ)| dt dz dφ . (84)
16π2
Usually, t is interpreted as the ordering variable of the shower, z as some kind of energy-sharing
variable, and φ as the angle about the branching plane in the i- j-k rest frame. However, different
showers make different choices which may be more or less connected with this analogy.
Addressing point 3 of the list above, it is instructive to start by noting that various choices
of ordering variables are formally equivalent at the Leading Logarithmic (LL) level, as can be
seen by comparing the differentials as they enter through the matrix-element and phase-space
factorizations described above,
2
dt dp⊥, j dm2i j dθi2j
= 2 = 2 = 2 , (85)
t p⊥, j mi j θi j

and by noting that in the collinear limit p⊥,2


j
∼ z(1 − z)m2i j ∼ z 2 (1 − z)2 E 2j θi2j . It is straightforward
to see that all these choices represent a certain measure of softness or collinearity, as required
above. The requirement that this measure remains appropriate during the shower evolution then
translates into strong ordering of emissions, i.e. subsequent emissions evolve down in the ordering
scale: t 0 > t 1 > t 2 > . . . t n .
Putting the above together, a no-branching probability, often also called Sudakov factor, can
be defined:
 t 
 Zn 
Πn (t n , t n+1 ; Φn ) = exp − Kn7→n+1 (Φn , Φ+1 (t 0 , z 0 , φ 0 )) dΦ+1 (t 0 , z 0 , φ 0 ) . (86)
 
t n+1

64
SciPost Physics Codebases Submission

It describes the evolution from an n-particle state at scale t n to an n + 1-particle state at scale
t n+1 < t n . By rewriting Kn7→n+1 as the sum of radiation functions K j/ĩ k̃ , Πn can written as the
product of ĩ k̃ 7→ i jk no-branching probabilities:
t z 2π
 
 X Z n ZmaxZ 0
1 dφ

0 0 0 0 0 0 0 0
Πn (t n , t n+1 ; Φn ) = exp − K (t , z , φ ) J(t , z , φ ) dz dt
 j
16π2 j/ĩ k̃ 2π 
t n+1 zmin 0
 t z 2π 
 Z n ZmaxZ 0
1 dφ
Y 
0 0 0 0 0 0 0 0
= exp − K (t , z , φ ) J(t , z , φ ) dz dt (87)
j
 16π2 j/ĩ k̃ 2π 
t n+1 zmin 0
Y
= Π j/ĩ k̃ (t n , t n+1 ; Φn ) .
j

Written this way, it is also emphasized that each branching ĩ k̃ 7→ i jk comes with its own branching
phase space and kinematic mapping. This is how the full no-branching probability Πn7→n+1 is
implemented in shower algorithms in practice.
For the calculation of the expected value of an observable O, the no-branching probabilities
enter to describe the shower evolution as a Markov chain,
Z
PS dσn
〈O〉n = Sn (t, O) dΦn (88)
dΦn

which is generated by a “shower operator” Sn (t, O), defined recursively as

Zt
Sn (t, O) := Πn (t, t c ; Φn )O(Φn ) + Kn7→n+1 Πn (t, t 0 ; Φn ) Sn+1 (t 0 , O) dΦ+1 (t 0 , z 0 , φ 0 ) . (89)
tc

This shower operator makes the unitarity of the shower explicit. The first term implicitly accounts
for all unresolved radiation and virtual corrections between the shower starting scale t and the
shower cutoff t c . The second term, on the other hand, describes the emission of a single particle,
approximated by the sum of radiation functions Kn7→n+1 , and includes all unresolved and virtual
corrections between the shower starting and cutoff scale.
It is instructive to make the form of the no-branching probability eq. (87) more explicit for
QCD showers. Implicitly, the radiation functions K j/ĩ k̃ above contain the strong-coupling constant,
a colour factor, and, for ISR, a ratio of PDFs,

K j/ĩ k̃ (t, z, φ) = gs2 (t) RPDF (t, z) C j/ĩ k̃ K̄ j/ĩ k̃ (t, z, φ) = 4παs (t) RPDF (t, z) C j/ĩ k̃ K̄ j/ĩ k̃ (t, z, φ) , (90)

where we have introduced the coupling-, PDF-, and colour-factor-stripped radiation function K̄ j/ĩ k̃ ,
which depends solely on the branching kinematics. For FSR, the PDF ratio is equal to unity,
RPDF = 1, as the initial-state momenta do not change due to the branching. Differentially in

65
SciPost Physics Codebases Submission

the evolution variable t, the integral in the exponent of Π j/ĩ k̃ can thus be written as

zmaxZ

dP FSR (t) αs (t) C j/ĩ k̃
Z
dφ 0 0
j/ĩ k̃
= K̄ j/ĩ k̃ (t, z 0 , φ 0 ) J(t, z 0 , φ 0 ) dz , (91)
dt 2π 2 2π
zmin 0
zmaxZ

dP ISR (t) αs (t) C j/ĩ k̃
Z
dφ 0 0
j/ĩ k̃
= RPDF (t, z) K̄ j/ĩ k̃ (t, z 0 , φ 0 ) J(t, z 0 , φ 0 ) dz , (92)
dt 2π 2 2π
zmin 0

for FSR and ISR, respectively. Written this way, the connection to eqs. (75) and (78), respec-
tively, is immediately evident. It is worthwhile to point out here that typically different shower
algorithms are inconsistent as to whether colour factors are included or excluded in radiation
functions. Moreover, depending on whether a shower aims at describing the evolution of a single
initial-state leg at a time or both at the same time, the PDF ratios RPDF have to include one PDF
ratio,
x i f i (x i , t)
RPDF (t, z) = , (93)
x i˜j f ĩ (x ĩ , t)
or two PDF ratios if both initial-state particles are evolved at the same time,

x i f i (x i , t) x k f k (x k , t)
RPDF (t, z) = . (94)
x ĩ f ĩ (x ĩ , t) x k̃ f k̃ (x k̃ , t)

The x-fractions pi = x i P, with P the incoming hadron momentum, depend on the shower variables
t and z.
A similar analysis can be done in the cases of QED or EW showers, where the QCD coupling
has to be replaced by the electromagnetic/electroweak coupling and QCD colour factors by the
appropriate QED/EW charges.

Formal accuracy Despite their success in describing wide classes of observables with often im-
pressive agreement with experimental data, parton showers commonly work with a number of
approximations. It is not an easy task to formally assess the accuracy of a given shower model,
i.e. to determine which exact terms of a perturbative series a shower includes. For a start, there
are three expansions to be considered:

1. the perturbative expansion in the coupling constant αn (t), determining the accuracy of the
hard process, e.g. leading-order (LO), next-to-leading order (NLO), etc.;

2. the perturbative expansion in large logarithms αn (t) logm (t hard /t), determining the accuracy
of the resummation, e.g. Leading Logarithmic (LL), Next–to–Leading Logarithmic (NLL),
etc.;

3. and for QCD showers, the expansion in the number of colours (Nc ), determining the accuracy
of the colour factors in the resummation, e.g. Leading Colour (LC), Next–to–Leading Colour
(NLC), etc.

A baseline shower would for example start from a LO matrix element and (typically) generate the
LL corrections arising from additional radiation under the LC assumption of planar colour flows.
Such a shower could be assigned a LO+LL+LC accuracy. This can be expected from virtually

66
SciPost Physics Codebases Submission

all common shower models, although observables may exist for which a given shower does not
correctly include the LL terms. It is more interesting, however, to determine if and for which
observables showers reach sub-leading, i.e. higher, accuracy than the LO+LL+LC minimum.
Increasing the accuracy on the fixed-order side can be addressed by matching and merging
methods, which are described in detail in section 5. Matching and merging at LO and NLO have
de-facto become state of the art for all showers and processes.
Assessing and increasing the logarithmic accuracy of showers has become a highly-active field,
where no general solution has yet been developed. Different approaches to assess the logarithmic
accuracy of showers have been developed in the recent past, such as ones based on comparison of
analytic and numerical resummation [56, 57], analytic examination of the logarithmic structure
of showers [58, 59], or numerical checks of logarithmic terms [60, 61]. Moreover, for simple
processes such as e+ e− annihilation to jets, first shower models have been developed that can
be shown to give NLL accuracy for a wider range of observables [62, 63]. Most common shower
models currently only obtain a formal LL accuracy, with varying, observable-dependent subleading
accuracy.
Lastly, the inclusion of sub-leading colour corrections in parton showers is an active field as
well, with approaches based on matrix-element corrections [59, 64–68], sampling of colours [69,
70], quantum-probability density-matrix arguments [71, 72], or amplitude-level evolution [73,
74]. Sub-leading colour corrections are not in general universally applied in parton showers.

Showers in PYTHIA 8.3 There are three different shower modules available in PYTHIA 8.3: the
original/default simple shower, the VINCIA antenna shower, and DIRE. These will be discussed in
detail below in section 4.1, section 4.2, and section 4.3, respectively.

4.1 The simple shower


The “simple shower” is the oldest parton-shower algorithm in PYTHIA 8 and is also the default
shower model in PYTHIA 8.3. It has its origin in the mass-ordered showers in JETSET/PYTHIA [52,
75–77], with the transition to p⊥ ordering [78] partly influenced by the Lund dipole picture [54]
and partly by the desire to combine the ISR and FSR shower evolution with MPI in a single inter-
leaved sequence [78].
Over the years, significant revisions and extensions have been introduced, many of them only
available in recent PYTHIA versions. This includes:

• Full interleaving of ISR, FSR, and MPI [79].

• Options for a dipole-style treatment of initial-final colour flows [80].

• f → fγ and γ → ff splittings (where f represents charged fermions).

• Matrix element corrections for resonance decays and a few other processes [76, 77, 81].

• Extensive facilities for matching and merging (cf. section 5).

• Reweighted shower branchings and uncertainty bands [20].

• A flexible treatment of showers in baryon-number-violating processes [82].

• Weak showers [83].

67
SciPost Physics Codebases Submission

• Hidden-sector showers [38, 39].

The name “simple” shower here refers to the limited aim of a consistent leading-logarithmic
(and beyond) shower evolution, with several known shortcomings [58,59], as opposed to the more
sophisticated goals of the alternative VINCIA (cf. section 4.2) and DIRE (cf. section 4.3) shower
options, also available in PYTHIA 8.3. It should be emphasized that, by virtue of its longer history,
many features are only available in the simple shower, and that as such the naming might be
slightly misleading. As an example, the simple shower offers a much larger selection of matching
and merging methods than does VINCIA or DIRE.
The shower machinery consists of one algorithm for FSR and one for ISR. These two are
evolved together into one combined sequence of decreasing p⊥ scales. As an example, consider
a partonic process a + b → c + d, where a and b are extracted from the beams A and B. It is
then possible for c and d to undergo FSR branchings, and for a and b backwards-evolution ISR
ones. Starting from some maximal scale p⊥max , downwards evolution gives a possible branching
p⊥ scale for each of the four partons. The one with largest p⊥ is the winner that undergoes a
branching, leading to a new state of five partons. The selected p⊥ value is taken as the new start-
ing point for all five partons to evolve further down in p⊥ , giving a new branching. This is applied
iteratively until some lower cutoff is reached and the evolution is stopped. Also, MPIs will form
part of this evolution, see section 6.2.5.

4.1.1 Basic shower branchings


The description of showers in the introduction of this section is valid for the simple shower frame-
work. Notably the branching probabilities dPa of eq. (75) and dP b of eq. (78) play a central part,
but with two key additions.
One is that evolution is performed in terms of transverse momenta, i.e. the generic Q2 scale
2
in eq. (75) and eq. (78) is replaced by a p⊥evol . The use of transverse momentum as an evolution
variable has been shown to catch key coherence features and therefore is a preferred choice [53,
54].
The other is that a dipole picture is being used, although with some exceptions. In it each
coloured parton has a unique anticolour partner, and together the two form a dipole. Radiation
is split into one contribution from each dipole end. When one end radiates, the other end has to
take a recoil such that total energy and momentum is preserved.

Shower evolution To understand basic kinematics in a branching a → bc, expressions become


especially simple using light-cone (LC) p± = E ± pz , for which p+ p− = m2⊥ = m2 + p⊥2
. When a
+ + + + −
moves along the +z axis, with p b = zLC pa and pc = (1 − zLC )pa , p conservation then gives

m2b + p⊥
2
m2c + p⊥
2
m2a = + , (95)
zLC 1 − zLC

or equivalently
2
p⊥ = zLC (1 − zLC )m2a − (1 − zLC )m2b − zm2c = p⊥LC
2
. (96)
For a timelike branching Q2 = m2a and m b = mc = 0, assuming massless partons, so then p⊥LC
2
= zLC (1−zLC )Q2 .
For a spacelike branching Q = −m b and ma = mc = 0, where b is the parton that will enter the
2 2
2
hard interaction, so instead p⊥LC = (1 − zLC )Q2 . We are inspired by these relations to define

68
SciPost Physics Codebases Submission

abstract evolution variables


2
p⊥evol = z(1 − z)Q2 for FSR , (97)
2 2
p⊥evol = (1 − z)Q for ISR , (98)

in which to order the sequence of shower emissions. The zLC definitions will be replaced by
invariant-mass-based z for the final kinematics definitions, for better Lorentz invariance proper-
ties, and as a consequence p⊥evol 6= p⊥LC . Further details on this are given later.
2 2
The evolution is now carried out, downwards in p⊥evol from some starting scale p⊥evol,max , for
FSR by parton a branching to b + c, for ISR by parton b being reconstructed as coming from the
branching of an earlier a. The branching probabilities of eq. (75) and eq. (78), with the addition
of no-branching probabilities Π, eq. (76), gives
2 2 2
dPFSR = Πa (p⊥evol,max , p⊥evol ) dPa (p⊥evol ), (99)
2 2 2
dPISR = Π b (x, p⊥evol,max , p⊥evol ) dP b (x, p⊥evol ) . (100)

2
A p⊥evol scale is selected for each existing dipole end, and the end with the largest value is chosen
to branch.
2
The selection of a branching means that p⊥evol and z are fixed. From these, one can derive the
virtuality of the evolving parton
2
p⊥evol
m2a = Q2 = for FSR , (101)
z(1 − z)
2
p⊥evol
−m2b = Q2 = for ISR . (102)
(1 − z)Q2

What now remains is to construct the kinematics of the branching. This works rather differently
for FSR and for ISR, so the two cases are presented separately.

FSR branching kinematics Study the radiation inside a dipole, consisting of a radiator a and a
recoiler r, in the dipole rest frame, with a moving in the +z direction, and with m2ar = (pa + p r )2 .
For massless partons, the introduction of an off-shell Q2 = m2a increases Ea from mar /2 to
(m2ar + Q2 )/2mar , with E r reduced by the same amount, or in terms of four-momenta

Q2 Q2
 
pa0 = pa + 2 p r , pr 0 = 1 − 2 pr . (103)
mar mar

The two daughters share the energy according to E b = z Ea and Ec = (1 − z)Ea . With the modified
a still along the +z axis, the transverse momentum of the two daughters then becomes

2
z(1 − z)(m2ar + Q2 )2 − m2ar Q2
p⊥b,c = Q2 ≤ z(1 − z)Q2 = p⊥evol
2
. (104)
(m2ar − Q2 )2

The kinematics can now be completed, including a random ϕ orientation of the p⊥ . Also, if
the original dipole had to be boosted and rotated to its rest frame, the new system should be
transformed back to the original frame.

69
SciPost Physics Codebases Submission

Colours are also assigned in the branching, such that the new colour-dipole picture is set up.
This is well defined in the Nc → ∞ limit, except for g → gg branchings. Here a rewriting [54],
2
1 − z(1 − z) 3 1 + z 3 3 1 + (1 − z)3 1 + z3
Pg→gg (z) = 3 = + '3 , (105)
z(1 − z) 2 1−z 2 z 1−z

allows the gluon that takes the (usually smaller) 1 − z fraction to be the “radiated” gluon that
connects the “radiator” gluon to the recoiler.
Of note is that the light-cone sharing of momenta between daughters, suggested initially, here
2
is replaced by an energy sharing. It has the advantage that p⊥evol and this z together exactly match
on to the singularity structure of matrix elements, such as the textbook γ∗ /Z → q(1) + q(2) + g(3)
one, when q → qg and q → qg radiation from the two dipole ends is combined
2 2
dp⊥evol,q dzq dp⊥evol,q dzq dx 1 dx 2 dx 1 dx 2 dx 1 dx 2
+ = + = , (106)
2
p⊥evol,q 1 − zq 2
p⊥evol,q 1 − zq (1 − x 2 )x 3 (1 − x 1 )x 3 (1 − x 1 )(1 − x 2 )

with x i = 2Ei /Etot . Corrections to fully reproduce several important matrix elements therefore
are easily implemented.
Incidentally, note that 1 − x 2 ∝ cos θqg and 1 − x 1 ∝ cos θqg , so eq. (106) provides a prescrip-
tion for how radiation from the full dipole smoothly can be split into radiation from the two ends
as a function of the gluon emission angle. This split also decides which of the two original partons
is the recoiler, the one that keeps its direction of motion.
The kinematics need to be modified when quark masses are included, with full expressions
in ref. [77]. There are two key points, however. First, if the branching parton a has an on-shell
mass ma and off-shell mass ma0 , then eq. (97) needs to be modified to
2
p⊥evol = z(1 − z)Q2 = z(1 − z)(m2a0 − m2a ) , (107)

to reproduce the singularities in matrix elements. Second, if the daughters are initially assigned
(0)
four-momenta p b and pc(0) as if they were massless, then massive four-vectors can be constructed
as
(0)
p b = (1 − k b )p b + kc pc(0) , (108)
(0)
pc = (1 − kc )pc(0) + kb pb , (109)
q
m2a − (m2a − m2b − m2c )2 − 4m2b m2c ± (m2c − m2b )
k b,c = . (110)
2m2a

The p⊥b,c is also reduced in the process, by a factor 1 − k b − kc .

ISR branching kinematics The handling of ISR branching kinematics is somewhat more com-
2
plicated. At any resolution scale p⊥evol the ISR algorithm will identify two initial partons, one
from each incoming hadron, that are the mothers of the respective incoming cascade to the hard
interaction. These partons should be set massless and collinear with the beams. When the resolu-
tion scale is reduced, using backwards evolution, either of these two partons may turn out to be
the daughter b of a previous branching a → bc. The parton r on the other side of the event takes
on the role of recoiler, needed for consistent reconstruction of the kinematics when the parton b
previously considered massless now is assigned a spacelike virtuality m2b = −Q2 . This redefinition

70
SciPost Physics Codebases Submission

should be performed in such a way that the invariant mass of the b + r system is unchanged,
since this mass corresponds to the set of produced particles, which in a case like gg → H must not
be modified. The system will have to be rotated and boosted as a whole, however, to take into
account that b not only acquires a virtuality but also a transverse momentum; if previously b was
assumed to move along the event axis, now it is a that should do so.
At any step of the cascade, the massless mothers suitably should have four-momenta given
p
by pi = x i ( s/2) (1; 0, 0, ±1) in the rest frame of the two incoming beam particles, so that
ŝ = x 1 x 2 s. If this relation is to be preserved in the a → bc branching, the z = x b /x a should
fulfil z = m2br /m2ar = (p b + p r )2 /(pa + p r )2 . This gives an explicit construction of the kinematics
in the a + r rest frame, assuming a is moving along the +z axis and c is massless:
mar
pa,r = (1; 0, 0, ±1) , (111)
2
2Q2
 
mar mar
pb = z; p⊥b,c cos ϕ, p⊥b,c sin ϕ, z+ 2 , (112)
2 2 mar
2Q2
  
mar mar
pc = (1 − z); −p⊥b,c cos ϕ, −p⊥b,c sin ϕ, 1−z− 2 , (113)
2 2 mar
2 Q4
p⊥b,c = (1 − z)Q2 − < (1 − z)Q2 = p⊥evol
2
. (114)
m2ar

For small Q2 values the p⊥b,c


2 2
and p⊥evol measures agree well, but with increasing Q2 the p⊥b,c
2

will eventually turn over and decrease again (for fixed z and mar ). Simple inspection shows that
the maximum p⊥b,c2
occurs for pzc = 0 and that the decreasing p⊥b,c
2
corresponds to increasingly
2 2
negative pzc . The drop of p⊥b,c thus is deceptive. Like for the FSR algorithm, p⊥evol therefore
2
makes more sense than p⊥b,c as an evolution variable, despite it not always having as simple a
kinematic interpretation. One should note, however, that emissions with negative pzc are more
likely to come from radiation off the other incoming parton, where it is collinearly enhanced, so
2
in practice the region of decreasing p⊥b,c is not so important.
Quark-mass effects are less crucial for ISR: nothing heavier than charm and bottom need be
considered as beam constituents, unlike the multitude of new massive particles one could imagine
for FSR. Kinematics have to be modified slightly if the outgoing parton c is not massless, e.g. in a
g → qq branching. The main effect is a modified evolution p⊥ , with eq. (98) replaced by
2
p⊥evol = (1 − z)(Q2 + m2c ) , (115)

and a reduced p⊥ in the branching, replacing eq. (114) by

Q4 Q2 (Q2 + m2c )(m2br + Q2 )


 
p⊥b,c = (1 − z)Q − 2 − mc z + 2 = Q2 − z
2 2 2
. (116)
mar mar m2br

Charm and bottom quarks raise another issue, namely what to do in the threshold region, i.e.
around the Q2thr scale where g → cc or g → bb branchings are turned on in the PDF evolution.
Normally, it is assumed that these quark PDFs vanish below Q2thr and then evolve above it as a
massless quark would. Initially, thus fq (x, Q2 ) ∝ ln(Q2 /Q2thr ). In backwards evolution of a c/b
quark, this leads to a diverging dP b in eq. (78) for Q2 → Q2thr , and a vanishing no-branching
probability. While such a behaviour is possible to handle by evolving with gradually smaller Q2
steps as the threshold is approached, the chosen solution is instead to rely on the known forwards-
evolution PDF shape. Therefore, once p⊥evol2
< f m2q , with f a parameter of the order of 2, a

71
SciPost Physics Codebases Submission

2
p⊥evol is chosen logarithmically evenly between m2q and f m2q , and a z flat in the allowed range.
Acceptance is based on the product of three factors, representing the running of αs , the splitting
2
kernel (including the mass term) and the gluon density weight. At failure, a new p⊥evol is chosen
in the same range, i.e. is not required to be lower since no no-branching probability is involved.
2
As for FSR, the choices of p⊥evol and z offers a possibility to match onto the singularity structure
of common matrix elements, and thereby easily correct to matrix-element expressions. Consider
e.g. qq0 → gW± [81]. The q → qg branching gives a denominator t̂( t̂ + û) and q0 → q0 g a denomi-
nator û( t̂ + û), which combine to t̂ û, in agreement with the matrix element. This also illustrates
how the full ISR radiation pattern can be subdivided into contributions from the two sides.
One special option in the ISR implementation, on by default, is the possibility to order the
emissions in rapidity, or equivalently in angle, i.e. to veto any trial emission that leads to unordered
emitted partons [79]. The backwards evolution is one towards smaller p⊥ and larger x values,
so angular ordering is already implicit to first approximation, but the unordered emissions have
a non-negligible impact that appears to be detrimental for some distributions. There are good
arguments for a rapidity ordering to be a legitimate choice [84], to provide a consistent separation
between ISR and FSR. But that was for a somewhat different algorithm, so this option should more
be seen as one possible variation beyond the basic LL accuracy of the shower.

Strong coupling By default a first-order running αs (p2⊥evol ) is used, but alternatives are a fixed
value or second-order running. Tuned αs (m2Z ) values typically tend to come out somewhat above
the PDG MS one [85]. This can be understood as absent higher-order effects, in splitting kernels
and shower kinematics, being absorbed into effective values. Since these higher-order corrections
differ between ISR and FSR, the αs (m2Z ) are also set separately for the two.
Furthermore, in the soft-gluon limit, it can be shown that the dominant O(α2s ) splitting-
function term, which generates contributions starting from O(α2s ln2 ) at the integrated level, can
be absorbed into the LO splitting functions by translating to the so-called Catani–Marchesini–
Webber (CMW) (also known as MC) scheme [86]. This means that an MS αs (m2Z ) = 0.1185 would
translate into an MC αs (m2Z ) = 0.126. This goes some of the way towards explaining the PYTHIA
default αs (m2Z ) = 0.1365. It is possible to switch on the usage of the CMW rescaling procedure to
allow a lower input αs (m2Z ), but physics is only mildly modified by this.
Another consequence of staying at leading order is that usage of LO parton distributions is
vastly to be preferred. If not, the description of ISR branchings at low scales becomes quite unreli-
able, for physical and technical reasons. The former are covered elsewhere, the latter are reflected
in the need to have positive PDFs in eq. (78), which is not guaranteed at NLO.

Shower cutoff A lower cutoff scale p⊥min is needed both for ISR and FSR, but the two need not
be same. The FSR one is related to the transition from partons to hadrons, and LEP experience
gives us some understanding that too high a value does affect event shapes detrimentally. The
ISR case is less clear cut. Experimental signals, such as the p⊥ spectrum of Z bosons in pp/pp
collisions, are affected by the non-trivial interplay with primordial k⊥ , cf. section 6.3.3. A lower
p⊥min means more p⊥ kicks to the Z, but a shower initiator with a larger x, which means more
dilution of its k⊥ in the cascade. One reasonable strategy therefore is to assume the ISR is damped
in the same way as MPIs are, i.e. the dp⊥ 2
/p⊥
2
divergence is replaced by a dp⊥ 2
/(p⊥0
2
+ p⊥
2
) one.
Alternatively, it is also possible to use a sharp cutoff.

72
SciPost Physics Codebases Submission

Interleaving Multiparton interactions and ISR are in direct competition for the beam-remnant
momentum. Therefore, a combined downwards evolution in p⊥ of the two gives precedence to
the harder parts of the event activities. There is no corresponding competition requirement for
FSR to be interleaved, and FSR can also be viewed as occurring after the other two components
in time. Interleaving is allowed, however, since it can be argued that a high-p⊥ FSR occurs on
shorter time scales than a low-p⊥ MPI, say. Backwards evolution of ISR is also an example that
physical time is not the only possible ordering principle. Rather, one can work with conditional
probabilities: given the partonic picture at a specific p⊥ resolution scale, what possibilities are
open for a modified picture at a slightly lower p⊥ scale, either by MPI, ISR, or FSR? This is the
default approach taken.
It is possible to switch off the interleaving, and consider FSR after MPI and ISR. In that case
it is also possible to allow FSR dipoles to be formed between matching colour-anticolour pairs in
two different MPIs, whereas normally dipoles are local to each MPI separately.
Another ordering issue is when resonance decays and their showers are considered. By default,
this is done after the ISR/FSR/MPI evolution of the hard process, and also after the handling of
beam remnants and colour reconnections (CR). An option for “early resonance decays” allows for
the resonance-decay to be handled before remnants and CR; this does not alter the perturbative
evolution, but partons from resonance decays can then participate in CR on an equal footing with
partons from the production process. The option for “interleaved resonance decays” [23] moves
the resonance-decay handling even earlier, interleaving it with the ISR/FSR/MPI evolution of the
hard process, with a few different options for which value of the perturbative evolution scale
to associate to resonance decays, the default being of order the width of the resonance. This
effectively represents an alternative treatment of finite-width effects; it is not a big effect for the
standard-model particles, none of which has widths much larger than the shower cutoff, but could
be relevant for precision studies and/or in BSM scenarios.

4.1.2 The dipole evolution


The previous subsection described the kinematics of a single branching. The full evolution in an
event requires some further consideration, in particular related to the overall colour flow and the
resulting set of radiating dipoles. In hadronic collisions the dipole pattern can be quite compli-
cated. Consider the example of gg → gg scattering, as shown in fig. 7a, which is one of the six
possible colour topologies for this process in the Nc → ∞ limit. Each radiation now is charac-
terized by whether the radiator is in the initial (I) or final (F) state, combined with the same
classification for the recoiler, so in general four different emission types need to be considered.

Final-final radiation To begin with, consider the simple e+ e− → γ∗ /Z → qq event. The first
emission of a gluon, to give qqg, follows the pattern already outlined. Now the Nc → ∞ limit is
applied to split the event into two dipoles qg and gq. Each can be considered in its respective rest
frame, with the p⊥evol scale of the branching setting the upper limit for the continued evolution.
In this evolution, the full emission rate of g → gg has to be split between the two dipoles. Using
eq. (105), the effective splitting kernel becomes Pg→gg (z) = (3/2)(1+z 3 )/(1−z). Here, the emitter
gluon takes the fraction z and the emitted 1 − z, where the latter is the one straddling the two
new dipoles. The radiation function from the q (or q) and g ends of the dipole have almost the
same shape, the main difference being between the colour factors 4/3 vs. 3/2, which are smoothly
mixed around the middle of the dipole, as already discussed for the angular dependence of q → qg
vs. q → qg. There are known shortcomings with this colour factor treatment [59, 87], but these

73
SciPost Physics Codebases Submission

FI/IF
rp
rg gb FI/IF e− e−
FI/IF
g
II γ∗
pb q
q q
FI/IF
FF q p
Z0 FI/IF

Figure (a)
1: Colour flow for the process (b)qq → Z0 g. The dashed lines (c) represent the colour
linesg(rg)
: Colour flow for the process stretching
+ g(gb)between
→ g(rp)the Figure
dipole
+ g(pb). 1:we
ends.
Here, Deep
are inelastic
working scattering: an incoming electron scatters one o
Figure 7: (a) Colour flow for the process
the g(r g) + g(g
incoming b) → The
proton. + g(pb).
g(r p)dashed lineHere, the the colour line stret
represents
imit where the number of colours goes to infinity so that p stands for the new
Nc → ∞ limit is used so that p stands for the
two dipole new
ends. colour purple. The dashed lines
urple. The dashed lines represent the colour lines stretching between the dipole
represent the colour lines stretching between the dipole ends. The type of dipole is
he type of dipole is indicated.
indicated. (b) qq → Zg, again with colour lines and dipole types. (c) Deeply inelastic
scattering, again with colour lines and dipole types.

are of order 1/Nc2 and are neglected here. On the kinematics side, note that an emission in one
dipole also affects the kinematics of adjacent ones, by virtue of sharing one gluon with changed
momentum.

Initial–Initial (II) radiation The ISR and FSR descriptions can be separated so long as colour
does not flow between the initial and the final state, as for the first emission in qq → Z, which is
pure II. But once a gluon has been emitted, cf. fig. 7b, the two dipoles now bypass the Z, and the
Z does not receive any further p⊥ recoil during the subsequent evolution. This runs counter to
standard perturbation- and resummation-theory results, which is the reason why traditionally ISR
has only been handled as II dipoles. That is, as shown in fig. 7b, the emission of a second gluon
is handled as occurring from the (new) qq dipole, with Z + g together taking the recoil. Similarly,
as shown in fig. 7a, the two IF dipole ends are replaced by doubling the strength of the II dipole.

Final–Initial (FI) radiation It should usually also be possible to replace the FI ends by FF ones,
with an arbitrary matching of the dipole ends, and such an option exists for exploratory purposes,
but is not the default. Instead, the incoming colour-connected parton is designated as recoiler
r. In a branching, considered in the dipole rest frame, a fraction Q2 /m2ar of the recoiler energy
should be given from the recoiler to the emitter, exactly as in eq. (103). But the recoiler is not a
final-state particle, so the increase of a momentum is not compensated anywhere in the final state.
Instead, the incoming parton that the recoiler represents must have its momentum increased, not
decreased, by the same amount as the emitter. That is, its momentum fraction x needs to be scaled
up as
Q2
 
x r0 = 1 + 2 x r . (117)
mar
Note that the direction along the incoming beam axis is not affected by this rescaling, and that the
kinematics construction therefore inevitably comes to resemble that of Catani–Seymour dipoles [88].
The dipole mass mar and the squared subcollision mass ŝ are increased in the process, the latter

74
SciPost Physics Codebases Submission

by the same factor as x r . As with ISR, the increased x value leads to an extra PDF weight

x r 0 f r (x r 0 , p⊥
2
)
, (118)
x r f r (x r , p⊥
2
)

in the emission and no-emission probabilities. This ensures a proper damping of radiation in the
x r 0 → 1 limit. The splitting of the full dipole radiation pattern is not as well understood in this
case as for an FF dipole, however, but some order-of-magnitude estimates of how the full dipole-
emission rapidity range should be shared can be made [79]. This suggests an extra damping factor
like Q2hard /(Q2 + Q2hard ), where Q2hard is the relevant hard scale of the process, like 4p⊥
2
for QCD
2 → 2 processes, which is applied by default.

Initial–Final (IF) radiation Finally, a non-default option exists, where IF dipole ends are treated
in their own right [80]. It then suffers from the above-mentioned problems with p⊥Z resummation,
but it enables handling e.g. of deeply inelastic scattering (DIS), cf. fig. 7c, where II radiation is not
an option (using the e− as recoiler would upset DIS kinematics), and presumably offers a more
realistic description e.g. of weak-gauge-boson fusion to a Higgs. The kinematics step from b + r,
where r is the colour-connected recoiler in the final state, to a + c + r 0 , as a consequence of the
a → bc step, is easiest constructed in the b + r rest frame. There
1
pa = pb , (119)
z
1
‹
pc = − 1 p b + pshift , (120)
z
p r 0 = p r − pshift , (121)
(2z − 1)Q2 m2c Q2 m2r Q2 + m2c
 
pshift = +z ; p⊥ cos ϕ, p⊥ sin ϕ, − −z , (122)
2m br m br 2m br m br m2br − m2r
2
Q2 + m2c Q2 + m2c
  
2 2 2 2 2

p⊥ = (1 − z)(Q + mc ) − mc 1 − z 2 − mr z 2 . (123)
m br − m2r m br − m2r

The same set of rotations and boosts as used to recover the b + r rest frame can then be inverted
to bring c and r 0 back to the event rest frame.

Special cases What remains is to combine IF and FI emissions consistently. In the specific case
of the first gluon emission from a DIS process, it turns out that the IF-type branching q → qg
exactly reproduces the soft- and collinear-singularity structure of the γ∗ q → qg matrix element
on its own, with only a mild mismatch in the numerator (which vanishes in the soft-gluon limit).
Therefore, it would be possible to leave aside FI emissions in this case, and the same holds for
g → gg splittings, but not for g → qq ones. So in general. both IF and FI contributions have
to be used. One simplifying factor is that the incoming parton must always be along the beam
axis, so there will only be one common phase-space mapping, unlike the case of FF or II dipoles.
Nevertheless, the details become technical and we refer to ref. [80] for further discussion. One
small comment, however: when an emission from a qg dipole is considered, the two ends radiate
with different colour charges, 4/3 and 3/2, respectively. The colour factors of the two ends are
then mixed in proportion to the 1/m2 values of the emitted parton to the two dipole ends.
Another set of problems occurs in the decays of coloured resonances, say t → bW. In this case
the colour dipole is stretched between the b and the hole left behind by the decayed t. In order to

75
SciPost Physics Codebases Submission

conserve momentum-energy, the b uses the W as a recoiler, and this choice is unique. Once a gluon
has been radiated, however, it is possible to either still have the unmatched colour (inherited by
the gluon) recoiling against the W, or to let it recoil against the b for this dipole as well. The
former could give unphysical radiation patterns, so the latter is chosen by default, although it is
not perfect either. A more detailed discussion of this issue can be found in [89]. The same issue
exists for a second emission of QED radiation, e.g. in W+ → e+ νe , but is obviously less significant
there.

4.1.3 Matrix-element and other corrections


In this subsection we give a survey of some methods used to make the shower reproduce, or at
least better approximate, known matrix-element behaviours. The methods to match and merge
external matrix-element input to the showers are covered separately in section 5, so here we
mainly describe program elements internal to the simple shower. Included are also some other
“correction” aspects, that should offer improvements to the shower, or at least provide increased
understanding by controlled variations.

Matrix-element corrections One key capability is the first-order correction to resonance decays
a → bc, where a gluon is emitted to give an a → bcg final state. The foremost example of this
is e+ e− → γ∗ /Z → qq → qqg [75]. This works because eq. (106) provides a way that the parton
shower exactly can reproduce the singularity structure of the matrix element, i.e. of the generic
ratio
1 dσa→bcg
. (124)
σa→bc dx 1 dx 2
The 1 + z 2 numerator of the splitting kernels also combines to an expression that overestimates
the numerator of the matrix elements, e.g. x 12 + x 22 for e+ e− annihilation. In the veto-algorithmic
downwards evolution of the shower, it is therefore trivial to use the ratio of the correct numerator
to the shower-kernel numerator, as a probability that a trial emission will be retained. In fact, for
the evolution down to the first branching, it is as simple as putting the numerator equal to 2 and
correct down from that.
This approach has then been extended to all combinations of colours and spins for a, b and
c that can occur within the SM and MSSM [77], and can be reused for other models where the
same colour and spin combinations occur. The inclusion of b and c masses as in eq. (107) also
reproduces the proper propagator poles 1/(m2b0 − m2b ) and 1/(m2c 0 − m2c ) that are found in the
matrix elements, such that all correction factors are well behaved over the whole phase space.
Although the matrix elements are calculated for a first emission only, they are reused in a suitably
modified form to include mass effects also in subsequent steps.
Similarly, there are a few processes where the first branching of an ISR shower are corrected
to the respective matrix element [81], based on a common singularity structure. These include
qq → Vg, qg → Vq, ff → Vγ, and fγ → Vf, where V = γ∗ /Z/W± /Z0 / . . . is a colour-singlet vector
boson. In the point-like-coupling approximation, also Higgs production gg → H and γγ → H is
handled.
It should be feasible to include a matrix element correction to DIS in the same fashion as
already outlined, but this has not been done yet. A generic and more detailed discussion of matrix-
element corrections is given in section 5.

76
SciPost Physics Codebases Submission

Power and wimpy showers In the cases above, the ISR/FSR showers are allowed to cover the
full phase space, so-called power showers [90]. We have seen that they can reach the furthest
corners no worse than being a factor two off, which then could be fixed by modest reweighting.
One guess is that this would hold true also in other processes, where no matrix-element correction
factors have been implemented. But there are counterexamples. Consider QCD jet production,
say, starting out from 2 → 2 partonic processes. Then a low-p⊥ 2 → 2 process could not be
allowed to shower further partons at high p⊥ , or else such high-p⊥ production would be double
counted and the whole perturbative framework would be undermined. So the logical p⊥evol,max
shower starting scale is the p⊥ scale of the 2 → 2 process, i.e. the factorization scale, giving wimpy
showers. Comparisons with 2 → 3 matrix elements confirm that such a scale choice is close to
optimal [79].
In general, it is possible for the user to choose between power and wimpy showers, even
separately for ISR and FSR. The default option involves a choice between the two based on the
likelihood of double counting:

• If the final state of the hard process (not counting subsequent resonance decays) contains at
least one quark (u, d, s, c, b), gluon, or photon then p⊥evol,max is chosen to be the factorization
scale for internal processes and the scale value for Les Houches input, i.e. wimpy showers.

• Else, emissions are allowed to go all the way up to the kinematic limit, i.e. power showers.

The reasoning is that in the former set of processes, the ISR emission of yet another quark, gluon,
or photon could lead to double counting, while no such danger exists in the latter case.
In cases where more is known about the context of a particular event sample, e.g. when doing
matching and merging, it is important to make use of this knowledge to override the default
behaviour. One example is to start out with power showers but then implement a user hook to
reject those emissions that would double count the particular cuts of the event sample.

Damped showers While there are processes where power or wimpy showers are appropriate,
there are also ones where the actual behaviour lies in between. It is relevant to recall that the
2
characteristic cross-section shape of a shower emission is dp⊥ /p⊥
2
, while that of QCD 2 → 2 process
4
is dp⊥ /p⊥ . That is, the p⊥ spectrum of a parton ought to begin to drop faster around the scale
2

where it goes from being a soft add-on to being a part of the core hard process. For top-pair
production gg → ttg, e.g. the gluon emission can be approximated by a shape

dP 1 k2 M 2
∝ , (125)
2
dp⊥g 2
p⊥g k2 M 2 + p⊥g
2

where M 2 is a reasonable scale to associate with the hard process and k2 is a fudge factor of order
unity. This generalizes into the possibility to use a power shower with an additional damping
factor k2 M 2 /(k2 M 2 + p⊥evol
2
). Studies [91] show that this is a reasonable approach for coloured
final states, e.g. for pairs of supersymmetric coloured particles, whereas a simple power shower is
more appropriate for pair production of uncoloured particles. This can be understood as reduced
emission by a destructive interference between ISR and FSR when colours flow from the initial to
the final state [92], but only if there is such flow.

Gluon splittings The pure s-channel nature of g → qq splittings motivates the introduction of an
option with αs (m2qq ) rather than αs (p⊥evol
2
), where mqq is the invariant mass of the qq pair. More

77
SciPost Physics Codebases Submission

importantly, the cuts on the allowed z range during the FSR evolution imply that the branching
rate is reduced relative to expectations from matrix elements. Therefore, for this branching only,
the default option is to weigh up the splitting kernel inside the allowed z range to give the correct
integrated matrix-element weight. Furthermore, this range is afterwards remapped to cover the
full range of decay angles, disregarding the normal p⊥ ordering. This treatment is especially
important for charm and bottom quarks, where the mass is not negligible and mass corrections
should be reproduced both in rate and in angular distributions. As a final twist, the matrix element
for H → gg → gqq does reproduce the expected behaviour e.g. from e+ e− → γ∗ → qq, but times a
factor (1 − m2qq /m2H )3 . The default option uses this factor, with the radiating dipole mass replacing
the Higgs one, to suppress high-mass branchings.

Dead cones For topologies where a gluon recoils against a massive quark (or another massive
coloured particle) there are no suitable ME corrections implemented into PYTHIA. When the dipole
radiation pattern is split into two ends, with a smooth transition between the two, this means that
the gluon end can radiate into the quark hemisphere as if the quark were massless. The “dead
cone” effect, that radiation collinear with a massive quark is strongly suppressed, thereby is not
fully respected. (Unlike radiation from the quark end itself, where mass effects are included.)
By default, a further suppression is therefore introduced for g → gg branchings, derived as the
massive/massless ratio of the eikonal expression for dipole radiation, which eliminates radiation
collinear with the quark. The g → qq branchings currently are not affected; the absence of a soft
singularity implies that there is hardly any radiation into the recoiler hemisphere anyway.

Global recoil The default ISR and FSR showers differ, in that the former uses a global recoil
while the latter uses a dipole one. That is, the recoil from an emission is carried by all final-state
particles in ISR, but only by a single one in FSR. Then we introduced an option where dipole
recoil can be used for ISR. As it turns out, there is also an option to obtain a global recoil in FSR.
In such a scenario, the radiation pattern is unrelated to colour correlations, which could be seen
as a disadvantage. It is convenient for some matching algorithms, however, where a full analytic
knowledge of the shower radiation pattern is needed to avoid double counting, so it is by such
user requests that the option is made available.
Technically, the radiation pattern is most conveniently represented in the rest frame of the
final state of the hard subprocess. Then, for each parton at a time, the rest of the final state
can be viewed as a single effective parton. This “parton” has a fixed invariant mass during the
emission process, and takes the recoil without any changed direction of motion. The momenta
of the individual new recoilers are then obtained by a simple common boost of the original ones.
With the whole subcollision mass as “dipole” mass, the phase space for subsequent emissions
is larger than for the normal dipole algorithm, which leads to a too steep multiplication of soft
gluons. Therefore, the main application is for the first one or few emissions of the shower, where a
potential overestimate of the emission rate is to be corrected by a matching to the relevant matrix
elements. Thereafter, subsequent emissions should be handled as before, i.e. with dipoles spanned
between nearby partons. Several process-dependent settings are needed to use this option.

Azimuthal asymmetries Parton-shower branchings are assumed to occur isotropically in az-


imuthal angle ϕ, in the rest frame of the respective dipole. The boost to the overall CM frame
then gives rise to the familiar “string effect” [93, 94] coherence phenomenon, where particle pro-
duction is enhanced in the region between two colour-connected partons. But there are also

78
SciPost Physics Codebases Submission

azimuthal correlations arising from parton polarization [95]. Notably, gluons tend to be plane po-
larized, with the decay plane of g → gg branchings favourably aligned with the production plane,
while g → qq ones tend to be aligned orthogonal to it. The former branching type is common but
with small asymmetries, while the opposite holds for the latter branching type, so that net effects
are small. They are included nevertheless, since they may have some effect in charm and bottom
production.

User hooks There are also other user hooks that can be used to modify the shower evolution.
The ones that allow an ISR or FSR emission to be vetoed play a key role in matching and merging
schemes and therefore are described in section 5.

4.1.4 QED, electroweak and other showers


The simple shower includes several extensions beyond the QCD core discussed so far. Character-
istic is that these form part of the same evolution in a common p⊥evol scale, although with some
distinguishing features.

QED shower The most obvious extension is to QED. The required branching kernels have been
presented in eqs. (73) and (74). In the evolution equations αs (p⊥evol2
) is replaced by αem (p⊥evol
2
),
but otherwise most that has been written about q → qg and g → qq carries over. A dipole language
is used also for QED emissions, but the dipoles may be different from the QCD ones. An example
is e+ e− → γ∗ /Z → qq → qqg, where the last stage contains two colour dipoles qg and gq, but
only one charge dipole qq, since the gluon carries no electrical charge. The complete multipole
radiation pattern may be poorly represented by a set of simple dipoles in cases with multiple
charges, since there is no confinement mechanism in QED to further a unique dipole setup. In
reality, few events contain multiple QED charges to consider, and if so, often the event history
suggests a reasonable division, e.g. when a new dipole arises from a γ → ff branching.
The lower cutoff on QED radiation in a hadron beam is not bound to be the same as the QCD
one, i.e. since there is no issue of αem diverging at low scales. Nevertheless it is plausible to assume
that the QCD cutoff is related to the transition from quarks to hadrons, and thus should be applied
to all radiation. For radiation off a lepton, there is no such restriction, and PYTHIA then by default
sets p⊥evol,min = 10−6 GeV for FSR and 5·10−4 for ISR. These values are fully sufficient to cover the
emission of any photons observable at a collider. They are also adjusted to be in a region where
kinematic reconstruction still works well in double precision. It has been pointed out that they
are not sufficiently low to generate the full observable-photon spectrum when PYTHIA is applied
to whatever processes could give the highest-energy cosmic rays.
The branching of a photon, γ → ff, does not fit well into the dipole picture. The choice of a
recoiler is based on the history to the largest extent possible, i.e. based on what the photon was
produced in association with. The photon branchings in part compete with the hard processes
involving γ∗ /Z production. In order to avoid overlap it makes sense to correlate the maximum γ
mass allowed in showers with the minimum γ∗ /Z mass allowed in hard processes, by default at
10 GeV. In addition, the shower contribution only contains the pure γ∗ contribution, i.e. not the Z
part, so the mass spectrum above around 50 GeV would not be well described.

Electroweak shower The emission of W± and Z gauge bosons off fermions is an integrated part
of the ISR and FSR frameworks, and is fully interleaved with QCD and QED emissions [83]. It
is off by default, however, since it takes some time to generate trial emissions, whereof very few

79
SciPost Physics Codebases Submission

result in real emissions unless the fermion transverse momenta are much larger than the W/Z
masses. These masses also have a considerable impact on the phase space of emissions, which
the shower is not set up to handle with a particularly good accuracy. Therefore, the weak-shower
emissions are always matched to the matrix element for emissions off an ff weak dipole, or some
other 2 → 3 matrix element that resembles the topology at hand. Even if the match may not be
perfect, at least the main features should be caught that way. Notably, the correction procedure
is used throughout the shower evolution, not only for the emission closest to the hard 2 → 2
process. Also, the angular distribution in the subsequent V = W± /Z decay is matched to the
0 0
matrix-element expression for ff → ffV → fff0 f (FSR) and ff → g∗ V → g∗ f0 f (ISR). Afterwards, the
0
f0 f system undergoes showers and hadronization just like any W± /Z decay products would.
Special for the weak showers is that couplings are different for left- and right-handed fermions.
With incoming unpolarized beams this should average out, at least so long as only one weak emis-
sion occurs. In the case of several weak emissions off the same fermion, the correlation between
them will carry a memory of the fermion helicity. Such a memory is retained for the affected dipole
end. The flavour-changing character of W± emissions also affects the tight relation between the
real-emission evolution and Sudakov factors, so-called Bloch–Nordsieck violations. These effects
are not expected to be large, but they are not properly included. Another restriction is that there
is no simulation of the full γ∗ /Z interference: at low masses, the QED shower involves a pure γ∗
component, whereas the weak shower generates a pure Z. Finally, it should be remembered that
this is not a full (electro)weak shower, which would also have required interactions among gauge
bosons, and even involved the Higgs boson. These interactions are included, e.g. in the VINCIA
EW shower, cf. section 4.2.4.

Onia Hard production of charmonium and bottomonium can proceed either through colour-
singlet or colour-octet mechanisms. In the former case, the state does not radiate and the onium
is therefore produced in isolation, while it is sensible to assume that a shower can evolve in the
latter case, giving an onium state embedded in some amount of jet activity. Currently, both cases
are initiated by 2 → 2 interactions directly producing an onium state; the alternative mechanism
of producing onia during the shower evolution itself [96] is not (yet) implemented. Emissions
off an octet-onium state could easily break up a semi-bound quark pair, but might also create a
new semi-bound state, and to some approximation these two effects should balance in the onium
production rate. The showering implemented here therefore should not be viewed as an accurate
description of the emission history step by step, but rather as an effective approach to build up
the onium environment. The simulation of branchings is based on the assumption that the full
radiation is provided by an incoherent sum of radiation off the quark and off the antiquark of the
onium state. Thus, the splitting kernel is taken to be the normal q → qg one, multiplied by a
factor of two. A number of corrections to this picture could be imagined; since they would come
with opposite signs the assumption is that they cancel out. Further discussion is also included in
section 3.3.

Baryon-number-violating decays A complicated case for showering is baryon-number-violating


decays, e.g. a neutralino decaying to three quarks. It is then not possible to assign an ordinary
dipole configuration. Instead half-strength dipoles are constructed between each pair of quarks.
That way the total emission rate from each quark is at normal strength, and the recoil can be taken
by either of the other two quarks. Similar reduced-showering-rate dipoles can be selected also in
a few other cases.

80
SciPost Physics Codebases Submission

Hidden Valley processes The Hidden Valley (HV) scenario, introduced in section 3.7, has been
developed specifically to allow the study of visible consequences of radiation in a hidden sector,
either by recoil effects or by leakage back into standard-model particles. A key aspect therefore is
that the normal timelike showering machinery has been expanded with a third kind of radiation,
in addition to the QCD and QED(+EW) ones [38, 39]. These three kinds are fully interleaved, i.e.
evolution occurs in a common p⊥ -ordered sequence. This radiation may be described either within
a (possibly broken) U(1) or an unbroken SU(N) gauge group, but not both simultaneously. Thus,
one has either HV-photons or HV-gluons as interaction carriers, where the latter are non-Abelian
and may branch into more HV-gluons. A set of 12 new particles mirrors the standard-model flavour
structure, and is charged under both the SM and the HV symmetry groups, so that they can radiate
both into the visible and invisible sector. There is also a new massive particle with only HV charge,
sitting in the fundamental representation of the HV gauge group, denoted an HV-quark.
HV particles are only produced in or after the hard process, so only FSR needs to be considered.
The HV radiation defines its own set of dipoles, usually between opposite charges. Decays of
massive particles can give rise to the same kind of issues as for top decays, i.e. that a dipole
properly involves the hole of the decaying particle. Matrix-element corrections are implemented
for a number of decay processes, with colour, spin, and mass effects included, as for SM processes.
These were calculated within the context of the particle content of the MSSM, however, which
does not include spin-1 particles with unit colour charge. In such cases spin 0 is assumed instead.
By experience, the main effects come from mass and colour flow anyway, so this is not a bad
approximation. In the case of a broken U(1) symmetry, the HV-photon is massive, which requires
some kinematics corrections relative to ordinary QED radiation. If decays back to the SM occur,
e.g. the HV-photon by mixing with the ordinary γ, then also ordinary showers are allowed. By
default the coupling strength is fixed, but running is allowed, given the gauge group and the
contributing matter content.

4.1.5 Algorithms for automated shower variations and enhanced splittings


Several variations of the simple shower are available in an automated fashion, to help construct
uncertainty bands for predictions [20]. That is, weights are constructed and associated with the
shower evolution under different alternative conditions, at the same time as the normal showers
(with unit weight) are evolved. The properties of an event only need to be analysed once, but can
then be filled in one histogram for each distinct variation, with its associated event weight, and
at the end these histograms can be combined to provide the uncertainty band. Variations can be
set for the renormalization scale for ISR and FSR QCD emissions (separately), for the inclusion
of non-singular terms in the ISR and FSR splitting kernels (separately), and for different PDF
members.
The veto algorithm is used to generate parton-shower histories for the physics parameters
chosen at initialization as normal. Using eq. (15), we can compute sets of weights (which we
call variations) for each event reflecting the changed probability for that event under different
possible choices of physics parameters. The number of variations calculated is limited only by
finite computing and memory resources.
While the proof of unitarity is more easily realized using eq. (15), the algorithm is employed
discretely. Thus the factors

f (t 0 ) 1 − f (t 0 )/g(t 0 )
(acc) and (disc), (126)
g(t 0 )r(t 0 ) 1 − r(t 0 )

81
SciPost Physics Codebases Submission

can be calculated at each discrete step and book-kept during the shower to calculate an event
weight. The factors (acc) account for the effect in accepted splittings, while the factors (disc)
preserve unitarity from the discarded trial splittings.

Parton-Shower Variations Consider a parton shower based on the veto algorithm with a physi-
cal trial-accept probability, Pacc , given by the ratio of a splitting kernel P(t, z) and an oversampling
kernel P̂(t, z), and an alternative shower algorithm, defined by a different physical trial-accept
0
probability, Pacc , given by the ratio of an alternative radiation kernel P 0 (t, z) to the same over-
sampling kernel. The difference between the radiation kernels could be different αs scale choices,
different non-singular terms in the splitting kernels, and/or different effective higher-order con-
tributions to the splitting kernels. In the following, we assume that the t and z definitions remain
the same. Translations between different t choices are discussed in ref. [97] (and the resulting
equations used in early versions of VINCIA to provide an uncertainty variation corresponding to
the difference between virtuality-ordered and p⊥ -ordered showers) while exploring different z
definitions (and more generally, different recoil strategies) would require a future generalization.
The algorithm to compute the probability of an event generated by P 0 based on an event generated
using P is, following refs. [20, 64] and suppressing the z dependence for clarity:

1. Start the event evolution by setting all weights (nominal and uncertainty-variation ones)
equal to the input weight of the event, w0 = w.

2. If the trial branching is accepted, multiply the alternative weight w0 by the relative ratio of
accept probabilities,
0
Pacc (t) P 0 (t)
R0acc (t) = = . (127)
Pacc (t) P(t)

3. If the trial branching is rejected, multiply the alternative weight w0 by the relative ratio of
discard probabilities,
0
Pdisc (t) 0
1 − Pacc (t) P̂(t) − P 0 (t)
R0disc (t) = = = . (128)
Pdisc (t) 1 − Pacc (t) P̂(t) − P(t)

4. If desired, the detailed balance between the accept and discard probabilities could option-
0 0
ally be allowed to be broken by up to a non-singular term, Pacc 6= 1 − Pdisc , to represent
uncertainties due to genuine (non-cancelling) higher-order corrections, which would mod-
ify the total cross sections. For the current implementation in PYTHIA, however, we do not
consider this possibility further.

Step 2 is responsible for adjusting the naive splitting probabilities, while step 3 is responsible for
adjusting the no-branching probabilities. The result is that the set of weights w0 represents a sep-
arately unitary event sample, with w0 = 〈w〉; i.e. the samples integrate to the same total cross
section. The relative discard-ratio, eq. (128), contains the difference P̂ − P in the denominator.
If the trial overestimate, P̂, is “too efficient” (meaning it is very close to P), the denominator
can become close to singular, resulting in large and possibly numerically unstable weights. Algo-
rithmically, what happens is that there are very few failed trials, hence the modifications to the
no-branching probability can have large fluctuations. Technically, we address this by applying a
“headroom factor” to the trial functions when automated uncertainty variations are requested,

82
SciPost Physics Codebases Submission

ensuring that there is always a non-negligible probability for trials to be discarded at the cost of
computational speed.
The final event weight, w0 , after the full shower evolution, is the product of many such factors,
one R0acc for each accepted trial and one R0disc for each discarded one,

0 0
Y Pi,acc Y P j,disc
0
w = . (129)
i∈accepted
Pi,acc j∈discarded
P j,disc

Given enough phase space for evolution, this factor can become arbitrarily different from unity,
representing that, e.g. a very active shower history is exponentially more likely to occur in a shower
with a large value of αs than in one with a small value. In principle, this is both physically and
mathematically correct. In practice, however, it is not desirable that branchings at low evolution
scales in the shower should significantly alter the modified event weights. Technically, we treat
this by imposing a few limiting factors on the variations.

Renormalization-Scale Variations The first major class of variations we include are variations
of the shower renormalization scales. This can be done for both QED and QCD, with the latter
normally dominating the overall uncertainty. It is worth noting, however, that for a coherent
shower algorithm, a scale choice of p⊥ accompanied by the so-called CMW scale factor absorbs
the leading second-order corrections to the splitting functions for soft-gluon emission. A brute-
force scale variation would destroy this agreement. We therefore provide an option to allow an
explicit O(α2s ) compensating term to accompany each scale variation, driving the effective scale
choice back towards p⊥ at the NLO level, while leaving the higher-order components of the scale
variation untouched.
Specifically, if the baseline gluon-emission density is

αs (p⊥ ) P(z)
P(t, z) = , (130)
2π t
with P(z) the DGLAP radiation kernel, then we may define a renormalization-scale variation,
µ = p⊥ → µ0 = kp⊥ , with an NLO-compensating term (see, e.g. ref. [97])

αs (kp⊥ )  αs  P(z)
P 0 (t, z) = 1+ β0 ln k , (131)
2π 2π t
with β0 = (11Nc − 2n F )/3, Nc = 3, and n F the number of active flavours at the scale µ = p⊥ . Note
that, if there are any quark-mass thresholds in between p⊥ and kp⊥ , then αs (p⊥ ) and αs (kp⊥ )
will not be evaluated with the same n F . Matching conditions are applied in PYTHIA to make the
running continuous across thresholds, so this effect should be small for reasonable values of k.
Nonetheless, one could in principle add an additional term αs /(2π) ln(mq /(kp⊥ ))/3 to compen-
sate for the different β0 coefficients used in the region between the threshold and kp⊥ . However,
since the variation is numerically larger without that term, and since the ambiguities associated
with thresholds are anyway among the uncertainties one could wish to explore, for the time being
we consider it more conservative to not include any such terms.
Note also that the scale and scheme of the αs factor in the compensation term, inside the
parenthesis in eq. (131), is not specified, as this amounts to an effect of a yet higher order, beyond
NLO. To make the compensation as conservative as possible (and to avoid the risk of overcompen-
sating), we choose the scale of the compensation term to be the largest local scale in the problem,

83
SciPost Physics Codebases Submission

namely the invariant mass of the emitting colour dipole mdip , thus making the correction term
as numerically small (and hence as conservative) as possible, specifically µmax = max(mdip , kp⊥ ).
Furthermore, since this argument only pertains to the soft limit, our estimate of the compensation
would be too optimistic if applied undiminished over all of phase space. To be more conservative,
we therefore multiply the compensation term by an explicit factor (1 − ζ), defined so as to vanish
linearly outside the soft limit,

 z for splittings with a 1/z singularity
ζ= 1−z for splittings with a 1/(1 − z) singularity . (132)
 min(z, 1 − z) for splittings with a 1/(z(1 − z)) singularity

Combined, these arguments lead us to the following modified accept probability for a robust
shower renormalization-scale variation compatible with the known second-order leading-singular
structure:
αs (kp⊥ ) αs (µmax )
 ‹
0 P(z)
P (t, z) = 1 + (1 − ζ) β0 ln k , (133)
2π 2π t
hence
0
Pacc (t, z) αs (kp⊥ ) αs (µmax )
 ‹
R0acc (t, z) = = 1 + (1 − ζ) β0 ln k . (134)
Pacc (t, z) αs (p⊥ ) 2π
The compensation term in the expressions above is only included for gluon emissions, not for
g → qq̄ splittings. The latter are subjected to the full (uncompensated) variation, αs (kp⊥ )/αs (p⊥ ).
Finally, we impose an absolute limit on the allowed amount of αs variation, by default

|∆αs | ≤ 0.2 . (135)

This does not significantly restrict the range of variation for perturbative branchings (even when
αs ∼ 0.5, a full 40% amount of variation is still allowed), but it does prevent branchings very
near the cutoff from generating large changes to the event weights. Removing this bound would
not significantly affect the perturbative physics uncertainties, but would cause much larger weight
fluctuations (between events with and without some very soft branching near the end of the evo-
lution), mandating much longer run times for the same statistical precision.
At the technical level, the user decides whether to perform scale variations of ISR and FSR
independently, or whether to vary the respective αs factors in a correlated manner. It is even
possible to include both types of variations (independent and correlated), and compare the results
obtained at the end of the run. From a practical point of view, the FSR αs choice mainly influences
the amount of broadening of the jets, while the ISR αs choice influences resummed aspects such as
the combined recoil given to a hard system (e.g. a Z, W , or H boson, or a t t̄, dijet, or γ+jet system)
by ISR radiation and also how many extra jets are created from ISR. The latter of course also
depends on whether and how corrections from higher-order matrix elements are being accounted
for. A few illustrations for the simple shower model can be found in ref. [20].

Finite-Term Variations All shower formalisms are based upon the universal nature of the sin-
gular infrared (soft and/or collinear) limits of QCD. In these limits, the exact form of the splitting
functions are known (to a given order), regardless of whether we express them as DGLAP kernels,
dipole/antenna functions, or by any other means. Away from these limits, however, in the phys-
ical phase space on which the kernels will be applied as approximations, there are in principle
infinitely many different radiation functions to choose from, sharing the same singular terms but

84
SciPost Physics Codebases Submission

having different non-singular ones. Attempting to evade this problem by setting the non-singular
terms to zero would not only be arbitrary, it would also not be stable against reparameterizations
of the radiation functions themselves. Thus, zero finite terms in a DGLAP parameterization does
not translate to zero in a dipole one, nor does zero in one dipole parameterization correspond to
zero in another, see e.g. refs. [64, 98].
We also emphasize that finite terms are qualitatively different from renormalization-scale vari-
ations and produce quite different-looking uncertainty envelopes [64]. The reason is that re-
normalization-scale variations are by construction proportional to the shower radiation functions.
In regions far from the singular limits, the pole terms are highly suppressed and the shower radi-
ation functions may not bear much resemblance to the matrix elements for the process at hand.
In such regions, modest finite terms can therefore easily produce much larger variations than
renormalization-scale changes.
We therefore believe that an exhaustive exploration of parton-shower uncertainties should at
least grant the capability to perform finite-term variations, while the final decision whether and
how to use them can still be left up to the user. An observation of large finite-term uncertainties
in the context of a physics study would be a direct indication of a need to incorporate further
corrections from matrix elements, e.g. via one of the many matching/merging strategies available
in PYTHIA. This is because the matrix elements contain the correct finite terms for the process at
hand, thus nullifying the finite-term uncertainties at least in any phase-space regions populated
by the matrix elements.
To implement such variations in the context of a DGLAP approach, we do the following,
‚ Œ ‚ Œ
P(z) 2 P(z) c 2 c Q2 d t
dQ → + 2 dQ = P(z) + 2 , (136)
Q2 Q2 mdip mdip t

where mdip is the invariant mass of the dipole in which the splitting occurs, c is a dimensionless
finite term of order unity, and in the last equality we used the identity dQ2 /Q2 = d t/t which holds
for any t = f (z)Q2 , including in particular all the PYTHIA evolution variables. Note that, for gluon
emission off timelike massive quarks, Q2 should be the virtuality, or off-shellness of the massive
quark, defined as Q2 = (p b + p g )2 − m2b = 2p b · p g [77], with p b the four-momentum of the massive
quark and p g that of the emitted gluon. Thus,

P(z) + c Q2 /m2dip
‚ Œ
0 αs
P (t, z) = C , (137)
2π t

where C is the colour factor. The variation can therefore be obtained by introducing a spurious
term proportional to Q2 /m2dip in the splitting kernel used to compute the accept probability, hence

0
Pacc c Q2 /m2dip
R0acc = = 1+ , (138)
Pacc P(z)
from which we also immediately confirm that the relative variation explicitly vanishes when
Q2 → 0 or P(z) → ∞.
To motivate a reasonable range of variations, we take the finite terms that different physical
matrix elements exhibit as a first indicator, and supplement that by considering the terms that are
induced by PYTHIA’s Matrix Element Corrections (MEC) for Z-boson decays [75]. In particular, the
study in ref. [98] found order-unity differences (in dimensionless units) between different physical
processes and three different antenna-shower formalisms. Therefore, here we also take variations

85
SciPost Physics Codebases Submission

of order unity as the baseline for our recommendations. A few illustrations for the simple shower
model can be found in ref. [20].

Veto Algorithm with Biased Kernels A second important use case for modifying the veto al-
gorithm is to evaluate the fragmentation contributions to processes like photon and B-hadron
production, via splittings like q → qγ and g → b b̄, respectively. Since these processes are rela-
tively rare (αem  αs and Pg→b b̄  Pg→g g ), the generation of adequate event samples featuring
these processes can suffer from substantial inefficiencies. The method implemented in PYTHIA 8.3
is described in refs. [15, 20]. It is formally identical to the one presented for q → qγ branchings
in ref. [99].
Consider that we wish to enhance the rate of g → b b̄ splittings by a factor E  1 until we have
obtained at least one such splitting, after which we would normally want to let the probability to
have a second g → b b̄ splitting in the same event drop back down to the normal level. We can
achieve this by first increasing the rate of trials for the corresponding splitting function by a factor
of E by using a larger (biased) trial function (suppressing the dependence on both t and z),

P̂biased = E P̂ . (139)

We then keep the accept probability the same as normal, but reweight each accepted biased trial
branching by the inverse of the biasing factor,

P P̂ 1
Pacc = ; Racc = = , (140)
P̂ P̂biased E

so that the product Racc Pacc P̂biased = P is the desired physical distribution. For each discarded
biased trial branching, we use the same technology as above to reweight the event,
 
1 − Pacc Racc P̂ P P P̂biased P̂
Rdisc = = 1− → , (141)
1 − Pacc P̂ − P P̂biased P̂ − P

where the last asymptote shows that the reweighting factor becomes independent of the bias in
the limit that the bias factor is very large. Nonetheless, the difference is important since it allows
us to recover the physical no-branching probability. Currently enhancements of both ISR and FSR
branchings have been included, uniformly in phase space.

4.2 The VINCIA antenna shower


The VINCIA shower implements an interleaved p⊥ -ordered evolution based on the so-called an-
tenna formalism. In event-generator contexts, this type of shower was first pioneered by the
ARIADNE model [4, 54], which was widely used e.g. at LEP. For completeness, we note that the
objects we call “antennae” here were actually called dipoles in that context, but today the term
dipole has taken on a different meaning, see, e.g. section 4.3 on DIRE.
Especially for FSR QCD radiation, VINCIA shares many features with ARIADNE, including its
evolution-variable definition and its antenna-style 2 7→ 3 approach to parton branchings in which
both parents can acquire transverse recoil and the soft eikonal remains unpartitioned. These latter
two properties are specific to antenna showers.
For ISR, VINCIA’s treatment is quite different from that of ARIADNE, with VINCIA extending the
concept of (interleaved) backwards evolution [52, 78] to the antenna picture [100] via coherent

86
SciPost Physics Codebases Submission

II and IF antennae [101], as well as so-called Resonance–Final (RF) ones [89]. The latter are
relevant to the decays of coloured resonances, such as top quarks. They come with their own,
dedicated kinematic mapping which is constructed to preserve the invariant mass of the decaying
resonance. Since all of its building blocks are explicitly coherent (at least at leading colour) and
interleaved in a single common sequence of decreasing p⊥ values, VINCIA should exhibit a quite
reliable description of soft coherence effects across essentially all physical contexts.
This extends to QED, for which VINCIA’s default antenna functions [102, 103] include fully
coherent (multipole) soft interference effects in addition to the collinear DGLAP structures. We are
not aware of any other multipole QED treatment that can be interleaved with the QCD evolution.
(e.g. the YFS formalism [104] is constructed purely as an “afterburner”, i.e. not interleaved with
the QCD shower, and collinear logarithms can only be included order by order.)
A further difference with respect to ARIADNE is that VINCIA’s QCD and multipole QED showers
are constructed as so-called “sector” antenna showers, in which the phase space is divided into dis-
tinct (non-overlapping) colour and kinematics sectors, each of which only receives contributions
from one specific antenna-branching kernel. This has a number of mainly technical consequences
which will be elaborated on further below, to do with making the incorporation of higher-order
corrections as straightforward and efficient as possible. For the time being, ARIADNE-style “global”
showers also remain available as a non-default option.
Effects of particle masses are systematically included, both via mass corrections to the antenna
functions such that all relevant (quasi-)collinear limits are reproduced [101, 105], and by the use
of exact massive phase-space factorizations. The current default behaviour is to treat bottom and
heavier quarks, and muons and tau leptons, as massive in VINCIA, though this can be changed
if desired. (Weak bosons are always treated as massive.) A subtlety arises in the treatment of
incoming heavy-flavour quarks (and, potentially, muons). Kinematically, such partons are book
kept as massless, similarly to the choice made in PYTHIA’s simple shower. The consequence is that
the treatment of mass effects for initial-state partons in VINCIA is less rigorous than for final-state
ones. One should also be aware that there can be a non-trivial interplay with the flavour scheme
employed by the chosen PDF set.
As a complementary option to the multipole QED shower, VINCIA also includes a module for
full-fledged electroweak showers [23, 106]. This option includes the full set of EW-branching
kernels, including both Higgs couplings and gauge-boson self-interactions, tallying to more than
1000 EW-antenna functions in total. The main limitation is that only the relevant (quasi-)collinear
limits are implemented, not the full soft interference structure; thus, also the QED treatment is
effectively reduced to a DGLAP-style treatment when using this option. Note also that the EW
module is based explicitly on VINCIA’s underlying formalism for helicity-dependent showers [107,
108] (e.g. to tell left- and right-handed weak bosons apart). This module therefore requires Born
partons with assigned helicities, which is not the default in PYTHIA, and must be provided either via
external LHEF events with helicity information, or via VINCIA’s dedicated option for hard-process
helicity selection. (The latter is based on PYTHIA’s run-time interface to external matrix-element
libraries; see the program’s online manual and example programs for configuration and linking
instructions.)
A further feature that was originally introduced with VINCIA’s electroweak-shower module, but
is now applied independently of it, is a novel treatment of finite-width effects, called interleaved
resonance decays [23], which are the default in VINCIA. This means that decays of short-lived
resonances, such as top quarks, W /Z bosons, or BSM particles, are inserted in the shower evo-
lution at an evolution scale of order the off-shellness of the resonance, instead of being treated
sequentially, after the shower of the hard process. This reflects the physical picture that short-lived

87
SciPost Physics Codebases Submission

particles should not be able to radiate at frequencies lower than the inverse of their lifetime; only
their decay products can do that. This can produce subtle changes in reconstructed invariant-mass
distributions, relative to conventional (non-interleaved) resonance decays.
All of VINCIA’s shower modules are fully interleaved with PYTHIA’s treatment of multiparton
interactions (MPI), in the same manner as for the simple-shower model.

4.2.1 Common features


Some aspects are common to all of VINCIA’s shower modules. This includes the definition of
the evolution variable as well as recoil schemes and phase-space factorizations. These common
features are discussed in this subsection before going into further detail on each of the specific
components of VINCIA’s shower implementations.

Evolution variables All showers in VINCIA, including the QED and EW ones, are evolved in a
Lorentz-invariant scaled notion of off-shellness, based on a generalized version of the ARIADNE
definition of transverse momentum. For a generic branching I K → i jk,

q̄i2j q̄2jk
2
p⊥ j = , (142)
smax

where the off-shellness for final-state partons is defined as

q̄i2j = (pi + p j )2 − m2I = m2i j − m2I i is final , (143)

and that for initial-state partons is obtained via crossing (and an overall sign change to make it
positive),
q̄i2j = −(pi − p j )2 + m2I i is initial . (144)
These both involve the positive invariant 2pi · p j but differ in the signs of pre- vs. post-branching
parton masses. This reflects the underlying crossing and sign change, combined with the prop-
agator structure of backwards evolution. For convenience, we define the dimensionful invariant

si j ≡ 2pi · p j , (145)
regardless of whether particles i and j are massless or not. The maximal antenna invariant, smax ,
is then defined by 

 sI K FF


smax = sa j + s jk IF & RF , (146)



sab II

where initial-state partons are labelled with letters from the beginning of the alphabet (a and
b) and final-state ones are labelled by i, j, k, . . .. Below, that labelling convention will be used
systematically to distinguish initial- and final-state partons.
We also define dimensionless (scaled) invariants and masses,

si j m2i
yi j = ; µ2i = . (147)
smax smax

88
SciPost Physics Codebases Submission

For massless kinematics, the scaled invariants have very simple relations to the z variables of
DGLAP-style approaches. Thus, for final-final (FF) antennae, the CM energy fractions are
2Ek
xk = p = 1 − yi j , (148)
sI K

and similarly for the two other permutations of (i, j, k). For initial-initial (II) antennae, the in-
coming legs are always massless and the yAB invariant can be identified with the z variable, since
sAB x x
yAB = = A B = za z b , (149)
sab xa x b
where we have emphasized that, for antenna-II branchings, in general both x values change, with
v v
xA t
u 1 − yjb xB t
u 1 − ya j
za = = yAB and z b = = yAB . (150)
xa 1 − ya j xb 1 − yjb

There is still the constraint that, in the a-collinear limit z b → 1 and vice versa (for massless j).
For initial-final (IF) ones, the exact relations are more involved but again the collinear limits
can be examined via
za = 1 − y jk ; zk = 1 − ya j . (151)

Recoil schemes In the antenna formalism, the branching recoil is shared between both antenna
parents for FSR emissions in an on-shell kinematics map along the lines of refs. [105, 109, 110]
including full mass dependence. This is illustrated in the top row in fig. 8. In the collinear limits,
any transverse momentum is fully absorbed within the collinear pair, and the anti-collinear parent
recoils purely longitudinally; therefore, in these limits the map agrees with the conventional dipole
ones, cf. refs. [88, 111, 112]. This means that the post-branching momenta are constructed as
µ
pi = (Ei , 0, 0, |~pi |) , (152)
µ 
pj = E j , −|~p j | sin θi j , 0, |~p j | cos θi j , (153)
µ
pk = (Ek , |~pk | sin θik , 0, |~pk | cos θik ) , (154)

in the rest frame of the parent I-K antenna. Here, the energies are given by

si j + sik + 2m2i si j + s jk + 2m2j sik + s jk + 2m2k


Ei = , Ej = , Ek = , (155)
2m I K 2m I K 2m I K
and the angles by
2Ei E j − si j 2Ei Ek − sik
cos θi j = , cos θik = . (156)
2|~pi ||~p j | 2|~pi ||~pk |
Subsequently, the branching plane is rotated by an angle φ, uniformly sampled in [0, 2π], in
the x- y plane, and by an angle ψ between the mother parton I and the daughter parton i, which
establishes the relative orientation of the post-branching partons with respect to the pre-branching
ones. As the choice of ψ is not unique away from the collinear limits, VINCIA implements a few
different options for ψ, cf. refs. [105, 110]. In any of the choices, ψ → 0 ensures that parton i
recoils purely longitudinally in the K-collinear limit and ψ → π − θik ensures that k recoils purely
longitudinally in the I-collinear limit.

89
SciPost Physics Codebases Submission

pI pi
pR pR

FF pA pB pA pB
pj
pk
pK
pK
pk
pj
IF pA pB pa pB

pR
pR

pR pR pr

II p∗a p∗b
pA pB pa pj pb
p∗j

pK
p∗k pk
p∗j pj
RF pA p∗a pA
pR pR pr

Figure 8: Illustration of VINCIA’s kinematic maps, for final-final (FF), initial-final (IF),
initial-initial (II), and resonance-final (RF) branchings. Dashed lines represent initial-
state momenta, non-participating legs are shaded grey, and the set of final-state specta-
tors (R) is shown in cyan. For II and RF branchings, the frame reinterpretation done in
the last step imparts a collective recoil to the final-state spectators, pR → p r .

For initial-state radiation, one (or both) parents must remain collinear to the beam axis, and
instead the hard system can now acquire transverse recoil. This makes it more complicated to
define a truly antenna-like recoil scheme, and VINCIA’s choices [101, 113, 114] are more similar
to dipole treatments such as the ones in refs. [115–118]. In the case of IF antennae, this amounts
to constructing the post-branching momenta as
1 µ
x a = x A/ yAK ( =⇒ paµ = p ), (157)
yAK A

µ
( yak + µ2j − µ2k ) + ( yak − ya j )µ2K − yAK yak µ µ µ
= pA + y a j p K + Γa jk q⊥max ,
Æ
pj (158)
yAK

µ
( ya j − µ2j + µ2k ) + ( ya j − yak )µ2K − yAK ya j µ µ µ
= pA + yak pK − Γa jk q⊥max ,
Æ
pk (159)
yAK

in the A-K rest frame, as illustrated in the second row in fig. 8. In this context, Γa jk = ya j y jk yak and
µ
q⊥max denotes the transverse component in terms of a spacelike four-vector that is perpendicular
to pA and pK and obeys q⊥max2
= −(sa j + sak ).
For II antennae, both initial-state particles are evolved at the same time and therefore both
momentum fractions change simultaneously [100,119], cf. the third row in fig. 8 for an illustration.
(Note that this is different to “dipole” kinematics, in which only one of the incoming x fractions

90
SciPost Physics Codebases Submission

can change in each branching.) Consequently, the post-branching momenta are constructed as
µ
pB , (160)
µ
x a = x A/za ( =⇒ paµ = pA /za ) , (161)
µ µ
x b = x B /z b ( =⇒ p b = pB /z b ) , (162)
µ µ µ
Ç
p j = y j b paµ + ya j p b + ya j y j b − µ2j q⊥max , (163)
µ µ
pµr = paµ + p b − p j , (164)

µ
where the za,b fractions are defined in eq. (150), q⊥max is again a spacelike four-vector perpendicu-
2
lar to pA and pB with q⊥max = −sa b , and r denotes the recoiling spectator system whose combined
invariant mass and rapidity are both unchanged by the branching: p2r = pR2 and y r = yR .
In both IF and II antennae, all momenta are rotated about the branching plane by a uniformly
distributed angle φ ∈ [0, 2π].
For RF antennae [89], the invariant mass of the resonance must be kept fixed, pa2 = pA2 = m2A.
The post-branching kinematics are therefore constructed in the resonance rest frame with the
z-axis defined along pK , so that
µ
pA = paµ = (mA, 0, 0, 0) , (165)
µ
€ Ç Š
pk = Ek , 0, 0, E 2j − m2k , (166)
µ
€ Ç Ç Š
p j = E j , E 2j − m2j sin θ jk , 0, E 2j − m2j cos θ jk , (167)
€ Ç Ç Ç Š
pµr = mA − Ek − E j , − E 2j − m2j sin θ jk , 0, − E 2j − m2k − E 2j − m2j cos θ jk , (168)

where r denotes the remainder of the resonance decay system and

sa j sak 2E b E g − s jk
Ej = , Ek = , cos θ jk = Ç . (169)
2ma 2ma 2 (Ek2 − m2k )(E 2j − m2j )

These momenta are rotated about the y axis such that the set of recoilers are along -z, so that only
j and k receive transverse recoil. Again, the momenta are subsequently rotated by a uniformly
sampled angle φ ∈ [0, 2π] about the z axis. The original orientation of pK with respect to z is
then recovered in a final step. This map is illustrated in the bottom row of fig. 8.

Helicity Dependence All of VINCIA’s QCD and EW (but not QED) antenna functions are imple-
mented with full helicity dependence, i.e. decomposed into distinct terms for each set of contribut-
ing helicities. This facilitates helicity-dependent showering and matching, given a polarized Born
state [108]. For brevity, the QCD antenna functions shown below are averaged over pre-branching
helicities and summed over post-branching ones; see ref. [101] for details on their individual he-
licity components.

Biased branchings and uncertainty weights Just as for the simple shower, VINCIA contains
several options for artificially increasing (or suppressing) the probabilities for different branching
types to occur, accompanied by non-unity event weights to compensate for how over- or under-
represented each generated event becomes in the resulting sample. This can be especially useful

91
SciPost Physics Codebases Submission

to enhance the rate of rare splittings, such as g → b b̄. The general procedure is described in
section 4.1.5 and follows the formalism presented in ref. [20].
As a relatively minor extension, VINCIA also allows for “enhancement” factors smaller than
unity, which then act to suppress the corresponding branchings. The intended use case is to
focus on Sudakov-suppressed regions of phase space. In the algorithms described in refs. [15,
20], enhancement factors smaller than unity are not guaranteed to produce positive weights. In
VINCIA’s implementation, this issue is sidestepped by letting trial branchings be enhanced by a
factor max(1, E),
P̂biased = max(1, E) P̂ , (170)
where P̂ is the unenhanced trial-generation probability density and E is the enhancement (or
suppression) factor. Thus for E < 1 the trial probability is not modified. Conversely, each trial
branching is accepted with a probability
min(1, E)P
Pacc = , (171)

where P/ P̂ is the unbiased accept probability. The reweighting factor for an accepted trial branch-
ing remains Racc = 1/E (cf. section 4.1.5), while the reweighting factor for a discarded one gen-
eralizes to
P̂biased − P
Rdisc = . (172)
P̂biased − E P
Thus, in VINCIA’s version of the enhancement algorithm, both Racc and Rdisc are positive definite
for any E > 0 and P < P̂.
Although automated shower-variation weights were a signature feature of early versions of
VINCIA [64], such variations have not yet been incorporated into the current VINCIA implementa-
tion in PYTHIA 8.3 but remain planned for a future revision. See the program’s online manual
for updates.

4.2.2 QCD showers


In their present incarnation, VINCIA’s QCD showers are fully developed within the so-called sector
framework [98, 101, 107, 109, 120–122], in which only a single branching contributes per phase-
space point. This is enforced by dividing the phase space into sectors according to a decomposition
of unity as given by the following sum of Heaviside step functions,
X X  
1= Θsct
j (p 2
⊥j , ζ j , φ j ) = θ min {Q 2
res,k } − Q 2
res, j . (173)
k
j j

To discriminate between the different sectors, a “sector resolution” variable is used, which we
define to be [98]
 2
p if j a gluon
 ⊥j


Q2res, j =
v
u 2 , (174)
2
t q̄ jk
if (i, j) a quark − antiquark pair

q̄i j

smax
2
with p⊥ j
and q̄i j as defined in eq. (142). The asymmetric choice for quark-antiquark pairs accounts
for the fact that in gluon splittings with an arbitrary colour-connected recoiler X I gK 7→ X i q j qk ,
there is no singularity associated to the i- j-collinear limit [98].

92
SciPost Physics Codebases Submission

The shower evolution is given by the exponentiation of leading-order antenna functions [119–
127], specifically sector-antenna ones defined by the ratio of colour-ordered squared amplitudes,
2
Mi jk (q; pi , p j , pk )
A j/I K = gs2 C j/I K Ā j/I K = , (175)
|M I K (q; p I , pK )|2
and the coupling- and colour-factor-stripped antenna function Ā j/I K . Antenna functions for quark-
antiquark, quark-gluon, and gluon-gluon parents can be derived from off-shell photon decays
γ → qq [124], neutralino decays χ̃ 0 → g̃g [125], and Higgs decays H → gg [126], respectively.
An antenna function derived in this way will include the full single-unresolved singularity structure
of all colour dipoles in the given colour-ordered amplitude.
When multiple colour dipoles are present in the three-particle state used to derive the function,
these can be divided into sub-antenna functions,
gl gl
Ag/qg (pi , p j , pk ) = Ag/qg (pi , p j , pk ) + Ag/qg (pi , pk , p j ) ,
gl gl gl (176)
Ag/gg (pi , p j , pk ) = Ag/gg (pi , p j , pk ) + Ag/gg (pi , pk , p j ) + Ag/gg (p j , pi , pk ) .

Such functions build the basis for so-called global antenna showers, in which every antenna ra-
diates over all of its branching phase space, and only the sum of all antennae recovers the full
gl
single-unresolved singularity structure. Denoting these by A j/I K , the specific choices for global
final-state antenna functions in VINCIA are
2
(1 )2
+ (1 )2 2 2
– ™
gl g s C g/qq − y i j − y jk 2µ i
2µ k
Ag/qq (s I K ; yi j , y jk , µ2i , µ2k ) = +1− 2 − 2 , (177)
sI K yi j y jk yi j y jk
gs2 Cg/qg (1 − yi j )3 + (1 − y jk )2 y jk 2µ2i
– ™
gl 2
Ag/qg (s I K ; yi j , y jk , µi ) = + 2 − yi j − − 2 , (178)
sI K yi j y jk 2 yi j
gs2 Cg/gg (1 − yi j )3 + (1 − y jk )3
 
gl 3 3
Ag/gg (s I K ; yi j , y jk ) = + 3 − yi j − y jk , (179)
sI K yi j y jk 2 2
gs2 Cq/X g 2µQ2
– ™
gl 2 1 2 2
Aq/gX (s I K ; yi j , y jk , yik , µQ ) = yik + y jk + . (180)
2s I K yi j + 2µQ2 yi j + 2µQ2
They only differ from the ones given in refs. [119–127] by non-singular terms. Below, we show
how VINCIA’s sector-antenna functions, Asct
j/I K
, are constructed from these building blocks.

Single-unresolved limits In the sector shower formalism, there is only a single branching kernel
that contributes per phase-space point. In order to capture the correct leading-logarithmic struc-
ture of QCD matrix elements, it is therefore vital that sector-antenna functions fully incorporate
all single-unresolved limits of a given antenna/dipole. This means, that a single sector-antenna
function has to reproduce the full eikonal in the soft-gluon limit,
2 2
– ™
g j soft 2s ik 2m i
2m k
Asct (s ; yi j , y jk , µ2i , µ2k ) −−−→ gs2 C j/I K
j/I K I K
− 2 − 2 , (181)
si j s jk si j s jk

while reproducing the full massive DGLAP splitting kernel PI→i j (z, µ2i ) (or PI→i j (z, µ2i )/z for initial-
state partons) in any (quasi-)collinear limit,
ik j PI→i j (z, µ2i )
Asct (s ; yi j , y jk , µ2i , µ2k ) −→ gs2 C j/I K
j/I K I K
. (182)
si j

93
SciPost Physics Codebases Submission

This differs from conventional (non-sector) parton-shower algorithms, in which the soft and/or
collinear singularity structures are partial fractioned onto different branching kernels. E.g. in
DGLAP-based and dipole approaches, the soft eikonal is partial fractioned onto two separate ker-
nels (which are associated with two different recoil maps), and the same is true of gluon-collinear
singularities in both dipole and global-antenna showers. In the sector-antenna formalism, each
antenna function reproduces both the full eikonal and the full DGLAP kernel in the respective
limits, and double counting is avoided by allowing only one such antenna function to contribute
per phase-space point.

FSR antenna functions In VINCIA, final-final (FF) sector-antenna functions for gluon emissions
are constructed from their global counterparts, eqs. (177) to (180), by symmetrizing over colour-
connected gluons in the following way,
gl
Asct (s ; yi j , y jk , µ2i , µ2k ) = A g/I K (s I K ; yi j , y jk , µ2i , µ2k )
g/I K I K
gl
+ δ Ig Ag/I K (s I K ; yi j , 1 − y jk , µ2i , µ2k ) (183)
gl
+ δKg Ag/I K (s I K ; 1 − yi j , y jk , µ2i , µ2k ) ,

where δ Ig = 1 if I is a gluon and zero otherwise, and similarly for K. Note that the symmetrization
is done on the CM energy fraction of the relevant gluon(s), as in ( y jk = 1 − x i ) → (1 − y jk = x i )
in the symmetrization for I → i j, instead of via explicit permutations of the i and j momenta
as in eq. (176), which would correspond to y jk → ( yik = 1 − y jk − yi j ). This slight difference
(which vanishes in the relevant collinear limit yi j → 0) is to ensure finiteness in phase-space
regions close to the “hard” boundary yik . Although the yik = 0 region will never belong to the
j-emission sector, this damping of the singularity is important as it allows for the sampling of the
sector-antenna function over all of phase space with a post-hoc imposed sector veto. Additionally,
it ensures numerical stability whenever sector boundaries become close to the yik -singular region.
For gluon-splitting sector-antenna functions, an equivalent procedure yields
gl
Asct (s ; yi j , y jk , yik , µQ2 ) = 2 Aq/gX (s I K ; yi j , y jk , yik , µQ2 ) .
q/gX I K
(184)

Antenna functions for final-state partons that are colour-connected to incoming ones, as in initial-
final (IF) or resonance-final (RF) colour flows, are discussed below.

ISR antenna functions As for final-state radiation, sector-antenna functions involving initial-
state partons can be obtained by symmetrizing corresponding global ones over final-state gluons.
The reason initial-state legs do not need to be symmetrized is that there is no sector for “emission
into the initial state”. (Analogously, while jet algorithms may decide to cluster final-state partons
either with each other or with the beam, the beam itself is hard by definition and cannot be
clustered away.)
This means that, even in the global-antenna approach, beam-collinear singularities do not need
to be partial-fractioned. Hence, for II antennae, there is no difference between global and sector-
antenna functions; while for initial-final gluon emissions, antenna functions with two final-state
gluons are symmetrized as follows,
sct,IF sct,IF
Ag/AK (sAK ; ya j , y jk , µ2a , µ2k ) = Ag/AK (sAK ; ya j , y jk , µ2a , µ2k )
sct,IF
+ δgK Ag/AK (sAK ; 1 − ya j + y jk , y jk , µ2a , µ2k ) . (185)

94
SciPost Physics Codebases Submission

Finiteness close to the spurious yak → 0 singularity is here again ensured by adding y jk to the
symmetrized argument. Initial-final antenna functions describing final-state gluon splittings are
obtained in exactly the same way as in eq. (184).
Global initial-final and initial-initial antenna functions are obtained from eqs. (177) to (180) by
crossing partons from the final state into the initial state. For initial-initial antennae, the crossing
(I, K, i, k) → (−A, −B, −a, −b) implies:
si j −sa j ya j
yi j = → =− ,
sI K sAB yAB
s jk −s j b yjb
y jk = → =− , (186)
sI K sAB yAB
sik sab 1
yik = → = ,
sI K sAB yAB

while for initial-final antennae, the crossing (I, i) → (−A, −a) yields:
si j −sa j ya j
yi j = → = ,
sI K −sAK yAK
s jk s jk y jk
y jk = → =− , (187)
sI K −sAK yAK
sik −sak yak
yik = → = .
sI K −sAK yAK

The RF antenna functions are identical to the IF ones. The full set of VINCIA antenna functions,
including their helicity contributions, can be found in ref. [101].

The strong coupling VINCIA offers the same basic options for the strong coupling as the simple
shower does, with up to 2-loop running matched across flavour thresholds and an option to use
the CMW scheme. However, whereas the main tuneable parameters in the simple shower are the
effective values of αISR
s (M Z ) and αs (M Z ) (which may then be interpreted as being given in a
2 FSR 2

renormalization scheme not necessarily identical to MS), in VINCIA one instead specifies a single
common value for αMS s (M Z ) — normally just set to agree with a reasonable global average value
2

such as that given by the PDG [85] — with different effective values for different branching types
obtained via user-specifiable renormalization-scale prefactors,

αs ( kEF p⊥2
+ µ20 ) for FF and RF gluon emissions ,


 j
α ( kF p2 + µ20 ) for final-state gluon splittings ,


 s S ⊥j


αMS 2
s (M Z ) →
αs ( kEI p⊥
2
j
+ µ20 ) for II and IF gluon emissions , (188)
αs ( kSI p⊥
2
+ µ20 ) for initial-state gluon splittings ,




 j
 α ( kI p2 + µ2 ) for initial-state gluon conversions ,

s C ⊥j 0

where the scheme can be either MS or CMW and µ0 ∼ O(ΛQCD ) is a fixed scale that forces the
effective coupling to asymptote to αs (µ20 ) for p⊥ j → 0. A maximum value can also be specified
beyond which αs is not allowed to grow, effectively freezing the coupling at that value in the
infrared.

95
SciPost Physics Codebases Submission

Evolution equations The differential branching probability as implemented by the sector shower
is given as the sum of individual I K 7→ i jk antenna branching probabilities,
dP X dP j/I K
2
= 2
, (189)
dp⊥ j j
dp⊥ j

2
which can be written in terms of the shower evolution variable p⊥ j
and an arbitrary complemen-
tary phase-space variable ζ as
2
ζmax (p⊥ )
Z j Z2π
dP j/I K αs (p⊥
2
j
) dφ
2
= C j/I K Āsct (p2 , ζ) RPDF FΦ J(p⊥
j/I K ⊥ j
2 sct 2
j , ζ) Θ (p⊥ j , ζ, φ) dζ . (190)
dp⊥ j
4π 2π
ζmin (p⊥
2
j
) 0

Here, the Jacobian J(p⊥ 2


j
, ζ) accounts for the change to the shower variables ( yi j , y jk ) 7→ (p⊥
2
j
, ζ),
for which different choices are implemented in VINCIA, depending on the branching type, cf. ref. [101,
section 2.5]. Note that, since the starting point is an exact phase-space factorization and the Jaco-
2
bian factor J(p⊥ j
, ζ) accounts for the mapping to shower variables, there is no physical dependence
on the choice of ζ in VINCIA; it only affects how simple or complicated the trial integrals become,
and the efficiency with which trial branchings can be generated. The phase-space factor
 sI K

 sI K q FF
λ(s , m2
, m2
)

I K


 i k
sAK + m2j + m2k − m2K sAK + m2j + m2k − m2K



RF

(1 − y jk )3
 q
FΦ = λ(m2A, m2AK , m2K ) (191)
 sAK
IF


1 − y jk



sAB


II


1 − ya j − y j b

accounts for the relative size of the post-branching phase space to the pre-branching phase space.
For ISR, a PDF ratio is included for every initial-state parton,


 1 FF & RF


 f a (x a , p⊥ j )
2



RPDF = f (x , p2 ) IF . (192)
 A A ⊥j
f a (x a , p⊥ j ) f b (x b , p⊥ j )
2 2




 f (x , p2 ) f (x , p2 ) II


A A ⊥j B B ⊥j

Two things should be noted in eq. (190). First, the colour factor C j/I K is normalized such that
the integral prefactor is always 1/4π (as opposed to 1/2π), with the specific VINCIA choices being
8
Cg/qq = 2CF = , (193)
3
1 17
Cg/qg = (2CF + CA ) = , (194)
2 6
Cg/gg = CA = 3 , (195)
Cq/X g = 2TR = 1 , (196)

96
SciPost Physics Codebases Submission

where an interpolation between 2CF and CA ,

(1 − yi j )2CF + (1 − y jk )CA
Cg/qg = (197)
2 − yi j − y jk

is available for qg antennae. Second, the azimuthal integration is made explicit although the
antenna functions have no azimuthal dependence. This is to emphasize a potentially non-trivial
azimuthal dependence of the sector veto Θsct .

Matching, merging, and matrix-element corrections A unique property of VINCIA’s sector-


based approach to parton showers is that there is only a very small number of “shower histories”
leading to each distinct parton configuration. For gluon emissions, VINCIA’s sector shower is en-
tirely bijective, i.e. there is only a single unique shower history leading from the Born to any
given Born+n-gluon parton configuration. For g → qq̄ splittings, one has to sum over all pos-
sible same-flavour quark-antiquark pairings, but the number of contributing histories for a given
phase-space point is still drastically reduced relative to conventional, non-sectorized, showers. We
say that the sector shower is “maximally bijective”, and this provides an optimal framework for
high-multiplicity matching and merging, discussed further in section 5.4.

Infrared cutoffs For p⊥ scales below 1 GeV or so, perturbative approximations become increas-
ingly inaccurate as αs (p⊥ ) shoots towards divergence at ΛQCD ∼ 200 − 300 MeV. Like for the
simple shower model, VINCIA’s perturbative shower evolution is therefore also halted some dis-
tance above ΛQCD , at which point the parton system is handed to PYTHIA’s string-fragmentation
model for hadronization. In VINCIA, the precise scale at which the shower is stopped can be set
independently for FF, IF, and II antennae.
The shower cutoff for FF antennae in VINCIA is analogous to the FSR cutoff in the simple-
shower model. It can be regarded as the effective factorization scale between the perturbative and
non-perturbative parts of the overall fragmentation description. It therefore has the interpretation
as the scale at which the parameters of the non-perturbative hadronization modelling are defined.
Ideally, the hadronization parameters should “run” with the shower cutoff, but since the relevant
running equations are not known, in practice the hadronization parameters simply have to be
retuned for each new value of the cutoff. In other words, the FF-cutoff value can be considered part
of the fragmentation tuning. In general, one would seek not to leave too much of a gap between the
lowest p⊥ scales generated by shower branchings (down to the cutoff) and the highest p⊥ scales
generated by string breaks (with typical size set by the fragmentation p⊥ width, cf. section 7.1.1).
For II antennae, the cutoff can be regarded as an effective colour-screening resolution scale,
or a lowest scale for which partons inside hadrons can be said to be well represented by plane
waves. This could possibly be tied to the physics of parton saturation, though no explicit such
connection is made here. The practical considerations are similar as for the ISR shower cutoff
in the simple-shower model, striking a balance between p⊥ kicks generated by the shower and
contributions from so-called “primordial k⊥ ”, cf. section 6.3.3.
For IF antennae, the fact that VINCIA’s default recoil strategy is fully local, and does not impart
p⊥ recoil to any partons outside of the 2 → 3 branching itself, leads to some pathologies. In
particular, each IF branching dilutes the primordial k⊥ , and does not add any perturbative p⊥ of
its own, to the hard system. The non-smooth interplay between the II and IF recoil strategies can
make it challenging to describe the soft peak of experimental signals such as the Drell–Yan p⊥
spectrum, and can produce seemingly counter-intuitive scaling with the value of the IF cutoff.

97
SciPost Physics Codebases Submission

4.2.3 QED showers


The VINCIA shower offers a number of options for the inclusion of electromagnetic and weak
corrections. They all share common features, like the phase-space factorizations and ordering
scale, with the QCD shower, cf. section 4.2.1. In this section, we describe the first (and default)
option, which is a pure QED shower that incorporates a fully coherent multipole treatment of
the simulation of photon radiation off systems of charged fermions, vectors, and scalar particles,
as well as photon splittings to pairs of charged fermions [102, 103]. We also include a simpler
and somewhat faster alternative, in which the full multipole sum is replaced by individual dipole
terms according to a principle of maximal screening, analogous to how QED is handled in the
simple-shower model.
The basic building block for VINCIA’s treatment of photon radiation is the photon-emission
antenna function for a single pair of final-state charged particles i and k,

2 QI QK yik µ2i µ2k yi j yi j


•
Asct (s ;
γ/I K I K
yi j , y jk , µ2i , µ2k ) = 2g 2 − 2 2 − 2 2 + δ If + δKf
sI K yi j y jk yi j y jk y jk y jk
y jk ( y I K − y jk )
 
4 y jk
+ δ IW yi j +
3 y I K − y jk y I2K
yi j ( y I K − yi j ) ˜
 
4 yi j
+ δKW y jk + , (198)
3 y I K − yi j y I2K

where the Kronecker deltas ensure the correct collinear terms are incorporated in the cases where
I and K are fermions or W bosons. The factors Q I and Q K represent the relative electromagnetic
charges of I and K, respectively. The II and IF antennae may be found by crossing symmetry
following eqs. (186) and (187). For notational convenience we define

Asct (s ; yi j , y jk , µ2i , µ2k ) = g 2Q I Q K Āsct


γ/I K I K
(s ; yi j , y jk , µ2i , µ2k ),
γ/I K I K
(199)

similar to the QCD equivalent eq. (175).


While possibly counter intuitive, the definition of a coherent QED shower using eq. (199)
is not as straightforward as for its (leading-colour) QCD shower counterpart. The reason is the
absence of an equivalent of the leading-colour approximation, which in QCD allows one to discard
the majority of the soft eikonal contributions that are subleading in colour. Conversely, in QED no
eikonal is subleading to any other, and full coherence can only be accomplished by the inclusion of
all of them simultaneously. VINCIA’s most sophisticated photon emission algorithm accomplishes
this by the definition of a single branching kernel
X
Āsct
γ/coh
= σ I Q I σK Q K Āsct (s ; yi j , y jk , µ2i , µ2k ) ,
γ/I K I K
(200)
{I K}

where {I K} runs over all pairs of charged particles, and σ I and σK are sign factors that have σ I = 1
for final-state particles and σ I = −1 for initial-state particles. This branching kernel includes all
soft multipole terms, as well as the correct collinear limits [102], but its singular structure is highly

98
SciPost Physics Codebases Submission

complex. The coherent algorithm is able to sample it by sectorizing the phase space according to
2
ζmax (p⊥ )
Z j Z2π
dP j,coh X αem (p⊥
2
j
)
2
= Āsct (p2 , ζ)
γ/coh ⊥
dp⊥ j

{ik}
ζmin (p⊥
2
j
) 0

2 sct 2 dφ
× RPDF FΦ J(p⊥ j , ζ) Θik (p⊥ j , ζ, φ) dζ , (201)

where Θik (p⊥ j , ζ, φ) is given by eq. (173), but with a sum over charged-particle pairs, and the
sct 2

sector resolution is the same as that of a gluon emission as given by eq. (174). This procedure
ensures the soft and collinear singularities are correctly regularized by the transverse momenta of
the photon with respect to all pairs of charged particles. The coherent emission algorithm is the
default choice, but in some specific high-multiplicity cases it may be slow due to the large number
of sectors that need to be sampled.
As a backup, a faster, unsectorized alternative is implemented that rephrases the photon emis-
sion probability as
2
ζmax (p⊥ )
Z j Z2π
dPpair X αem (p⊥
2
j
) dφ
2
= Q2[I K] Āsct (p2 , ζ)RPDF FΦ J(p⊥
γ/I K ⊥ j
2
j , ζ) dζ , (202)
dp⊥ 4π 2π
j [I K]
ζmin (p⊥
2
j
) 0

where [I K] now runs over all pairings of charged particles with identical but opposite charge
Q [I K] . The factors σ I and σK have been absorbed into the definition of Q [I K] , meaning that a
final-state charged particle may be paired with a same-sign initial-state particle. That is, every
charged particle now only appears once, and pairings are constructed to minimize the sum of
dipole-antenna invariant masses as per the principle of maximum screening [103]. The task of
pairing particles under such a constraint in O(n3 ) time complexity is accomplished using the
Hungarian algorithm [128–130]. While this algorithm is generally faster, it only approximates the
complete multipole structure. Furthermore, it may not always be possible to pair up all charges.
For instance, in a W+ → ud decay, no pairings are possible at all. In such cases, as many charges
as possible are paired up, and the fully coherent algorithm is used for the remainder.
The QED shower also includes photon splittings to charged fermions, which use antennae
that are kinematically identical to their gluon-splitting counterparts. Furthermore, while photon
radiation off quarks is cut off at a scale of order the hadronization scale, leptonic photon radiation
continues to much lower scales and has its own cutoff scale. Since the system of leptons is not
necessarily charge conserving by itself, which is a requirement for the above algorithms, the pool
of charges is supplemented with the colour-neutral strings that enter the hadronization stage.
When acting as the recoiler of a lepton, the antenna function is replaced by a dipole function that
only contains the singular limits relevant to the lepton.
QED radiation off charged hadrons and/or in hadron decays, is not present in the current im-
plementation but may be included in future work; see the program’s online manual for updates.

4.2.4 EW showers
As an alternative to the coherent QED shower described above, VINCIA also offers the option to
interleave the QCD shower with a full-fledged EW shower, in which all possible branchings from

99
SciPost Physics Codebases Submission

the EW sector are incorporated, albeit only in a collinear approximation without any attempt at
incorporating soft-interference effects [23, 106]. For each given application, one must therefore
choose whether weak-shower corrections are more important than QED coherence effects for the
study at hand, the default choice being the coherent-QED one.
When enabled, VINCIA’s EW option includes not only the branchings that are also available
in the simple shower (heavy vector-boson emissions off initial- and final-state fermions) but also
final-state triple vector-boson branchings, Higgs emissions, and decay-like splittings. Like the QED
module, the EW one also shares the common features of the QCD shower, allowing for a sensible
interleaving of the two. However, it is important to be aware that the EW shower relies on the
helicity-dependent evolution described in section 4.2.1, which must therefore also be enabled.
The resulting intermediate states of definite helicity are vital in the EW sector due to its chiral
nature. Helicity-dependent antenna functions are present for all EW branchings, capturing their
associated quasi-collinear limits. Due to the rich physics landscape of the EW sector of the SM
and the many different helicity combinations, there are hundreds of distinct polarized collinear-
splitting kernels. The antenna functions are therefore not included here, but they may be found
in ref. [23]. Note that, since the EW shower does not incorporate soft-interference effects, the
antenna functions are more like dipole functions, only including the single quasi-collinear limit of
the branching particle, while the other just functions as a recoiler.
A number of features unique to the EW sector are incorporated. For example, in a shower
sequence like e− → e− Z/γ → e− W+ W− , the interference between the Z and the γ can be of
O(1) [106,131]. A full treatment of this effect may for instance be accomplished by the evolution
of density matrices, which can quickly become prohibitively expensive. VINCIA instead implements
a simplified approach, in which the emission probability is corrected at first order by an event
weight. This weight is computed using quasi-collinear amplitude-level branching amplitudes using
the spinor-helicity formalism.
These same amplitudes are also used to determine recoilers for the quasi-collinear branchings
of the EW shower. Unlike in the QCD sector, where recoilers are typically chosen to be a colour-
connected parton, no such mechanism is available in the EW sector. Furthermore, because the
EW shower only models quasi-collinear branchings without soft-interference effects, the choice of
recoiler is formally arbitrary. One can however select the recoiler probabilistically such that the
kinematic effects of recoil on previous branchings is minimized [106].
Another peculiar feature of the EW sector is the fact that branchings like t → bW+ and
Z → qq appear both as shower branchings and as resonance decays. For off-shellness scales
Q2 = m2 − m20 ∼ Γ 2 , the physics is best described by a Breit–Wigner distribution, while for scales
above the electroweak scale Q2EW , the EW shower is most accurate. In the intermediate region a
matching procedure is required. When the EW shower produces a heavy resonance like one of the
EW gauge bosons, a top quark, or a Higgs boson, its mass is sampled from a helicity-dependent
Breit–Wigner distribution (see ref. [23] for details)

m0 Γ (m)
BW(Q2 ) ∝ . (203)
Q4 + m20 Γ (m)2

This procedure mirrors the treatment of resonances that are part of the hard process as described
in section 2.3.3, which can also be branched by the EW shower. The shower is matched to the
Breit–Wigner distribution by applying a suppression factor Q4 /(Q2 + Q2EW )2 , and the resonance is
decayed when the evolution scale reaches the sampled resonance off-shellness without generating
an EW branching. In that case, if the EW shower produced the resonance, the decay is distributed

100
SciPost Physics Codebases Submission

according to the appropriate helicity-dependent 1 → 2 matrix element. If instead the resonance


was part of the hard process, the decay has already been generated and is inserted.
Finally, double counting issues appear with the inclusion of EW branchings in the shower.
For instance, the state pp → VVj may be reached by starting from pp → VV and performing an
initial-state QCD emission, or from pp → Vj and performing an EW emission. To avoid double
counting such phase-space points, VINCIA implements an overlap veto procedure that can be used
when overlapping matrix elements are enabled. It is based on the k T jet algorithm [132] distance
measures, generalized to account for the massive states that appear in the EW sector, given by

diB = k2T,i ,
∆i j
di j = min(k2T,i , k2T, j ) + |m2i + m2j − m2I |. (204)
R
The distance between the beam and final-state particle i is measured by diB , while di j measures the
distance between two final-state particles i and j. If, for example, a gluon is emitted by the QCD
shower, the distances with respect to its colour-connected partons are computed. Furthermore,
the distances of all possible 2 → 1 EW clusterings of the state after the gluon emission are also
evaluated. If one of these distances is smaller than the QCD ones, then the current phase-space
point should be populated by the EW shower rather than the QCD shower, and the gluon emis-
sion is vetoed. This procedure ensures no double counting occurs and the QCD and EW showers
populate the regions of phase space they are most accurate in.

4.3 The DIRE shower


The DIRE parton shower, introduced in ref. [118], offers another alternative showering model.
It aims to combine aspects traditionally associated with 2 → 3 dipole (antenna) showers with
features of “conventional” 1 → 2 parton showers. The goal of this hybrid is to inherit the modelling
of soft-emission effects from dipole showers, while keeping an explicit association of splittings with
specific collinear directions. This should, in principle, allow for an uncomplicated comparison to
the ingredients in QCD factorization theorems. The physics aspects of DIRE have been developed
in a series of articles [118, 133–138], and we refer the reader to these publications for details.
Below, we will summarize the most important choices, virtues and current limitations.

4.3.1 Phase-space coverage and ordering


DIRE employs an exact factorization of the single- and double-emission phase spaces. The single-
emission phase space is adapted from refs. [112, 139], and allows for any combination of masses
in 2 → 3 branchings. The construction of post-emission momenta through the mapping M(1) can
be sketched by
  Radiator i
f
Radiator i1

M(1)  ⊕ t (1) , z (1) , φ (1)  = , (205)


 

Recoiler ke Emission 1
Recoiler k

where t (1) is the evolution variable, z (1) a momentum-sharing variable, and φ (1) an azimuthal
angle. Note that under the mapping M(1) , the direction of the recoiler is not affected by the
branching. Only its longitudinal momentum components change. This deliberate choice ensures

101
SciPost Physics Codebases Submission

that the collinear direction defined by the recoiler, and consequently its mapping onto factorization
theorems, remains intact. A caveat to this approach — related to initial-state emissions — is
discussed below.
The double-emission phase space — relevant for NLO parton evolution [133] — may similarly
be illustrated by a map M(2)

  Radiator i
g
Radiator i12

Emission 1
M(2)  ⊕ t (12) , z (12) , φ (12) , s12 , x, φ 0  = . (206)
Recoiler ke Emission 2
Recoiler k

Here, t (12) is the evolution variable assigned to the emission of the system (12), z (12) (φ (12) ) a
momentum-sharing (azimuthal angle) variable, and s12 the virtuality of the system (12), while
x and φ 0 are related to the momentum sharing (azimuthal angle) between emissions 1 and 2.
Again, the direction of the “recoiler” is preserved.
The momentum mappings M(1) and M(2) are (re)arranged to ensure that the phase-space cov-
erage is fully symmetric between radiation from the “radiator” or the “recoiler”, i.e. given a fixed
post-branching phase-space point and fixed branching variables, an identical pre-branching phase-
space point is produced, independent of assigning the emissions to the radiator or recoiler. How-
ever, it should be noted that the momentum-sharing variables z are not symmetric under exchange
of the “emission” for one of the other involved particles, since the limit z → 1 is associated with a
soft emission. DIRE separates the generation of post-branching momenta into four distinct cases:

FF, i.e. emission from a final-state particle, using a final-state recoiler:


This case has the fewest kinematic constraints, but the richest set of combinations of possible
masses. Thus, the mapping is constructed to ensure that regions of phase space in which
mass-corrected transition rates would lead to negative contributions are outside the physical
phase-space boundaries.

FI, i.e. emission from a final-state particle, using an initial-state recoiler:


This case again has few kinematic constraints, after the choice of keeping the recoiler di-
rection intact. DIRE will, if not instructed otherwise, treat incoming particles as massless
for the purpose of phase-space generation. This means that in this configuration, negative
transition rates due to mass corrections may occur. This is handled by a weighted shower
algorithm.

IF, i.e. emission from an initial-state particle, using a final-state recoiler:


This case has several kinematic constraints that need to be considered. In fact, the system
is over-constrained if both the initial-state particle and the final-state recoiler should retain
their directions. In this case, the transverse momentum generated in the branching can be
balanced by extending the set of particles that may change their momentum. DIRE offers
the possibility to employ a “global”-recoil strategy, in which the transverse momentum of the
splitting is balanced by all final-state particles within the decaying system2 . It is also possible
to instruct DIRE to relax the condition that the final-state recoiler retains its direction. In
this “local” strategy, the system of particles that change their momentum does not need to
be extended.
2
Here, “decaying system” refers to the particle content of a single 2 → n scattering, in case several such scatterings
exist due to the inclusion of multiparton interactions.

102
SciPost Physics Codebases Submission

II, i.e. emission from an initial-state particle, using an initial-state recoiler:


This case also has several kinematic constraints, and is over constrained, since both initial-
state particles should retain their directions. Here, no attempt is made to construct a “local”-
recoil strategy. Instead, the transverse momentum of the branching is collectively balanced
by all final-state particles within the decaying system.

In general, the construction of post-branching momenta is subject to many choices. The choices
above have foremost been guided by providing a simple procedure and Jacobian factors, such that
analytic integrations of the emission patterns are as straight-forward as possible. This helps when
improving the evolution with next-to-leading order corrections.
The evolution variables in DIRE are chosen to lead to a symmetric phase-space sampling and
simple phase-space boundaries. Soft transverse momenta fulfil these criteria, if defined by

(pi pa )(pk pa )
t (a) ∝ ∝ pa+ pa− , (207)
Q2

where Q2 is a maximal scale, pa may be a sum of one or two emission momenta, and pi and pk are
the radiator and recoiler post-branching momenta. Thus, in all of the four cases above, and for
both single-emission and double-emission contributions, DIRE employs soft transverse momentum
as ordering variable, see refs. [118, 133] for details.

4.3.2 Transition rates


The DIRE parton shower aims to model configurations containing soft particles or collinear con-
figurations with high fidelity. As in a traditional parton shower, separate transition rates are used
in each collinear direction. It might be helpful to explain this choice with an example. Imagine a
dipole stretched between a quark and a gluon. The primary contributions to radiation collinear
to the quark should be proportional to the colour factor CF , while the radiation pattern collinear
to the gluon should, up to small corrections, be proportional to CA . Similarly, higher-order cor-
rections to the radiation pattern in either region differ.
However, simultaneously radiating from both dipole “ends” with the full rate expected in the
collinear limit (given by the DGLAP splitting functions) will naively lead to an incorrect pattern
in the soft limit. This problem is circumvented by replacing collinear-soft parts of DGLAP kernels
by an improved description. The latter may be obtained by distributing the correct soft radiation
pattern among all coherently radiating particles,
(pi pk ) 1 (pi pk )
(pi pa )(pk pa ) = (pi pa )
+ (i ↔ k)
(p p ) + (p p )
| {z } | i a {z k a }

combine with Jacobian rewrite in t and z


1 2(1−z) 2
t (1−z)2 +t/Q2 1−z
collinear limit t → 0
The resulting soft-collinear pattern is supplemented with hard-collinear terms [88]. As an exten-
sion of ref. [88], the 1/z-terms present in DGLAP kernels are also shifted 1z → z 2 +t/Qz
2 to ensure

that sum rules for the splitting kernels are maintained. Finally, mass-dependent corrections based
on ref. [112] are added. The exact splitting kernels used in DIRE are listed in ref. [118]. The above
chain of reasoning is used for all branching types in DIRE.

103
SciPost Physics Codebases Submission

QCD The above reasoning directly applies to QCD branchings at leading order, i.e. when in-
creasing the multiplicity by one particle, and while not including explicit virtual corrections to
single-parton emission. At leading order:

• Dipoles are formed from radiator-recoiler pairs connected by a colour flow in the Nc → ∞
limit. At the point of compiling this manual, the fixed-colour corrections discussed in ref. [138]
have not been included in the core PYTHIA code.

• Colour factors due to colour-charge correlators in the Nc → ∞ limit are given by:

1. gluon-radiation off (anti)quarks ∝ CF ,


2. gluon-radiation off gluons ∝ CA /#(possible recoilers) = CA /2,
3. and gluon branching to quark pairs ∝ TR.

• Coupling-factors αs for all QCD splittings are evaluated with dynamic arguments, with the
preferred scheme being αs (t). However, it should be noted that the emergence of the run-
ning coupling is driven by soft-gluon emissions, and thus, it is a priori not obvious if the
evaluation αs (t) extends also to hard-collinear configurations. Thus, the user may instruct
PYTHIA to use different arguments to evaluate αs : the running coupling may be evaluated
using the “collinear transverse momenta” k2⊥ defined as evolution variables in ref. [115], i.e.
αs (k2⊥ ), or it may be evaluated using the strict definition of the (inverse) eikonal term, i.e.
(pi pa )(pk pa )
€ Š
αs (p p ) .
i k

The usage of a running coupling effectively includes “universal” virtual corrections to the emission
rates. For inclusive soft-gluon emission, it is possible to include further next-to-leading order
corrections rescaling the soft-gluon emission rate:

αs 2(1 − z)
• ˜
+ (hard-collinear terms)
2π (1 − z)2 + t/Q2
αs  αs  2(1 − z) αs
→ 1+K + · (hard-collinear terms) . (208)
2π 2π (1 − z) + t/Q
2 2 2π

This may be considered as a conservative implementation of the conventional CMW (or MC)
scheme [86]. Note that different strategies for the evaluation of running couplings will induce
different higher-order corrections. Without better modelling, none of these ad hoc choices are
completely satisfactory.
Higher-order corrections to QCD evolution have been known for a long time. DIRE implements
several aspects of QCD evolution at next-to-leading order:

• Inclusive branching rates can be augmented with hard-collinear corrections by employing


NLO DGLAP splitting functions. These improvements are available for both initial-state
and final-state branchings. The benefit of such corrections is mainly in a more consistent
treatment of PDF evolution in backwards initial-state evolution, since the latter relies on
the parton shower distributing emissions according to the rates used to evolve externally
pre-tabulated PDFs from low to high scales.

• Correlated triple-collinear emissions, i.e. branchings of the form 1 ⊕ 1 → 3 ⊕ 1, have been


included to yield NLO DGLAP (initial- or final-state) evolution in the collinear limit from
fully differential double-emission matrix elements.

104
SciPost Physics Codebases Submission

• Correlated double-soft emissions and explicit real-virtual corrections, i.e. branchings of the
form 2 → 4 and 2 → 3 at 1-loop, can be employed for final-state branchings. The inclusion
of such NLO corrections is mainly a reduction of the renormalization-scale uncertainty of
the parton shower. At the point of writing this manual, the consistent combination of triple-
collinear and double-soft NLO corrections outlined in ref. [140] has not been included in a
public PYTHIA release.

QED The description of QED in DIRE [136, 138] follows a very similar structure to that of QCD
branchings, and is inspired by ref. [141].

• Dipoles are formed from all electrically charged radiator-recoiler pairs, much like the fixed-
colour QCD dipole assignments discussed in ref. [69, 138].

• Charge-factors due to electric-charge correlators are determined from

ηi1
˜ ηk̃ Q i1
˜ Q k̃
Q2 = − (photon emission)
Q2˜
i1
1
Q2 = (photon splitting) , (209)
#recoilers
where Q i1 ˜ and Q k̃ are the charges of the radiator and recoiler, respectively, and ηi = +1(−1)
if i is a final- or initial-state particle. These correlators multiply the splitting functions in
place of the QCD colour factors, and may readily lead to negative contributions to the tran-
sition rates. Thus, a weighted shower algorithm is crucial for the QED modelling in DIRE.

• Coupling-factors αem for all QED splittings are evaluated in the Thompson limit, i.e. no
running QED coupling is employed in the shower.

Kinetically mixed dark photons DIRE further implements kinetically mixed dark photon in-
teractions, featuring dark photon emission from and decay into standard-model particles. These
transitions are handled analogously to QED interactions, except that the dark photon may be mas-
sive. The decay width of the dark photon is currently ignored for both dark photon emission and
decay.

Electroweak effects Finally, DIRE allows for electroweak-boson radiation and fermionic weak-
boson decay, using a simplistic model similar to the ideas of refs. [83,142]. Electroweak effects are
mainly included because of the necessity for consistent matrix-element merging at LHC energies:
to avoid an overly QCD-evolution biased scale setting for vector-boson plus jets configurations
that exhibit giant K-factors [143], the inclusion of parton-shower histories containing electroweak
clusterings are mandatory for some showers [144, 145]. Thus, DIRE implements electroweak
evolution using:

• Transition rates are determined from partial-fractioned massive dipole kernels [112].

• Dipoles are formed from all pairs of particles that may emit the same electroweak vector
bosons. Electroweak-boson decays employ the same recoiler selection as vector-boson radi-
ation, much like the QED case [136, 138].

105
SciPost Physics Codebases Submission

• Coupling factors are calculated under the assumption of chirality-summed evolution, cf. ref. [142].
The coupling value is kept fixed, i.e. no running coupling is employed.

This overly simplified model may appropriately handle electroweak history effects in the context of
multi-jet merging, especially for W± -boson plus multi-jet configurations at the LHC [137]. Beyond
this, studies of weak-boson radiation with DIRE are discouraged.

4.3.3 Weight handling aspects


The transition rates outlined in section 4.3.2 may be relatively complex. This results in some
technical requirements that need to be met to produce a sound simulation since:

• it may not be possible to find efficient overestimates of complex transition kernels, such as
for correlated double emission;

• and it may not be possible to guarantee positivity, e.g. due to mass effects, electric charge
correlators, or higher-order corrections.

Both of these points (as well as the automated renormalization scale variations available in DIRE)
may be addressed with the help of a weighted veto algorithm, which was discussed in section 2.2.3.
DIRE employs this method more heavily than the rest of PYTHIA. Thus, the relevant features and
extensions beyond the literature will be discussed here.
The core realization of weighted shower algorithms is that acceptance rates in the veto algo-
rithm may be factored into a contribution that is “unweighted” via the veto algorithm, and an
event-by-event “weight” factor that encapsulates the effect of sign changes or underestimations
of the transition rate. To preserve inclusive cross sections, this is naturally complemented by
event-by-event weights that augment the rejection rate of the parton shower. Once acceptance
and rejection weights have been introduced, it also becomes possible to only partially unweight
the use of overestimates through the veto algorithm, and correct for the partial unweighting by
amending the event weight. This allows for an enhancement of certain transitions beyond their
natural rate, leading to an improved statistical error, at the expense of a larger weight variance.
Finally, the veto algorithm may be implemented in a series of distinct accept-reject steps. Each
such step can be upgraded to incorporate complex rates3 . At present, DIRE employs a weighted
shower when choosing a branching according to splitting kernels, and another, stacked weighted
shower to incorporate matrix-element corrections. A weighted approach to the latter is necessary
as matrix-element corrections may induce sign changes, or because matrix elements are underes-
timated by the (sum of all possible) splitting kernels.
The weighted shower algorithm in DIRE rests on the realization that the rate of producing one
transition in the shower after n rejections (through the veto algorithm) is given by
 Z tn n •
‚ Z t Œ
f (t) g(t i ) − f (t i )
Y ˜ 0

P (t) = g(t) exp − d t̄ g( t̄) g(t i ) exp − d t̄ g( t̄) . (210)


g(t) t i=1
g(t i ) ti

This equation is the formal requirement for the validity of the veto algorithm, and does not strictly
constrain the transition rates f (t) by the “overestimate” g(t) through 0 < f (t) < g(t). The nu-
merical implementation of the equation does, however, require sensible acceptance probabilities.
3
For example, the algorithm of ref. [138] relies on a three-step (un)weighting.

106
SciPost Physics Codebases Submission

This may be achieved by introducing an auxiliary function h(t) that guarantees acceptance prob-
abilities 0 < f (t)/h(t) < 1, see e.g. also the description at the end of section 2.2.3. Rewriting in
terms of this auxiliary function leads to
 Z tn n •
‚ Z t Œ
f (t) f (t i )
 Y ˜ 0

P (t) = g(t) exp − d t̄ g( t̄) 1− g(t i ) exp − d t̄ g( t̄)


h(t) t i=1
h(t i ) ti
n
h(t) Y h(t i ) g(t i ) − f (t i )
⊗ . (211)
g(t) i=1 g(t i ) h(t i ) − f (t i )

The second line may be interpreted as a corrective factor due to disconnecting the sampling and
rejection distributions. It does not have to be bounded, and is implemented as an event weight.
It is important to note that the acceptance rates f (t)/h(t) only need to be bounded point-wise in
t, i.e. they may be adjusted depending on the value of f (t). In particular, DIRE uses
¨
sign[ f (t)]g(t) if g(t) > | f (t)|
h(t) = (212)
k f (t) if g(t) < | f (t)| (with k ∼ 1.1) .

An artificially enhanced sampling may be achieved by shifting g → g 0 = C g, while keeping all


rejection steps (i.e. the definition of h(t)) fixed to their original values. The compensating event
weight will then be shifted to
n
1 h(t) Y h(t i ) g(t i ) − f (t i )/C
. (213)
C g(t) i=1 g(t i ) h(t i ) − f (t i )

Once the weighted shower is in place, parton-shower variations may be included by keeping track
of multiple weights of the form
n
1 h(t) Y h(t i ) g(t i ) − f [k] (t i )/C
w[k] = , (214)
C g(t) i=1 g(t i ) h(t i ) − f (t i )

where f [k] (t i ) is the value of the varied transition kernel. DIRE allows for renormalization-scale
variations in the argument of running-coupling evaluations, as well as variations of parton distri-
bution functions.
Finally, DIRE stacks weighted-shower steps, especially to allow the incorporation of matrix-
element corrections. The possibility for stacking relies on two realizations: weighted-shower in-
duced event weights are multiplicative, and after applying the event weight of previous steps,
the shower rate will be correctly determined from the full splitting rate. Thus, for subsequent
weighted shower steps, the full splitting kernel that would be obtained after applying the weight
is the new sampling rate — or “overestimate” — for the next, stacked, weighted-shower step.
DIRE currently stacks two weighted-shower steps. Ignoring, for the sake of a simple presenta-
tion, splitting enhancements and variations, then the first weighted shower (used to exponentiate
complicated splitting kernels) yields a weight
n
h1 (t) Y h1 (t i ) g1 (t i ) − f1 (t i )
w1 = , (215)
g1 (t) i=1 g1 (t i ) h(t i ) − f (t i )

107
SciPost Physics Codebases Submission

while the stacked weighted shower (used within the context of matrix element corrections) further
induces the weight
m 0 0 0
h2 (t 0 ) Y h2 (t i ) f1 (t i ) − f2 (t i )
w2 = , (216)
f1 (t 0 ) i=1 f1 (t i0 ) h2 (t i0 ) − f2 (t i0 )

with
¨
sign[ f2 (t)] f1 (t) if | f1 (t)| > | f2 (t)|
f2 (t i0 ) = f1 (t i0 ) ⊗ ME-correction and h2 (t) = , (217)
k2 f2 (t) if | f1 (t)| < | f2 (t)|

where k2 ∼ 1.5. Since matrix-element corrections are applied only once a viable splitting has been
selected, and the corresponding phase-space point generated. Thus, the set of all t 0 is different,
and smaller, than the set of all t. Currently, DIRE does not implement enhancements or variations
in the stacked weighted-shower step, since there does not seem to be a strong need for such
complications. Variations may, in the future, be used to embed uncertainties due to the underlying
Lagrangian entering the matrix elements.
Note that only the product of all event weights is required. Thus, the stacked algorithm is
identical to the original weighted-shower algorithm from an outside perspective.

5 Matching and merging


Matching and merging methods aim to augment the event generator with (multiple) calculations
performed within fixed-order perturbation theory. This is rather straight forward for individual
(and simple) hard-scattering calculations, which may be treated as the “hard process” from which
further event generation steps start. When the fixed-order calculation includes virtual and/or real
corrections, a consistent treatment quickly becomes more complex, such that dedicated schemes
of combining external calculations with the event generator need to be developed.
Naive parton showers aim to reproduce the effect of many collinear or soft emissions, and thus
require improvements when describing observables that depend on well-separated hard particles.
Fixed-order perturbative calculations furnish, on the other hand, an appropriate description of
events with a handful of well-separated particles, but may fail in the collinear and soft limits. At
high-energy colliders, observables typically exhibit effects of both approximations. On top of that,
both the bulk cross sections (of low jet multiplicity) and tails (depending on the correct rate of high
jet multiplicities) are often equally important. Methods to perform a matching or a merging of the
fixed-order calculations with parton showers aim to combine the strengths of both approaches.
Before going into the details of matching and merging methods, it is useful to discuss some
aspects of fixed-order calculations. Higher-order calculations require the calculation of virtual and
real corrections. The latter introduce additional final-state particles, so that the (next-to-)k order
prediction for an observable O is
(i) (i)
k
dσ(i) k−1 k−2
Z Z Z
X
n
X dσ n+1
X dσ n+2
〈O〉 = dΦn O(Φn ) + dΦn+1 O(Φn+1 ) + dΦn+1 O(Φn+2 )
i=0
dΦn i=0
dΦn+1 i=0
dΦn+2
(0)
dσn+k
Z
+ ··· + dΦn+k O(Φn+k ) , (218)
dΦn+k

108
SciPost Physics Codebases Submission

where the superscript (i) determines the loop order. The symbols dΦn , dΦn+1 , . . . , dΦn+k refer to
the n-, (n + 1)-, . . . , (n + k)-particle phase-space measures defined in eq. (18). We will refer to a
“matching method” as a method to combine complete higher-order corrections (i.e. all terms for
the order k) to a single inclusive process. A “merging method” combines several calculations for a
lowest-multiplicity base process and related processes with additional well-separated jets (i.e. up
to a certain multiplicity n+m, but possibly omitting some higher-(i) terms) with parton showering.
The goals of these two approaches are often overlapping. Next-to-leading order matching methods
aim to include the NLO prediction
(0)
dσ(0) dσ(1)
Z ‚ Œ Z
n n dσn+1
〈O〉NLO,in = dΦn + O(Φn ) + dΦn+1 O(Φn+1 ) , (219)
dΦn dΦn dΦn+1

while a leading-order merging combines the calculation


(0)
dσ(0)
Z Z
n dσn+1
〈O〉LO,in = dΦn O(Φn )Θ(Q(Φn ) − Q MS) + dΦn+1 O(Φn+1 )Θ(Q(Φn+1 ) − Q MS)
dΦn dΦn+1
Z (0)
dσn+m
+ ··· + dΦn+m O(Φn+k )Θ(Q(Φn+m ) − Q MS) (220)
dΦn+k

with the parton shower. Here, Q MS denotes the so-called merging scale, which separates the hard
(fixed-order) region Q > Q MS from the soft/collinear (resummation) region Q ≤ Q MS. This scale
is in principle arbitrary, and merging algorithms should not develop a strong dependence on the
exact choice, as long as it amounts to a reasonably small scale.
Before any combination is attempted, it is important to remember that virtual and real correc-
tions are separately infrared divergent, and only their sum is free of singularities. This means that
the individual contributions need to be regularized carefully, making the (unweighted) generation
of events challenging. Matching and merging can help with this, as explained below. Furthermore,
fixed-order calculations are inclusive, meaning that a calculation for the process a b → c + X in-
cludes real-emission corrections implicitly, as part of X . For example, a leading-order calculation
for pp → e+ e− + X implicitly includes the process pp → e+ e− g. Fixed-order calculations for
different processes can thus not simply be added — they first have to be reorganized as exclu-
sive cross sections. At fixed order, this is achieved by including all relevant virtual corrections.
The parton shower employs Sudakov factors or no-emission probabilities to produce exclusive all-
order cross sections — a reminder that Sudakov factors resum virtual corrections to all orders.
High-multiplicity fixed-order calculations and showered low-multiplicity predictions may overlap
as well.
Thus, various sources of overlap between calculations should be handled when combining
fixed-order calculations with parton showers. Matching and merging methods typically employ a
mix of subtraction, phase-space division and (emission or event) vetoes for this task. The subtrac-
tions that are required in matched or merged calculations can occur at fixed perturbative order or
at all perturbative orders.
The aim of additional fixed-order subtractions is to remove the parton-shower approximations
of real and/or virtual corrections from the fixed-order calculation, such that the resulting rem-
nant can be showered without introducing overlaps. At next-to-leading order, this leads to (NLO)

109
SciPost Physics Codebases Submission

matching formulas that schematically have the form


‚ (0)
dσ(1) dσPS(1)
Z Œ
dσn n n
〈O〉NLO+PS = dΦn + − S (O, Φn )
dΦn dΦn dΦn
Z ‚ (0) PS(0) Œ
dσn+1 dσn+1
+ dΦn+1 − S (O, Φn+1 ) , (221)
dΦn+1 dΦn+1

where S is the shower operator defined in eq. (89). This also shows that these “parton-shower
subtractions” are typically mandatory to allow for fixed-order event generation, since this would
allow for the generation of the bracketed terms in eq. (221) as individual event samples. Addi-
tionally, this highlights that matrix-element-corrected parton showering can lead to simple NLO
matching methods. Matrix-element corrections guarantee that the first emission in the parton
shower is distributed according to the full tree-level rate by improving the splitting kernel (79),
2
(0)
Mn+1
K j/ĩ k̃ → PMEC K j/ĩ k̃ with PMEC = 2
. (222)
P (0)
K j/ĩ k̃ Mn
j

It is straightforward to see that in the sum over all branchings this reproduces the full n+1-particle
matrix element,
2 2
(0)
X
PMEC K j/ĩ k̃ M(0)
n = Mn+1 . (223)
j

In practice, the correction factor PMEC is implemented via an additional multiplicative factor in the
accept probability of the shower veto algorithm, cf. section 2.2.3. It is worth noting that matrix-
element-correction methods historically appeared well before generic NLO matching methods [75,
146–148]. MECs identified
PS(0) (0)
dσn+1 dσn+1
= (224)
dΦn+1 dΦn+1
(0)
dσPS(1)
Z
n dσn+1
=− dΦ1 , (225)
dΦn dΦn+1

leading to the POWHEG matching prescription


(0)
!
dσ(0) dσ(1)
Z Z
n n dσn+1
〈O〉NLO+PS = dΦn + + dΦ1 SMEC (O, Φn ) . (226)
dΦn dΦn dΦn+1
Φn

Since the matrix-element-corrected parton shower SMEC would now produce real-emission events,
it is not possible to combine this calculation naively with further pre-calculated multiparton fixed-
order predictions. It would, however, be possible to add new tree-level samples if the contributions

110
SciPost Physics Codebases Submission

are also subtracted in the overall result:


(0)
!
dσ(0) dσ(1)
Z Z
n n dσn+1
〈O〉STACKED = dΦn + + dΦ1 O(Φn )
dΦn dΦn dΦn+1
Φn
Z (0)
dσn+1
− dΦn dΦ1 Θ(Q(Φn+1 ) − Q MS)O(Φn )
dΦn+1
Φn
Z (0)
dσn+1
+ dΦn+1 Θ(Q(Φn+1 ) − Q MS)O(Φn+1 ) (227)
dΦn+1

This somewhat academic exercise of “stacking” fixed-order calculations can be cast into a more
familiar form related to the parton shower. For that, we introduce the “parton-shower weight” of
an n-parton state wPSn as the exact parton-shower rate of the n-parton state, excluding the product
of naive splitting probabilities. Thus,
n−1
Y αs (t i+1 ) f i+1 (x(Φi+1 ), t i+1 )
wPS 2
n = f 0 (x(Φ0 ), µ f PS ) Πi (t i , t i+1 ; Φi ) (228)
i=0
2π f i (x(Φi ), t i+1 )

Similarly, we may collect all coupling and PDF factors used at fixed order into the fixed-order
weight
n
αs (µ2r )

FO
wn = f n (x(Φn ), µ2f ) (229)

Applying the ratio of the former to the latter weight, to an n-parton fixed-order calculation in-
troduces appropriate parton-shower higher orders. With this, we may instead add and subtract
all-order contributions, leading to
(0)
!
dσ(0) (1)
wPS
Z Z
n dσ n dσ n+1 n
〈O〉MERGED = dΦn + + dΦ1 O(Φn )
dΦn dΦn dΦn+1 wFO
n Φn
(0)
wPS
Z
n+1 dσn+1
− dΦn dΦ1 FO Θ(Q(Φn+1 ) − Q MS)O(Φn )
w n+1 dΦn+1
Φn
(0)
wPS
Z
n+1 dσn+1
+ dΦn+1 FO Θ(Q(Φn+1 ) − Q MS)O(Φn+1 )
w n+1 dΦn+1
(0)
!
dσ(0) dσ(1) wPS
Z Z
n n dσn+1 n
≈ dΦn + + dΦ1 Πn (t n , t cut ; Φn ; > Q MS)O(Φn )
dΦn dΦn dΦn+1 wFO
n
Φn
(0)
wPS
Z
n+1 dσn+1
+ dΦn+1 Θ(Q(Φn+1 ) − Q MS)O(Φn+1 ) . (230)
wFO
n+1
dΦn+1

Here, the additional argument “> Q MS” in the no-emission probability indicates that only emis-
sions leading to states with Q(Φn+1 ) > Q MS should be considered — leading to what is sometimes
called the “vetoed shower” no-emission probability. The lines after the approximate equality would
(0)
be fully equivalent to the lines before if the shower correctly reproduced the rate dσn+1 /dΦn+1 .
This equation leads to an interesting interpretation: the inclusion of a no-emission probability
Πn on top of the fixed-order n-parton cross section is producing a subtraction that allows it to be

111
SciPost Physics Codebases Submission

combined the (n + 1)-parton event samples. Event samples that are made exclusive with the help
of no-emission probabilities can be added without further complication. This realization is the
basis of merging methods, which extend the argument to the combination of several multiparton
calculations. If the no-emission probabilities are approximated by jets after the complete evolu-
tion sequence, then the merging procedure can become independent of the shower details. This is
the case for MLM jet matching. In the Catani–Krauss–Kuhn–Webber–Lönnblad (CKKW-L) method,
the second equation and the exact (partonic) no-emission probabilities of the parton shower are
used to calculate the rescalings wPSn+1 /w n+1 . Incidentally, such ratios are often called the “CKKW-L
FO

weight” or “merging weight”. Unitarized merging methods retain the explicit add-subtract struc-
ture to guarantee the correct inclusive cross sections even if the parton shower does not accurately
reproduce the (higher-order) emission pattern.
As of today, a broad spectrum of matching and merging techniques has been developed. His-
torically, the first method were matrix-element corrections (MECs) [75], where the shower kernel
itself is corrected to the full matrix element after the first emission. This method has later been
extended to include higher orders as well [64, 98, 110, 113, 149]. For NLO matching, two general
schemes exist, namely MC@NLO [150] and POWHEG [151–153], with the former being auto-
mated in the MADGRAPH5_AMC@NLO [154] and SHERPA [155] event-generation frameworks and
the latter available through the POWHEG B OX program [156]. Well-established tree-level merging
methods are MLM [157,158] and CKKW [159,160], which utilize a simple jet-matching algorithm
and analytic Sudakov factors, respectively. The CKKW-L method [161–163] and METS [164] ex-
tend the CKKW merging scheme to use numerical no-branching probabilities, generated by trial
showers.
The Unitarised Matrix Element + Parton Shower (UMEPS) scheme [165, 166] improves the
unitarity of tree-level merging and thereby addresses the issue that these change the inclusive cross
section of the event samples. At NLO, multiple refinements of the aforementioned LO merging
schemes exist. The NL3 technique [167] extends CKKW-L to NLO, just as UNLOPS [168,169] does
the same with UMEPS. At the same time, UNLOPS may be viewed as the unitarity-improved version
of NL3. The MENLOPS scheme [170,171] combines an NLO calculation for the lowest multiplicity
with LO calculations for higher multiplicities in the METS scheme, while the full extension to NLO
is treated in the MEPS@NLO scheme [172, 173]. The FxFx method [174] combines MC@NLO
matching with MLM merging. The MiNLO scheme [175, 176] may be regarded as a scale-setting-
improved NLO extension of the CKKW algorithm, with analytically calculated NLL Sudakov factors
between the clustered states.
Before describing the matching and merging methods available in PYTHIA 8.3, it should be
emphasized that (NLO) matching and merging methods introduce the stated fixed-order accuracy
only up to the matched (merged) multiplicities. That is, an NLO-matched n-jet event sample has
NLO accuracy only for n-jet observables, while (n + 1)-jet observables will have LO accuracy, and
(n+2)-jet as well as higher-multiplicity observables have parton-shower accuracy only. Similarly, a
merged event sample with up to n jets at (N)LO accuracy, has (N)LO accuracy for m-jet observables
with m ≤ n. In the case of NLO merging, (n + 1)-jet observables will have LO accuracy, while they
will have parton-shower accuracy for LO merging. In either case, m-jet observables with m > n+1
have only parton-shower accuracy. Another question is the accuracy of the inclusive cross section.
In NLO matching schemes, the inclusive cross section is accurate to NLO by design. In merging
schemes, the inclusive cross section of n-jet cross sections are individually retained only if the
merging scheme is constructed to be unitary, such as UMEPS or UNLOPS. In non-unitary merging
schemes, inclusive cross sections are changed by the inclusion of higher-multiplicity event samples.

112
SciPost Physics Codebases Submission

5.1 PYTHIA methods for leading-order multi-jet merging


As discussed above, PYTHIA offers a variety of leading-order merging schemes. This allows for
a test of the robustness of merged predictions, beyond assessing the uncertainty due to scheme-
specific parameters. The main task for a leading-order merging scheme is to produce an inclusive
event sample that provides a simultaneous tree-level prediction for observables depending on any
number of jets ≤ n. This entails removing any overlap between the tree-level calculations for ≤ n
partons. The second main task is to provide a smooth transition between the “well-separated re-
gion” (Q(Φn ) > Q MS) described by (reweighted) tree-level results, and the “soft/collinear region”
(Q(Φn ) < Q MS) described by the parton-shower radiation pattern. Internal merging schemes in
PYTHIA also ensure that the self-consistency of the PYTHIA event-generation chain is intact.
CKKW-L multi-jet merging is the oldest tree-level merging scheme implemented in PYTHIA. It
allows both standard-model and BSM core processes4 , to which multiple several colour-charged
partons or W bosons are added. Lepton and hadron-collider processes are accepted. The resulting
tree-level samples are combined with each other and the default parton shower by employing the
merging formula
N
X −1 Z
dσ(0)
n wn
PS
〈O〉CKKW−L = dΦn Π (t , t ; Φn ; > Q MS)Θ(Q(Φn ) − Q MS) S (O, Φn ; < Q MS)
FO n n cut
n=0
dΦ n w n
(0)
wPS
Z
N dσN
+ dΦN FO Θ(Q(ΦN ) − Q MS) S (O, ΦN ) , (231)
w N dΦN
where the showers S (O, Φn ; < Q MS) of all but the highest-multiplicity event sample fill in emis-
sions below the merging scale Q MS. The main challenge of CKKW-L merging lies in the correct
calculation of the weights wPS n . Their calculation in P YTHIA ensures that the value of the weights
is identical to the all-order weight the shower would had attached to the state Φn , had it produced
the state internally. This requires the construction of all possible parton-shower histories, and an
admixture of the (history-dependent) weight factors identical to that of the shower [162].
A theoretical drawback of CKKW-L is that inclusive jet cross sections change upon inclusion of
higher-multiplicity tree-level samples. The size of the change is determined by the value of the
merging scale Q MS, leading to unacceptable changes for Q MS of O(1GeV). For low merging scales,
the exact “subtract what you add” strategy highlighted in eq. (230) has to be employed. For this
purpose, the UMEPS method has been introduced in PYTHIA. The UMEPS implementation can
handle the same process classes as the CKKW-L scheme. The UMEPS merging formula reads
N −1 Z  (0) PS
X dσn w n
〈O〉UMEPS = dΦn Θ(Q(Φn ) − Q MS)
n=0
dΦn wFOn
(0)
dσn+1 wPS
Z 
n+1
− dΦ1 Θ(Q(Φn+1 ) − Q MS) S (O, Φn ; < Q MS)
dΦn+1 wFO
n+1
(0)
wPS
Z
N dσN
+ dΦN FO Θ(Q(ΦN ) − Q MS) S (O, ΦN ) , (232)
w N dΦN
The subtractive samples in this formula are produced with the help of the parton-shower history
n [165].
employed to calculate the weights wPS
4
Note, though, that no attempt is made at diagram removal or subtraction for colour-changed BSM resonances that
decay into SM quarks.

113
SciPost Physics Codebases Submission

As a final remark on leading-order merging, it should be noted that PYTHIA offers detailed
semi-internal UserHooks utilities for MLM jet matching [177]. This early approach to combining
fixed-order matrix-element calculations with parton showers approximates the parton-shower no-
emission probabilities necessary to remove overlap between samples with a pragmatic event-veto
procedure: the parton ensemble at fixed order is stored, and compared to jets after showering the
ensemble. If each jet directions overlaps with one parton direction, the event is retained. The rate
of rejected events mimics the application of no-emission probabilities. This convenient approach
has the benefit of simplicity and computational efficiency, at the expense of sacrificing a formal
understanding of the result.

5.2 PYTHIA methods for matching


The POWHEG NLO matching formalism as given in eq. (226) provides an elegant and universal
method for the combination of NLO calculations and parton showers. It is universal, as it does
not depend on the exact implementation of the parton shower to be matched. This is, because
in addition to the NLO-corrected Born-level event, the first emission is generated according to a
matrix-element corrected no-branching probability
Q2
 
 Z0
 (0) 
dσn+1 
2 2
Π̄(Q 0 , Q 1 ) = exp − dΦ+1 (0)
, (233)

 2 dσn  
Q1

which is independent of the branching kernels used by the shower. In principle, an NLO-matched
prediction could therefore be obtained with any given shower by starting the shower evolution
at the scale of the first POWHEG emission. In practice, the ordering variable of the shower t
will disagree with the ordering variable Q2 used in the POWHEG formalism. To avoid over- or
under-counting emissions, it is hence preferable to start the shower at the phase-space maxi-
mum (i.e. using a “power shower”, cf. section 4.1.3) and vetoing emissions that are harder than
the POWHEG emission according to the POWHEG ordering variable. This method was outlined in
ref. [91] and since then PYTHIA 8.3 provides the relevant POWHEG hook to supply consistent show-
ering of POWHEG B OX events with the simple showers. It is worth noting that this procedure leads
to the somewhat awkward situation that the first, i.e. hardest, jet is produced from the kinematics
and colour configuration of the Born+1-jet state. To circumvent this, a more complete treatment
would involve clustering the first emission and running a vetoed truncated shower off the actual
Born state for the first emission. This is currently not available in PYTHIA 8.3.
It might potentially be regarded as an inelegance that the POWHEG no-branching probability
eq. (233) exponentiates the full matrix element, including process-specific non-singular terms,
and that the value of Q20 is typically given by the (hadronic) phase-space limit, and not a scale that
more adequately defines the transition between “hard” and “soft” radiation. These concerns are
avoided in the (historically first developed) MC@NLO method, in which the real-radiation matrix
element is separated into an infrared-singular (“soft”) and infrared-regular (“hard”) part,
(0) S(0) H(0)
dσn+1 = dσn+1 + dσn+1 . (234)
Therefore, only the singular part in the no-branching probability is retained,
Q20
 
 Z S(0) 
S 2 2
 dσn+1 
Π̄ (Q 0 , Q 1 ) = exp − dΦ+1 (0)
, (235)

 2 dσn  
Q1

114
SciPost Physics Codebases Submission

so that the MC@NLO matched expectation value of an observable O reads

dσ(0) dσ(1)
Z  Z PS(0)
(MC@NLO) n n dσn+1
〈O〉NLO+PS = dΦn + + dΦ+1
dΦn dΦn dΦn+1
Z ‚ S(0) PS(0) 
Œ
dσn+1 dσn+1
+ dΦ+1 − S (O, Φn )
dΦn+1 dΦn+1
Z H(0)
dσn+1
+ d Φn+1 . (236)
dΦn+1

As evident from eq. (236), the MC@NLO method requires stringent consistency between the NLO
calculation and the shower. Different to the POWHEG method, it therefore does not provide a
universal scheme but needs to be implemented explicitly for each shower. To facilitate a simple
implementation of the MC@NLO technique for PYTHIA’s simple shower, PYTHIA 8.3 provides a
global-recoil scheme, cf. section 4.1.3. A publicly available parton-level event generator support-
ing the generation of MC@NLO events for PYTHIA’s simple shower is MADGRAPH5_AMC@NLO.
Caution is, however, advised, as the global-recoil scheme might not be a suitable choice for each
and every process.
Different to the POWHEG formalism, the MC@NLO scheme explicitly employs negative-weighted
events (in fact POWHEG was designed to remove negative weights from MC@NLO). While not
posing a problem technically, negative weights reduce the efficiency of any simulation, simply be-
cause they have to compensated for in histograms with positive-weighted events, of which more
are needed to obtain the same statistics as for simulations with a strict probability interpretation.
The fraction of negative-weighted events can be reduced by the MC@NLO-∆ formalism [178],
which regulates the divergent structure of real-emission matrix elements via shower-generated
no-branching probabilities. In addition, the MC@NLO-∆ prescription introduces an independent
shower starting scale for each colour line in the hard process. Such multi-scale treatment is also
required in the POWHEG formalism, if resonance-aware matching is pursued, e.g. when using
the POWHEG B OX RES generator [179]. In both cases, PYTHIA 8.3 offers the necessary machin-
ery to deal with multiple scale definitions [180] through UserHooks (see section 9.7.2). While
PYTHIA 8.3 offers in-house implementations for MC@NLO and POWHEG matching as alluded to
above, other matching schemes can conveniently be implemented via user hooks, cf. section 9.7.2.
A prominent example is the NNLO+PS matching framework GENEVA [181–184].

5.3 PYTHIA methods for NLO multi-jet merging


A number of schemes to combine several matched NLO (QCD) calculations with each other and
parton showering are available in PYTHIA. As was the case for tree-level merging (cf. section 5.1),
this allows for an exploration — through comparison within the same code base — of the benefits
and limitations of various approaches, as well as NLO merged predictions more generally.
Historically, the first two NLO merging schemes embedded in PYTHIA were NL3 (an extension
of the CKKW-L approach to NLO) and UNLOPS, the extension of UMEPS to NLO accuracy. Beyond
the theoretical and computational challenges already present at leading order, NLO merging needs
n /w n does not invalidate the NLO precision
to ensure that the application of all-order weights wPS FO

of the input samples due to overlapping virtual corrections. This can be achieved by subtracting
the first-order expansion of the shower weights attached to tree-level events. Thus, the main
complication in calculation is in generating terms in the shower expansion [185]. PYTHIA uses

115
SciPost Physics Codebases Submission

trial showering to generate the expansion of no-emission probabilities, and analytic results to
calculate the expansion of running-coupling and PDF factors. Once these technicalities are under
control, the NLO extension of CKKW-L implements the matching formula
N Z
X dσ(0)
n wn
PS
〈O〉NL3 = dΦn Πn (t n , t cut ; Φn ; > Q MS)Θ(Q(Φn ) − Q MS) S (O, Φn ; < Q MS)
n=0
dΦn wFO n
N Z  (1) Z (0)
X dσn dσn+1
+ dΦn + dΦ1
n=0
dΦn dΦn+1
Φn
(0)  PS  
dσn wn
− Πn (t n , t cut ; Φn ; > Q MS) S (O, Φn ; < Q MS)
dΦn wFO n O(αs )
(0)
dσN +1 wPS
Z
N +1
+ dΦN +1 Θ(Q(ΦN +1 ) − Q MS) S (O, ΦN +1 ) . (237)
dΦN +1 wFO N +1

where square bracket with subscript O(αs ) indicate that the O(αs ) term of the expansion of the
bracketed term is required. The first and last line of eq. (237) are identical to the CKKW-L result.
The second line incorporates the inclusive NLO correction, and the subtraction of double count-
ing of virtual corrections. As was the case for CKKW-L, the calculation of all necessary terms in
eq. (237) employs parton shower histories.
However, the NL3 formula eq. (237) has the same theoretical drawback as the CKKW-L results:
inclusive n-parton rates are changed when including corrections to m > n parton rates. The size of
the effect is determined by Q MS, and can easily be of a similar numerical size as NLO corrections
for moderately small Q MS. Thus, PYTHIA also extends the UMEPS method, which corrects this
behaviour, to NLO accuracy. The resulting UNLOPS merging formula can be found in ref. [185].
UNLOPS is the preferred NLO merging scheme in PYTHIA.
Before moving on, it is interesting to observe that any reweighting of the second line in
eq. (237) with higher-order terms will neither affect the NLO fixed-order nor the shower accu-
racy of the combined calculation. Thus, variants of NLO merged methods can be obtained from
reweighting these contributions. This freedom, and the resulting uncertainty, is exposed in PYTHIA
by offering three different variants of UNLOPS [186]. Sensible uncertainties from NLO merged
calculations should include the envelope of these variants as “scheme uncertainty”.
As part of its semi-internal implementation of MLM jet matching, PYTHIA also offers semi-
internal utilities to combine input samples produced for FxFx merging [174]. This scheme extends
the MLM method to NLO QCD accuracy, and handles the overlap between events of different
multiplicity before showering in a hybrid scheme: fixed-order events are reweighted with analytic
Sudakov form factors to produce results that are additive before showering. The overlap after
showering is addressed in a pragmatic way, following along the lines of MLM jet matching. The
resulting scheme is computationally efficient, especially since Sudakov form factors can directly
be integrated into the fixed-order calculation code. However, this efficiency comes at the price of
an unclear all-order structure of the prediction. Nevertheless, the scheme has found a large user
base.

5.4 Matching and Merging in VINCIA


The unique “maximally bijective” property of VINCIA’s sector antenna showers, cf. section 4.2,
make them well suited for incorporating corrections from fixed-order matrix elements, especially

116
SciPost Physics Codebases Submission

at high multiplicities. The focus so far has been on QCD corrections, although the sector nature of
VINCIA’s coherent QED shower should make adaptations to include QED corrections as well fairly
straightforward.
At leading order, a dedicated CKKW-L merging scheme has been implemented which exploits
the properties of VINCIA’s sector showers. This is discussed below in section 5.4.1. Details on the
general PYTHIA CKKW-L implementation can be found in section 5.1.
Next-to-leading order matching in the antenna framework is so far not generally available,
but VINCIA’s QCD showers, including the resonance-final one, can be combined with NLO QCD
calculations by shower-independent matching methods, such as POWHEG. This is described in
section 5.4.2.
Although VINCIA currently has no built-in NLO matching schemes, a generalization of the
scheme developed in refs. [97, 187] may become available in the future. In a similar vein, VINCIA
does not offer the merging of multiple NLO-matched samples in the current version. Schemes
extending the ones described in section 5.3 to sector showers may, however, be implemented in
the future.
VINCIA’s NNLO matching approach presented in ref. [188] is not yet part of the public PYTHIA 8.3
releases. We expect it to become available in a future release, once it has been applied and vali-
dated for a larger class of processes.
A signature feature in early developments of VINCIA, iterated matrix-element corrections [64,
107, 149, 189] have not yet been made available in PYTHIA 8.3. Plans are underway to do so,
building on the new matrix-element interfaces described in section 10.1.6. Note also that VINCIA’s
weak shower, described in section 4.2.4, is currently not sectorized and hence full-fledged EW
merging would presumably require some work. Finally, note that dedicated matching and merging
strategies for VINCIA’s interleaved treatment of resonance decays have not yet been worked out.
See the program’s online manual for updates.

5.4.1 Leading-order merging


Tree-level merging with VINCIA is done in the CKKW-L scheme [159,161,162] according to eq. (231),
properly extended to sector showers [163]. The phase-space sectorization particularly facili-
tates the merging at very high multiplicities and offers increased control over highly-complex
final states. Most of the merging method is identical to the PYTHIA implementation described in
section 5.1, with the difference only in the construction of shower histories needed for the Su-
dakov reweighting. The settings relevant to VINCIA’s CKKW-L implementation can be found in
section 9.6.1.
In the default CKKW-L scheme, all possible shower histories are constructed and the one maxi-
mizing the branching probability is chosen, cf. section 5.1. In the sector-shower CKKW-L implemen-
tation, however, the construction of all possible histories is replaced by a deterministic inversion of
the shower evolution. This is possible because VINCIA’s sector showers generate branchings only if
these correspond to the minimal sector-resolution variable, cf. section 4.2.2. The sector-resolution
variable can then be used to exactly invert any branching. The only subtlety in this algorithm stems
from the treatment of multiple quark pairs, for which all possible quark-antiquark clusterings must
be taken into account. To this end, the same procedure as in the default CKKW-L method is utilized
and a shower history is constructed for all viable permutations of colour strings between quark
pairs, and the one maximizing the branching probability is picked. This algorithm replaces the
shower history tree by maximally a few linear history branches, which not only positively affects
the CPU time needed for the computation, but more importantly reduces the prohibiting scaling

117
SciPost Physics Codebases Submission

in memory allocation intrinsic to the default CKKW-L algorithm.

5.4.2 NLO matching


If an NLO-matched calculation with VINCIA is desirable, the POWHEG method [151, 152] with
externally matched NLO event samples, as e.g. produced by the POWHEG B OX program [156,179],
can be used [89]. To this end, the difference in the POWHEG B OX and VINCIA evolution variables
are properly accounted for by increasing the shower starting scale and vetoing emissions above
the POWHEG scale [91]. The usage of POWHEG B OX with VINCIA is described in detail in ref. [190,
appendix A].

5.5 Matching and Merging in DIRE


At the time of compiling this manual, the matching and merging machinery with DIRE have not
been validated within PYTHIA 8.3. Previous versions of DIRE + PYTHIA 8.2 included CKKW-L tree-
level merging [136], UNLOPS NLO merging [137], iterated matrix-element corrections within the
MOPS approach [136, 138], and TOMTE N3LO+PS matching [191]. We expect these previous
developments to become accessible and validated in PYTHIA 8.3 in the future.

6 Soft and beam-specific processes


Hadrons and nuclei are composite objects, mainly made out of quarks and gluons. This requires the
introduction of parton distribution functions (PDFs) f aA(x, Q2 ), expressing the probability density
that parton a exists inside particle A with a momentum fraction x if the particle is probed at a
resolution scale Q2 . Given such PDFs, hard collisions between the constituent partons can be
described by perturbation theory, see section 2.3. But in the limit p⊥ → 0 the cross section for
perturbative QCD scattering diverges, and traditional perturbation theory breaks down.
The alternative offered already since before the advent of QCD is so-called Regge–Gribov the-
ory [192–196], wherein an effective field theory is formulated in terms of the exchange of reggeon
(R ) and pomeron (P ) objects between the colliding hadrons, with propagators and vertex-
coupling strengths, the latter both to hadrons and among themselves. A reggeon (pomeron)
contribution represents the resummed effect from the exchange of (an infinite set of) mesons
(glueballs) with a common set of flavour quantum numbers, but ordered in a linear relationship
(a “trajectory”) between increasing orbital angular momentum L and increasing m2 . This lan-
guage can be used to motivate expressions for total, elastic, and diffractive cross sections, even if
today this is done in a pragmatic spirit, where not fully consistent adjustments of parameters can
be made to better fit data.
Leptons are fundamental particles, unlike hadrons, and it would seem like traditional pertur-
bation theory can always be applied. But a charged lepton is surrounded by a cloud of virtual
photons, and these carry some of the total momentum. It therefore becomes necessary to in-
troduce PDFs also to describe the distribution of a lepton and photons inside the whole charged
lepton, as a function of Q2 . Either of these two components can then collide with constituents of
the other beam. The photon, in its turn, can fluctuate further into a lepton or quark pair, and the
latter again can have a non-perturbative behaviour. This requires a similar approach for photon
interactions as for hadron ones, in fact with even more complexity. Since hadrons and nuclei also

118
SciPost Physics Codebases Submission

can contain or be surrounded by photons, by coupling to the charge of individual quarks or to the
hadron as a whole, similar issues can arise in hadronic collisions.
Also fluxes of W± and Z bosons can be defined, and have been used in the past, both for
leptons and for protons. The large weak-gauge-boson masses suppress the rate in the p⊥ → 0
limit, however, and so their contributions are better handled as propagators in Feynman graphs,
like the top quark and the Higgs boson. This also implies that neutrinos can be considered point-
like for our purposes.
Heavy-ion collisions introduce further physics aspects, relative to hadronic collisions. Some
of these are reasonably well understood, such as the role of the initial geometry, where models
for the distribution of nucleons inside a nucleus can be used to find the “wounded” nucleons, i.e.
those that interact. But most of the subsequent physics is still open to interpretation, and different
approaches exist. One such is the PYTHIA 8.3/ANGANTYR model, presented here.

6.1 Total and semi-inclusive cross sections


Here we introduce the components of the total hadron-hadron cross section, and how these vary
as a function of the collision energy. The intention is not to go into the modelling of the collision
processes as such, which is the main topic for subsequent subsections, but some such information
is necessary when the differential cross sections are the basic building blocks, and the integrated
ones only a consequence thereof. See also the online manual, under the heading “Total Cross
Sections”.
Throughout this section, we will discuss collisions between two high-energy hadrons A and B
at a squared CM energy s = ECM 2
. By high energy, we mean roughly ECM > 10 GeV, where the
perturbative model is valid. Low-energy non-perturbative processes are discussed in section 6.1.5.
The (strong-interaction) total cross section for the collision of two high-energy hadrons is conve-
niently subdivided into several components, typically
AB
σtot AB
(s) = σel AB
(s) + σinel (s)
AB
= σel AB
(s) + σsd(X B)
AB
(s) + σsd(AX )
AB
(s) + σdd AB
(s) + σcd AB
(s) + σnd (s) . (238)

The components are:

• Elastic scattering (el) AB → AB where the hadrons are scattered through an angle but are
otherwise unharmed. Everything else, where the final state is not AB, is collectively called
inelastic.

• Single diffraction (sd) where either of the incoming hadrons becomes an excited system,
while the other remains intact, AB → X B or AB → AX . Here, X represents the excited
system that eventually will produce two or more hadrons.

• Double diffraction (dd) where both hadrons are excited, AB → X 1 X 2 , but remain as separate
objects.

• Central diffraction (cd), where both hadrons survive but lose energy to a new central system,
AB → AX B.

• Non-diffractive interactions (nd), or more formally inelastic non-diffractive ones, where


both hadrons break up and form a common system, AB → X , that is not (easily) subdivided
into separate subsystems, unlike diffraction.

119
SciPost Physics Codebases Submission

A A A A A

X
B B B B B
elastic single diffractive (XB) single diffractive (AX)

A A A A
X1
X
X

X2
B B B B
double diffractive central diffractive nondiffractive

Figure 9: Schematic Feynman-diagram-style illustration of the six event classes in


eq. (238). A pair of parallel vertical gluons represent a net colour-singlet exchange,
a pomeron, while a single vertical gluon gives a colour-octet exchange, a cut pomeron.
The vertical axis can be viewed as representing the rapidity range spanned between A and
B, where horizontal gluons are regions with possible partonic final-state presence. The
red bars represent the final regions where strings will be drawn and produce hadrons,
whereas the regions without them are rapidity gaps.

In principle, one could imagine higher diffractive topologies, say AB → X 1 X 2 X 3 , but these are
expected to be small and are neglected here. For applications at low energies we will also introduce
annihilation and resonance contributions.
The dividing line between these different components is unclear, notably between diffrac-
tive and non-diffractive events. Single- or double- diffractive systems X predominantly have low
masses, and thus only produce a few particles at either end of the full rapidity range. In between,
there is a large rapidity gap, i.e. a region in rapidity space without any particle production. That
is unlike the non-diffractive events, where particle production is assumed to span the whole avail-
able rapidity range. But, since particles are discrete objects randomly produced, there will be a
falling distribution of increasing gap sizes also in non-diffractive events. In contrast, the falling
tail of large-mass diffractive systems can leave no obvious gap in a diffractive event. We there-
fore need to distinguish the theoretical modelling of cross-section classes and event properties
presented here from the experimental-detector and analysis-dependent classification of observed
events.
In modern nomenclature, where a pomeron is viewed as shorthand for a two-gluon system
in a colour-singlet state, the different event classes can be illustrated in terms of the colour and
momentum-energy transfers between the two incoming hadrons, see fig. 9. An elastic scattering

120
SciPost Physics Codebases Submission

requires a pomeron (or reggeon) to be exchanged, so that the scattered hadrons remain colour
singlets, but with (modestly) changed outgoing momenta. If only one gluon is exchanged, a so-
called cut pomeron, then the colour transfer turns the A and B hadrons into colour-octet objects,
which means they will be connected by colour strings that can fragment into hadrons over the
whole rapidity range, i.e. this gives a non-diffractive event. Single diffraction, e.g. AB → AX , can
be viewed as a two-step process. First the emission of a P from A, carrying a fraction ξ of the A
momentum. And second the collision between the P and B, giving rise to a system with MX2 = ξs.
For the first step a pomeron flux fPA(ξ, t) can be introduced in analogy with conventional PDFs,
while the second step can be viewed as a non-diffractive-type PB subcollision, at least for large
MX . Double and central diffraction can be described in a similar manner.
The hadronic cross sections that will be discussed late are for reasonably high hadron-hadron
CM energies, say ECM > 10 GeV, corresponding to a fixed-target proton-proton beam energy of
Ebeam ¦ 50 GeV. Separate from this, low-energy cross sections will be discussed in the context of
hadronic rescattering, section 6.1.5. To a large extent the same language can be used, but at low
energies the contribution from exclusive resonances, AB → R → AB or AB → R → C D, can give
rise to rapid fluctuations in the cross section as a function of s.

6.1.1 Proton total cross sections


As already mentioned, pomeron and reggeon contributions play a large role in the modelling of
cross sections. Both are expected to give an sδ energy dependence, where δ ≥ 0 for pomerons
and δ < 0 for reggeons, such that high-energy cross sections are dominated by the pomeron term.
The pomeron contribution is even, i.e. the same for AB and AB, while reggeons can be even or
odd, the latter giving opposite-sign contributions for the two processes. A hypothetical odderon
contribution would be odd, as the name indicates, and have δ ≥ 0 like the pomeron, so that
σAB (s) − σAB (s) would not vanish in the s → ∞ limit. Its existence is supported by recent TOTEM
data [197], but is still not included in most models.
The simple sδ behaviour is obtained for the exchange of a single object, whereas multiple
exchanges can come in with opposite signs and damp the rise of cross sections. The Froissart
bound [198] shows that cross sections cannot rise faster than ln2 s asymptotically, but that limit is
far off. Other bounds are more relevant, for instance that diffractive cross sections cannot become
larger than the total one [199].
The most important hadronic cross sections are the pp and pp ones. Here, four different
σtot (s) parameterizations are available for high-energy collisions in PYTHIA, plus one placeholder,
see further the overview in ref. [200]. They are roughly ordered in increasing number of free
parameters tuned to data, with numbers corresponding to the options of the SigmaTotal:mode
switch.

0. A zero option allows the user to set any value at the currently studied energy, i.e. it does not
model any energy dependence.

1. The Donnachie–Landshoff (DL) form [201], with one pomeron and one reggeon term,
AB
σtot (s) = X AB s0.0808 + Y AB s−0.4525 , (239)

with s in units of GeV and σ in mb. The coefficients X pp = X pp , as discussed above. There
is no such symmetry for the Y AB , which can be viewed as having one even and one odd
reggeon, but with the same power.

121
SciPost Physics Codebases Submission

2. The Minimum Bias Rockefeller (MBR) parameterization [202], which uses two different
expressions. Below 1.8 TeV the form is given by one pomeron and two reggeon terms,
whereof one odd and one even, with different δ. Above it a common Froissart-inspired
form like a ln2 s + b ln s + c is used.

3. The ABMST model [203] includes a soft and a hard pomeron, i.e. lower or higher δ > 0, an
even and an odd reggeon, plus terms for two-pomeron and triple-gluon exchange.

4. The COMPAS/RPP parameterization [204] contains a total of six even and six odd terms,
including pomeron, odderon, reggeon, and double-exchange ones.

The relevant cross section parameterizations are hard coded in options 1 – 4, and cannot easily
be changed.

6.1.2 Proton elastic cross sections


Elastic cross sections are related to total ones via the optical theorem:

dσel 1 + ρ2 2
(s, t = 0) = σtot (s) , (240)
dt 16π
where t represents the squared momentum transfer between the initial and final proton on the
same side of the event. For detailed modelling, a suitable starting point is the elastic scattering
amplitude T (s, t), from which one derives (dσel /dt)(s, t) ∝ |T (s, t)|2 , σtot ∝ Im T (s, 0), and
ρ = Re T (s, 0)/Im T (s, 0). Typically ρ is close to 0 and can be neglected to first approximation.
The total elastic cross section is obtained by integration over t.
For the simple Regge-theory-motivated ansatz, that (dσel /dt)(s, t) ∝ exp(Bel t), one obtains

1 + ρ 2 σtot
2
σel (s) = , (241)
16π Bel

where, to a very good approximation at high energies, the t integration range has been extended
to [−∞, 0]. The ansatz also assumes that
AB
Bel (s) = 2bA + 2bB + 2α0 ln(s/s0 ) , (242)

where bA,B come from the respective hadronic form factors, with b = 2.3 GeV−2 for unflavoured
baryons and 1.4 GeV−2 for mesons, α0 = dL/dm2 = 0.25 GeV−2 is the slope of the pomeron
trajectory, and s0 = 1/α0 = 4 GeV2 is a typical hadronic scale [199, 205].
In detail, the total-cross-section models above, as selected by SigmaTotal:mode, also handle
elastic scattering as follows.

0. It is possible for the user to set their own σel , Bel , and ρ at the current energy.

1. The original DL model was extended to Schuler and Sjöstrand (SaS)/DL [199] by the simple
Regge-theory ansatz above, but with the difference that the ln s dependence in eq. (242) is
replaced by an s0.0808 term to ensure that σel does not grow faster than σtot asymptotically.
There is no modelling of ρ, but a value can be set by hand.

2. In MBR the ratio σel (s)/σtot (s) is parameterized, separately below and above 1.8 TeV, and
separately for pp and pp below it. Then eq. (241) is used to derive a Bel (s) slope, with ρ = 0.

122
SciPost Physics Codebases Submission

3. In ABMST the fundamental building block is a complex scattering amplitude T (s, t), con-
taining the six terms of the total cross section, each with a separate t dependence, usually,
but not always, of an exponential character. From this, both total and elastic cross sections
are derived, including the ρ parameter.

4. Also the COMPAS/RPP parameterization starts out from a complex T (s, t), with the same
comments as for ABMST, except that the number of terms now is larger.
There are no further free parameters in the code, beyond the ones mentioned above.
So far, only strong interactions have been considered. But, since protons are charged particles,
there are also electromagnetic (EM) interactions. These are given by the traditional Coulomb
scattering cross section, dσel /dt ∝ α2em /t 2 , which blows up in the t → 0 limit, i.e. for vanishing
scattering angle. Therefore, it is always necessary to specify a minimal angle or equivalently a
t max < 0. There are two aspects that make it possible to disregard the EM contributions at the
LHC, except for special runs. Firstly, the EM contribution exceeds the strong one only below a |t|
of order 10−3 GeV2 , which corresponds to extremely small scattering angles. Secondly, owing to
this, inelastic EM collisions are completely negligible. By default, Coulomb corrections therefore
are not taken into account, but can be switched on.
What complicates the issue is that the elastic scattering amplitude

T (s, t) = Tn (s, t) + e iαem φ(t) Tc (s, t) , (243)

contains a phase factor in front of the Coulomb Tc amplitude, relative to the definition of the real
part of the nuclear/QCD Tn amplitude. Three different expressions are used, one for SaS/DL, and
also for MBR and SigmaTotal:mode0, and one each for ABMST and COMPAS/RPP. Although
written in slightly different ways, they give almost identical results.

6.1.3 Proton diffractive cross sections


Diffractive cross sections are differential in several variables: for single diffraction in t and MX ; for
double diffraction in t, MX 1 , and MX 2 ; and for central diffraction in t 1 , t 2 , and MX . Here, MX rep-
resents the mass of the respective diffractive system. Alternatively the scaled variable ξ = MX2 /s
is often used, but it is less intuitive when modelling contributions from low-mass resonances. The
fundamental objects are the differential expressions, and the integrated rates in general do not
have simple closed forms. Within Regge theory it is possible to relate the differential expressions
to the ones for total and elastic ones, with minor extensions. Specifically, single diffraction is mod-
elled with triple-Regge exchange graphs that involve the same pomeron (or reggeon) propagators
as before, but requires the introduction of triple-Regge couplings. If only pomerons are consid-
ered, as could be reasonable at high energies, then mass spectra will behave roughly like dMX2 /MX2
and t spectra roughly like exp(B t), where B = B(s, MX2 ) depends on the process considered.
In reality it is more complicated, for a number of reasons. At low masses the experimental
MX spectrum is not smooth, but reflects the presence of well defined N and ∆ resonance states.
At high masses phase-space restrictions kick in, e.g. in the allowed t range, as well as a wish
to minimize the overlap between diffractive and non-diffractive event topologies. In addition
to the pomeron also reggeons should be considered, in various combinations, contributing to
different mass distribution shapes and CM energy dependencies. Some terms increase faster with
CM energy than the total cross section itself, implying that the description has to break down at
some point. The solution to this is likely to involve the possibility of multiple exchanges of both a
diffractive and non-diffractive nature, leading to a competition between the two [206].

123
SciPost Physics Codebases Submission

Three different diffractive models are implemented [200], matching the first three descrip-
tions of total and elastic cross sections, plus again an additional placeholder, enumerated in the
SigmaDiffractive:mode switch in the same way as in SigmaTotal:mode. It is possible to
combine the choice of total plus elastic and diffractive models freely.

0. One can set user defined single, double and central diffractive cross sections for the current
energy. In this option there is a choice between seven possible MX2 spectra, with related t
shapes.

1. The SaS model is based on pomeron contributions only, i.e. is of the form (dMX2 /MX2 ) exp(B t)
to first approximation. At low masses a smooth enhancement is added, to provide a simple
smeared-out further contribution from resonances. At large masses the rate is suppressed
to reduce the rate of diffractive events with small rapidity gaps. The rise of the diffractive
cross section with energy is given by integration. It turns out, however, that the initial ansatz
gives a steeper rise than data, so energy-dependent damping factors have been introduced.
Central diffraction is a rather recent addition, not included in many commonly used tunes,
and therefore not on by default. The B slope is similar in spirit to eq. (242), but without
any form factor contribution for protons that break up, and the logarithmic term is related
to the rapidity gap size, e.g. ln(s/MX2 ) for single diffraction.

2. In the MBR model the single-, double- and central-diffractive cross sections are given as
ratios of two integrals, one being the Regge cross section and the other a renormalized flux.
These are matched so that the increase of diffractive cross sections is kept at an accept-
ably low rate. The differential distributions in MX2 and t are given by somewhat lengthier
expressions than in SaS, but qualitatively similar.

3. The ABMST model is the most sophisticated one, in terms of number of components con-
sidered. The single-diffractive description is split into two parts, for high- and low-mass
diffraction. The former includes PPP, PPR, RRP and RRR graphs, plus pion exchange,
each with a characteristic mass, t, and energy dependence. Four resonances are modelled
in the low-mass regime, along with a background from the high-mass regime and a contact
term matching the two regimes smoothly. The resonances are excited states of the proton,
each a unit of angular momentum higher than the previous one. Taken together, the ABMST
model gives a very good description of data at lower energies. Unfortunately the energy de-
pendence of some terms is too steep, such that single diffraction at the LHC is overestimated
by about a factor of two, and results at 100 TeV would be completely unphysical. A few dif-
ferent options have therefore been included in the PYTHIA implementation to damp this
rise [200]. Another problem is that ABMST does not model double diffraction. One expects
an approximate relationship [196]
‹−1
d3 σdd d2 σsd d2 σsd dσel

≈ , (244)
dMX2 dMX2 dt dMX2 dt dMX2 dt dt
1 2 1 2

however, and this has been used to extend the modelling. Also central diffraction can be
introduced by a similar ansatz.

4. The COMPAS/RPP parameterization does not extend to diffraction, so there is no such op-
tion.

124
SciPost Physics Codebases Submission

Each of the models above contain a wide selection of modifiable parameters, specific to that diffrac-
tion model. Both the integrated and the differential cross sections can be modified, notably af-
fecting the dependence on CM energy and the shape of the MX spectra.
In summary, the modelling of diffraction is highly nontrivial, and at a more primitive stage
than that of total and elastic cross sections. There also exist alternative starting points to the
Regge formalism we have worked with here, notably the Good–Walker one [207]. In it, it is
assumed that the interaction eigenstates do not agree with the mass ones. That is, an incoming
proton can be viewed as a coherent sum of interaction eigenstates. During the collision process,
parts of these eigenstates are absorbed to give rise to non-diffractive events. The remaining parts
of the incoming wave function can be projected back on to a spectrum of possible masses for
the outgoing object, including one component corresponding to elastic scattering. Actually the
“diffraction” name comes from the close analogue with optics, where an opaque disk put in a beam
of light absorbs part of the light but also generates a quantum mechanical diffraction pattern in the
remaining light. Such a picture implies that diffractive and elastic collisions are more peripheral
than non-diffractive ones. The same also holds in the Regge-language-related MPI framework to
be discussed in the next subsection, so even of the models seemingly are unrelated, there are many
common traits.

6.1.4 Other cross sections


Except for the absence of Coulomb elastic scattering, collisions involving (anti)neutrons are as-
sumed to have the same cross sections as (anti)protons in PYTHIA, and this similarity appears
supported by data [201]. Therefore all of the models above can be used for pn, pn, nn and nn.
For other hadron combinations, the only alternative beyond the user-defined option is an ex-
tension of the SaS/DL setup. It encompasses the following collision types.

• Combinations where σtot (s) were fitted by DL [201]: π+ p, π− p, K+ p, K− p, and γp.

• SaS extensions [208]: ρ 0 p, φ 0 p, J/ψp, ρ 0 ρ 0 , ρ 0 φ 0 , ρ 0 J/ψ, φ 0 φ 0 , φ 0 J/ψ, and J/ψJ/ψ.


Particles with identical flavour content are assumed to give identical cross sections. The
prime example is π0 , ρ 0 , and ω. The emphasis on the interactions of vector mesons is
related to the SaS modelling of γp and γγ physics, where an important aspect is that a real
photon can fluctuate into a vector meson like ρ 0 , ω, φ 0 , or J/ψ, and interact as such most
of the time (Vector Meson Dominance (VMD), see section 6.6 and section 6.7). Total γγ
cross sections are also provided.

• Later extensions in the SaS/DL spirit [49]: K0 p, ηp, η0 p, D0,+ p, Ds+ p, B0,+ p, Bs0 p, Bc+ p, Υ p,
Λp, Ξp, Ωp, Λc p, Ξc p, Ωc p, Λb p, Ξb p, and Ωc p. Isospin symmetry is used to equate the
cross sections of closely related particles, e.g. n with p and Σ+,0,−,∗+,∗0,∗− with Λ. For the
baryon-baryon processes, the corresponding baryon-antibaryon ones are also implemented.
The purpose of these extensions is to allow the tracing of the evolution of cascades in matter,
also in collisions with nuclei, meaning that essentially all hadronic collisions with protons
or neutrons have to be included.

• As a final case, a σtot (s) is defined for pomerons, but more for model studies of diffractive
Pp

systems at a given mass than for comparisons with data.

In summary, by suitably mapping a particle onto one of equivalent flavour content, the possibil-
ities above cover a fair fraction of all possible hadron collisions. The main exceptions are those

125
SciPost Physics Codebases Submission

involving baryons with more than one charm or bottom quark, and (most) collisions between two
short-lived particles. Should the need arise, further extensions along the same lines would be
possible for these cases.
It should be clear from the onset that the accuracy expected for these cross sections cannot
compare with the pp and pp ones. As a rule of thumb, the rarer the particle, the more uncertain
the assumptions that went into deriving related cross sections. For many applications, notably
the evolution of a cascade in matter, it is the average collision rates that count rather than the
individual ones, however, one may assume that it should still work out reasonably well.
The starting point in all these total cross sections is the pomeron plus reggeon ansatz of
eq. (239). The X AB pomeron term strength appears to obey the Additive Quark Model (AQM)
rule [209, 210], i.e. be proportional to the number ni of valence quarks in each hadron, but
with a reduced contribution for strange and heavier quarks. Thus we have made the ansatz that
X AB ∝ nAeff neff
B
, with
neff = nd + nu + 0.6ns + 0.2nc + 0.07nb . (245)
The prefactors for heavier quarks have been assumed roughly inversely proportional to their re-
spective constituent quark masses, which could be viewed as a reflection of a reduced size of their
spatial wave functions.
The modelling of the Y AB reggeon factors is considerably less systematic, since typically sev-
eral reggeon trajectories may contribute. The mix of charge-even and charge-odd contributions
gives Y AB 6= Y AB , while X AB = X AB . For baryon collisions Y AB > Y AB , which can be viewed as a
reflection of possible contributions from qq annihilation graphs. This is supported by the observa-
tion that Y φp ≈ 0, consistent with the OZI rule [211–213], and we assume that this suppression
of couplings between light u/d quarks and s quarks extends to c and b. Thus, for baryons, the
reggeon Y AB and Y AB values are assumed proportional to the number of u/d quarks only, scaled
separately from the Y pp and Y pp reference values. Thereby baryons with the same flavour con-
tent, or only differing by the relative composition of u and d quarks, are taken to be equivalent, i.e.
+ 0 −
σΛp (s) = σΣ p (s) = σΣ p (s) = σΣ p (s). Another simplification is that D/B mesons are assigned
the same cross sections as the respective D/B, taken to be some average.
The BelAB
slope for hadronic collisions is defined as in eq. (242), with a universal α0 but bA,B
taken to be 1.4 for mesons and 2.3 for baryons, except that mesons made only out of c and b
quarks are assumed to be more tightly bound and thus have lower values, in the spirit of the
AQM factors. Given this, and assuming ρ ≈ 0, the integrated elastic cross sections are given by
eq. (241). For photon interactions, only the VMD part is assumed to undergo “elastic” scatterings,
so the fractions of fluctuations to ρ 0 , ω, φ 0 , or J/ψ are combined with the expressions for these
respective states to scatter elastically.
Also diffractive cross sections are calculated using the SaS ansatz. The differential formulae are
integrated numerically for each relevant collision process and the result suitably parameterized,
including a special threshold-region ansatz [214]. If the hadronic form factor from pomeron-
driven interactions is written as βAP (t) = βAP (0) p exp(bA t) then, with suitable normalization,
X AB = βAP (0) βBP (0). Thus we can define βpP (0) = X pp and other βAP (0) = X Ap /βpP (0). These
numbers enter in the prefactor of single diffractive cross sections, e.g. σAB→AX ∝ βAP2
(0) βBP (0) = X AB βAP (0).
This relation comes about since the A side scatters (semi)elastically, while the B side description
is an inclusive one, cf. the optical theorem. In double diffraction AB → X 1 X 2 neither side is elas-
tic and the rate is directly proportional to X AB . For photons again only the VMD parts undergo
diffractive scatterings.
The descriptions mentioned so far are intended for cross sections at high energies. Specifically,

126
SciPost Physics Codebases Submission

the original DL ansatz was tuned to data down to 6 GeV. At low energies, different descriptions are
used, as outlined in the next subsection, most of which are not intended to be used much above
10 GeV. In cases where the full energy range from threshold upwards needs to be used, a smooth
interpolation is therefore applied between the low- and high-energy descriptions. More precisely,
the transition is linear in the range between
begin
ECM = Emin + max(0., mA − mp ) + max(0., mB − mp ) and (246)
end begin
ECM = ECM + ∆E , (247)

where Emin is 6 GeV and ∆E is 8 GeV by default.

6.1.5 Low-energy processes


At low energies (below ∼ 10 GeV), the perturbative framework described in this section breaks
down. In modern high-energy physics, experimental beam energies lie far above this threshold,
but processes at these energies still have applications for example in hadronic rescattering (see
section 7.4). PYTHIA provides a framework for simulating such low-energy collisions. This frame-
work can be used explicitly by enabling LowEnergyQCD:* processes, and is used implicitly inside
PYTHIA when rescattering is turned on. The following gives a summary of the available low-energy
processes:

Elastic scattering AB → AB is implemented similarly to elastic scattering at high energies, except


the cross section is calculated differently, as described below.

Diffractive scattering (both single and double) is also similar to how it is implemented at high
energies. Central diffractive (AX B) has a very small cross section at low energies, and is thus
not implemented. In addition, at low energies the diffractive system can sometimes be viewed
as a resonance excitation, for example pp → p∆+ . In PYTHIA 8.3, these excitation processes are
implemented only for nucleon-nucleon interactions.

Non-diffractive scattering Works similarly in principle to high-energy non-diffractive interac-


tions, but with extra steps to ensure the process does not reduce to an elastic scattering at energies
very close to the threshold.

Annihilation processes Baryon-antibaryon interactions where one or two quarks annihilate.

Resonance formation A meson interacting with a baryon or another meson can form a reso-
nance particle, e.g. pπ+ → ∆++ or π+ π− → ρ 0 .
While several of these processes correspond to similar high-energy processes, their cross sections
are in most cases calculated differently, as perturbative calculations cannot be used at these ener-
gies. Only a short overview of how the cross sections are calculated is given here, and the reader
is referred to ref. [214] for further details.
When PDG data is available5 , total cross sections are calculated using parameterizations or by
fitting to data. The H PR1 R2 parameterization is used when available, as is the case for e.g. nucleon-
nucleon interactions [215]. For baryon-antibaryon interactions, a parameterization due to UrQMD
5
https://pdg.lbl.gov/2018/hadronic-xsections/hadron.html [215]

127
SciPost Physics Codebases Submission

is used [216]. π π and π K interactions use parameterizations by Pelaez et al. [217–219]. For other
processes involving mesons, if the pair can form resonances, the total cross section is calculated
by summing the contributions from each resonance, possibly also adding an elastic contribution.
While these cases describe the most common processes, there is also a large set that is not covered.
For these remaining processes, the total cross section is calculated using the additive quark model
(AQM) [209, 210] with small modifications introduced to also include charm and bottom quarks
[214]. Specifically, the total AQM cross section is given by

nAeff neff
B
AB
σAQM = (40 mb) , (248)
3 3
where neff is the “effective” number of quarks in each hadron, given by eq. (245). With this,
low-energy processes are available for all possible hadron-hadron types.
In our description, we define elastic interactions as processes where the hadrons exchange
momenta without ever changing their types, e.g. through a pomeron exchange. We do not include
for example “pseudo-elastic” scattering through a resonance, AB → R → AB. Note that this dis-
tinction usually cannot be made experimentally, so one often considers a process elastic as long
as the outgoing hadrons are of the same type as the incoming ones. For nucleon-nucleon and
nucleon-pion interactions, the elastic cross section is found by fitting to data below 5 GeV [215],
and by using the CERN/HERA parameterization above 5 GeV [220]. Elastic cross sections for
baryon-antibaryon interactions are calculated using another parameterization by UrQMD [216],
and for π π and π K, we use parameterizations by Pelaez et al. [217–219]. Other cross sections are
given by an elastic AQM-style parameterization. The angular distribution of the outgoing hadrons
is the same as for the high-energy case (section 6.1.2).
Diffractive cross sections are calculated using the SaS model [199, 208], with two modifica-
tions. First, the basic model is designed to deal with processes involving only p, p, π, ρ, ω, and φ
hadrons. In the low-energy framework, the generic case is calculated by replacing each incoming
hadron by the most similar among these particles (e.g. treating each baryon as a proton), then
rescaling the calculated cross section by the appropriate AQM factor. The second modification is
due to the fact that the basic SaS model is intended for collision energies above 10 GeV. This is
compensated for by multiplying by an ad hoc factor below 10 GeV. At low energies, diffractive
processes can lead to the formation of explicit resonances, e.g. pp → p∆+ . This is implemented in
PYTHIA 8.3 only for nucleon-nucleon interactions, using the description by UrQMD [216].
Non-diffractive cross sections are calculated by subtracting all other partial cross sections from
the total cross section. One important difference between non-diffractive interactions at low and
high energies is that at low energies, the hadronization process might sometimes produce a hadron
pair that is the same as the incoming pair, essentially reducing the interaction to an elastic pro-
cess (AB → X 1 X 2 → AB). This is a problem in cases where the calculated elastic cross section
has already been adjusted to fit data. Several steps are taken to ensure that this does not give
unexpected contributions to the elastic cross section, and are outlined in section 7.1.6.
Annihilation processes in our framework refer to baryon-antibaryon interactions where one
or two quark-antiquark pairs are annihilated. Strings are drawn between the remaining quark-
antiquark pairs, and hadronize to form outgoing hadrons. The cross section for annihilation in pp
is given by a parameterization by Koch and Dover [221],

A2 s0
 
s0
σann = 120 + 0.6 , (249)
s (s − s0 )2 + A2 s0

128
SciPost Physics Codebases Submission

where s0 = 4m2p and A = 0.05 GeV. The cross section for other B B interactions is found by rescaling
this value by an AQM factor. The only exception is when the quark-contents make annihilation

impossible, e.g. like in a ∆++ + Σ interaction, in which case the annihilation cross section is set
to zero.
Finally, resonance production refers to processes where the two hadrons combine to form one
resonance particle. The cross section for the process AB → R is given by a non-relativistic Breit–
Wigner [215],
π (2SR + 1) ΓR→AB ΓR
σAB→R = 2 p , (250)
pCM (2SA + 1)(2SB + 1) (mR − s)2 + 14 ΓR2
where S is the spin of each particle, pCM is the CM momentum of the incoming particles, ΓR→AB
is the mass-dependent partial width of the decay R → AB, and ΓR is the mass-dependent total
width of R. The full list of implemented resonances is given in ref. [214]. For π π and π K where
the total cross section is calculated using the parameterization by Pelaez et al., the partial cross
sections are rescaled to ensure their sum equals the total cross section.

6.2 Multiparton interactions basics


Hadrons are composite objects. A proton consists of three valence quarks, plus countless gluons
and sea quarks. When two hadrons collide there is a possibility for several parton pairs to col-
lide — multiparton interactions (MPIs). Processes with exactly two parton pairs, double parton
scattering (DPS), was proposed in the early days of QCD, but then viewed as a rare perturbative
process [222, 223]. Regge–Gribov theory, on the other hand, allowed for events with multiple
cut pomerons, i.e. several “strings” crossing from one rapidity end of the event to the other, each
generating its sequence of low-p⊥ hadrons [224]. The PYTHIA philosophy for the first time in-
troduced a merger and extension of these two approaches [225]. In it, semiperturbative MPIs
both generate multiple minijets, that contribute to the p⊥ flow, and multiple colour connections
between the beam remnants, that leads to events with higher multiplicity. This picture is now
generally accepted in its essentials. An overview of MPI theory and phenomenology can be found
in ref. [226], with the PYTHIA perspective described in ref. [227], with many further references.
See also the online manual under the “Multiparton Interactions” heading.

6.2.1 The perturbative cross section


The p⊥ -differential perturbative QCD 2 → 2 cross section can, to leading order, be written as

dσ X ZZZ dσ̂ikj 
t̂ û
‹
2 2 2
2
= f i (x 1 , Q ) f j (x 2 , Q ) δ p⊥ − dx 1 dx 2 d t̂ , (251)
dp⊥ i, j,k
d t̂ ŝ

with Q2 = p⊥2
as factorization and renormalization scale, partons assumed massless, and k running
over processes with the same initial state but different final states (cf. eq. (36) and eq. (41)). The
partonic cross section dσ̂/d t̂ is dominated by t-channel gluon exchange, i.e. qq0 → qq0 , qg → qg
and gg → gg. (Including those u-channel graphs that easily can be relabelled into t-channel ones.)
This cross section has an approximate behaviour

dσ̂ α2s (Q2 ) dσ̂ α2s (p⊥2


)
∝ ⇒ 2
∝ 4
. (252)
d t̂ t̂ 2 dp⊥ p⊥

129
SciPost Physics Codebases Submission

105 QCD integrated cross sections QCD differential cross sections


105
100 TeV standard
104 13 TeV damped
2 TeV
200 GeV 104
103
or σtot (mb)

dσ/dp (mb/GeV)
102 103
min)

101
102
σint(p

100
101
10-1

10-2 100 0
100 101 2 4 6 8 10
p min (GeV) p (GeV)
(a) (b)

Figure 10: (a) Integrated standard 2 → 2 QCD cross section as a function of the lower
cutoff p⊥min for pp collisions at 200 GeV, 2 TeV, 13 TeV and 100 TeV, respectively. Hori-
zontal dashed lines give the total cross section at their respective energy. (b) Differential
2 → 2 QCD cross section at 13 TeV, as obtained in standard perturbation theory, and af-
ter multiplication by the damping factor eq. (253). Minor breaks in slopes come from
transitions, notably the freeze of PDFs below 1 GeV. Results have been obtained for the
default PYTHIA 8.3 setup, and details depend e.g. on the choice of PDF set.

Evidently this cross section is divergent in the limit p⊥ → 0, as shown in fig. 10. The integrated
2 → 2 cross section above some p⊥min scale, σint (p⊥min ), is increasing with the pp collision energy.
But, taking p⊥min = 1 GeV as a scale where perturbation theory would be expected to hold, already
at a collision energy of 200 GeV, the σint value exceeds the total pp cross section σtot at this energy.
A further aspect is that σtot is subdivided into different components, as already discussed, and
the 2 → 2 partonic interactions primarily occur within the non-diffractive one, which is what we
will assume next. They are absent in elastic scattering and low-mass diffraction, while they can
occur in high-mass diffraction. This is a small fraction of the total cross section, however, so to
first approximation we may neglect it. Later on we will correct the picture.
Putting it together, one finds that σint (p⊥min ) is around 60 mb for p⊥min ≈ 5 GeV at LHC
energies, which is also the order of the non-diffractive pp cross section σnd . Going to lower p⊥min
scales the cross section rapidly explodes, σint (2 GeV) ≈ 1000 mb ≈ 15 σnd . In the context of MPIs,
this is not as bad as it may sound, since we may interpret the ratio σint (p⊥min )/σnd as the average
number of MPIs above the p⊥min scale that occur in a non-diffractive collision. Nevertheless, an
infinity of MPIs in the p⊥min → 0 limit is not attractive.
A damping of the cross section at low p⊥ can be viewed as a consequence of colour screening:
in the p⊥ → 0 limit a hypothetical exchanged gluon would not resolve individual partons but only

130
SciPost Physics Codebases Submission

(attempt to) couple to the vanishing net colour charge of the hadron. By contrast, traditional
perturbation theory is based on the assumption of asymptotically free incoming and outgoing
partons. To be specific, a multiplicative damping factor
Œ2
αs (p⊥0
2
+ p⊥
2
) 2
‚
p⊥
. (253)
αs (p⊥
2
) 2
p⊥0 + p⊥
2

is introduced, with p⊥0 a free parameter. This means a modification to eq. (252)

dσ̂ α2s (p⊥


2
) α2s (p⊥0
2
+ p⊥
2
)
∼ −→ , (254)
2
dp⊥ 4
p⊥ (p⊥0
2
+ p⊥ )
2 2

which is finite in the limit p⊥ → 0, cf. fig. 10b.


The p⊥0 value is not provided from first principles, although suggestions have been made to
equate it with the saturation scale Q s in colour glass condensate models [228, 229]. Fits to pp/pp
data give a result that increases with energy, by default like
‹0.215
ECM

p⊥0 (ECM ) = (2.28 GeV) , (255)
7 TeV
but alternatively a logarithmic rise could be assumed. It should be noted that results are sensitive
to the choice of PDF set, and especially to the low-x behaviour of the gluon distribution at small
Q2 . The numbers are for the default NNPDF2.3 QCD+QED LO αs (MZ ) = 0.130 set [230]. The
choice of an LO PDF is deliberate, since the description of partonic collisions is also an LO one, but
in particular since NLO PDFs tend to become unphysical at small x and Q2 . This is why PYTHIA
offers the possibility to use two different sets of PDFs, one for the hard processes, where these
kinematic regions are not accessed, and one for MPIs and showers, where often they are.
The range of x values that can be accessed by MPI in PYTHIA is illustrated by the thick black
lines in fig. 11, for hadronic CM energies ranging from 10 GeV (at the left-hand edge of the plot)
to 100 TeV (at the right-hand edge). The shaded area emphasizes the region of low x ≤ 10−4
in which current PDFs are uncertain by a factor two or more. The red dashed line indicates the
solution to x 2 s = 4p⊥0
2
, for the default form of p⊥0 (ECM ) given by eq. (255). Any partonic collision
with p̂⊥ ∼ p⊥0 will involve at least one x value below this line. Thus, especially at LHC energies
and beyond, it is important to keep in mind that the effective MPI cross section (and hence any
observables derived from it) around p̂⊥ ∼ p⊥0 really depends on the combination of p⊥0 and
the shape of the low-x PDF parameterization. Since the latter can change drastically between
different PDF sets, any “tuned” values of p⊥0 should be considered valid only for the PDF set they
were obtained with.

6.2.2 The impact-parameter model


A hadron is characterized not only by its longitudinal structure, as encoded in the PDFs, but also
by its transverse one. That is, the “impact parameter” plane overlap of partons in the two hadrons
influences the possible collisions. The hadrons are Lorentz contracted to pancake shapes in high-
energy collisions, such as the LHC, so the partons can be considered as frozen during the short
collision time.
As a first approximation we will assume a common spatial distribution ρ(x) d3 x = ρ(r) d3 x
for all parton types and momenta in a hadron. In the collision between two hadrons, passing by

131
SciPost Physics Codebases Submission

0.01

-5
x(pT0 ,y=0)
xMPI 10
s > (1 GeV)2

10-8

100 1000 104 105


ECM [GeV]

Figure 11: Range of x values accessible to MPI in PYTHIA, for 10 GeV < ECM < 100 TeV.
Scatterings at p̂⊥ ∼ p⊥0 will involve at least one x fraction below the red dashed line.
Grey shading highlights the low-x extrapolation region x < 10−4 in which current PDFs
are uncertain by a factor two or more.

at an impact parameter b, the overlap between the two distributions is then given by
ZZ  ‹  ‹
3 b b
O(b) =
e d x dt ρboosted x − , y, z − v t ρboosted x + , y, z + v t
2 2
ZZ
p
∝ d3 x dt ρ(x, y, z) ρ(x, y, z − b2 + t 2 ) , (256)

where the second line is obtained by suitable scale changes.


A few different ρ distributions have studied and made available as options. Using Gaussian
distributions is especially convenient, since the convolution then becomes trivial. However, a
single Gaussian does not give a good enough description of the data, and a better description is
obtained with a sum of two Gaussians, with a small core region embedded in a larger hadron.
This can be viewed as a manifestation of the “hot spot” concept [231, 232], wherein partons may
tend to cluster in a few small regions, typically associated with the three valence quarks, as a
consequence of partons cascading from them. Another alternative, that is currently the default, is
a one-parameter shape 
Oe (b) ∝ exp −b d , (257)
where d < 2 gives more fluctuations than a Gaussian and d > 2 less. The default value is d = 1.85,
i.e. slightly more peaked than a Gaussian. Note that the expression is for the overlap, not for the
individual hadrons, for which no related simple analytic form is available.
It is now assumed that the interaction rate, to first approximation, is proportional to the overlap

nMPI (b)〉 = k O
〈e e (b) . (258)
Interactions are assumed to occur independently of each other for a given b, to first approximation,
which leads to a Poissonian number distribution. Zero interactions means that the hadrons pass
each other without interacting. The n eMPI (b) ≥ 1 interaction probability therefore is

Pint (b) = 1 − exp (−〈enMPI (b)〉) = 1 − exp −k Oe (b) . (259)

132
SciPost Physics Codebases Submission

We notice that kO e (b) is essentially the same as the eikonal Ω(s, b) = 2 Imχ(s, b) of optical mod-
els [233–236], but split into one piece O e (b) that is purely geometrical and one k = k(s) that
carries the information on the parton-parton interaction cross section.
Simple algebra shows that the average number of interactions in events, i.e. hadronic passes
with nMPI ≥ 1, is given by
R
e (b) d2 b Z s/4
kO 1 dσ 2
〈n〉 = R = k〈Oe〉 = dp⊥ , (260)
Pint (b) d2 b σnd 0 dp⊥ 2

which fixes the absolute value of k (numerically). We have also taken the occasion to introduce
〈Oe 〉 as the average overlap. Hence Oe (b)/〈O
e 〉 represents the enhancement at small b and depletion
at large b.
So far, we have assumed the transverse b-space profile to be decoupled from the longitudinal
x one. This is not the expected behaviour, because low-x partons in a hadron should diffuse out
towards larger r during the evolution down from higher-x ones [237]. Additionally, if r = 0 is
defined as the centre of energy of a hadron, then by definition a parton with x → 1 also implies
r → 0. In this spirit, there is a non-default PYTHIA option with correlated x and r [238]. It does
not explicitly trace the evolution of cascades in x, but assumes that the r distribution of partons
at any x can be described by a simple Gaussian, but with an x-dependent width:

r2
 
1 1
 ‹
ρ(r, x) ∝ 3 exp − 2 with a(x) = a0 1 + a1 ln , (261)
a (x) a (x) x

where a0 and a1 are free parameters to be determined. The overlap is then given by

b2
 
1 1
O(b, x 1 , x 2 ) =
e exp − 2 . (262)
π a2 (x 1 ) + a2 (x 2 ) a (x 1 ) + a2 (x 2 )

In principle one could argue that also a third length scale should be included, related to the trans-
verse distance the exchanged propagator particle, normally a gluon, could travel. This distance
should be made dependent on the p⊥ scale of the interaction. For simplicity, this further compli-
cation is not considered but, a finite effective radius is allowed also for x → 1. The generation
of events is more complicated with an x-dependent overlap, but largely involves the same basic
principles. Until now, there is no evidence that this option provides a better description of data
than the default, unfortunately.

6.2.3 The generation sequence


To introduce the MPI generation algorithm, leave aside the impact-parameter issue for a mo-
ment. The probability to have an MPI at a given p⊥ in a non-diffractive event is then given by
(1/σnd )dσ/dp⊥ . If interactions occur independently of each other, the number of MPIs would be
distributed according to a Poissonian, with the zero suppressed. There are a few ways to generate
such a Poissonian.
The PYTHIA approach is inspired by the parton-shower paradigm. The generation of consecu-
tive MPIs is formulated as an evolution downwards in p⊥ , resulting in a sequence of n interactions
p
with s/2 > p⊥1 > p⊥2 > · · · > p⊥n > 0. The probability distribution for p⊥1 becomes
‚ Z ps/2 Œ
dP 1 dσ 1 dσ 0
= exp − 0 dp⊥ . (263)
dp⊥1 σnd dp⊥1 p σnd dp⊥
⊥1

133
SciPost Physics Codebases Submission

Here the naive probability is corrected by an exponential factor expressing that there must not
p
be any interaction in the range between s/2 and p⊥1 for p⊥1 to be the hardest interaction. The
procedure can be iterated, to give
‚ Z p Œ
⊥i−1
dP 1 dσ 1 dσ 0
= exp − 0 dp⊥ . (264)
dp⊥i σnd dp⊥i p σnd dp⊥
⊥i

The exponential factors resemble Sudakov form factors of parton showers [51], or virtual correc-
tions of “uncut pomerons” in the Regge–Gribov framework, and fills the same function of ensuring
that probabilities are bounded by unity. We will use the Sudakov terminology to stress this sim-
ilarity. Summing up the probability for a scattering at a given p⊥ scale to happen at any step of
the generation chain gives back (1/σnd ) dσ/dp⊥ , and the number of interactions above any p⊥
is a Poissonian with an average of σint (p⊥ )/σnd , as it should. The downwards evolution in p⊥ is
handled by using the veto algorithm, like for showers. If no MPIs are generated in the evolution,
a sequence is rejected and a new try made.
When the impact-parameter variability is to be included as well, eq. (263) generalizes to
‚ Z ps/2 Œ
dP e (b) 1 dσ
O e (b)
O 1 dσ 0
= exp − 0 dp⊥ . (265)
d2 b dp⊥1 〈Oe 〉 σnd dp⊥1 〈Oe〉 p σnd dp⊥
⊥1

This expression can be integrated over p⊥1 to give eq. (259). Once b has been chosen, the selection
e (b)/〈O
is similar to that in eq. (263), except that there is now a factor O e 〉 multiplying the rate.
The same factor enters in the extension of eq. (264), for the continued evolution, to

e (b) p⊥i−1 1 dσ
‚ Œ
e (b) 1 dσ
Z
dP O O 0
= exp − 0 dp⊥ . (266)
dp⊥i 〈Oe 〉 σnd dp⊥i 〈Oe〉 p σnd dp⊥
⊥i

The usefulness of the doubly differential expression in eq. (265) is not so apparent in the
generation of an inclusive non-diffractive event sample, where p⊥1 can be integrated out before
selecting b. But it gives important insights, especially since the MPI machinery is also intended
to be used to generate the underlying event associated with other processes. Assume e.g. that we
want to produce a hard jet sample, i.e. p⊥1 > p⊥min . For a large p⊥min the steep fall of dσ/dp⊥
ensures that the argument of the exponent is tiny, and so the exponent itself is close to unity
and can be neglected. The b and p⊥1 expressions then factorize. The former variable is selected
proportional to O e (b), while the latter is selected according to the conventional differential cross
section. Since Oe (b) is more peaked at small b than Pint (b), it means that hard processes are se-
lected at more central b values than inclusive non-diffractive events. The physics is quite clear:
the probability to obtain a hard collision is proportional to the full parton-parton collision rate,
nMPI (b)〉 ∝ O
〈e e (b), and so it is strongly peaked at small b, while already a single MPI is enough
to obtain a non-diffractive event, and so that probability saturates at unity in Pint (b). The con-
sequence of picking a smaller b in hard processes is that the selection rate for subsequent MPIs,
eq. (266), also is larger, thus giving a higher level of underlying activity than that of the full
non-diffractive event sample, the “pedestal effect”.
While the expression in eq. (265) provides for interpolation between hard and soft events,
it is important to note that only the non-diffractive processes, i.e. the ones where the hardest
interaction is selected by the MPI machinery, involve the full correlation. If one studies a hard
process, be it hard QCD jets or something else, then in PYTHIA the selection of process kinematics

134
SciPost Physics Codebases Submission

is done with no reference to MPIs. It is only if and when, after the MPI machinery is invoked, that
the p⊥ scale of the hard process is used to select a b value that takes into account the Sudakov
factor.
Therefore, in the study of hard QCD jets, one should not pick such a low p⊥min that the Sudakov
factor deviates appreciably from unity. In practice, this means that one should have p⊥min at least
above 20 GeV at LHC energies. If one wants to study jets below that scale, one can as well start out
from the full non-diffractive sample. When a hard process is fed into the MPI machinery, however,
b is chosen according to eq. (265) in full, i.e. including the Sudakov. That is, if by mistake one
were to generate LHC jets at or below 10 GeV, the Sudakov would not be used in the p⊥ selection,
and thus the cross section would be overestimated, but it would be used in the b selection, and
thereby provide the correct underlying-event activity.
So far, we have only considered 2 → 2 QCD processes in the MPI framework, but the list can
be extended also to other ones. By default PYTHIA allows other 2 → 2 processes to be included in
the Sudakov factor, and thereby also in the MPI generation: jet pairs via s-channel γ∗ or t-channel
γ∗ /Z/W± exchange, events with one or two photons, or charmonium or bottomonium recoiling
against a jet. Needless to say, these cross sections are much lower than the standard QCD ones, and
therefore do not make much of a difference, but nevertheless help provide a richer non-diffractive
or underlying-event structure.
Another issue is what upper limit to set for the selection of p⊥2 . If studying QCD jets, the
ordering p⊥1 > p⊥2 is obvious; anything else would not reproduce the inclusive scattering cross
section. But, if the hard process is single Z production, say, then this is not part of the MPI
machinery, and so there is no double counting involved by allowing the underlying events to
contain jets up to the kinematic limit. (The exception is if weak showers are switched on; then a
hard QCD jet can emit a softer Z, and so such topologies could be double counted.) A few options
are available, but the default strategy in PYTHIA is to split events into two types. If the final state
of the hard process contains only (d, u, s, c or b) quarks, gluons, and photons then p⊥max is chosen
to be the factorization scale for internal processes, and the scale value for external Les Houches
input. If not, interactions are allowed to go all the way up to the kinematic limit.

6.2.4 Momentum and flavour conservation


As formulated so far, the same PDFs are used for all MPIs. This would allow more momentum to
be taken out of a beam than there is, and also favour the repeated collisions of valence quarks that
have already reacted. It is here that the ordering of the emissions becomes important. Standard
PDFs can indeed be used for the first emission, which is the hardest one and therefore the one most
visible and the one that standard PDFs have been tuned to describe. For subsequent emissions, the
PDFs can gradually be modified to take into account the effects of the previous ones. An obvious
modification is to rescale the x scale such that PDFs do not extend to higher values than left by
the previous ones, i.e.
i−1
X
x i < x i,max ≡ X i = 1 − xj , (267)
j=1

but we will also want to consider flavour aspects. The beauty is that these successive modifica-
tions, that gradually let the PDFs diverge from the conventional ones, occur at falling p⊥ scales,
where individual MPIs become less easily studied, so imperfections do not give large effects. The
consecutive reduction of remaining momentum also means that the nMPI distribution, for a fixed b,
will fall off faster than the assumed Poissonian. What does not change, fortunately, is the fraction

135
SciPost Physics Codebases Submission

of nMPI = 0 events that have to be thrown away, because that is entirely determined by whether a
first MPI can be generated with standard PDFs or not.
To extend the PDF framework, to include not only a simple x rescaling but also flavour count-
ing, it is assumed that quark distributions can be split into a valence and a sea part. In cases where
this is not explicit in the PDF parameterizations, it is assumed that the sea is flavour-antiflavour
symmetric, so that one can write e.g.

u(x, Q2 ) = uval (x, Q2 ) + usea (x, Q2 ) = uval (x, Q2 ) + u(x, Q2 ) . (268)

The parameterized u(x, Q2 ) and u(x, Q2 ) distributions can then be used to find the relative prob-
ability for a kicked-out u quark to be either valence or sea.
For valence quarks two effects should be considered. One is the reduction in content by previ-
ous MPIs: if a u valence quark has been kicked out of a proton then only one remains, and if two
then none remain. In addition, the constraint from momentum conservation should be included.
Together this gives
Nu,val,remain 1
 ‹
x
ui,val (x, Q2 ) = uval , Q2 , (269)
Nu,val,original X i Xi
for the u quark in the i’th MPI, and similarly for the d. The 1/X i prefactor ensures that the ui
integrates to the remaining number of valence quarks. The momentum sum is also preserved,
except for the downwards rescaling for each kicked-out valence quark. The latter is compensated
by a uniform scaling up of the gluon and sea PDFs.
When a sea quark (or antiquark) qsea is kicked out of a hadron, it must leave behind a corre-
sponding antisea parton in the beam remnant, by flavour conservation, which can then participate
in another interaction. We can call this a companion antiquark, qcmp . In the perturbative approx-
imation the pair comes from a gluon branching g → qsea + qcmp . This branching often would not
be in the perturbative regime, but we choose to make a perturbative ansatz, and also to neglect
subsequent perturbative evolution of the qcmp distribution. Even if approximate, this procedure
should catch the key feature that a sea quark and its companion should not be expected too far
apart in x. Given a selected x sea , the distribution in x = x cmp = y − x sea then is
Z 1
qcmp (x; x sea ) = C g( y) Pg→qsea qcmp (z) δ(x sea − z y) dz
0
g(x sea + x) x sea
 ‹
=C Pg→qsea qcmp . (270)
x sea + x x sea + x

Here Pg→qq (z) is the standard DGLAP branching kernel, g( y) an approximate gluon PDF, and C
gives an overall normalization of the companion distribution to unity. Furthermore, an X i rescaling
is necessary as for valence quarks. The addition of a companion quark does break the momentum
sum rule, this times upwards, and so is compensated by a scaling down of the gluon and sea PDFs.
In summary, in the downwards evolution, the kinematic limit is respected by a rescaling of
x. In addition, the number of remaining valence quarks and new companion quarks is properly
normalized. Finally, the momentum sum is preserved by a scaling of gluon and (non-companion)
sea quarks. It is interesting to note that the joint PDFs for the first two MPIs behave rather similarly
to the Gaunt–Stirling DPS PDFs [239], whereas the PYTHIA approach currently is the only one that
explicitly offers triple parton distributions and beyond.

136
SciPost Physics Codebases Submission

6.2.5 Interleaved and intertwined evolution


So far we have only considered an MPI as a 2 → 2 process, but it should be associated with
ISR and FSR showers. In particular, ISR needs to take momentum from the beams, and can also
change the “original” flavour taken out of the beam during the backwards evolution. This implies
a more intricate competition between the MPI systems than already outlined. If all MPIs are first
considered, then their number will be maximized, whereas there may be little room left for ISR. If
instead ISR is added to each MPI before proceeding to the next, then there will be less room left
for MPIs.
Time ordering does not give any clear guidance what is the correct procedure. Incoming high-
energy hadrons can be viewed as flat pancakes, such that all MPIs happen simultaneously at the
collision moment, while ISR stretches backwards in time from it, and FSR forwards. But we have
no clean way of separating the hard interactions themselves from the virtual ISR cascades that
“already” exist in the colliding hadrons.
Instead we choose the same guiding principle as we did when we originally decided to consider
MPIs ordered in p⊥ : it is most important to get the hardest part of the story “right”, and then one
has to live with an increasing level of approximation for the softer steps. Since also showers are
ordered in (some kind of) p⊥ , it is meaningful to choose p⊥ as common evolution scale. Thus the
scheme is characterized by one master formula

dP dPMPI X dPISR X dPFSR


 ‹
= + +
dp⊥ dp⊥ dp⊥ dp⊥
‚ Z p   Œ
⊥max
dPMPI X dPISR X dPFSR 0
× exp − 0 + 0 + 0 dp⊥ (271)
p dp⊥ dp⊥ dp⊥

that probabilistically determines what the next step will be. Here the ISR sum runs over all in-
coming partons, two per already produced MPI, the FSR sum runs over all outgoing partons (or
dipoles), and p⊥max is the p⊥ of the previous step. Starting from the hardest interaction, eq. (271)
can be used repeatedly to construct a complete parton-level event. The flavour and momentum
used by previous MPIs or shower branchings are book kept in accordance with the principles out-
lined previously, with a few straightforward extensions. For ISR, e.g. the x and flavour of the own
MPI does not count as used up.
MPIs are not only related to each other by overall momentum and flavour conservation issues,
but may be directly interacting with each other. Two such examples are joined interactions and
partonic rescattering.
In the former, two partons participating in two separate MPIs may turn out to have a common
ancestor when the backwards ISR evolution traces their prehistory. The joined interactions are
well known in the context of the forwards evolution of multiparton densities [240, 241]. It can
approximately be turned into a backwards evolution probability for a branching a → bc

dQ2 αs x a f a (x a , Q2 )
dP bc (x b , x c , Q2 ) ' z(1 − z)Pa→bc (z) , (272)
Q2 2π x b f b (x b , Q2 ) x c f c (x c , Q2 )

with x a = x b + x c and z = x b /(x b + x c ). The main approximation is that the two-parton differential
(2)
distribution has been been factorized as f bc (x b , x c , Q2 ) ' f b (x b , Q2 ) f c (x c , Q2 ), to put the equation
in terms of more familiar quantities.
Just like for the other processes considered, a form factor is given by integration over the
relevant Q2 range and exponentiation. Associating Q ' p⊥ , joined interactions can be included

137
SciPost Physics Codebases Submission

as a fourth term in eq. (271). But technical complications arise when the kinematics of joined
branchings are reconstructed, notably in transverse momentum, and the code to overcome these
was never written. One reason is that already the evolution itself showed that joined-interaction
effects are small and tend to occur at low p⊥ values [78].
The second intertwining possibility is rescattering, i.e. that a parton from one incoming hadron
consecutively scatters against two or more partons from the other hadron. The simplest case,
3 → 3, i.e. one rescattering, has been well studied [242–244]. The conclusion is that it should
be less important than two separated 2 → 2 processes: 3 → 3 and 2 × (2 → 2) contain the same
number of vertices and propagators, but the latter wins by involving one parton density more.
The exception could be large p⊥ and x values, but there 2 → n, n ≥ 3 QCD radiation anyway is
expected to be the dominant source of multi-jet events.
For rescattering, a detailed implementation is available as an option in PYTHIA [245], as fol-
lows. In order to allow a rescattering then a scattered parton has to be put back into the PDF, but
now as a δ function. A hadron can therefore be characterized by a new PDF
X
f (x, Q2 ) → frescaled (x, Q2 ) + δ(x − x i ) = fun (x, Q2 ) + fδ (x, Q2 ) , (273)
i

where fun represents the unscattered part of the hadron and fδ the scattered one. The scattered
partons have the same x values as originally picked, in the approximation that small-angle t-
channel gluon exchange dominates, but more generally there will be shifts. The sum over delta
functions runs over all partons that are available to rescatter, including outgoing states from hard
or MPI processes and partons from ISR or FSR branchings. All the partons of this disturbed hadron
can scatter, and so there is the possibility for an already extracted parton to scatter again. With
the PDF written in this way, the MPI scattering rate can be seen as a sum of four terms, depending
on whether the fun or the fδ is involved on either incoming side. Unfortunately, like for the joined
interactions above, the kinematics become quite messy, specifically the propagation of recoils be-
tween systems that are partly intertwined but also partly separate.
A third and more dramatic intertwining possibility is that the perturbative cascades grip into
each other. An example is the “swing” mechanism, whereby two dipoles in the initial state can
reconnect colours, which is a key aspect of the DIPSY generator [246, 247]. An implementation
exists in a branch of PYTHIA 8.3 [248], but not yet in the public version.

6.2.6 Spatial parton vertices


While setting spatial production vertices of unstable hadrons and leptons is a standard task (see
section 8.1.3), the corresponding task for parton vertices in MPIs (as well as for beam remnants
and parton shower) is not. The main issue to tackle is, that as the MPI and shower models are
formulated in momentum space only, no obviously correct correlation with an impact-parameter
picture exists. The plan is to further develop such an integrated framework, based on matching
with dipole calculations on proton Fock states in impact-parameter space [248], but as such in-
formation is needed for string interactions (section 7.3) and hadronic rescattering (section 7.4),
a basic framework is in place already now.
The basic framework includes four choices for the pp overlap region, from which vertices are
sampled randomly. For all model choices, vertices of ISR and FSR partons are smeared relative
to their mother by a Gaussian distribution, with a width of σv /k⊥ , where k⊥ is the transverse
momentum of the produced parton, and σv is a parameter to be set by the user.
The four possible choices for the overlap region are:

138
SciPost Physics Codebases Submission

• The proton profile is a Lorentz-contracted ball of uniform density. This gives an almond-
shaped overlap region, similar to heavy-ion collisions, favouring MPIs being displaced per-
pendicular to the collision plane. This option somewhat collides with impact-parameter
selection in the MPI model, as it does not allow any interactions of the impact parameter to
be larger than twice the hadron radius.

• The proton profile is a Lorentz-contracted three-dimensional Gaussian (motivated by the


proton mass distribution), easily reduced to a two-dimensional one, as the z can be inte-
grated out. The overlap region is taken as the product of the two displaced Gaussians,
which is in itself a Gaussian.
p
• A variation of the above Gaussian scheme, but elongated by a factor (1 + ε)/(1 − ε), where
ε is a parameter determining whether production should be favoured in the collision frame
or out of the collision frame.

• Another variation of the Gaussian scheme, but with a modulation factor 1 + ε + cos(2φ),
and φ defined with respect to the collision plane.

It should be noted that the models for spatial parton vertices are at a very early stage of
development, and subject to change in the future.

6.2.7 Other MPI aspects


There are several topics that concern MPIs, that will be described separately. One such is the
issue of colour flow. The colours within each MPI, and its associated ISR and FSR, are initially
assigned in the Nc → ∞ limit. This implies that each parton taken out of a hadron, to go into
a MPI, leaves its corresponding unique anticolours behind in the beam remnant. With many
MPIs involved this gives an unrealistically complicated remnant, and so there is a machinery that
attempts to associate an initial-parton colour from one MPI with an initial-parton anticolour from
another MPI. Remaining colour lines attach to the remnant partons, see further the beam remnants
description. This still allows colour lines to be drawn criss-cross in the event. Colour reconnection
(CR) is a mechanism whereby these colour lines may be reconnected, typically in such a way that
the total string length is reduced, further described in section 7.2.
In this section we have reasoned around MPIs in the non-diffractive component in hadron-
hadron collisions, which is the prime, but not the only, application of the MPI framework. One ex-
tension is that photons have a resolved component, where they behave more-or-less like hadrons,
and undergo MPIs in a similar manner. Another is that diffraction may be viewed as involving the
collision of pomerons with hadrons or with each other, and that also pomerons can be associated
with a hadronic structure that allows MPIs to occur. These aspects will be discussed further in
their respective context.
A standard task for PYTHIA is to generate one predetermined hard process and then add
underlying-event activity to that, which means that most of the time the additional MPI activ-
ity will be too soft to give explicitly visible jets. This means that generation efficiency will be low
if one is interested in studies of double parton scattering. But, there is a possibility to request two
hard scatterings in an event, each of a given type and within given kinematic ranges. While one
of the two processes can be selected from the full range of possibilities, the other must be chosen
from a list of a dozen process groups. This is not a fundamental limitation, but covers all that we
could see a possible application for, and if need be the list could be extended. Furthermore, as a

139
SciPost Physics Codebases Submission

non-standard extension to the Les Houches Accord, it is also possible to feed in external events
with two hard processes for further handling in PYTHIA. See section 3.14 for further details.
Since MPIs play such a key role for hadronic event properties, it is important to tune them
as well as possible to describe minimum bias, i.e. predominantly non-diffractive, and underlying
events alike. A number of settings and parameters are available to that end. Of special interest is
the p⊥0 parameter, that directly influences important properties such as the multiplicity distribu-
tion. Finally, it is worth mentioning that the MPI component normally is the most time-consuming
task of the PYTHIA initialization step. In order to prepare the Monte Carlo sampling of the differen-
tial cross section, it is necessary to find an upper envelope of it in the (x 1 , x 2 , t̂) phase space. This
envelope is based on multichannel sampling, where the relative importance of the channels should
be optimized to allow a reasonably high sampling efficiency. The MPI cross section itself also needs
to be integrated, as part of the p⊥ -evolution formalism. The initialization of non-diffractive events
therefore may take around a second, i.e. almost two orders more than it takes to generate an LHC
event afterwards. If one had to repeat the MPI initialization for each new event, this step would
form a bottleneck. That would be the case in diffractive events, where the mass of the diffractive
system varies from one event to the next. To this end, diffraction is initialized for a number of
logarithmically evenly spaced mass values, and then parameters for intermediate masses are ob-
tained by interpolation. If the incoming beams have varying energies, also non-diffractive events
can be set up for a range of collision energies. Thus initialization may take tens of seconds for the
full set of inelastic processes, while the subsequent interpolation time is negligible compared to
the event-generation one. If furthermore PYTHIA is initialized for multiple hadron types, the time
needed becomes proportionately longer. An option therefore exists to save the MPI initialization
data to a file, for reuse in subsequent runs, see section 9.6.2.

6.3 Beam remnants


What is left of a beam particle, after the partons initiating hard interactions and MPI have been re-
moved from it (and showered), is called the beam remnant. By definition, the remnant itself does
not participate in any momentum exchanges at scales larger than O(1 GeV). Hence, in PYTHIA, it
is regarded as a purely non-perturbative object, which does not undergo a parton shower.
The general strategy is to add the minimal number of partons required to conserve the beam
particle’s quantum numbers (flavour, colour and baryon number), taking into account which va-
lence and sea flavours have been scattered out of it. The remaining beam-particle momentum
is then shared amongst those partons, as described below. Note that what is relevant to deter-
mining the remnant structure is not which partons initiated Born-level processes (or MPI) at the
respective hard-process factorization scale(s), but instead the ones after initial-state radiation, at
Q ∼ Q cutoff ∼ O(1 GeV). For brevity, we henceforth refer to these low-scale partons as “initiator
partons”.
By default, also some “intrinsic transverse momentum” is added for the initiators and the rem-
nant partons. Final momentum conservation is then ensured by rescaling the sampled momenta
of the remnant partons appropriately. The procedure is discussed in more detail in ref. [249] and
outlined below for hadron beams and the more specialized cases of lepton and photon beams will
be discussed in sections 6.5 to 6.7.

140
SciPost Physics Codebases Submission

6.3.1 Flavour structure


The first step in beam remnant generation is to determine the number and flavours of the remnant
partons. This begins by including the remaining valence quarks. For baryons, if two or more
valence quarks are present, a randomly selected pair of these is turned into a diquark state. In
this case, relative probabilities for different diquark spins are derived within the context of the
non-relativistic SU(6) model, i.e. flavour SU(3)uds times spin SU(2). For instance, a ud diquark
in a proton remnant is 3/4 spin-0 and 1/4 spin-1, while a uu diquark always has spin-1. If the
initiator was a gluon, then the remnant is a colour-octet object, which is split into a triplet and an
antitriplet, again using SU(6) to determine relative weights. For a proton remnant, P(u+ud0 ) = 12 ,
P(u + ud1 ) = 61 , and P(d + uu1 ) = 13 .
Otherwise, the valence flavours are unambiguous assuming that valence content has been fixed
beforehand. As sea quarks are created in pairs, for all sea quarks that have taken out from the
beam particle a companion quark with an opposite flavour and colour is added if such have not
been already found during partonic evolution. If no other remnants are needed, a gluon (photon)
is added to carry the momentum of the hadron (lepton) beam, otherwise no gluons are added
as remnant partons unless required to balance for the colour structure. For DIS events, it is also
possible to collapse two remnant partons directly into a colour-singlet hadron.

6.3.2 Colour structure


Since the incoming hadrons (or, more generally, the incoming beam particles) are colourless,
the combined set of initiator and beam-remnant partons must be colourless, too. In the very
simplest cases, such as when the remnant consists of a single triplet and/or antitriplet colour,
there is no ambiguity. But when there are several such charges, the assignment of colour flow
in the remnant (roughly, which remnant-parton colours to associate with which initiator-parton
colours) is inherently ambiguous, and there is no first-principles solution. PYTHIA contains two
distinct models that address this ambiguity, called “old” and “new”, based on the time they were
developed and implemented. Currently, the “old” one is the default.
The old model [249] is motivated by the way colour flow is treated in parton showers, and
extends this to the beam remnant, as follows. Starting from the simplest representation of the
colour structure of the valence quarks in the incoming beam particle (a quark-antiquark pair for
a meson and a three-quark junction structure for a baryon, simplified to a quark-diquark struc-
ture when possible), initiator gluons are attached in random order to one of the valence quarks
(selected at random if there are several), and quark-antiquark pairs are added as if they came
from gluon splittings. Thus this model captures the qualitative behaviour that is expected from
leading-colour QCD.
The new model [250] is motivated by SU(() 3) colour algebra, and essentially extends the
QCD-based colour-reconnection model to the beam remnant, as follows. First, the set of initiator
partons is considered. An SU(3) product determines the possible overall multiplets that can be
formed from those partons. If one assumes they are uncorrelated, the naive probability for the set
to be in any of those multiplets would be given simply by state counting. A free parameter allows
for the application of an (exponential) weighting factor favouring small multiplets over larger
ones. This is intended as a way to mimic correlations due to possible saturation effects which are
not otherwise explicitly represented in PYTHIA. Having selected a multiplet for the set of initiator
partons, the beam-remnant colour configuration has to be the inverse of that, to conserve the
colour-singlet nature of the beam particle. The minimum amount of gluons are added to the

141
SciPost Physics Codebases Submission

beam remnant in order to obtain this colour configuration.

6.3.3 Primordial k⊥
As the hard processes and parton showers in PYTHIA are based on collinear factorization, only the
longitudinal momenta are generated during the perturbative treatment. However, some trans-
verse momentum of non-perturbative origin due to Fermi motion of partons inside a hadron is
expected. Furthermore, studies on Z-boson transverse-momentum distributions have indicated
that a significant amount of partonic p⊥ is required to reproduce these distributions in hadron-
hadron collisions. In PYTHIA such partonic transverse momentum is modelled with primordial k⊥
that acts as a proxy for non-perturbative and possibly perturbative initial p⊥ .
In PYTHIA the primordial k⊥ is generated from a two-dimensional Gaussian distribution. For
hard-process initiators the width of the Gaussian is parameterized as

σsoftQ 1/2 + σhard m


σ(Q) = , (274)
(Q 1/2 + Q) (m + m1/2 ydamp )

where Q ispthe renormalization scale for the hardest process and p⊥ for subsequent MPIs and m
the mass ( ŝ) of the system. The Q-dependent factor provides an interpolation between a soft
scale set by parameter σsoft and a hard scale, set by σhard , and Q 1/2 controls the midpoint between
these two. The m-dependent factor on the right-hand side in turn provides damping for small-
mass and/or large-rapidity systems. Such damping is introduced due to purely technical reasons
E rred
so the controlling parameters m1/2 and ydamp = ( m ) , where rred controls the of amount rapidity
damping, should not have much influence on related observables. For the remnant partons not
directly connected to any hard process, the width of the k⊥ -distribution is fixed by an another
parameter σremn and does not depend on any scale related to hard scattering or MPIs. After
sampling the k⊥ for each parton in the beam it is inevitable that the total transverse momentum
of the beam becomes non-zero. To retrieve the original beam p⊥ , the k⊥ of all partons will be
rescaled with a common factor in such a way that the net four-momentum of the beam particles
will be preserved.

6.3.4 Longitudinal momentum


In addition to the transverse momentum, the remnant partons should also carry the remaining
longitudinal momentum of the beam particle, X . As a first step, a momentum fraction x < X is
sampled for each remnant parton. In case of valence quarks, the value is sampled according to
p
(1 − x)a / x, where the power a can be adjusted for each parton flavour. Such a distribution
approximates the valence quark PDFs around the initial scale O(1 GeV) at which the remnants are
constructed. For the remaining companion quarks, the momentum fraction is sampled from the
distribution defined in eq. (270) which takes into account that the sea quarks are always created
in pairs, by definition, from gluon splittings. Gluons (and photons) are only added as remnants if
no valence or companion quarks are remaining in the beam. As only one of these will be added
as a remnant, it will carry all the remaining beam particle’s momentum X .
After the initial momentum fractions have been sampled for each remnant parton, these have
be to rescaled to make sure that the total four-momentum is conserved in each event. As now
both the initiator and the remnant partons carry also transverse momentum, the longitudinal-
momentum fraction of the remnants cannot be simply rescaled with X but some momentum have
to be shared between the two beams to balance the event, for details see ref. [249, section 4.4].

142
SciPost Physics Codebases Submission

In some special cases, such as DIS processes, only one remnant is required and no such balancing
can be done. To account for momentum conservation, the final-state parton momenta are then
boosted and rotated in such a way that the total four-momentum is conserved for the sampled
remnant configuration.

6.4 Hadron-hadron collisions


In section 6.1 we introduced the main event types in hadron-hadron collisions, and how their
total and differential cross sections are parameterized in PYTHIA 8.3. Elastic-scattering events are
trivial to model, given the dσ/dt cross-section expression; there are just two hadrons coming in
and the same two coming out, with a momentum transfer t and a randomly-selected ϕ angle.
See section 6.1.2 for the various options available for proton elastic-scattering cross sections, and
section 6.1.4 for the less sophisticated expressions used for other hadrons. The subsequent test on
MPIs and beam remnants are mainly concerned the non-diffractive component. It has the largest
cross section, and especially it is the one where the bulk of hard processes occur, which makes
it the most studied one experimentally. In this section we provide some further comments on
this event class in section 6.4.1, but in particular describe additional aspects in the description of
diffraction in sections 6.4.2 and 6.4.3.

6.4.1 Minimum-bias and related inclusive processes


The inelastic non-diffractive event type is often also called Minimum Bias (MB). Strictly speaking,
however, MB refers to the smallest possible trigger bias that allows for the identification of non-
empty events in a given experimental context. Depending on the detector acceptance, MB will
typically also include contributions from processes that PYTHIA labels as diffractive. Thus, if the
aim is to simulate an inclusive sample of “minimum bias” events, usually both diffractive and non-
diffractive events must be included, and then subjected to the appropriate experimental trigger
requirements.
Other, related, experimental terms are zero bias (e.g. based on a bunch-crossing timing trigger,
including some a priori unknown fraction of genuinely empty events), pileup (essentially also zero
bias except in cases where pileup contamination may affect trigger variables such as calorimeter
energies), inelastic ≥ N events (inelastic events with at least N particles in some given fiducial
region), and non-single-diffractive events (typically a “double-sided” MB trigger).
Related to this, note that the distinction between diffractive and non-diffractive processes is
not without ambiguity. In experimental contexts, diffraction may be defined in terms of observable
“rapidity gaps” with no particle production detected in specific region(s) of the detector, while in
theoretical contexts processes that are classified as diffractive typically produce a whole spectrum
of gaps with small ones suppressed but not excluded, see section 6.4.2. Conversely, events that
are modelled as non-diffractive in origin may produce large rapidity gaps, due to fluctuations
in the fragmentation process and/or if colour reconnections — see section 7.2 — are allowed to
produce such gaps, and in the transition region there could even be quantum interference between
the two categories (not modelled by PYTHIA). Thus, for any given application it is important to
phrase experimental measurements in terms of clearly defined physical observables, and consider
which MC processes are going to be able to contribute to those.
Usually hard processes, such as jet or gauge-boson production, are assumed to occur within the
non-diffractive event class. This is not quite true, since it is possible also for diffractive topologies
to contribute to hard cross sections, see section 6.4.3. That contribution typically is of the order of

143
SciPost Physics Codebases Submission

a per cent when modelled or measured experimentally, however, and is neglected by default. This
means that the full parton distribution functions (PDFs) are associated with the non-diffractive
component. They are used not only for the hard process itself but also for the associated MPI, ISR
and FSR activity. See further section 3.12

6.4.2 Diffractive processes


Diffractive event topologies are illustrated in fig. 9 on p. 120, and the differential cross sections
are described in section 6.1.3. The choice of diffractive mass(es) and t values sets the overall
kinematics of the events, but does not describe the hadronization of the diffractive system. To this
end, the Ingelman–Schlein approach is used [251], with details as described further in ref. [200].
In this approach, a pomeron is viewed as a physical particle, akin to a glueball state, with an
internal structure and notably with PDFs. Similarly, a reggeon is viewed as a mesonic state, but
for the practical handling the two are not distinguished. Single diffraction therefore contains
a pomeron-proton subcollision, double diffraction two such, and central diffraction a pomeron-
pomeron subcollision. Each such subcollision is assumed to produce particles as in a normal
inelastic non-diffractive hadron-hadron collision.
At high energies the modelling on the perturbative level is then given by the MPI machinery,
augmented by ISR and FSR. There are a few issues that need to be clarified, however. Notably
the MPI collision rate involves a combination of the pomeron-inside-proton flux with the parton-
inside-pomeron PDF. What is measured, e.g. at HERA, is the convolution of the two, where the
absolute normalization of each individually is not known. Historically, the flux normalization was
specified, such that then the pomeron PDF does not have to obey the momentum sum rule. This
may seem odd, but is in line with some theoretical arguments that the pomeron is not a real particle
and therefore is not bound by such constraints. There are a dozen different pomeron PDFs that
come with PYTHIA (plus three special-purpose ones), and most of these have a momentum sum
of the order of 0.5. It is possible to scale them by a factor, to restore unit normalization. Whether
that is done or not, the rescaling of remaining momentum for subsequent MPIs is done as for a
normal hadron, however. That is, the normalization matters for the rate of MPI production, but
not for the handling of those MPIs that do occur.
Further, the ordinary non-diffractive MPI rate is related not only to PDFs but also to the nor-
malization with the non-diffractive total cross section σnd , cf. eq. (260) and other MPI expressions.
This is an unknown number from first principles, and with the same pomeron-flux-normalization
uncertainty as the PDFs, so effectively it can be used to compensate for a non-unit momentum
sum. The default value is 10 mb at a collision CM energy of 100 GeV, where it has been tuned
(with default PDFs etc.) to produce about the same average charge multiplicity as ordinary pp non-
diffractive collisions at the same energy. This value could be energy-dependent, cf. the pomeron
term in eq. (239), but currently the default is a constant value.
Diffraction tends to be peripheral, i.e. occur at high-to-intermediate impact parameter for the
two protons. That aspect is implicit in the modelling of diffractive cross sections. For the simula-
tion of the pomeron-proton subcollision itself, however, it is rather the impact-parameter distribu-
tion of that particular subsystem that should be modelled. That is, it also involves the transverse
coordinate-space shape of a pomeron wave function. The outcome of the convolution with a pro-
ton wave function could be a different shape than for non-diffractive events, and therefore it can
be set separately. The default is a simple Gaussian, for lack of any relevant data. The p⊥0 scale
is assumed the same as in non-diffractive events at the same collision energy, but also that is an
assumption that could be questioned.

144
SciPost Physics Codebases Submission

The diffractive mass spectrum extends down to the ∆+ mass for pp collisions, and obviously
a perturbative MPI description would not make sense at such low energies. Instead a separate
low-mass description has been implemented. Up to 1 GeV above the hadron mass, the diffractive
system is allowed to decay isotropically into a two-hadron state. Above that, a diffractively-excited
hadron is modelled as if either a valence quark or a gluon is kicked out from it, along the collision
axis with some “primordial k⊥ ” smearing, cf. section 6.3.3.
In the former case this produces a simple string to the leftover remnant, in the latter it gives
a hairpin arrangement where a string is stretched from one quark in the remnant, via the gluon,
back to the rest of the remnant. The latter topology ought to dominate at higher mass MX of the
diffractive system. Therefore an approximate behaviour like

Pq N
= p (275)
Pg MX

is assumed, with N (= 5 by default) and p (= 1) as free parameters, and MX in GeV.


There is a smooth transition between the low-mass non-perturbative and the high-mass per-
turbative descriptions. The probability for applying the latter is given by [252]

max(0, MX − mmin )
 ‹
Ppert = 1 − exp − , (276)
mwidth

with mmin and mwidth free parameters, both by default 10 GeV. Note how Ppert vanishes when below
mmin .

6.4.3 Hard diffraction


The model for hard diffraction is somewhat different from the soft (low- and high-mass) diffraction
and it can be applied to any hard process, including e.g. high-p⊥ jets and EW bosons. The starting
point is again the Ingelman–Schlein picture where these interactions are mediated by a pomeron
whose internal structure is given by the diffractive PDFs. It has been observed, however, that this
factorization-based approach is broken as the predictions based on the diffractive PDFs determined
in diffractive DIS overshoot the hard diffractive data in hadron-hadron collisions roughly by an
order of magnitude [253,254]. In the PYTHIA framework this can be naturally explained by having
several non-diffractive partonic interactions, MPIs, in the same hadron-hadron collisions on top of
the diffractive process. These may then produce particles that fill up the rapidity gap used to select
the diffractive events leading to seemingly a suppressed diffractive cross section. The details of
this dynamical rapidity gap survival model are presented in ref. [255], together with several data
comparisons, and are briefly outlined below.
After a hard process and its kinematics are sampled, the events of diffractive origin are first
p,D p,ND
selected based on relative magnitude of the diffractive, f i , and non-diffractive, f i , PDFs
which together form the inclusive (the usual) hadronic PDFs
p p,ND p,D
f i (x, Q2 ) = f i (x, Q2 ) + f i (x, Q2 ) . (277)
p
The diffractive part, in turn, can be defined as a convolution between the pomeron flux fP and
pomeron PDF f iP :
Z1
p 2 dx P p
f i (x, Q ) = fP (x P ) f iP (x/x P , Q2 ) , (278)
x x P

145
SciPost Physics Codebases Submission

which can be considered as parton-in-pomeron-in-proton PDFs, typically determined using diffrac-


tive DIS data from HERA. After this tentative selection of diffractive events corresponding to the
Ingelman–Schlein approach, the pomeron kinematics are sampled and the event is processed fur-
ther. The essence of the PYTHIA model is then to perform a full parton-level evolution for the
original hadron-hadron system and to check whether any MPIs, that would render the event to a
non-diffractive one, has occurred. This allows for the generation of a sample where only events
without such additional interactions remain and the rapidity gap has survived. It is also possible
not to perform such a check and obtain the purely factorization-based result that serves as a base-
line for the expected cross-section suppression. Notice, however, that MPIs in the pomeron-hadron
system are still allowed as these would not fill up the rapidity gap between the excited hadron
and the pomeron remnants. Remarkably, this model relies solely on the MPI model in PYTHIA
and does not require any further parameters tuned to data. Yet, it can qualitatively explain the
order-of-magnitude difference between the purely factorization-based predictions and Tevatron
and LHC data, and reproduces the latest CMS data for diffractive dijets [256] with a good pre-
cision. Only single diffraction is currently implemented, and if both beams have been found to
emit pomerons, the diffractive side is selected randomly with equal probabilities. It is possible to
consider pomeron emissions from one side only which can be useful for non-symmetric collisions.

6.5 Lepton-lepton collisions


Lepton colliders have a reputation for providing the cleanest collisions possible, with e+ e− →
Z → ff at LEP/SLC providing a prime example, where Z properties could be studied in minute
detail. At lower energies, charm and beauty factories have advanced our understanding of the
standard model, e.g. the weak unitarity triangle(s). The key argument for future lepton colliders
often is precision Higgs physics. Nevertheless, lepton colliders also have their challenges, as will
be discussed in this section.

6.5.1 Bremsstrahlung and lepton PDFs


A lepton is surrounded by a cloud of virtual photons. In a collision, such as e+ e− annihilation,
some of those photons survive in the final state as so-called bremsstrahlung, mainly travelling
near the incoming lepton directions, and the annihilation energy is reduced correspondingly.
Similarly to the traditional PDF evolution in Q2 of a hadron, one can here start from a low-Q2
fee (x, Q20 ) = δ(x − 1) and evolve it with a splitting kernel

dQ2 αem 1 + z 2
dPe→eγ = dz , (279)
Q2 2π 1 − z

in close analogy with q → qg. The resummed effects of multiple photon emissions are described
in PYTHIA by an NLO expression [257] of the approximate shape

β Q2
 
e 2 β/2−1 2αem
fe (x, Q ) ≈ (1 − x) ; β= ln 2 − 1 . (280)
2 π me

The form is divergent but integrable for x → 1, i.e. the electron tends to keep most of the energy.
To handle the numerical precision problems for x very close to unity, where 64-bit double precision
would not be sufficient, the (electron) parton distribution is set to zero for x > 1 − 10−10 , and is

146
SciPost Physics Codebases Submission

rescaled upwards in the range 1 − 10−7 < x < 1 − 10−10 , in such a way that the total area under
the parton distribution is preserved:


 fee (x, Q2 ) 0 ≤ x ≤ 1 − 10−7

1000β/2 f e (x, Q2 ) 1 − 10−7 < x < 1 − 10−10

e 2

fe (x, Q ) mod = (281)

 1000β/2 − 1 e

0 x > 1 − 10−10 .

Turning to the photon flux, the evolution equation eq. (279) is deceptive in that it appears
to treat the electron and photon on equal footing. But, there is no resummation of the photon
spectrum, as there is for the one-and-only electron, only an increasing number of photons as the
evolution continues. The typical kinematics is also different. When we consider fee (x, Q2 ), it is
for an annihilating e± , where m2e  Q2 ∼ s, and the radiated energy manifests itself in terms
of massless photons. For the fγe (x, Q2 ), it is instead the electron that has to be on mass shell, a
requirement that leads to a non-trivial Q2min , and the photon that is virtual. This gives a PDF like

αem 1 + (1 − x)2 m2e x 2


 2 
e 2 Q 2
fγ (x, Q ) = ln , Q min ≈ , (282)
2π x Q2min 1− x

which obviously should vanish if Q2 ≤ Q2min . In typical physics applications, it is conventional to


set Q2 = Q2max ∼ 1 GeV2 to define a beam of quasi-real photons, that then can lead to γp and
γγ collisions. A photon more virtual than that would rather be considered as the propagator of a
deep inelastic scattering event, and one would not use PDF language to describe it. See further
section 6.6 and section 6.7.
The above equations for an electron beam can easily be extended to a muon one, simply by
replacing me by mµ , and similarly for τ. Neutrinos do not couple to photons and so there is no
need to introduce a substructure for them.
Returning to the issue of e+ e− annihilation, the effects of bremsstrahlung are more easily
illustrated if only one photon emission is considered, but from either side, in which case

αem 1 + (1 − x γ )2
 
dσ s
= ln 2 − 1 σ0 (ŝ) , (283)
dx γ π me xγ

where x γ is the photon energy fraction of the beam energy, ŝ = (1 − x γ )s is the squared reduced
hadronic CM energy, and σ0 is the ordinary annihilation cross section at the reduced energy. For
e+ e− → γ∗ → ff, where σ0 (ŝ) ∝ 1/ŝ ∝ 1/(1 − x γ ), the bremsstrahlung spectrum thus is singular
both for x γ → 0 and x γ → 1. The former is a true singularity, corresponding to infinitely soft
photons, that fortunately also carry away infinitely little energy from the electron. The latter is
cut off by the mass threshold for ff production.
If instead the e+ e− collider is running on a peak in the cross section, like the Z one at LEP 1,
and neglecting interference with γ∗ for simplicity, then σ0 (ŝ) < σ0 (s). While the soft-photon
singularity remains, any non-negligible photon energy will push the Z propagator further off-shell,
which leads to a suppression of such photon emissions and of the total Z cross section.
The situation is even more extreme for charm and beauty factories when they run on a nar-
row ψ or Υ state, where the net effect is a loss of cross section. PYTHIA does not simulate such
emissions, however, or indeed the production of onium states by e+ e− colliders.
Finally, note that leptons can be polarized both transversely and longitudinally, the former
by plane polarization in circular rings and the latter by spin rotation thereof. This can lead to

147
SciPost Physics Codebases Submission

non-trivial effects on cross sections, since the standard model distinguishes between left- and
right-handed fermions, and is therefore expected to be a main staple at future linear colliders.
While PYTHIA 6.4 encoded spin-dependent cross sections for a few common processes, none of
these have been ported to PYTHIA 8.3. If Les-Houches event input is used, such effects can be
taken into account already at that level, and will not affect the continued handling of the event
by PYTHIA.

6.5.2 Beamstrahlung
At potential future linear e+ e− colliders, the beams will be so tightly collimated that the electri-
cal field of one beam will significantly deflect the individual e± of the other. This acceleration
of charges leads to the emission of photons — beamstrahlung. Like bremsstrahlung, it gives a
reduced collision energy, a disadvantage that has to be balanced against the gains of a higher lu-
minosity. Beamstrahlung emits real photons and keeps the electrons real as well, so there is no Q2
dependence but only an x one. The fee (x) spectrum is highly dependent on the beam parameters,
and varies e.g. between the front and the tail of a bunch. It is therefore in the realm of machine
physicists to provide relevant spectra, e.g. with the GUINEA-PIG program [258]. Simplified param-
eterizations are found in the CIRCE program [259].
For e+ e− annihilation, the beamstrahlung and bremsstrahlung effects must be convoluted. Rel-
evant code for handling such a convolution does not (yet) exist in PYTHIA 8.3. In case of need, a
temporary solution is to split the energy remaining after beamstrahlung, but before bremsstrahlung,
into small bins that are generated separately and combined in proportion to their respective cross
section. This requires an initialization for each bin, but this is not such a big overhead since the
MPI bottleneck is absent in e+ e− annihilation.

6.5.3 Processes
PYTHIA contains many processes initiated by a fermion-antifermion pair, and these can almost
all be used both for hadron and lepton colliders. The list includes electroweak processes, top
production, Higgs physics, new gauge bosons, supersymmetry, and so on.
Most prominent is e+ e− → γ∗ /Z → ff. It has been the main staple of all lepton colliders so far,
possibly with the exception of LEP 2. In addition to precision electroweak physics, it has allowed
the study of FSR and hadronization under the cleanest conditions that we can hope for. The
simplest γ∗ /Z → qq process produces a single string between the q and q endpoints. One order
up, γ∗ /Z → qqg offers access both to αs and to tests of string topologies, specifically to confirm that
a string is drawn from the q via the g to the q. With four-jet events, mainly γ∗ /Z → qqgg, the non-
Abelian nature of QCD could be established. Taken together, the measured particle composition
can be used to tune flavour parameters, measured jet rates and correlations to tune showers, and
measured particle spectra to tune longitudinal and transverse fragmentation properties.
For LEP 2, instead W+ W− pair production was the most prominent process, although γγ
physics contributed at an even higher rate. Apart from electroweak physics, of note is that e+ e− → W+ W− → q1 q2 q3 q4
offers a test bed for colour reconnections, further described in section 7.2.

6.6 Lepton-hadron collisions


In lepton-hadron collisions the events are often classified in terms of virtuality of the intermediate
photon, Q2 . Events where the virtuality is large, or the mass of exchanged EW boson is large, and
the target hadron breaks up are referred to as deep inelastic scattering (DIS). At low virtualities

148
SciPost Physics Codebases Submission

(Q2 ® 1 GeV2 ), the events are in the photoproduction region where the photons can either interact
directly as unresolved particles or fluctuate into a hadronic state with equal quantum numbers. In
PYTHIA 8.3 these two event classes are handled in separate frameworks and the special features
of the former class are discussed below. The photoproduction framework is, in turn, introduced
in section 6.7.

6.6.1 Parton distribution functions and structure functions


In DIS the intermediate boson scatters off a parton in the target hadron in a relatively clean
scattering process where the kinematics characterizing the scattering can be related to the four-
momentum of the outgoing lepton. Therefore, such collisions can be used to study the structure
of the hadron and the initial-state QCD dynamics. Let P denote the four-momentum of the incom-
ing hadron, k the incoming lepton and k0 the scattered lepton. Then it is possible to define the
following Lorentz-invariant quantities

Q2 = −q2 = −(k − k0 )2
W 2 = (P + q)2
Q2
x=
2P ·q
P ·q
y= , (284)
P·k
purely based on measured energy and scattering angle of the scattered lepton. In fully inclusive
events, where the hadronic final state is integrated out, it is possible then to write down the
cross section of such a scattering process in terms of these quantities without making further
assumptions on the proton structure

d2 σ y2
 
2 2 2 2
l l l
= N y x F1 (x, Q ) + (1 − y) F2 (x, Q ) ∓ ( y − l
)x F3 (x, Q ) . (285)
dxd y 2

The coupling factor N l is different for neutral- and charged-current DIS and the sign of the last term
depends on whether the incoming lepton l is charged or neutral (neutrino) and if it is a particle or
an antiparticle. The structure functions Fil (x, Q2 ) represent the partonic structure of the hadron.
In the leading-order parton model [260, 261] the structure functions are simply proportional to
the sum of the parton distributions f (x, Q2 ) but do depend also on beam lepton type. The x can be
interpreted as the momentum-energy fraction of the parton with respect to the hadron momentum
P and the Q2 dependency arise from the QCD corrections at higher orders. The goal of PYTHIA 8.3
is, however, to provide fully exclusive events for which the relevant treatment is described next.

6.6.2 Deep inelastic scattering


The DIS framework describes processes where the scattered lepton emits a highly-virtual (point-
like) photon or a massive gauge boson that interacts with the constituents of the target hadron
breaking it up. As there currently are no models for intermediate photon virtualities (Q2 ∼ 1 GeV2 )
where high-virtuality, point-like, and low-virtuality hadronic processes contribute to cross sections,
the DIS framework provides a reliable description only at sufficiently large Q2 where the scattering
is purely mediated by a point-like particle. As the resolved-photon contribution fade continuously
(roughly as ∼ 1/Q2 ), it is impossible to set a hard cut for such a region. In most applications,

149
SciPost Physics Codebases Submission

however, a limit of Q2 > 5 GeV2 has turned out to be sufficient to ensure negligible contribu-
tions from the hadronic fluctuations. As the model for intermediate virtualities implemented in
PYTHIA 6 was based on several parameterizations mimicking the physical picture and turned out
to be somewhat fragile with restricted predictability, we have decided to develop a completely
new model for such processes that will be implemented in a future PYTHIA 8.3 release.

Hard processes In the LO DIS implemented in PYTHIA, the incoming lepton scatters off a quark
from the target hadron by exchanging an EW boson. As described in section 3.2, this includes both
neutral- and charged-current processes with charged leptons and neutrinos, and the interference
between a virtual photon and the Z boson can be accounted for. The DIS-optimized scale-setting
options are listed in section 3.10 and the relevant phase-space cuts in section 3.13. It is also
possible to provide the hard process as an input from an ME generator in the LHE format. However,
no matching of higher-order processes and the default parton shower has been implemented. As
the hard-processes are set up in the collinear approximation, no off-shellness is allowed for the
initial lepton line. Thus no radiation should be allowed for the initial-state lepton and no PDFs
for the lepton used. The phase-space sampling for DIS is inherited from generic massless 2 → 2
scattering where the initiators are assumed massless but the final-state particles can have finite
masses. This is not ideal for DIS, however, since the invariants typically studied in DIS, Bjorken
x and Q2 , are often derived from the four-momentum of the scattered lepton. Due to a mismatch
in masses, these variables might then not match the internally sampled values which can lead to
unphysical configurations such as x > 1 when invariants are derived from the scattered lepton.
To fix the issue, a new phase-space sampling optimized for t-channel exchange of bosons with
(potentially varying) masses will be implemented. Heavy-quark pairs can be produced in two
different ways: if the lepton scatters off a heavy quark, a companion will be added by ISR, or
the heavy-quark pair can be formed from a gluon splitting by FSR. Similarly, DIS events with
more than one jet can be formed via PS emissions but no explicit hard dijet processes have been
included. The showers do, however, include matrix-element corrections for the first emissions. As
the DIS process is a scattering of a single point-like particle, no MPIs are allowed.

Parton showers Both initial- and final-state radiation from deep-inelastic-scattering processes
require a careful treatment of the branching kinematics. If emissions from the hadronic system
disturb (via recoil) the lepton line, or vice versa, then both the x and the Q2 distribution are affected
by showering. In this case, an intricate recalculation of the hard-scattering cross section after each
parton-shower emission is required, making the strategy sub optimal6 . The natural resolution is
to ensure that any recoil due to the branching process is contained either in the hadronic or the
leptonic system. For the hadronic system, this is most easily achieved by employing a “local dipole
recoil” strategy, in which the kinematic recoil is absorbed by a colour-connected partner. Such a
strategy is employed by the DIRE section 4.3 and VINCIA section 4.2 showers, and is an option for
the simple-shower methods [80]. To model the QED evolution of the leptonic line, this approach
is insufficient, however, and more complex strategies are necessary, or the conservation of x and
Q2 may need to be relaxed, e.g. for charged-current DIS events, where electric charge flows from
the lepton to the hadron system. The latter is the case in the DIRE shower.
Another important aspect of modelling radiation in DIS events is the phase-space boundary for
emissions that involve incoming partons. The factorization scale (i.e. typically Q2 ) gives a natural
6
Similar concerns apply to any scattering via t-channel colour-singlet exchange, e.g. to Higgs production in vector-
boson fusion.

150
SciPost Physics Codebases Submission

phase-space boundary when using backward initial-state evolution [52]. However, the kinematic
boundary is more accurately given by the invariant mass of the radiating dipole or the invariant
mass of the hadronic system (W 2 ), which are often, and especially for low-x values, significantly
larger than Q2 . A natural resolution of this issue is to keep the tight Q2 constraint for the shower,
and use (tree-level) merging to supplement the missing phase-space regions. Another approach
is to abandon the DGLAP-based initial-state evolution [6]. In lieu of the latter, hard initial-state
emission in the partons shower models of PYTHIA should be considered with caution.
As the DIS events are rather clean, they offer a very good environment to study parton-shower
dynamics. For example, since the parton shower produces p⊥ kicks for the initiator via emis-
sions, it can be thought to resemble perturbative evolution of transverse-momentum dependent
(TMD) PDFs [262, 263]. Thus one should obtain reasonable p⊥ distributions in Semi–Inclusive
DIS (SIDIS) from the parton-shower enabled PYTHIA simulations.

Hadronization The hadronization of DIS events is analogous to that of hadron-hadron scatter-


ing systems. The scattered lepton does not partake in hadronization and since no multiparton
interactions are included in DIS events, no colour reconnection model is employed. At present, no
DIS data has been used in the tuning of the hadronization model. The study of spin polarizations
and the higher-dimensional structure of the hadron are typically important aspects of DIS analysis.
In this context, it should be noted that the PYTHIA 8.3 hadronization model does not by default
consider polarization, though external tools to model such effects have been proposed [264].

6.7 Photon-hadron and photon-photon collisions


The possibility to turn a charged lepton into a photon using laser back scattering has been studied,
but has not been realized in the current or foreseen colliders. Thus photon-induced collisions are
usually studied in colliders with charged beam particles that may emit photons when accelerated
to high energies. The shape of the photon flux and the virtuality spectra are, however, different for
different beam types but, given an appropriate flux, the photon-induced processes can be treated in
a single framework regardless of the original beam configuration. Here we focus on low-virtuality
(quasi-real) photons and introduce the current simulation framework in PYTHIA 8.3 for processes
involving such effective beams.

6.7.1 Parton distribution functions of resolved photons


In total there are three separate contributions for processes with low-virtuality photons: a photon
can interact either as an unresolved particle, it can split perturbatively into quark-antiquark pair, or
it can fluctuate into hadronic state non-perturbatively. The two latter contributions, where the par-
tonic constituents act as initiators for hard scattering, can be described with DGLAP-evolved PDFs.
As in the case of hadrons, the evolution equation for resolved photons do include a hadron-like
component where a non-perturbative ansatz is evolved according the usual QCD DGLAP kernels.
In addition to this, however, the evolution equation contains also a point-like component which
feeds in more quark-antiquark pairs with increasing evolution scale that may evolve further by
QCD splittings. The full evolution equation for resolved photons is
γ 1
∂ f i (x, Q2 )αem (Q2 ) 2 αs (Q2 ) X
Z
dz γ x
= e P
i iγ (x) + Pi j (z) f j ( , Q2 ) , (286)
∂ log(Q2 ) 2π 2π j x z z

151
SciPost Physics Codebases Submission

where the γ → qq splitting kernel in LO is Piγ (x) = 3(x 2 + (1 − x)2 ) and Q2 is the factorization
scale at which the partonic structure is probed. As in the case of proton PDFs, the parameters
related to the non-perturbative ansatz at the initial scale are determined in a global QCD analysis
comparing to experimental data. In PYTHIA 8.3, the default set for the resolved photon PDFs is
from the CJKL analysis [265] which conveniently provides the hadron-like and point-like parts
separately, which can be used for finer classification of the events with resolved photons. No
dependence on the photon virtuality is included in these PDFs, but all photons are taken as real
with zero virtuality, which is the case also for the LO photon-initiated cross sections currently
implemented in PYTHIA 8.3.

6.7.2 Photoproduction
Photoproduction typically refers to processes where a beam lepton emits a low-virtuality (quasi-
real) photon that then collides with a hadron from the other beam. The following describes some
special features of such collisions. These are not unique to ep colliders – similar processes can
take place also in e+ e− , pp, pA, and AA collisions as will be discussed in the following.

Photon flux and kinematic limits When the emitted photons are quasi-real and almost collinear
with the beam leptons, the cross section calculations can be simplified by factorizing the photon
flux from the hard perturbatively calculated part. In case of lepton beams, the flux of quasi-
real photons can be obtained from the well-known Weizsäcker–Williams [266, 267] or Equivalent
Photon Approximation (EPA). The flux differential in photon virtuality Q2 is

αem dQ2 1 + (1 − x γ )
2
fγl (x γ , Q2 ) = , (287)
2π Q2 xγ

where x γ is the momentum fraction carried by the (almost) collinear photon with respect to the
parent lepton. Integration from the minimum allowed virtuality yields the photon-in-lepton PDF
in eq. (282). In photoproduction, the upper limit Q2max is typically of the order 1 GeV2 , depending
on the considered experimental setup and detector acceptance. The lower limit is restricted by
the requirement of physical kinematics (on-shell leptons) for the 1 → 2 splitting and depends on
x γ , the mass of the lepton, ml , and the energy of the beam in the CM frame, E

2m2l x γ2 m2l x γ2
Q2min (x γ ) = ≈ . (288)
1 − xγ
q q
1 − x γ − m2l /E 2 + 1 − m2l /E 2 (1 − x γ )2 − m2l /E 2

From a similar consideration, one can find the kinematically allowed upper limit for x γ

Q2 m2
 
2 1 − 4Emax2 − E 2l
x γmax = s , (289)
4m2l m2l

1+ 1 + Q2 1 − E2
max

which typically is very close to unity. The lower limit of x γ can be derived from the minimum
considered W of the photon-hadron system. Similarly, as for hadron-hadron collisions, this should
be large enough to justify the perturbative treatment that PYTHIA is largely based on. After the
values for x γ and Q2 have been sampled from the allowed phase space, the full kinematics for the

152
SciPost Physics Codebases Submission

intermediate photon can be derived. The transverse and longitudinal momentum, q⊥ and qz as
shown in fig. 12, can be calculated from
v
u€ Š € Š
u 1 − x − Q22 Q2 − x 2 + Q22 m2
γ 4E γ E l
q⊥ = t
u
(290)
m2l
1 − E2
2
Q
E(x γ + 2E 2)
qz = r . (291)
m2l
1 − E2

The azimuthal angle is sampled from a flat distribution and the scattered lepton four-momentum
can be obtained simply from k0 = k − q. It is also possible to provide the photon flux externally in
PYTHIA 8.3, but the sampling has been optimized for the form in eq. (287). The kinematics and
the allowed phase-space region are independent from the applied flux.

k′
k
q⊥
q
qz
Figure 12: Kinematics of a photon emission.

Direct and resolved photons If the (quasi-)real photon is the initiator of the hard scattering,
i.e. an unresolved (or direct) photon, the photon flux acts essentially as a PDF and can be directly
applied for sampling of the process kinematics. If the photon has fluctuated into a hadronic state,
for which the partonic structure is given by the resolved photon PDFs described above, these PDFs
have to be convoluted with the flux to define so-called parton-in-photon-in-lepton PDFs
Z 1
γ dx γ γ
f i (x, Q2 ) = fγp (x γ ) f i (x/x γ , Q2 ) , (292)
x xγ

where the photon virtuality has been integrated out and Q2 refers to the factorization scale at
which the resolved photon is probed. Here, it is also assumed that the PDFs are independent of
the photon virtuality, though alternatives containing such information exist, see e.g. ref. [268].
The flux is also used to sample the intermediate photon kinematics required to reconstruct the full
event including the remnants of the resolved photon and the kinematics of the scattered lepton.
In PYTHIA 8.3 both of these contributions, direct and resolved, are included and can be generated
simultaneously to obtain the correct mixture of the possible contributions for a given process at
considered kinematics.

ISR with photon beams For direct photons, no ISR splittings have been implemented as in
these cases the effect from additional QED emissions is typically small. For the resolved photons,
however, some additional care needs to be taken when generating ISR due to the extra term in
the PDF evolution, see eq. (286), compared to purely hadronic beam particles. As this term feeds
in quark-antiquark pairs when evolving forwards with DGLAP, in backwards evolution, relevant

153
SciPost Physics Codebases Submission

for the ISR, this will collapse partons back into the original unresolved photon as illustrated in
fig. 13. If such splittings are found during the PS evolution, one can think of these processes
being of point-like origin and if not, the partons have originated from the hadron-like part of
the PDFs. This dynamical selection of these two contributions have then further implications for
beam remnants and MPIs as discussed below. This is also one of the key differences between the
old PYTHIA 6 implementation where such selection was done already when sampling the hard
scattering, and no MPIs were allowed for the point-like contribution at any scale.

Q2s Q2
Figure 13: Backwards evolution of a point-like photon that collapses into unresolved
photon at the scale Q s . The hard-process initiator whose splittings are traced back in
ISR is highlighted with red colour.

MPIs with resolved photons Similarly as with resolved hadron beams, the resolved photons
may also experience several partonic interactions in each collision. These MPIs are modelled in
the same way as for hadrons as described in section 6.2, but some aspects require further attention.
The first one follows from the ISR generation discussed above. If the photon has collapsed back
to an unresolved state, it can not have further MPIs below the scale at which this splitting has
occurred, in fig. 13 this scale is denoted with Q s . Such an ordering is possible thanks to the
interleaved evolution of PS and MPIs, see eq. (271). Another potential difference is related to the
screening parameter in semi-hard cross sections from which the MPI probabilities are calculated
from. Since the partonic and spatial structure of resolved photons are quite different compared
to protons, it would be expected that the value of this parameter should be separately tuned
for collisions involving resolved photons. Indeed, first comparisons to HERA data [269] indicate
that a somewhat larger screening parameter yielding a lower MPI probability is preferred but the
constraints are still rather sparse and would benefit from further measurements of low-p⊥ hadron
production. Also, the impact-parameter profile could be modified but this would require more
experimental data sensitive to MPIs.

Remnants Since the PDFs for resolved photons contain both a hadron- and a point-like part,
the remnant construction also needs to be adjusted to handle both cases. The main difference
to a purely hadronic state is that since the point-like contribution is of a perturbative nature,
the collapse back to a pure photon state should also be handled perturbatively, namely with the
parton showers. Unlike in PYTHIA 6 the distinction into a point-like and hadron-like part is not
done when the (semi-)hard scattering is selected, but the term corresponding to γ → qq̄ splitting
in ISR algorithm will select the cases where the initiator has originated from a perturbative photon
splitting. In cases where there are no MPIs in the event, ending up in such a configuration means
that there is no need to add any non-perturbative remnants, as the necessary partons have been
added perturbatively by the parton shower as illustrated in fig. 13. If the ISR generation will not

154
SciPost Physics Codebases Submission

end in a γ → qq̄ splitting, the resolved photon is taken to be hadron-like, and the remnants will
be constructed similarly as for any hadrons. In this case, the valence flavour is sampled based on
relative weights derived from the PDFs. The remnant construction becomes more complicated if
the initiator is found to be of a point-like origin but the beam photon has encountered additional
MPIs before (at scale Q2 > Q2s ) the resolved state is collapsed into an unresolved one. Then there
are several initiators kicked out from the beam, so a single companion cannot make the beam
configuration flavour and colour neutral. In this case the remnant is again constructed as for
any hadron, but the primordial k⊥ for the initiator of the hardest process and its companion are
derived from the scale Q2s at which the γ → qq̄ branching collapsing the photon to an unresolved
state has occurred.

Hard processes and diffraction The hard diffraction with dynamical rapidity-gap survival model
introduced in section 6.4.3 has been implemented also for photoproduction. For direct photons,
the no-MPI requirement has zero effect since there are no MPIs with unresolved photons. However,
as MPIs can still occur with resolved photons, some suppression is expected also for hard-parton
initiated processes. Indeed, there are indications that diffractive dijet photoproduction cross sec-
tions are suppressed compared to pQCD predictions based on diffractive PDFs for the target pro-
ton. The observed suppression factor depends on the applied kinematic cuts and varies between
0.5–0.9 in different analyses [270, 271]. The milder suppression compared to hadron-hadron col-
lisions at the Tevatron and LHC is explained by the presence of the direct component and the
smaller invariant mass of the photon-proton system at HERA kinematics which both reduce MPI
probability compared to hadronic collisions at higher energies. As demonstrated in ref. [272], the
MPI-based model in PYTHIA 8.3 provides a reasonable description for the various HERA data.

Soft QCD processes Apart from the non-diffractive low-p⊥ 2 → 2 scatterings that are generated
with the regulated cross section from the MPI framework using photon PDFs, the soft processes
with real photons are modelled according to the vector meson dominance (VMD) model. In this
model the photon is described as a linear combination of different vector-meson states with pref-
actors derived from experimental data. In PYTHIA 8.3, the values are taken from the analysis
presented in ref. [273] that have also been used in an SaS fit [208] for total and elastic cross
sections applied here. The included vector meson states are ρ 0 , ω, φ 0 , and J/ψ but Υ is cur-
rently neglected. In the VMD model for elastic and diffractive processes, the incoming photon
will first transform into a vector-meson state sampled according to relative weights. Then, the
interaction is handled similarly as for any other hadron-hadron case described in section 6.1.5.
The elastic scattering process in photoproduction is often referred to as exclusive vector-meson
production for which there are nowadays a good amount of data from HERA experiments, see
e.g. refs. [274–280]. The SaS parameterization tends to provide a good description for low-mass
vector-meson production, e.g. in case of ρ 0 , but underestimates higher-mass states such as the
J/ψ by a large margin. This indicates the need for further, possibly pQCD based, modelling for
high-scale elastic processes.

6.7.3 Photon-photon collisions


Similarly as photoproduction in ep collisions, the charged-lepton beams in e+ e− collisions may
emit photons that can interact with each other leading to effective photon-photon collisions. If
both of the photons have a low virtuality, there are a number of possible combinations that must
be accounted for. In the most complex case, where both photons are resolved, the collisions are

155
SciPost Physics Codebases Submission

generated in a similar manner as in hadron-hadron collisions, including parton showers for the
initial and final state, beam remnants, and, in particular, MPIs with the same special features
as with photoproduction as discussed earlier. If one photon is unresolved and other resolved,
the interactions are somewhat simpler, since the unresolved photon scatters off a parton from a
resolved photon. In this case, no MPIs can take place and ISR and beam remnants are generated
only for the hadron side. Both photons can also interact as unresolved particles when all particles
are produced from the outgoing particles through FSR and hadronization, which are relevant also
for other possible contributions.

Kinematics The initial phase-space sampling assumes that the incoming photons are collinear
with respect to the beam particles. However, as kinematically allowed photons emitted from mas-
sive (on-shell) particles will always have a finite virtuality, they will also possess some transverse
momentum given by eq. (290). The direction of this q⊥ is not a priori known and is sampled only
after the hard process kinematics are determined. Thus the final invariant mass of the photon-
photon system, Wγγ , will depend on the virtualities of the photons and their relative azimuthal
angle, ∆φ = φ1 − φ2 . The resulting W can again be derived from the kinematics, giving
2
Wγγ = 2E1 E2 x γ1 x γ2 − Q21 − Q22 + 2qz1 qz2 − 2q⊥1 q⊥2 cos(∆φ) , (293)

where x i are the momentum fractions of the photons with respect to the beam leptons whose
CM energies are Ei . To account for the possibly modified W 2 (= ŝ for the direct-direct case),
the cross section and relevant kinematic variables are recalculated after the virtualities and the
direction of the photons are sampled. Typically the changes in the cross section and kinematics
are negligible, but are needed in order to preserve the four-momentum of the event. An exception
is, however, 2 → 1 processes where it is important to keep the mass of the intermediate particle
intact, a prime example being Higgs-boson production, where the photon momentum fractions
are modified instead.

Possible final states There are many topics that can be studied in photon-photon collisions and
the relative importance of direct and resolved contributions varies by the process and considered
kinematics. For example, Higgs production in γγ collisions is dominated by the direct-direct con-
tribution but for QCD processes, such as jets or heavy quarks that contribute to the background of
Higgs studies, the resolved photons may also have a significant contribution. Another interesting
phenomenon is the MPIs in a photon-photon system which can be studied with low-p⊥ hadrons
that arise almost completely from resolved-resolved interactions. Also QED processes, such as
dilepton production, can be considered to calibrate the photon fluxes as they are not sensitive to
QCD effects.

6.7.4 Ultra-peripheral collisions


As briefly mentioned earlier, other charged beam particles, including protons and heavy nuclei,
may also emit photons that interact with the other beam or photons emitted by the other beam.
When the beam particles do not interact hadronically but stay intact and emit photons that give rise
to a hard interaction, the events are referred to as Ultra–Peripheral Collisions (UPCs). Due to the
requirement of beam particles with finite size not breaking up, the emitted photons have always
a small virtuality and can therefore be handled with the photoproduction framework introduced
above. The photon-induced processes where the beam hadron break ups can be simulated by

156
SciPost Physics Codebases Submission

using a PDF set that includes perturbatively generated photons from DGLAP evolution with the
usual PYTHIA model for hadron-hadron collisions.
The key difference between photon fluxes from hadrons and charged leptons is that the finite
size of the emitting particle needs to be accounted for. For protons, a good approximation is
obtained with the electric dipole form factor, giving a Q2 -differential flux of the form

αem 1 + (1 − x γ ) dQ2
2
1
fγp (x γ , Q2 ) = , (294)
2π xγ Q (1 + Q2 /Q20 )4
2

where Q20 = 0.71 GeV2 . Integrating over the possible virtualities will provide the flux derived
in ref. [281]. Another flux has been implemented for protons that is based on work by Budnev
et al. (see ref. [282]). The downside in the latter is that since only a virtuality-integrated form
is provided, there is not enough information to sample the full kinematics of the intermediate
photon and the virtuality sampling needs to be turned off. Therefore, this flux is not suited to
study observables sensitive to the transverse momentum of the intermediate photon as the q⊥ is
set to zero.
For heavy nuclei, it is possible to use form factors and derive the photon flux in a similar man-
ner as for protons. Usually it is more convenient to work in the impact-parameter space since the
heavy nuclei have a well-defined size and therefore it is possible to remove events where hadronic
interactions dominate the particle production by rejecting events with small impact parameter. As
shown in ref. [283], it is possible to derive an analytic form for the flux differential in the impact
parameter by assuming a point-like charge distribution. In fact, this provides a good approxima-
tion for the flux with a more realistic density profile when considering the region outside of the
nucleus relevant for UPCs. Integrating this from the minimum allowed impact-parameter value
bmin gives
αem Z 2 
2ξK1 (ξ)K0 (ξ) − ξ2 K12 (ξ) − K02 (ξ) ,

fγA(x γ ) = (295)
π xγ
where Z is the electric charge (number of protons) of the nuclei A, ξ = bmin x γ mN . As the nuclear
beams are typically defined in terms of per-nucleon energy, mN here also refers to average nucleon
mass. A suitable value for bmin is given by the sum of the radii of the colliding nuclei. Such a flux
is included in PYTHIA 8.3 but can only be enabled by providing this as a pointer to the Pythia
object with a dedicated method. The shape and magnitude of this flux is very different from
the flux for charged leptons, and therefore the phase-space sampling must be re-optimized for
efficient event generation. A suitable over-estimate is included, but the parameters may have to
be re-adjusted for different beam configurations. When using this flux, the virtuality sampling has
to be disabled since the allowed virtualities have been essentially integrated over when converting
to impact-parameter space by Fourier transform from the momentum space.
The current framework can already be applied to many processes studied in UPCs but have
a few limitations as well. In proton-proton collisions it is possible to study both photon-photon
and photon-proton collisions with fully reconstructed kinematics, when a Q2 -dependent flux is
used. This includes all hard processes initiated by photons or partons and also soft QCD processes
apart from central- and double-diffractive events. These allows for the study of minimum-bias
photon-proton collisions, inclusive and diffractive jet production, and photon-initiated dilepton
production with all different contributions, to name a few. In case of heavy ions, the palette is
somewhat more limited due to a Q2 -independent photon flux and lack of model for photon-nucleus
collisions, which will be addressed in future releases. In pA collisions, where the flux from the
heavy nucleus is amplified by the Z 2 factor so that γp component dominates the cross sections,

157
SciPost Physics Codebases Submission

almost all the same final states can be studied as in proton-proton collisions apart from observ-
ables highly sensitive to transverse momentum of the intermediate photon. For QCD observables,
the effect from neglected Q2 dependence will be washed out by the QCD radiation. In AA col-
lisions subsequent photon-nucleon interactions are not modelled, but high-p⊥ observables and
direct-photon dominated processes can be generated with reasonable accuracy. Photon-photon
interactions can also be considered, with the only limitation being the neglected Q2 dependence
in the kinematics that again has an effect for the q⊥ -dependent observables, e.g. the acoplanarity
of dilepton pairs produced by two direct photons.

6.8 Heavy ion collisions


The Heavy Ion (HI) collider physics community has traditionally not had very close ties to the rest
of the High–Energy Physics (HEP) community. This has also been reflected in the event generator
community, where the authors of HI event generators, although they some times make use of e.g.
the string fragmentation in PYTHIA, did not interact much with the authors of the main general
purpose event generators for pp, ep, and e+ e− collisions. However, with the arrival of the LHC, the
situation has changed. Not only are HI and particle physicists now part of the same collaborations,
the physics questions being asked are also starting to converge, and typical observables studied
in HI collisions are being applied to pp, and vice versa. It should therefore not come as a surprise
that PYTHIA 8.3 now also has some HI functionality implemented.
There are several ways to study HI collisions in PYTHIA 8.3. In section 6.7.4 we described how
to study ultra-peripheral HI collisions, and there is also the possibility to use nuclear PDFs to study
some observables. Here, we will concentrate on the modelling of complete exclusive hadronic
final states using the so-called ANGANTYR model [284], which is the default way of handling HI
collisions in PYTHIA 8.3.

6.8.1 Wounded nucleons


The ANGANTYR model in PYTHIA 8.3 can be said to be the successor of the old FRITIOF pro-
gram [285] which used string fragmentation to generate final states in HI collisions, and was
based on the so-called wounded nucleon model [286]. The basic assumption in the wounded nu-
cleon model is that each nucleon that participates in a HI collisions contributes to the multiplicity
of the full final state, according to a multiplicity function W ( y) which has a triangular form in
rapidity
1
 ‹
y
W ( y) ∝ 1+ , (296)
2 ymax
where ymax is the rapidity of the nucleon in the collision rest frame. This would yield the following
simple form of the rapidity distribution in an AA, for a given number of wounded nucleons (or
participants), Npart,p , Npart,t in the projectile and target nuclei respectively,

N ( y) = Npart,p W ( y) + Npart,t W (− y) . (297)

FRITIOF, in its simplest form, used the fact that the distribution of particles of a hadroniz-
ing string is flat in rapidity. For each wounded nucleon, a string was stretched out to an end-
point randomly positioned uniformly in rapidity, which then on average reproduces the form in
eq. (296). Despite the simplistic nature of the model, FRITIOF was able to provide a fairly good
description of collider data at the energies available in the 1980s. In fact, even pp collisions (with
Npart,p = Npart,t = 1) were reasonably described.

158
SciPost Physics Codebases Submission

With the energies achievable at RHIC and LHC, the basically non-perturbative FRITIOF model
falls short of reproducing data, and the ANGANTYR model was developed to address these short-
comings.

6.8.2 The ANGANTYR model


In comparison to FRITIOF, the ANGANTYR model introduces two major new ingredients. First,
rather than wounded nucleons only resulting in a string stretched out and being hadronized, a
full diffractive excitation is generated using the full multiparton interaction machinery of PYTHIA
where these are described in terms of a pomeron-proton collision. In addition, a more advanced
version of the Glauber simulation is used where special attention is given to the fluctuations in the
nucleon wave functions, making it possible to differentiate between different types of Nucleon–
Nucleon (N N ) subcollisions.
Starting with the new Glauber modelling, we rely on the Good–Walker formalism [207] to
connect the different types of N N semi-inclusive cross sections with fluctuations in the wave func-
tions [287].
For a projectile particle with an internal substructure, it is possible that the mass eigenstates
differ from the elastic scattering eigenstates. We denote the mass eigenstates Ψi , with the projectile
in the ground state (e.g. a nucleon) denoted Ψ0 , while Φl are the eigenstates to the scattering
amplitude T , withPT Φl = t l Φl . The mass eigenstates are linear combinations of the scattering
eigenstates, Ψi = l cil Φl . The scattering can be treated as a measurement, where the projectile
selects one of the eigenvalues t l , with probability |c0l |2 .
The elastic amplitude for the ground state projectile is then given by 〈Ψ0 |T |Ψ0 〉 = l |c0l |2 t l ≡ 〈T 〉,
P

where 〈T 〉 is the expectation value for the amplitude T for the projectile. The elastic cross section
is then given by
dσel /d 2 b = 〈T (b)〉2 . (298)
Working in impact-parameter space, the amplitude depends on b, and the total diffractive-scattering
cross section, σdiff , is the sum of transitions to all states Φl :
X
dσdiff /d 2 b = 〈Ψ0 |T |Φl 〉〈Φl |T |Ψ0 〉 = 〈Ψ0 |T 2 |Ψ0 〉 , (299)
l

where we have used the fact that the Φl form a complete set of states. Subtracting the elastic
cross section, we then obtain the cross section for diffractive excitation, which thus is given by the
fluctuations in the scattering amplitude:

dσd i f f −tot /d 2 b = 〈T 2 〉 − 〈T 〉2 . (300)

In a N N collision, both the projectile and the target are fluctuating, leading to single-diffractive
excitation of the projectile or the target, as well as to double diffraction. The different N N cross
sections are then given by

dσtot /d 2 b = 〈2T (b)〉 p,t


dσabs /d 2 b = 2T (b) − T 2 (b) p,t
2
dσel /d b = 〈T (b)〉2t p
− 〈T (b)〉2p,t
¬ ¶
dσDD /d 2 b = T 2 (b) p,t − 〈T (b)〉2p − 〈T (b)〉2t p
+ 〈T (b)〉2p,t . (301)
t

159
SciPost Physics Codebases Submission

Here 〈· · ·〉 p and 〈· · ·〉 t are averages over projectile and target states respectively, and subscripts Dt,
Dp, and DD stand for single-diffractive excitation of the target, the projectile, and double diffrac-
tion, respectively. The absorptive or non-diffractive inelastic cross section is given by σabs . We note
that the diffractive excitation is directly related to fluctuations in the nucleon wave function.
In ANGANTYR, we use these cross sections in the Glauber modelling to determine not only
which nucleons have been wounded, but also to differentiate if they were non-diffractively scat-
tered or only diffractively excited. The fluctuations are by default modelled using a varying radius
of the nucleons, according to a Gamma function,

r k−1 e−r/r0
P(r) = , (302)
Γ (k)r0k

and in addition, introducing a varying opacity of the elastic amplitude, which depends on the radii
of the projectile and target nucleons, r p and r t ,
v !
u (r + r )2
p t
T (b, r p , r t ) = T0 (r p + r t )Θ −b ,
t
(303)
2T0 (r p + r t )

where α
T0 (r p + r t ) = 1 − exp −π(r p + r t )2 /σ t . (304)
R
We then obtain the differential semi-inclusive cross sections in eq. (301) using 〈· · ·〉i = d ri P(ri )(· · · ),
which gives e.g.
Z Z 2
¬ ¶
2
〈T (b)〉 p = P(r t ) P(r p )T (b, r p , r t )d r p d r t . (305)
t

Three parameters (k, r0 and σ t ) depend on the N N collision energy, and need to be deter-
mined. By default this is done in the ANGANTYR initialization by fitting the integrated total and
semi-inclusive N N cross sections to the parameterization in PYTHIA 8.3 (see section 6.1), using
a simple genetic algorithm. If needed, the parameters can be specified by the user to avoid the
somewhat time-consuming fitting procedure.
The Glauber calculation works as follows. First the 3D positions of the nucleons in the nuclei
are modelled using a Woods–Saxon parameterization (by default the parameterizations with a
hard core from ref. [288,289] is used). Then, an impact parameter between the nuclei is generated
according to a user-specified importance sampling (by default a 2D Gaussian). For each nucleon
we then sample the wave function according to eq. (302). This gives us the probability that a
projectile nucleon, i, scatters non-diffractively with target nucleon, j, as

2T (b, ri , r j ) − T 2 (bi j , ri , r j ) , (306)

where bi j is the impact parameter between the nucleons. But, we also want to obtain the probabil-
ity of diffractive excitation, which involves the fluctuations. We do this by generating an additional
radius, r 0 , for each nucleon, thus sampling the fluctuations. In this way we obtain four statistically
equivalent N N collisions and we can ensure that on the average we obtain the correct integrated
non-diffractive and diffractive excitation cross sections, by shuffling the probabilities between the
four combinations so for each the probability never exceeds unity, as explained in ref. [284]. It
should be noted that this trick does not allow us to determine the correct amount of elastic scat-
tering, but these scattering are of less importance in a Glauber calculation.

160
SciPost Physics Codebases Submission

In the end of the Glauber modelling, we have a long list of all potential N N subcollisions
with an assigned type of interaction. These will now tell us how many, and of which kind of
N N events we will generate using the normal pp minimum-bias framework in PYTHIA 8.3, to be
merged together into a full HI collision event. The way this is done is as follows.

• Order all non-diffractive subcollisions in the N N impact parameter, bi j , and iterate with
increasing bi j .

• If none of the nucleons has been involved in a non-diffractive subcollision with smaller bi j ,
generate a (primary) non-diffractive subevent.

• If one of the nucleons has been involved in a previous subevent, generate a single-diffraction
N N event corresponding to the diffractive excitation of the other nucleon (using a special
modification as explained in ref. [284]) and merge this with the corresponding previous
subevent.

• If both of the nucleons are already in a generated subevent, do nothing.

When we merge a single diffraction subevent, we only add the diffractively excited subsystem,
removing the elastically scattered nucleon. We also take some longitudinal momentum from the
remnants of the primary event to ensure momentum-energy conservation.
In a similar way, we go through all double- and single-diffractive subcollisions, and add these to
the full HI event. In the end, we take all non-interacting nucleons and collect them into projectile
and target nucleus remnants, which each end up as a single entry in the event record with PDG-ID
codes of the form 100ZZZAAA9, depending on the number of neutrons and protons, which in the
PDG standard corresponds to a highly-excited nucleus.
It should be noted that all subevents above are generated on the parton level, which allows
us to hadronize them together. This enables us the option to perform string shoving and rope
formation (see sections 7.3.1 and 7.3.2) on the full HI partonic state.
The main use of the ANGANTYR model is to generate minimum-bias events. It is however, also
possible to generate specific hard processes in HI events. If a hard process is specified by the user,
the Glauber modelling will proceed as before, but (at least) one of the non-diffractive primary
N N events will be replaced by a specific hard interaction event, and at the same the event will be
reweighted by a factor given by
NND σhard /σND , (307)
where NND is the number of non-diffractive subcollisions. Note that for the specified hard pro-
cesses, ANGANTYR treats pp, pn, np, and nn subcollisions separately, which is not the case for the
minimum bias, where isospin symmetry is assumed.
By default, PYTHIA 8.3 will automatically initialize the ANGANTYR machinery as soon as one
of the beams is specified to be a nucleus (using the PDG ID of the form 100ZZZAAAI, where I
indicates the excitation level). It is possible to use the ANGANTYR machinery also for minimum-
bias pp collisions, by setting HeavyIon:mode = 2.
Finally, it should be noted that only the most commonly used nuclei are defined by default
in PYTHIA 8.3, but a user can easily define further nuclei. Note also that the beam energy of a
nucleus is specified by giving the energy per nucleon, following the convention of the field.

161
SciPost Physics Codebases Submission

7 Hadronization
Hadronization (often also referred to as fragmentation) is the process of turning the final out-
going, coloured partons into colourless hadrons. This transition is non-perturbative, and must
be handled by models. In PYTHIA it is based on the Lund string model [290, 291], which is also
historically the core of the JETSET/PYTHIA programs. Even though the core methods for string
hadronization are identical to previous versions of PYTHIA, the past years have seen significant
activity in the area of fragmentation dynamics, guided by the discovery of heavy-ion-like effects in
hadronic collisions. In PYTHIA, these efforts have culminated in a multitude of models modifying
the original Lund strings in the presence of other strings in an event.

7.1 The Lund String model


Results from lattice QCD support viewing the confining force field between a colour and an anti-
colour charge, such as a qq pair, as a flux tube with potential energy increasing linearly with the
distance between the charge and the anti-charge. As the partons move apart, energy is transferred
from the partons at the ends of the string to the string itself, by κ ≈ 1 GeV/fm. This directly
gives rise to the so-called “yoyo” modes of single qq dipoles in 1+1 dimensions7 , as illustrated in
fig. 14 (a). In the figure, an evolution starts at time t = 0, where all the energy is stored in the
ends, and none in the string,

1 p p
(E, p x )qq =
( s, ± s), Estring = 0 . (308)
2
p
The string reaches its maximal extension at time t = s/2κ. Here, all energy has been transferred
from the end-points to the string:
p
(E, p x )qq = (0, 0), Estring = s. (309)
p
At time t = s/κ, the string ends are back at their starting point, but with their momenta swapped
p
compared to eq. (308), and finally at t = 2 s/κ, the string has been through a full period.
In the string picture, yoyo modes like this are identified as mesons, with flavour determined by
their quark content (see section 7.1.1). Longer strings will break into hadrons, with new qq pairs
breaking up the original string. Aligning the string axis of the original string with the x axis, this
process is depicted in fig. 14 (b). The qq pairs are produced around a hyperbola, and joins together
to form the hadrons, depicted as arrows. A hadron produced on the string is then characterized
by two adjacent vertices (i and i − 1), with space-time coordinates (x and t) correlated through
the hadron mass (m):
m2i /κ2 = (x i − x i−1 )2 − (t i − t i−1 )2 . (310)
In general, a string will break into a state with n hadrons, which in the model is given by the
probability [292]:
n
Y €X Š
N d 2 pi δ(pi2 − m2 ) δ(2)
 
dP ∝ pi − Ptot exp (−bA) , (311)
i=1
7
The following convention for spatial coordinates is used. When discussing the 1+1 dimensional string, x is taken
as the spatial coordinate. When we move on to discuss 3+1 dimensional strings, the coordinate z is chosen to be the
coordinate along the string axis, as this will often coincide with the coordinate along the beam axis, which is normally
denoted z.

162
SciPost Physics Codebases Submission


2 s
t= κ


s
t= κ

s
t= 2κ
t A/κ2
t t=0
x

(a) (b)

Figure 14: (a) The yoyo picture of a meson, at several steps in time as explained in the
text. (b) A quark-antiquark string breaking into hadrons. The original pair is moving
outwards along light-like trajectories. New qq pairs are produced around a hyperbola
in (x, t), and combine into hadrons.

where A is the area covered by the string before breakup in units of κ, as shown in fig. 14 (b),
and b is a parameter. If the string breaking is imagined as an iterative
1 process, the consistency
constraint that the same result should be obtained (on average) by fragmenting from the left or the
right, one obtains the distribution of momentum fraction (z) of remaining light-cone momentum
taken by each hadron as
(1 − z)a bm2
 
f (z) ∝ exp − , (312)
z z
where a is a new parameter related to N and b in eq. (311). Once transverse momenta are
introduced, the substitution m2 → m2⊥ is performed, with the “transverse mass” defined by

m2⊥ = m2 + p⊥
2
. (313)

The resulting form of eq. (312) is known as the Lund symmetric fragmentation function. This simple
picture of a qq system can be extended to topologies including gluons, without introducing new
parameters, by viewing the gluon as a kink on the string in the Nc → ∞ limit, with separate colour
and anti-colour indices. A string can as such stretch from e.g. the quark end through a number of
gluons, and end in the antiquark end [291].
While the default behaviour of PYTHIA is to always use eq. (312) with given values for the
parameters a and b, the a parameter can in principle be different for each flavour. This possibility
is implemented for s quarks and diquarks. Going from an old flavour i to a new flavour j, the
fragmentation function would thus be modified as:

z ai 1 − z a j bm2⊥
 ‹  
f (z) ∝ exp − . (314)
z z z

Finally, the Bowler modification [293] done in the Artru–Mennesier model [294] allows for mas-
sive endpoint quarks with mass mQ . This modified the symmetric fragmentation function, as the

163
SciPost Physics Codebases Submission

areas swept out by massive endpoint quarks is reduced compared to massless ones. Though using
this modification is a break with the Lund-string philosophy, it is available as an option, where an
effective a term for a discrete mass spectrum [295] is used:

bm2⊥
‹ aβ  
1 1−z

f (z) ∝ 2
z aα
exp − . (315)
z 1+rQ bmQ z z

A common use case is to enable the Bowler modification for fragmentation for heavy quarks, as it
can describe the somewhat harder spectrum better.
The derivation of eq. (312) also gives the probability distribution in proper time (τ) of qq
breakup vertices, i.e. a quantity that can be interpreted as (input to) a hadron production time. In
terms of Γ = (κτ)2 it is:
P (Γ )dΓ ∝ Γ a exp(−bΓ )dΓ . (316)
From this distribution it is possible to calculate the average breakup time of a qq string:

1+a
〈τ2 〉 = . (317)
bκ2

Default PYTHIA values for a and b give 〈τ2 〉 ≈ 2 fm. The Γi values can be defined recursively

m2⊥
 
Γi = (1 − z) Γi−1 + , (318)
z

with Γ0 = 0.

7.1.1 Selection of flavour and transverse momentum


In the previous section, the qq pairs in the string breaking were treated as massless and without
transverse momenta. If the quark and antiquark has a transverse mass, they can no longer be pro-
duced in a single vertex, but must tunnel through a forbidden region of size m⊥ /κ. The tunnelling
probability can be calculated in the WKB approximation, giving [290]:

1 dP
∝ exp(−πm2⊥ /κ) = exp(−πm2 /κ) exp(−πp⊥
2
/κ) . (319)
κ d2 p⊥

Here m⊥ is the transverse mass of the quark, and the factorization of the result allows separation
of the generation of m and p⊥ .
The relative production of light quarks of different mass, and thus of different flavour, could in
principle be obtained directly by inserting u, d and s quark masses in eq. (319). It is, however, not
obvious what quark masses to use. Current quark masses lead to too little strangeness suppression,
and constituent quark masses lead to too much. Instead, the suppression is viewed as a free
parameter, and tuned to LEP data. The current default s suppression relative to u or d types is
0.217, which does not imply unreasonable effective quark masses in eq. (319). Heavier quark
flavours are suppressed too heavily to be produced in string breakings, for any reasonable value
of their masses.
The generation of p⊥ by eq. (319), can be implemented by giving the quark and antiquark
Gaussian p⊥ -kicks with σ2 = κ/π ≈ (0.25 GeV)2 . Fits to data have this number higher, around
σ = 0.35 GeV, implying that a large fraction of the p⊥ kick comes from another source, such as
soft gluon radiation below the parton shower cutoff.

164
SciPost Physics Codebases Submission

r r̄ r g ḡ r̄
a b

r g b b̄ ḡ r̄ r g b b̄ b b̄ ḡ r̄
c d

Figure 15: Illustration of step-wise popcorn production of a baryon-antibaryon pair, with


a meson in between. In frame a), a string is spanned between a red-antired (rr̄) qq pair,
with colour flow indicated by the arrow. In frame b), a green-antigreen (g ḡ) qq pair has
appeared as a vacuum fluctuation between them, reversing the colour flow in the central
part of the string. In frame c), an additional pair is produced, breaking the string, and
in frame d) another breakup produces a meson between the baryon and anti-baryon.
Figure from ref. [247].

Besides production of the normal light-quark species, other hadron types can be produced
through the same mechanism with a few modifications. Excited mesons are allowed by letting
quarks and antiquarks combine to a total spin of either 0 or 1. Considering only pseudoscalar
and vector multiplets, the expectation of the relative rate is 1 : 3, while data – at least in the
case of π : ρ – prefers a ratio about 1. This difference between prediction and data can be
explained as a result of differences in the hadronic wave function [296, 297], but this comes at
the expense of many free parameters, which have to be tuned to data. Baryons can be produced
using eq. (319) as well, by allowing diquark-antidiquark string breakings [298]. Compared to the
production of s quarks, this process will be suppressed by a larger (effective) diquark mass. In such
an approach, the produced baryon-antibaryon pair will be neighbours along the string, and share
two flavours. This simple picture is modified by considering an underlying step-wise mechanism
for baryon production, first suggested by Casher, Neuberger and Nussinov [299], and realized
in the “popcorn” model [300] in PYTHIA. In the popcorn model, diquarks are generated by first
producing a qq pair as a vacuum fluctuation on the string, without breaking it. By producing more
new qq pairs in between, meson production between the baryon-antibaryon pair is allowed. The
whole process is illustrated in fig. 15. In principle, several mesons can be produced in between a
baryon-antibaryon pair through the popcorn mechanism, but currently only the simplest case of a
single meson is implemented in PYTHIA.
While this explanation above suffices for an introduction of the physics behind the model,
there are many important implementation details to be faced when going from a “physics level”
description of the Lund string to the actual implementation in PYTHIA, which must be able to han-
dle arbitrarily complicated configurations of partons. In the next subsections we outline several of
the more specialized features in PYTHIA string fragmentation, and the thought behind the imple-
mentation. While some are completely new models on top of the old hadronization framework,
others remain the same as even the oldest version of the JETSET and PYTHIA 6.3 programs . Those
specific parts of the discussion are therefore largely carried over from the PYTHIA 6.4 manual [14].

165
SciPost Physics Codebases Submission

7.1.2 Joining two jets in qq events


Keeping with the simple picture of a single qq pair, the iterative procedure obtained by successive
application of eq. (312), is only valid when the remaining mass of the system, after fragmenting
off a hadron, is large. If the algorithm implementing eq. (312) were to start from one end, and
create hadrons successively until the other end is reached, the mass of the last hadron would be
fully constrained by four-momentum conservation, and would therefore be off-shell.
The practical route taken in PYTHIA, is to randomly fragment off hadrons from either the q
or q end in each step, with z taken to be either the positive or negative light-cone momentum
respectively. To wit, if the step is on the q side, z is the remaining E + pz fraction, and if the
step is on the q side, z is the remaining E − pz fraction. Once the mass of the remaining system
has dropped below a certain value, with some smearing to avoid an unphysical sharp cutoff, the
remaining system is fragmented into two “final” hadrons, and the chain ends.

7.1.3 Fragmentation of systems with gluons


Most of the preceding discussion has involved the simple system of a single string spanned between
a qq pair. While sufficient to explain the basic features of the model and implementation, it is
far from covering the complexity in hadronization of multiparton systems. A Lorentz covariant
algorithm exists, however, and in this section the machinery employed for this task is outlined,
noting that the complete machinery is complicated, and covered in detail in refs. [291, 301].
The basis of the algorithm is to divide multiparton systems to be fragmented into smaller string
pieces, spanned between individual partons. Consider a long string spanned between a qq pair
(labelled 1 and n in the following), with a number of gluons in between (labelled 2, ..., n − 1).
Such a string will contain n − 1 separate pieces. The kinematics of those pieces are, as for simple
qq strings, determined by the four-momenta of the endpoint partons. In the case of gluons, the
four-momentum is shared between the two neighbouring string pieces, each taking half. It must
furthermore be assumed that endpoint (anti-)quarks are massless, for the fragmentation algorithm
to work. In practise this is done by attaching a fictitious string piece with a massless (anti-)quark
to the string end, replacing the massive quark. This string piece in a later step becomes part of
the massive hadron produced from the massive quark.
In summary, we have therefore n − 1 string pieces defined by adjacent8 four-momentum pairs
( j, k), with the parton going towards the q end further indexed with a + and the parton going
towards the q end with a −. In general, a hadron is now created by taking a step from a region
( j1 , k1 ) to ( j2 , k2 ). A step may be taken within just a single region, or between two different
regions. The resulting hadron four-momentum can be written as
j2 k2
( j) ( j) (k) (k)
X X
p= x + p+ + x − p− + p x1 ê(xj1 k1 ) + p y1 ê(yj1 k1 ) + p x2 ê(xj2 k2 ) + p y2 ê(yj2 k2 ) , (320)
j= j1 k=k1

where the four-momentum fraction of p±i taken by the hadron is denoted x ±i , and (p x , p y ) are the
transverse momenta produced at the string breaks according to eq. (319) with (ê x , ê y ) spacelike
unit four-vectors normal to the string direction in the respective region.
The only remaining degree of freedom is z, to be determined by eq. (312). The interpretation
of z is, however, only well-defined for a step in the initial string regions. But via eq. (318) a z
8
It is possible to have string regions spanned by non-adjacent pairs as well, created when a gluon loses all its energy
to the string. These regions form an integral part of the formalism, and help ensure that string fragmentation is rather
insensitive to soft and collinear gluon emissions in the parton-shower stage.

166
SciPost Physics Codebases Submission

value can be translated into a new Γ = (κτ)2 value, and Γ is well defined across region boundaries.
(j )
Together with the p2 = m2 constraint on eq. (320) this is sufficient to find the relevant x + 2 and
(k )
x − 2 values of the next breakup vertex.

7.1.4 Hadron vertices


While the production vertices of hadrons are impossible to detect experimentally, calculating them
still has applications in other parts of the simulation, most notably hadronic rescattering. In this
section we describe the space-time picture for qq pairs, based on methods developed in ref. [302].
From the linear potential V (r) = κr, the equations of motion are

dpz,q/q dpz,q/q dEq/q dEq/q


= = = =κ. (321)
dt dz dt dz

The sign on each derivative is negative if the distance between the quark is increasing, and positive
if the distance is decreasing. After sampling Ehi and phi for each hadron, these equations lead to
simple relations between the space-time and momentum-energy pictures, zi−1 − zi = Ehi /κ and
t i−1 − t i = phi /κ, where zi and t i denote the space-time coordinates of the ith breakup point (note
that zi−1 > zi since points are enumerated from right to left). In the massless approximation,
p
the endpoints are given by z0,n = t 0,n = ± s/2κ. This specifies the breakup points, but there
is still some ambiguity as to where the hadron itself is produced. The default in PYTHIA 8.3 is
the midpoint between the two breakup points, but it is also possible to specify an early or late
production vertex at the point where the light-cones from the two quark-antiquark pairs intersect.
A complete knowledge of both the space-time and momentum-energy pictures violates the
Heisenberg uncertainty principle. This is compensated for in part by introducing smearing factors
for the production vertices, but outgoing hadrons are still treated as having a precise location and
momentum. Despite not being a perfectly realistic model, there is no clear systematic bias in this
procedure, and any inaccuracies associated with this violation are expected to average out.
There are several further complications to these process. One is more complicated topologies
such as those involving gluons or junctions. Another is the fact that the massless approximation
is poor for heavy qq pairs. For massive quarks, rather than moving along their light-cones, the
quarks move along hyperbolas E 2 − pz2 = m2 + p⊥ 2
= m2⊥ . Both these issues are addressed in more
detail in ref. [302].

7.1.5 Junction topologies


Junction topologies in their simplest form arise when three massless quarks in a colour-singlet
state move out from a common production vertex, a textbook example of which is given by a
baryon-number-violating super-symmetric decay χ 0 → qqq. In that case it is assumed that each
of them pull out a string piece, a “leg”, to give a Y-shaped topology, where the three legs meet in
a common vertex, the junction. This junction is the carrier of the baryon number of the system:
the fragmentation of the three legs from the quark ends inwards will each result in a remaining
quark near to the junction, and these three will form a baryon around it.
The junction will be at rest in a frame where the pull of the three legs balance each other,
which is when the angle between each quark pair is 120◦ . It is therefore convenient to handle the
hadronization in such a frame. There is no first-principles description of junction-string fragmen-
tation. Instead the process is split into a few steps, to make use of the existing string machinery
in a credible manner [303], illustrated in fig. 16. First, the two lowest-energy legs are considered

167
SciPost Physics Codebases Submission

qA0 qA0

q̄A1 q̄A1
qA1 qA1

q̄A2 q̄A2
qA2 qC4 q̄C4 qC3 q̄C3 qC2 q̄C2 qC1 q̄C1
qC0 qqAB qC0
qB3
q̄B3 q̄B3

qB2 qB2
q̄B2 q̄B2

q̄ q̄B1 q̄ q̄B1
qqB1 qqB1

qB0 First Stage: Legs A and B qB0 Second Stage: Leg C

Figure 16: Illustration of the two main stages of junction fragmentation. (left) First, the
junction rest frame (JRF) is identified, in which the pull directions of the legs are at 120◦
to each other. (If no solution is found, the CM of the parton system is used instead.) The
two lowest-energy legs (A and B) in this frame are then fragmented from their respective
endpoints inwards, towards a fictitious other end which is assigned equal energy and
opposite direction, here illustrated by grey dashed lines. This fragmentation stops when
any further hadrons would be likely to have negative rapidities along the respective
string axes. (right) The two leftover quark endpoints from the previous stage (qA2 and
qB3 ) are combined into a diquark (qqAB ) that is then used as endpoint for a conventional
fragmentation along the last leg, alternating randomly between fragmentation from the
qC end and the qqAB end as usual.

separately, each as if it were a qq string, with a fictitious q in the opposite direction to the q.
All fragmentation is from the q end of the respective system, however, and keeps on going until
almost all the original q energy is used up, resulting in the situation illustrated in the left-hand
pane of fig. 16. At that stage the remaining unmatched two quarks (qA2 and qB3 in the figure) are
combined into a diquark, carrying the unspent energy and momentum. This diquark now forms
one end of the remaining string out to the third quark, which can be fragmented as a normal string
system, illustrated in the right-hand pane of fig. 16. One criterion that the procedure works, e.g.
that the fragmentation of the two first legs is stopped at about the right remaining energy, is that
the junction baryon is formed with a low momentum and with minimal directional bias in the
junction rest frame. Additional checks are also made to ensure that the final string mass is above
the threshold for string fragmentation. Otherwise, repeated attempts are made, starting over with
the first two strings.
Unfortunately real-life applications introduce a number of complications. One such is that the
pull is more complicated when the endpoints are not massless. Then, in a fraction of the events,
there is no analytic solution. Typically this happens when a massive quark is almost at rest in the
configurations that come closest to balance, and an approximate balance along these lines may be
obtained. An even more complicated case is when a leg is stretched via a number of intermediate
gluons between the junction and the endpoint quark, as would be a natural consequence of parton-
shower evolution in the χ 0 → qqq decay. Then the initial motion of the junction is set by the gluon
nearest to it. But often this gluon has low energy and, once that is lost to the drawn-out string, it is
the direction of the next-nearest gluon that sets a new net pull. Thus, there is no frame where the

168
SciPost Physics Codebases Submission

junction remains at rest throughout the whole fragmentation process. An effective average pull
is then defined for each of the three legs, as a weighted sum of the respective parton momenta,
where the weight drops exponentially as the energy sum of partons closer to the junction increases,
cf. ref. [303].
The absence of an exact solution for the junction rest frame leads to an approximate iterative
procedure being used. One of the more common sources of PYTHIA warnings is that this procedure
does not converge. If no fix can be found any other way, then ultimately the centre-of-mass frame
of the system is taken as the junction rest frame.
Junction fragmentation is not only a topic for exotic physics, but very much part of ordinary
QCD hadronic physics. It appears if two valence quarks are kicked out of a baryon beam by the
MPI machinery. Since these interactions typically involve colour exchange, two of the ends will
stretch to partons from the other incoming beam, unless colour reconnection gives another result.
The fragmentation follows the already outlined procedure, which can lead to the beam baryon
number being transported in to the central region of the event, cf. ref. [249].
Also antijunctions may exist, where the colour lines from three antiquarks meet, and such
antijunctions carry a negative baryon number. A string system may contain both a junction and
an antijunction, or even multiple of such. The simplest such topology is when one leg connects a
junction to an antijunction, leaving two other junction legs to quarks and two antijunction legs to
antiquarks. It is here assumed that the total string length (see section 7.2) is smaller for such a
topology than for having two simple qq strings, or else the junction pair would annihilate to give
the simple string topology, cf. [249,303]. Conversely, when the string length can be reduced, more-
or-less parallel qq strings may colour reconnect into junction-antijunction systems, see further
section 7.2.2.
To reduce the complexity of multijunction fragmentation, each system is split up into smaller
ones that only contain (at most) one junction or antijunction each. Consider e.g. a junction-
antijunction topology. If the leg connecting the two contains at least one gluon, it can be split up
by a replacement g → qq. If not, a small amount of energy can be shuffled from the regular q and
q legs into some energy (and momentum) for this connecting leg, so that it can be split.
Another subtlety concerns what spin state to choose for the diquark that is formed at the end
of the fragmentation of the two first legs, the one labelled qqAB in fig. 16, which we will call the
junction diquark. For conventional (non-junction) fragmentation, empirically one finds that S = 1
diquark states are heavily suppressed, interpreted as due to significantly higher masses and smaller
binding energies. However, unlike in conventional string breaks, where diquark-antidiquark pairs
are formed together in a single coherent tunnelling process (modulo fluctuations such as in the
popcorn scenarios), the junction diquark is formed by combining the leftovers from two separate
string breaks; PYTHIA 8 therefore allows for the S = 1 suppression factor for junction diquarks to
be set independently of that for conventional diquarks. Moreover, analogously to in the meson
sector, it can be set independently for b-, c-, s-, and light-flavoured junction diquarks, where the
label always refers to the heaviest of the two constituents.
It is also worth emphasizing that, within the context of the current PYTHIA modelling, junctions
represent the sole mechanism for producing baryons containing multiple heavy flavours, such as
Ξcc , Ωcc , Ωccc , and their b-flavoured relatives. Note, however, that this will still be quite rare; since
heavy flavours cannot be produced by string breaks, they can only appear as endpoints, say qA0
and qB0 in fig. 16. The only possibility to form a double-heavy-flavoured baryon involving these is
if there is too little energy in both legs A and B for any other string breaks to occur, so that qA0 and
qB0 are combined directly into the junction diquark, which is then doubly-heavy flavoured. We
note that, so far, no dedicated emphasis has been placed on developing the heavy-quark aspects of

169
SciPost Physics Codebases Submission

junction fragmentation, though that may change with experimental interest. Predictions should
therefore be regarded as tentative.
In summary, the full machinery for junction hadronization is convoluted and not without weak-
nesses, but overall it serves its purpose, and finds use in several physics contexts.

7.1.6 Small-mass systems


If the invariant mass of the qq system is small, a few complications to the fragmentation process can
arise. For example, for an ss system at 0.9 GeV, the string cannot fragment as there is not enough
energy to form an outgoing K K pair, nor can the quarks enter a “yoyo motion” as there is no hadron
with compatible mass and flavour content. Furthermore, even if the string can fragment, at low
energies the available phase space might be so small that the fragmentation algorithm becomes
very inefficient. These situations can occur for instance towards the end of a parton shower by
g → qq branchings or during hadronic rescattering, and are handled using approaches inspired by
cluster fragmentation [304].
To improve the efficiency of the algorithm, the first step is to assume that the string will break
at only a single point, and a few attempts are made to find possible outgoing two-hadron states.
If these attempts fail, next the algorithm tries to form a single hadron from the endpoints, then
put that hadron on-shell by transferring momentum to or from a neighbouring string system. If
no momentum rearrangement is possible, further attempts are made to find possible two-hadron
states, but now only the lightest possible hadrons for the given flavour content are considered. If
this still does not work, the string may collapse to the lightest possible hadron given the endpoints,
and produce one additional π0 . Finally, if this is not possible either, the last resort is to collapse
the string to the lightest possible hadron, and transfer momentum with a neighbouring parton or
hadron.
String systems are handled in order of increasing mass relative to the two-body threshold, so
normally other systems are still unfragmented when addressing this kind of issue. Especially in
(low-energy) hadronic rescattering there may two low-energy strings. Then, when the first string
is handled, its collapse may reduce the mass of the other string. In this case, that system may also
collapse to a single hadron, which is put on-shell by transferring momentum with a hadron from
the previously fragmented string.

7.2 Colour reconnections


In PYTHIA (and other event generators), a simplified set of rules for colour flow is used to determine
between which partons confining potentials should arise. In the context of the string model, this
determines a unique string topology which sets the stage for the subsequent hadronization.
Specifically, all perturbative processes (including MPIs, ISR and FSR) are handled in a leading
colour (LC) limit in which the probability for any two random colours to both be the same van-
ishes. Formally, this is done by taking the limit Nc → ∞ with αs Nc kept fixed [55] so that QCD
amplitudes retain their Nc = 3 normalizations. This accomplishes two things: 1) it eliminates
colour-interference effects which are suppressed by powers of 1/Nc2 → 0, and 2) it allows for a
particularly simple representation of gluons in colour space, as direct products of a colour and an
anticolour, since the weight of the singlet in Nc ⊗ N̄c = (Nc2 − 1) ⊕ 1 vanishes as Nc → ∞.
In the LC limit, Feynman-diagram amplitudes in colour space are represented by products of
independent “colour lines”. Each of these expresses conservation of a distinct colour charge, and
is represented by a δi j connection between partons carrying colours i and j (suitably crossed).

170
SciPost Physics Codebases Submission

Figure 17: Illustration of LC colour flow in a simple e+ e− → qq̄ ⊗ shower event. The
shaded regions represent the resulting unique LC string topology.

We call this an LC dipole connection. Due to the orthogonality of the basis states and the lack of
interference in this limit, each such line translates directly to a coherent colour-singlet structure at
the colour-summed amplitude-squared level, which is confining at large distances. Thus, each LC
dipole emerging from the perturbative stages of the event evolution can be uniquely mapped to a
string piece (discussed further, e.g. in [53]). We use the term “colour reconnection” (CR) to refer
to any scenario that results in changes relative to this map in defining the starting configuration
of hadronizing strings in an event.
A simple illustration of the map between LC dipoles and string pieces, for an e+ e− → γ∗ /Z →
0 0
qq̄q q̄ g g event, is shown in fig. 17. Matching colour (and anticolour) charges are represented by
Les Houches colour (and anticolour) tags [305, 306] numbered from 101 – 104 in this example
and indicated by coloured lines in the diagram. In keeping with the Nc → ∞ nature of the LC
limit, the number of different tags is not limited to three, and each new tag is distinct from all
others. This produces a unique set of colour connections which can be traced to form the LC string
topology (shaded regions).
In hadronic collisions, the structure of the beam remnants is also to be modelled, after MPIs
have extracted multiple coloured objects from them. Here it is useful define rules on how to equate
some of these colours and anticolours with each other, so as to keep the total colour charge of a
remnant within reasonable bounds. Note that this would still classify as “colour connection”,
insofar as it is the initial assignment of remnant colours, although the consequences propagate in
from the remnants to the central perturbative interactions. This is discussed further in the section
on beam remnants, section 6.3. As used in this section, the term CR applies to models that go
beyond this, i.e. that allow for departures from the simple colour rules discussed above and/or
address ambiguities that are left unresolved by them. CR may be classified as one example of a
broader palette of string interactions, with other examples presented in section 7.3.
Note that, occasionally, “junction” structures (see section 7.1.5) may also be present. Unlike
dipole-type δi j connections, junctions (and antijunctions) represent εi jk structures in colour space;
these are explicit Nc = 3 structures which have no analogy in the Nc → ∞ limit. In PYTHIA, they
can appear in the initial state in proton beams [249], in hard BSM processes (or decays) with
baryon number violation [82, 303], and/or as a product of colour reconnections in the final state
(in pairs of junctions and antijunctions to conserve overall baryon number) [250]. Due to the
added technical complexity of dealing with junction structures, the latter possibility is, however,

171
SciPost Physics Codebases Submission

so far only invoked by the QCD-based CR model, cf. section 7.2.2.


Several different scenarios are included in PYTHIA, as described in the following subsections,
each with its own motivations and underpinnings. The unifying feature is that these models act
only by reassigning colours, with no explicit momentum exchanges between the involved partons.
The decisions whether and how to reassign still can depend both on momentum-energy and on
space-time relations between partons. Also, the changes at the level of produced hadrons still can
be dramatic, due to the changed lengths and orientations of the resulting hadronizing strings.
Historically, CR was first discussed in the context of charmonium production [307–309], no-
0 0
tably in weak B decay to J/ψ, e.g. B = bd → W− cd → sccd → J/ψK . In such decays the c and c
belong to two separate colour singlets, but ones that overlap in space-time, with the possibility of
soft gluon exchange to create the new singlets.
The first large-scale application of CR was in the PYTHIA MPI model of hadronic collisions [13],
notably to explain the increasing mean transverse momentum 〈p⊥ 〉 with increasing charged mul-
tiplicity nch observed at CERN’s SppS collider [310]. If all MPIs draw out strings and fragment
in the same manner, 〈p⊥ 〉(nch ) would be essentially flat. CR was therefore introduced in such a
way that the total string length is reduced. Each further MPI then on the average increases nch
less than the previous one, while giving the same p⊥ from (mini)jet production, resulting in an
increasing 〈p⊥ 〉(nch ).
LEP 2 offered a good opportunity to search for CR effects. Specifically, in a process e+ e− →
+ −
W W → q1 q2 q3 q4 , CR could lead to the formation of alternative “flipped” singlets q1 q4 and
q3 q2 , and correspondingly for more complicated string topologies, formed when parton showers
are included. Such CR would be suppressed at the perturbative level, since it would force some
W± propagators off the mass shell [311]. This suppression would not apply in the soft region.
Based on a combination of results from all four LEP collaborations, the no-CR null hypothesis is
excluded at a 99.5% CL [312]. Within the SK I scenario, described below, the best description
is obtained for ∼50% of the 189 GeV W+ W− events being reconnected, in qualitative agreement
with predictions.
More recently, Tevatron [313] and LHC [314, 315] measurements of the top-quark mass in
hadronic top-quark decays brought CR effects on precision observables to the fore again, with
several new models geared towards the increased complexity of hadron collisions produced first
in PYTHIA 6 [316–318] and later in PYTHIA 8 [250,319]. Hadronic reconstruction of the top-quark
mass remains an important impetus for further explorations of CR model space and for the de-
velopment of systematic and exhaustive ways to constrain modelling ambiguities and parameters
experimentally.
The importance of colour algebra versus dynamics differs widely between models. Taking the
simple W+ W− case above, there is a 1/9 probability that q1 q4 and q3 q2 are singlets purely by colour
algebra. But such accidental singlets do not stop q1 q2 and q3 q4 from still being singlets as well; so
nevertheless, a dynamics principle would be needed to decide which singlet set takes precedence
when it is time to hadronize. Furthermore, once parton showers are included, the number of
colour charges in an event increases, and the possibilities for CR with it. In the extreme limit, a
string may be viewed as a chain of (non-perturbative) gluons infinitesimally closely spaced, such
that the string constantly flips colour, so there would be no suppression of CR for lack of nearby
matching colours.
In several of the models below the concept of a “string-length” λ plays a prominent dynamics
role. It is a measure of how many hadrons of some reference hadronic mass m0 there are room
for (in phase space), if the hadrons are evenly spaced in rapidity along the string. For a simple qq

172
SciPost Physics Codebases Submission

string of mass mqq one possible definition is λ = ln(m2qq /m20 ). In principle, λ is well defined also
for more complicated string topologies [292], but in practice its construction is too complicated.
Instead, approximate expressions are used, like
n m2i,i+1
‚ Œ
X 1
λ≈ ln 1 + 2
, m2i,i+1 = (εi pi + εi+1 pi+1 )2 , εq = 1 , ε g = , (322)
i=0
m0 2

for a string q0 g1 g2 · · · gn qn+1 , where ε g = 1/2 because gluon momenta are shared between two
string pieces. The addition of 1 is to ensure that a low-mass section does not give a negative
contribution, and is not always used. More generally, if low masses are common, it probably
signals that there is a larger underlying issue, e.g. having too low a cut-off for shower evolution.
Loosely speaking, λ can be viewed as the “free energy” of a string system, available for particle
production. It provides a useful momentum-space measure of the worldsheet area that a given
string system will span, on average, before string breaks occur. Since the classical (Nambu–Goto)
string action is proportional to (the negative of) that area, it is generally assumed that, other
things being the same, nature prefers a low string length.
It should be noted that such a principle does not apply to the perturbative stage of an event,
where the hard interaction and MPIs signal the transition from a state of small λ (partons con-
fined in the incoming protons) to a state of significantly higher λ. The principle of string-length
minimization rather refers to longer time scales, when strings begin to be pulled out between the
partons moving from the central collision.
Similarly, some general considerations of the space-time picture are necessary. One is that
the spatial evolution of showers need not be traced. That is, parton showers occur at time scales
sufficiently shorter than hadronization ones so that, to first approximation, all the final partons can
be viewed as emerging from a common vertex. Furthermore, while the branching of a low-mass
high-energy parton can be significantly displaced, the daughters will tend to be sufficiently close,
by any distance measure, such that CR is unlikely to break them apart. Another issue is how the
lifetime of intermediate resonances compares with the CR time. The W, Z, and t have intermediate
decay time scales, about an order of magnitude shorter than typical hadronization times. (Whereas
the H is much more long lived.) But the two would become more comparable if time is added for
the decay products to expand and begin interacting with the environment given by the rest of the
pp collision. Ideally, the situation should therefore be simulated dynamically, where different time
orderings are possible outcomes, but that would be fraught with uncertainties and is typically not
done. Instead, a more common option is to allow only early or only late resonance decays, i.e.
before or after hadronization. In early decays, all partons can reconnect, while in late decays the
resonance decay products cannot.

7.2.1 The MPI-based model


The first CR model implemented in PYTHIA 8, and currently still the default, attempts to reduce λ
by a complete merge of the partons of separate MPI systems. The probability for two MPIs to be
reconnected this way is a function of the lower p⊥ scale of the two, of the form

(Rrec p⊥0 )2
Prec (p⊥ ) = , (323)
(Rrec p⊥0 )2 + p⊥
2

where p⊥0 is the parameter introduced in eq. (253) to damp the p⊥ → 0 infinity of the QCD
2 → 2 cross section, and Rrec is a phenomenological parameter. An Rrec of order unity would

173
SciPost Physics Codebases Submission

seem reasonable; empirically somewhat larger values are found. The reconnection probability is
chosen to be higher for soft systems, reflecting that the latter are described by more extended
wave functions, thus having a higher probability to overlap and interact with other systems.
Now consider an event containing n MPIs, which have been generated in order of falling p⊥ ,
p⊥1 > p⊥2 > . . . > p⊥n . The reconnections are then done in a two-step procedure, as follows.
First, the MPI systems are tested for reconnection in sequence of increasing p⊥ , i.e. starting
with system n. For an arbitrary m, 2 ≤ m ≤ n, the reconnection probability Pm = Prec (p⊥m ) is
used to decide whether system m should be merged with m − 1 or not. If not, the same relative
probability holds for a merger with m − 2, and so on to the top. That is, there is no explicit
dependence on the higher p⊥ scale, but implicitly there is via the survival probability of not already
having been merged with a lower-p⊥ system. In total, the probability for m not to merge therefore
is (1− Pm )m−1 . Note that mergings may cascade: if m is merged with l, 1 < l < m, then l in its turn
may be merged with an even-higher-p⊥ system k, 1 ≤ k < l, and then also m counts as merged
with k.
Second, once it has been decided which systems should be reconnected, the actual merging
is carried out in the opposite direction. That is, first the hardest system is studied, and all colour
dipoles (i, k) in it are found, as usual in the Nc → ∞ limit. This includes those to the beam
remnants, as defined by the holes of the incoming partons. Then consecutively, each softer system
to be merged with it is considered in order of decreasing p⊥ . For each such system, the gluons j
are inserted, in order of decreasing gluon p T , into the dipole (i, k) that minimizes the increase in
the λ measure for the harder system

(pi · p j )(p j · pk )
∆λ = λ j;ik ≡ λi j + λ jk − λik = ln . (324)
(pi · pk )m20

Note that the first term of eq. (322) is not required here, since an Ek → 0 (for fixed relative
angles) would affect all λ j;ik the same way and thus not alter the choice of the winning (i, k)
dipole. Although gluons dominate, MPIs may also contain quarks. Those qq pairs that originate
from the splitting of a gluon can be inserted into the higher-p⊥ system by the same criterion
as would have been used for such a gluon. The (few) other quarks are not affected by the CR
procedure, but remain for the beam-remnant handling to address.
The CR procedure is carried out before resonance decays are considered by default, i.e. the
late decay option introduced above. It is possible to switch to early decays, however.

7.2.2 QCD-based colour reconnections


As discussed in the introduction to section 7.2, during the perturbative stages of the event evo-
lution, LC colour flow is used to keep track of which partons are colour connected to each other.
In the LC limit, each colour tag is matched by only a single unique anticolour tag in the event
(or a combination of two colour tags, if junctions are present). At the perturbative level, these
connections represent LC dipoles/antennae, and they are one-to-one mapped to string pieces at
the non-perturbative stage, enforcing colour confinement.
Beyond the LC limit however, there should be a finite probability also for LC-unconnected
partons to “accidentally” find themselves in a colour-singlet state, or in some other coherent state
with a lower total colour charge than the scalar sum of their individual charges. This follows from

174
SciPost Physics Codebases Submission

the SU(2) colour-algebra rules:


3⊗3=8⊕1 (325)
3⊗3=6⊕3 (326)
3 ⊗ 8 = 15 ⊕ 6 ⊕ 3 (327)
8 ⊗ 8 = 27 ⊕ 10 ⊕ 10 ⊕ 8 ⊕ 8 ⊕ 1 , (328)
where the representations that correspond to a coherent addition of charges (with lower total
charge) are highlighted in red. In the LC limit, colour-unconnected quark-antiquark pairs are
never allowed to form a singlet; they are always in an overall octet state, while quark-quark, quark-
gluon, and gluon-gluon ones are in sextet, quindecuplet, and vigintiseptet states, respectively.
The starting point for the QCD-based CR scheme [250] is that slightly simplified versions of
eqs. (325) to (328) can be used to compute probabilities for LC-unconnected partons to stochasti-
cally enter into coherent states with one another. This does not invalidate the LC colour topology,
but it does allow for (potentially many) other viable mappings of the same parton system to differ-
ent string configurations. Optionally, configurations that involve (re)connections between systems
with large relative boosts can be excluded if deemed to be in conflict with causality, as discussed
further below. The model then chooses between the remaining allowed configurations by select-
ing the one that minimizes the λ measure, eq. (322). In principle, one could allow fluctuations
around this, but that is not currently done in the model.
A characteristic feature of this model is that it provides a qualitatively new mechanism for the
creation of baryon-antibaryon pairs, in addition to the conventional mechanism of string breaks to
diquark-antidiquark pairs. The 3 in eq. (326), the 6 in eq. (327), and the two decuplets in eq. (328)
represent colour states that involve colour-epsilon structures. In the context of the string model,
these map to string junctions (and antijunctions), around which baryons will form, cf. section 7.1.5
and ref. [303]. As a consequence, the effective baryon-to-meson ratio increases with the amount
of CR in this model, and hence more active events (e.g. with many MPI) will generally exhibit
higher baryon fractions. Note that colour conservation implies that the model always creates
equal numbers of baryons and antibaryons; these pairs can, however, be well separated in phase
space, contrary to the more localized nature of the conventional diquark string breaks. Moreover,
the model also allows for the formation of doubly-heavy-flavour baryons such as Ξ bc , a possibility
that does not occur within the conventional diquark-type string breaks. In the current formulation
of the model, however, no special attention has been devoted to questions specific to heavy quarks,
hence this aspect should be considered to be associated with substantial uncertainty.
At the technical level, the model approximates the QCD probabilities expressed by eqs. (325)
to (328) by randomly assigning an index between 0–8 to each Les-Houches colour tag, subject to
the requirement that gluons must have different colour and anticolour indices. Any parton pairs
with matching colour and anticolour indices are then considered to be in relative singlets and are
candidates for dipole-type string pieces. (This mimics the representation first proposed in [320].)
Stochastically, this reproduces the 19 probability of eq. (325) exactly.
The algorithm starts from the LC topology and considers each index group in turn, working
its way down from high to low dipole invariant masses, at each step considering all allowed pos-
sibilities and executing a swap if that lowers the total λ measure. Note that qq̄ pairs originating
directly from g → qq̄ branchings are also excluded from having the same index. Consequently,
quarkonium formation from such pairs is not expected in this model in its current formulation.
If junction-type reconnections are enabled, the algorithm then works its way through three
separate groups of indices: [0,3,6], [1,4,7], and [2,5,8] (chosen so that they are trivial to sep-

175
SciPost Physics Codebases Submission

arate using the modulo 3 operation). Within each of these groups, any partons carrying two
different colour indices (say, 0 and 3) are allowed to add coherently to the overall anticolour of
the third (say, -6) and enter into corresponding junction-type reconnections if that reduces the
λ measure. This enables a decent ( 29 ) approximation to the probability for junction-type recon-
nections, but does underestimate the true QCD group weights somewhat, see [250]. This pro-
cedure (dipole-style reconnections followed by junction reconnections) is iterated until no more
favourable reconnections are identified.
In addition to the colour rules, the dipoles also need to be causally connected in order to
perform a reconnection. The definition of causally connected dipoles is not exact, and several
different options are available. All the time-dilation modes introduce a tunable parameter, which
provides a handle on the overall amount of CR.
When the two strings are allowed to reconnect, they will reconnect if it lowers the total string
length, as defined by an approximation to the λ measure. Several options for different approxima-
tions are available. The λ measure is not well understood, especially for junction structures, and
a tunable parameter allows for the enhancement or suppression of junction-type connections to
dipole ones. This affects how many baryons are generated by the model. See also the description
of junction fragmentation in section 7.1.5.
Although the main objective of the model is to treat reconnections involving large invariant
masses, there is of course a tail towards small masses as well. For very low masses < O(1 GeV),
string fragmentation becomes technically complicated (as each hadron needs to straddle several
gluon “kinks”), especially when junctions are involved, and also the approximations made in the
λ measure are not particularly reliable. Therefore, reconnections involving string pieces with
masses below m0 , cf. eq. (322), are excluded from participating in the CR framework. (Technically,
partons making up such low-mass systems are treated collectively as a single pseudo-particle for
the purpose of reconnections.)

7.2.3 The gluon-move scheme


In the effort to determine the top mass as accurately as possible, CR is one of the major sources of
systematic error. To better understand the situation, a range of new models were developed and
implemented in ref. [319]. Many of these are crude straw-man models, or applicable only to top
decay. They are therefore not integrated as standard options, but may be obtained by using the
ColourReconnectionHooks.h plugin; see main29.cc for an example.
In the late resonance decays approach it is possible to allow separate CR models for the under-
lying event and for the top decay products. Then two collections of gluons are constructed, one
containing the gluons radiated from the top decay products and the other containing the gluons
from the rest of the event. Iterating over the former in random order, one forces a random frac-
tion of the gluon from the top to exchange colours with a gluon from the rest of the event. The
latter gluon can be picked according to one of five different criteria, (i) at random, (ii) giving the
smallest invariant mass, (iii) giving the largest invariant mass, (iv) giving the smallest (with sign)
∆λ value, or (v) as (iv) but only if ∆λ < 0.
For early resonance decays, three possible operations were implemented, swap, move, and
flip. The latter two are implemented in the main body of PYTHIA.
The swap model is similar to option (iv) above. A random fraction of all final-state gluons are
chosen for possible reconnection. For each such gluon pair j and m, on dipoles (i, k) and (l, n)
respectively, one calculates the difference ∆λ resulting from a swap of the two gluon colours
 
∆λ( j, m) = λm;ik +λ j;l n − λ j;ik + λm;ln = λim +λmk +λl j +λ jn − λi j + λ jk + λlm + λmn . (329)

176
SciPost Physics Codebases Submission

A reconnection is performed if min j,m ∆λ( j, m) ≤ ∆λcut , where ∆λcut ≤ 0 is a tunable parameter
that expresses a CR strength. The procedure is repeated until no allowed swaps remain.
The closely related move model works as follows. Again a random fraction of all final-state
gluons are singled out. Starting from each such gluon j on a final-state dipole (i, k), the change
in the string length ∆λ that would result from moving the gluon to any other final-state dipole
(l, n) is calculated using

∆λ( j, ik → l n) = λ j;l n − λ j;ik = λl j + λ jn + λik − λi j + λ jk + λln . (330)

Now the minimum is found ∆λmin = min j,l,n ∆λ( j, ik → l n), and the move carried out if ∆λmin ≤ ∆λcut .
This is then repeated as long as the latter criterion is fulfilled.
There is some fine print. If a colour-singlet subsystem consists of two gluons only, then it is
not allowed to move any of them, since that would result in a colour-singlet gluon. Also, at most
as many moves are made as there are gluons, which normally should be enough. A specific gluon
may be moved more than once, however. Finally, a gluon directly connected to a junction cannot
be moved, and also no gluon can be inserted between it and the junction. This is entirely for
practical reasons, but should not be a problem, since junctions are rare in this model.
Neither the swap nor move methods reconnect quarks. That is, if a qq pair start out at the
opposite ends of a string then so they will remain. The gluons found along this string can change,
and in the move model even the number of such gluons, but the endpoints do not. To lift this
restriction, a flip step can be added subsequent to the swap or move one. The basic idea here is to
flip two string pieces, (i, k) and (l, n), and instead connect them as (i, n) and (l, k). For any two
separate colour-singlet subsystems one finds

∆λmin = min [λin + λlk − (λik + λln )] . (331)


i,k,l,n

The system pair with smallest ∆λmin is selected for a flip, as long as ∆λmin ≤ ∆λcut . Singlet
systems that have undergone one flip are not allowed any further ones. As a minor variation,
junction topologies are either excluded or included among the allowed flip possibilities. It is also
possible to switch on/off move and flip separately.

7.2.4 The SK models


The SK I and SK II models [311,321] were specifically developed for e+ e− → W+ W− → q1 q2 q3 q4 at
LEP 2, and work (almost) equally well for an γ∗ /Z γ∗ /Z intermediate state. They are not intended
to handle hadronic collisions, however, except in special contexts. The prime example is Higgs
decays of the same character as above, H → W+ W− /ZZ, since the Higgs is so long lived that its
decay is decoupled from the rest of the event [322].
The labels I and II refer to the colour-confinement strings being modelled either by analogy
with type I or type II superconductors. In the former model the strings are viewed as transversely
extended “bags” [323]. The likelihood of reconnection is then related to the integrated space-time
overlap of string pieces from the W+ with those from the W− . In the latter model, instead, strings
are assumed to be analogous with vortex lines, where all the topological information is stored in
a thin-core region. Reconnection, therefore, only can occur when these cores pass through each
other.
The imagined time sequence is the following. The W+ and W− fly apart from their common
production vertex and decay at some distance. Around each of these decay vertices, a perturbative
parton shower evolves from an original qq pair. The typical distance that a virtual parton (of mass

177
SciPost Physics Codebases Submission

m ∼ 10 GeV, say, so that it can produce a separate jet in the hadronic final state) travels before
branching is comparable with the average W+ W− separation, but shorter than the fragmentation
time. Each W can therefore effectively be viewed as instantaneously decaying into a string spanned
between the partons, from a quark end via a number of intermediate gluons to the antiquark end.
The strings expand, both transversely and longitudinally, at a speed limited by that of light. They
eventually fragment into hadrons and disappear. Before that time, however, the string(s) from
the W+ and the one(s) from the W− may overlap. If so, there is some probability for a colour
reconnection to occur in the overlap region.
In scenario I, the reconnection probability is proportional to the space-time volume over which
the W+ and W− strings overlap, with saturation at unit probability. This probability is calculated
as follows. In the rest frame of a string piece expanding along the ±z direction, the colour field
strength is assumed to be given by
¦ ©
Ω(x, t) = exp −(x 2 + y 2 )/2rhad
2
θ (t − |x|) exp −(t 2 − z 2 )/τ2frag ,

(332)

where x = (x, y, z). The first factor gives a Gaussian falloff in the transverse directions, with a
string width rhad ≈ 0.5 fm of typical hadronic dimensions. The time retardation factor θ (t − |x|)
ensures that information on the decay of the W spreads outwards with the speed of light. The last
factor gives the probability that the string has not yet fragmented at a given proper time along
the string axis, with τfrag ≈ 1.5 fm. For a string piece e.g. from the W+ decay, this field strength
has to be appropriately rotated, boosted, and displaced to the W decay vertex. In addition, since
the W+ string can be made up of many pieces, the string field strength Ω+ max (x, t) is defined as the
maximum of all the contributing Ω+ ’s in the relevant point. The probability for a reconnection to
occur is now given by
 Z 
Precon = 1 − exp −kI d3 x dt Ω+ −
max (x, t) Ωmax (x, t) , (333)

where kI is a free parameter. The integration cannot be done analytically, but is approximated
by Monte-Carlo methods. Exponentiation has been applied to saturate the probability at unity. If
a reconnection occurs, however, the space-time point for this reconnection is selected according
to the differential probability Ω+ −
max (x, t) Ωmax (x, t) without any saturation. This defines the string
pieces involved, and the new colour singlets are obtained by a flip as described above (dipoles
(i, k) + (l, n) → (i, n) + (l, k)).
In scenario II it is assumed that reconnections can only take place when the core regions of two
string pieces cross each other. This means that the transverse extent of strings can be neglected,
which leads to considerable simplifications compared with the previous scenario. The position of
a string piece at time t is described by a one-parameter set x(t, α), where 0 ≤ α ≤ 1 is used to
denote the position along the string. To find whether two string pieces (i, k) and (l, n) from the
W+ and W− decays cross, it is sufficient to solve the equation system x+ (i,k)
(t, α+ ) = x−
(l,n)
(t, α− )
and to check that this (unique) solution is in the physically allowed domain. As an example, if
there is no shower activity, so that the event only consists of the two q1 q2 and q3 q4 strings, it is
easy to see that these are moving apart from each other already from their creation and will never
meet. A solution will nevertheless be found, but with negative t and possibly either or both of
the α± outside their allowed range. Further, it is required that neither string piece has had time
to fragment, which gives two extra suppression factors of the form exp{−τ2 /τ2frag }, with τ the
proper lifetime of each string piece at the point of crossing, i.e. as in scenario I. If there are several

178
SciPost Physics Codebases Submission

string crossings, only the one that occurs first is retained. Reconnection is done with a flip, as in
scenario I.
In models I and II the string length is not tested, so it may increase. The geometry of the
process still tends to favour a reduced λ. For the model variants I0 and II0 , a reduced λ is imposed
as an additional requirement on allowed reconnections.

7.2.5 Other CR models


It is relevant to remember that many more CR models have been proposed, and several imple-
mented in past PYTHIA versions. Some of these could be resuscitated using the existing colour-
reconnection user hook, or an expanded version thereof, should the need arise.
In PYTHIA 6.4, several colour-annealing scenarios were available [316, 324], again primarily
intended to be useful for top-mass uncertainty studies in hadronic collisions. They start from the
assumption that, at hadronization time, no information from the perturbative colour history of the
event is relevant, so all existing colour tags are erased. Instead, what determines how hadronizing
strings form between the partons is a minimization of the total potential energy stored in these
strings, as represented by the λ measure. The minimization is achieved by an iterative procedure,
which unfortunately can be quite time consuming. The scenarios differ by details such as whether
closed gluon loops are suppressed or not, or whether only free colour triplets are allowed to initiate
string pieces.
Also in PYTHIA 6.4, the GH model [325] offered a simpler option for W+ W− events, based on
colour factors and string length reduction, without any space-time picture.
In the ARIADNE program for e+ e− and e± p, CR was introduced based on λ minimization [320],
but CR could occur after each new parton-shower emission, and thereby affect the continued
shower evolution. A similar idea is the dipole-swing mechanism for the initial-state evolution of
incoming hadrons [246, 248].
When rapidity gaps were found in HERA DIS events, one early alternative to the Ingelman–
Schlein pomeron picture [251] was that the gaps were a consequence of CR [326–329]. The
Uppsala group has subsequently expanded this soft colour interactions approach to encompass
also hadronic events, for topics such as diffraction and other rapidity gaps [330], and charmonium
production [331]. One important difference relative to many of the models above is the frequent
use of an “area law” [332] rather than the λ measure. The area that a string motion sweeps
out P is related to its m2 . For a string consisting of several pieces, the total area is defined as
n
A = i=0 m2i,i+1 , with masses calculated as in eq. (322). The probability of a reconnection is then
P = R0 [1 − exp(−b ∆A)] = R0 [1 − exp(−b(Aold − Anew ))]. The R0 ≈ 1/Nc2 is an assumed colour-
factor suppression, and b is the same as in eq. (312). Note that, had A been defined as the product
of masses rather than a sum, then ln A would have been closely related to λ, and in particular a
∆λ and a ∆A scan would find the same optimal reconnection region, but that is not the case now.
The related code is available in some earlier PYTHIA versions.
CR has also been studied in the context of other generators, such as HERWIG [333–335] and
SHERPA [336]. It is not possible to address CR in equivalent terms for cluster as for string frag-
mentation, so there is no straight correspondence, but some basic ideas nevertheless are shared.

7.3 String interactions and collective effects


Heavy-ion collision experiments have for decades studied the possible creation of a Quark–Gluon
Plasma (QGP) in high-energy collisions of heavy nuclei. Monte-Carlo simulations of physics

179
SciPost Physics Codebases Submission

processes involving QGP creation, is mostly carried out in designated generators or generator
frameworks such as JEWEL [337] or JETSCAPE [338] (both of which in fact use PYTHIA as a hard-
process generator). Another approach is to segment individual events into “core” and “corona”
parts [339], where the former are treated as QGP, and the latter in vacuum. This is the case for
EPOS-LHC [340], which is an independent framework, as well as for other approaches built on
top of PYTHIA [341].
PYTHIA has thus, historically, played on a different field than generators focused on the spe-
cial observables obtained in heavy-ion collisions. Instead, PYTHIA is often used as a generator
supplying an initial state, with a focus on the hard process, parton shower, and hadronization as
in lepton collisions, with no QGP produced or assumed. This clear division of tasks was ques-
tioned by data from LHC. First in 2010, with the discovery of long-range azimuthal correlations
of final-state hadrons in high multiplicity pp collisions, referred to as “the near-side ridge” [342],
and later by observations of enhanced production of strange and multi-strange final-state hadrons,
incompatible with model fits to LEP data [343–346]. The latter culminated in the observation
that not only is the observed strangeness production incompatible with model fits to LEP data,
strange/non-strange ratios also increase with multiplicity, and the increase smoothly connects pp
with pA and AA collision systems [347]. This clearly meant that PYTHIA could no longer assume
that effects traditionally ascribed to QGP formation are only present in heavy-ion collisions. While
CR models can account qualitatively for some of the observed effects [348, 349], they are wholly
unsuitable in others [350]. Instead of introducing QGP formation into PYTHIA, as the approaches
cited above in some sense have already done, the route taken is to expand the Lund string model
to its furthest consequence, by allowing interactions between strings in densely populated regions
of space. Whether interactions between strings are indeed responsible for all collective effects ob-
served in pp and heavy-ion collisions, is still unknown. The models introduced here should thus
clearly be understood as one possibility among several others, however unified by the underlying
assumption that QGP is not formed. Furthermore, they are all work in progress at the time of
writing, and subject to change. There is no clear demarcation between what constitutes a model
of colour reconnection, as introduced above, and models of string interactions. In this manual
we have drawn the line between models operating in momentum space (the colour reconnection
models) and models operating in real space.

7.3.1 String shoving


In the original formulation of the Lund string model, strings are treated as massless relativistic
strings, which presupposes that strings have no transverse extensions. In collisions with many
strings occupying the available spatial volume, this approximation breaks down, and strings are
allowed to interact with mainly repulsive forces. The realization of this picture is denoted the
“string shoving model”. While similar ideas were explored analytically already in 1988 [351],
the modern version of the string shoving model is formulated to take into account input from
lattice QCD, and is based more firmly on the correspondence with a superconductor. This model
is rather new at the time of writing [284] and is still being extended with further consequences
being explored [352, 353]. The model contains three basic physics components: 1) the string
shape, 2) the string transverse width, and 3) the interaction force between two strings.
The transverse shape of the colour-electric field of the flux tube (the string shape) is determined

180
SciPost Physics Codebases Submission

with input from lattice QCD [354], and can be well described by a Gaussian:

ρ2
 
E(ρ) = N exp − , (334)
2R
where N is a normalization factor, ρ is the radial coordinate, and R is the string equilibrium
radius.R The normalization constant is determined by assuming that the field energy per unit
length d2 ρE 2 /2 is a constant fraction (g) of the string tension. This gives N 2 = 2gκ/(πR2 ). The
strings expand from their time of formation with infinitesimal width, until they either attain the
maximum width R, or until the string’s fragmentation proper time, τhad , has been reached. While
the equilibrium width of a string can be argued either by lattice considerations or from models,
the number is associated with such large uncertainty, that it is in practice kept as a free parameter
of the model, with reasonable values between around 0.5 and 1.5 fm. The same holds for the
parameter g. The string repulsion force can then R be calculated from the energy of the colour-
electric field of two overlapping, parallel strings d2 ρ( E~1 + E~2 )2 /2. If strings are separated from
each other by the transverse distance d⊥ , the interaction energy becomes 2gκ exp(−d⊥2 /(4R2 )),
which gives the interaction force per unit length:

d⊥2
 
gκd⊥
f (d⊥ ) = exp − 2 . (335)
R2 4R
The above treatment leading to eq. (335), is made in terms of Abelian fields. As such, anti-
parallel strings would attract each other rather than repel. In a non-Abelian theory like QCD, the
picture is more complex, leading to repulsion being the dominant mechanism. As an example,
consider the case of oppositely oriented triplet fields. One obtains an octet field with probability
8/9, which still leads to a repulsion, and a singlet field with probability 1/9, leading to attraction.
Since singlets correspond to the total attenuation of fields, it can further be assumed that singlets
are already handled by colour-reconnection mechanisms [353, 353].
The technical implementation is concerned with two further questions, namely calculating d⊥
for two given string pieces, and distributing the resulting pushes in the event. For the former,
a suitable Lorentz frame is defined, where a string pair always lies in parallel planes, called the
parallel frame [353]. One can then boost a pair of strings from the lab frame to the parallel
frame, where the string topology is specified with an opening angle between the two partons in
the string ends and a skewness angle between the two strings — both of which are constrained
by momentum-energy conservation. The angles can be expressed in terms of pseudorapidity and
invariant masses si j for the string formed by partons i and j:
s14 s13 s14 s13
cosh(η) = + , and cos(φ) = − . (336)
4p⊥1 p⊥4 4p⊥1 p⊥3 4p⊥1 p⊥4 4p⊥1 p⊥3
Furthermore, the strings now evolve and interact in the proper time in the parallel frame. Calcu-
lating this interaction for every possible string pair is, among other aspects, a computational chal-
lenge, and to curb the possibility of running into being an extreme computing resource-consuming
program, we for now neglect end-string effects which, for example, have been studied in ref. [291].
The shoving force is distributed to the outgoing hadrons formed after string fragmentation,
taking into account that the total applied push is a result of a time evolution. The integrated push
∆ p⊥ is:
Z Z
∆p⊥ = dt dz f (d⊥ (t)) , (337)

181
SciPost Physics Codebases Submission

where the integration limits in z are time dependent. Since the time ordering of pushes is impor-
tant, ∆p⊥ is split up into several (fixed) small pushes δp⊥ , according to a probability distribution
P(t). The total push is then:
Z Z
1
∆p⊥ = dt P(t)δp⊥ , with P(t) = dz f (d⊥ ) , (338)
δp⊥
when δp⊥ is small. The pushes can then be ordered in time (in the parallel frame) using the veto
algorithm. The resulting procedure corresponds to a time evolution with dynamical time stepping,
where steps are large when pushes are small and vice versa.
In t, z space, this would look like hadrons flying out along the direction of their original pseu-
dorapidity, even after the pushes are applied, spreading out in a light-cone that extends in such a
way that it encloses all the hadrons which receive a share of this generated ∆p⊥ . This distribution
of pushes is performed as shown in fig. 18.

z
z

Figure 18: Space-time diagram of a Lund string showing the trajectory of hadrons when
they receive their share of ∆p⊥ resulting from string shoving interactions. The blue lines
show the initial pseudorapidity lines for the hadrons formed, the red line implies a δp⊥
generated from shoving, and the red dashed line shows the τhad .

7.3.2 Rope hadronization


A simple string drawn between a quark and an antiquark is an SU(3) triplet (or anti-triplet depend-
ing on direction of colour flow). When several strings overlap with each other at hadronization
time, the rope-hadronization model posits that end-point colour charges will act together coher-
ently to form a stronger field — a rope. This possibility was noted in the classic paper by Biro,
Nielsen and Knoll from 1984 [355].

182
SciPost Physics Codebases Submission

The new, stronger field is an SU(3) multiplet. According to lattice calculations [356], the
energy density (and thus the string tension) scales as the second Casimir operator (C2 ) of the
rope multiplet. When a rope is formed by ordinary triplet and anti-triplet strings, the net colour
charge is obtained from the addition of random coloured triplets and anti-triplets [247,355,357].
A resulting multiplet is uniquely characterized by two quantum numbers p and q, with a specific
state corresponding to p coherent triplets and q coherent anti-triplets (a normal triplet string is
thus characterized as {p, q} = {1, 0}). The multiplicity of a multiplet is given by:

2N = (p + 1)(q + 1)(p + q + 2) . (339)

This allows for an iterative addition of multiplets. Starting from a given multiplet {p, q}, adding
a triplet gives the three possible multiplets [247]:

{p + 1, q}, {p − 1, q − 1}, and {p, q − 1} , (340)

with weights given by eq. (339). The anti-triplet case is given directly from symmetry. Once it is
established which triplets and anti-triplets are overlapping in an event, a random walk procedure
can be carried out to find p and q for the rope multiplet. Since the energy density of the rope is
proportional to C2 , the relative tension of the multiplet to the triplet can be calculated directly as:

C2 ({p, q}) 1 2
= (p + pq + q2 + 3p + 3q) . (341)
C2 {1, 0} 4
When the rope breaks up, it does so in a step-wise manner, one string at a time. By considering the
change in available field energy in the transition {p, q} → {p − 1, q}, neglecting the contribution
from the vacuum pressure to the total energy, the energy available in a single string breaking
becomes the effective string tension κ̃:

κ̃ = (2p − 1)κ . (342)

While the string tension κ does not enter explicitly9 into the PYTHIA implementation of string
hadronization, it does enter implicitly through the parameters of eqs. (312) and (319). From
the implicit dependence on κ, transformation rules for all parameters can be defined, given the
assumption that the PYTHIA default values of all parameters correspond to κ̃ = κ, as they are
tuned to LEP data [358] where there are no overlapping strings, and thus p = 1 and q = 0. The
most important affected parameters [247], are: those involved in suppression of strangeness (ρ),
diquark production (ξ), diquark with strange-quark content relative to diquarks without strange
quarks (x), the suppression of spin-1 diquarks relative to spin-0 diquarks, and the width of the
transverse momentum distribution in string breakings (σ). Letting h = κ̃/κ, the transformation
rules for ρ, x and y are similar:

ρ 7→ ρ̃ = ρ 1/h , x 7→ x̃ = x 1/h , and y 7→ ỹ = y 1/h , (343)

while σ 7→ σ̃ = σ1/h . The ξ parameter is more complicated, and transforms like:


‹1/h
ξ

ξ 7→ ξ̃ = α̃β , (344)
αβ
with α depending on all the above parameters, and β a free parameter.
9
The string tension does enter explicitly into the vertex positions in section 7.1.4, but the effect of rope formation
has so far not been taken into account for hadron vertices.

183
SciPost Physics Codebases Submission

7.3.3 The thermal model


The thermal model [359], available as a non-default option, can partly be viewed as an alternative
to the rope model, sharing similar objectives. Not all details have been fully developed, so its
main purpose is for exploration. One motivation for it is that hadronic p⊥ spectra in low-energy
collisions are reasonably well described by an exponential fit

dσ Ç
= N exp(−m⊥had /T ) with m⊥had = m2had + p⊥
2
, (345)
d2 p⊥

where N and T are (approximately) common for all hadron types. Another motivation is that local
quantum-mechanical fluctuations in the string transverse profile translate into a fluctuating string
tension κ, which can broaden the Gaussian p⊥ into an exponential-like form [360]. (Compare
with fluctuations in the proton size, which are commonly advocated and used e.g. in the ANGANTYR
modelling of cross sections [284].) While traditionally T is associated with a temperature, in such
a scenario it would rather be derived from κ.
The thermal model is implemented as follows. In each string break the q and q receive opposite
and compensating p⊥ values, chosen such that the p⊥ sum of two adjacent string breaks precisely
gives an exp(−p⊥ /T ) spectrum. Starting from a known flavour in one string break, the next
flavour and the resulting intermediate hadron is chosen among all possibilities according to a
relative weight exp(−m⊥had /T ). Assuming the production of two hadrons with different masses
m1 and m2 , this approach then implies the same production rate for p⊥  m1 , m2 , but more
suppression of the heavier hadron at low p⊥ . Thus, there is less production of heavier states, but
they come with a larger 〈p⊥ 〉, which is as intended.
There is some fine print, like that each particle should be weighted by the number of spin
states, that flavour-diagonal mesons can mix, that baryons need SU(6) symmetry factors, that
baryons receive a free overall normalization factor with respect to mesons, and so on. The num-
ber of flavour-related free parameters still is significantly reduced relative to the ordinary string
fragmentation.
Overall the particle composition comes out reasonably well, with some excess of the heavier
baryons. This is in contrast to the normal string fragmentation, where it is difficult to produce
enough of these particles. The larger 〈p⊥ 〉 for heavier particles also improves agreement with pp
data, but resonance decays act to dilute the effects, so 〈p⊥ 〉(mhad ) still does not rise quite fast
enough.
Another issue is what happens when several strings are close packed. In the rope model, this
leads to a higher κ, in quantised steps. An alternative is to assume a continuously increasing κ as
each string is squeezed into a decreasing effective area. Such an option is implemented as part
of the thermal model, but can also be applied to the default Gaussian one. In this approach, the
T or κ parameter is rescaled by a power of the effective number of strings in the neighbourhood
of a new hadron. Therefore a trial average step along the string is made before a new hadron
is produced, giving a likely hadron rapidity and p⊥ . Then, one may count the number of strings
crossing that rapidity, as a simple measure of string density. A smooth damping is applied for
particles produced at larger p⊥ , which are likely to be produced in minijets sticking out from the
denser-populated central region. The close-packing enhancement can be used e.g. to increase
strangeness production in high-multiplicity pp events, similar to the rope model, but it has not
been as extensively compared with data.

184
SciPost Physics Codebases Submission

7.4 Hadronic rescattering


After hadrons have been produced, outgoing hadrons can interact in secondary collisions. This
rescattering can be relevant when studying collective effects, but can lead to a significant slow-
down of PYTHIA, and is therefore not on by default. It is enabled by setting HadronLevel:Rescatter
= on. Here, we will outline the rescattering algorithm, then summarize some notable effects of
rescattering of which the average user should be aware. A more detailed discussion of the rescat-
tering framework is given in ref. [214] in the context of pp collisions, while ref. [361] discusses
physics results for pA and AA collisions.
There are two aspects to the rescattering algorithm: first, describing how two hadrons interact
with each other in their rest frame; and second, describing the evolution of the event as a whole.
p
Consider two hadrons in their rest frame, with CM energy s and impact parameter b. We
assume that the probability of an interaction occurring is a function of b and the total cross section
p
σtot . The cross section generally depends on s and the specific hadron species, as described in
section 6.1.5. There is no solid theory for how P depends of b, so two different models are
implemented in PYTHIA 8.3. The default is a Gaussian dependency,
2
/b02
P(b) = P0 e−b , (346)

where P0 is referred to as the opacity, a free parameter that is 0.9 by default, and the characteristic
length scale is
t σtot
v
b0 = . (347)
P0 π
The alternative model is a disk model,

P(b) = P0 Θ(b − b0 ) , (348)

where Θ is the Heaviside step function. For P0 = 1, this corresponds to the black-disk model used
by most existing hadronic rescattering frameworks. The two models are normalized such that
if b is chosen uniformly on a disk with radius much larger than b0 , then both models will give
the same interaction probability. In practice, rescattering is more likely in dense regions where b
tends to be biased towards lower values, so the narrower distributions like the black disk will lead
to more rescattering activity. If it is determined that the hadrons should interact, the interaction
time is defined as the time of closest approach in their rest frame.
The algorithm for performing rescattering for the whole event proceeds as follows:

1. Start with an event right after hadronization.

2. For each hadron pair, test whether they could interact, using the probability P defined above.

3. If a pair could potentially interact, record the interaction time for that pair in a time-ordered
list.

4. Choose the earliest interaction in the list where participants have not already interacted, and
simulate the collision. Which process to simulate is chosen with probabilities proportional
to the partial cross sections for each process.

5. Check whether the newly produced hadrons can interact with existing ones, and if so, add
the interaction times for those pairs to that list.

185
SciPost Physics Codebases Submission

6. Continue picking interactions from the list until there are no more potential rescatterings.

Short-lived hadrons can also decay during the rescattering phase. To model this, the decay times
of those hadrons are recorded in the list together with rescattering interaction times, and the decay
occurs when it is chosen in step 3 above, if it has not already rescattered.
Enabling rescattering has a few consequences for the shape of events. First, rescattering in-
creases charged multiplicity, since only processes with two incoming hadrons are allowed, but in-
elastic processes can produce more than two outgoing ones. For PYTHIA 8.307 with default param-
eters and pp at 13 TeV, this can be compensated by setting MultipartonInteraction:pT0Ref
= 2.345. For beams such as pPb and PbPb, other values might restore the multiplicity, but a
more thorough retune is necessary in order to simultaneously obtain the correct multiplicity in all
three cases. In such a retune, it would also be relevant to include other effects such as ropes (sec-
tion 7.3.2) and shoving (section 7.3.1). For now, the user is advised to assume that rescattering
will change the charged multiplicity.
Similarly, hadron composition will change. Baryon number in particular is reduced in rescat-
tering through annihilation processes. For example, the process pp → π+ π− π0 is possible, but not
the reverse. Another way the composition changes is through resonance production, e.g. πK → K∗ ,
but be aware that this resonance production is not easily detectable in experiment; for a resonance
production followed by a decay, πK → K∗ → πK, the invariant mass of the outgoing system is the
same as for the incoming one. In other words, this process produces a K∗ that is visible in the
event record, without necessarily changing the observable πK mass spectrum.
A particular consequence of the increased multiplicity is that each hadron will on average have
lower p⊥ , which could affect e.g. spectra that are sensitive to p⊥ cuts. At the same time, the mean
p⊥ for particular hadron species may increase. This is the case for example with protons, which
will move slower than pions with similar p⊥ , and will therefore receive a push from behind. This
phenomenon is referred to as the “pion wind”.
Rescattering has been shown to give rise to some collective effects, in particular azimuthal flow
in PbPb collisions. PYTHIA 8.3/ANGANTYR with rescattering provides a good description of elliptic
flow coefficients at large multiplicities, and a more modest contribution at low multiplicities. It
can also lead to some jet modifications, but with the aforementioned p⊥ shift, it is not clear how
to interpret these modifications. See ref. [361] for further discussion.

7.5 Bose–Einstein effects


Ideally, coloured partons could be formed into colourless final-state hadrons using amplitude-
based quantum mechanics, but because these transitions are non-perturbative, phenomenological
models of hadronization are employed. Due to the probabilistic nature of these hadronization
models, coherence in final-state particles cannot be directly described. A classic example of such
final state coherence is Bose–Einstein effects, where correlations arise between identical bosons in
an event from the symmetrization of the production amplitude. While these correlations are ex-
pected to have a negligible impact for most measurements in pp collisions, Bose–Einstein effects10
have been observed in minimum bias pp and pp data [363–366], as well as e+ e− data [367–370].
Additionally, some precision measurements such as W -mass determination using hadronic final
states may be sensitive to Bose–Einstein effects [371].
10
Within the heavy-ion and astrophysics communities these effects are oftentimes discussed in the context of Hanbury-
Brown–Twiss interferometry [362].

186
SciPost Physics Codebases Submission

Assuming a geometric picture with a Gaussian distribution of production vertices in space-


time, two-particle correlations of identical bosons are enhanced by a unitless factor of,
2
Q−2
f2 (Q) = 1 + λe−Q 0 , (349)

with respect to a final state with no coherence effects [372]. Here, Q2 is (pi − p j )2 where pi and p j
are the four-momenta of the two particles, λ is the incoherence parameter, and Q 0 is a reference
Q related to the radius of the particle source as r ≡ ħh/Q 0 . The incoherence parameter is limited
between 0 where there is no effect, and 1 with a maximal effect.
For a high multiplicity event with multiple two-particle correlations, the event weight can be
naively approximated as the product of f2 (Q) for each particle pair. Note, this is a slight over-
estimate of the event weight for most event topologies. These event weights cannot modify the
overall normalization, as this would increase the cross section for the final state. If the weights are
normalized to unity, the total cross section is not modified, but the multiplicity distribution will
be shifted to higher multiplicities. Neither of these behaviours is desirable, as both cross sections
and multiplicity distributions are already well described without Bose–Einstein effects.
Instead, in PYTHIA Bose–Einstein effects are introduced by shifting the momenta inside particle
pairs. Assuming the distribution of Q for particle pairs is given by flat phase space, then solving,
Q Q0
q2 q2
Z Z
dq p = dq f2 (q) p , (350)
0 q2 + 4m2 0 q2 + 4m2

for Q0 determines the new Q value needed to produce an enhancement of f2 (Q) for that particle
pair with individual particle mass m. The three-momentum for the two particles can then be
shifted by,
∆~pi, j = c(~pi − ~p j ) , (351)
where ~pi0 = ~pi + ∆~pi, j ~p j0 = ~p j − ∆~pi, j , and the constant coefficient c is determined from setting
Q02 = (pi0 − p0j )2 . Because events may have more than one particle pair, the total shift for a given
particle is then, X
~pi0 = ~pi + ∆~pi, j , (352)
j6=i

where the sign of ∆~pi, j is such that the three-momenta of the event is conserved.
Effectively, this shifting of momentum corresponds to pulling particle pairs closer together,
and while three-momenta is conserved throughout this process, energy conservation is violated
and the total energy of the event is reduced. The form of f2 (Q) from eq. (349) arises from inte-
grating the pair symmetrization term 1 + cos(∆x · ∆p) over a Gaussian distribution of production
vertices in space-time. Consequently, any source distribution other than a Gaussian will result in
an oscillatory behaviour of f2 (Q), with alternating values of f2 (Q) > 1, where Q0 < Q results in
the pair pulled together, and f2 (Q) < 1, where Q0 > Q results in the pair separated apart. With
the appropriate damping of this oscillatory behaviour, for a given particle configuration, a form
of f2 (Q) can be found where both conservation of three-momenta and energy is achieved. Some
pairs at low Q are pulled together, reducing the net energy, while other pairs at middle Q are
separated apart, increasing the net energy.
To achieve this behaviour, a form of f2 (Q) is selected to have one oscillation before damping.
The ansatz of the BE32 algorithm [373],
2 −2 2 −2 2 −2
” —” € Š—
f2 (Q) = 1 + λe−Q Q 0 1 + αλe−Q Q 0 /9 1 − e−Q Q 0 /4 (353)

187
SciPost Physics Codebases Submission

is chosen for α < 0, where the new second factor effectively models the initial minimum of the
oscillation as a smeared Gaussian [14]. This form does not have any deep physical meaning, but
provides the necessary first oscillation while maintaining the initial Gaussian distribution form,
and has the limiting behaviour of f2 (0) = 1 + λ. The factor α is iteratively determined per event
after calculating all relevant pi0 , such that energy is still conserved even after three-momentum
shifting is performed for each relevant particle. Consequently, at least two particle pairs must be
present for Bose–Einstein effects to be introduced.
Bose–Einstein correlations are performed after hadronization but prior to particle decays, and
are not included by default. Effects may be switched on or off for different particle groupings:
pions with π0 , π+ , and π− pairs; kaons with K0S , K0L , K+ , and K− pairs; and eta mesons with η
and η0 pairs. Many of these particle species are produced not only from primary hadronization,
but also from the decay of short-lived particles. Consequently, a configurable minimum decay
width can be set so that any particles with a larger width are decayed prior to the application of
Bose–Einstein effects. The default minimum decay width is set at 0.02 GeV so that both ρ and K∗
mesons are decayed before correlations are introduced. Both the shifted and unshifted versions of
particles are kept in the event record for bookkeeping purposes. All shifted particles are assigned
a status of 99 and are set as the children of their unshifted entry.

7.6 Deuteron production


The deuteron, D, is a bound proton and neutron state, which, similar to Bose–Einstein effects
(see section 7.5), must be formed after hadronization. Understanding deuteron production in
the context of collider-based experiments can help efforts in modelling nuclei formation and re-
duce prediction uncertainties when searching for possible dark-matter induced excesses in cosmic
ray flux ratios [374]. In heavy-ion physics, formation of loosely bound systems are often used
to determine the chemical freeze-out temperature in statistical hadronization models [375]. In
PYTHIA 8.3, two deuteron formation models are available, the coalescence model [376, 377] and
the more sophisticated Dal–Raklev model [378]. Both models are implemented through the same
configurable framework, with the Dal–Raklev model set as the default configuration. All deuteron
production is switched off unless explicitly requested by the user. Note that while the discussion
here is for the deuteron, anti-deuteron production is also performed following the exact same
method, but with all particles swapped to antiparticles.
In the coalescence model, all possible p and n pair combinations are determined. For each
pair the magnitude of their three-momenta difference,
q
k = (~pi − ~p j )2 (354)

is calculated in the rest frame of the pair, where ~pi and ~p j are the two three-momenta of the pair. If
k is less than some cutoff value c0 , the pair is bound into a deuteron, otherwise the nucleons remain
unmodified. Spatial separation, in addition to momentum separation, could also be considered,
although this has not been implemented in any of the models described here. The ordering of
testing pairs for binding is randomized, and after each successful binding, any remaining pairs
containing one of the bound nucleons are no longer considered for binding. For the coalescence
model, this implies that the binding cross section is flat as a function of k. If there are two unique
pairs each with k < c0 , both pairs have an equal probability of being bound, even if one k is smaller
than the other.
After a nucleon pair is selected for binding, a deuteron is formed. In principle, the three-
momentum of this deuteron could be calculated as ~pi + ~p j , and while three-momentum for the

188
SciPost Physics Codebases Submission

event would be conserved, energy would not. Instead, an isotropic decay into a deuteron and
photon is performed in the rest frame of the pair. Because the primary process for deuteron for-
mation at the low momentum differences of the coalescence model is radiative capture, pn → γD,
this provides a reasonable approximation of the process and conserves both energy and momen-
tum. While spin correlations could be considered, these typically are negligible after boosting the
deuteron and photon into the laboratory frame.
The Dal–Raklev model expands upon the coalescence model by considering the following for-
mation channels, other than just pn → Dγ.

• pn → γD • pn → π− π+ D • pp → π+ D • pp → π+ π0 D

• pn → π0 D • pn → π0 π0 D • nn → π− D • nn → π− π0 D

Channels can be removed, modified, or added. Each channel must have a two-body initial state
and an n-body final state where n > 1 and at least one of the outgoing particles is a deuteron. For
each of these channels the kinematics of the final state are determined by an isotropic decay in the
rest frame of the initial state pair. Additionally, the binding cross section is no longer considered as
a uniform distribution up to a cutoff parameter, but is instead determined from fits of differential
nucleon-scattering data from a number of experiments [378].
Four cross-section parameterizations are available. For each channel, one of the following
parameterizations must be selected, and the necessary coefficients ci provided.

1. The coalescence model parameterization as previously described is a step function with two
parameters, the cutoff parameter c0 and a normalization parameter c1 . The normalization
allows channels using this parameterization to be used in combination with other channels.

dσ(k)
= c1 Θ(c0 − k) (355)
dk
2. The pn → γD cross-section differential in k can be parameterized by a polynomial below a
cutoff of c0 , and with an exponential above. Due to Runge’s phenomenon, the polynomial
is fixed to its value at k = 0.1 GeV for k < 0.1 GeV.

 dσ(0.1 GeV)/dk for k < 0.1 GeV
dσ(k)
P
12
= i=1 ci k
i−2
for 0.1 GeV ≥ k < c0 (356)
dk e−c13 k−c14 k2

otherwise

3. The two-body final states with a pion and deuteron are parameterized using a cross sec-
tion differential in q, the momentum magnitude of the pion in the nucleon-pair rest frame,
divided by the mass of the pion. Because the final state is two-body, the pion momentum
magnitude is already known a priori.

dσ(q) c0 q c1
= (357)
dq (c2 − e c3 q )2 + c4

In the default Dal–Raklev model, the pn → π0 D, pn → π+ D, and pn → π− D channels use


this parameterization.

189
SciPost Physics Codebases Submission

4. The cross sections for the three-body final states with pions are differential in k and are
parameterized with,
dσ(k) X c5i k c5i+1
= , (358)
dk i=0
(c5i+2 − e c5i+3 k )2 + c5i+4
where the number of coefficients is variable but must be a multiple of 5. The default
pn → π− π+ D, pn → π0 π0 D, pn → π+ π0 D, and nn → π− π0 D channels use this param-
eterization.

Not only does the shape of the differential cross sections matter, but also the normalization, as
this determines the relative rates between the channels. In the Dal–Raklev model the γD channel
dominates at low k. For k > 1 GeV the πD channels dominate except between roughly 1 and 2 GeV
where the ππD channels dominate. An additional unitless normalization scale can be configured
to increase or decrease the total deuteron production cross section. A number of fits for this
normalization constant have been made using various data sets from the LHC, with the default
normalization set from differential 7 TeV ALICE data [378].

8 Particles and decays


There are several ways to classify unstable particles, and in PYTHIA at least three classifications
are useful:

• by lifetime, specifically for coloured particles whether above or below the hadronization
time;

• by if the partial and total widths of a particle are perturbatively calculable, such as for the
µ, γ∗ /Z, W± , top, Higgs bosons, and most BSM particles, or not, such as for hadrons;

• by if a particle is part of the hard process, and cannot be produced anywhere else, such as
in parton showers or hadronization, e.g. Z, W± , top, and Higgs bosons.

These classifications are necessary to understand how particles are technically treated.
In PYTHIA a distinction is made between the following technical representations of particle
states: resonances with an average lifetime shorter than the hadronization scale; particles with
an average lifetime comparable to or longer than the hadronization scale; and partons that carry
colour charge and must be hadronized. By default, any state with a nominal mass above 20 GeV
is considered as a resonance, e.g. γ∗ /Z, W± , top, Higgs bosons, and most BSM states such as
sfermions and gauginos. However, some light hypothetical weakly interacting or stable states,
such as the gravitino, are also by default considered as resonances to ensure a full treatment of
angular correlations in their decays. All remaining states without colour charge, primarily leptons
and hadrons, are treated as particles, while quarks and gluons are considered as partons. There are
some exceptions like colour-octet onia, which are treated as both partons carrying colour charge,
and particles that decay after hadronization.
Resonance states are sequentially decayed during the hard process, see section 3.11 for details.
For example, in the hard process gg → H → Z[→ µ+ µ− ]Z[→ µ+ µ− ] the decay of the Higgs into Z
bosons is first performed, followed by the decays of the γ∗ Z resonances into muon pairs. The cross
sections calculated for hard processes with resonances depend upon the available decay channels

190
SciPost Physics Codebases Submission

of the resonances; closing decay channels will reduce the cross section for the process while open-
ing decay channels will increase the cross section. Consequently, when using the cross section
calculated for a resonance produced in a hard process, the available branching fractions of the res-
onance are already included in the cross section. In most cases, angular correlations are included
in the decay of the resonance. In some cases, mixed decays of the resonances are needed, e.g.
gg → H → Z[→ µ+ µ− ]Z[→ e+ e− ]. In this example, both the muon and electron channels could
be left open. However, in some cases this might result in inefficient generation of the required
final state. Consequently, a special class is available in PYTHIA, ResonanceDecayFilterHooks,
which can be used to select specific final states from the resonance decays.
Lighter states such as the J/ψ, which can be produced by the hard processes of section 3.3,
are technically treated by PYTHIA as particles and not resonances. This is because the J/ψ can
be produced in both hadronization and particle decays, where the cross section of these J/ψ pro-
duction mechanisms is not known a priori. The reduced cross section of the J/ψ due to closed
decay channels can only be determined after the generation of full events, including J/ψ pro-
duction from the hard process, hadronization, and particle decays. Similarly, states that are only
produced in the hadronization and particle decays, e.g. the ρ, are also considered particles and
not resonances. An important exception to the treatment of resonances is the production of weak
bosons in the parton shower, see section 4.1.4. Here, while closing the decay channels of the weak
bosons will modify the hard-process cross section, the decays of the weak bosons in the parton
shower will still remain inclusive. The decay channels of the weak bosons in the parton shower
can be selected using the special IDs 93 and 94 for the Z and W, respectively. However, changing
these decay channels will not affect the hard-process cross section and must be book kept carefully
by the user.

8.1 Particle properties


For all states, a number of properties must be defined. Each state is uniquely identified by its PDG
ID, or when a PDG convention is not available, a PYTHIA specific numbering convention, i.e. for
BSM and colour-octet onium states. For each state a human readable name is stored, as well as
an antiparticle name when relevant. The quantum numbers for each state must be defined: the
spin, electric charge, and colour charge. Note that the spin information is duplicated for hadrons,
where the spin can also be determined from the PDG ID. The experimentally observable properties
of the state are also specified including a nominal mass, a nominal width, allowed limits of this
width, and a nominal proper lifetime. Additionally, a number of decay related options can be
specified including whether the state may decay, if the width is perturbatively calculable, and
if the width should be forced to be rescaled. Each state may also have a list of decay channels
which determine how the state is decayed. Each channel is configured with a flag specifying if
the channel is available for the particle/antiparticle state, a branching ratio, a mode specifying a
possible matrix element for the decay, and a list of the decay products.

8.1.1 Masses
The default masses for most particles in PYTHIA are taken from experimental observation as sum-
marized by the PDG [379]. There are three exceptions: quarks and diquarks, unobserved or
poorly studied hadrons, and hypothetical BSM particles. For hypothetical particles, e.g. BSM Higgs
bosons, hidden valley hadrons, or fourth generation fermions, reasonable defaults are provided,
see section 10.1.2 for details.

191
SciPost Physics Codebases Submission

Due to ill-defined quark masses, two types of quark masses are used in PYTHIA, kinematic and
running. The kinematic masses are those defined in the PYTHIA particle database, and are used
when generating phase space. For example, in the process gg → cc, the kinematic mass of the c
quark is used. Similarly, the g → qq splittings of the parton shower use these kinematic masses.
While these quark masses can be changed, their default values have been carefully chosen fol-
lowing a number of considerations [304]. Modifying these default values can lead to unintended
consequences across all aspects of PYTHIA including hard process generation, the parton shower,
hadronization, and even particle decays. Consequently, care should be taken when changing these
quark masses from their default values.
Running quark masses are used when calculating mass-dependent couplings, which include
couplings of the quarks to SM and BSM Higgs bosons. The running masses for the quarks are
calculated at one loop in the MS normalization scheme using,
‹12/(33−nf )
ln(Q 0 /Λ)

m(Q) = m0 , (359)
ln(Q/Λ)

where m0 is the input mass at the reference scale Q 0 , and nf is the number of active flavours in
calculating αs . For the light quarks, Q 0 is set at 2 GeV, while for the c, b, and t quarks Q 0 is set at
m0 . The input masses can be configured with the parameters ParticleData:mXRun, where X is
the quark name, i.e. either d, u, s c, b, or t, to be put equal to the MS mass of the quark. The
reference value of αs used in calculating Λ is defined at the scale mZ and set with the parameter
ParticleData:alphaSvalueMRun.
The default masses of unobserved hadrons and diquarks have been set using the constituent
mass model from PYTHIA 6 [14, 380], which considers the spin-spin couplings of the quark com-
binations. The semi-empirical formula for a hadron mass is given by,
X X Si j
m = m0 + mi + k m2d , (360)
i i< j
mi m j

where the terms m0 and k are determined from known hadron masses and depend upon the mul-
tiplet of the hadron, mi are the constituent quark masses, and Si j are the spin-spin interactions for
each quark-pair combination. The constituent quark masses are taken from PYTHIA 6 as 0.325 GeV
for the u and d, 0.5 GeV for the s, 1.6 GeV for the c, and 5 GeV for the b. Since the t does not form
narrow bound states, the t constituent mass is not needed.
For mesons and diquarks, there is only one quark pair, given by q1 and q2 . For diquarks and
meson multiplets with orbital angular momentum L = 0, the spin-spin term for S = 0 states is
S12 = −3, while for the S = 1 states this term is S12 = 1. For both pseudoscalar and vector
mesons, m0 is set to 0 GeV and k is fitted to be 0.16 GeV. For the excited multiplets with L = 1,
the spin-spin terms vanish with k set to 0 GeV and m0 fitted to be 0.45 GeV, 0.5 GeV, 0.55 GeV, and
0.6 GeV for scalars, S = 0 axial-vectors, S = 1 axial-vectors, and tensors, respectively. The masses
of diquarks are calculated using the same k value as for baryons, 0.048 GeV, and m0 = 0.077 GeV
which is two-thirds the baryon m0 value.
There are three possible combinations for baryons, and the spin-spin terms depend not only
upon the spin of the baryon, but also the quark composition. For S = 12 baryons the spin-spin term
is given by,
X Si j 1 2 2
= − − , (361)
i< j
mi m j m1 m2 m1 m3 m2 m3

192
SciPost Physics Codebases Submission

if there are either two identical flavours, q1 and q2 , or all the quark flavours are different and
the two lighter quarks are in an anti-symmetric spin state. For this anti-symmetric case q3 is the
heaviest quark, while q1 and q2 are the two lighter quarks. When all the quarks are all different
flavours and the light quark pair is symmetric, the spin-spin term is given by
X Si j 3
=− , (362)
i< j
mi m j m2 m3

3
where q2 and q3 are the two lighter quarks when relevant. For the S = 2 baryons, the spin-spin
term is given by,
X Si j 1 1 1
= + + , (363)
i< j
mi m j m1 m2 m1 m3 m2 m3

where the ordering of the quarks does not matter. For all baryons, the fitted parameters are set
as m0 = 0.11 GeV and k = 0.048 GeV. The default masses for a number of baryons in PYTHIA are
calculated using these factors and eq. (360). These baryons include the double and triple-heavy
Ξ and Ω baryons.

8.1.2 Widths
Widths are relevant for sampling the masses of both resonances and particles. For particles, widths
are fixed when sampling a particle mass except for the case of hadronic rescattering, see sec-
tions 7.4 and 8.2.3 for further details. The parameter ParticleData:modeBreitWigner deter-
mines what type of distribution is used to select particle masses. Note that this parameter is set for
all particle species; it is not possible to choose different mass shapes on a species-by-species basis.
For a value of 0 the fixed on-shell particle mass is used, while for 1 a non-relativistic Breit–Wigner
is used,
1
P (m) dm ∝ dm . (364)
(m − m0 )2 + Γ 2 /4
By setting a value of 2 a mass dependent width can be included,
v
u m2 − m2
thr
Γ (m) = Γ0
t
, (365)
m20 − m2thr

where m is the selected mass, m0 is the on-shell mass, and mthr is the average threshold mass.
The threshold mass is the sum of the on-shell masses for the decay products, and is consequently
channel dependent. However, to decouple mass selection and decay, the mass threshold is taken
as the average mass threshold for all decay channels, weighted by branching fraction.
A relativistic Breit–Wigner can also be selected,

1
P (m2 ) dm2 ∝ dm2 , (366)
(m2 − m20 )2 + m20 Γ 2

with the option 3, where a fixed Γ is used just as for option 1. The relativistic Breit–Wigner can
also be used with the mass dependent width of eq. (365) with option 4. For all mass selection op-
tions, relativistic or otherwise, the mass distribution is truncated via the NN::mMin and NN:mMax
parameters set for each particle species; here, NN is the given particle species ID. The default mass
shape in PYTHIA is option 4, a mass-dependent relativistic Breit–Wigner.

193
SciPost Physics Codebases Submission

For particles with broad mass distributions that are not treated as resonances, the mass se-
lection models above can distort the particle branching ratios. Regardless of the selected mass
of a particle, all decay channels, even those with an on-shell mass threshold above the selected
mass, are considered. Only after the masses for the decay products are sampled, are channels
eliminated if not kinematically available. In this way, decay channels can remain open if there are
downward fluctuations in the selected masses of the decay products. However, if the mass distri-
bution for a particle is truncated at a lower mass, decay channels with lower mass thresholds may
be enhanced. A good example is the ρ 0 which, as a broad low-mass resonance, has any number
of non-perturbative and threshold effects. The mass distribution for the ρ 0 is limited by the rare
e+ e− decay channel. However, truncating the mass distribution at this mass threshold can result
in oversampling the lighter decay channels. Consequently, the mass threshold for the ππ channel
is used instead for the default low-mass truncation of the ρ 0 mass distribution.
For resonances, widths are sampled using the relativistic Breit–Wigner of eq. (366), but with
a number of options available for determining the partial widths of the resonance at a given mass.
This calculation method can be set by the user with the NN:meMode parameter when defining
the decay channels for a particle. The default value is 0, where the partial width is calculated
perturbatively for the resonance if already available in PYTHIA. If a given width is not available
via a perturbative calculation, then this width is set to zero. However, a number of alternative
partial width calculations are available.

• NN:meMode = 100: The partial width is defined as the branching fraction for that decay
channel, multiplied by the total width. This method results in mass-independent widths and
does not account for mass-threshold effects, which may result in issues when a resonance
is produced significantly off-shell with a mass below the on-shell mass. When this occurs,
it is possible that no decay channels remain kinematically open, and the resonance can
no longer be decayed. However, it is also possible that downward mass fluctuations may
occur in the masses of the decay products, allowing some channels. Consequently, all decay
channels are considered whenever a resonance is decayed, even if the on-shell masses of
the products kinematically exclude the channel.

• NN:meMode = 101: The partial widths are calculated in the same fashion as for NN:meMode
= 100, but are now set to zero if the sum of the on-shell masses for the decay products
is not kinematically allowed at the mass for which the partial width is being calculated.
Consequently, the total width becomes mass dependent through the introduction of step
functions at the kinematic limits for each decay channel.

• NN:meMode = 102: This method builds upon the method of NN:meMode = 101 but uses a
smooth threshold factor, rather than a step function. For two-body decays the partial width
is multiplied by the factor,
q
β = (1 − m21 /m2 − m22 /m2 )2 − 4m21 m22 /m4 , (367)

where mi are the masses of the decay products and m is the selected mass of the decaying
resonance. While this correctly includes the phase-space suppression for an isotropic two-
body decay, any channel specific modifications due to the matrix element for the decay are
not included. For higher multiplicity decays, a less sophisticated factor of,
v X
β = 1− mi /m , (368)
t
i

194
SciPost Physics Codebases Submission

is used which roughly approximates the phase-space suppression. For this method, the
branching ratio for each decay channel should be provided without a phase-space sup-
pression factor, otherwise phase-space suppression for that channel will be double counted.
When using this method, the branching fractions for the resonance as calculated by PYTHIA
will not match those provided by the user.

• NN:meMode = 103: The phase-space suppression of NN:meMode = 102 is used, but the
branching fraction for the channel is divided by the β factor calculated at the on-shell mass of
the resonance, β0−1 . Consequently, the branching fractions no longer need to be adjusted for
phase-space suppression, and the branching fractions calculated by PYTHIA will match those
provided by the user. However, in some cases β0−1 can be very large if a channel is very near
threshold for the nominal mass of the resonance. The parameter ResonanceWidths:-
minThreshold defines the minimum allowed β0 and limits the correction for resonance
masses well above the on-shell mass.

Note that it is possible to mix and match partial width calculation methods for a given resonance,
i.e. some decay channels may have their partial width calculated perturbatively, while the methods
outlined above are used for others.

8.1.3 Lifetimes
While the lifetime of a particle is inversely related to its width, decoupling the lifetime and width
of a particle is oftentimes useful for practical purposes. Consequently, both the lifetime and the
width of a particle species can be specified independently in PYTHIA. The lifetime is given as the
nominal proper lifetime multiplied by the speed of light, cτ0 , and has units of millimetres. For
particles with a non-zero lifetime, a lifetime is sampled according to an exponential decay,

P (τ) dτ ∝ exp(−τ/τ0 ) dτ , (369)

where the τ0 used here is not calculated from the width, but rather specified independently. When
the hadronic-rescattering framework is enabled and the independently provided τ0 is zero, the
nominal proper lifetime is automatically calculated using the width, if the particle species has at
least a single available decay channel. See section 7.4 for details. Similarly, missing lifetimes are
calculated when vertex positions and rapid hadron decays are enabled in the hadronization. For
resonances, τ0 is automatically determined from the calculated width of the resonance. However,
in some cases very long lifetimes are necessary, which could result in such narrow widths that the
calculation of the cross section becomes numerically unstable. Here, the width and lifetime for a
resonance can be made independent by setting the flag NN:tauCalc = false for that resonance.
This can be particularly useful when scanning lifetime space for BSM resonances.
After the lifetime for a particle or resonance is selected, the decay vertex position is calculated
as,
p
x dec = x pro + τ , (370)
m
where m is the mass of the particle, p the momentum, and x pro the production-vertex posi-
tion that may be either the primary interaction point or from some previous decay. This treat-
ment of the decay vertex assumes all particles travel without interaction, including no magnetic
fields or interactions with detector materials. Consequently, decay chains can be stopped to al-
low the subsequent decays of the particles to be handed to a detector simulation. A number
of criteria for stopping decays is available. Particles with a specified minimum nominal lifetime

195
SciPost Physics Codebases Submission

can be stopped from decaying with the flag ParticleDecays:limitTau0. Similarly, particles
with a selected lifetime greater than a configurable minimum lifetime can be set stable with
the ParticleDecays:limitTau flag. Particles can also be limited from decaying geometri-
cally, either within a sphere with ParticleDecays:limitRadius, or within a cylinder with
ParticleDecays:limitCylinder.

8.2 Decays
Particle decays might at first appear to be one of the simpler components of PYTHIA, given the
clear factorization between the production and decay of particles. The masses, widths, and decay
channels for most particles can be set directly to experimentally observed values, and typically do
not require sophisticated calculations. Once this information is provided, a particle can be decayed
by randomly selecting a decay channel with a weight proportional to its branching fraction, and
then distributing the products of the selected channel according to phase space. However, there
are a number of complications which require modifications to this initial approach.
The technical generation of phase space for decays with more than three products can be non-
trivial to perform efficiently, and requires the use of specialized algorithms such as the M-generator
or RAMBO, which are introduced in section 2.2.4. After phase-space generation, a matrix-element
weight can be applied to ensure the correct kinematic distribution, given the nature of the decay.
For particles with non-zero spin, spin effects can change the kinematic distribution not only for
a single decay, but also between correlated decays. Finally, additional photons need to be proba-
bilistically included in radiative decays.
All these complications assume the decay channel is exclusive, i.e. the number and type of
decay products is fixed. For many decays, such as those of charm and bottom hadrons, this is not
the case. A full list of the available decays are provided in table 2. About 40% of decay channels
in PYTHIA have dedicated matrix elements, corresponding to 50% of decays when weighted by
branching fraction. The remainder of this section describes these decays.

8.2.1 Hadron decays with parton showers


The decays of many particles are not known in an exclusive hadronic form but instead, the relative
rates between exclusive partonic channels is known. Consequently, it is necessary to evolve these
exclusive partonic decays into final state hadrons. In PYTHIA there are two mechanisms for this
evolution. In the first method, the partons are passed to the timelike parton shower of section 4,
followed by the hadronization of section 7.1. This method is used for bb states, and typically the
parton shower does not significantly modify the decay. By default, the partons produced in the
decay are distributed uniformly in phase space, with the notable exception of NN:meMode = 92
detailed in section 8.2.4 and NN:meMode = 94 detailed in section 8.2.6.
A number of parton and colour configurations are available for this type of inclusive decay via
the parton shower as follows. Here, ci is used to indicate a colour index and c̄i anti-colour index.

• qq: The quark carries c1 the antiquark c̄1 . This type of decay is set with NN:meMode = 91.
Examples of decays using this matrix element mode are Υ → qq. Hidden valley hadrons
also heavily utilize this decay.

• gg: The first gluon carries c1 and c̄2 while the second gluon carries c2 and c̄1 . This decay is
also specified by NN:meMode = 91 and is primarily used for quarkonia, e.g. ηb → gg.

196
SciPost Physics Codebases Submission

Table 2: Available matrix element modes for particle decays. Here, V is a vector meson,
P a pseudoscalar, H a generic hadron, X any non-partonic initial state, and A and B any
non-partonic final states.

force process eq. meMode


any X → qq or X → gg none 91
any X → qqA, where A is a colour singlet none 93
any X → qq . . . none 42 - 80
any H → AB (372) 3-7
strong V → π+ π− π0 , where V is an isoscalar (376) 1
strong P → P V [→ P P] (377) 2
strong P → γV [→ P P] (378) 2
strong V → ggg or V → γgg (379) 92
EM H → Aγ∗ [→ `+ `− ] (380), (381) 11
EM H → qqγ∗ [→ `+ `− ] → A`+ `− (380), (381) 11
EM H → AB . . . γ∗ [`+ `− ] (380), (381) 12
EM H → qqγ∗ [→ `+ `− ] → AB`+ `− (380), (381) 12
EM H → γ∗ [→ `+ `− ]γ∗ [→ `+ `− ] (380), (381) 13
weak H → ν̄` `− A (382) 22/23
weak H → ν̄` `− qq (382) 22/23
weak X → qqA, where A is a colour singlet (382) 94
weak H → ν̄` `− AB . . . (383) 22/23
weak H → qqqq (384) 22, 23
weak `− → ν` A. . . (384) 21
weak `− → ν̄` `− `+ ν` (382) 22/23
weak H → γqq (385) 31

197
SciPost Physics Codebases Submission

• ggg: The first gluon carries c1 and c̄2 , the second c2 and c̄3 , and the third c3 and c̄1 . This
configuration is intended for the decays of quarkonia, e.g. Upsilon → ggg, and set with
NN:meMode = 92.
• ggγ: The first gluon carries c1 and c̄2 and the second c2 and c̄1 . This decay is also intended
for quarkonium decays, e.g. Υ → γgg and is set with NN:meMode = 92.

• qqX : This is the same as the colour-singlet qq decay mode, except with an additional colour
singlet X , and is selected with NN:meMode = 93 for flat phase space, and NN:meMode =
94 for a weak decay.
For all of these decays, the ordering of the partons as passed to PYTHIA does not matter.

8.2.2 Inclusive hadron decays


The second method for inclusive hadronic decays is to first determine hadrons from the partons
and then distribute these hadrons in phase space. This method is used primarily for multibody
decays of hadrons such as the D and B mesons, where only a few channels are known experimen-
tally. The flavours for a channel can then be dynamically built from the initial partonic content of
a weak decay. For this type of decay, either one or two parton pairs can be specified in the decay,
in addition to any non-parton particles. Here, a parton is either a quark or diquark. The number
of final particles is determined from a Poisson distribution with a mean of,
nknown + nspec npartons
λ= + + ρmult ln(mdiff /mmult ) (371)
2 4
where nknown is the number of non-partonic particles in the specified decay, nspec is the number of
spectator partons, and npartons is the number of partons. Here, the spectator partons are those par-
tons that do not participate in the partonic weak decay. The mass mdiff is the difference between
the decaying particle mass and the sum of the nominal decay-product masses. A reference mass
mmult is set by the parameter ParticleDecays:multRefMass and can be used to tune the aver-
age decay multiplicity. An additional factor, ρmult , also determines the average decay multiplicity
and is set via the parameter ParticleDecays:multIncrease for all relevant matrix-element
modes except NN:meMode = 23 where ParticleDecays:multIncreaseWeak is used instead.
See section 8.2.6 for further details.
The method for selecting the final hadrons is as follows.

1. The multiplicity is selected according to eq. (371) and is required to be less than 10. A
minimum multiplicity can be required by setting the NN:meMode between 42 and 50, where
the minimum multiplicity is given by meMode - 40. Alternatively, the multiplicity can be
fixed by setting the NN:meMode between 62 and 70. Here the multiplicity is calculated as
meMode - 60.
2. The number of hadrons to form is the difference between the selected multiplicity from the
previous step, and the number of non-parton particles in the decay.

3. One of the partons is selected at random and a new parton and hadron is formed, following
the flavour selection of section 7.1.1.

4. The previous step is repeated until the number of remaining hadrons to select is the same
as the number of parton pairs.

198
SciPost Physics Codebases Submission

5. The remaining parton pairs are formed into hadrons.

6. If there are two pairs, they may be reshuffled, as determined by the probability ParticleDecays:-
colRearrange, i.e. for a value of 0 the pairs will never be reshuffled but for a value of 1
they will always be reshuffled.

7. If the mass of the final decay products is less than the decaying particle, the hadron selection
is kept, otherwise the process begins again with step 1.

This model is very similar to the hadronization model, but the momenta of the hadrons is now just
determined with phase space. For most decays this approximation is valid as the decay-product
momenta should be very low and on average reproduce the correct kinematic behaviour. While
the flavour selection is the same as for hadronization, the mass constraint of step 7 will typically
bias decays to the lighter pseudoscalar mesons, particularly for high multiplicity decays.
For these types of inclusive decays, the special particle ID 82 can be used to randomly select
a light flavour pair, i.e. uu, dd, or ss. The suppression of selecting an ss pair with respect to uu
and dd is configured by the parameter StringFlav:probStoUD which is also used in the flavour
selection of the hadronization algorithm of section 7.1.1. When specifying decays with this ID, the
channel should be given as an 82 -82 pair, where the ordering does not matter. A similar ID is
83 which is the same as 82, but intended for decays that proceed through a gluon loop. Since this
loop will increase the average multiplicity of the decay, (371) is modified by adding an additional
constant specified by the parameter ParticleDecays:multGoffset. The primary decay of the
J/ψ into three gluons, as well as many of the other onium states, use this special ID.
For some particles, exclusive decays must be specified in addition to inclusive decays. Matrix-
element modes are provided in PYTHIA to prevent double counting the exclusive decays in the in-
clusive decays. An NN:meMode between 52 and 60 reproduces the same behaviour as an NN:meMode
between 42 and 50, but will exclude any generated final state that matches a non-partonic decay
channel. An example of such a decay is ηc → qq. Similarly, if NN:meMode is between 72 and 80,
the behaviour for meModes between 42 and 50 is reproduced, but again excluding any generated
final state that matches a non-partonic decay channel.

8.2.3 Variable-width hadrons


For standard particle decays, the probability used to select a decay channel is calculated using
a fixed branching ratio, independent of the decaying particle mass. The hadronic rescattering
framework (cf. section 7.4), however, includes mass-dependent partial widths for two-body decays
of hadrons. For hadrons included in the rescattering framework, decay channels are picked using
these partial widths. The partial width for the decay of a hadron resonance H into particles A and
B, H → AB, is given by,

m0 Φ(2l + 1, m) 1.2
ΓH→AB (m) = Γ0 . (372)
m Φ(2l + 1, m0 ) 1.0 + 0.2 Φ(2l,m)
Φ(2l,m0 )

Here, Γ0 is the nominal partial width of the decaying hadron at its nominal mass m0 , set from
experiment. The angular momentum of the two-body decay is given by l. In PYTHIA, this angular
momentum is specified by the user as l = meMode −3. At high masses the final multiplicative factor
regulates the partial width. Similar to resonance production, see section 8.1.2, these partial widths
define not only the branching fractions of the hadron but also production.

199
SciPost Physics Codebases Submission

The phase space is given by


Z Z
Φ(l, m) = dmA dmB q l (m, mA, mB )BW(mA)BW(mB ) , (373)

where q(m, mA, mB ) is the magnitude of the A and B momentum in the centre-of-mass frame,

(m2 − (mA + mB )2 )(m2 − (mA − mB )2 )


p
q(m, mA, mB ) = . (374)
2m
Finally, the mass distribution for each of the two decay products is given by a Breit–Wigner,
1 Γ (m)
BW(m) = . (375)
2π (m − m2 )2 + 41 Γ 2 (m)
2
0

While this mass distribution does include a mass-dependent width, phase-space considerations
ensure these mass-dependent widths can be evaluated recursively from the lowest mass particle
to the highest. Note that performing decays with variable partial widths only affects the branching
ratios of the decay channels, and not the angular distribution of the decay products. By default, a
number of hadrons are decayed using variable partial widths in PYTHIA. This includes many of the
excited mesons as well as a number of the baryons. For technical reasons, variable partial width
decays are never performed for the ρ or f2 mesons.

8.2.4 Strong decays


Most decays proceeding via the strong force in PYTHIA are modelled with pure phase space. How-
ever, there are four special cases that are generated according to matrix elements: isoscalar vector
mesons decaying into pseudoscalar mesons, pseudoscalar mesons decaying into a pseudoscalar
and vector mesons, pseudoscalar mesons decaying into a photon and vector meson, and vector
mesons decaying into a three gluon final state.
The ω meson decays predominantly into a three-pion final state of π+ π− π0 . This decay can
be modelled using the isobar model [381], where the decay proceeds via the intermediate ρ 0 π0
or ρ ± π∓ state. The matrix element for this decay is given by

|M|2 ∝ (m1 m2 m3 )2 − (m1 p2 p3 )2 − (m2 p1 p3 )2 − (m3 p1 p2 )2 + 2(p1 p2 )(p1 p3 )(p2 p3 ) |F |2 , (376)


 

where mi and pi are the mass and momentum of decay product i. Here, π+ corresponds to i = 1,
π− to i = 2, and π0 to i = 3. The function F includes possible final-state interactions of the
pions, and depends upon the full kinematics of the decay. When no final-state interactions are
present, F = 1, which corresponds to P-wave distributed phase space. In PYTHIA, this assumption
of no final-state interactions is made. However, there is experimental evidence that final-state
interactions could play an important role in this decay [382].
The φ meson is also an isoscalar like the ω meson and has a non-negligible branching to
the ρ 0 π0 and ρ ± π∓ channels, where the larger φ mass provides sufficient phase space for a ρ
resonance. However, a contact π+ π− π0 decay, without the ρ-resonance structure, is also possi-
ble [383], and is described by the same matrix element as for the ω meson. The ρ 0 itself can also
decay into a π+ π− π0 final state described by this matrix element, although this decay channel is
heavily suppressed due to phase space. For both the φ and ρ 0 mesons, no final-state interactions
are considered in these decay channels. The matrix element of eq. (376) can be selected by setting
NN:meMode = 1.

200
SciPost Physics Codebases Submission

In the decay chain P0 → P1 V2 [→ P3 P4 ], where P is a pseudoscalar meson and V a vector


meson, the decay products P3 and P4 are distributed in the rest frame of V2 according to cos2 θ ,
where θ is the angle between P0 and P3 . The corresponding matrix element, is given by

|M|2 ∝ (p0 p2 )(p2 p3 ) − m22 (p0 p3 ) , (377)

where again i specifies the particle in the decay chain, mi is the mass of that particle, and pi is
the momentum. Similarly, for the decay chain P0 → γV2 [→ P3 P4 ], the distribution of P3 and P4 is
now given by sin2 θ in the rest frame of V2 . The matrix element for this decay is,

|M|2 ∝ m22 2(p2 p3 )(p0 p2 )(p0 p3 ) − m2 (p2 p3 )2 − m22 (p0 p3 )2 − m23 (p0 p2 )2 + (mm2 m3 )2 . (378)
 

While these two matrix elements are relevant for all appropriately produced vector-meson de-
cays into a pseudoscalar-meson pair, in practice the relevant vector-meson decay channels are:
ρ → ππ, ω → π+ π− , K∗ → Kπ, φ → KK, φ → π+ π− , and D∗ → Dπ. Note that when the vector
meson is not produced in the decay chain P0 → P1 /γV2 , these matrix elements are not used. As
an example, in the decay chain D → πK∗ [→ Kπ], the decay products of the K∗ are distributed
according to eq. (377). To use these matrix elements, NN:meMode = 2 must be set.
For the decays of vector-like onium states into a partonic final state of gluons, V0 → g0 g1 g2 , or
gluons and a photon, V0 → γ0 g1 g2 , the matrix element,
‹2 ‹2 ‹2
1 − x1 1 − x2 1 − x3
  
|M|2 ∝ + + , (379)
x2 x3 x1 x3 x1 x2

is used. Here, x i is twice the energy of particle i divided by the mass of the decayer in the rest
frame of the decayer, 2Ei /m. For the two gluon and photon decay, the two-gluon system is required
to have a minimum mass configured by the parameter StringFragmentation:stopMass to
ensure that the system can properly hadronize. This matrix element is set using meMode = 92 as
is done for the partonic decays Υ → ggg and Υ → γgg. Because eq. (379) is symmetric, ordering
of the decay products when configuring PYTHIA does not matter.

8.2.5 Electromagnetic decays


The electromagnetic decay π0 → γ∗ [→ e+ e− ]γ can be generated with a factorized approach. To
begin, the γ∗ mass is selected, using the decay matrix element integrated over the solid angle, but
still dependent upon the γ∗ mass, m1 .
v 3
2u
4m22 m21
 
2 1 2m 2 t 1
|M | ∝ 2 1 + 2 1− 2 1− . (380)
m1 m1 m1 (m − m max )2 (mρ0 − m1 ) + m2ρ0 Γρ20
2 2 2

The subscript i is 0 for the π0 , 1 for the virtual γ∗ , 2 for the e+ , 3 for the e− , and 4 for the real γ;
the mass for each particle is given by mi and mmax is the maximum mass of the off-shell photon,
i.e. mmax = m4 = 0 for this decay channel of the π0 . The final factor of this expression is the VMD
propagator for the ρ 0 , where mρ0 is the mass of the ρ 0 , and Γρ0 the width. This propagator is
negligible for any decaying particle with a mass far from the ρ 0 mass, which includes the case of
the π0 . Next, after the γ∗ mass is selected, the two-body decay of π0 → γ∗ γ is performed. Finally,
the angular distribution of the e+ e− pair is generated according to,

|M|2 ∝ (m21 − 2m22 ) (qp2 )2 + (qp3 )2 + 4m22 (qp2 )(qp3 ) + (qp2 )2 + (qp3 )2 ,
   
(381)

201
SciPost Physics Codebases Submission

where pi is the momentum of the corresponding particle with index i, and q = p0 − p1 . For
efficiency and simplicity, this angular distribution is generated in the rest frame of the decaying
particle, which if highly boosted, can result in minor numerical induced violations in momentum-
energy conservation. Consequently, the momentum of the final lepton is calculated as p3 = p1 − p2
in the laboratory frame.
The matrix element for this decay channel is also valid for similar processes where a lepton pair,
` ` , is produced via an off-shell photon. Such decay channels include η → `+ `− γ, ω → `+ `− π0 ,
+ −

φ → `+ `− η, B → `+ `− K/K∗ , Bs0 → `+ `− φ, and Σ0 → `+ `− Λ0 . This matrix element can also be


used for the final state `+ `− qq. In this particular case, the qq is converted into a single hadron,
following the inclusive decay selection of section 8.2.2 but with the multiplicity of the decay fixed
to three. In all the cases described above, the matrix element for these decay channels is set with
NN:meMode = 11.
The form of (380) and (381) are also approximately valid for decay channels with the final
state γ∗ [`+ `− ]AB . . ., where there are two or more decay products in addition to the lepton pair.
Such decays include η → `+ `− π+ π− , K0S → `+ `− π+ π− , B0 → `+ `− π0 π0 , and B+ → `+ `− us. For
this type of decay channel, eq. (380) is still used to select the mass, but with mmax = mA + mB +. . .,
and eq. (381) is used without modification. The phase-space generation, after selecting m1 , is now
performed as a decay with multiplicity n − 1 > 2, where n is the final multiplicity of the decay.
Setting NN:meMode = 12 selects this matrix element. If A and B are replaced with a qq̄ final state,
the system is collapsed down into two hadrons, with the flavour selection again performed using
the inclusive decay algorithm but with the multiplicity fixed at four.
Finally, these matrix elements are also used to approximate γ∗ [`+ `− ]γ∗ [`+ `− ] decay channels.
Following the same numbering convention, the mass of the first off-shell photon, m1 is selected
using eq. (380) where mmax = m5 + m6 , i.e. twice the mass of the second lepton flavour. Then,
the mass of the second off-shell photon, m3 , is selected again with eq. (380) but using indexing
i − 2 and setting mmax = m2 + m3 . After performing the two-body decay of the γ∗ γ∗ system, the
angular distributions for the two lepton pairs are generated independently using eq. (381). This
type of decay channel is specified with NN:meMode = 13 and can be used for decays such as
π0 → e+ e− e+ e− . The technical implementation for all decays using eqs. (380) and (381) require
that the lepton pair should always be set as the final two decay products when defining these
decay channels.

8.2.6 Weak decays


The helicity averaged matrix element for the t-channel weak scattering of fermions, f0 f1 → f2 f3 ,
is,
|M|2 ∝ (p0 p1 )(p2 prem ) , (382)
where pi is the momentum of particle with index i and prem = i=3 pi is the sum of the remaining
P

momenta, which here is just p3 . By crossing symmetry, this matrix element can also be used for
weak decays. An example is the fully leptonic decay of the τ lepton, τ− → ν̄` `− ν̄τ . The particle
ordering determines the corresponding i for each particle in eq. (382), e.g. i = 1 for the anti-
neutrino and i = 2 for the charged lepton. This matrix element can also be used to approximate
semi-leptonic decays of D and B mesons, e.g. D0 → `+ ν` π− or B0 → ν` `+ π− , where the final
fermion pair is collapsed into a single hadron. In this example, the ordering of the neutrino and
charged anti-lepton is swapped between the two decays. This is because for D-meson decays, the
partonic f0 is a c quark, while for the B-meson this is a b antiquark. Similarly, this matrix element
can be used for the semi-leptonic decays of baryons, e.g. n → ν̄e− p, or the leptonic decays of

202
SciPost Physics Codebases Submission

charged leptons, e.g. µ− → ν̄e e− νµ . When not using the sophisticated τ decays of section 8.2.8,
this helicity averaged matrix element can also be used for the leptonic decays of the τ.
Semi-leptonic decays can also be specified with their partonic content, e.g. D0 → `+ ν` du or
B0 → ν` `+ du, where the ordering of the quarks does not matter. Similarly, baryon decays of this
nature like Ξ0c → e+ νe s(sd)0 , can be decayed using this matrix element where one of the partons is
a diquark, i.e. (sd)0 . When partonic content is specified, the parton system is collapsed to a single
hadron following the flavour-selection rules of section 7.1.1. The matrix element of eq. (382)
is used for all the decays described above by setting either NN:meMode = 22 or NN:meMode =
23. For these types of decays there is no difference between these two matrix-element modes.
The only technical requirement for these decays is that the first two particles of the decay are the
neutrino/charged-lepton pair, followed by either a hadron or a parton pair, where ordering of the
partons does not matter. In some cases it is convenient to use the special particle ID 81 to act as
a place holder for the spectator quark or diquark, which is then automatically replaced with the
correct spectator flavour. For baryons, an ambiguity can arise in this selection where the spin of
the diquark cannot always be determined uniquely. For the example decay of the Ξ0c given here,
the spectator flavour can either be (sd)0 or (sd)1 , while the automatic flavour will always select
the (sd)0 diquark.
In some cases, semi-leptonic decays with more than one final-state hadron are needed, e.g.
D0 → e+ νe K0 π− . The additional hadrons can be physically interpreted as being produced from the
fragmentation of the spectator parton, resulting in hadrons with a significantly softer momentum
than the hadron containing the spectator quarks. This softer momentum is modelled by taking
the product of eq. (382) and an exponential damping factor,
Y 2 2
|Mdamp |2 ∝ |M|2 e−|pi | /σsoft , (383)
i=4

which is calculated in the rest frame of the decay, where the product is taken over all hadrons
following the spectator hadron with momentum magnitude |pi |. Here, |M|2 is calculated with
eq. (382) and σsoft is the damping term which can be configured by the user with the param-
eter ParticleDecays:sigmaSoft. A single damping parameter is used for all decays and is
expected to fall within the range 0.2 – 2, where a smaller value increases the damping. For semi-
leptonic decays with two or more final state hadrons, this matrix element can be used by setting
NN:meMode = 22 or NN:meMode = 23. Again, there is no difference between these two matrix-
element modes for decays of this type. As before, the ordering of the decay as passed to PYTHIA
matters. The neutrino/charged-lepton pair must be specified first, in the correct order as dis-
cussed above, followed by the hadron containing the spectator quark, followed by any remaining
hadrons, which will then have their momentum damped.
The matrix element for weak decays into purely hadronic final states, where the decay is
defined only by partonic content, is approximated by PYTHIA. An example of this class of decay is
B0 → udcd which will result in a final state with a D meson. The partonic content should be set as
q1 q2 q3 q4 where q1 and q2 are colour connected, and either q3 or q4 is the spectator quark/diquark.
The special particle code 81 can be used here to automatically determine the spectator flavour. Just
like for the partonic semi-leptonic decays, the final two partons are collapsed into a single hadron
following the flavour-selection rules of section 7.1.1. The first two partons are then fragmented
into multiple hadrons, following the method of section 8.2.2.
When NN:meMode = 22 is used, the mean number of final particles in the decay is calculated
with eq. (371) using the ρmult parameter ParticleDecays:multIncrease. When NN:meMode

203
SciPost Physics Codebases Submission

= 23 is used instead, the mean number of final particles is calculated using ParticleDecays:-
multIncreaseWeak. The former parameter is intended, although not required, to be smaller
than the latter, since in weak decays only the mass of the off-shell W boson is available to the
fragmenting partonic system, and not the entire parent mass. Additionally, for NN:meMode = 23
a minimum of three final particles are required in the decay after flavour selection. After the final
particles are determined for each decay, the matrix element,

2E1 4E1
 ‹
2
|M| ∝ 3− , (384)
m m

is used where m is the mass of the decaying hadron and E1 is the energy of the hadron containing
the spectator quark, in the rest frame of the decay. This matrix element can also be used for
hadronic τ decays when the sophisticated treatment is not needed by specifying NN:meMode =
21. Here, the first decay product should always be the ντ , which increases the energy of the
neutrino with respect to flat phase space.
Partonic radiative decays via the weak force are roughly approximated with the matrix ele-
ment,
2E1 3
 ‹
|M|2 ∝ (385)
m
where m is the mass of the decaying hadron and E1 is the energy of the photon in the rest frame
of the decay. Effectively, this increases the photon energy with respect to flat phase space. The
partonic content for these decays should be set as a photon, the spectator quark, and the flavour-
changing quark, e.g. B0 → dsγ where d is the spectator quark. Unlike the previous weak decays,
where the spectator system is collapsed to a single hadron, the spectator system is fragmented
into multiple hadrons following the inclusive selection of section 8.2.2. However, the multiplicity
for the decay is selected with a geometric distribution,

1 n−1 1
 ‹
P(n) = 1 − , (386)
2 2

rather than a Poisson distribution, where a minimum multiplicity of 2 and a maximum multiplicity
of 10 is required. This type of decay is specified by setting NN:meMode = 31, and the decay
products can be assigned in an arbitrary order.
In all the decays above, the matrix element is applied to the final particles of the decay, not
the partonic content. In some cases it is useful to apply the matrix element to the partonic content
of the decay, and then perform a full parton shower followed by hadronization, using the parton-
shower method of section 8.2.1. Specifying NN:meMode = 94 does this, where the matrix element
of eq. (382) is used to distribute the phase space of the partons from the decay.
In addition to the weak decays described above, B systems may mix prior to decay. This mixing
is controlled by the flag ParticleDecays:mixB and has a probability of,
 ‹
2 xτ
P = sin , (387)
2τ0

where τ is the selected lifetime of the particle, and τ0 the nominal proper lifetime. The mixing
parameter x is set with ParticleDecays:xBdMix and ParticleDecays:xBsMix for the Bd
and Bs systems, respectively.

204
SciPost Physics Codebases Submission

8.2.7 Helicity decays


A generic helicity-density formalism is available in PYTHIA which can be used for τ decays as well as
muon decays in lepton-flavour violating production. External tools have also used this framework
for heavy-neutral-lepton decays. The weight for an n-body decay of an arbitrary particle is given
by,
(i)
Y
W = ρλ0 λ00 Mλ0 ;λ1 ...λn M∗λ0 ;λ0 ...λ0 Dλ λ0 . (388)
0 1 n i i
i=1,n

The decaying particle is given index 0 and the decay products are assigned indices i through n.
The helicity for each particle is given by λi and summations are performed over each repeated
helicity index. The helicity density matrix for the decaying particle is given by ρ, while the decay
matrix for each decay product is given by D. The helicity matrix element for the decay is M and
depends upon the helicity of the decaying particle as well as the decay products.
For a particle produced from a 2 → n hard process, the helicity-density matrix for an outgoing
particle with index i is given by,
(i) (1) (2) ( j)
Y
ρλ λ0 = ρκ κ0 ρκ κ0 Mκ1 κ2 ;λ1 ...λn M∗κ0 κ0 ;λ0 ...λ0 Dλ λ0 , (389)
i i 1 1 2 2 1 2 1 n j j
j6=i

where ρ (1,2) are the helicity-density matrices for the incoming particles, M is the helicity matrix
element for the process, and κ1,2 are the helicities of the incoming particles. For incoming two-
helicity-state beam particles with a known longitudinal polarization Pz the helicity-density matrix
is diagonal with elements (1 ± Pz )/2.
Before any particles are decayed in a given sequence, all decay matrices in eq. (388) and
eq. (389), D, are initialized to the identity matrix. In a 2 → n process, a first outgoing particle is
randomly selected and decayed using a helicity-density matrix determined with (389). The decay
matrix for this first decay is calculated as
(0) (i)
Y
Dλ λ0 = Mλ0 ;λ1 ...λn M∗λ0 ;λ0 ...λ0 Dλ λ0 . (390)
0 0 0 1 n i i
i=1,n

After the full decay tree for this first particle is determined, the remaining particles for the 2 → n
process are then randomly selected and decayed using the helicity-density matrix of (389) with
the updated decay matrices for the already decayed outgoing particles.
When a particle from the hard process is selected for decay, the full decay tree of that particle
is performed. A single branch of the decay tree is followed until a final stable particle is reached.
The helicity-density matrices for particles produced from decays are calculated with,
(i) (0) ( j)
Y
ρλ λ0 = ρλ λ0 Mλ0 ;λ1 ...λn M∗λ0 ;λ0 ...λ0 Dλ λ0 , (391)
i i 0 0 0 1 n j j
j6=i

where the ρ (0) is the helicity-density matrix of the parent particle. The algorithm then calculates
the decay matrix for the last particle decayed with eq. (390) and the next undecayed branch of
the decay tree is traversed until all branches of the decay tree have been decayed. In this way the
decays of the outgoing particles from the hard process are correlated. As implemented in PYTHIA,
this full recursion is not necessary since the implemented τ decays are typically provided with
stable final-state particles.
The hard-process generation of PYTHIA uses unpolarized matrix elements to generate the phase
space of the hard process, and so dedicated 2 → n helicity matrix elements are needed to deter-
mine the helicity density matrix after phase-space generation. For τ decays a number of helicity

205
SciPost Physics Codebases Submission

Table 3: Summary of available τ− decay models in PYTHIA. The ντ is omitted from the
decay products for brevity and charge conjugation is implied for τ+ decays.

mult. ref. meMode decays


2 1521 π− , K −
1531 e− ν¯e , µ− ν¯µ
3 [386] 1532 π0 π− , K 0 K − , ηK −
[387] 1533 π− K̄ 0 , π0 K −
[388] 1541 π0 π0 π− , π− π− π+
K − π− K + , K 0 π− K̄ 0 , KS0 π− KS0 , K L0 π− K L0 , KS0 π− K L0 ,
[389] 1542
K − π0 K 0 , π0 π0 K − , K − π− π+ , π− K̄ 0 π0
4
π0 π0 π+ , π− π− π+ , K − π− K + , K 0 π− K̄ 0 , K − π0 K 0 ,
[390] 1543
π0 π0 K − , K − π− π+ , π− K̄ 0 π0 , π− π0 η
[384] 1544 γπ0 π−
5 [391] 1551 π0 π− π− π+ , π0 π0 π0 π−
6 [392] 1561 π0 π0 π− π− π+ , π0 π0 π0 π0 π− , π− π− π− π+ π+

matrix elements are available. Correlated decays from γ, Z, Z00 , γ/Z/Z00 , neutral Higgs bosons,
and t-channel γγ → `` production are provided. Single τ decays from W, W0 , charged Higgs
bosons, and B/D decays are also provided. For all these production mechanisms the relevant pa-
rameters that can be configured for the unpolarized production mechanisms are also used in the
helicity matrix elements. This includes the axial and vector couplings for the new gauge bosons,
as well as the parity of the Higgs bosons. When a particle used in the helicity decay framework is
provided from outside of PYTHIA, the SPINUP digit is interpreted as the helicity of the particle in
the laboratory frame. A number of options can be configured to fine tune the helicity treatment
of τ decays in PYTHIA.

8.2.8 Tau decays


While unpolarized simplified models of τ decays are available in PYTHIA, see section 8.2.6, dedi-
cated models which use the helicity-density framework are available. These models are based on
those provided in TAUOLA [384], and are available for all decay channels with branching fractions
greater than 0.04%, including up to six-body tau decays. The general helicity-density matrix for
these decays used in eq. (388) is given by

M ∝ ūντ γµ (1 − γ5 )uτ J µ . (392)

where only the current J µ needs to be specified. Here, u and ū are Dirac spinors, γµ are the Dirac
matrices, and the Weyl basis as adopted in HELAS [385] is used throughout.
Here, a brief description of the available τ decays is provided; more details can be found in
ref. [393] with a summary given in table 3. Note that the ordering of the particles matters, and
whenever numerical indices are used, 0 is the decaying τ− while nuτ has index 1. For two-body

206
SciPost Physics Codebases Submission

decays into a neutrino and pseudoscalar meson, tau− → ντ P, the hadronic current is given by,
µ
J µ ∝ p2 . (393)

The current for the fully leptonic three-body decay τ− → ντ `− ν¯` is

J µ = ū2 γµ (1 − γ5 )v3 . (394)

Three-body decays with hadronic states can proceed via vector and scalar currents,
cv
 X
µ
J ∝ P (p3 − p2 )µ w v i BWp (m2 , m3 , s2 , m v i , Γ v i )
i wv i i
X w v i BWp (m2 , m3 , s2 , m v i , Γ v i ) ‹
µ
− s1 (p2 + p3 )
i
m v 2i
cs X
+P (p2 + p3 )µ ws j BWs (m2 , m3 , s2 , ms j , Γs j ) ,
w
j sj j

where ws , v i are complex weights for each vector and scalar current, cs,v are the scalar and vector
couplings, and BWp is a P-wave Breit–Wigner. The final state determines the relevant couplings
and weights to use. The general form of the hadronic current for four-body decays is given by,
‹µ
qµ qν
 ‹
J µ ∝ g µν − (F3 − F2 )p2 + (F1 − F3 )p3 + (F2 − F1 )p4
s1
+ F4 qµ + i F5 εµ (p2 , p3 , p4 ) ,

where each Fi is a model specific form factor and ε is the permutation operator.
The hadronic current for the decay τ− → ντ γπ0 , π− is given by [384]

~ ρ , Γ~ρ , w
J µ ∝ F (s1 , m ~ ρ )F (0, m ~ρ , w
~ ρ, G ~ ρ )F (s4 , m ~ω , w
~ ω, G ~ ω)

µ
"2 m2π− p4 ν p2ν − p3ν p2ν (p4 ν p3ν − p4 ν p2ν )


− p3 µ (p3ν "2ν )(p4 ν p2ν ) − (p4 ν "2ν )(p3ν p2ν )




µ ν ν ν 2 ν

− p2 (p3ν "2 )(p4 ν p3 ) − (p4 ν "2 )(mπ− + p3ν p2 ) ,

where F is a sum over the possible vector currents including ρ and ω resonances. The five-body
decays depend on sub-currents for each allowed resonance [391, 394],
µ µ µ
Jπ0 π0 π0 π− ∝ J0,a + J0,a
1 →ρπ 1 →σπ
µ µ µ µ
Jπ0 π− π− π+ ∝ J−,a1 →ρπ + J−,a1 →σπ + J−,ω→ρπ ,

and are based in the Novosibirsk model. The six-body decay model [392] can be written as a
summation of a and b-type currents,
µ
X X
Jµ ∝ Jaµ + Jb , (395)

where each term is one of the possible final state permutations. The a-type currents proceed
through a a1 → ωρ resonance structure, while the b-type proceed via a a1 → σa1 [→ ρπ] struc-
ture.

207
SciPost Physics Codebases Submission

Part III
Using PYTHIA 8.3
PYTHIA 8.3 provides comprehensive choices for modelling all kinds of physics effects in collision ex-
periments, as can be seen from the bulk of this manual. It is often not necessary to know the details
of all components to start using the program to calculate useful quantities, however. The descrip-
tions provided in this section should allow a new user to set up and use PYTHIA for most standard
model and new physics processes, using default settings for showers, MPIs, and hadronization that
have been tested to work at the LEP and LHC experiments. By extension, it should also be useful
in many other contexts. All settings corresponding to particular physics models or to changing
the “tunes” (i.e. parameter fitting for showers, MPIs, and hadronization) are documented in the
HTML online manual, which is also distributed in the share/Pythia8/htmldoc/ directory of
the released source code. One can begin browsing from the Welcome.html home page of that
directory.
PYTHIA is under constant and active development. Therefore, any specific detail of this article
can become obsolete soon after it is released. We therefore urge users seeking specific information
to:

• Make sure to read the most recent version of this manuscript, in conjunction with the most
recent code version. Some information, which may have been correct when the manuscript
was obtained, may be outdated when being read.

• Consult the HTML manual, which always contains specific settings and reasonable defaults
for all physics processes, as well as suggestions for analyses. It also contains a detailed
change-log documenting updates between code versions.

• Use the examples distributed with the working version of PYTHIA for inspiration. Examples
are kept up to date, and should always correspond to the program version downloaded.

Past and present code versions, documentation, some relevant presentations, and more can be
found at the PYTHIA website:

https://www.pythia.org/
It is continuously kept up to date.
In section 9 we will describe the logic behind using PYTHIA as a library to write a stand-alone
analysis, and in section section 10 we describe interfacing to external programs.

9 Using PYTHIA stand-alone


The default way of using PYTHIA, is to use it as a C++ library, and write “main” programs perform-
ing the desired simulation tasks. This can be done completely stand-alone, as PYTHIA in principle
contains everything needed for a complete physics analysis. Several such example main’s are
shipped with PYTHIA in the examples/ sub-directory. In the following we will describe and ex-
emplify how such user code can be written, and then go on to give more advanced use cases,
covering deeper interactions with the simulation than allowed from an example main.

208
SciPost Physics Codebases Submission

9.1 Installation
The latest version of PYTHIA 8.3 (as well as older versions) can be downloaded from https://www.
pythia.org/ as a gzipped tarball pythia83XX.tgz. On Unix, Linux, or MacOS systems this can
be unzipped with

tar -xvfz pythia8307.tgz

(On Windows systems, we recommend to install a virtual machine running Linux, cf. e.g., this
tutorial.) The simplest installation can be made using the standard commands

./configure
make

Configuration options (especially for linking against external libraries) can be found by typing

./configure ––help

Details can also be found in the README file distributed with PYTHIA 8.3. If an install location is
specified with ––prefix, then make install will copy libraries, headers, and shared documen-
tation to that location in the standard Unix/Linux hierarchy. Details of the configuration can be
accessed either via the generated Makefile.inc file or the pythia8-config script in the bin
directory.
Most users would then change to the examples/ sub-directory, find a suitable example to use
as a template for their analysis, modify the desired parts, and compile and run the examples (say,
main01) by:

make main01
./main01

It is, however, also possible to compile and run PYTHIA programs outside the examples/ directory.
Three environment variables could be potentially useful, providing the paths to the compiled
libraries, and to the settings and particle properties databases,

PYTHIA8PATH = <set to head Pythia directory>


PYTHIA8DATA = $PYTHIA8PATH/share/Pythia8/xmldoc
LD_LIBRARY_PATH = $PYTHIA8PATH/lib:$LD_LIBRARY_PATH

9.2 Program setup


The simplest PYTHIA 8.3 user code comprises three main sections — initialization, the event loop,
and final statistics. A skeleton of a simple program is as below. Note that the skeleton program
should compile but not produce any reasonable output, as no reasonable settings are read in.

209
SciPost Physics Codebases Submission

#include "Pythia8/Pythia.h" // access to Pythia objects.


using namespace Pythia8; // allow simplified notation.

void main() {

// --- Initialization ---

Pythia pythia; // Define Pythia object.


Event& event = pythia.event; // quick access to current event.

// Read in settings
pythia.readString("..."); // line by line...
pythia.readFile("cardfile.cmnd"); // or via file.

// Define histograms, external links,


// local variables etc. here. E.g.
int maxEvents = 1000; // The number of events to run.

pythia.init(); // Initialize

// --- The event loop ---

for(int iEvent = 0; iEvent < maxEvents; iEvent++){

// Generate next event;


// Produce the next event, returns true on success.
if(!pythia.next()) {
// Any error handling goes here.
}

// Analyse event; fill histograms etc.

} // End event loop.

// --- Calculate final statistics ---


pythia.stat();

// Print histograms etc.

return;
}

9.3 Settings
The internal PYTHIA 8.3 event generation is divided into three steps:

210
SciPost Physics Codebases Submission

• Process level, dealing with the hard process.

• Parton level, dealing with showers, MPIs, colour reconnection, and beam remnants.

• Hadron level, dealing with hadronization and further decays of the particles produced.

Naturally, there are specific settings to control each of these levels. Aside from this, there are
several classes of settings to address output during initialization and generation of each event.
In the following, we give an overview of how these may be used. However, the reader should
consult the online manual (also accessible from share/Pythia8/htmldoc/Welcome.html
distributed with the release) for a full listing of all available settings and options. Note that all
possible setting keys are indexed, and can be searched via the Search Docs box in the upper-right
corner of the page.
It is possible to run PYTHIA 8.3 entirely with the default settings. The only minimal user input
required is the choice of production process. As a default, the incoming beams are both protons
with a centre-of-mass energy of 14 TeV with the parton distribution function set to the NNPDF2.3
QCD+QED LO αs (M Z ) = 0.130 one [230]. Furthermore, initial- and final-state radiation is turned
on, using the internal PYTHIA simple shower. MPIs and hadronization are both on by default, and
all unstable hadrons with cτ0 < 1000 mm are decayed to stable ones. The default tune is the
Monash 2013 one [358], see section 9.9.2.
PYTHIA 8.3 collects settings performing related functions into groups (e.g. overarching parton-
level settings are named PartonLevel:*). Input strings for changing settings have the form

settingGroup:nameOfSetting = value

For example, decays of all resonances can be turned off by setting

ProcessLevel:resonanceDecays = off

PYTHIA 8.3 supports four different types of settings:

• flag is a boolean true or false. Acceptable input alternatives include on/off, yes/no,
and 1/0.

• mode is an integer switch enumerating either available options or a wider range of values.
Acceptable values are integers.

• parm is a real number parameter.

• word is a character string. It cannot contain single or double quotation marks, or curly
braces, i.e. { }.

It is further possible to have a vector of each of these types. If necessary, users can define their
own settings that can then be used in their code.
The user can read in settings in one of two ways: either line-by-line with pythia.readString()
calls inside the user C++ code, or by providing a plain-text file that is read at run time. The latter
has the advantage of not requiring a recompilation every time a change is made. It is triggered by

pythia.readFile("cardfile.cmnd");

211
SciPost Physics Codebases Submission

inside the code.


All settings have reasonable default values enabled, and can furthermore be defined with
maximal and/or minimal values beyond which they cannot be changed. These can be studied
in the online manual under the respective parameter. A parameter can be forced outside the
allowed bound by using the keyword force, for example:

PhaseSpace:pTHatMinDiverge force= 0.1

will force the parameter PhaseSpace:pTHatMinDiverge, which usually has a minimal value of
0.5 GeV, to 0.1 GeV. The force keyword should be used with extreme caution! The boundaries are
there for a reason, and breaking them can make the program unstable or invalidate the physics
model.
If nothing else is mentioned explicitly, dimensional parameters have units of GeV for energy,
momentum, and mass, and mm for length and time, with the speed of light c = 1 implicit. In-
ternal cross sections are book kept in mb, but communication with other programs may require
conversion from/to other units.

9.3.1 Beams and PDFs


The incoming beams are set by providing the PDG code of the incoming particles to Beams:idA
and Beams:idB (the default for both is proton i.e. 2212). For example, a pp collision can be set
by changing the value of Beams:idB to

Beams:idB = -2212

An e+ e− collision can be set by idA = 11 and idB = -11. Currently available beams include
protons (2212), neutrons (2112), pions (±211, 111), most other light hadrons (but not necessar-
ily all combinations of them), electrons (11), muons (13), photons (22), and several heavy-ion
species. The collision energy can then be set by

Beams:eCM = 2000.

Units of GeV are implicit, as already mentioned. For heavy-ion collisions, this is the energy per
nucleon-nucleon collision, as per the usual heavy-ion conventions.
By default, collisions are assumed to be in the CM frame. Other options can be set with
Beams:frameType. Using option Beams:frameType = 2 the beam energies can be set sepa-
rately and e.g. a HERA-like beam configuration can be obtained with

Beams:frameType = 2
Beams:idA = 2212
Beams:eA = 920.
Beams:idB = -11
Beams:eB = 27.5

Furthermore, the beams do not need to be back-to-back but option Beams:frameType = 3 al-
lows for setting also some transverse momentum for the beams. A particularly useful setting to
automatically set beam information when using external LHE files (see section 10.1.1 for details)
is

212
SciPost Physics Codebases Submission

Beams:frameType = 4

It is also possible to specify a simple Gaussian spread of incoming beam momentum and of the
interaction vertex position. These can be set by Beams:allowMomentumSpread and Beams:-
allowVertexSpread and their accompanying parameters in the x, y, and z directions for each
beam.
The applied proton PDF set can be selected with setting PDF:pSet which is also applied for
antiprotons and neutrons via isospin symmetry. By default, this sets PDFs to be the same for beam
A and B but it is also possible to set the PDFs for beam B separately using option PDF:pSetB. The
internal PDF sets can be selected by setting an integer value for the above options, e.g. the current
default is set with

PDF:pSet = 13

To use LHAPDF grids instead, PYTHIA needs either be linked to the LHAPDF library or one can use
the internal implementation for the LHAPDF grid interpolation, see see section 10.1.4. In the first
case, the set is defined with a string LHAPDF6:set/member, e.g.

PDF:pSet = LHAPDF6:NNPDF23_lo_as_0130_qed/0

which would correspond to the current default above. Also LHAPDF version 5 is supported and en-
abled with keyword LHAPDF5:set/member. The internal interpolation for the LHAPDF 6 format
is enabled with LHAGrid1:filename and with this, the default PDF can be obtained with

PDF:pSet = LHAGrid1:NNPDF23_lo_as_0130_qed_0000.dat

The grid file should be located in the folder share/Pythia8/xmldoc or an absolute path should
be provided. These settings change the PDF used throughout the program, including hard-process
generation, MPIs, and ISR. To keep the underlying event description intact, one can also change
the PDFs only for the hard processes by setting PDF:useHard = on and selecting the hard PDFs
with PDF:pHardSet. All the above options can be used to select the PDFs for hard processes
and one can also include nuclear modifications for these with PDF:useHardNPDFA = on or
PDF:useHardNPDFB = on. Similarly, one can select PDFs for other beam types including pions,
pomerons, photons, and leptons, see the online manual and section 3.12 for further details.

9.3.2 Process selection


The minimal initialization information required by PYTHIA 8.3 to generate events is which pro-
cess(es) are to be run. This is done by turning on the relevant flags. For example, to generate a
gg → qq hard process, set

HardQCD:gg2qqbar = on

A full list of internally defined processes is available in appendix A.


It is possible to turn on more than one process at a time. PYTHIA 8.3 will then generate events
for each process in proportion to their cross sections. Some extra switches are also available for
processes that are often grouped together, e.g.

213
SciPost Physics Codebases Submission

HardQCD:all = on

will turn on all QCD 2 → 2 quark/gluon production processes. Since these processes are di-
vergent in the p⊥ → 0 limit, it is necessary to introduce a lower transverse-momentum cutoff
PhaseSpace:pTHatMin. Note that such a parton-level cut does not directly translate into a cut
on jet properties, since intermediate parton showers, MPIs, hadronization effects, and jet finders
will distort the original simple process. Further details are available in sections 3.1 and 3.13.
Several choices of renormalization and factorization scale are available. For 2 → 2 processes,
these can be set via SigmaProcess:renormScale2 and SigmaProcess:factorScale2, re-
spectively. The default for the renormalization scale is the geometric mean of the squared trans-
verse masses of the two outgoing particles. The default of factorization scale is set at the smaller
of the two squared transverse masses. The possible options are listed in section 3.10.

9.3.3 Soft processes


The bulk of the total cross section in high-energy hadronic collisions is not associated with a
visible hard process. A reasonably complete and consistent description of these relevant processes
is instead obtained with

SoftQCD:all = on

This includes elastic, single- and double-diffractive, and non-diffractive processes, which alter-
natively could be switched on individually. The inelastic processes, i.e. the diffractive and non-
diffractive ones, include a modelling of MPIs, which does include a tail of high-p⊥ processes. Thus
HardQCD:all becomes a subset of the SoftQCD:all total cross section, and one should not mix
SoftQCD and HardQCD processes. Colour screening ensures that the hard processes here are
damped appropriately at low p⊥ values, as described in section 6.2.
At very low collision energies the perturbative processes are gradually phased out and only
truly soft processes remain. This occurs e.g. in hadronic rescattering, or in the final stages of the
evolution of a cosmic-ray cascade in the atmosphere. To simulate low-energy collisions directly,
use

LowEnergyQCD:all = on

or related LowEnergyQCD:* flags to turn on only a subset of the available processes. These are as-
sumed to be accurate below 10 GeV. It is possible to simultaneously turn on both LowEnergyQCD:*
and SoftQCD:* processes, in which case a mix of the two is used at intermediate energies.
A number of other processes are available, including numerous non-QCD processes which may
not be applicable for proton beams. See appendix A.1 for a complete list of included standard-
model processes, and appendix A.2 for a list of BSM processes.

9.3.4 Parton- and hadron-level settings


The primary switches for parton showers are

PartonLevel:ISR = on|off
PartonLevel:FSR = on|off

214
SciPost Physics Codebases Submission

PYTHIA 8.3 has two other showers available, aside from the “simple showers”. The choice of
shower model can be performed with

PartonShowers:model = 1|2|3

where the default (1) corresponds to the “old” simple shower, (2) corresponds to VINCIA, and (3)
to the DIRE shower.
Finally, the primary switch for hadronization is

HadronLevel:all = on|off

9.3.5 Particle data


All known information regarding particles (mass, charge, decay width, branching fractions, etc.)
is stored within the ParticleData class. Each particle has the following basic properties:

• id holds the PDG identity number of the particle.

• name is a string containing the name of the particle. Particle and antiparticle names are
stored separately, with void returned when no antiparticle exists.

• spinType in the form of an integer equal to (2s + 1).

• chargeType is three times the electric charge.

• colType is the colour representation (0: uncoloured, (-1)1: (anti-) triplet, 2: octet,
(-3)3: (anti-) sextet).
• m0 is the nominal mass in GeV.

• mWidth is the Breit–Wigner width in GeV.

• mMin, mMax are the limits for mass generated by the Breit–Wigner.

• tau0 is the proper lifetime in mm.

• mayDecay sets whether the particle is allowed to decay.

• isVisible sets whether the particle is to be considered visible by the detector.

Other than these, there are a few special properties related to external decays which can be found
in the online manual. Any property of a particle can be changed by setting:

NN:Property = value

where NN is the PDG ID of the particle.


The next critical piece of information for a particle is its decay table. The decay table is com-
prised of decay modes (or decay channels), each of which has the following properties:

• onMode sets whether this decay channel is open where 0 is off, 1 is on, 2 on for the particle
but not for the antiparticle, and 3 is on for the antiparticle but not for particle.

215
SciPost Physics Codebases Submission

• bRatio sets the branching ratio for the channel.

• meMode sets how this decay is handled, in particular whether internal matrix element reweight-
ing is available to account for mass or angular correlations. The default is 0 and corresponds
to flat phase space. See table 2 for available matrix-element modes for particles and sec-
tion 8.1.2 for available matrix-element modes for resonances.

• multiplicity sets the number of daughters, the maximum allowed is eight.

• product(i) is an array that holds the PDG IDs of the daughter particles; empty slots are
set to zero.

Several shortcuts exist to quickly set up the decay table of a particle. For example, deleting
the existing decay table to start anew can be done by using the following.

NN:oneChannel = onMode bRatio meMode product1 product2 ...


NN:addChannel = onMode bRatio meMode product1 product2 ...

Branching fractions are automatically rescaled such that the sum is one. Certain modes can be
turned on or off based on the identity of the products by using the following shortcuts

NN:offIfAny = product1 product2 ...


NN:onIfAny = product1 product2 ...
NN:onPosIfAny = product1 product2 ...
NN:onNegIfAny = product1 product2 ...

This turns on the mode if any of the products in the list matches one in the product(i) array.
Note that onPos... (onNeg...) above means that setting only applies to the decays of the
(anti)particle. Further shortcuts (to select based on matching all products etc.) can be found in
the online manual.
Adding new particles can be done either by directly calling ParticleData::addParticle
from the program or by using the SLHA interface with a QNUMBERS block (cf. section 10.1.2).

9.4 Analysis of generated event


A generated “event” is essentially a list of particles — initial, final, or intermediate — that are
generated sequentially based on probabilistic calculations. A user will mostly be interested in
studying kinematic variables constructed from the momenta of initial or final-state particles. The
following three classes will be useful in constructing such variables. The full list of available classes
and methods for analysing an event is available in the online manual.

9.4.1 The Vec4 class


The Vec4 class is designed to hold the four-momentum (or indeed any other four-vector quantity
that may be needed) of the particles in the collision event. Some useful methods are

• px(), py(), pz(), e() to access the individual components.


q
• mCalc() for calculated mass E 2 − p2x − p2y − pz2 .

216
SciPost Physics Codebases Submission

• pT() and pAbs() for the transverse momentum and the absolute value of the three-momentum,
respectively.

• theta(), eta(), phi() for the polar and azimuthal angles, rapidity and pseudorapidity,
respectively.

• rot(double theta, double phi) to rotate the three-momentum.

• bst(const Vec4& p) and bstback(const Vec4& p) to boost the current vector by


~p
β~ = ± E .

9.4.2 The Particle class


The Particle class forms the fundamental particle unit, multiples of which are assembled in the
form of an “event”. Each Particle has the following properties:

• id() for the PDG code.

• status() for the status of the current particle (initial, final, stable, or intermediate etc., see
the online manual for the full status codes). For most users, the only relevant check is, if
the number is greater than zero, which denotes a stable, final-state particle. This can also
be determined directly by asking isFinal().

• p() returns a four-vector whereas px(), py(), pz(), e() can be used directly to access
components.

• mother1(), mother2() refer to the indices of the first and last mother, with several spe-
cial rules. motherList() returns a vector of all the mother indices, circumventing the need
to know these rules.

• daughter1(), daughter2() refer to the indices of the first and the last daughter, with
several special rules (all contiguous indices in between are daughters of said particle).
daughterList() returns a vector of all the daughter indices, circumventing the need to
know these rules

• vProd() for the production four-vertex.

• tau() is the lab-frame lifetime in mm/c.

9.4.3 The Event class


Finally, we come to the main result of the program which is held in a class called Event, rep-
resenting a collision event. It contains a dynamic array (vector) of particles along with helper
methods that are useful to extract information from the array. A single Pythia instance contains
two Events, called process and event. The first of these, process, contains only the hard pro-
cess whereas the second event contains the full history of the collision event. The user usually
does not need to manually add or remove particles from either of these arrays. The individual
particles can be accessed simply by using their index in the event (e.g. pythia.event[i]). All
methods corresponding to the particle then can be accessed e.g. pythia.event[i].phi() ac-
cesses the azimuthal angle ϕ. Some useful methods beyond those given for individual particles
are:

217
SciPost Physics Codebases Submission

• detaAbs(int i1, int i2) and dphiAbs(int i1, int i2) to obtain ∆η and ∆ϕ
between two particles in the event.

• REtaPhi(int i1, int i2) for the R distance between two particles.

with more given in the online manual.


Several useful functions that take an Event as in input are available to the user to construct
important quantities, e.g. SlowJet is a sequential clustering algorithm that can be used to form
jets from final-state particles whereas Sphericity and Thrust classes calculate these inclusive
variable.

9.5 Program output


The most basic level of output that can be requested is a listing of the full event (when inside the
event loop), which is done simply by

pythia.event.list()

A printout of the statistics, i.e. number of tried and accepted events, as well as the number of
events produced for each process and the resulting cross section can be obtained by using

pythia.stat()

For hard processes with e.g. p⊥ cuts, the cross section must be calculated by Monte-Carlo integra-
tion. This is done automatically as events are generated. After generation, the total cross section
and its statistical error can be accessed by calling respectively:

pythia.info.sigmaGen()
pythia.info.sigmaErr()

Pythia also provides rudimentary built-in histogramming via the Hist class. The main methods
of interest are

• Hist(string title, int numberOfBins, double xMin, double xMax, bool logX)
the constructor, defining a histogram.

• fill(double value, double weight = 1.0) to fill the histogram with an optional
weight.

• table(string fileName, bool printOverUnder = false, bool xMidBin = true)


to
output the histogram as a table.

Piping the histogram object directly to the standard output (std::cout < < myhist; ) will
also give a rudimentary ASCII output of the histogram. There are methods that will also generate
PYTHON PYPLOT code for cleaner graphical representations.

218
SciPost Physics Codebases Submission

9.5.1 Messages, warnings, and errors


PYTHIA 8.3 provides four basic levels of diagnostic output that are available in the Info class. All
such generated output is provided in a summary at the end of each run and can be useful as a
sanity check or as debug information. The main categories are:

Abort means that something went seriously wrong, either in initialization or generation. In the
former case, event generation cannot begin. In the latter case, the event is flawed, and
should be skipped. In either case the respective method Pythia::init() or Pythia::next
will return false, to allow the user to react. There are occasions where an abort may be de-
liberate, such as when a file of Les-Houches events is read and the end of the file is reached.

Error typically means that something went wrong during event generation, but the program will
backup and try again. In cases where this is not possible, a separate Abort will be issued. A
typical run can issue several errors, without it being a problem, unless the program aborts.
If encountering unusually many errors, it can be a good idea to check if any run param-
eters are set to unreasonable values, making a calculation unable to converge. The user
can set the maximum number of errors to allow before the entire run is aborted via the
Main:timesAllowErrors parameter.
Warning is less severe. Typically the program will try again with a good chance of success. Usually
no action needs to be taken by the user.

Message represents informative outputs that confirm e.g. reading of an external file. Verbosity of
messages can be set separately for each module that provides this function (e.g. SLHA:verbose
can be set to zero for a silent read.)

9.6 Advanced settings examples


The example use cases given above, are enough for performing simple tasks with PYTHIA 8.3.
In most cases, however, when a user wants to apply specialized built-in physics capabilities, the
application is more complicated, generally scaling with the complexity of the required tasks. For
this purpose, PYTHIA 8.3 ships with a large number of examples (in the examples/ sub-directory),
intended to showcase various applications.
In this section we provide a thorough explanation of two such advanced use cases, to high-
light the versatility of the distributed code. Settings for matching and merging are presented in
section 9.6.1, while section 9.6.2 discusses options for changing the beam configuration on an
event-by-event basis.

9.6.1 Matching and merging settings


PYTHIA 8.3 offers implementations of a large variety of matching and merging schemes. This
allows both flexibility, but crucially also cross-checks of the results of combining fixed-order per-
turbative calculations with the event-generator machinery.

POWHEG matching allows for the combination of specialized next-to-leading order calculations
with PYTHIA 8.3. To facilitate the matching, PYTHIA 8.3 offers vetoed parton showers via so-called

219
SciPost Physics Codebases Submission

PowhegHooks. This tool is available for the default showers as well as VINCIA11 and prevents the
over counting of emissions. It can be enabled with the setting:

POWHEG:veto = 0|1

Since it is not strictly guaranteed that the first shower emission can be considered the hardest
emission according to the POWHEG criteria, the number of emissions to be subjected to vetoed
showering can be adjusted by:

POWHEG:vetoCount = value

Furthermore, vetoed showering only needs to be applied to Born-type configurations, which can
be tagged by the minimal number of partons in the process:

POWHEG:nFinal = value

Vetoed showering relies on comparing the hardness of an emission to an allowed maximal hard-
ness. The definition of “hardness” is determined by:

POWHEG:pTdef = 0|1|2

Values other than 1 are discouraged. The definition of the “maximal hardness” can be adjusted
with:

POWHEG:pThard = 0|1|2

where values other than 0 only serve testing purposes. Finally, the setting:

POWHEG:pTemt = 0|1|2

determines for which sets of particles the hardness comparison should be applied, with a value of 0
strongly recommended. A few further, more advanced, settings are listed in the online manual.

MC@NLO matching employs shower-specific fixed-order calculations, which handle the over-
lap between shower and fixed-order calculation by explicit subtraction. When interfacing these
calculations, it is paramount to guarantee consistency of settings between the fixed-order calcula-
tion and the parton shower, for any aspects that might have an impact at the NLO level. No new
settings need be introduced in PYTHIA. The relevant settings to produce consistent results depend
on the shower and the MC@NLO provider. When using MC@NLO inputs with PYTHIA’s simple
showers, a minimal set of consistent settings is:
11
For VINCIA, PowhegHooks should be swapped for PowhegHooksVincia. All settings listed here retain their
importance with VINCIA. More details can be found in ref. [190, appendix A].

220
SciPost Physics Codebases Submission

SpaceShower:pTmaxMatch = 1
SpaceShower:pTmaxFudge = 1
TimeShower:pTmaxMatch = 1
TimeShower:pTmaxFudge = 1
SpaceShower:MEcorrections = off
TimeShower:MEcorrections = off
TimeShower:globalRecoil = on
TimeShower:weightGluonToQuark = 1

Please refer to the online manual for further details.

CKKW-L merging allows for the combination of several multi-jet tree-level fixed-order calcu-
lations with each other and the wider PYTHIA 8.3 environment. For example, calculations of
Drell–Yan lepton-pair production at hadron colliders in association with zero, one, two, or more
additional jets can be combined. In this context, “additional jets” refers to further QCD partons,
as well as W and Z bosons, in the case of simple showers and DIRE.
The inputs for multi-jet merging need to be regularized to avoid soft/collinear configurations.
The regularization cut also acts as the criterion to distinguish between fixed-order and parton-
shower phase space regions — the so-called merging scale. If the input events are regulated by a
k⊥ cut, the following flag can be used to interpret the merging scale in terms of the k⊥ definition:

Merging:doKTMerging = on|off

For the simple showers, the merging scale definition may also be set in terms of the shower evo-
lution variable p⊥ by setting:

Merging:doPTLundMerging = on|off

It must be emphasized that this option is naturally not available within VINCIA’s merging. The
simple shower also offers further built-in merging scale definitions, and the option to supply a
pointer to a user-defined MergingHooks class to implement new merging scale definitions. The
value of the merging scale separating fixed-order and parton-shower regions must be specified via
the parameter:

Merging:TMS = value

The merging further requires the definition of the “process” through the string:

Merging:Process = value

where value should identify the particles of the lowest-multiplicity process partaking in the merg-
ing. The process is used under the assumption that each event will contain exactly the specified
particles, and potentially further particles that are considered as additional radiation. Looser pro-
cess definitions are possible through the use of “particle containers”, and the “guess” option, see
the online manual. Finally, the number of additional jets must be set via:

221
SciPost Physics Codebases Submission

Merging:nJetMax = nJets

Other settings are documented in the online manual.

Sector merging The VINCIA antenna shower in PYTHIA 8.3 comes with its own implementation
of the CKKW-L merging algorithm, which differs from the one implemented for the simple showers.
The main difference is that VINCIA’s sector showers are maximally bijective, i.e. possess a minimal
number of possible histories that lead to any given multi-parton configuration. As such, they are
specifically designed for merging with high-multiplicity matrix elements for which the complexity
grows factorially with the number of possible shower histories, cf. section 5.4.
Sector merging may be enabled by using VINCIA with its sector shower option turned on12 and
switching on merging:

PartonShowers:model = 2
Vincia:sectorShower = on
Merging:doMerging = on

By default, it is then assumed that the merging scale is defined in terms of VINCIA’s evolution
variable, cf. section 4.2. Other definitions (such as a k⊥ regularization) may be used via the
appropriate settings listed for the simple showers above.
While the merging-scale value and the number of additional jets must be set in exactly the
same way as listed for the simple showers above, an important difference pertains to the syn-
tax of the process definition. Different to the Merging:Process setting in the default merging
implementation, the whole string must be encased in curly braces when using VINCIA:

Merging:Process = { value }

In addition, particles must be specified one at a time and be separated by a white-space character.
The initial and final state should be separated by > and exactly two initial-state particles must be
specified. It must be emphasized that a process string in the “default” syntax cannot be processed
by VINCIA and will lead to an abort.
More advanced settings can be found in the online manual.

UMEPS merging extends CKKW-L tree-level merging, by ensuring that inclusive cross sections
for n additional jets are not changed by the inclusion of calculations for m > n extra jets. This
is achieved by introducing subtractions that act to remove the effect of higher-multiplicity events
from lower-multiplicity inclusive cross sections. As an extension to CKKW-L, UMEPS shares the
settings of the former. Beyond these settings, the different stages of UMEPS merging can be invoked
by:

Merging:doUMEPSTree = on|off

which yield CKKW-L-reweighted tree-level results (up to small differences in Sudakov reweighting),
and by:
12
The sector shower flag is listed only for completeness — sector showers are switched on in VINCIA by default.

222
SciPost Physics Codebases Submission

Merging:doUMEPSSubt = on|off

which produce the necessary subtractions. Depending on the example main program, these two
stages may directly be mixed internally, so that only the first setting may be necessary.

NLO merging extends the leading-order merging machinery of PYTHIA 8.3 with (externally gen-
erated) next-to-leading order QCD event samples. As an extension of LO machinery, NLO merging
inherits many of the settings of LO merging. The result of NLO merging is an inclusive calculation
that recovers NLO QCD accuracy for inclusive cross sections with n ≤ nNLO additional partons, and
LO (QCD) accuracy for inclusive cross sections with nNLO < m ≤ nLO jets. The maximal number
of jets for which NLO samples are available (n ≤ nNLO ) has to be set by using:

Merging:nJetMaxNLO = value

PYTHIA 8.3 offers two NLO merging schemes as part of its core code: NL3 and UNLOPS. Other
NLO merging schemes (such as the FxFx scheme) can be embedded with the help of UserHooks.
NL3 merging is a straight-forward extension of CKKW-L, and mixes augmented CKKW-L-reweighted
tree-level events with events from NLO samples. The reweighted LO stage is enabled by using the
flag:

Merging:doNL3Tree = on|off

while the processing of NLO samples requires setting the switch:

Merging:doNL3Loop = on|off

Typically, NLO input samples contain not only NLO corrections, but tree-level contributions as
well. If this is the case, then explicit removal of tree-level contributions from the NLO sample is
necessary to avoid double counting. This subtraction is enabled by using the flag:

Merging:doNL3Subt = on|off

Note that this subtraction is not related to any of the UMEPS subtractions, but rather a necessity due
to the structure of available inputs. NL3 only supports the use of the Merging:doPTLundMerging
merging scale definition.
UNLOPS merging is an extension of UMEPS that — like UMEPS at leading order — ensures that
NLO inclusive cross sections are exactly retained, with the help of unitarity subtractions. Due to
this, UNLOPS merging proceeds in four phases. The reweighting of tree-level inputs is enabled by:

Merging:doUNLOPSTree = on|off

while the processing of NLO samples is produced when using:

Merging:doUNLOPSLoop = on|off

Both of the former stages should then be accompanied by subtractions to ensure the correctness
of the inclusive cross section. Subtractive leading-order samples are produced when using

223
SciPost Physics Codebases Submission

Merging:doUNLOPSSubt = on|off

while subtractive NLO events are enabled by:

Merging:doUNLOPSSubtNLO = on|off

Depending on the example main program, these two tree-level-dependent stages, as well as the
two NLO-dependent stages, may directly be mixed internally, so that only the first two settings may
be necessary in practice. UNLOPS supports the use of the Merging:doPTLundMerging merging-
scale definition natively. Other merging-scale definitions (embedded by custom MergingHooks
classes) can be enabled by setting

Merging:unlopsTMSdefinition = value

to a non-zero value.

9.6.2 Variable energies and beam particles


By default, the beam configuration is initialized at one specified energy. In some cases, however,
one may need to generate events across a range of energies. In PYTHIA 8.3, this feature is enabled
by setting

Beams:allowVariableEnergy = on

When this is enabled, the MPI machinery for SoftQCD will be initialized at a grid of energies rang-
ing from 10 GeV up to the maximum energy specified by Beams:eCM. This way, interpolation can
be used to efficiently find the relevant coefficients at each particular energy. (The LowEnergyQCD
code is intended for energies below 10 GeV where MPIs are irrelevant, and no specific initializa-
tion is needed.) Events can then be generated using one of the variant Pythia::next methods
below, corresponding to the kinematics setup specified by Beams:frameType. In other cases, it
is also necessary to change the beam particle types on an event-by-event basis. One example of
a relevant use case is hadronic cascades in a medium like the Earth’s atmosphere or a particle
detector. A number of settings must be explicitly switched on to enable this feature:

SoftQCD:all = on
LowEnergyQCD:all = on
Beams:allowVariableEnergy = on
Beams:allowIDAswitch = on

This will initialize the MPI machinery for a set of some 20 different common hadrons. To switch
beam configurations, use one or more of the following variants of the Pythia::set methods

pythia.setBeamIDs( idA, idB = 0)


pythia.setKinematics( eCM)
pythia.setKinematics( eA, eB)
pythia.setKinematics( pxA, pyA, pzA, pxB, pyB, pzB)
pythia.setKinematics( pAin, pBin)

224
SciPost Physics Codebases Submission

that match the Beams:frameType set. After calling these methods, all subsequent events called
with next will use the updated configuration, unless the set call was unsuccessful. The first
method preserves the kinematics of the previous event, modulo the change of masses. In this
framework, currently only p/n/p/n is supported for idB. An optional parameter procType can
be passed to next, and is used to generate an event of a particular type, such as non-diffractive
or single-diffractive on a specified side.
For applications such as cascades in a medium, the decision whether a variable-type interaction
should occur or not must be based on the relevant cross section. To this end, the parameterizations
outlined in section 6.1.4 and section 6.1.5 can be accessed by using the

pythia.getSigmaTotal( idA, idB, eCMAB, mixLoHi = 0)


pythia.getSigmaPartial( idA, idB, eCMAB, procType, mixLoHi = 0)

methods. Here, the default mixLoHi = 0 gives a smooth interpolation between the low-and
high-energy descriptions.
Typically, the MPI initialization is the slowest step also in a normal LHC run setup, and with
variable particles and energies it will take several minutes. It is possible to speed up the initializa-
tion process by saving the MPI parameterizations to disk. This is done using the MultipartonInteractions:reuse
option, which can take the following values:

• 0 (default): MPIs are reinitialized every time.

• 1: MPIs are reinitialized and the parameterization is saved to disk.

• 2: The MPI parameterization is loaded from disk. If the data file does not exist, initialization
fails.

• 3: The MPI parameterization is loaded if the file exists, otherwise it is reinitialized and saved
to disk.

When using non-zero values, the file name MultipartonInteractions:initFile to save/load


from must be specified.

9.7 Advanced usage


Often, the user might want to use PYTHIA to simulate physics effects that are not already imple-
mented in the standard release. We therefore provide several ways of extending PYTHIA capabili-
ties. The event generation process can be interrupted at various points (e.g. after hard scattering,
after first branching in the parton shower, etc.) by using “user hooks”. These can be used to
reweight (or veto) events and change distributions accordingly. Additionally, any extra produc-
tion process or decay of a new particle can be implemented by inheriting from PYTHIA classes that
provide cross section calculation or decay width calculation machinery. We refer to any processes
implemented this way as “semi-internal”. Finally, when extending capabilities, one may wish to
have run-time user-input information in the same way as PYTHIA settings. We therefore provide
some placeholder settings as well as methods to add custom settings keys that can be used to
accompany any new functionality.

225
SciPost Physics Codebases Submission

9.7.1 User-defined settings


Should the user require additional settings be provided via a card file, some spares have been
made available following the same schema as the normal PYTHIA settings: three each (N = 1,
2, 3) of boolean flags via Main:spareFlagN; integer modes via Main:spareModeN; floating
point parameters via Main:spareParmN; and strings via Main:spareWordN. These can all be
set in the card file and interpreted by the user to suit their needs. To add completely new settings
keywords, the user can use corresponding methods in the Settings class, e.g.

addFlag(string key, bool default)

to add a boolean, i.e. a Flag, and

addParm(string key, double default, bool hasMin, bool hasMax, double


min, double max)

to add a double-precision parameter. For further fine-grained control or using the comma-separated
vector type settings, we advise the user to refer to the methods documented in the Settings
Scheme section of the online manual.

9.7.2 User hooks


User hooks are placeholders where the user can interrupt normal PYTHIA program flow to cus-
tomize behaviour. The behaviour of the hook (i.e. the position in program flow where it is designed
to interrupt) is set by functions of the type canVetoX where X indicates one of the pre-defined
locations. An accompanying doVetoX is then executed during every instance of the X. A user
defines a hook by creating a class inheriting from UserHooks, overriding one or more of the hook
methods, and passing an object of that class to a Pythia instance:

pythia.setUserHooksPtr(make_shared<MyUserHooksClass>());

It is also possible to add more than one UserHooks object as follows:

pythia.addUserHooksPtr(make_shared<AnotherUserHooksClass>());

Note, however, that this may give rise to ambiguities if several objects have overridden the same
hook function. For the standard doVetoX functions, X will be vetoed if any of the objects veto,
while for some hook methods for which it is not possible to deduce a reasonable combination. In
the latter case, PYTHIA will issue a warning during initialization.
PYTHIA provides user hooks for ten cases: interruption while switching between main-generation
levels (e.g. process to parton); during parton-level evolutions based on p T or after a step; vetoes
for ISR or FSR emissions; to modify cross section or phase space sampling; after resonance de-
cays; to modify shower scales; to allow colour reconnections; to enhance certain rare splittings
(e.g. g → b b̄); and finally, to modify hadronization. The details of each of these hooks can be
found in the online manual. Using these hooks to modify parton level emissions (e.g. to match
matrix-element contributions from different orders) is discussed in section 9.8. Here we discuss a
simple case of modifying a resonance decay (e.g. to select certain kinematics or decay modes). The

226
SciPost Physics Codebases Submission

Pythia::process event contains the hard scattering process and decay of resonances produced
in the hard scattering. Defining:

bool MyHook::canVetoResonanceDecays() {
// By default returns false.
// Set to true to run following method
// after each resonance decay
return true;
}

bool MyHook::doVetoResonanceDecays(Event& process){


// Look through the process to check
// for desired characteristics.
// Return false to accept the event.
// Return true to veto the event.
return false;
}

this method can be used e.g. with an LHE file (an LHE event is always stored in the process
before migrating it to the full event) that already has decayed resonances that are decayed again
by PYTHIA.

9.7.3 Semi-internal processes and resonances


While PYTHIA provides a large number of models, built-in production processes and resonances,
it is oftentimes necessary to either modify existing processes or add new ones. The class structure
provided by PYTHIA can be easily inherited from to include new processes. Any new particles
produced can be implemented as new resonances.
For a resonance, there are three levels of methods used when calculating decay widths in var-
ious channels. The first, initConstants() is run once per resonance and can be used to set
couplings or any other properties that do not depend on kinematics. The second calcPreFac()
has access to the kinematic configuration (masses of particles and phase-space variables), whereas
the third, calcWidth() has access to all information and usually contains a case-wise calcu-
lation of the decay width in all channels. When there is no flavour-dependent factor in the
calculation, calcPreFac() can be used to set the internal variable widNow (inherited from
ResonanceWidths) which serves as the calculated width for a given channel. The example pro-
gram main22 provides a working example of a new resonance.

227
SciPost Physics Codebases Submission

class NewResonance : public ResonanceWidths {

public:

// Constructor.
NewResonance(int idResIn) {initBasic(idResIn);}

private:

// Locally stored properties and couplings.


double coupling1, coupling2;

// Initialize constants.
virtual void initConstants();

// Calculate various common kinematic factors


// for the current mass.
virtual void calcPreFac(bool = false);

// Calculate width for each channel.


virtual void calcWidth(bool = false);

};

Once the resonance is set up, it can be added to the PYTHIA particle data table before initializing
using

ResonanceWidths* newResonance = new NewResonance(pid);


// Where pid is the integer PDG id.
pythia.setResonancePtr(newResonance);

This will automatically call the relevant width calculation functions on initialization and calculate
the total width of the particle based on all open channels. The use may have to set up the decay
table (i.e. a list of open channels) using the commands in section 9.3.5 if the particle is not part
of the PDG standard [85].
Once all new particles are set up, production modes can be set up by inheriting from PYTHIA’s
SigmaProcess class and its derivatives. For 2 → 1, use SigmaProcess, for 2 → 2 use Sigma2Process,
and use Sigma3Process for 2 → 3. All relevant kinematic variables are already set up and will
be filled event by event based on PYTHIA’s phase-space generator. The production process should
be set up before calling init() in the main Pythia class. The kind of incoming particles needed
for the production are set by the return value of inFlux(); options are qqbar, qqbarSame (same
flavour qq), qg (qg and qg), ffbar (includes quarks qq and leptons `¯`), gg (gluons), and a few
more.

228
SciPost Physics Codebases Submission

class Sigma1qqbar2NewResonance : public SigmaProcess {

public:

// Constructor.
Sigma1qqbar2NewResonance() {}

// Initialize process.
virtual void initProc();

// Calculate flavour-independent parts of cross section.


virtual void sigmaKin();

// Evaluate sigmaHat(sHat).
// Assumed flavour-independent so simple.
virtual double sigmaHat() {return sigma;}

// Select flavour, colour and anticolour.


virtual void setIdColAcol();

// Info on the subprocess.


virtual string name() const {return "q qbar -> NewResonance";}
virtual int code() const {return 10000;}
virtual string inFlux() const {return "qqbarSame";}
virtual int resonanceA() const {return 1000025;}

// Set internally shared variables (like couplings)


// as protected or private
...
}

Similar to the new resonance-width calculation, there are three progressive methods that can
be used to optimize running time. First, initProc() is called once per run and can be used
to set constants or couplings based on input parameters. Second, sigmaKin() can be used to
set up kinematic factors for unresolved processes that do not rely on flavour information of the
incoming and outgoing states. Third, sigmaHat() can be used to calculate the full contribution of
the phase-space point, which is returned as a double-precision floating-point number. The return
value should be the value of dσ 2
d t for 2 → 2 and |M| for 2 → 3 processes, respectively. If the user
wishes to input the matrix element squared instead of dσd t , then they should also override bool
convertM2() to return true (see include/Pythia8/SigmaProcess.h for full class definition
and explanatory comments). Finally, an important step before the process is usable is to set the
incoming and outgoing colours (i.e. colour topology) and flavours where necessary.

229
SciPost Physics Codebases Submission

void Sigma1qqbar2NewResonance::setIdColAcol() {

// Flavours simply to be copied from incoming


// quark ids i.e. id1, id2
setId( id1, id2, idNew);

// Colour flow topologies. Swap when antiquarks.


// Say NewResonance is an octet
// col1, acol1, col2, acol2, colRes, acolRes
setColAcol( 1, 0, 0, 2, 1, 2);
if (id1 < 0) swapColAcol();

The new process can now be added to the PYTHIA process array by declaring:

SigmaProcess* sigma1Res = new Sigma1qqbar2NewResonance();


pythia.setSigmaPtr(sigma1Res);

9.7.4 Multithreading
In most cases, events are generated independently of each other. This means that in principle,
event generation can easily be split across multiple threads in order to speed up generation.
In practice, each Pythia object contains an internal state, and is therefore not thread-safe. A
straightforward workaround is to create multiple Pythia objects, each initialized with its own
random seed using Random:seed and Random:setSeedon.
Starting from PYTHIA 8.307, the PythiaParallel class provides a framework for doing this.
This class is intended to provide a lightweight solution to easily enabling parallelism for simple
studies. Objects of this class are constructed and initialized similarly to normal Pythia objects,
but rather than having a next() method that generates a single event, it provides the run method,
which generates a number of events in parallel. The way this works is that the PythiaParallel
object creates and keeps track of a number of Pythia sub-objects. These sub-objects create events
in parallel. Whenever an event is generated, the Pythia object that generated it is passed to the
user so that the resulting event can be analysed. This process then continues until a required
number of events has been generated, as specified by the Main:numberOfEvents setting. The
following snippet gives an example of how to generate events using this class.

230
SciPost Physics Codebases Submission

#include "Pythia8/Pythia.h"
// PythiaParallel.h must be included explicitly.
#include "Pythia8/PythiaParallel.h"
using namespace Pythia8;

void main() {
// The PythiaParallel object is created
// and initialized as normal.
PythiaParallel pythia;
pythia.readString("SoftQCD:nonDiffractive = on");
pythia.readString("Main:numberOfEvents = 10000");
pythia.init();

// Example: plot charged multiplicity


Hist nCh("Charged multiplicity", 100, -0.5, 399.5);

// This defines the callback that will analyse events.


function<void(Pythia& pythiaNow)> callback =
[&nCh](Pythia& pythiaNow) {
int nChNow = 0;
for (int i = 0; i < pythiaNow.event.size(); ++i)
if (pythiaNow.event[i].isFinal() &&
pythiaNow.event[i].isCharged()) nChNow += 1;
nCh.fill(nChNow);
};

// Generate events in parallel, using


// the specified callback for analysis.
pythia.run(callback);

// Print histogram.
cout << nCh;
}

In this example, the callback is defined via an anonymous function (also known as a lambda
function). It would also be possible to define it as a named function, e.g. with signature

void callback(Pythia& pythiaNow) ...

The advantage of using an anonymous function is that it can directly access local variables such as
the nCh histogram, which is captured by reference according to the [&nCh] specifier. It is not nec-
essary to actually save this anonymous function in the callback local variable. They can instead
be passed directly to run, which would make the structure of the code more similar to running
with Pythia::next. Further examples on using the PythiaParallel class are included with
the PYTHIA 8.307 distribution.

231
SciPost Physics Codebases Submission

Figure 19: Illustration of how 10 events are generated and processed in parallel by
four threads. The red lines indicate that a thread is generating an event. The blue lines
indicate that a thread is analysing the event. Note that two threads are not allowed to
analyse at the same time; the dashed black lines indicate that a thread is done generating
the event, and is waiting for another thread to finish its analysis.

By default, the framework tries to identify the number of available hardware threads and
use the maximum degree of parallelism. Alternatively, the number of threads can be fixed using
the Parallelism:numThreads setting. This can be useful in order to limit the computational
resources spent on generation, and is mandatory on systems where the number of threads cannot
be detected.
Although event generation is done in parallel, the analysis is synchronized by default so that
only one event is processed at the same time. In the example above, this ensures that it is not
possible for two threads to simultaneously write to the nCh histogram. An illustration of this is
shown in fig. 19. Usually, the analysis is much faster than the actual event generation, and this
does not have a significant impact on the run time. However, if the analysis is slow or if the
number of threads is very large, the different threads may spend a non-negligible amount of time
waiting for other threads to finish the processing. In this case, the run time can be improved by
setting Parallelism:processAsync = on, which will cause the generated events to be also
processed in parallel. It is then up to the user to ensure mutually exclusive access to thread-unsafe
resources such as histograms.
It is also possible to use external libraries to perform naive parallelization. Several examples
using OPENMP are available in the PYTHIA distribution.

9.8 Event weight handling


By default, PYTHIA produces unweighted events. This means that every event produced by the
generator represents an equal share of the total cross section of the chosen processes. However,
some settings and functionalities require the use of weighted events. Weighted events no longer
represent equal shares of the total cross section, but are augmented with a corrective event weight
that needs to be taken into account when filling histograms of physics observables.
Event weights are useful in different scenarios, which are listed below. One of the key advan-
tages is the use of parameter or setting variations. Rather than regenerating events for different
settings and choices of parameters, a vector of corrective event weights can be included in the
event generation, reducing the total computation time significantly compared to the generation
of separate event samples.
A detailed list of available event weights, related settings and how to access them is available

232
SciPost Physics Codebases Submission

in the online manual. In the following, we provide an overview of process-specific weights and
how they can be accessed. Furthermore, we describe the automated-variation weights and the
weight container, which collects all weights in a common structure.

9.8.1 Overview of process specific weights


PYTHIA collects the available event weights in a single nominal event weight, which is accessible
through the Info::weight() function. In a usual setting, this weight is set to 1 and thus un-
interesting. Several functionalities and settings lead to a modification of this weight though, in
which case the weight must be included when filling histograms.

• Biased phase-space point selection allows for the reduction of statistical fluctuations for
specific kinematic configurations. The corrective weight needs to be included to ensure that
the overall distributions are not changed.

• If Les-Houches events are used as input, some strategies allow for negative weights, which
will be included in the event weight and need to be taken into account. For the strategies 4
and −4, the event weight has units pb, and is converted to mb upon output.

• For heavy-ion collisions, PYTHIA allows a Gaussian sampling of the impact parameter space,
leading to weighted events.

• In rare cases, the initial over-estimate of the differential cross section might for specific
phase-space points lie below the correct differential cross section. In these cases, a weight
above 1 is provided to compensate for this violation.

• Enhanced parton-shower emissions (cf. section 4.1.5) need to be corrected for with a weight
to ensure that the distributions remain unchanged when improving the statistical relevance
of rare emissions.

• Multi-jet merging requires event weights to account for Sudakov factors and the running of
coupling parameters. For the leading-order merging schemes CKKW-L and UMEPS, these are
by default included in Info::weight(). For the next-to-leading-order multi-jet merging
schemes NL3 and UNLOPS, the merging weight needs to be included and is available from
Info::mergingWeightNLO().

9.8.2 Automatic weight variations


In addition to the nominal weight, additional weights can be provided to take into account vari-
ations of settings and parameters. Filling histograms with these respective weights allows for
an estimation of the corresponding distributions without rerunning PYTHIA. Additional variation
weights are available from the parton shower, multi-jet merging and external LHEF input.
The parton shower currently allows for renormalization-scale variations and non-singular term
variations in both initial- and final-state radiation, discussed further in section 4.1.5. Besides, it
allows for the variation of PDF members of LHAPDF 6 families. Details on the usage of these
variations can be found in the online manual. The physics background is described in ref. [20].
The multi-jet merging schemes CKKW-L, UMEPS, NL3, and UNLOPS also allow for renormal-
ization-scale variations. Furthermore, variations of the UNLOPS merging scheme itself are avail-
able. For details, see ref. [186] and the online manual. The renormalization-scale variations in

233
SciPost Physics Codebases Submission

the merging are automatically combined with corresponding variations from LHEF input and the
parton shower.
With the availability of variation weights from different sources, PYTHIA 8.3 introduces a
common structure, the weight container, to make all these weights available to the user. This
structure is also used for writing available event weights to HEPMC output. The naming conven-
tions are based on ref. [395, p. 162]. If multi-jet merging is activated, combined weights for
renormalization-scale variations in LHEF input, parton shower, and merging are included. Cus-
tom weights from LHEF input or the parton shower are presented with a prefix to emphasize that
further processing or combining might be necessary. While HEPMC output automatically contains
these variation weights, the user can access them directly through the following methods:

int Info::numberOfWeights()
string Info::weightNameByIndex(int i)
double Info::weightValueByIndex(int i)
vector<string> Info::weigthNameVector()
vector<double> Info::weightValueVector()

The first entry of this weight vector, or correspondingly the weight with index 0 is the nominal
weight, including the weights from all the above-mentioned sources.

9.9 Tuning PYTHIA


By default, PYTHIA 8.3 operates using a particular set of run-time parameters that determine the
behaviour of the physics models. A set of parameters that is chosen based on a comparison of
PYTHIA 8.3 predictions to data is generically named a “tune”. As the name suggests, the procedure
to obtain a tune is similar to adjusting the pegs on a stringed instrument to achieve a certain sound.
However, there is no universal agreement on what constitutes a good tune in contrast to a good
sound. The goal of tuning is to find an “optimal” set of physics parameters, p∗ , that minimizes
the difference between the experimental data and the simulated data from the event generator.
In practice, this difference is defined as follows:

2
X X (MC b (p) − O b )2
χMC (p, w)
~ = wO b , (396)
O∈SO b∈O
∆MC b (p)2 + ∆O2b

where SO is the set of observables used in the tune, b ∈ O denotes the bins in a certain observable
O, and w ~ is a vector of weights wO b for each bin of each observable. The ∆s are the uncertainties
on the simulated data and the observable. The weights wO b ≥ 0 reflect how much an observable
contributes to the tune, i.e. if wO b = 0 for some O b , then this observable bin will not influence
the tuning of p, whereas if w ~ O b = 1 then all data is treated equally. The choice of SO and wO b
determines a unique tune, and these choices are driven by both theoretical and experimental
considerations. The variable in (396) is called a “chi-squared”, but due to the presence of weights
there is no guarantee that it will have the properties of a proper χ 2 distribution.

9.9.1 Comments on the tuning procedure


There are several aspects of this problem that make it non-trivial. One is that the model, which
is a mixture of theoretically- and phenomenologically-grounded sub-models, does not describe all
data equally well. Related to this is that there is no systematic method for predicting a priori where

234
SciPost Physics Codebases Submission

the model will fail — the sub-models are often deeply entwined and are not factorizable, despite
our original intention that they should be. If one had a numerical estimate of the uncertainty
coming from a certain model (not just its sensitivity to parameter variations, but an estimate of
where it fails), then that could be included in ∆MC b , which usually includes only the uncertainty
arising from a finite number of simulation runs. Another issue is how the tune will be applied
— will it be used for generic simulations with lowest-order matrix elements or with matched and
merged predictions with higher-order matrix elements?
There is more than one way to attack these problems. One, as suggested above, is to set w ~O = 0
for some set of observables or analyses. Such a decision is made at the beginning of tuning when
one decides which data is most relevant, e.g. Tevatron data or LHC data from a lower-energy run,
minimum bias or inclusive jet data, etc. Data can also be removed from the tune when it becomes
obvious they do not fall within the envelope of the model predictions. In practice, it is sometimes
found that even problematic data should be included, with reduced significance, to improve the
overall quality of the tune. This can be accomplished by adjusting some of the w ~ O values to
emphasize or de-emphasize certain datasets. Obviously, such a posteriori manipulation of data is
subject to bias and abuse. However, one should remember that tuning is not hypothesis testing —
we do not allow for the possibility that the model is ruled out. To illustrate two different, but not
exhaustive, approaches, we will describe the MONASH 2013 and the ATLAS A14 tuning exercises.

9.9.2 The default PYTHIA 8.3 tuning: MONASH 2013


The MONASH 2013 tune is currently the default one. It was performed using data from HEP-
DATA and the PDG. It was aimed at non-diffractive, high momentum-transfer collisions using the
leading-order matrix elements coded in PYTHIA. It started from the hypothesis that hadronization
was independent of the environment, and the related physics parameters could be best constrained
using e+ e− data, particularly from LEP (for most observables) and SLD (for b-hadron specific ob-
servables). Any modifications to hadronization predictions from the breakup of the proton, for
example, would be handled by explicit models that modified the initial conditions, but not the
mechanism of hadronization. Once the observables were selected, all with either w ~ O = 0 or 1,;
several inconsistent values of particle yields were adjusted based on common sense. Physics pa-
rameters related to final state radiation, hadronization, and particle decays were selected using
eq. (396) as a guide, but without an explicit global minimization. An additional ad hoc “theory
uncertainty” of 5% was added per bin of each histogram used in the tune to prevent overfitting.
These parameters were then frozen as a particular eetune. The tuning of the remaining param-
eters, specific to hadronic collisions, began with a choice of PDF, which is an integral part of any
such tune. The central tune of the NNPDF2.3 PDF set was selected, as it was being used in many
other theory calculations at the time. In particular, the choice was leading order with a value of αs
closer to that found in the eetune. The tuning of initial-state parameters, such as those related
to initial-state radiation, beam remnants, and multiparton interactions, proceeded in a similar
fashion using LHC data at the highest energy available. Scaling of the multiparton-interaction
parameter was obtained by including Tevatron data. Again, at no point was a global optimization
of parameters made based on minimization of a χ 2 .

9.9.3 The ATLAS A14 tune


The ATLAS A14 tune took a different approach. First, it took the basic MONASH 2013 parameters
as a starting point, with the goal of optimizing parameters for LHC physics studies. It relied heavily

235
SciPost Physics Codebases Submission

on the PROFESSOR [396] framework. The observables were selected and weighted to emphasize
high-p⊥ radiation and some top-quark observables. It was designed to be used for BSM physics
searches, where precision was not the main goal. To that end, it minimized eq. (396) for ten
parameters, but in an iterative process to select weights that produced a “reasonable” fit. Using
PROFESSOR, it also produced eigentunes that could be used as alternative tunes to study sensitivity
to the PYTHIA parameters. However, because of the inclusion of weights, and since the fit residuals
do not appear to be χ 2 distributed, an ad hoc criterion was used to determine these variations. In
the process of selecting data, many observables were included that are obviously highly correlated.
However, those correlations were not reported consistently by the experiments. As a result, some
of the observables have a hidden weight.

9.9.4 Automatic tuning approaches


Automatic tuning approaches can be helpful to circumvent some of the challenges that manual
tuning entails, like subjectivity based on expert knowledge of models, parameters, constraints, and
data, and challenges due to a high amount of data sets and parameters to be taken into account.
Automatic tuning aims at simplifying the tuning procedure and making it more systematic, which
is especially helpful when many parameters are to be tuned.
A brute force grid-based tuning approach is usually prohibited due to the high computational
cost of generator runs, especially if many parameters are to be tuned. To circumvent this problem,
one can use iterative optimization approaches, which can take time due to the serial running with
different parameters, but focus well on relevant regions in parameter space. Alternatively, one can
attempt to parameterize the generator response, and to then optimize based on an interpolation.
After an initial generator run, which can be trivially parallelized, the actual optimization based on
the interpolation is much more straight forward.
As outlined in ref. [397], an iterated Bayesian optimization approach can be employed for
event generator tuning. A χ 2 value for different parameter values is obtained, and all information
is used to find the next set of parameters iteratively. This approach thus goes beyond local gradient-
based optimization, balancing exploration and exploitation.
The PROFESSOR toolkit employs a parameterization approach. After an initial parallelized MC
event-generator run, the generator response is parameterized using a polynomial function. A
χ 2 optimization is then performed based on this interpolation. This approach allows for several
parameters, but becomes prohibitive if the parameter space becomes too large. It is then beneficial
to tune in successive steps based on model and data knowledge.
There are multiple efforts in improving the PROFESSOR tuning approach. The AUTOTUNES
method [398] employs PROFESSOR, and goes beyond by automatically identifying subsets of cor-
related parameters that can be optimized successively. The weights are chosen correspondingly to
constrain sub-tunes by the most relevant experimental data. The APPRENTICE method [399] goes
beyond PROFESSOR by allowing for more general interpolations, a larger variety of optimization
methods, and automated setting of weights.
Automatic tuning methods can be very useful, and are helpful when many parameters are to
be optimized based on a large amount of experimental data. In combination with expert knowl-
edge about the tuned models and the experimental data, pitfalls can be avoided, like too strong
constraints due to single well-measured distributions or unphysical tuning results.

236
SciPost Physics Codebases Submission

10 Interfacing to external programs


In most realistic use-cases, PYTHIA 8.3 is not used stand-alone, but rather as part of a large soft-
ware stack capable of providing everything from calculation of Feynman rules from a Lagrangian,
to detector simulation and analysis, including interfaces between all those steps. Technically,
PYTHIA 8.3 is a C++ library, and only the users’ technical proficiency limits the ways the program
can be interfaced to other code, thus a manual section describing external interfaces, will by defini-
tion be incomplete. For practical purposes, however, PYTHIA 8.3 comes with a number of interfaces
pre-written, and several more with an official or unofficial “blessing” by the developers. These are
interfaces which should in general work, and where the PYTHIA 8.3 developers will at take some
responsibility for helping users when setting up. Those interfaces are described here, along with
an explanation of how PYTHIA 8.3 is expected to interact with them. The section is sub-divided
in four. In section 10.1 we describe file-based or run-time based interfaces to external providers
of input to PYTHIA 8.3, be it external matrix elements, PDFs, or random numbers. In section 10.2
we describe the most often used output formats, such as HEPMC events or ROOT “n-tuples”. In
section 10.3 we describe run-time interfacing with the analysis tools RIVET and FASTJET, and fi-
nally in section 10.4 the use of PYTHIA 8.3 through the PYTHON interface and on multicore HPC
architectures is discussed.

10.1 Generation tools


Several file-based or run-time interfaces exist. For file-based interfaces, generation steps must be
run in a strict sequence. For run-time interfaces, we take a PYTHIA-centric view, i.e. that PYTHIA
controls the overall event generation (unless stated otherwise).

10.1.1 Les Houches Accord and Les Houches Event File functionality
The Les Houches Accord (LHA) format [305] allows a factorized event generation chain and is one
of the most long-lived and successful interface agreements in particle physics. Using the LHA for-
mat, complex perturbative calculations can be factored out from the rest of the event generation
chain, and performed by specialized tools. The basic idea of LHA is a run-time interface between
two generator codes: the “fixed-order generator” stores the collision setup and cross-section infor-
mation in memory for the “event generator” to read upon initialization (see table 4). At generation
time, the individual phase-space points used in the fixed-order generator are stored in memory
for the event generator to read and process further, cf. table 5 for the format definition. Origi-
nally, the in-memory structures were FORTRAN common blocks (called HEPRUP for initialization
and HEPEUP for event information). This original format is still used in modern applications, e.g.
the interfaces to MADGRAPH or POWHEG B OX discussed below. An example of another in-memory
structure is discussed in section 10.1.3.
Although desirable from a computing perspective, run-time interfaces require programming
language-specific in-memory representations. The Les Houches event file (LHEF) format [306] is
a text-file-based update and extension of LHA, such that no run-time interface is necessary, making
the results somewhat more portable. Les Houches Event files provide pre-tabulation and storage
of phase-space points, thus enabling the reuse of computationally expensive results.
The LHEF format defines XML-like “tag” structures to store information. As such, all relevant
information in a LHEF file is enclosed in:

<LesHouchesEvents version=" v "> ... </LesHouchesEvents>.

237
SciPost Physics Codebases Submission

The version can be v = 1.0 [306] or v = 3.0 [395].


The HEPRUP initialization information of the LHA is mirrored by a text block bracketed with
<init> ... </init>, while the HEPEUP event information is captured in a text block enclosed
in an <event> ... </event> tag. Auxiliary information pertaining to all events can also be
stored in a block bracketed with a <header> ... </header> tag. The content of each tag may
contain further tags, see tables 6 and 7 for a list of all recognized tags.
A basic example <header> block is

<header>
Some auxiliary information that
...
is not parsed.
</header>

Such a header would be compliant with all versions of the format. Additional tags may appear
in later versions (v. 3) of the format, as shown in the example below

<header>
Some auxiliary information that
...
is not parsed.
<initrwgt>
<weightgroup type="alphasVariation">
<weight id="A"> nominal alphas </weight>
<weight id="B"> decreased alphas </weight>
<weight id="C"> increased alphas </weight>
</weightgroup>
</initrwgt>
</header>

In this particular example, PYTHIA 8.3 will be instructed to expect each event to contain a
<rwgt> block that contains three <wgt> entries.
In a slight extension of the accord, PYTHIA 8.3 will also parse the parts of the <header>
block that are enclosed in <slha> .. </slha> as if the block contained an SLHA file. See
section 10.1.2 for a description of SLHA files.
The <init> block is a mandatory part of any LHE file. A basic example will contain the two
beam-particle identifiers, their two energies in GeV, two PDF-author-group identifiers, two PDF-
set identifiers, and weighting information, followed, in a separate line, by cross section, statistical
error, and unit weight information, followed by an integer process label:

<init>
2212 2212 0.4E+04 0.4E+04 -1 -1 21100 21100 -4 1
0.50109086E+02 0.89185414E-01 0.50109093E+02 1234
</init>

Nowadays, the most common weighting-strategy information (given by −4 in the example)


allows for both positive and negative event weights, where the average weight gives the cross

238
SciPost Physics Codebases Submission

section of the generated events. In later versions of the format, the optional generator tag may
also be included:

<init>
2212 2212 0.4E+04 0.4E+04 -1 -1 21100 21100 -4 1
0.50109086E+02 0.89185414E-01 0.50109093E+02 1234
<generator name="SomeGen1" version="1.2.3"> some additional
comments </generator>
<generator name="SomeGen2" version="a.x.3"> some other comments
</generator> </init>

This tag mainly serves to convey information, and does not affect the file processing through
PYTHIA 8.3.
The initialization information is then complemented with a large list of <event> blocks con-
taining the phase-space points. It should be noted that PYTHIA 8.3 supports an arbitrary list of
attributes of the <event> tag, and further allows “custom” additions enclosed in <event> tags:

• The identifier #pdf at the start of a line means the line contains information on PDFs. For
example, the line
#pdf 1 -1 0.11 0.3 100 0.5 0.3
will lead to reading/setting the values: ID(particle extracted from beam “A”) = 1, ID(particle
extracted from beam “B”) = −1; momentum fraction of particle extracted from beam A
x A = 0.11, momentum fraction of particle extracted from beam B x B = 0.3; factorization
scale µ F = 100 GeV; value of the parton distribution for beam A fA (x A , µ F ) = 0.5; and value
of the parton distribution for beam B fB (x B , µ F ) = 0.3.

• <event> tags are allowed to enclose two hard-scattering events, as is e.g. needed when
interfacing to external double-parton scattering codes.

• In the latter case, the identifier #scaleShowers at the start of a line leads to the two
subsequent floating-point values being interpreted as parton-shower starting scales for the
first and second hard scattering enclosed by <event> ... </event>, respectively.

• Omitting the incoming particles in the content of the <event> tag can be permissible when
interfacing with PYTHIA 8.3 to perform only hadronization of resonance-decay systems.

• The event attributes npLO and npNLO are parsed, and employed when interfacing to MAD -
GRAPH5_AMC@NLO.

A simple event compliant with both versions of the standard will contain information about the
number N of particles in the event, the process label, the “scale”, and QED and QCD coupling
strengths, followed by N lines containing particle information:

239
SciPost Physics Codebases Submission

<event>
4 1234 5.0 300.0 7.861651E-03 1.084400E-01
2 -1 0 0 101 0 0.000E+00 0.000E+00 3.016E+02 3.016E+02 0.000E+00 0.
9.
-2 -1 0 0 0 102 0.000E+00 0.000E+00 -2.964E+02 2.964E+02 0.000E+00
0. 9.
6 1 1 2 101 0 -1.358E+02 -1.671E+02 1.128E+02 3.000E+02 1.756E+02
0. 9.
-6 1 1 2 0 102 1.358E+02 1.671E+02 -1.076E+02 2.980E+02 1.756E+02
0. 9.
</event>

For each particle, its identity, status, pair of mothers, pair of colours, momentum, mass, produc-
tion vertex, and spin are required information. In version 3.0 of the standard, further information
may be added to an <event>. A more involved example is:

<event type="undecayed_born_level_ttbar">
4 1234 5.0 300.0 7.861651E-03 1.084400E-01
2 -1 0 0 101 0 0.000E+00 0.000E+00 3.016E+02 3.016E+02 0.000E+00 0.
9.
-2 -1 0 0 0 102 0.000E+00 0.000E+00 -2.964E+02 2.964E+02 0.000E+00
0. 9.
6 1 1 2 101 0 -1.358E+02 -1.671E+02 1.128E+02 3.000E+02 1.756E+02
0. 9.
-6 1 1 2 0 102 1.358E+02 1.671E+02 -1.076E+02 2.980E+02 1.756E+02
0. 9.
<rwgt>
<wgt id="A"> 5.0 </wgt>
<wgt id="B"> 4.5 </wgt>
<wgt id="C"> 5.5 </wgt>
</rwgt>
<weights> 1.0 0.7 1.3 </weights>
<scales muf="175.0" mur="175.0" mups="300.0" scale_3="1.0"
scale_4="1.0">
content is not parsed
</scales>
</event>

This event contains three auxiliary event weights in the “detailed format”, as well as three
additional event weights in the “compressed format”. These different ways to transmit event
weights do typically not appear together. The “detailed format” has become much more widely
used. The example above further contains auxiliary scale information through the scales tag.
This feature can be used to e.g. transfer multiple shower starting scales to PYTHIA 8.3. Starting
scales for individual particles in the event can be set by including a scales attribute ending with
_iPos, where iPos is the position of the particle (in the <event>) in question. This function-
ality is used for MLM jet matching with MADGRAPH, and for MC@NLO ∆ matching using MAD -
GRAPH5_AMC@NLO. At present, PYTHIA 8.3 does not support the use of sets of events enclosed

240
SciPost Physics Codebases Submission

in <eventgroup>. Such events sets were originally proposed in ref. [400] to collect events that
require correlated post-processing. Since the latter is not possible in PYTHIA 8.3, event files con-
taining <eventgroup> tags will be treated as if the <eventgroup> tag was not present.
Finally, note that PYTHIA 8.3 will perform momentum-conservation checks on each input
<event>. If inconsistencies (e.g. due to rounding errors) are found, then actions will be taken to
repair the event. This entails enforcing the correct value of particle rest masses, and ensuring that
the incoming momentum matches the outgoing momentum.

Table 4: The information defining the LHA initialization interface (in the HEPRUP com-
mon block). The suffix UP can be read as “user process”. At most 100 user processes are
allowed. See ref. [305] for details.

block name description


IDBMUP(2) pair of two integer values defining the PDG IDs of the colliding beams
EBMUP(2) pair of two floating-point values listing the energies of the two col-
liding beams in GeV
PDFGUP(2) pair of two integer values defining the author group of the PDF fit
used as the PDF for the colliding beams
PDFSUP(2) pair of two integer values defining the PDF set used to extract parti-
cles from the colliding hadron beams
IDWTUP signed integer value determining how the event weights should be
interpreted
NPRUP integer value defining the number of different user processes
XSECUP(NPRUP) list of NPRUP double values giving the cross sections (in units of pb)
of the individual user processes
XERRUP(NPRUP) list of NPRUP double values giving the statistical errors associated
with the individual user processes
XMAXUP(NPRUP) list of NPRUP double values giving the maximum weight encountered
in generating the cross section of the user process
LPRUP(NPRUP) list of NPRUP integer identifiers for the user processes; the identifiers
will also feature in the in-memory representation of the phase-space
point

241
SciPost Physics Codebases Submission

Table 5: The information defining the LHA event information (in the HEPEUP common
block). At most 500 particles are allowed. See ref. [305] for details.

block name description


NUP number of particle entries in the event
IDPRUP identifier of the user process for this event
XWGTUP event weight
SCALUP scale of the event in GeV
AQEDUP value of the QED coupling for this event
AQCDUP value of the QCD coupling for this event
IDUP(NUP) list of NUP integer values defining the PDG IDs of the individual par-
ticles
ISTUP(NUP) list of NUP integer values defining the status (initial state, final state,
or resonance) of the individual particles
MOTHUP(2,NUP) pair of two lists of NUP integer values defining the mothers of the
particles
ICOLUP(2,NUP) pair of two lists of NUP integer values defining the Nc → ∞ colour
(anticolour) flow indices of the particles
PUP(5,NUP) five lists of NUP double values giving the lab-frame momentum of the
particle (Px , P y , Pz , E, M ) in GeV
VTIMUP(NUP) list of NUP double values giving the invariant lifetime cτ (distance
from production to decay) in mm
SPINUP(NUP) cosine of the angle between the spin vector of the particle and the
three-momentum of the decaying particle, specified in the lab frame

242
SciPost Physics Codebases Submission

Table 6: Allowed tags in the <header> and <init> blocks of a Les-Houches event file.

tag name description


<header> the tag starting the header block, a completely empty
header block is allowed
<initrwgt> optional tag detailing the auxiliary events in the “detailed
LHEF v3.0 format”; the following two tags have to be en-
closed in this tag
<weightgroup> optional tag defining a group of event weights in the “de-
tailed LHEF v3.0 format”; this group will contain several
instances of the following tag
<weight id="name"> optional tag defining a particular auxiliary event weight;
PYTHIA 8.3 expects each event to contain a <wgt> (see ta-
ble 7) with id=name for a <weight> with id=name
<init> the tag starting the cross section information and initial-
ization block
<generator> optional tag to transfer information about the generator
and generator version used to produce the event sample

Table 7: Allowed tags in the <event> block of a Les-Houches event file.

tag name description


<event> the tag starting the event block; an arbitrary number of attributes is
allowed
<rwgt> optional tag enclosing a set of event weights in the “detailed LHEF
v3.0 format”, see next tag
<wgt id="name"> optional tag transmitting the floating-point value of a unique auxil-
iary event weight as content; the id=name should mirror one of the
<weight> tags of the <initrwgt> block (see table 6)
<weights> optional tag containing an array of floating-point values for a set of
auxiliary event weights in the “compressed LHEF v3.0 format”
<scales optional tag allowing additional scale information stored as at-
tributes of the tag

10.1.2 SLHA
The SUSY Les Houches accord format! [401, 402] was designed as a plain-text interface between
supersymmetric spectrum generators, decay packages, and event generators. However, it has since
been generalized to contain information for any new physics model, cf. e.g. ref. [403].
The current SUSY implementation in PYTHIA is fully general with support for flavour- and
R-parity violation. The physical mass basis for each class of new particles (squarks, sleptons,
charginos, and neutralinos, as well as Higgses) is ordered by mass alone. We refer the reader to
the original SLHA2 documentation [402] for the full list of supersymmetric parameters supported

243
SciPost Physics Codebases Submission

by SLHA2. Here we give a summary of how new parameters can be passed to PYTHIA, and the
modifications made to extend SLHA2 support to be able to read up to 3-dimensional matrix input.
An SLHA file contains a number of pre-formatted “blocks”. The three main blocks most often
used for passing information about new particles are QNUMBERS, MASS, and DECAY. As an example,
we show here how a new spin-1 particle in a colour-octet representation (“heavy gluon”) and a
new fermion (“heavy quark”) can be defined in SLHA [403]. All characters following a # symbol
are ignored as a comment, except the first two words after the particle ID code are assumed to be
the name of the particle and, optionally, its antiparticle.

BLOCK QNUMBERS 9000021 # HeavyGluon


1 0 # 3 times electric charge
2 3 # number of spin states (2S+1)
3 8 # colour rep (1:singlet, 3:triplet, 8:octet, 6:sextet)
4 0 # Particle/Antiparticle distinction (0=own anti)

BLOCK QNUMBERS 9000006 # HeavyQuark HeavyQuarkbar


1 0 # 3 times electric charge
2 2 # number of spin states (2S+1)
3 3 # colour rep (1:singlet, 3:triplet, 8:octet, 6:sextet)
4 1 # Particle/Antiparticle distinction (0=own anti)

Note that many of the particle ID codes below 3 million, and several above it, are already in
use in PYTHIA (e.g. for hadrons, SM particles, and the MSSM particle spectrum). To avoid conflicts,
it is strongly advised to only use codes above 3 million for new BSM particles, and to check in the
particle data table that the codes are not already in use. See also the PDG list of standard particle
ID codes [35, sec. 45]. Finally, note that PYTHIA is only able to handle colour singlets, triplets,
octets, and sextets.
The mass block [401] contains the mass of the physical particles and is simply a list containing
the particle ID code and its mass.

BLOCK MASS
9000021 1000. # HeavyGluon
9000006 450. # HeavyQuark

Note that some matrix-element generators export their complete list of particle masses in this
block, including also those of SM particles, which may not agree with PYTHIA’s internal values.
This can wreak havoc in unintended places, e.g. by overwriting PYTHIA’s constituent-quark masses
by far smaller current-quark masses. Therefore, for particles with ID codes less than one million,
PYTHIA normally ignores SLHA input for any particle whose default mass in PYTHIA is smaller than
SLHA:minMassSM = 100 GeV. This allows SLHA input to modify top and Higgs-boson properties,
but not those of Z, W , and lighter particles.
Separate DECAY blocks [401] can be used to specify decay tables for both new and existing
particles. (See further sections 2.3.3 and 3.11 for more on PYTHIA’s modelling of resonance pro-
duction and decays.) The sum of all branching fractions is normalized to one when read in. If a
certain decay channel is needed for determining the total width, but is not desired to be generated

244
SciPost Physics Codebases Submission

in the context of a given run, this can be done by setting the branching fraction negative. Each
line containing a branching ratio should also contain the number of daughter particles, followed
by the ID codes of the daughters. Note that only a single decay table should be provided for each
particle type; PYTHIA does not accept separate decay tables for antiparticles. However, if different
open decay modes are required for a particle and its antiparticle, this can be accomplished by
using the PYTHIA ParticleData settings NN:onIfPos and NN:onIfNeg which are allowed to
override the initial SLHA settings if SLHA:allowUserOverride = true.

# PID Width
DECAY 9000021 0.01
# BR NDA ID1 ID2
0.67 2 9000006 -9000006
0.33 2 6 -6

When the SLHA interface is used to modify particle data, the mmin and mmax limits used in
PYTHIA’s Breit–Wigner sampling (see section 2.3.3) default to m0 ±min(5Γ0 , m0 /2). The mmin value
is further required to also be above the sum of on-shell masses for the lightest decay channel. The
default values can be modified by the user, if so desired.
The default Breit–Wigner treatment for decay tables imported via the SLHA interface is the
simple NN:meMode = 100 one with constant branching fractions, but this can also be changed if
desired. The phase-space sampling is isotropic, since the SLHA tables do not convey any differen-
tial information. It is up to the user to ensure that the final behaviour is consistent with what is
desired and/or to apply suitable post-facto reweightings. Plotting the generator-level resonance
and decay-product mass distributions and e.g. mass differences, effective branching fractions, etc.,
may be of assistance to validate the program’s behaviour for a given application.
Note, finally, that the default in PYTHIA is to ignore SLHA input for all SM particles except
top quarks and Higgs bosons; this protects PYTHIA’s more sophisticated modelling of e.g. Z and
W decays (as well as its definitions of quarks, hadrons, and leptons), cf. section 3.11, from being
unintentionally overridden by the simpler SLHA treatment. Similar to the above, this choice can
be changed by the user if desired, though care must be taken not to corrupt PYTHIA’s hadron or
light-quark particle data.
Finally, we describe how user-defined blocks may be accessed via the SLHA class [82]. All
unknown, i.e. user-defined blocks that can be stored in arrays of up to 3 dimensions are read in
via the test SLHA file and saved under the name following the BLOCK keyword. Depending on
the dimensions of the box, one of these methods can be used to access relevant information. This
functionality can be used with e.g. the semi-internal processes described in section 9.7.3 to use
SLHA files to read complex parameter information. Using the slhaPtr object available to all
production processes inheriting from the SigmaProcess class, a block with blockName can be
accessed using one of the following.

245
SciPost Physics Codebases Submission

# Single value
bool slhaPtr->getEntry(string blockName, double& value);

# 1D array
bool slhaPtr->getEntry(string blockName, int index, double& value);

# 2D array
bool slhaPtr->getEntry(string blockName, int index1,
int index2, double& value);

# 3D array
bool slhaPtr->getEntry(string blockName, int index1, int index2,
int index3, double& value);

10.1.3 LHAHDF5
In addition to plain-text based ASCII LHEF, PYTHIA 8.3 now also supports Les-Houches event
input via the HDF5 data format, which some matrix-element generation frameworks, such as
SHERPA [404], support as an alternative to LHEF event output.
The HDF5 format is an open-source binary data format, organized like a database within a
single file. It allows for heterogeneous data storage, which is more compressed than ASCII files.
Being indexed in an efficient way, it enables the possibility of data slicing, i.e. the reading of data
subsets instead of the entire data at once. The HDF5 format is thus well suited for storing large
numbers of LHA phase-space points in a more efficient way than text-based file formats, allowing
for massively parallelized simultaneous access to a single event file [405].
The LHAHDF5 reader uses the HighFive header library to interface HDF5. Moreover, the HDF5
library tools must be installed and an MPI compiler, such as that shipped with MPICH, is needed.
To use the LHAHDF5 reader with PYTHIA 8.3, an example configuration command is therefore
given by:

./configure --with-mpich[=path] --with-hdf5[=path]


--with-highfive[=path]

As a relatively new event file format, the LHAHDF5 standard is still undergoing active de-
velopment. PYTHIA 8.3 internally uses a three-digit numbering scheme to distinguish different
LHAHDF5 versions, characterized as follows:

0.1.0 The event file contains an index group, in which the indices of the particles in a single event
are stored. The indices refer to the particle group. Weight variations are not supported and
event weights are stored as a single floating-point number in the event group.

0.2.0 The event file does not contain an index group. Weight variations are not supported, and
event weights are stored as a single floating-point number in the event group.

1.0.0 The event file does not contain an index group. Weight variations are supported, and
event weights are stored in a (possibly one-dimensional) array in the event group.

246
SciPost Physics Codebases Submission

Currently, not all event files may have their version number stored. Therefore, the version can be
specified in the PYTHIA input file using e.g. LHAHDF5:version = 0.2.0. If a version number is
present in the event file that is used, the user input will be ignored and the one in the event file is
used instead.

10.1.4 LHAPDF
The LHAPDF package is the community standard for providing external parton distribution func-
tions to event generators. Two versions of LHAPDF are supported by PYTHIA 8.3, version 5 [406],
a legacy FORTRAN version, and version 6 [407], with a more performant modern C++ implementa-
tion. The use of LHAPDF 5 is discouraged and will be fully removed in the future, but is currently
kept to provide PDFs for resolved photons that are not currently available in LHAPDF 6. Both
versions act as interpolators and extrapolators, for x and Q2 PDF grids provided by fitting groups.
The LHAPDF libraries do not perform DGLAP evolution, and are restricted in x and Q2 to the grids
provided by each PDF set.
Support for LHAPDF can be enabled during PYTHIA 8.3 configuration by,

./configure --with-lhapdf5[=path] --with-lhapdf6[=path]

where the path can optionally be provided. If the executable lhapdf-config is available,
the LHAPDF path will be automatically extracted. Plugin libraries are generated along with
the Pythia library which are then loaded at run time when LHAPDF sets are requested by the
user. With this interface, it is technically possible to simultaneously use both an LHAPDF 5 and
LHAPDF 6 PDF, but this is strongly discouraged. For all PDFs, proton or otherwise, LHAPDF sets
can be selected via setting the relevant configuration key to the value LHAPDF5:set/member or
LHAPDF6:set/member, where set is the name of the PDF set to use and member is the numerical
member of that set. If member is not supplied, the nominal member is assumed. The example
main52 demonstrates this syntax, while the example main51 shows how PDF classes can be used
independently of a main Pythia instance.
Every LHAPDF set has a range of validity, given by the minimum and maximum x and Q2
values of the grids provided. By default, PYTHIA 8.3 freezes these PDF sets at all boundaries for
the set, i.e. for x < x min the PDF value is fixed at x min and for Q < Q min the PDF value is fixed
at Q min . It is possible to enable extrapolation below x min by setting the PDF:extrapolate flag.
This flag applies universally to all PDF sets, both internal and external. Extrapolation should be
enabled with care, as the extrapolation is PDF set and LHAPDF version dependent, and in many
cases may return nonsensical results. Note that extrapolation for the remaining boundaries, x max ,
Q min , and Q max , is never performed. These values are always frozen at the limits of validity.
The standardized LHAGrid1 format used by LHAPDF 6 allows for PYTHIA 8.3 to use grids from
LHAPDF 6 sets without requiring the LHAPDF 6 library. Simple cubic interpolation is performed
in ln(x) and ln(Q2 ), where all Q2 sub-grids must have the same x-value structure. When less than
four Q2 sub-grids are available, linear interpolation is used instead. All relevant PDF sets can use
this interpolation by setting the relevant PDF configuration key to the value LHAGrid1:file,
where file is the full name of the PDF set file. If file begins with /, then an absolute file path
is used, otherwise the file is assumed to be in the share/Pythia8/pdfdata directory.

247
SciPost Physics Codebases Submission

10.1.5 POWHEG
A large number of processes utilizing the POWHEG method (positive weight hardest emission gen-
erator) [151, 152, 156] are available via the POWHEG B OX package [179]. The physics behind the
matching and merging of the hard processes generated by this package with the PYTHIA parton
shower is detailed in section 5. Here, technical details on how results from POWHEG B OX matrix
elements may be technically interfaced with PYTHIA are given.
The POWHEG B OX package uses a common FORTRAN code structure, which is then duplicated
with process-specific modifications in individual matrix elements, e.g. dijets which produces
NLO dijet events. These individual matrix elements are then compiled to create executables which
when run, take input cards from the user and produce LHEF output, see section 10.1.1 for details
on this format. This output file can then be directly read into PYTHIA via the Beams:LHEF setting.
Direct POWHEG B OX input, without correctly setting up matching, will result in double counting
of emissions. A special UserHooks class, PowhegHooks in Pythia8Plugins, provides a com-
mon interface for appropriately matching POWHEG B OX output with the PYTHIA parton shower.
In main31 a full example is given, demonstrating how dijet events produced from the dijets
POWHEG B OX matrix element can be correctly passed through PYTHIA to produce full events.
In some cases, particularly within large experimental frameworks, users may wish to directly
access the FORTRAN common blocks of a POWHEG B OX executable, passing the event by memory
to PYTHIA, rather than through LHEF output. By default, POWHEG B OX builds only executables.
However, it is possible to modify the Makefile via the command,

sed -i "s/F77= gfortran/F77= gfortran -rdynamic -fPIE -fPIC -pie/g"


Makefile

so that the executables can also be used as shared libraries. When modified accordingly, these
executables can be linked against PYTHIA interface code to produce libraries that can be loaded
directly by PYTHIA at run time. Run-time loading, rather than dynamic linking, is used so that mul-
tiple POWHEG B OX processes can be accessed by a single Pythia instance, without creating symbol
collisions between executables that have common names for global functions and variables.
After appropriately modifying the relevant POWHEG B OX Makefiles and compiling executa-
bles that can also be used as shared libraries, the PYTHIA interface libraries must be created. This
can be configured with PYTHIA via,

./configure --with-powheg-bin=path

where path is the directory containing the POWHEG B OX executables. When building PYTHIA, a
plugin library for each POWHEG B OX in the provided directory will automatically be created. These
plugin libraries can then be used via the PowhegProcs class provided in Pythia8Plugins as
demonstrated in the example main33. The program flow is as follows,

Pythia pythia; // Create a Pythia instance.


PowhegProcs hvq(&pythia, "hvq"); // Load the "hvq" plugin library.
hvq.readString("configure here"); // Configure the "hvq" plugin
hvq.init(); // Initialize the plugins.
pythia.init(); // Initialize Pythia.

248
SciPost Physics Codebases Submission

where the heavy-quark process hvq has been loaded and configured. It is also possible to include
another process,

PowhegProcs dijet(&pythia, "dijet", "dijetrun");

where the additional argument is needed to ensure that the integration grids from the first
process are not overwritten by the second process.
When using the PowhegProcs method for interfacing with POWHEG B OX a PowhegHooks
instance is automatically created and passed to the main Pythia instance. The settings for this
matching hook must be set by the user through either the readString or readFile methods of
the Pythia instance. In many cases, sensible default values are set, but some settings are process
dependent and must be correctly configured by the user, i.e. POWHEG:nFinal.

10.1.6 MADGRAPH5_AMC@NLO
MADGRAPH5_AMC@NLO [154] is a hard process generator, similar to POWHEG B OX, but rather
than relying upon individually implemented processes, it can automatically generate arbitrary
processes up to NLO. There are a number of ways through which MADGRAPH5_AMC@NLO can be
interfaced with PYTHIA.
1. MADGRAPH and AMC@NLO themselves can interface with PYTHIA and pass generated hard
processes through PYTHIA to produce full events, all within the MADGRAPH5_AMC@NLO
machinery.

2. LHEF output from MADGRAPH5_AMC@NLO can be passed to PYTHIA 8.3, see section 10.1.1
for details on reading LHEF input.

3. Source code for matrix-element libraries, inheriting from the internal SigmaProcess class
in PYTHIA, can be generated by MADGRAPH.

4. The MADGRAPH5_AMC@NLO executable can be called from within PYTHIA via the LHAup-
Madgraph class.
5. Matrix-element plugins for the DIRE and VINCIA parton showers can be generated by MAD -
GRAPH, compiled, and then loaded at run time.
The latter three methods are covered in more detail below. In all cases, it is important that appro-
priate matching and merging, see section 5, is configured to ensure there is no double counting
between the generated hard process and the remainder of the event produced by PYTHIA 8.3.
Semi-internal processes can be passed to PYTHIA 8.3 via inheriting from the SigmaProcess
class. The primary method of this class is sigmaHat where the exact definition depends upon the
final-state multiplicity of the process. Phase-space generation can be handled by PYTHIA 8.3 for
2 → 1, 2 → 2, and 2 → 3 processes, although the 2 → 3 phase-space sampler is not particularly
sophisticated. When necessary, users can provide custom external phase-space samplers. Conse-
quently, while 2 → n processes can be externally supplied, phase-space generation must also be
implemented by the user for n > 3. A full example is given in the example main22 but the general
syntax is,

SigmaProcess* userSigma = new UserSigma();


pythia.setSigmaPtr(userSigma);

249
SciPost Physics Codebases Submission

where UserSigma is a user-defined process inheriting from SigmaProcess.


Semi-internal process source code can be generated from within the MADGRAPH PYTHON in-
terface as follows.

import model model_name


generate mg5_process_syntax
add process mg5_process_syntax
output pythia8 [path_to_pythia]

A directory containing the output for the process is placed in the PYTHIA 8.3 source directory
specified by path_to_pythia and an example is placed in the examples directory.
It is also possible to call MADGRAPH from within PYTHIA 8.3 via the LHAupMadgraph class
provided in Pythia8Plugins.

shared_ptr<LHAupMadgraph> madgraph =
make_shared<LHAupMadgraph>(&pythia, true, "madgraphrun", exe);
madgraph->readString("generate mg5_process_syntax");
pythia.setLHAupPtr(madgraph);

This interface generates the relevant MADGRAPH configuration cards, and then runs the MAD -
GRAPH executable, specified by exe, to produce LHEF output that is then read in by PYTHIA 8.3.
An attempt is made to automatically set up matching and merging, but this process should always
be validated by the user. Random-number sequences are automatically handled, based on the
PYTHIA 8.3 random-number generator. Whenever the LHEF input is exhausted, a new call is made
to the MADGRAPH executable and a new LHEF output is generated.
Finally, it is possible to use MADGRAPH to generate matrix-element plugins for use in the
DIRE and VINCIA parton showers. A number of these plugins are already provided with the
PYTHIA 8.3 distribution in the plugins/mg5mes directory. To enable this plugin support, con-
figure PYTHIA 8.3 with

./configure --with-mg5mes[=path]

where the path to the matrix-element plugin source-code directories can optionally be specified.
A plugin library for each directory in the path will be built, which can then be loaded at run time.
Just as for POWHEG B OX, run-time loading of the matrix elements allows for multiple plugins to
be used with the same instance of Pythia. For DIRE and VINCIA, the plugin library to be used can
be specified with the settings Dire:MEplugin and Vincia:MEplugin respectively.
New matrix-element plugin libraries can be generated by using the generate command in
the plugins/mg5mes directory. In its simplest form, the user just needs to specify the process,

./generate --process="mg5_process_syntax"

but may also specify a model to use, as well as the output directory. Advanced usage is also possible
where a custom MADGRAPH card is passed by the user, or the interactive mode of MADGRAPH is
enabled. Note that this feature requires the use of DOCKER to download and run a container with
a custom version of MADGRAPH.

250
SciPost Physics Codebases Submission

The most common interface to MADGRAPH5_AMC@NLO is through text files in LHEF format,
cf. section 10.1.1. For easy interfacing between MADGRAPH5_AMC@NLO and PYTHIA 8.3, some
custom additions to the file format are employed:

• The event attributes npLO and npNLO are used to set the number of particles at lowest order
for events with leading-order and next-to-leading order cross sections, respectively. For the
former, npLO amounts to a simple final-state particle count. For the latter, npNLO gives the
number of final-state particles necessary to define the scattering at Born level. It is assumed
that npLO≥ 0 → npNLO< 0 and npNLO≥ 0 → npLO< 0, meaning that these attributes also
act to signal if an event is a leading-order or next-to-leading-order contribution.

• Several mechanisms to set the parton shower starting scales for individual particles exist.
These rely on attributes of the <scales> tag defined in the LHEF 3.0 format.

• For the case of MLM matching, the parton-shower starting scale information is also used
to signal whether a particle should not be considered for the MLM jet matching procedure.
Particles that have been assigned a starting scale µ > 2ECM will be considered exempt from
the MLM jet matching criterion.

MADGRAPH5_AMC@NLO further incorporates provisions for automatic NLO+PS matched cal-


culations within the MC@NLO approach. The interface between AMC@NLO and PYTHIA 8.3 typi-
cally relies on phase-space points transmitted via LHEF. However, for special matching tasks, it is
possible to invoke PYTHIA 8.3 from within AMC@NLO. This is the case for the MC@NLO −∆ match-
ing prescription. The relevant FORTRAN code, wrapping PYTHIA 8.3 functionality, is shipped within
MADGRAPH5_AMC@NLO. PYTHIA 8.3 can be set up for use within MADGRAPH5_AMC@NLO by
setting the configuration flag Merging:runtimeAMCATNLOInterface. This then allows MAD -
GRAPH5_AMC@NLO direct access to select parts of PYTHIA 8.3’s internal merging machinery, to
e.g. enable the extraction of Sudakov form factors. A more detailed introduction may only be
relevant to experts in MADGRAPH5_AMC@NLO, and may be found in the online manual.

10.1.7 HELACONIA
While PYTHIA 8.3 has a complete collection of expandable quarkonia processes, see section 3.3,
it is sometimes necessary to generate quarkonia states at higher orders or with additional final-
state partons. Previous versions of MADGRAPH were able to produce arbitrary tree-level quarkonia
processes via MADONIA [408], but the current version of MADGRAPH no longer has this ability to
generate bound heavy-quark resonances. However, the standalone HELACONIA [409] package is
able to provide the same functionality of the MADONIA package, and beyond.
The program flow of HELACONIA is very similar to that of MADGRAPH. A PYTHON interface
is used to generate source code which is then compiled and run to produce LHEF output. This
output can then be provided to PYTHIA 8.3 to produce full events with parton showers, underlying
event, and particle decays. The HELACONIA syntax is modelled after the MADGRAPH syntax, and
consequently, the interface is similar. Unlike MADGRAPH, HELACONIA is not able to produce semi-
internal matrix elements inheriting from the SigmaProcess class. Instead, HELACONIA can be
interfaced either by directly providing LHEF output to PYTHIA 8.3, or using the LHAupHelaconia
class provided in Pythia8Plugins.
The LHAupHelaconia interface is very similar to that of LHAupMadgraph,

251
SciPost Physics Codebases Submission

shared_ptr<LHAupHelaconia> helaconia =
make_shared<LHAupHelaconia>(&pythia, true, "helaconiarun", exe);
helaconia->readString("generate ho_process_syntax");
pythia.setLHAupPtr(helaconia);

where ho_process_syntax is the HELACONIA equivalent for the MADGRAPH process syntax.
The HELACONIA executable must be available via the string exe. Every time a PYTHIA 8.3 event is
generated, the plugin checks if an event is available from an LHEF file generated by HELACONIA.
If not, it will automatically run another batch of events. Random-number seeds and sampling are
consistently handled in the same way as for LHAupMadgraph.

10.1.8 EVTGEN
For many experimental collaborations, particularly those specializing in B-physics, more detailed
hadron-decay models are needed than those provided by default in PYTHIA 8.3. The EVTGEN [410]
package specializes in B-hadron decays, including sophisticated models, spin correlations, and the
ability to implement new models. To include spin correlations EVTGEN does not just decay a single
particle at a time, but instead performs the entire decay tree for each given initial particle. Conse-
quently, decays from EVTGEN cannot be included in PYTHIA 8.3 via the provided DecayHandler
class, called during the decay stage of the hadron level, but must rather be performed after full
event generation. Such an interface for EVTGEN is supplied by the class EvtGenDecays provided
in Pythia8Plugins.
In B-physics, particularly at hadron colliders, one oftentimes wishes to produce a large sample
of events where each event contains one or more rare signal decays, e.g. Bs0 → µ+ µ− . The first
step, of course, is to generate an event with at least one signal particle candidate, while the second
step is to force the signal decay for one of these candidates. The weight for an event containing
one candidate with a forced signal decay is simply the branching fraction for the signal decay.
However, when multiple candidates are present, the event weight becomes slightly more complex,
requiring non-trivial bookkeeping. Consequently, the EvtGenDecays class in PYTHIA 8.3 provides
a generalized mechanism by which to force signal decays for given particle species, while still
providing an appropriate event weight.
Signal particle candidates, ci , do not all need to be the same particle species. Here, a particle
species differentiates not only between particle types, e.g. Bs0 and τ+ , but also between particles
and antiparticles, e.g. τ+ and τ− . Additionally, the signal decay for a candidate, with branching
fraction Bsig (ci ), can include multiple channels. Consequently, arbitrarily complex signal decays
can be forced. As an example, events can be required to contain one or more of the following
decays: τ+ → ν̄τ π+ , τ+ → ν̄τ π0 π+ , Bs0 → µ+ µ− , and τ− → τν π− π− π− π+ π+ . Here, assuming
equal production of the three particle species (which is almost certainly not the case), the decay
τ+ → ν̄τ π0 π+ of the four signal decays will be the most commonly forced decay. Following this
notation, the event weighting is performed as follows.
1. An event is generated and all n signal particle candidates, ci , are found. If there are no
candidates, n = 0, then an event weight, Wevent , of 0 is returned.

2. If n > 0 then a candidate ci is randomly chosen with probability


Bsig (ci )
P(ci ) = P € Š, (397)
m
j=1 1 − Bsig (c j )

252
SciPost Physics Codebases Submission

where Bsig (ci ) is the signal branching fraction for each candidate ci .

3. A channel is selected for the chosen candidate ci from one of the signal channels contributing
to Bsig (ci ).

4. Channels for all remaining candidates are selected, using all allowed decay channels, not
just the signal channels.

5. The number of candidates with a selected signal channel, m, is determined. The channel
selection for the candidates is then kept with probability 1/m. If the channel selection is
rejected, the algorithm returns to step 2 and a new set of channels is selected.

6. All candidates are decayed via their selected channel and


n €
Y Š
Wevent = 1 − 1 − Bsig (ci ) , (398)
i=1

is calculated as the event weight.

An unweighted sample of events can be obtained by randomly selecting events, each with probabil-
ity Wevent /Wmax . The maximum possible event weight, Wmax , can be determined by the maximum
weight from a sufficiently large sample of events.
To use EVTGEN in PYTHIA 8.3, configure PYTHIA 8.3 with

./configure --with-evtgen[=path]

where path optionally provides the path to the EVTGEN installation. Note that EVTGEN itself
also links against PYTHIA 8.3, so in some cases it might be necessary to reconfigure PYTHIA 8.3
after installation of EVTGEN. A full example using EVTGEN is provided in main48. The general
syntax is,

EvtGenDecays evtgen(&pythia, dec, pdl);


pythia.next();
evtgen->decay();

where dec and pdl provide the paths to the EVTGEN decay and particle data files.

10.1.9 External random-number generators


When including PYTHIA in a larger software framework, using a single random-number generator
across all components is oftentimes required to ensure reproducible results. Consequently, an
external random-number-generator pointer may be passed for use by a given PYTHIA instance.

pythia.setRndmEnginePtr(rng)

Here, rng is a pointer to an instance of a user-defined random number generator derived from
the RndmEngine class. The only method that must be implemented by the user is flat which
should return a uniform distribution between 0 and 1. The example below implements a linear
congruential generator with a configurable seed, modulus, multiplier, and increment.

253
SciPost Physics Codebases Submission

class RandomLCG : public RndmEngine {


public:
long int seed{1}, m{2147483648}, a{1103515245}, c{12345};

// The only method that needs to be implemented.


double flat() {
seed = (a * seed + c) % m;
return double(seed)/m;
}
};

Typically, the RndmEngine class can be used to wrap some other random number generator.
An example of this is the MixMadRndm class which is a wrapper for an implementation of the
MIXMAX algorithm [11].

#include "Pythia8Plugins/MixMax.h"
MixMaxRndm rng(0, 0, 0, 123);
pythia.setRndmEnginePtr(&rng);

The argument to the generator constructor is four seed values. While this functionality of pro-
viding an external random number generator is useful, it should be treated with care. Some
pseudo-random-number generators implemented in standard packages are not sufficient for large
scale generation, e.g. the CLHEP implementation of the RANLUX algorithm. Consequently, when
possible, the default random number generator in PYTHIA, based on the RANMAR implementation
of the Marsaglia-Zaman algorithm [9], is recommended and sufficient for most physics purposes,
see section 2.2.1.

10.2 Output formats


PYTHIA comes with a set of example main programs, and in most of these the analysis of the
produced event is performed directly in the code there. It is also possible to output the events to be
analyzed by interfacing to external programs and code. For this purpose PYTHIA can communicate
its events with different output formats as described in this subsection.

10.2.1 HEPMC versions 2 and 3


The standard format for communicating fully generated events is called HEPMC [411, 412] and
defines a set of C++ classes to describe an event and all particles therein. Internally, the particles
are connected by vertex objects using pointers.
The latest version of the HEPMC code is not yet adopted by all LHC collaboration and PYTHIA
therefore has support for both version 2 (2.06 and later) and version 3. The interface as such is pro-
vided at the header file level using Pythia8Plugins/HepMC2.h or Pythia8Plugins/HepMC3.h,
and the PYTHIA code itself does not have any dependencies on these. This means that the PYTHIA
(shared) library can be built independently of which version of HEPMC should be used. However,
if one wishes to use the example main programs that show how to use HEPMC13 the configuring
13
The example main programs can be found in the online manual under Getting Started → Examples by
Keyword, search for Hepmc.

254
SciPost Physics Codebases Submission

of PYTHIA must be done according to

./configure ---with-hepmc3=/path/to/hepmc/installation

or

./configure ---with-hepmc2=/path/to/hepmc/installation

Besides the particles, other information will also be transferred to the HEPMC format, such as
cross sections, parton density information, and different weights (see section 9.8). Note, however,
that not all information in the Pythia8::Event is preserved in the HEPMC output. Notably, the
status codes for particles in HEPMC are only set to 1 (final state particle), 2 (decayed standard
model hadron or τ or µ), 4 (incoming beam), or a number in the range 11–200 (generator depen-
dent status of an intermediate particle, given by the absolute value of the corresponding PYTHIA
status code).

10.2.2 Histograms with the YODA package


Even though the built-in histogram package might suffice for the most basic use cases, such as
one-dimensional histograms, most users require more advanced capabilities. Since PYTHIA 8.3
is not a statistics or plotting package, we refer the user to external programs. For slightly more
advanced use cases, we recommend interfacing to the YODA14 histogram package. If installed,
PYTHIA 8.3 can be configured with --with-yoda=/path/to/yoda, which allows the user to
create Makefile recipes with access to YODA histograms easily. The YODA package will then be
accessible in PYTHIA 8.3 as any other C++ library can be accessed. Questions regarding the YODA
histogram package should be addressed to the YODA authors.

10.2.3 Interfacing with ROOT


For more advanced analyses, many users prefer the ROOT [413] package. PYTHIA 8.3 provides
several possibilities to interface with ROOT, version 6 or higher. Use cases can roughly be grouped
into three categories:

1. Using ROOT as a histogram package inside PYTHIA 8.3.

2. Using PYTHIA 8.3 to generate ROOT events or “n-tuples”, which can be post-processed by
ROOT.

3. Steering PYTHIA 8.3 from inside a ROOT-based framework.

We will here briefly cover the first two use cases, but refer the user to the ROOT documentation
for using the PYTHIA 8.3 interface in ROOT, where it is extensively documented.
The simplest use case is of the first category which, from a technical point of view, is not
too different from using any other C++ library along with PYTHIA 8.3. In the example main91,
it is shown how to declare a ROOT TApplication environment and ROOT TH1F histograms, to
be filled by PYTHIA 8.3, and displayed on screen. The crucial part is the Makefile recipe. If
PYTHIA 8.3 is configured --with-root, convenient variables pointing to the ROOT libraries and
the root-config script can be used as shown, to compile a main program with the necessary
14
See https://yoda.hepforge.org/.

255
SciPost Physics Codebases Submission

linking to ROOT libraries. The generated histograms can then be saved to a .root file for later
access.
Most users already familiar with ROOT, would rather store event files generated with PYTHIA 8.3
(so-called “n-tuples”) on disk, which can then be post-processed with a ROOT-centric analysis
framework, often with auxiliary packages, provided by a large experiment. In such cases, exam-
ples main92 and main93 can be of inspiration. The main92 example shows how to store full
events into a ROOT TTree. For most realistic use cases, this is not very practical, as such files
will quickly grow large, containing a significant amount of information which is of little rele-
vance to the user. The main93 program provides a more streamlined interface. In the header
file main93.h, two classes RootTrack and RootEvent are defined. Those classes define what
information about each track (including track-level cuts, e.g. desired acceptance) as well as each
event, should be stored in an output ROOT-file. If PYTHIA 8.3 has been configured with ROOT, the
main93 example can be run with an input .cmnd file with the flag Main:writeRoot = on, and
the desired information will be stored. If changes are made to the header file, main93 must be
recompiled. For both main92 and main93, the compilation recipe in the Makefile is the most
difficult part to set up, as both require generation of compiled and linked ROOT dictionary libraries
with CINT. A user wishing to go beyond simple extensions of the given examples are encouraged
to study the existing Makefile recipes, as well as the ROOT documentation on Linkdef.h. It is
kindly requested that queries about ROOT dictionary library generation are directed to the ROOT
authors.

10.3 Analysis tools


The tools included in PYTHIA are normally enough for doing simple analyses of the generated
events, but for more complicated analyses, or if direct comparison with data is wanted, the user
needs to interface to external tools. Here we describe some of these interfaces.

10.3.1 RIVET versions 2 and 3


The RIVET package [414,415] is probably the most convenient way of comparing event-generator
models to experimental data. The program includes a large collection of experimental analyses
encoded (usually by the experiments themselves) in C++ classes that read HEPMC input and pro-
duce YODA files that can be plotted together with the experimental data points (also provided by
the experiments through HEPDATA [416, 417]).
Since RIVET only needs HEPMC input, the only thing needed for PYTHIA is to write the events to
a HEPMC file (see section 10.2.1) or a pipe (which is recommended to avoid creating unnecessarily
large files), and have RIVET take this as input. Assuming a main PYTHIA program, mymain-hepmc,
that simply writes HEPMC to the standard output, the commands to do this are

mkfifo hepmc-pipe
./mymain-hepmc > hepmc-pipe &
rivet -a SomeAnalysis hepmc-pipe
rivet-mkhtml Rivet.yoda

where the last command will produce formatted web pages in the rivet-plots subdirectory,
with the comparisons to data.
In PYTHIA there is also a more direct way of calling RIVET from within a main program pro-
vided. This uses the header file Pythia8Plugin/Pythia8Rivet.h and provides simple short-

256
SciPost Physics Codebases Submission

cuts as shown in some of the provided example main programs.15 To enable this, PYTHIA must be
configured using

./configure --with-rivet=/path/to/rivet/installation

together with the corresponding ---with for the version of HEPMC that RIVET was configured
with.
PYTHIA currently supports direct linking with both versions 2 and 3 of RIVET. The additional
features in the later version includes the possibility of using different weights (see section 9.8),
several heavy-ion specific features [418], and to provide options to the analyses. Support for
version 2 of RIVET will likely be dropped in the future.

10.3.2 FASTJET
The fjcore code is distributed together with the PYTHIA 8.3 code by permission from the authors.
There is also an interface that inputs PYTHIA 8.3 events into the full FASTJET library, for access to
a wider set of methods, but then FASTJET must be linked by using

./configure --with-fastjet=/path/to/fastjet/installation
--with-fastjetlib=/path/to/
fastjet/library

Among PYTHIA 8.3 example main programs, main71.cc shows in the case of W plus jet pro-
duction, how the FASTJET package can be used for analysis of the final state, and main80.cc
performs CKKW-L merging with a merging scale defined in k⊥ , with main80.cmnd and LHE files
as input. Also, main72.cc compares QCD jet finding in SlowJet and FASTJET, using the header
file FastJet3.h present in the directory Pythia8Plugins contributed by Gavin Salam [419].

10.4 Computing environments


PYTHIA has been developed as a C++ library to write and compile programs to execute standalone
on a generic ∗nix operating system on a generic computer. However, we address here the rise
in popularity of PYTHON as a development language and a powerful tool in machine-learning
applications.

10.4.1 PYTHON interface


To meet the growing requirements of a large user base, PYTHIA includes a flexible PYTHON interface
to most frequently used classes, and thus allows a user to write a PYTHIA main program entirely
in PYTHON. This provides the user direct access to the wealth of analysis and visualization tools,
available through PYTHON libraries, all at run time. A number of PYTHON examples are provided,
each a direct translation of their corresponding C++ counterpart. The interface is generated with
BINDER using the PYBIND11 template library. The specific version of BINDER and PYBIND11 needed
to generate the interface is provided through a small DOCKER container.
The default interface is a simplified one, with only the core PYTHIA functionality available.
This interface is a trade off between usability and remaining light weight. The top level Pythia
15
The example main programs can be found in the on-line manual under Getting Started → Examples by
Keyword, search for Rivet.

257
SciPost Physics Codebases Submission

class is available, as well as all relevant Event, ParticleData, and analysis tool related classes.
An important feature of the interface is that it is bi-directional, derived classes in PYTHON can
be passed back to PYTHIA. This is useful, for example, to create a UserHooks derived class (see
section 9.7.2). All user interface classes, typically passed to the main Pythia object via pointers
in the standard C++ code, are available through the simplified interface.
A full PYTHON interface can also be generated by the user. Only DOCKER is required to enable
the generation of a new PYTHON interface to PYTHIA. The following generates the full interface.

cd plugins/python
./generate --full

It is also possible to generate a user-defined interface which is tailored to a specific use case
via the flag --user=FILE instead. Here, FILE is a BINDER configuration file specified by the user.
Note that whenever changes are made to the PYTHIA C++ headers, the PYTHON interface must be
generated again, whether simplified, full, or user defined.
Installation of the PYTHON interface requires the Python.h header to be available. The
python-config script can be used to find the relevant paths when configuring PYTHIA. An ex-
ample configuration for PYTHIA with PYTHON 3.6 could then be:

./configure --with-python-config=python3.6-config

This would configure PYTHIA to be built with the default interface, using PYTHON 3.6. After
configuring, the compiled PYTHIA module is available in the lib/ directory under the top level
PYTHIA directory. The PYTHON installation must have that directory made available, e.g. by setting:

export PYTHONPATH=$(PWD)/lib/:$PYTHONPATH

from the top level PYTHIA directory. After compiling with make, the PYTHON interface should
be available. The following example loads the PYTHIA PYTHON module and prints the internal
documentation which includes the available classes, as well as some of the not-so-obvious features.

>>> import pythia8


>>> help(pythia8)

One of the main reasons for the PYTHON interface is the fast development of a standalone main
program in PYTHON rather than C++, allowing for an environment of external tools, which the user
might be more familiar with. As an example of such a program, consider the short PYTHON script
below, which will run PYTHIA to produce a numpy histogram containing the distribution of charged
hadron multiplicity at mid-pseudorapidity in proton collisions at LHC energies.

258
SciPost Physics Codebases Submission

# Wrapper around numpy histogram to allow fill functionality.


import numpy as np
class HistoFiller(object):
def __init__(self, bins):
self.bins = bins
self.hist, edges = np.histogram([], bins=bins, weights=[])
self.widths = []
for i in range(len(edges)-1):
self.widths.append(edges[i+1] - edges[i])

def fill(self, val, w=1.0):


hist, edges = np.histogram(val, bins=self.bins, weights=w)
self.hist+=hist

def get(self):
scale = 1./sum(self.hist)
return [h/w*scale for h,w in zip(self.hist,self.widths)],
[np.sqrt(h)*scale for h in self.hist]

# Set up Pythia and declare histogram.


import pythia8
pythia = pythia8.Pythia()
pythia.readString("SoftQCD:all = on")
pythia.init()
mult = HistoFiller([3.*x for x in range(20)])

# Event loop. Find particles and fill histogram.


for iEvent in range(1000000):
if not pythia.next(): continue
nCharged = 0
for p in pythia.event:
if p.isFinal() and p.isHadron() and p.isCharged():
nCharged += 1
mult.fill(nCharged)

# Plot the histogram using the matplotlib library.


import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(111)
y, ye = mult.get()
ax.errorbar(mult.xvals,y,xerr=[w/2. for w in mult.widths],
yerr=ye, drawstyle=’steps-mid’,fmt=’-’,color=’black’)
ax.set_xlabel(r’$dN_{ch}/d\eta$’)
ax.set_ylabel(r’$P(dN_{ch}/d\eta)$’)
plt.show()

259
SciPost Physics Codebases Submission

Part IV
Summary and Outlook
Our goal in writing this manual was to provide reference material for users and developers of
PYTHIA 8.3. We provided some basic content that we considered mandatory, such as defining
what is an event generator, how does our code structure reflect the physics, and what sorts of
numerical methods we use in the program. This is covered in part I. The core of the manual,
provided in part II, describes in detail the phenomenon that is simulated and our assumptions
and approximations. Parton showers and hadronic or nuclear physics are covered in more detail
because these have been the arenas of more recent development. Other topics are covered more
liberally in the HEP literature, and we hope to have provided enough outside references. What is
somewhat new compared to other PYTHIA manuals is part III, dedicated to the user of PYTHIA 8.3.
Our aim was not to give the user an easy way to skip the description of physics, but to facilitate
the use of the program in real analyses and investigations. This part of the manual is the most
pragmatic, but also the one most susceptible to acronyms, initialisms, and jargon. It is also the
most technical in describing our and others’ computer code.
This manual is a snapshot of an evolving entity. Within a short period of our concluding state-
ments, new developments will arise that are not covered in this manual. We hope this continues,
even as we pass the torch to the next generation of PYTHIA authors and contributors.
James D. Bjorken (“BJ” to his generation) wrote of the “tyranny” of Monte Carlo in a short
paragraph of a larger editorial on the future of particle physics in 1992 [420]. He lamented the
fact that Monte-Carlo predictions were taken as the truth, event though most of the prediction was
a black-box. Had he read this manual in 2022, we hope he would understand that the authors
have provided a code that is more democratic, and allows users to liberally test ideas, but within
well-defined boundaries. As such, there is no single PYTHIA prediction to compare to data.

260
SciPost Physics Codebases Submission

Acknowledgements
A large number of people should be thanked for their contributions to the PYTHIA 8.3 event gen-
erator.
First of all, Bo Andersson and Gösta Gustafson are the originators of the Lund model, and
have strongly influenced the development of both early code versions and also recent model addi-
tions. Hans-Uno Bengtsson should furthermore be acknowledged as the originator of the PYTHIA
program.
Some made contributions dating way back in the programs’ history, others more recently.
While praise for the contributions should go to the contributors, blame for mistakes made in mod-
ifications to the original code, or failure to keep it up-to-date, should rest with the core authors.

Former authors
Former PYTHIA 8 authors, who are no longer active in the field, are: Stefan Ask, Jesper Roy
Christiansen, Richard Corke, Nadine Fischer, and Christine O. Rasmussen. The merging of VINCIA
into PYTHIA 8.3 brought with it further significant author contributions from Helen Brooks in
particular.

Further contributions
The program has received many smaller and larger contributions and bug reports over time, from
users to numerous to mention here. They are mentioned in the online update notes as the bug
fixes go in, and are all gratefully acknowledged.
In particular, contributions from the following should be mentioned: Baptiste Cabouat for de-
veloping and implementing the initial-final dipole approach, Silvia Ferreres-Solé for implementing
the space-time hadronic production points in string fragmentation, and Tomas Kasemets for im-
plementation of new proton PDFs.
Code contributions from the following collaborators and users are also gratefully acknowl-
edged: O. Alvestad, S. Baker, B. Bellenot, R. Brun, A. Buckley, M. Cacciari, L. Carloni, S. Carrazza,
R. Ciesielski, V. Hirschi, N. Hod, H. Hoeth, J. Huston, M. Kirsanov, A. Larkoski, B. Lloyd, J. Lopez-
Villarejo, O. Mattelaer, M. Montull, A. Morsch, A. Naumann, S. Navin, P. Newman, M. Ritzmann,
J. Rojo, G. Salam, K. Savvidy, G. Savvidy, A. Singh, G. Soyez, M. Sutton, R. Thorne, and G. Watt.
We also thank J. Altmann and T. Garnett for correction of typos in this manuscript.
Finally, vigilant code tests of PYTHIA releases by Mikhail Kirsanov, Dimitri Konstantinov, and
Vittorio Zecca are gratefully acknowledged.

Financial support
The Lund and Monash groups have received financial support from the EU H2020 Marie Skłodowska-
Curie Innovative Training Network MCnetITN3, grant agreement 722104.

The Lund group has also received funding from the European Research Council (ERC) under
the European Union’s Horizon 2020 research and innovation programme, grant agreement No
668679 (MorePheno), and from the Swedish Research Council, contract number 2016-05996.

The Jyväskylä group (IH and MU) has been funded as a part of the CoE in Quark Matter of the
Academy of Finland.

261
SciPost Physics Codebases Submission

CB and LL acknowledge support from the Knut and Alice Wallenberg foundation, contract number
2017.0036.

SC and LL acknowledge support from the Swedish Research Council, contract number 2020-
04869.

ND acknowledges support from the Science and Engineering Research Board, Government of In-
dia under Ramanujan Fellowship SB/S2/RJN-070.

SM is supported by the Fermi Research Alliance, LLC under Contract No. DE-AC02-07CH11359
with the U.S. Department of Energy, Office of Science, Office of High Energy Physics.

PS acknowledges support from the Australian Research Council via Discovery Project DP170100708
— “Emergent Phenomena in Quantum Chromodynamics”.

RV acknowledges support from the European Research Council (ERC) under the European Union’s
Horizon 2020 research and innovation programme (grant agreement No. 788223, PanScales), and
from the Science and Technology Facilities Council (STFC) under the grant ST/P000274/1.

CTP acknowledges support from the Swiss National Science Foundation (SNF) under contract
200021-197130, the Monash Graduate Scholarship, the Monash International Postgraduate Re-
search Scholarship, and the J. L. William Scholarship.

IH acknowledges support from the Academy of Finland, project numbers 308301 and 331545,
and from the Carl Zeiss Foundation.

MU acknowledges support from the Academy of Finland, project number 336419.

PI acknowledges support from the United States National Science Foundation (NSF) via grant
NSF OAC-2103889.

262
SciPost Physics Codebases Submission

Appendices
A Full list of internal processes

A.1 Standard model processes

Table 8: List of internal soft QCD processes, see section 6.1 for details and references.

process internal name code


SoftQCD:all
AB → X SoftQCD:nonDiffractive 101
AB → AB SoftQCD:elastic 102
AB → X B SoftQCD:singleDiffractiveXB 103
AB → AX SoftQCD:singleDiffractiveAX 104
A B → X1 X2 SoftQCD:doubleDiffractive 105
AB → AX B SoftQCD:centralDiffractive 106
SoftQCD:singleDiffractive 104, 103
SoftQCD:inelastic 101, 103, 104, 105,
106

263
SciPost Physics Codebases Submission

Table 9: List of internal hard QCD processes, see section 3.1 for details.

process internal name code refs.


HardQCD:all
gg → gg HardQCD:gg2gg 111 [421–423]
gg → qq HardQCD:gg2qqbar 112 [421–423]
qg → qg HardQCD:qg2qg 113 [421–423]
qq0 → qq0 HardQCD:qq2qq 114 [421–424]
qq → gg HardQCD:qqbar2gg 115 [421–423]
qq → q0 q0 HardQCD:qqbar2qqbarNew 116 [421–424]
gg → cc HardQCD:gg2ccbar 121 [425]
qq → cc HardQCD:qqbar2ccbar 122 [425]
gg → bb HardQCD:gg2bbbar 123 [425]
qq → bb HardQCD:qqbar2bbbar 124 [425]
gg → ggg HardQCD:gg2ggg 131 [426]
qq → ggg HardQCD:qqbar2ggg 132 [426]
qg → qgg HardQCD:qg2qgg 133 [426]
qq0 → qq0 g HardQCD:qq2qqgDiff 134 [426]
qq → qqg HardQCD:qq2qqgSame 135 [426]
qq → q0 q0 g HardQCD:qqbar2qqbargDiff 136 [426]
qq → qqg HardQCD:qqbar2qqbargSame 137 [426]
gg → qqg HardQCD:gg2qqbarg 138 [426]
qg → qq0 q0 HardQCD:qg2qqqbarDiff 139 [426]
qg → qqq HardQCD:qg2qqqbarSame 140 [426]

Table 10: List of internal low-energy QCD processes, see section 6.1.5 for details.

process internal name code refs.


LowEnergyQCD:all
AB → X LowEnergyQCD:nonDiffractive 151 [214]
AB → AB LowEnergyQCD:elastic 152 [214]
AB → X B LowEnergyQCD:singleDiffractiveXB 153 [214]
AB → AX LowEnergyQCD:singleDiffractiveAX 154 [214]
A B → X1 X2 LowEnergyQCD:doubleDiffractive 155 [214]
N N → N∗ N LowEnergyQCD:excitation 157 [214]
B B̄ → X LowEnergyQCD:annihilation 158 [214]
AB → R LowEnergyQCD:resonant 159 [214]

264
SciPost Physics Codebases Submission

Table 11: List of internal weak-boson processes, see section 3.2 for details.

process internal name code refs.


PromptPhoton:all
qg → qγ PromptPhoton:qg2qgamma 201 [423, 427]
qq → gγ PromptPhoton:qqbar2ggamma 202 [423, 427]
gg → gγ PromptPhoton:gg2ggamma 203 [428–430]
qq → γγ PromptPhoton:ffbar2gammagamma 204 [429]
gg → γγ PromptPhoton:gg2gammagamma 205 [428–430]
WeakBosonExchange:all [431]
ff0 → ff0 WeakBosonExchange:ff2ff(t:gmZ) 211
f1 f2 → f3 f4 WeakBosonExchange:ff2ff(t:W) 212
WeakSingleBoson:all [424]
ff → γ∗ /Z WeakSingleBoson:ffbar2gmZ 221
ff0 → W± WeakSingleBoson:ffbar2W 222
∗ 0 0
ff → γ → f f WeakSingleBoson:ffbar2ffbar(s:gm) 223 [423, 424]
∗ 0 0
ff → γ /Z → f f WeakSingleBoson:ffbar2ffbar(s:gmZ) 224 [423, 424]
f1 f2 → W± → f3 f4 WeakSingleBoson:ffbar2ffbar(s:W) 225 [423, 424]
WeakDoubleBoson:all
0 ∗ ∗
ff → γ /Z γ /Z WeakDoubleBoson:ffbar2gmZgmZ 231 [25, 424]
0 ±
ff → Z W WeakDoubleBoson:ffbar2ZW 232 [25, 424]
ff → W+ W− WeakDoubleBoson:ffbar2WW 233 [424, 432]
WeakBosonAndParton:all
qq → γ∗ /Z g WeakBosonAndParton:qqbar2gmZg 241 [424]
qg → γ∗ /Z q WeakBosonAndParton:qg2gmZq 242 [424]
ff → γ∗ /Z γ WeakBosonAndParton:ffbar2gmZgm 243 [424]
fγ → γ∗ /Z f WeakBosonAndParton:fgm2gmZf 244 [433]
qq → W± g WeakBosonAndParton:qqbar2Wg 251 [424]
qg → W± q WeakBosonAndParton:qg2Wq 252 [424]
ff → W± γ WeakBosonAndParton:ffbar2Wgm 253 [424, 434]
fγ → W± f WeakBosonAndParton:fgm2Wf 254 [433]

265
SciPost Physics Codebases Submission

Table 12: List of internal photon-collision processes, the second code in parenthesis is
used to separate photons from beam A and beam B when both are possible, see sec-
tion 3.2 for details.

process internal name code refs.


PhotonCollision:all [435]
γγ → qq PhotonCollision:gmgm2qqbar 261
γγ → cc PhotonCollision:gmgm2ccbar 262
γγ → bb PhotonCollision:gmgm2bbbar 263
γγ → e+ e− PhotonCollision:gmgm2ee 264
γγ → µ+ µ− PhotonCollision:gmgm2mumu 265
γγ → τ+ τ− PhotonCollision:gmgm2tautau 266
PhotonParton:all
gγ → qq PhotonParton:ggm2qqbar 271 (281) [436]
gγ → cc PhotonParton:ggm2ccbar 272 (282) [437]
gγ → bb PhotonParton:ggm2bbbar 273 (283) [437]
qγ → qg PhotonParton:qgm2qg 274 (284) [436]
qγ → qγ PhotonParton:qgm2qgm 275 (285) [436]

266
SciPost Physics Codebases Submission

Table 13: List of internal top-production processes and production of fourth-generation


fermions. Expressions are from ref. [14].

process internal name code


Top:all
gg → tt Top:gg2ttbar 601
qq → tt Top:qqbar2ttbar 602
qq → tq Top:qq2tq(t:W) 603
ff → γ/Z → tt Top:ffbar2ttbar(s:gmZ) 604
ff → W ± → tq Top:ffbar2tqbar(s:W) 605
γγ → tt Top:gmgm2ttbar 606
gγ → tt Top:ggm2ttbar 607
FourthBottom:all
0 0
gg → b b FourthBottom:gg2bPrimebPrimebar
0
qq → b0 b FourthBottom:qqbar2bPrimebPrimebar 801
ff → b0 q (t-channel W) FourthBottom:qq2bPrimeq(t:W) 803
0
ff → b0 b (s-channel γ/Z) FourthBottom:ffbar2bPrimebPrimebar(s:gmZ) 804
0
ff → b0 q (s-channel W) FourthBottom:ffbar2bPrimeqbar(s:W) 805
0
ff → b0 t (s-channel W) FourthBottom:ffbar2bPrimetbar(s:W) 806
FourthTop:all
0
gg → t0 t FourthTop:gg2tPrimetPrimebar 821
0
qq → t0 t FourthTop:qqbar2tPrimetPrimebar 822
ff → b0 q (t-channel W) FourthTop:qq2tPrimeq(t:W) 823
0
ff → t0 t (s-channel γ/Z) FourthTop:ffbar2tPrimetPrimebar(s:gmZ) 824
0
ff → t0 q (s-channel W) FourthTop:ffbar2tPrimeqbar(s:W) 825
0 0
ff → t0 b (s-channel W) FourthPair:ffbar2tPrimebPrimebar(s:W) 841
0
ff → τ0 ν̄0 (s-channel W) FourthPair:ffbar2tauPrimenuPrimebar(s:W) 842

267
SciPost Physics Codebases Submission

Table 14: List of internal SM-Higgs production processes. See section 3.5 for details.

process internal name code


HiggsSM:all
ff → HSM HiggsSM:ffbar2H 901
gg → HSM HiggsSM:gg2H 902
γγ → HSM HiggsSM:gmgm2H 903
ff → HSM Z HiggsSM:ffbar2HZ 904
ff → HSM W HiggsSM:ffbar2HW 905
ff → HSM ff (ZBF) HiggsSM:ff2Hff(t:ZZ) 906
ff → HSM ff (WBF) HiggsSM:ff2Hff(t:WW) 907
gg → HSM tt HiggsSM:gg2Httbar 908
qq → HSM tt HiggsSM:qqbar2Httbar 909

qg → HSM q HiggsSM:qg2Hq 911


gg → HSM bb HiggsSM:gg2Hbbbar 912
qq → HSM bb HiggsSM:qqbar2Hbbbar 913
gg → HSM g HiggsSM:gg2Hg(l:t) 914
qg → HSM q HiggsSM:qg2Hq(l:t) 915
qq → HSM g HiggsSM:qqbar2Hg(l:t) 916

268
SciPost Physics Codebases Submission

A.2 Beyond-the-Standard-Model processes

Table 15: List of internal SUSY particle production processes. Expressions from refs. [82,
438, 439]. Particular flavour states can be selected using IdA and idB, see section 3.6
or the online manual for details.

process internal name


SUSY:all
gg → g̃ g̃ SUSY:gg2gluinogluino
qq → g̃ g̃ SUSY:qqbar2gluinogluino
qg → q̃ g̃ SUSY:qg2squarkgluino
gg → q̃i q̃∗j SUSY:gg2squarkantisquark
qq → q̃i q̃∗j SUSY:qqbar2squarkantisquark
qq → q̃i q̃∗j (No EW) SUSY:qqbar2squarkantisquark:onlyQCD
qq → q̃i q̃∗j SUSY:qqbar2squarkantisquark
qq → q̃i q̃ j SUSY:qq2squarksquark
qq → q̃i q̃ j (No EW) SUSY:qq2squarksquark:onlyQCD
qq → χ̃i0 χ̃ 0j SUSY:qqbar2chi0chi0
qq → χ̃i± χ̃ 0j SUSY:qqbar2chi+-chi0
qq → χ̃i± χ̃ ∓ j SUSY:qqbar2chi+chi-

qg → q̃χ̃i0 SUSY:qg2chi0squark
qg → q̃χ̃i± SUSY:qg2chi+-squark
qq → χ̃i0 g̃ SUSY:qqbar2chi0gluino
qq → χ̃i± g̃ SUSY:qqbar2chi+-gluino
ff → ˜`i ˜`∗j SUSY:qqbar2sleptonantislepton
qi q j → q̃k∗ SUSY:qq2antisquark

269
SciPost Physics Codebases Submission

Table 16: List of internal BSM-Higgs production processes. See section 3.5 for details.
Expressions from refs. [440, 441].

process internal name code


HiggsBSM:all
(replace H1 with H2 or A3)
HiggsBSM:allH1
ff → H1 (H2 , A3 ) HiggsBSM:ffbar2H1 1001, 1021, 1041
gg → H1 (H2 , A3 ) HiggsBSM:gg2H1 1002, 1022, 1042
γγ → H1 (H2 , A3 ) HiggsBSM:gmgm2H1 1003, 1023, 1043
ff → H1 (H2 , A3 )Z HiggsBSM:ffbar2H1Z 1004, 1024, 1044
ff → H1 (H2 , A3 )W HiggsBSM:ffbar2H1W 1005, 1025, 1045
ff → H1 (H2 , A3 )ff (ZBF) HiggsBSM:ff2H1ff(t:ZZ) 1006, 1026, 1046
ff → H1 (H2 , A3 )ff (WBF) HiggsBSM:ff2H1ff(t:WW) 1007, 1027, 1047
gg → H1 (H2 , A3 )tt HiggsBSM:gg2H1ttbar 1008, 1028, 1048
qq → H1 (H2 , A3 )tt HiggsBSM:qqbar2H1ttbar 1009, 1029, 1049

HiggsBSM:allH+-
±
ff → H HiggsBSM:ffbar2H+- 1061
bg → H ± HiggsBSM:bg2H+-t 1062

HiggsBSM:allHpair
ff → A3 H1 HiggsBSM:ffbar2A3H1 1081
ff → A3 H2 HiggsBSM:ffbar2A3H2 1082
ff → H ± H1 HiggsBSM:ffbar2H+-H1 1083
ff → H ± H2 HiggsBSM:ffbar2H+-H2 1084
ff → H ± A3 HiggsBSM:ffbar2H+H- 1085

qg → H1 (H2 , A3 )q HiggsBSM:qg2H1q 1011, 1031, 1051


gg → H1 (H2 , A3 )bb HiggsBSM:gg2H1bbbar 1012, 1032, 1052
qq → H1 (H2 , A3 )bb HiggsBSM:qqbar2H1bbbar 1013, 1033, 1053
gg → H1 (H2 , A3 )g HiggsBSM:gg2H1g(l:t) 1014, 1034, 1054
qg → H1 (H2 , A3 )q HiggsBSM:qg2H1q(l:t) 1015, 1035, 1055
qq → H1 (H2 , A3 )g HiggsBSM:qqbar2H1g(l:t) 1016, 1036, 1056

270
SciPost Physics Codebases Submission

Table 17: List of internal processes for dark matter. See section 3.8 and ref. [442] for
details.

process internal name code


gg → χ χ̄ DM:gg2S2XX 6011
gg → χ χ̄ j DM:gg2S2XXj 6012
ff → χ χ̄ DM:ffbar2Zp2XX 6001
ff → χ χ̄ DM:ffbar2Zp2XXj 6002
ff → χ χ̄ j DM:qg2Zp2XXj 6003
ff → Z 0 H DM:ffbar2ZpH 6004
qq → Ψ Ψ̄ DM:qqbar2DY 6020

Table 18: List of internal processes mediated by new gauge bosons or leptoquarks. See
section 3.9.

process internal name code refs.


ff → γ/Z/Z 0 NewGaugeBoson:ffbar2gmZZprime 3001 [443]
ff → W 0 NewGaugeBoson:ffbar2Wprime 3021 [443]
ff → R0 NewGaugeBoson:ffbar2R0 3041 [443]

LeftRightSymmmetry:all [444]
ff → ZR LeftRightSymmmetry:ffbar2ZR 3101
ff → W 0 LeftRightSymmmetry:ffbar2WR 3102
`¯` → H L LeftRightSymmmetry:ll2HL 3121
`γ → H L e LeftRightSymmmetry:lgm2HLe 3122
`γ → H L µ LeftRightSymmmetry:lgm2HLmu 3123
`γ → H L τ LeftRightSymmmetry:lgm2HLtau 3124
ff → ffH L LeftRightSymmmetry:ff2HLff 3125
ff → H L H L LeftRightSymmmetry:ffbar2HLHL 3126
`¯` → HR LeftRightSymmmetry:ll2HR 3141
`γ → HR e LeftRightSymmmetry:lgm2HRe 3142
`γ → HR µ LeftRightSymmmetry:lgm2HRmu 3143
`γ → HR τ LeftRightSymmmetry:lgm2HRtau 3144
ff → ffHR LeftRightSymmmetry:ff2HRff 3145
ff → HR HR LeftRightSymmmetry:ffbar2HRHR 3146

LeptoQuark:all
q` → S LeptoQuark:ql2LQ 3201 [445]
qg → `S LeptoQuark:qg2LQl 3202 [445]
gg → SS ∗ LeptoQuark:gg2LQLQbar 3203 [445]
ff → SS ∗ LeptoQuark:qqbar2LQLQbar 3204 [445]

271
SciPost Physics Codebases Submission

Table 19: List of internal processes for excited fermions. See section 3.9.

process internal name code refs.


ExcitedFermion:all [446, 447]

dg → d ExcitedFermion:dg2dStar 4001 [446, 447]
ug → u∗ ExcitedFermion:ug2uStar 4002 [446, 447]
sg → s∗ ExcitedFermion:sg2sStar 4003 [446, 447]
cg → c ∗ ExcitedFermion:cg2cStar 4004 [446, 447]
bg → b∗ ExcitedFermion:bg2bStar 4005 [446, 447]
eγ → e∗ ExcitedFermion:egm2eStar 4011 [446, 447]
µγ → µ∗ ExcitedFermion:mugm2muStar 4013 [446, 447]
τγ → τ∗ ExcitedFermion:taugm2tauStar 4015 [446, 447]

qq → d ∗ q ExcitedFermion:qq2dStarq 4021 [446, 447]


qq → u∗ q ExcitedFermion:qq2uStarq 4022 [446, 447]
qq → s∗ q ExcitedFermion:qq2sStarq 4023 [446, 447]
qq → c ∗ q ExcitedFermion:qq2cStarq 4024 [446, 447]
qq → b∗ q ExcitedFermion:qq2bStarq 4025 [446, 447]
qq → e∗ e ExcitedFermion:qqbar2eStare 4031 [446, 447]
qq → ν∗e νe ExcitedFermion:qqbar2nueStarnue 4032 [446, 447]
qq → µ∗ µ ExcitedFermion:qqbar2muStarmu 4033 [446, 447]
qq → ν∗µ νµ ExcitedFermion:qqbar2numuStarnumu 4034 [446, 447]
qq → τ∗ τ ExcitedFermion:qqbar2tauStartau 4035 [446, 447]
qq → ν∗τ ντ ExcitedFermion:qqbar2nutauStarnutau 4036 [446, 447]
qq → e∗ e∗ ExcitedFermion:qqbar2eStareStar 4051 [446, 447]
qq → ν∗e ν∗e ExcitedFermion:qqbar2nueStarnueStar 4052 [446, 447]
qq → µ∗ µ∗ ExcitedFermion:qqbar2muStarmuStar 4053 [446, 447]
qq → ν∗µ ν∗µ ExcitedFermion:qqbar2numuStarnumuStar 4054 [446, 447]
qq → τ∗ τ∗ ExcitedFermion:qqbar2tauStartauStar 4055 [446, 447]
qq → ν∗τ nu∗τ ExcitedFermion:qqbar2nutauStarnutauStar4056 [446, 447]

Table 20: List of internal processes for Randall–Sundrum resonances. See section 3.9
and refs. [448, 449] for details.

process internal name code


ExtraDimensionsG*:all

gg → G ExtraDimensionsG*:gg2G* 5001
ff → G ∗ ExtraDimensionsG*:ffbar2G* 5002
gg → G ∗ g ExtraDimensionsG*:gg2G*g 5003
gq → G ∗ q ExtraDimensionsG*:qg2G*q 5004
qq → G ∗ g ExtraDimensionsG*:qqbar2G*g 5005
qq → GKK g ExtraDimensionsG*:qqbar2KKgluon* 5006

272
SciPost Physics Codebases Submission

Table 21: List of internal processes for TeV−1 -sized extra dimensions. See section 3.9 for
details, expressions from ref. [450].

process internal name code


ff → dd ExtraDimensionsTEV:ffbar2ddbar 5061
ff → uu ExtraDimensionsTEV:ffbar2uubar 5062
ff → ss ExtraDimensionsTEV:ffbar2ssbar 5063
ff → cc ExtraDimensionsTEV:ffbar2ccbar 5064
ff → bb ExtraDimensionsTEV:ffbar2bbbar 5065
ff → tt ExtraDimensionsTEV:ffbar2ttbar 5066
ff → e+ e− ExtraDimensionsTEV:ffbar2e+e- 5071
ff → νe ν̄e ExtraDimensionsTEV:ffbar2nuenuebar 5072
ff → µ+ µ− ExtraDimensionsTEV:ffbar2mu+mu- 5073
ff → νµ ν̄µ ExtraDimensionsTEV:ffbar2numunumubar 5074
ff → τ+ τ− ExtraDimensionsTEV:ffbar2tau+tau- 5076
ff → ντ ν̄τ ExtraDimensionsTEV:ffbar2nutaunutaubar 5076

Table 22: List of internal processes for large extra dimensions. See section 3.9 for details.

process internal name code refs.


ExtraDimensionsLED:monojet [451, 452]
gg → G g ExtraDimensionsLED:gg2Gg 5021
gq → Gq ExtraDimensionsLED:qg2Gq 5022
qq → G g ExtraDimensionsLED:qqbar2Gg 5023

ff → GZ ExtraDimensionsLED:ffbar2GZ 5024 [451]


ff → Gγ ExtraDimensionsLED:ffbar2Ggamma 5025 [451]
ff → γγ ExtraDimensionsLED:ffbar2gammagamma 5026 [451]
gg → γγ ExtraDimensionsLED:gg2gammagamma 5027 [451]
ff → `¯` ExtraDimensionsLED:ffbar2llbar 5028 [451]
gg → `¯` ExtraDimensionsLED:gg2llbar 5029 [451]

ExtraDimensionsLED:dijets [451]
gg → gg ExtraDimensionsLED:gg2DJgg 5030
gg → qq ExtraDimensionsLED:gg2DJqqbar 5031
qg → qg ExtraDimensionsLED:qg2DJqg 5032
qq → qq ExtraDimensionsLED:qq2DJqq 5033
qq → gg ExtraDimensionsLED:qqbar2DJgg 5034
qq → q0 q0 ExtraDimensionsLED:qqbar2DJqqbarNew 5035

273
SciPost Physics Codebases Submission

Table 23: List of internal processes for unparticles. Expressions from refs. [453, 454],
see section 3.9.

process internal name code


ExtraDimensionsUnpart:monojet
gg → U g ExtraDimensionsUnpart:gg2Ug 5045
gq → Uq ExtraDimensionsUnpart:qg2Uq 5046
qq → U g ExtraDimensionsUnpart:qqbar2Ug 5047

ff → UZ ExtraDimensionsUnpart:ffbar2UZ 5041
ff → Uγ ExtraDimensionsUnpart:ffbar2Ugamma 5042
ff → U → γγ ExtraDimensionsUnpart:ffbar2gammagamma 5043
gg → U → γγ ExtraDimensionsUnpart:gg2gammagamma 5044
ff → U → `¯` ExtraDimensionsUnpart:ffbar2llbar 5048
gg → U → `¯` ExtraDimensionsUnpart:gg2llbar 5049

274
SciPost Physics Codebases Submission

Table 24: List of internal hidden valley processes, see section 3.7 and refs. [38, 39] for
details.

process internal name code


HiddenValley:all
gg → d v d v HiddenValley:gg2DvDvbar 4901
gg → u v u v HiddenValley:gg2UvUvbar 4902
gg → s v s v HiddenValley:gg2SvSvbar 4903
gg → c v c v HiddenValley:gg2CvCvbar 4904
gg → b v b v HiddenValley:gg2BvBvbar 4905
gg → t v t v HiddenValley:gg2TvTvbar 4906
qq → d v d v HiddenValley:qqbar2DvDvbar 4911
qq → u v u v HiddenValley:qqbar2UvUvbar 4912
qq → s v s v HiddenValley:qqbar2SvSvbar 4913
qq → c v c v HiddenValley:qqbar2CvCvbar 4914
qq → b v b v HiddenValley:qqbar2BvBvbar 4915
qq → t v t v HiddenValley:qqbar2TvTvbar 4916
ff → d v d v HiddenValley:ffbar2DvDvbar 4921
ff → u v u v HiddenValley:ffbar2UvUvbar 4922
ff → s v s v HiddenValley:ffbar2SvSvbar 4923
ff → c v c v HiddenValley:ffbar2CvCvbar 4924
ff → b v b v HiddenValley:ffbar2BvBvbar 4925
ff → t v t v HiddenValley:ffbar2TvTvbar 4926
ff → e v ē v HiddenValley:ffbar2EvEvbar 4931
ff → µ v µ̄ v HiddenValley:ffbar2MUvMUvbar 4932
ff → τ v τ̄ v HiddenValley:ffbar2TAUvTAUvbar 4933
ff → ντv ν̄τv HiddenValley:ffbar2nuEvnuEvbar 4934
ff → ντv ν̄τv HiddenValley:ffbar2nuMUvnuMUvbar 4935
ff → ντv ν̄τv HiddenValley:ffbar2nuTAUvnuTAUvbar 4936
ff → Z v HiddenValley:ffbar2Zv 4941

275
SciPost Physics Codebases Submission

References
[1] T. Sjöstrand, S. Ask, J. R. Christiansen, R. Corke, N. Desai, P. Ilten, S. Mrenna, S. Prestel,
C. O. Rasmussen and P. Z. Skands, An introduction to PYTHIA 8.2, Comput. Phys. Commun.
191, 159 (2015), doi:10.1016/j.cpc.2015.01.024, 1410.3012.

[2] T. Sjöstrand, The PYTHIA Event Generator: Past, Present and Future, Comput. Phys. Com-
mun. 246, 106910 (2020), doi:10.1016/j.cpc.2019.106910, 1907.09874.

[3] W. Bartel et al., Experimental Study of Jets in electron - Positron Annihilation, Phys. Lett. B
101, 129 (1981), doi:10.1016/0370-2693(81)90505-0.

[4] L. Lönnblad, ARIADNE version 4: A Program for simulation of QCD cascades implement-
ing the color dipole model, Comput. Phys. Commun. 71, 15 (1992), doi:10.1016/0010-
4655(92)90068-A.

[5] G. Ingelman, A. Edin and J. Rathsman, LEPTO 6.5: A Monte Carlo generator for deep inelastic
lepton-nucleon scattering, Comput. Phys. Commun. 101, 108 (1997), doi:10.1016/S0010-
4655(96)00157-9, hep-ph/9605286.

[6] H. Kharraziha and L. Lönnblad, The Linked dipole chain Monte Carlo, JHEP 03, 006 (1998),
doi:10.1088/1126-6708/1998/03/006, hep-ph/9709424.

[7] B. Nilsson-Almqvist and E. Stenlund, Interactions Between Hadrons and Nuclei: The
Lund Monte Carlo, Fritiof Version 1.6, Comput. Phys. Commun. 43, 387 (1987),
doi:10.1016/0010-4655(87)90056-7.

[8] F. James and L. Moneta, Review of High-Quality Random Number Generators, Comput.
Softw. Big Sci. 4(1), 2 (2020), doi:10.1007/s41781-019-0034-3, 1903.01247.

[9] G. Marsaglia, B. Narasimhan and A. Zaman, A random number generator for PC’s, Comput.
Phys. Commun. 60, 345 (1990), doi:10.1016/0010-4655(90)90033-W.

[10] M. Luscher, A Portable high quality random number generator for lattice field theory simu-
lations, Comput. Phys. Commun. 79, 100 (1994), doi:10.1016/0010-4655(94)90232-1,
hep-lat/9309020.
[11] K. G. Savvidy, The MIXMAX random number generator, Comput. Phys. Commun. 196, 161
(2015), doi:10.1016/j.cpc.2015.06.003, 1403.5355.

[12] R. Kleiss and R. Pittau, Weight optimization in multichannel Monte Carlo, Comput. Phys.
Commun. 83, 141 (1994), doi:10.1016/0010-4655(94)90043-4, hep-ph/9405257.

[13] T. Sjöstrand and M. van Zijl, A Multiple Interaction Model for the Event Structure in Hadron
Collisions, Phys. Rev. D 36, 2019 (1987), doi:10.1103/PhysRevD.36.2019.

[14] T. Sjöstrand, S. Mrenna and P. Z. Skands, PYTHIA 6.4 Physics and Manual, JHEP 05, 026
(2006), doi:10.1088/1126-6708/2006/05/026, hep-ph/0603175.

[15] L. Lönnblad, Fooling Around with the Sudakov Veto Algorithm, Eur. Phys. J. C 73(3), 2350
(2013), doi:10.1140/epjc/s10052-013-2350-9, 1211.7204.

276
SciPost Physics Codebases Submission

[16] S. Plätzer and M. Sjödahl, The Sudakov Veto Algorithm Reloaded, Eur. Phys. J. Plus 127, 26
(2012), doi:10.1140/epjp/i2012-12026-x, 1108.6180.

[17] R. Kleiss and R. Verheyen, Competing Sudakov Veto Algorithms, Eur. Phys. J. C 76(7), 359
(2016), doi:10.1140/epjc/s10052-016-4231-5, 1605.09246.

[18] P. A. W. Lewis and G. S. Shedler, Simulation of nonhomogeneous poisson processes by thinning,


Naval Research Logistics Quarterly 26(3), 403 (1979), doi:10.1002/nav.3800260304.

[19] L. Devroye, Non-Uniform Random Variate Generation, SpringerLink : Bücher. Springer New
York, ISBN 9781461386438 (2013).

[20] S. Mrenna and P. Skands, Automated Parton-Shower Variations in Pythia 8, Phys. Rev. D
94(7), 074005 (2016), doi:10.1103/PhysRevD.94.074005, 1605.08352.

[21] F. James, Monte-Carlo phase space (1968).

[22] R. Kleiss, W. J. Stirling and S. D. Ellis, A New Monte Carlo Treatment of Multiparticle Phase
Space at High-energies, Comput. Phys. Commun. 40, 359 (1986), doi:10.1016/0010-
4655(86)90119-0.

[23] H. Brooks, P. Skands and R. Verheyen, Interleaved Resonance Decays and Electroweak Radi-
ation in Vincia (2021), 2108.10786.

[24] E. Norrbin and T. Sjöstrand, Production and hadronization of heavy quarks, Eur. Phys. J. C
17, 137 (2000), doi:10.1007/s100520000460, hep-ph/0005110.

[25] J. F. Gunion and Z. Kunszt, Lepton Correlations in Gauge Boson Pair Production and Decay,
Phys. Rev. D 33, 665 (1986), doi:10.1103/PhysRevD.33.665.

[26] G. T. Bodwin, E. Braaten and G. P. Lepage, Rigorous QCD analysis of inclusive an-
nihilation and production of heavy quarkonium, Phys. Rev. D 51, 1125 (1995),
doi:10.1103/PhysRevD.55.5853, [Erratum: Phys.Rev.D 55, 5853 (1997)], hep-ph/
9407339.
[27] R. Baier and R. Rückl, Hadronic Collisions: A Quarkonium Factory, Z. Phys. C 19, 251
(1983), doi:10.1007/BF01572254.

[28] R. Gastmans, W. Troost and T. T. Wu, Cross-Sections for Gluon + Gluon → Heavy Quarkonium
+ Gluon, Phys. Lett. B 184, 257 (1987), doi:10.1016/0370-2693(87)90578-8.

[29] P. L. Cho and A. K. Leibovich, Color octet quarkonia production. 2., Phys. Rev. D 53, 6203
(1996), doi:10.1103/PhysRevD.53.6203, hep-ph/9511315.

[30] F. Yuan, C.-F. Qiao and K.-T. Chao, D wave heavy quarkonium production in fixed target
experiments, Phys. Rev. D 59, 014009 (1999), doi:10.1103/PhysRevD.59.014009, hep-ph/
9807329.
[31] B. Humpert and P. Mery, ψψ PRODUCTION AT COLLIDER ENERGIES, Z. Phys. C 20, 83
(1983), doi:10.1007/BF01577721.

[32] C.-F. Qiao, J/ψ pair production at the Tevatron, Phys. Rev. D 66, 057504 (2002),
doi:10.1103/PhysRevD.66.057504, hep-ph/0206093.

277
SciPost Physics Codebases Submission

[33] P. Nason et al., Bottom production, In Workshop on Standard Model Physics (and more) at
the LHC (First Plenary Meeting), pp. 231–304 (1999), hep-ph/0003142.

[34] M. Bargiotti and V. Vagnoni, Heavy quarkonia sector in PYTHIA 6.324: Tuning, validation
and perspectives at LHC(b) (2007).

[35] P. A. Zyla et al., Review of Particle Physics, PTEP 2020(8), 083C01 (2020),
doi:10.1093/ptep/ptaa104.

[36] R. Aaij et al., Study of J/ψ Production in Jets, Phys. Rev. Lett. 118(19), 192001 (2017),
doi:10.1103/PhysRevLett.118.192001, 1701.05116.
p
[37] A. M. Sirunyan et al., Study of J/ψ meson production inside jets in pp collisions at s = 8
TeV, Phys. Lett. B 804, 135409 (2020), doi:10.1016/j.physletb.2020.135409, 1910.01686.

[38] L. Carloni and T. Sjöstrand, Visible Effects of Invisible Hidden Valley Radiation, JHEP 09, 105
(2010), doi:10.1007/JHEP09(2010)105, 1006.2911.

[39] L. Carloni, J. Rathsman and T. Sjöstrand, Discerning Secluded Sector gauge structures, JHEP
04, 091 (2011), doi:10.1007/JHEP04(2011)091, 1102.3795.

[40] S. Dittmaier et al., Handbook of LHC Higgs Cross Sections: 1. Inclusive Observables (2011),
doi:10.5170/CERN-2011-002, 1101.0593.

[41] S. C. Park, H. S. Song and J.-H. Song, Z boson pair production at CERN
LHC in a stabilized Randall-Sundrum scenario, Phys. Rev. D 65, 075008 (2002),
doi:10.1103/PhysRevD.65.075008, hep-ph/0103308.

[42] K. Kovařík, P. M. Nadolsky and D. E. Soper, Hadronic structure in high-energy collisions, Rev.
Mod. Phys. 92(4), 045003 (2020), doi:10.1103/RevModPhys.92.045003, 1905.06957.

[43] V. N. Gribov and L. N. Lipatov, Deep inelastic ep scattering in perturbation theory, Sov. J.
Nucl. Phys. 15, 438 (1972).

[44] Y. L. Dokshitzer, Calculation of the Structure Functions for Deep Inelastic Scattering and e+ e−
Annihilation by Perturbation Theory in Quantum Chromodynamics., Sov. Phys. JETP 46, 641
(1977).

[45] G. Altarelli and G. Parisi, Asymptotic Freedom in Parton Language, Nucl. Phys. B 126, 298
(1977), doi:10.1016/0550-3213(77)90384-4.

[46] M. Glück, E. Reya and I. Schienbein, Pionic parton distributions revisited, Eur. Phys. J. C
10, 313 (1999), doi:10.1007/s100529900124, hep-ph/9903288.

[47] M. Glück, E. Reya and A. Vogt, Pionic parton distributions, Z. Phys. C 53, 651 (1992),
doi:10.1007/BF01559743.

[48] M. Glück, E. Reya and M. Stratmann, Mesonic parton densities derived from constituent
quark model constraints, Eur. Phys. J. C 2, 159 (1998), doi:10.1007/s100520050130,
hep-ph/9711369.

278
SciPost Physics Codebases Submission

[49] T. Sjöstrand and M. Utheim, Hadron Interactions for Arbitrary Energies and Species, with
Applications to Cosmic rays, Eur. Phys. J. C 82(1), 21 (2022), doi:10.1140/epjc/s10052-
021-09953-5, 2108.03481.

[50] V. Bertone, R. Gauld and J. Rojo, Neutrino Telescopes as QCD Microscopes, JHEP 01, 217
(2019), doi:10.1007/JHEP01(2019)217, 1808.02034.

[51] V. V. Sudakov, Vertex parts at very high-energies in quantum electrodynamics, Sov. Phys. JETP
3, 65 (1956).

[52] T. Sjöstrand, A Model for Initial State Parton Showers, Phys. Lett. B 157, 321 (1985),
doi:10.1016/0370-2693(85)90674-4.

[53] G. Gustafson, Dual Description of a Confined Color Field, Phys. Lett. B 175, 453 (1986),
doi:10.1016/0370-2693(86)90622-2.

[54] G. Gustafson and U. Pettersson, Dipole Formulation of QCD Cascades, Nucl. Phys. B 306,
746 (1988), doi:10.1016/0550-3213(88)90441-5.

[55] G. ’t Hooft, A Planar Diagram Theory for Strong Interactions, Nucl. Phys. B 72, 461 (1974),
doi:10.1016/0550-3213(74)90154-0.

[56] S. Höche, D. Reichelt and F. Siegert, Momentum conservation and unitarity in parton showers
and NLL resummation, JHEP 01, 118 (2018), doi:10.1007/JHEP01(2018)118, 1711.
03497.
[57] N. Baberuxki, C. T. Preuss, D. Reichelt and S. Schumann, Resummed predictions for
jet-resolution scales in multijet production in e+ e− annihilation, JHEP 04, 112 (2020),
doi:10.1007/JHEP04(2020)112, 1912.09396.

[58] M. Dasgupta, F. A. Dreyer, K. Hamilton, P. F. Monni and G. P. Salam, Logarithmic accuracy of


parton showers: a fixed-order study, JHEP 09, 033 (2018), doi:10.1007/JHEP09(2018)033,
[Erratum: JHEP 03, 083 (2020)], 1805.09327.

[59] K. Hamilton, R. Medves, G. P. Salam, L. Scyboz and G. Soyez, Colour and logarithmic accu-
racy in final-state parton showers (2020), doi:10.1007/JHEP03(2021)041, 2011.10054.

[60] Z. Nagy and D. E. Soper, Summations of large logarithms by parton showers, Phys. Rev. D
104(5), 054049 (2021), doi:10.1103/PhysRevD.104.054049, 2011.04773.

[61] Z. Nagy and D. E. Soper, Summations by parton showers of large logarithms in electron-
positron annihilation (2020), 2011.04777.

[62] M. Dasgupta, F. A. Dreyer, K. Hamilton, P. F. Monni, G. P. Salam and G. Soyez, Parton


showers beyond leading logarithmic accuracy, Phys. Rev. Lett. 125(5), 052002 (2020),
doi:10.1103/PhysRevLett.125.052002, 2002.11114.

[63] J. R. Forshaw, J. Holguin and S. Plätzer, Building a consistent parton shower, JHEP 09, 014
(2020), doi:10.1007/JHEP09(2020)014, 2003.06400.

[64] W. T. Giele, D. A. Kosower and P. Z. Skands, Higher-Order Corrections to Timelike Jets, Phys.
Rev. D 84, 054003 (2011), doi:10.1103/PhysRevD.84.054003, 1102.2126.

279
SciPost Physics Codebases Submission

[65] S. Plätzer and M. Sjödahl, Subleading Nc improved Parton Showers, JHEP 07, 042 (2012),
doi:10.1007/JHEP07(2012)042, 1201.0260.

[66] S. Plätzer, M. Sjödahl and J. Thorén, Color matrix element corrections for parton showers,
JHEP 11, 009 (2018), doi:10.1007/JHEP11(2018)009, 1808.00332.

[67] J. Bellm, Colour Rearrangement for Dipole Showers, Eur. Phys. J. C 78(7), 601 (2018),
doi:10.1140/epjc/s10052-018-6070-z, 1801.06113.

[68] J. Holguin, J. R. Forshaw and S. Plätzer, Improvements on dipole shower colour, Eur. Phys.
J. C 81(4), 364 (2021), doi:10.1140/epjc/s10052-021-09145-1, 2011.15087.

[69] J. Isaacson and S. Prestel, Stochastically sampling color configurations, Phys. Rev. D 99(1),
014021 (2019), doi:10.1103/PhysRevD.99.014021, 1806.10102.

[70] S. Höche and D. Reichelt, Numerical resummation at subleading color in the strongly ordered
soft gluon limit, Phys. Rev. D 104(3), 034006 (2021), doi:10.1103/PhysRevD.104.034006,
2001.11492.
[71] Z. Nagy and D. E. Soper, Parton shower evolution with subleading color, JHEP 06, 044
(2012), doi:10.1007/JHEP06(2012)044, 1202.4496.

[72] Z. Nagy and D. E. Soper, Effects of subleading color in a parton shower, JHEP 07, 119 (2015),
doi:10.1007/JHEP07(2015)119, 1501.00778.

[73] J. R. Forshaw, J. Holguin and S. Plätzer, Parton branching at amplitude level, JHEP 08, 145
(2019), doi:10.1007/JHEP08(2019)145, 1905.08686.

[74] M. De Angelis, J. R. Forshaw and S. Plätzer, Resummation and Simulation of


Soft Gluon Effects beyond Leading Color, Phys. Rev. Lett. 126(11), 112001 (2021),
doi:10.1103/PhysRevLett.126.112001, 2007.09648.

[75] M. Bengtsson and T. Sjöstrand, Coherent Parton Showers Versus Matrix Elements: Im-
plications of PETRA - PEP Data, Phys. Lett. B 185, 435 (1987), doi:10.1016/0370-
2693(87)91031-8.

[76] M. Bengtsson and T. Sjöstrand, A Comparative Study of Coherent and Noncoherent Parton
Shower Evolution, Nucl. Phys. B 289, 810 (1987), doi:10.1016/0550-3213(87)90407-X.

[77] E. Norrbin and T. Sjöstrand, QCD radiation off heavy particles, Nucl. Phys. B 603, 297
(2001), doi:10.1016/S0550-3213(01)00099-2, hep-ph/0010012.

[78] T. Sjöstrand and P. Z. Skands, Transverse-momentum-ordered showers and interleaved mul-


tiple interactions, Eur. Phys. J. C 39, 129 (2005), doi:10.1140/epjc/s2004-02084-y,
hep-ph/0408302.
[79] R. Corke and T. Sjöstrand, Interleaved Parton Showers and Tuning Prospects, JHEP 03, 032
(2011), doi:10.1007/JHEP03(2011)032, 1011.1759.

[80] B. Cabouat and T. Sjöstrand, Some Dipole Shower Studies, Eur. Phys. J. C 78(3), 226 (2018),
doi:10.1140/epjc/s10052-018-5645-z, 1710.00391.

280
SciPost Physics Codebases Submission

[81] G. Miu and T. Sjöstrand, W production in an improved parton shower approach, Phys. Lett.
B 449, 313 (1999), doi:10.1016/S0370-2693(99)00068-4, hep-ph/9812455.

[82] N. Desai and P. Z. Skands, Supersymmetry and Generic BSM Models in PYTHIA 8, Eur. Phys.
J. C 72, 2238 (2012), doi:10.1140/epjc/s10052-012-2238-0, 1109.5852.

[83] J. R. Christiansen and T. Sjöstrand, Weak Gauge Boson Radiation in Parton Showers, JHEP
04, 115 (2014), doi:10.1007/JHEP04(2014)115, 1401.5238.

[84] B. Andersson, G. Gustafson and J. Samuelsson, The Linked dipole chain model for DIS, Nucl.
Phys. B 467, 443 (1996), doi:10.1016/0550-3213(96)00114-9.

[85] P. A. Zyla et al., Review of Particle Physics, PTEP 2020(8), 083C01 (2020),
doi:10.1093/ptep/ptaa104.

[86] S. Catani, B. R. Webber and G. Marchesini, QCD coherent branching and semiinclusive pro-
cesses at large x, Nucl. Phys. B 349, 635 (1991), doi:10.1016/0550-3213(91)90390-J.

[87] G. Gustafson, Multiplicity distributions in QCD cascades, Nucl. Phys. B 392, 251 (1993),
doi:10.1016/0550-3213(93)90203-2.

[88] S. Catani and M. H. Seymour, A General algorithm for calculating jet cross-sections in NLO
QCD, Nucl. Phys. B 485, 291 (1997), doi:10.1016/S0550-3213(96)00589-5, [Erratum:
Nucl.Phys.B 510, 503–504 (1998)], hep-ph/9605323.

[89] H. Brooks and P. Skands, Coherent showers in decays of colored resonances, Phys. Rev. D
100(7), 076006 (2019), doi:10.1103/PhysRevD.100.076006, 1907.08980.

[90] T. Plehn, D. Rainwater and P. Z. Skands, Squark and gluino production with jets, Phys. Lett.
B 645, 217 (2007), doi:10.1016/j.physletb.2006.12.009, hep-ph/0510144.

[91] R. Corke and T. Sjöstrand, Improved Parton Showers at Large Transverse Momenta, Eur.
Phys. J. C 69, 1 (2010), doi:10.1140/epjc/s10052-010-1409-0, 1003.2384.

[92] R. K. Ellis, G. Marchesini and B. R. Webber, Soft Radiation in Parton Parton Scattering, Nucl.
Phys. B 286, 643 (1987), doi:10.1016/0550-3213(87)90456-1, [Erratum: Nucl.Phys.B
294, 1180 (1987)].

[93] B. Andersson, G. Gustafson and T. Sjöstrand, How to Find the Gluon Jets in e+ e− Annihila-
tion, Phys. Lett. B 94, 211 (1980), doi:10.1016/0370-2693(80)90861-8.

[94] Y. I. Azimov, Y. L. Dokshitzer, V. A. Khoze and S. I. Troian, The String Effect and QCD
Coherence, Phys. Lett. B 165, 147 (1985), doi:10.1016/0370-2693(85)90709-9.

[95] B. R. Webber, Monte Carlo Simulation of Hard Hadronic Processes, Ann. Rev. Nucl. Part. Sci.
36, 253 (1986), doi:10.1146/annurev.ns.36.120186.001345.

[96] P. Ernstrom and L. Lönnblad, Generating heavy quarkonia in a perturbative QCD cascade, Z.
Phys. C 75, 51 (1997), doi:10.1007/s002880050446, hep-ph/9606472.

[97] L. Hartgring, E. Laenen and P. Skands, Antenna Showers with One-Loop Matrix Elements,
JHEP 10, 127 (2013), doi:10.1007/JHEP10(2013)127, 1303.4974.

281
SciPost Physics Codebases Submission

[98] J. J. Lopez-Villarejo and P. Z. Skands, Efficient Matrix-Element Matching with Sector Showers,
JHEP 11, 150 (2011), doi:10.1007/JHEP11(2011)150, 1109.3608.

[99] S. Höche, S. Schumann and F. Siegert, Hard photon production and matrix-element parton-
shower merging, Phys. Rev. D 81, 034026 (2010), doi:10.1103/PhysRevD.81.034026,
0912.3501.
[100] M. Ritzmann, D. A. Kosower and P. Skands, Antenna Showers with Hadronic Initial States,
Phys. Lett. B 718, 1345 (2013), doi:10.1016/j.physletb.2012.12.003, 1210.6345.

[101] H. Brooks, C. T. Preuss and P. Skands, Sector Showers for Hadron Collisions, JHEP 07, 032
(2020), doi:10.1007/JHEP07(2020)032, 2003.00702.

[102] R. Kleiss and R. Verheyen, Final-state QED Multipole Radiation in Antenna Parton Showers,
JHEP 11, 182 (2017), doi:10.1007/JHEP11(2017)182, 1709.04485.

[103] P. Skands and R. Verheyen, Multipole photon radiation in the Vincia parton shower, Phys.
Lett. B 811, 135878 (2020), doi:10.1016/j.physletb.2020.135878, 2002.04939.

[104] D. R. Yennie, S. C. Frautschi and H. Suura, The infrared divergence phenomena and high-
energy processes, Annals Phys. 13, 379 (1961), doi:10.1016/0003-4916(61)90151-8.

[105] A. Gehrmann-De Ridder, M. Ritzmann and P. Z. Skands, Timelike Dipole-


Antenna Showers with Massive Fermions, Phys. Rev. D 85, 014013 (2012),
doi:10.1103/PhysRevD.85.014013, 1108.6172.

[106] R. Kleiss and R. Verheyen, Collinear electroweak radiation in antenna parton showers, Eur.
Phys. J. C 80(10), 980 (2020), doi:10.1140/epjc/s10052-020-08510-w, 2002.09248.

[107] A. J. Larkoski, J. J. Lopez-Villarejo and P. Skands, Helicity-Dependent Showers and Matching


with VINCIA, Phys. Rev. D 87(5), 054033 (2013), doi:10.1103/PhysRevD.87.054033,
1301.0933.
[108] N. Fischer, A. Lifson and P. Skands, Helicity Antenna Showers for Hadron Colliders, Eur.
Phys. J. C 77(10), 719 (2017), doi:10.1140/epjc/s10052-017-5306-7, 1708.01736.

[109] D. A. Kosower, Antenna factorization of gauge theory amplitudes, Phys. Rev. D 57, 5410
(1998), doi:10.1103/PhysRevD.57.5410, hep-ph/9710213.

[110] W. T. Giele, D. A. Kosower and P. Z. Skands, A simple shower and matching algorithm, Phys.
Rev. D 78, 014026 (2008), doi:10.1103/PhysRevD.78.014026, 0707.3652.

[111] S. Catani and M. H. Seymour, The Dipole formalism for the calculation of QCD jet cross-
sections at next-to-leading order, Phys. Lett. B 378, 287 (1996), doi:10.1016/0370-
2693(96)00425-X, hep-ph/9602277.

[112] S. Catani, S. Dittmaier, M. H. Seymour and Z. Trocsanyi, The Dipole formalism for next-
to-leading order QCD calculations with massive partons, Nucl. Phys. B 627, 189 (2002),
doi:10.1016/S0550-3213(02)00098-6, hep-ph/0201036.

[113] N. Fischer, S. Prestel, M. Ritzmann and P. Skands, Vincia for Hadron Colliders, Eur. Phys. J.
C 76(11), 589 (2016), doi:10.1140/epjc/s10052-016-4429-6, 1605.06142.

282
SciPost Physics Codebases Submission

[114] R. J. Verheyen, Electroweak Effects in Antenna Parton Showers, Ph.D. thesis, Radboud
University Nijmegen (2020).

[115] S. Schumann and F. Krauss, A Parton shower algorithm based on Catani-Seymour dipole
factorisation, JHEP 03, 038 (2008), doi:10.1088/1126-6708/2008/03/038, 0709.1027.

[116] M. Dinsdale, M. Ternick and S. Weinzierl, Parton showers from the dipole formalism, Phys.
Rev. D 76, 094003 (2007), doi:10.1103/PhysRevD.76.094003, 0709.1026.

[117] S. Plätzer and S. Gieseke, Coherent Parton Showers with Local Recoils, JHEP 01, 024 (2011),
doi:10.1007/JHEP01(2011)024, 0909.5593.

[118] S. Höche and S. Prestel, The midpoint between dipole and parton showers, Eur. Phys. J. C
75(9), 461 (2015), doi:10.1140/epjc/s10052-015-3684-2, 1506.05057.

[119] A. Daleo, T. Gehrmann and D. Maître, Antenna subtraction with hadronic initial states, JHEP
04, 016 (2007), doi:10.1088/1126-6708/2007/04/016, hep-ph/0612257.

[120] D. A. Kosower, Antenna factorization in strongly ordered limits, Phys. Rev. D 71, 045016
(2005), doi:10.1103/PhysRevD.71.045016, hep-ph/0311272.

[121] A. J. Larkoski and M. E. Peskin, Spin-Dependent Antenna Splitting Functions, Phys. Rev. D
81, 054010 (2010), doi:10.1103/PhysRevD.81.054010, 0908.2450.

[122] A. J. Larkoski and M. E. Peskin, Antenna Splitting Functions for Massive Particles, Phys. Rev.
D 84, 034034 (2011), doi:10.1103/PhysRevD.84.034034, 1106.2182.

[123] J. M. Campbell, M. A. Cullen and E. W. N. Glover, Four jet event shapes in electron-
positron annihilation, Eur. Phys. J. C 9, 245 (1999), doi:10.1007/s100529900034,
hep-ph/9809429.
[124] A. Gehrmann-De Ridder, T. Gehrmann and E. W. N. Glover, Infrared structure of e+ e− →
2 jets at NNLO, Nucl. Phys. B 691, 195 (2004), doi:10.1016/j.nuclphysb.2004.05.017,
hep-ph/0403057.
[125] A. Gehrmann-De Ridder, T. Gehrmann and E. W. N. Glover, Quark-gluon antenna functions
from neutralino decay, Phys. Lett. B 612, 36 (2005), doi:10.1016/j.physletb.2005.02.039,
hep-ph/0501291.
[126] A. Gehrmann-De Ridder, T. Gehrmann and E. W. N. Glover, Gluon-gluon antenna functions
from Higgs boson decay, Phys. Lett. B 612, 49 (2005), doi:10.1016/j.physletb.2005.03.003,
hep-ph/0502110.
[127] A. Gehrmann-De Ridder, T. Gehrmann and E. W. N. Glover, Antenna subtraction at NNLO,
JHEP 09, 056 (2005), doi:10.1088/1126-6708/2005/09/056, hep-ph/0505111.

[128] H. W. Kuhn and B. Yaw, The hungarian method for the assignment problem, Naval Res.
Logist. Quart pp. 83–97 (1955).

[129] J. Munkres, Algorithms for the assignment and transportation problems, Journal of the
Society for Industrial and Applied Mathematics 5(1), 32 (1957).

283
SciPost Physics Codebases Submission

[130] R. Jonker and A. Volgenant, A shortest augmenting path algorithm for dense and sparse linear
assignment problems, Computing 38(4), 325 (1987), doi:10.1007/BF02278710.

[131] J. Chen, T. Han and B. Tweedie, Electroweak Splitting Functions and High Energy Showering,
JHEP 11, 093 (2017), doi:10.1007/JHEP11(2017)093, 1611.00788.

[132] S. Catani, Y. L. Dokshitzer, M. H. Seymour and B. R. Webber, Longitudinally invariant


K t clustering algorithms for hadron hadron collisions, Nucl. Phys. B 406, 187 (1993),
doi:10.1016/0550-3213(93)90166-M.

[133] S. Höche and S. Prestel, Triple collinear emissions in parton showers, Phys. Rev. D 96(7),
074017 (2017), doi:10.1103/PhysRevD.96.074017, 1705.00742.

[134] S. Höche, F. Krauss and S. Prestel, Implementing NLO DGLAP evolution in Parton Showers,
JHEP 10, 093 (2017), doi:10.1007/JHEP10(2017)093, 1705.00982.

[135] F. Dulat, S. Höche and S. Prestel, Leading-Color Fully Differential Two-Loop


Soft Corrections to QCD Dipole Showers, Phys. Rev. D 98(7), 074013 (2018),
doi:10.1103/PhysRevD.98.074013, 1805.03757.

[136] S. Prestel and M. Spannowsky, HYTREES: Combining Matrix Elements and Parton Shower
for Hypothesis Testing, Eur. Phys. J. C 79(7), 546 (2019), doi:10.1140/epjc/s10052-019-
7030-y, 1901.11035.

[137] J. R. Andersen, C. Gütschow, A. Maier and S. Prestel, A Positive Resampler for


Monte Carlo events with negative weights, Eur. Phys. J. C 80(11), 1007 (2020),
doi:10.1140/epjc/s10052-020-08548-w, 2005.09375.

[138] L. Gellersen, S. Prestel and M. Spannowsky, Coloring mixed QCD/QED evolution (2021),
2109.09706.
[139] S. Dittmaier, A General approach to photon radiation off fermions, Nucl. Phys. B 565, 69
(2000), doi:10.1016/S0550-3213(99)00563-5, hep-ph/9904440.

[140] L. Gellersen, S. Höche and S. Prestel, Disentangling soft and collinear effects in QCD parton
showers (2021), 2110.05964.

[141] M. Schönherr, An automated subtraction of NLO EW infrared divergences, Eur. Phys. J. C


78(2), 119 (2018), doi:10.1140/epjc/s10052-018-5600-z, 1712.07975.

[142] F. Krauss, P. Petrov, M. Schönherr and M. Spannowsky, Measuring collinear W emissions


inside jets, Phys. Rev. D 89(11), 114006 (2014), doi:10.1103/PhysRevD.89.114006, 1403.
4788.
[143] M. Rubin, G. P. Salam and S. Sapeta, Giant QCD K-factors beyond NLO, JHEP 09, 084
(2010), doi:10.1007/JHEP09(2010)084, 1006.2144.

[144] A. Schälicke and F. Krauss, Implementing the ME+PS merging algorithm, JHEP 07, 018
(2005), doi:10.1088/1126-6708/2005/07/018, hep-ph/0503281.

[145] J. R. Christiansen and S. Prestel, Merging weak and QCD showers with matrix elements, Eur.
Phys. J. C 76(1), 39 (2016), doi:10.1140/epjc/s10052-015-3871-1, 1510.01517.

284
SciPost Physics Codebases Submission

[146] M. H. Seymour, Matrix element corrections to parton shower algorithms, Comput. Phys.
Commun. 90, 95 (1995), doi:10.1016/0010-4655(95)00064-M, hep-ph/9410414.

[147] M. H. Seymour, A Simple prescription for first order corrections to quark scattering and
annihilation processes, Nucl. Phys. B 436, 443 (1995), doi:10.1016/0550-3213(94)00554-
R, hep-ph/9410244.

[148] J. Andre and T. Sjöstrand, A Matching of matrix elements and parton showers, Phys. Rev. D
57, 5767 (1998), doi:10.1103/PhysRevD.57.5767, hep-ph/9708390.

[149] N. Fischer and S. Prestel, Combining states without scale hierarchies with ordered parton
showers, Eur. Phys. J. C 77(9), 601 (2017), doi:10.1140/epjc/s10052-017-5160-7, 1706.
06218.
[150] S. Frixione and B. R. Webber, Matching NLO QCD computations and parton shower simula-
tions, JHEP 06, 029 (2002), doi:10.1088/1126-6708/2002/06/029, hep-ph/0204244.

[151] P. Nason, A New method for combining NLO QCD with shower Monte Carlo algorithms, JHEP
11, 040 (2004), doi:10.1088/1126-6708/2004/11/040, hep-ph/0409146.

[152] S. Frixione, P. Nason and C. Oleari, Matching NLO QCD computations with Parton
Shower simulations: the POWHEG method, JHEP 11, 070 (2007), doi:10.1088/1126-
6708/2007/11/070, 0709.2092.

[153] S. Höche, F. Krauss, M. Schönherr and F. Siegert, Automating the POWHEG method in
Sherpa, JHEP 04, 024 (2011), doi:10.1007/JHEP04(2011)024, 1008.5399.

[154] J. Alwall, R. Frederix, S. Frixione, V. Hirschi, F. Maltoni, O. Mattelaer, H. S. Shao, T. Stelzer,


P. Torrielli and M. Zaro, The automated computation of tree-level and next-to-leading order
differential cross sections, and their matching to parton shower simulations, JHEP 07, 079
(2014), doi:10.1007/JHEP07(2014)079, 1405.0301.

[155] S. Höche, F. Krauss, M. Schönherr and F. Siegert, A critical appraisal of NLO+PS matching
methods, JHEP 09, 049 (2012), doi:10.1007/JHEP09(2012)049, 1111.1220.

[156] S. Alioli, P. Nason, C. Oleari and E. Re, A general framework for implementing NLO cal-
culations in shower Monte Carlo programs: the POWHEG BOX, JHEP 06, 043 (2010),
doi:10.1007/JHEP06(2010)043, 1002.2581.

[157] M. L. Mangano, M. Moretti and R. Pittau, Multijet matrix elements and shower evolution
in hadronic collisions: W b b̄ + n jets as a case study, Nucl. Phys. B 632, 343 (2002),
doi:10.1016/S0550-3213(02)00249-3, hep-ph/0108069.

[158] M. L. Mangano, M. Moretti, F. Piccinini and M. Treccani, Matching matrix elements and
shower evolution for top-quark production in hadronic collisions, JHEP 01, 013 (2007),
doi:10.1088/1126-6708/2007/01/013, hep-ph/0611129.

[159] S. Catani, F. Krauss, R. Kuhn and B. R. Webber, QCD matrix elements + parton showers,
JHEP 11, 063 (2001), doi:10.1088/1126-6708/2001/11/063, hep-ph/0109231.

[160] K. Hamilton, P. Richardson and J. Tully, A Modified CKKW matrix element merging ap-
proach to angular-ordered parton showers, JHEP 11, 038 (2009), doi:10.1088/1126-
6708/2009/11/038, 0905.3072.

285
SciPost Physics Codebases Submission

[161] L. Lönnblad, Correcting the color dipole cascade model with fixed order matrix elements, JHEP
05, 046 (2002), doi:10.1088/1126-6708/2002/05/046, hep-ph/0112284.

[162] L. Lönnblad and S. Prestel, Matching Tree-Level Matrix Elements with Interleaved Showers,
JHEP 03, 019 (2012), doi:10.1007/JHEP03(2012)019, 1109.4829.

[163] H. Brooks and C. T. Preuss, Efficient multi-jet merging with the Vincia sector shower, Comput.
Phys. Commun. 264, 107985 (2021), doi:10.1016/j.cpc.2021.107985, 2008.09468.

[164] S. Höche, F. Krauss, S. Schumann and F. Siegert, QCD matrix elements and truncated showers,
JHEP 05, 053 (2009), doi:10.1088/1126-6708/2009/05/053, 0903.1219.

[165] L. Lönnblad and S. Prestel, Unitarising Matrix Element + Parton Shower merging, JHEP 02,
094 (2013), doi:10.1007/JHEP02(2013)094, 1211.4827.

[166] S. Plätzer, Controlling inclusive cross sections in parton shower + matrix element merging,
JHEP 08, 114 (2013), doi:10.1007/JHEP08(2013)114, 1211.5467.

[167] N. Lavesson and L. Lönnblad, Extending CKKW-merging to One-Loop Matrix Elements, JHEP
12, 070 (2008), doi:10.1088/1126-6708/2008/12/070, 0811.2912.

[168] L. Lönnblad and S. Prestel, Merging Multi-leg NLO Matrix Elements with Parton Showers,
JHEP 03, 166 (2013), doi:10.1007/JHEP03(2013)166, 1211.7278.

[169] J. Bellm, S. Gieseke and S. Plätzer, Merging NLO Multi-jet Calculations with Improved
Unitarization, Eur. Phys. J. C 78(3), 244 (2018), doi:10.1140/epjc/s10052-018-5723-2,
1705.06700.
[170] K. Hamilton and P. Nason, Improving NLO-parton shower matched simulations with higher
order matrix elements, JHEP 06, 039 (2010), doi:10.1007/JHEP06(2010)039, 1004.1764.

[171] S. Höche, F. Krauss, M. Schönherr and F. Siegert, NLO matrix elements and truncated show-
ers, JHEP 08, 123 (2011), doi:10.1007/JHEP08(2011)123, 1009.1127.

[172] T. Gehrmann, S. Höche, F. Krauss, M. Schönherr and F. Siegert, NLO QCD matrix elements +
parton showers in e+ e− → hadrons, JHEP 01, 144 (2013), doi:10.1007/JHEP01(2013)144,
1207.5031.
[173] S. Höche, F. Krauss, M. Schönherr and F. Siegert, QCD matrix elements + parton showers:
The NLO case, JHEP 04, 027 (2013), doi:10.1007/JHEP04(2013)027, 1207.5030.

[174] R. Frederix and S. Frixione, Merging meets matching in MC@NLO, JHEP 12, 061 (2012),
doi:10.1007/JHEP12(2012)061, 1209.6215.

[175] K. Hamilton, P. Nason and G. Zanderighi, MINLO: Multi-Scale Improved NLO, JHEP 10, 155
(2012), doi:10.1007/JHEP10(2012)155, 1206.3572.

[176] R. Frederix and K. Hamilton, Extending the MINLO method, JHEP 05, 042 (2016),
doi:10.1007/JHEP05(2016)042, 1512.02663.

[177] M. Mangano, Exploring theoretical systematics in the ME-to-shower MC merging for multijet
process, E-proceedings of Matrix Element/Monte Carlo Tuning Working Group, Fermilab,
November 2002 (2002).

286
SciPost Physics Codebases Submission

[178] R. Frederix, S. Frixione, S. Prestel and P. Torrielli, On the reduction of neg-


ative weights in MC@NLO-type matching procedures, JHEP 07, 238 (2020),
doi:10.1007/JHEP07(2020)238, 2002.12716.

[179] T. Ježo and P. Nason, On the Treatment of Resonances in Next-to-Leading Order Calculations
Matched to a Parton Shower, JHEP 12, 065 (2015), doi:10.1007/JHEP12(2015)065, 1509.
09071.
[180] S. Ferrario Ravasio, T. Ježo, P. Nason and C. Oleari, A theoretical study of top-mass measure-
ments at the LHC using NLO+PS generators of increasing accuracy, Eur. Phys. J. C 78(6),
458 (2018), doi:10.1140/epjc/s10052-019-7336-9, [Addendum: Eur.Phys.J.C 79, 859
(2019)], 1906.09166.

[181] S. Alioli, C. W. Bauer, C. J. Berggren, A. Hornig, F. J. Tackmann, C. K. Vermilion, J. R.


Walsh and S. Zuberi, Combining Higher-Order Resummation with Multiple NLO Calculations
and Parton Showers in GENEVA, JHEP 09, 120 (2013), doi:10.1007/JHEP09(2013)120,
1211.7049.
[182] S. Alioli, C. W. Bauer, C. Berggren, F. J. Tackmann, J. R. Walsh and S. Zuberi, Match-
ing Fully Differential NNLO Calculations and Parton Showers, JHEP 06, 089 (2014),
doi:10.1007/JHEP06(2014)089, 1311.0286.

[183] S. Alioli, C. W. Bauer, C. Berggren, F. J. Tackmann and J. R. Walsh, Drell-Yan produc-


tion at NNLL’+NNLO matched to parton showers, Phys. Rev. D 92(9), 094020 (2015),
doi:10.1103/PhysRevD.92.094020, 1508.01475.

[184] S. Alioli, C. W. Bauer, S. Guns and F. J. Tackmann, Underlying event sensitive ob-
servables in Drell-Yan production using GENEVA, Eur. Phys. J. C 76(11), 614 (2016),
doi:10.1140/epjc/s10052-016-4458-1, 1605.07192.

[185] S. Höche, S. Kuttimalai and Y. Li, Hadronic Final States in DIS at NNLO QCD with Parton
Showers, Phys. Rev. D 98(11), 114013 (2018), doi:10.1103/PhysRevD.98.114013, 1809.
04192.
[186] L. Gellersen and S. Prestel, Scale and Scheme Variations in Unitarized NLO Merging, Phys.
Rev. D 101(11), 114007 (2020), doi:10.1103/PhysRevD.101.114007, 2001.10746.

[187] H. T. Li and P. Skands, A framework for second-order parton showers, Phys. Lett. B 771, 59
(2017), doi:10.1016/j.physletb.2017.05.011, 1611.00013.

[188] J. M. Campbell, S. Höche, H. T. Li, C. T. Preuss and P. Skands, Towards NNLO+PS Matching
with Sector Showers (2021), 2108.07133.

[189] J. J. Lopez-Villarejo and P. Z. Skands, Efficient Matrix-Element Matching with Sector Showers,
JHEP 11, 150 (2011), doi:10.1007/JHEP11(2011)150, 1109.3608.

[190] S. Höche, S. Mrenna, S. Payne, C. T. Preuss and P. Skands, A Study of QCD Radiation in VBF
Higgs Production with Vincia and Pythia (2021), 2106.10987.

[191] S. Prestel, Matching N3LO QCD calculations to parton showers, JHEP 11, 041 (2021),
doi:10.1007/JHEP11(2021)041, 2106.03206.

287
SciPost Physics Codebases Submission

[192] V. N. Gribov, A REGGEON DIAGRAM TECHNIQUE, Zh. Eksp. Teor. Fiz. 53, 654 (1967).

[193] P. D. B. Collins, An Introduction to Regge Theory and High-Energy Physics, Cambridge Mono-
graphs on Mathematical Physics. Cambridge Univ. Press, Cambridge, UK, ISBN 978-0-521-
11035-8, doi:10.1017/CBO9780511897603 (2009).

[194] J. R. Forshaw and D. A. Ross, Quantum chromodynamics and the pomeron, vol. 9, Cambridge
University Press, ISBN 978-0-511-89326-1, 978-0-521-56880-7 (2011).

[195] S. Donnachie, H. G. Dosch, O. Nachtmann and P. Landshoff, Pomeron physics and QCD,
vol. 19, Cambridge University Press, ISBN 978-0-511-06050-2, 978-0-521-78039-1, 978-
0-521-67570-3 (2004).

[196] V. Barone and E. Predazzi, High-Energy Particle Diffraction, vol. v.565 of Texts and Mono-
graphs in Physics, Springer-Verlag, Berlin Heidelberg, ISBN 978-3-540-42107-8 (2002).
p
[197] G. Antchev et al., First determination of the ρ parameter at s = 13 TeV: probing the exis-
tence of a colourless C-odd three-gluon compound state, Eur. Phys. J. C 79(9), 785 (2019),
doi:10.1140/epjc/s10052-019-7223-4, 1812.04732.

[198] M. Froissart, Asymptotic behavior and subtractions in the Mandelstam representation, Phys.
Rev. 123, 1053 (1961), doi:10.1103/PhysRev.123.1053.

[199] G. A. Schuler and T. Sjöstrand, Hadronic diffractive cross-sections and the rise of the total
cross-section, Phys. Rev. D 49, 2257 (1994), doi:10.1103/PhysRevD.49.2257.

[200] C. O. Rasmussen and T. Sjöstrand, Models for total, elastic and diffractive cross sections, Eur.
Phys. J. C 78(6), 461 (2018), doi:10.1140/epjc/s10052-018-5940-8, 1804.10373.

[201] A. Donnachie and P. V. Landshoff, Total cross-sections, Phys. Lett. B 296, 227 (1992),
doi:10.1016/0370-2693(92)90832-O, hep-ph/9209205.

[202] R. Ciesielski and K. Goulianos, MBR Monte Carlo Simulation in PYTHIA8, PoS ICHEP2012,
301 (2013), doi:10.22323/1.174.0301, 1205.1446.

[203] R. B. Appleby, R. J. Barlow, J. G. Molson, M. Serluca and A. Toader, The Practi-


cal Pomeron for High Energy Proton Collimation, Eur. Phys. J. C 76(10), 520 (2016),
doi:10.1140/epjc/s10052-016-4363-7, 1604.07327.

[204] C. Patrignani et al., Review of Particle Physics, Chin. Phys. C 40(10), 100001 (2016),
doi:10.1088/1674-1137/40/10/100001.

[205] A. Donnachie and P. V. Landshoff, Dynamics of Elastic Scattering, Nucl. Phys. B 267, 690
(1986), doi:10.1016/0550-3213(86)90137-9.

[206] P. Aurenche, F. W. Bopp, A. Capella, J. Kwiecinski, M. Maire, J. Ranft and J. Tran Thanh Van,
Multiparticle production in a two component dual parton model, Phys. Rev. D 45, 92 (1992),
doi:10.1103/PhysRevD.45.92.

[207] M. L. Good and W. D. Walker, Diffraction disssociation of beam particles, Phys. Rev. 120,
1857 (1960), doi:10.1103/PhysRev.120.1857.

288
SciPost Physics Codebases Submission

[208] G. A. Schuler and T. Sjöstrand, A Scenario for high-energy gamma gamma interactions, Z.
Phys. C 73, 677 (1997), doi:10.1007/s002880050359, hep-ph/9605240.

[209] E. M. Levin and L. L. Frankfurt, The Quark hypothesis and relations between cross-sections
at high-energies, JETP Lett. 2, 65 (1965).

[210] H. J. Lipkin, Quarks for pedestrians, Phys. Rept. 8, 173 (1973), doi:10.1016/0370-
1573(73)90002-1.

[211] S. Okubo, Phi meson and unitary symmetry model, Phys. Lett. 5, 165 (1963),
doi:10.1016/S0375-9601(63)92548-9.

[212] G. Zweig, An SU(3) model for strong interaction symmetry and its breaking. Version 1 (1964).

[213] J. Iizuka, Systematics and phenomenology of meson family, Prog. Theor. Phys. Suppl. 37, 21
(1966), doi:10.1143/PTPS.37.21.

[214] T. Sjöstrand and M. Utheim, A Framework for Hadronic Rescattering in pp Collisions, Eur.
Phys. J. C 80(10), 907 (2020), doi:10.1140/epjc/s10052-020-8399-3, 2005.05658.

[215] M. Tanabashi et al., Review of Particle Physics, Phys. Rev. D 98(3), 030001 (2018),
doi:10.1103/PhysRevD.98.030001.

[216] S. A. Bass et al., Microscopic models for ultrarelativistic heavy ion collisions, Prog. Part. Nucl.
Phys. 41, 255 (1998), doi:10.1016/S0146-6410(98)00058-1, nucl-th/9803035.

[217] R. García-Martín, R. Kamiński, J. R. Peláez, J. Ruiz de Elvira and F. J. Ynduráin, The Pion-
pion scattering amplitude. IV: Improved analysis with once subtracted Roy-like equations up
to 1100 MeV, Phys. Rev. D 83, 074004 (2011), doi:10.1103/PhysRevD.83.074004, 1102.
2183.
[218] J. R. Peláez, A. Rodas and J. Ruiz De Elvira, Global parameterization of ππ scattering up
to 2 GeV, Eur. Phys. J. C 79(12), 1008 (2019), doi:10.1140/epjc/s10052-019-7509-6,
1907.13162.
[219] J. R. Peláez and A. Rodas, Pion-kaon scattering amplitude constrained with for-
ward dispersion relations up to 1.6 GeV, Phys. Rev. D 93(7), 074025 (2016),
doi:10.1103/PhysRevD.93.074025, 1602.08404.

[220] L. Montanet et al., Review of particle properties. Particle Data Group, Phys. Rev. D 50, 1173
(1994), doi:10.1103/PhysRevD.50.1173.

[221] P. Koch and C. B. Dover, K± , p̄ and Ω− Production in Relativistic Heavy Ion Collisions, Phys.
Rev. C 40, 145 (1989), doi:10.1103/PhysRevC.40.145.

[222] P. V. Landshoff and J. C. Polkinghorne, Calorimeter Triggers for Hard Collisions, Phys. Rev.
D 18, 3344 (1978), doi:10.1103/PhysRevD.18.3344.

[223] C. Goebel, F. Halzen and D. M. Scott, Double Drell-Yan Annihilations in Hadron


Collisions: Novel Tests of the Constituent Picture, Phys. Rev. D 22, 2789 (1980),
doi:10.1103/PhysRevD.22.2789.

289
SciPost Physics Codebases Submission

[224] V. A. Abramovsky, V. N. Gribov and O. V. Kancheli, Character of Inclusive Spectra and Fluctu-
ations Produced in Inelastic Processes by Multi-Pomeron Exchange, Yad. Fiz. 18, 595 (1973).

[225] T. Sjöstrand, Multiple Parton-Parton Interactions in Hadronic Events, In 23rd International


Conference on High-Energy Physics (1985).

[226] P. Bartalini and J. R. Gaunt, eds., Multiple Parton Interactions at the LHC, vol. 29, WSP,
ISBN 978-981-322-775-0, 978-981-322-777-4, doi:10.1142/10646 (2019).

[227] T. Sjöstrand, The Development of MPI Modeling in Pythia, Adv. Ser. Direct. High Energy
Phys. 29, 191 (2018), doi:10.1142/9789813227767_0010, 1706.02166.

[228] L. D. McLerran and R. Venugopalan, Computing quark and gluon distribution functions for
very large nuclei, Phys. Rev. D 49, 2233 (1994), doi:10.1103/PhysRevD.49.2233, hep-ph/
9309289.
[229] F. Gelis, E. Iancu, J. Jalilian-Marian and R. Venugopalan, The Color Glass Condensate, Ann.
Rev. Nucl. Part. Sci. 60, 463 (2010), doi:10.1146/annurev.nucl.010909.083629, 1002.
0333.
[230] R. D. Ball, V. Bertone, S. Carrazza, L. Del Debbio, S. Forte, A. Guffanti, N. P. Hartland
and J. Rojo, Parton distributions with QED corrections, Nucl. Phys. B 877, 290 (2013),
doi:10.1016/j.nuclphysb.2013.10.010, 1308.0598.

[231] L. V. Gribov, E. M. Levin and M. G. Ryskin, Semihard Processes in QCD, Phys. Rept. 100, 1
(1983), doi:10.1016/0370-1573(83)90022-4.

[232] A. H. Mueller and J.-w. Qiu, Gluon Recombination and Shadowing at Small Values of x, Nucl.
Phys. B 268, 427 (1986), doi:10.1016/0550-3213(86)90164-1.

[233] R. J. Glauber, High-Energy Collision Theory, In W. E. Brittin and L. G. Dunham, eds., Lectures
in Theoretical Physics, vol. I, pp. 315 – 414. Interscience, New York (1959).

[234] T. T. Chou and C.-N. Yang, Model of Elastic High-Energy Scattering, Phys. Rev. 170, 1591
(1968), doi:10.1103/PhysRev.170.1591.

[235] C. Bourrely, J. Soffer and T. T. Wu, Impact Picture Expectations for Very High-Energy Elastic
pp and pp̄ Scattering, Nucl. Phys. B 247, 15 (1984), doi:10.1016/0550-3213(84)90369-9.

[236] P. L’Heureux, B. Margolis and P. Valin, QUARK - GLUON MODEL FOR DIFFRACTION AT
HIGH-ENERGIES, Phys. Rev. D 32, 1681 (1985), doi:10.1103/PhysRevD.32.1681.

[237] L. Frankfurt, M. Strikman and C. Weiss, Small-x physics: From HERA to LHC and beyond,
Ann. Rev. Nucl. Part. Sci. 55, 403 (2005), doi:10.1146/annurev.nucl.53.041002.110615,
hep-ph/0507286.
[238] R. Corke and T. Sjöstrand, Multiparton Interactions with an x-dependent Proton Size, JHEP
05, 009 (2011), doi:10.1007/JHEP05(2011)009, 1101.5953.

[239] J. R. Gaunt and W. J. Stirling, Double Parton Distributions Incorporating Perturbative


QCD Evolution and Momentum and Quark Number Sum Rules, JHEP 03, 005 (2010),
doi:10.1007/JHEP03(2010)005, 0910.4347.

290
SciPost Physics Codebases Submission

[240] K. Konishi, A. Ukawa and G. Veneziano, Jet Calculus: A Simple Algorithm for Resolving QCD
Jets, Nucl. Phys. B 157, 45 (1979), doi:10.1016/0550-3213(79)90053-1.

[241] R. Kirschner, Generalized Lipatov-Altarelli-Parisi Equations and Jet Calculus Rules, Phys.
Lett. B 84, 266 (1979), doi:10.1016/0370-2693(79)90300-9.

[242] N. Paver and D. Treleani, Multi - Quark Scattering and Large p T Jet Production in Hadronic
Collisions, Nuovo Cim. A 70, 215 (1982), doi:10.1007/BF02814035.

[243] N. Paver and D. Treleani, Multiple Parton Interactions and Multi - Jet Events at Collider and
Tevatron Energies, Phys. Lett. B 146, 252 (1984), doi:10.1016/0370-2693(84)91029-3.

[244] N. Paver and D. Treleani, MULTIPLE PARTON PROCESSES IN THE TeV REGION, Z. Phys. C
28, 187 (1985), doi:10.1007/BF01575722.

[245] R. Corke and T. Sjöstrand, Multiparton Interactions and Rescattering, JHEP 01, 035 (2010),
doi:10.1007/JHEP01(2010)035, 0911.1909.

[246] E. Avsar, G. Gustafson and L. Lönnblad, Small-x dipole evolution beyond the large-N(c) imit,
JHEP 01, 012 (2007), doi:10.1088/1126-6708/2007/01/012, hep-ph/0610157.

[247] C. Bierlich, G. Gustafson, L. Lönnblad and A. Tarasov, Effects of Overlapping Strings in pp


Collisions, JHEP 03, 148 (2015), doi:10.1007/JHEP03(2015)148, 1412.6259.

[248] C. Bierlich and C. O. Rasmussen, Dipole evolution: perspectives for collectivity and γ∗ A colli-
sions, JHEP 10, 026 (2019), doi:10.1007/JHEP10(2019)026, 1907.12871.

[249] T. Sjöstrand and P. Z. Skands, Multiple interactions and the structure of beam remnants, JHEP
03, 053 (2004), doi:10.1088/1126-6708/2004/03/053, hep-ph/0402078.

[250] J. R. Christiansen and P. Z. Skands, String Formation Beyond Leading Colour, JHEP 08, 003
(2015), doi:10.1007/JHEP08(2015)003, 1505.01681.

[251] G. Ingelman and P. E. Schlein, Jet Structure in High Mass Diffractive Scattering, Phys. Lett.
B 152, 256 (1985), doi:10.1016/0370-2693(85)91181-5.

[252] S. Navin, Diffraction in Pythia (2010), 1005.3894.


p
[253] T. Affolder et al., Diffractive dijets with a leading antiproton in p̄p collisions at s = 1800
GeV, Phys. Rev. Lett. 84, 5043 (2000), doi:10.1103/PhysRevLett.84.5043.
p
[254] G. Aad et al., Dijet production in s = 7 TeV pp collisions with large rapidity gaps at the
ATLAS experiment, Phys. Lett. B 754, 214 (2016), doi:10.1016/j.physletb.2016.01.028,
1511.00502.
[255] C. O. Rasmussen and T. Sjöstrand, Hard Diffraction with Dynamic Gap Survival, JHEP 02,
142 (2016), doi:10.1007/JHEP02(2016)142, 1512.05525.
p
[256] Measurement of dijet production with a leading proton in proton-proton collisions at s = 8
TeV (2018).

[257] R. Kleiss et al., MONTE CARLOS FOR ELECTROWEAK PHYSICS, In LEP Physics Workshop
(1989).

291
SciPost Physics Codebases Submission

[258] D. Schulte, Beam-beam simulations with GUINEA-PIG (1999).

[259] T. Ohl, CIRCE version 1.0: Beam spectra for simulating linear collider physics, Comput. Phys.
Commun. 101, 269 (1997), doi:10.1016/S0010-4655(96)00167-1, hep-ph/9607454.

[260] R. P. Feynman, Very high-energy collisions of hadrons, Phys. Rev. Lett. 23, 1415 (1969),
doi:10.1103/PhysRevLett.23.1415.

[261] J. D. Bjorken and E. A. Paschos, Inelastic Electron Proton and gamma Proton Scattering, and
the Structure of the Nucleon, Phys. Rev. 185, 1975 (1969), doi:10.1103/PhysRev.185.1975.

[262] J. C. Collins, D. E. Soper and G. F. Sterman, Transverse Momentum Distribution in Drell-Yan


Pair and W and Z Boson Production, Nucl. Phys. B 250, 199 (1985), doi:10.1016/0550-
3213(85)90479-1.

[263] R. Angeles-Martinez et al., Transverse Momentum Dependent (TMD) parton distri-


bution functions: status and prospects, Acta Phys. Polon. B 46(12), 2501 (2015),
doi:10.5506/APhysPolB.46.2501, 1507.05267.

[264] A. Kerbizi and L. Lönnblad, StringSpinner – adding spin to the PYTHIA string fragmentation
(2021), 2105.09730.

[265] F. Cornet, P. Jankowski, M. Krawczyk and A. Lorca, A New five flavor LO analysis and
parametrization of parton distributions in the real photon, Phys. Rev. D 68, 014010 (2003),
doi:10.1103/PhysRevD.68.014010, hep-ph/0212160.

[266] C. F. von Weizsäcker, Radiation emitted in collisions of very fast electrons, Z. Phys. 88, 612
(1934), doi:10.1007/BF01333110.

[267] E. J. Williams, Nature of the high-energy particles of penetrating radiation and status of
ionization and radiation formulae, Phys. Rev. 45, 729 (1934), doi:10.1103/PhysRev.45.729.

[268] G. A. Schuler and T. Sjostrand, Parton distributions of the virtual photon, Phys. Lett. B 376,
193 (1996), doi:10.1016/0370-2693(96)00265-1, hep-ph/9601282.

[269] I. Helenius, Simulations of photo-nuclear dijets with Pythia 8 and their sensitivity to nuclear
PDFs, PoS DIS2018, 113 (2018), doi:10.22323/1.316.0113, 1806.07326.

[270] S. Chekanov et al., Diffractive photoproduction of dijets in ep collisions at HERA, Eur. Phys.
J. C 55, 177 (2008), doi:10.1140/epjc/s10052-008-0598-2, 0710.1498.

[271] V. Andreev et al., Diffractive Dijet Production with a Leading Proton in ep Collisions at HERA,
JHEP 05, 056 (2015), doi:10.1007/JHEP05(2015)056, 1502.01683.

[272] I. Helenius and C. O. Rasmussen, Hard diffraction in photoproduction with Pythia 8, Eur.
Phys. J. C 79(5), 413 (2019), doi:10.1140/epjc/s10052-019-6914-1, 1901.05261.

[273] T. H. Bauer, R. D. Spital, D. R. Yennie and F. M. Pipkin, The Hadronic Proper-


ties of the Photon in High-Energy Interactions, Rev. Mod. Phys. 50, 261 (1978),
doi:10.1103/RevModPhys.50.261, [Erratum: Rev.Mod.Phys. 51, 407 (1979)].

[274] M. Derrick et al., Measurement of elastic ρ 0 photoproduction at HERA, Z. Phys. C 69, 39


(1995), doi:10.1007/s002880050004, hep-ex/9507011.

292
SciPost Physics Codebases Submission

[275] M. Derrick et al., Measurement of elastic ω photoproduction at HERA, Z. Phys. C 73, 73


(1996), doi:10.1007/s002880050297, hep-ex/9608010.

[276] M. Derrick et al., Measurement of elastic φ photoproduction at HERA, Phys. Lett. B 377, 259
(1996), doi:10.1016/0370-2693(96)00172-4, hep-ex/9601009.

[277] J. Breitweg et al., Measurement of elastic J/ψ photoproduction at HERA, Z. Phys. C 75, 215
(1997), doi:10.1007/s002880050464, hep-ex/9704013.

[278] S. Chekanov et al., Exclusive photoproduction of J/ψ mesons at HERA, Eur. Phys. J. C 24,
345 (2002), doi:10.1007/s10052-002-0953-7, hep-ex/0201043.

[279] A. Aktas et al., Elastic J/ψ production at HERA, Eur. Phys. J. C 46, 585 (2006),
doi:10.1140/epjc/s2006-02519-5, hep-ex/0510016.

[280] C. Alexa et al., Elastic and Proton-Dissociative Photoproduction of J/ψ Mesons at HERA, Eur.
Phys. J. C 73(6), 2466 (2013), doi:10.1140/epjc/s10052-013-2466-y, 1304.5162.

[281] M. Drees and D. Zeppenfeld, Production of Supersymmetric Particles in Elastic ep Collisions,


Phys. Rev. D 39, 2536 (1989), doi:10.1103/PhysRevD.39.2536.

[282] V. M. Budnev, I. F. Ginzburg, G. V. Meledin and V. G. Serbo, The Two photon particle produc-
tion mechanism. Physical problems. Applications. Equivalent photon approximation, Phys.
Rept. 15, 181 (1975), doi:10.1016/0370-1573(75)90009-5.

[283] J. D. Jackson, Classical Electrodynamics, Wiley, ISBN 978-0-471-30932-1 (1998).

[284] C. Bierlich, G. Gustafson, L. Lönnblad and H. Shah, The Angantyr model for Heavy-Ion
Collisions in PYTHIA8, JHEP 10, 134 (2018), doi:10.1007/JHEP10(2018)134, 1806.10820.

[285] B. Andersson, G. Gustafson and B. Nilsson-Almqvist, A Model for Low p(t) Hadronic Reac-
tions, with Generalizations to Hadron - Nucleus and Nucleus-Nucleus Collisions, Nucl. Phys.
B 281, 289 (1987), doi:10.1016/0550-3213(87)90257-4.

[286] A. Białas, M. Bleszyński and W. Czyż, Multiplicity Distributions in Nucleus-Nucleus Collisions


at High-Energies, Nucl. Phys. B 111, 461 (1976), doi:10.1016/0550-3213(76)90329-1.

[287] C. Bierlich, G. Gustafson and L. Lönnblad, Diffractive and non-diffractive wounded nucleons
and final states in pA collisions, JHEP 10, 139 (2016), doi:10.1007/JHEP10(2016)139,
1607.04434.
[288] W. Broniowski, M. Rybczyński and P. Bożek, GLISSANDO: Glauber initial-state simulation
and more.., Comput. Phys. Commun. 180, 69 (2009), doi:10.1016/j.cpc.2008.07.016,
0710.5731.
[289] M. Rybczyński, G. Stefanek, W. Broniowski and P. Bożek, GLISSANDO 2 : GLauber
Initial-State Simulation AND mOre. . . , ver. 2, Comput. Phys. Commun. 185, 1759 (2014),
doi:10.1016/j.cpc.2014.02.016, 1310.5475.

[290] B. Andersson, G. Gustafson and B. Söderberg, A General Model for Jet Fragmentation, Z.
Phys. C 20, 317 (1983), doi:10.1007/BF01407824.

293
SciPost Physics Codebases Submission

[291] T. Sjöstrand, Jet Fragmentation of Nearby Partons, Nucl. Phys. B 248, 469 (1984),
doi:10.1016/0550-3213(84)90607-2.

[292] B. Andersson, G. Gustafson and B. Söderberg, A Probability Measure on Parton and String
States pp. 145–150 (1985), doi:10.1016/0550-3213(86)90471-2.

[293] M. G. Bowler, e+ e− Production of Heavy Quarks in the String Model, Z. Phys. C 11, 169
(1981), doi:10.1007/BF01574001.

[294] X. Artru and G. Mennessier, String model and multiproduction, Nucl. Phys. B 70, 93 (1974),
doi:10.1016/0550-3213(74)90360-5.

[295] D. A. Morris, Heavy Quark Fragmentation Functions in a Simple String Model, Nucl. Phys.
B 313, 634 (1989), doi:10.1016/0550-3213(89)90399-4.

[296] B. Andersson, G. Gustafson, G. Ingelman and T. Sjöstrand, Parton Fragmentation and String
Dynamics, Phys. Rept. 97, 31 (1983), doi:10.1016/0370-1573(83)90080-7.

[297] C. Bierlich, S. Chakraborty, G. Gustafson and L. Lönnblad, Hyperfine splitting effects in string
hadronization (2022), 2201.06316.

[298] B. Andersson, G. Gustafson and T. Sjöstrand, A Model for Baryon Production in Quark and
Gluon Jets, Nucl. Phys. B 197, 45 (1982), doi:10.1016/0550-3213(82)90153-5.

[299] A. Casher, H. Neuberger and S. Nussinov, Chromoelectric Flux Tube Model of Particle Pro-
duction, Phys. Rev. D 20, 179 (1979), doi:10.1103/PhysRevD.20.179.

[300] B. Andersson, G. Gustafson and T. Sjöstrand, Baryon Production in Jet Fragmentation and
Υ Decay, Phys. Scripta 32, 574 (1985), doi:10.1088/0031-8949/32/6/003.

[301] T. Sjöstrand, The Merging of Jets, Phys. Lett. B 142, 420 (1984), doi:10.1016/0370-
2693(84)91354-6.

[302] S. Ferreres-Solé and T. Sjöstrand, The space–time structure of hadronization in the Lund
model, Eur. Phys. J. C 78(11), 983 (2018), doi:10.1140/epjc/s10052-018-6459-8, 1808.
04619.
[303] T. Sjöstrand and P. Z. Skands, Baryon number violation and string topologies, Nucl. Phys. B
659, 243 (2003), doi:10.1016/S0550-3213(03)00193-7, hep-ph/0212264.

[304] E. Norrbin and T. Sjöstrand, Production mechanisms of charm hadrons in the string model,
Phys. Lett. B 442, 407 (1998), doi:10.1016/S0370-2693(98)01244-1, hep-ph/9809266.

[305] E. Boos et al., Generic User Process Interface for Event Generators, In 2nd Les Houches
Workshop on Physics at TeV Colliders (2001), hep-ph/0109068.

[306] J. Alwall et al., A Standard format for Les Houches event files, Comput. Phys. Commun. 176,
300 (2007), doi:10.1016/j.cpc.2006.11.010, hep-ph/0609017.

[307] H. Fritzsch, Producing Heavy Quark Flavors in Hadronic Collisions: A Test of Quantum Chro-
modynamics, Phys. Lett. B 67, 217 (1977), doi:10.1016/0370-2693(77)90108-3.

294
SciPost Physics Codebases Submission

[308] A. Ali, J. G. Korner, G. Kramer and J. Willrodt, Nonleptonic Weak Decays of Bottom Mesons,
Z. Phys. C 1, 269 (1979), doi:10.1007/BF01440227.

[309] H. Fritzsch, How to Discover the B Mesons, Phys. Lett. B 86, 343 (1979), doi:10.1016/0370-
2693(79)90853-0.
p
[310] C. Albajar et al., A Study of the General Characteristics of pp̄ Collisions at s = 0.2-TeV to
0.9-TeV, Nucl. Phys. B 335, 261 (1990), doi:10.1016/0550-3213(90)90493-W.

[311] T. Sjöstrand and V. A. Khoze, On Color rearrangement in hadronic W+ W− events, Z. Phys. C


62, 281 (1994), doi:10.1007/BF01560244, hep-ph/9310242.

[312] S. Schael et al., Electroweak Measurements in Electron-Positron Collisions at W-Boson-Pair


Energies at LEP, Phys. Rept. 532, 119 (2013), doi:10.1016/j.physrep.2013.07.004, 1302.
3415.
[313] Combination of CDF and D0 results on the mass of the top quark using up 9.7 fb−1 at the
Tevatron (2016), 1608.01881.

[314] M. Aaboud et al., Measurement of the top quark mass in the t t̄ → lepton+jets channel from
p
s = 8 TeV ATLAS data and combination with previous results, Eur. Phys. J. C 79(4), 290
(2019), doi:10.1140/epjc/s10052-019-6757-9, 1810.01772.
p
[315] A. M. Sirunyan et al., Measurement of the top quark mass in the all-jets final state at s =
13 TeV and combination with the lepton+jets channel, Eur. Phys. J. C 79(4), 313 (2019),
doi:10.1140/epjc/s10052-019-6788-2, 1812.10534.

[316] M. Sandhoff and P. Z. Skands, Colour annealing - a toy model of colour reconnections, In 4th
Les Houches Workshop on Physics at TeV Colliders (2005).

[317] P. Z. Skands and D. Wicke, Non-perturbative QCD effects and the top mass at the Tevatron,
Eur. Phys. J. C 52, 133 (2007), doi:10.1140/epjc/s10052-007-0352-1, hep-ph/0703081.

[318] P. Z. Skands, Tuning Monte Carlo Generators: The Perugia Tunes, Phys. Rev. D 82, 074018
(2010), doi:10.1103/PhysRevD.82.074018, 1005.3457.

[319] S. Argyropoulos and T. Sjöstrand, Effects of color reconnection on t t̄ final states at the LHC,
JHEP 11, 043 (2014), doi:10.1007/JHEP11(2014)043, 1407.6653.

[320] L. Lönnblad, Reconnecting colored dipoles, Z. Phys. C 70, 107 (1996),


doi:10.1007/s002880050087.

[321] T. Sjöstrand and V. A. Khoze, Does the W mass reconstruction survive QCD effects?, Phys.
Rev. Lett. 72, 28 (1994), doi:10.1103/PhysRevLett.72.28, hep-ph/9310276.

[322] J. R. Christiansen and T. Sjöstrand, Color reconnection at future e+ e− colliders, Eur. Phys.
J. C 75(9), 441 (2015), doi:10.1140/epjc/s10052-015-3674-4, 1506.09085.

[323] A. Chodos, R. L. Jaffe, K. Johnson, C. B. Thorn and V. F. Weisskopf, A New Extended Model
of Hadrons, Phys. Rev. D 9, 3471 (1974), doi:10.1103/PhysRevD.9.3471.

[324] D. Wicke and P. Z. Skands, Non-perturbative QCD Effects and the Top Mass at the Tevatron,
Nuovo Cim. B 123, S1 (2008), doi:10.1393/ncb/i2009-10749-y, 0807.3248.

295
SciPost Physics Codebases Submission

[325] G. Gustafson and J. Hakkinen, Color interference and confinement effects in W pair produc-
tion, Z. Phys. C 64, 659 (1994), doi:10.1007/BF01957774.

[326] W. Buchmüller and A. Hebecker, A Parton model for diffractive processes in deep inelastic
scattering, Phys. Lett. B 355, 573 (1995), doi:10.1016/0370-2693(95)00721-V, hep-ph/
9504374.
[327] A. Edin, G. Ingelman and J. Rathsman, Soft color interactions as the origin of rapidity gaps
in DIS, Phys. Lett. B 366, 371 (1996), doi:10.1016/0370-2693(95)01391-1, hep-ph/
9508386.
[328] A. Edin, G. Ingelman and J. Rathsman, Unified description of rapidity gaps and energy flows in
DIS final states, Z. Phys. C 75, 57 (1997), doi:10.1007/s002880050447, hep-ph/9605281.

[329] R. Pasechnik, R. Enberg and G. Ingelman, Diffractive deep inelastic scattering


from multiple soft gluon exchange in QCD, Phys. Lett. B 695, 189 (2011),
doi:10.1016/j.physletb.2010.11.010, 1004.2912.

[330] R. Enberg, G. Ingelman and N. Timneanu, Soft color interactions and diffractive hard scatter-
ing at the Tevatron, Phys. Rev. D 64, 114015 (2001), doi:10.1103/PhysRevD.64.114015,
hep-ph/0106246.
[331] A. Edin, G. Ingelman and J. Rathsman, Quarkonium production at the Tevatron through
soft color interactions, Phys. Rev. D 56, 7317 (1997), doi:10.1103/PhysRevD.56.7317,
hep-ph/9705311.
[332] J. Rathsman, A Generalized area law for hadronic string re-interactions, Phys. Lett. B 452,
364 (1999), doi:10.1016/S0370-2693(99)00291-9, hep-ph/9812423.

[333] S. Gieseke, C. Rohr and A. Siódmok, Colour reconnections in Herwig++, Eur. Phys. J. C 72,
2225 (2012), doi:10.1140/epjc/s10052-012-2225-5, 1206.0041.

[334] S. Gieseke, P. Kirchgaeßer, S. Plätzer and A. Siódmok, Colour Reconnection from Soft Gluon
Evolution, JHEP 11, 149 (2018), doi:10.1007/JHEP11(2018)149, 1808.06770.

[335] J. Bellm, C. B. Duncan, S. Gieseke, M. Myska and A. Siódmok, Spacetime colour reconnection
in Herwig 7, Eur. Phys. J. C 79(12), 1003 (2019), doi:10.1140/epjc/s10052-019-7533-6,
1909.08850.
[336] V. A. Khoze, F. Krauss, A. D. Martin, M. G. Ryskin and K. C. Zapp, Diffraction and
correlations at the LHC: Definitions and observables, Eur. Phys. J. C 69, 85 (2010),
doi:10.1140/epjc/s10052-010-1392-5, 1005.4839.

[337] K. C. Zapp, JEWEL 2.0.0: directions for use, Eur. Phys. J. C 74(2), 2762 (2014),
doi:10.1140/epjc/s10052-014-2762-1, 1311.0048.

[338] S. Cao et al., Multistage Monte-Carlo simulation of jet modification in a static medium, Phys.
Rev. C 96(2), 024909 (2017), doi:10.1103/PhysRevC.96.024909, 1705.00050.

[339] K. Werner, Core-corona separation in ultra-relativistic heavy ion collisions, Phys. Rev. Lett.
98, 152301 (2007), doi:10.1103/PhysRevLett.98.152301, 0704.1270.

296
SciPost Physics Codebases Submission

[340] T. Pierog, I. Karpenko, J. M. Katzy, E. Yatsenko and K. Werner, EPOS LHC: Test of collective
hadronization with data measured at the CERN Large Hadron Collider, Phys. Rev. C 92(3),
034906 (2015), doi:10.1103/PhysRevC.92.034906, 1306.0121.

[341] Y. Kanakubo, Y. Tachibana and T. Hirano, Unified description of hadron yield ra-
tios from dynamical core-corona initialization, Phys. Rev. C 101(2), 024912 (2020),
doi:10.1103/PhysRevC.101.024912, 1910.10556.

[342] V. Khachatryan et al., Observation of Long-Range Near-Side Angular Correlations in Proton-


Proton Collisions at the LHC, JHEP 09, 091 (2010), doi:10.1007/JHEP09(2010)091, 1009.
4122.
p
[343] V. Khachatryan et al., Strange Particle Production in pp Collisions at s = 0.9 and 7 TeV,
JHEP 05, 064 (2011), doi:10.1007/JHEP05(2011)064, 1102.4282.
p
[344] R. Aaij et al., Measurement of prompt hadron production ratios in pp collisions at s =
0.9 and 7 TeV, Eur. Phys. J. C 72, 2168 (2012), doi:10.1140/epjc/s10052-012-2168-x,
1206.5160.
p
[345] K. Aamodt et al., Strange particle production in proton-proton collisions at s = 0.9 TeV with
ALICE at the LHC, Eur. Phys. J. C 71, 1594 (2011), doi:10.1140/epjc/s10052-011-1594-5,
1012.3257.
p
[346] B. Abelev et al., Multi-strange baryon production in pp collisions at s = 7 TeV with ALICE,
Phys. Lett. B 712, 309 (2012), doi:10.1016/j.physletb.2012.05.011, 1204.0282.

[347] J. Adam et al., Enhanced production of multi-strange hadrons in high-multiplicity proton-


proton collisions, Nature Phys. 13, 535 (2017), doi:10.1038/nphys4111, 1606.07424.

[348] A. Ortiz Velasquez, P. Christiansen, E. Cuautle Flores, I. Maldonado Cervantes and G. Paić,
Color Reconnection and Flowlike Patterns in pp Collisions, Phys. Rev. Lett. 111(4), 042001
(2013), doi:10.1103/PhysRevLett.111.042001, 1303.6326.

[349] C. Bierlich and J. R. Christiansen, Effects of color reconnection on hadron flavor observables,
Phys. Rev. D 92(9), 094010 (2015), doi:10.1103/PhysRevD.92.094010, 1507.02091.

[350] C. Bierlich, Microscopic collectivity: The ridge and strangeness enhance-


ment from string–string interactions, Nucl. Phys. A 982, 499 (2019),
doi:10.1016/j.nuclphysa.2018.07.015, 1807.05271.

[351] V. A. Abramovsky, E. V. Gedalin, E. G. Gurvich and O. V. Kancheli, Long Range Azimuthal


Correlations in Multiple Production Processes at High-energies, JETP Lett. 47, 337 (1988).

[352] C. Bierlich, Soft modifications to jet fragmentation in high energy proton–proton collisions,
Phys. Lett. B 795, 194 (2019), doi:10.1016/j.physletb.2019.06.018, 1901.07447.

[353] C. Bierlich, S. Chakraborty, G. Gustafson and L. Lönnblad, Setting the string shoving picture
in a new frame, JHEP 03, 270 (2021), doi:10.1007/JHEP03(2021)270, 2010.07595.

[354] P. Cea, L. Cosmai, F. Cuteri and A. Papa, Flux tubes in the SU(3) vacuum: Lon-
don penetration depth and coherence length, Phys. Rev. D 89(9), 094505 (2014),
doi:10.1103/PhysRevD.89.094505, 1404.1172.

297
SciPost Physics Codebases Submission

[355] T. S. Biro, H. B. Nielsen and J. Knoll, Color Rope Model for Extreme Relativistic Heavy Ion
Collisions, Nucl. Phys. B 245, 449 (1984), doi:10.1016/0550-3213(84)90441-3.

[356] G. S. Bali, Casimir scaling of SU(3) static potentials, Phys. Rev. D 62, 114503 (2000),
doi:10.1103/PhysRevD.62.114503, hep-lat/0006022.

[357] S. Jeon and R. Venugopalan, Random walks of partons in SU(N(c)) and classical rep-
resentations of color charges in QCD at small x, Phys. Rev. D 70, 105012 (2004),
doi:10.1103/PhysRevD.70.105012, hep-ph/0406169.

[358] P. Skands, S. Carrazza and J. Rojo, Tuning PYTHIA 8.1: the Monash 2013 Tune, Eur. Phys.
J. C 74(8), 3024 (2014), doi:10.1140/epjc/s10052-014-3024-y, 1404.5630.

[359] N. Fischer and T. Sjöstrand, Thermodynamical String Fragmentation, JHEP 01, 140 (2017),
doi:10.1007/JHEP01(2017)140, 1610.09818.

[360] A. Białas, Fluctuations of string tension and transverse mass distribution, Phys. Lett. B 466,
301 (1999), doi:10.1016/S0370-2693(99)01159-4, hep-ph/9909417.

[361] C. Bierlich, T. Sjöstrand and M. Utheim, Hadronic rescattering in pA and AA collisions, Eur.
Phys. J. A 57(7), 227 (2021), doi:10.1140/epja/s10050-021-00543-3, 2103.09665.

[362] R. Hanbury Brown and R. Q. Twiss, A Test of a new type of stellar interferometer on Sirius,
Nature 178, 1046 (1956), doi:10.1038/1781046a0.

[363] N. Neumeister et al., Higher order Bose-Einstein correlations in p anti-p collisions at


S**(1/2) = 630-GeV and 900-GeV, Phys. Lett. B 275, 186 (1992), doi:10.1016/0370-
2693(92)90874-4.
p
[364] G. Aad et al., Two-particle Bose–Einstein correlations in pp collisions at s = 0.9
and 7 TeV measured with the ATLAS detector, Eur. Phys. J. C 75(10), 466 (2015),
doi:10.1140/epjc/s10052-015-3644-x, 1502.07947.
p
[365] V. Khachatryan et al., Measurement of Bose-Einstein Correlations in pp Collisions at s = 0.9
and 7 TeV, JHEP 05, 029 (2011), doi:10.1007/JHEP05(2011)029, 1101.3518.

[366] R. Aaij et al., Bose-Einstein correlations of same-sign charged pions in the forward region
p
in pp collisions at s = 7 TeV, JHEP 12, 025 (2017), doi:10.1007/JHEP12(2017)025,
1709.01769.
[367] P. D. Acton et al., A Study of Bose-Einstein correlations in e+ e− annihilations at LEP, Phys.
Lett. B 267, 143 (1991), doi:10.1016/0370-2693(91)90540-7.

[368] P. D. Acton et al., A Study of K0s K0s Bose-Einstein correlations in hadronic Z0 decays, Phys.
Lett. B 298, 456 (1993), doi:10.1016/0370-2693(93)91851-D.

[369] D. Decamp et al., A Study of Bose-Einstein correlations in e+ e− annihilation at 91-GeV, Z.


Phys. C 54, 75 (1992), doi:10.1007/BF01881709.

[370] P. Abreu et al., Interference of neutral kaons in the hadronic decays of the Z0 , Phys. Lett. B
323, 242 (1994), doi:10.1016/0370-2693(94)90298-4.

298
SciPost Physics Codebases Submission

[371] L. Lönnblad and T. Sjöstrand, Bose-Einstein effects and W mass determinations, Phys. Lett.
B 351, 293 (1995), doi:10.1016/0370-2693(95)00393-Y.

[372] M. Gyulassy, S. K. Kauffmann and L. W. Wilson, Pion Interferometry of Nuclear Collisions. 1.


Theory, Phys. Rev. C 20, 2267 (1979), doi:10.1103/PhysRevC.20.2267.

[373] L. Lönnblad and T. Sjöstrand, Modeling Bose-Einstein correlations at LEP-2, Eur. Phys. J. C
2, 165 (1998), doi:10.1007/s100520050131, hep-ph/9711460.

[374] F. Donato, N. Fornengo and P. Salati, Anti-deuterons as a signature of supersymmetric dark


matter, Phys. Rev. D 62, 043003 (2000), doi:10.1103/PhysRevD.62.043003, hep-ph/
9904481.
[375] A. Andronic, P. Braun-Munzinger, J. Stachel and H. Stocker, Production of light nuclei,
hypernuclei and their antiparticles in relativistic nuclear collisions, Phys. Lett. B 697, 203
(2011), doi:10.1016/j.physletb.2011.01.053, 1010.2995.

[376] A. Schwarzschild and C. Zupancic, Production of Tritons, Deuterons, Nucleons, and


Mesons by 30-GeV Protons on A-1, Be, and Fe Targets, Phys. Rev. 129, 854 (1963),
doi:10.1103/PhysRev.129.854.

[377] J. I. Kapusta, Mechanisms for deuteron production in relativistic nuclear collisions, Phys. Rev.
C 21, 1301 (1980), doi:10.1103/PhysRevC.21.1301.

[378] L. A. Dal and A. R. Raklev, Alternative formation model for antideuterons from dark mat-
ter, Phys. Rev. D 91(12), 123536 (2015), doi:10.1103/PhysRevD.91.123536, [Erratum:
Phys.Rev.D 92, 069903 (2015), Erratum: Phys.Rev.D 92, 089901 (2015)], 1504.07242.

[379] J. Beringer et al., Review of Particle Physics (RPP), Phys. Rev. D 86, 010001 (2012),
doi:10.1103/PhysRevD.86.010001.

[380] A. De Rujula, H. Georgi and S. L. Glashow, Hadron Masses in a Gauge Theory, Phys. Rev. D
12, 147 (1975), doi:10.1103/PhysRevD.12.147.

[381] D. Herndon, P. Soding and R. J. Cashmore, A GENERALIZED ISOBAR MODEL FORMALISM,


Phys. Rev. D 11, 3165 (1975), doi:10.1103/PhysRevD.11.3165.

[382] M. Ablikim et al., Dalitz Plot Analysis of the Decay ω → π+ π− π0 , Phys. Rev. D 98(11),
112007 (2018), doi:10.1103/PhysRevD.98.112007, 1811.03817.

[383] S. Rudaz, ANOMALIES, VECTOR MESONS AND THE ω → 3π CONTACT TERM, Phys. Lett.
B 145, 281 (1984), doi:10.1016/0370-2693(84)90355-1.

[384] S. Jadach, Z. Wa̧s, R. Decker and J. H. Kühn, The τ decay library TAUOLA: Version 2.4,
Comput. Phys. Commun. 76, 361 (1993), doi:10.1016/0010-4655(93)90061-G.

[385] H. Murayama, I. Watanabe and K. Hagiwara, HELAS: HELicity amplitude subroutines for
Feynman diagram evaluations (1992).

[386] J. H. Kühn and A. Santamaria, τ decays to pions, Z. Phys. C 48, 445 (1990),
doi:10.1007/BF01572024.

299
SciPost Physics Codebases Submission

[387] M. Finkemeier and E. Mirkes, The Scalar contribution to τ → Kπντ , Z. Phys. C 72, 619
(1996), doi:10.1007/s002880050284, hep-ph/9601275.

[388] D. M. Asner et al., Hadronic structure in the decay τ− → ντ π− π0 π0 and the sign of the
tau-neutrino helicity, Phys. Rev. D 61, 012002 (2000), doi:10.1103/PhysRevD.61.012002,
hep-ex/9902022.
[389] M. Finkemeier and E. Mirkes, Tau decays into kaons, Z. Phys. C 69, 243 (1996),
doi:10.1007/s002880050024, hep-ph/9503474.

[390] R. Decker, E. Mirkes, R. Sauer and Z. Wa̧s, Tau decays into three pseudoscalar mesons, Z.
Phys. C 58, 445 (1993), doi:10.1007/BF01557702.

[391] A. E. Bondar, S. I. Eidelman, A. I. Milstein, T. Pierzchala, N. I. Root, Z. Wa̧s and M. Worek,


Novosibirsk hadronic currents for τ → 4π channels of τ decay library TAUOLA, Comput. Phys.
Commun. 146, 139 (2002), doi:10.1016/S0010-4655(02)00262-X, hep-ph/0201149.

[392] J. H. Kühn and Z. Wa̧s, τ decays to five mesons in TAUOLA, Acta Phys. Polon. B 39, 147
(2008), hep-ph/0602162.

[393] P. Ilten, Electroweak and Higgs Measurements Using Tau Final States with the LHCb Detector,
Ph.D. thesis, University Coll., Dublin (2013), 1401.4902.

[394] P. Golonka, B. Kersevan, T. Pierzchała, E. Richter-Wa̧s, Z. Wa̧s and M. Worek, The tauola-
photos-F environment for the TAUOLA and PHOTOS packages: Release. 2., Comput. Phys.
Commun. 174, 818 (2006), doi:10.1016/j.cpc.2005.12.018, hep-ph/0312240.

[395] J. R. Andersen et al., Les Houches 2013: Physics at TeV Colliders: Standard Model Working
Group Report (2014), 1405.1067.

[396] A. Buckley, H. Hoeth, H. Lacker, H. Schulz and J. E. von Seggern, Systematic event generator
tuning for the LHC, Eur. Phys. J. C 65, 331 (2010), doi:10.1140/epjc/s10052-009-1196-7,
0907.2973.
[397] P. Ilten, M. Williams and Y. Yang, Event generator tuning using Bayesian optimization, JINST
12(04), P04028 (2017), doi:10.1088/1748-0221/12/04/P04028, 1610.08328.

[398] J. Bellm and L. Gellersen, High dimensional parameter tuning for event generators, Eur.
Phys. J. C 80(1), 54 (2020), doi:10.1140/epjc/s10052-019-7579-5, 1908.10811.

[399] M. Krishnamoorthy, H. Schulz, X. Ju, W. Wang, S. Leyffer, Z. Marshall, S. Mrenna, J. Müller


and J. B. Kowalkowski, Apprentice for Event Generator Tuning, EPJ Web Conf. 251, 03060
(2021), doi:10.1051/epjconf/202125103060, 2103.05748.

[400] J. M. Butterworth et al., THE TOOLS AND MONTE CARLO WORKING GROUP Summary
Report from the Les Houches 2009 Workshop on TeV Colliders, In 6th Les Houches Workshop
on Physics at TeV Colliders (2010), 1003.1643.

[401] P. Z. Skands et al., SUSY Les Houches accord: Interfacing SUSY spectrum calculators,
decay packages, and event generators, JHEP 07, 036 (2004), doi:10.1088/1126-
6708/2004/07/036, hep-ph/0311123.

300
SciPost Physics Codebases Submission

[402] B. C. Allanach et al., SUSY Les Houches Accord 2, Comput. Phys. Commun. 180, 8 (2009),
doi:10.1016/j.cpc.2008.08.004, 0801.0045.

[403] J. Alwall, E. Boos, L. Dudko, M. Gigg, M. Herquet, A. Pukhov, P. Richardson, A. Sherstnev


and P. Z. Skands, A Les Houches Interface for BSM Generators, doi:10.2172/921331 (2007),
0712.3311.
[404] E. Bothmann et al., Event Generation with Sherpa 2.2, SciPost Phys. 7(3), 034 (2019),
doi:10.21468/SciPostPhys.7.3.034, 1905.09127.

[405] S. Höche, S. Prestel and H. Schulz, Simulation of Vector Boson Plus Many Jet
Final States at the High Luminosity LHC, Phys. Rev. D 100(1), 014024 (2019),
doi:10.1103/PhysRevD.100.014024, 1905.05120.

[406] M. R. Whalley, D. Bourilkov and R. C. Group, The Les Houches accord PDFs (LHAPDF) and
LHAGLUE, In HERA and the LHC: A Workshop on the Implications of HERA and LHC Physics
(Startup Meeting, CERN, 26-27 March 2004; Midterm Meeting, CERN, 11-13 October 2004),
pp. 575–581 (2005), hep-ph/0508110.

[407] A. Buckley, J. Ferrando, S. Lloyd, K. Nordström, B. Page, M. Rüfenacht, M. Schönherr and


G. Watt, LHAPDF6: parton density access in the LHC precision era, Eur. Phys. J. C 75, 132
(2015), doi:10.1140/epjc/s10052-015-3318-8, 1412.7420.

[408] P. Artoisenet, F. Maltoni and T. Stelzer, Automatic generation of quarkonium amplitudes in


NRQCD, JHEP 02, 102 (2008), doi:10.1088/1126-6708/2008/02/102, 0712.2770.

[409] H.-S. Shao, HELAC-Onia 2.0: an upgraded matrix-element and event genera-
tor for heavy quarkonium physics, Comput. Phys. Commun. 198, 238 (2016),
doi:10.1016/j.cpc.2015.09.011, 1507.03435.

[410] D. J. Lange, The EvtGen particle decay simulation package, Nucl. Instrum. Meth. A 462, 152
(2001), doi:10.1016/S0168-9002(01)00089-4.

[411] M. Dobbs and J. B. Hansen, The HepMC C++ Monte Carlo event record for High Energy
Physics, Comput. Phys. Commun. 134, 41 (2001), doi:10.1016/S0010-4655(00)00189-2.

[412] A. Buckley, P. Ilten, D. Konstantinov, L. Lönnblad, J. Monk, W. Pokorski, T. Przedzinski and


A. Verbytskyi, The HepMC3 event record library for Monte Carlo event generators, Comput.
Phys. Commun. 260, 107310 (2021), doi:10.1016/j.cpc.2020.107310, 1912.08005.

[413] R. Brun and F. Rademakers, ROOT: An object oriented data analysis framework, Nucl.
Instrum. Meth. A 389, 81 (1997), doi:10.1016/S0168-9002(97)00048-X.

[414] A. Buckley, J. Butterworth, D. Grellscheid, H. Hoeth, L. Lönnblad, J. Monk, H. Schulz


and F. Siegert, Rivet user manual, Comput. Phys. Commun. 184, 2803 (2013),
doi:10.1016/j.cpc.2013.05.021, 1003.0694.

[415] C. Bierlich et al., Robust Independent Validation of Experiment and Theory: Rivet version 3,
SciPost Phys. 8, 026 (2020), doi:10.21468/SciPostPhys.8.2.026, 1912.05451.

[416] M. R. Whalley and R. G. Roberts, A USER GUIDE TO HEPDATA: THE DURHAM / RAL HEP
DATABASES ON THE RAL CMS SYSTEM (1988).

301
SciPost Physics Codebases Submission

[417] E. Maguire, L. Heinrich and G. Watt, HEPData: a repository for high energy physics data,
J. Phys. Conf. Ser. 898(10), 102006 (2017), doi:10.1088/1742-6596/898/10/102006,
1704.05473.
[418] C. Bierlich et al., Confronting experimental data with heavy-ion models: RIVET for heavy ions,
Eur. Phys. J. C 80(5), 485 (2020), doi:10.1140/epjc/s10052-020-8033-4, 2001.10737.

[419] M. Cacciari, G. P. Salam and G. Soyez, FastJet User Manual, Eur. Phys. J. C 72, 1896 (2012),
doi:10.1140/epjc/s10052-012-1896-2, 1111.6097.

[420] J. D. Bjorken, Particle physics: Where do we go from here?, SLAC Beam Line 22(4), 8 (1992).

[421] B. L. Combridge, J. Kripfganz and J. Ranft, Hadron Production at Large Transverse Momen-
tum and QCD, Phys. Lett. B 70, 234 (1977), doi:10.1016/0370-2693(77)90528-7.

[422] R. Cutler and D. W. Sivers, Quantum Chromodynamic Gluon Contributions to Large p(T)
Reactions, Phys. Rev. D 17, 196 (1978), doi:10.1103/PhysRevD.17.196.

[423] H. U. Bengtsson, The Lund Monte Carlo for High p T Physics, Comput. Phys. Commun. 31,
323 (1984), doi:10.1016/0010-4655(84)90018-3.

[424] E. Eichten, I. Hinchliffe, K. D. Lane and C. Quigg, Super Collider Physics, Rev. Mod. Phys.
56, 579 (1984), doi:10.1103/RevModPhys.56.579, [Addendum: Rev.Mod.Phys. 58, 1065–
1073 (1986)].

[425] B. L. Combridge, Associated Production of Heavy Flavor States in pp and p̄p Interactions:
Some QCD Estimates, Nucl. Phys. B 151, 429 (1979), doi:10.1016/0550-3213(79)90449-
8.

[426] F. A. Berends, R. Kleiss, P. De Causmaecker, R. Gastmans and T. T. Wu, Single Bremsstrahlung


Processes in Gauge Theories, Phys. Lett. B 103, 124 (1981), doi:10.1016/0370-
2693(81)90685-7.

[427] F. Halzen and D. M. Scott, Hadroproduction of Photons and Leptons, Phys. Rev. D 18, 3378
(1978), doi:10.1103/PhysRevD.18.3378.

[428] V. Costantini, B. De Tollis and G. Pistoni, Nonlinear effects in quantum electrodynamics,


Nuovo Cim. A 2(3), 733 (1971), doi:10.1007/BF02736745.

[429] E. L. Berger, E. Braaten and R. D. Field, Large p(T) Production of Single and Double Photons in
Proton Proton and Pion-Proton Collisions, Nucl. Phys. B 239, 52 (1984), doi:10.1016/0550-
3213(84)90084-1.

[430] D. A. Dicus and S. S. D. Willenbrock, Photon Pair Production and the Intermediate Mass
Higgs Boson, Phys. Rev. D 37, 1801 (1988), doi:10.1103/PhysRevD.37.1801.

[431] G. Ingelman et al., Deep inelastic physics and simulation, In DESY Workshop 1987: Physics
at HERA (1987).

[432] D. Y. Bardin, M. S. Bilenky, D. Lehner, A. Olchevski and T. Riemann, Semi-analytical ap-


proach to four-fermion production in e+ e− annihilation, Nucl. Phys. B Proc. Suppl. 37(2),
148 (1994), doi:10.1016/0920-5632(94)90670-X, hep-ph/9406340.

302
SciPost Physics Codebases Submission

[433] E. Gabrielli, The Production of Weak Intermediate Bosons in ep Reactions, Mod. Phys. Lett.
A 1, 465 (1986), doi:10.1142/S0217732386000592, [Erratum: Mod.Phys.Lett.A 2, 69
(1987)].

[434] M. A. Samuel, G. Li, N. Sinha, R. Sinha and M. K. Sundaresan, Bounds on the magnetic
moment of the W boson, Phys. Rev. Lett. 67, 9 (1991), doi:10.1103/PhysRevLett.67.9,
[Erratum: Phys.Rev.Lett. 67, 2920 (1991)].

[435] T. Barklow, Particle physics research at a 500 GeV e+ e− linear collider, Conf. Proc. C 9006252,
440 (1990).

[436] D. W. Duke and J. F. Owens, Quantum Chromodynamics Corrections to Deep Inelastic Comp-
ton Scattering, Phys. Rev. D 26, 1600 (1982), doi:10.1103/PhysRevD.26.1600, [Erratum:
Phys.Rev.D 28, 1227 (1983)].

[437] M. Fontannaz, B. Pire and D. Schiff, Inclusive Photoproduction Cross-sections of Charmed


Mesons and Baryons, Z. Phys. C 11, 211 (1981), doi:10.1007/BF01545678.

[438] G. Bozzi, B. Fuks, B. Herrmann and M. Klasen, Squark and gaugino hadroproduction
and decays in non-minimal flavour violating supersymmetry, Nucl. Phys. B 787, 1 (2007),
doi:10.1016/j.nuclphysb.2007.05.031, 0704.1826.

[439] B. Fuks, B. Herrmann and M. Klasen, Phenomenology of anomaly-mediated supersymmetry


breaking scenarios with non-minimal flavour violation, Phys. Rev. D 86, 015002 (2012),
doi:10.1103/PhysRevD.86.015002, 1112.4838.

[440] K. Huitu, J. Maalampi, A. Pietila and M. Raidal, Doubly charged Higgs at LHC, Nucl. Phys.
B 487, 27 (1997), doi:10.1016/S0550-3213(97)87466-4, hep-ph/9606311.

[441] G. Barenboim, K. Huitu, J. Maalampi and M. Raidal, Constraints on doubly charged


Higgs interactions at linear collider, Phys. Lett. B 394, 132 (1997), doi:10.1016/S0370-
2693(96)01670-X, hep-ph/9611362.

[442] N. Desai, Collider signatures for dark matter and long-lived particles with Pythia 8 (2018),
1807.04240.
[443] C. Ciobanu, T. Junk, G. Veramendi, J. Lee, G. De Lentdecker, K. S. McFarland and
K. Maeshima, Z’ generation with PYTHIA (2005), doi:10.2172/15020136.

[444] G. Altarelli, B. Mele and M. Ruiz-Altaba, Searching for New Heavy Vector Bosons in pp̄
Colliders, Z. Phys. C 45, 109 (1989), doi:10.1007/BF01556677, [Erratum: Z.Phys.C 47,
676 (1990)].

[445] J. L. Hewett and S. Pakvasa, Leptoquark Production in Hadron Colliders, Phys. Rev. D 37,
3165 (1988), doi:10.1103/PhysRevD.37.3165.

[446] E. Eichten, K. D. Lane and M. E. Peskin, New Tests for Quark and Lepton Substructure, Phys.
Rev. Lett. 50, 811 (1983), doi:10.1103/PhysRevLett.50.811.

[447] U. Baur, M. Spira and P. M. Zerwas, Excited Quark and Lepton Production at Hadron Colliders,
Phys. Rev. D 42, 815 (1990), doi:10.1103/PhysRevD.42.815.

303
SciPost Physics Codebases Submission

[448] J. Bijnens, P. Eerola, M. Maul, A. Mansson and T. Sjöstrand, QCD signatures of narrow
graviton resonances in hadron colliders, Phys. Lett. B 503, 341 (2001), doi:10.1016/S0370-
2693(01)00238-6, hep-ph/0101316.

[449] G. Bella, E. Etzion, N. Hod, Y. Oz, Y. Silver and M. Sutton, A Search for
heavy Kaluza-Klein electroweak gauge bosons at the LHC, JHEP 09, 025 (2010),
doi:10.1007/JHEP09(2010)025, 1004.2432.

[450] S. Ask, J. H. Collins, J. R. Forshaw, K. Joshi and A. D. Pilkington, Identifying the colour of
TeV-scale resonances, JHEP 01, 018 (2012), doi:10.1007/JHEP01(2012)018, 1108.2396.

[451] R. Franceschini, P. P. Giardino, G. F. Giudice, P. Lodone and A. Strumia, LHC bounds on large
extra dimensions, JHEP 05, 092 (2011), doi:10.1007/JHEP05(2011)092, 1101.4919.

[452] G. Bella, E. Etzion, N. Hod and M. Sutton, Introduction to the MCnet Moses project and
Heavy gauge bosons search at the LHC (2010), 1004.1649.

[453] S. Ask, Simulation of Z plus Graviton/Unparticle Production at the LHC, Eur. Phys. J. C 60,
509 (2009), doi:10.1140/epjc/s10052-009-0949-7, 0809.4750.

[454] S. Ask, I. V. Akin, L. Benucci, A. De Roeck, M. Goebel and J. Haller, Real Emission and
Virtual Exchange of Gravitons and Unparticles in Pythia8, Comput. Phys. Commun. 181,
1593 (2010), doi:10.1016/j.cpc.2010.05.013, 0912.4233.

304
SciPost Physics Codebases Submission

Index

4-vectors, 207 Colour annealing, 171


Dipole swing, 171
αs Gluon-move model, 168
in DIRE, 99 Interplay with resonance decays, 70
in hard processes, 44 MPI-based model, 165
in MPI, 123 QCD-based model, 166
in Simple showers, 69 SK models, 169
in VINCIA, 91 Colour ropes, see Ropes
Altarelli-Parisi, see DGLAP Colour-octet onium production, 37
ANGANTYR, 151 Constituent quark masses, see Quark masses
Antenna functions, 87–91 Cosmic rays, 76
Antenna showers, see VINCIA Cut pomeron, 114, 115
ARIADNE, 82, 171 Cutoff scales, 63
Azimuthal asymmetries, 75 in Simple showers, 69, 76
in VINCIA, 93
Backwards evolution, 59, 63, 83
Heavy-quark thresholds, 68 Dal–Raklev model, 180
Baryon number violation, 77, 160 Dark Matter, 42
Baryons, 157, 160 Dark matter, 42, 258
Baryon-to-meson ratio, 167 Dark photons, 42, 77, 100
Junction baryons, 160 Dead cones, see Quark masses
Multiply-heavy-flavour, 167 Decays
Rescattering, 177 of Hadrons, 181, 187, 189
in Thermal model, 175 Helicity density formalism, 195
Beam remnants, 134 of Resonances, 28, 46, 47, 49, 70, 217, 234
for resolved photons, 147 of τ leptons, 196
Beamstrahlung, 141 Variable widths, 190
Bose–Einstein effects, see Hadronization Default tune, see Monash tune
Bottomonium, see Quarkonium Deuteron production, see Hadronization
Breit–Wigner distribution, 47, 184 DGLAP, 50, 57, 61, 81, 89, 98
Effect of PDFs, 49 for Resolved photons, 144
Breit–Wigner for Higgses, 40 Diffraction, 114, 118, 121, 137, 252
Breit-Wigner distribution, 29 Central, see Central diffraction
Double, see Double diffraction
Central diffraction, 28, 114, 118, 252
for resolved photons, 148
Charmonium, see Quarkonium
Gap survival, 138, 148, 171
CKKW(-L) merging, 106, 112, 211, 212, 246
Hard diffraction, 138
Close-packing of strings, 176
Ingelman–Schlein, 137, 138, 171
Cluster fragmentation, 162
Schuler-Sjöstrand, see Schuler-Sjöstrand model
Coalescence, 179
Single, see Single diffraction
Coherence, 61, 93, 98
Dipole swing, 171
Collective effects, 171, 176
DIRE, 96
Collinear factorization, see PDFs
DIS, 35, 134, 142, 144
Colour annealing, 171
Double diffraction, 114, 118, 252
Colour reconnections, 162

305
SciPost Physics Codebases Submission

dΦn , 22 Good–Walker, 151

Elastic scattering, 114, 116, 121, 252 Hadron decays, 187


Electromagnetic decays (of hadrons), 192 Hadron production vertices, 159
Electroweak processes, 34 Hadronic Rescattering, 176
Electroweak showers, see Weak Showers Hadronization, 154
Enhanced splittings Bose–Einstein effects, 178
in DIRE, 101 Close-packing effects, 172, 176
in Simple showers, 82 Coalescence, 179
in VINCIA, 87 Deuteron production, 179
Equivalent photon approximation, 145 Hadronic rescattering, 176
heavy nuclei, 150 Ropes, 174
protons, 149 String interactions, 171
Error messages, 209 Thermal string breaks, 175
Event generator, 7 Hanbury-Brown–Twiss, see Bose–Einstein effects
Event record, 208 Hard diffraction, see Diffraction
Event structure, 8 Hard Process
Event weights, see Weights Dark Matter, 43
Evolution variable, 60 EW bosons, 35
in DIRE, 98 Exotica, 44
in Simple showers, 66 Higgs production, 40
in VINCIA, 84 Photon collisions, 37
EVTGEN, 241 Prompt photon, 34
Excited fermions, 48 QCD, 32
Exotica, 44, 261, 262 Supersymmetry, 41
Excited fermions, 260 Top production, 39
Extra dimensions, see Extra dimensions setting, 204
Leptoquarks, 259 Hard QCD processes, 32, 252
New gauge bosons, 259 Phase-space sampling for 3-body processes,
Extra dimensions 28
Large extra dimensions, 261 HBT, see Bose–Einstein effects
Randall–Sundrum resonances, 260 HDF5, 235
TeV−1 -sized, 261 HEPMC, 243
Unparticles, 262 Hidden valleys, 42, 77, 262
Higgs bosons, 40, 46–48, 256, 258
Factorization theorem, see PDFs Hit-or-miss method, 17
FASTJET, 246
Final-state radiation, see FSR Impact parameter, 125, 132, 138, 150
Final-state radiation (FSR) Importance sampling, 17
Kinematics, 86 Ingelman–Schlein, 137, 138, 171
Fourth-generation fermions, 255 Initial-state radiation, see ISR
Fragmentation, see hadronization, 154 Installing PYTHIA, 200
FSR, 57 Interleaved evolution, 69, 84, 130
Kinematics, 65, 87 Interleaved resonance decays, 49, 83
ISR, 58
GENEVA, 110
Heavy-quark thresholds, 68
Glauber modelling, 152
Kinematics, 65, 86
g → qq splittings, 74, 89

306
SciPost Physics Codebases Submission

Rapidity ordering for Simple showers, 69 Minimum bias, 136


Resolved photons, 146 MIXMAX, 16, 243
MLM jet matching, 106, 230
JETSET, 12, 157 Monash tune, 225
Junction rest frame, 160 Monte Carlo integration, 17
Junctions, 160, 164, 167 MPI, 53, 123
Resolved photons, 147
Kinematics
MSSM, see SUSY
for 2 → 2 processes, 26
m⊥ , see Transverse mass
for 2 → 3 processes, 27
Multiparton interactions, see MPI
for DIS, 142
Multithreading, 220
for photon-photon collisions, 148
for photoproduction, 145 NL3 merging, 107, 110, 213
in shower branchings, 65, 86 NLO matching, 107, 211, 240
Phase-space cuts, 52 No-branching probability, see Sudakov factor
NRQCD, 37
λ measure of string length, 165
Nuclear PDFs, see PDFs for Nuclei
Leading colour, 162
Leptoquarks, 259 Odderon, 116
Les Houches Accords Onia, see Quarkonium
LHA, 227 Ordering variable, see Evolution variable
LHAHDF5, 235
LHAPDF, 236 Particle data, 205
LHEF, 227 Particle decays, 187, 206
SLHA, 233 Particle properties, 182
LHA, see Les Houches Accords Particle, 207
LHAHDF5, see Les Houches Accords Parton distribution functions, see PDFs
LHAPDF, see Les Houches Accords Parton showers, 56, 64
LHEF, see Les Houches Accords Dark photons, 77, 100
Light-flavour hadrons, 157 DIRE, 96
Low energy processes, 253 DIS, 143
Low-energy processes, 121 in Hadron decays, 187
Low-energy QCD processes, 253 Hidden valleys, 77
Lund model, 154 Simple showers, 64
Lund symmetric fragmentation function, 156 VINCIA, 82
PDFs, 50, 113, 125, 142, 203
MADGRAPH5_AMC@NLO, 107, 238 Effect on resonance shapes, 49
Marsaglia-Zaman algorithm, 16 Factorization theorem, 25
Matching and merging, 103, 210 for Leptons, 139
Leading order methods, 107, 111 for MPI, 129
Next-to-leading order methods, 108, 110, for Nuclei, 51
112 for Resolved photons, 144
matrix element, 25 Structure functions, 142
Matrix-element corrections, see MECs Phase-space generation, 22, 25
MC@NLO, 107, 211 2-body processes, 26
MECs, 64, 73, 105, 106 3-body processes, 27
meMode, 188, 197 Cuts, 52, 204
meMode, 29, 47, 185, 187, 189–193, 234 Lorentz Invariant Phase Space dΦn , 22

307
SciPost Physics Codebases Submission

M-generator, 23 RndmEngine, 243


Massive particles, 24, 27 Sampling, 16, 17
RAMBO, 23 Seed, 16, 242
Resonances, 28 RANLUX, 243
Photon-parton collisions, 37 RANMAR, 16
Photon-photon collisions, 37, 254 Rapidity gap survival, see Diffraction
Photoproduction, 145 Recoil schemes, 75, 86
Pileup, 10, 136 Regge–Gribov theory, 113
Pomeron, 113–116, 118, 137, 138, 171 Reggeon, 113, 116
Cut, 114, 115 Remnants, see Beam remnants
Popcorn, 157 Resolved photons, 144
Power showers, 73 Resonance decays, 28, 46
Damped showers, 74 calcWidth(), 47
POWHEG, 105, 107, 108, 112, 210, 237 Colour reconnections, 70
Primordial k⊥ , 135, 138, 147 Interleaving, 49, 70, 83
Interplay with showers, 69, 93 Semi-internal, 217
Prompt photons, 34 SLHA DECAY tables, 234
PYTHIA, 7, 11, 12 weightDecay(), 47
PYTHON, 246 RIVET, 245
RndmEngine, 243
QCD coupling constant, see αs ROOT, 244
QCD processes, 32, 252, 253 Ropes, 174
QCD showers, see Parton showers Running quark masses, see Quark masses
QED showers, 58
in DIRE, 100 Schuler-Sjöstrand model, 117–119, 121, 122,
in Simple showers, 75 148
in VINCIA, 93 Sector merging, 111, 212
QGP, 171 Sector resolution variable, 88
QNUMBERS, 233 Sector showers, see VINCIA
Quark masses Settings (expicitly mentioned)
Constituent quark masses, 120, 157, 183, Beams:allowVertexSpread, 203
234 Beams:allowIDAswitch, 214
in DIRE, 97, 99 Beams:allowMomentumSpread, 203
in Simple showers, 66, 67, 73 Beams:allowVariableEnergy, 214
in VINCIA, 83 Beams:eA, 203
Running quark masses, 183 Beams:eB, 203
Top quark mass, 164 Beams:eCM, 203, 214
Quark-gluon plasma, see QGP Beams:frameType, 203, 214
Quarkonium, 37, 53, 77, 167, 240 Beams:idA, 203
Beams:idB, 203
RAMBO, 23 Beams:LHEF, 237
Random numbers, 16 Dire:MEplugin, 239
External random-number generators, 242 HadronLevel:all, 205
Hit-or-miss method, 17 HadronLevel:Rescatter, 176
MIXMAX, 16, 243 HardQCD, 205
RANLUX, 243 HardQCD:all, 204, 205
RANMAR, 16 HardQCD:gg2qqbar, 204

308
SciPost Physics Codebases Submission

HeavyIon:mode, 154 NN:mMax, 185


Higgs:clipWings, 40 NN:mWidth, 47
Higgs:useBSM, 40 NN:onIfNeg, 234
Higgs:wingsFac, 40 NN:onIfPos, 234
idA, 203 NN:tauCalc, 186
idB, 203 Parallelism:numThreads, 221
LHAGrid1:file, 237 Parallelism:processAsync, 222
LHAGrid1:filename, 204 ParticleData:alphaSvalueMRun, 183
LHAHDF5:version, 236 ParticleData:modeBreitWigner, 184
LHAPDF5:set/member, 204, 236 ParticleData:mXRun, 183
LHAPDF6:set/member, 204, 236 ParticleDecays:colRearrange, 189
LowEnergyQCD:*, 121, 205 ParticleDecays:multIncreaseWeak, 189,
LowEnergyQCD:all, 214 194
Main:numberOfEvents, 220 ParticleDecays:limitCylinder, 187
Main:spareFlagN, 215 ParticleDecays:limitRadius, 187
Main:spareModeN, 215 ParticleDecays:limitTau, 187
Main:spareParmN, 215 ParticleDecays:limitTau0, 186
Main:spareWordN, 215 ParticleDecays:mixB, 195
Main:timesAllowErrors, 209 ParticleDecays:multGoffset, 190
meMode, 48 ParticleDecays:multIncrease, 189, 194
Merging:doPTLundMerging, 213 ParticleDecays:multRefMass, 189
Merging:runtimeAMCATNLOInterface, ParticleDecays:sigmaSoft, 194
240 ParticleDecays:xBdMix, 195
Merging:doKTMerging, 211 ParticleDecays:xBsMix, 195
Merging:doMerging, 212 PartonLevel:*, 202
Merging:doNL3Loop, 213 PartonLevel:FSR, 205
Merging:doNL3Subt, 213 PartonLevel:ISR, 205
Merging:doNL3Tree, 213 PartonShowers:model, 205, 212
Merging:doPTLundMerging, 211, 214 PDF:extrapolate, 236
Merging:doUMEPSSubt, 213 PDF:pHardSet, 204
Merging:doUMEPSTree, 212 PDF:pSet, 203, 204
Merging:doUNLOPSLoop, 213 PDF:pSetB, 203
Merging:doUNLOPSSubt, 214 PDF:useHard, 204
Merging:doUNLOPSSubtNLO, 214 PDF:useHardNPDFA, 204
Merging:doUNLOPSTree, 213 PDF:useHardNPDFB, 204
Merging:nJetMax, 212 PhaseSpace:pTHatMin, 204
Merging:nJetMaxNLO, 213 PhaseSpace:pTHatMinDiverge, 203
Merging:Process, 211, 212 POWHEG:nFinal, 210, 238
Merging:TMS, 211 POWHEG:pTdef, 210
Merging:unlopsTMSdefinition, 214 POWHEG:pTemt, 210
MultipartonInteraction:pT0Ref, 177 POWHEG:pThard, 210
MultipartonInteractions:reuseInit, POWHEG:veto, 210
215 POWHEG:vetoCount, 210
NN::mMin, 185 ProcessLevel:resonanceDecays, 202
NN:doForceWidth, 47 Random:seed, 220
NN:mayDecay, 46 Random:setSeed, 220
NN:meMode, 185–187, 189–195, 234 ResonanceWidths:minThreshold, 186

309
SciPost Physics Codebases Submission

SigmaDiffractive:mode, 118 τ decays, 196


SigmaProcess:factorScale2, 204 TAUOLA, 196
SigmaProcess:renormScale2, 204 Thermal string breaks, 175
SigmaTotal:mode, 116–118 Top quark, 39, 46, 48, 83, 255
SLHA:allowUserOverride, 234 Mass, 164
SLHA:minMassSM, 234 Single top, 255
SLHA:verbose, 209 Total cross sections, 114
SoftQCD, 205 Transverse mass, 156
SoftQCD:*, 205 Tuning, 224
SoftQCD:all, 205, 214 ATLAS A14 tune, 225
SpaceShower:MEcorrections, 211 Automated approaches, 226
SpaceShower:pTmaxFudge, 211 Default tune, see Monash tune
SpaceShower:pTmaxMatch, 211 Monash tune, 225
StringFlav:probStoUD, 190
StringFragmentation:stopMass, 192 Ultra-peripheral collisions, 149
TimeShower:globalRecoil, 211 UMEPS merging, 107, 212
TimeShower:MEcorrections, 211 Uncertainties, 120, 223, 225
TimeShower:pTmaxFudge, 211 in Hadronization, 134, 159, 164, 167
TimeShower:pTmaxMatch, 211 in Hard processes, 45, 56
TimeShower:weightGluonToQuark, 211 in MPI, 125, 138
Vincia:MEplugin, 239 in Parton showers, 20, 78, 80, 102
Vincia:sectorShower, 212 from PDFs, 50, 125
Shower cutoff scale, see Cutoff scales in Tuning, 224
Shower evolution variable, see Evolution vari- UNLOPS merging, 107, 110, 213
able User hooks, 75, 216
Shower ordering variable, see Evolution variable VBF, 28, 45, 256, 258
Simple showers, 64 Vec4, 207
Single diffraction, 114 Vector-boson fusion, see VBF
Single diffraction, 118, 252 Vector-meson dominance, see VMD
Single top, 39, 255 Veto algorithm, 18
SLHA, see Les Houches Accords VINCIA, 82
Soft QCD processes, 252 VMD, 121, 148, 192
SPINUP, 196, 232
Strangeness enhancement, 176 Weak bosons, 35, 46, 48, 83, 253
Strangeness suppression, 157 Weak decays, 193
String breaks, 155–157, 175 Weak showers
String interactions, 171 in DIRE, 101
String junctions, see Junctions in Simple showers, 76
String-length measure λ, 165 in VINCIA, 95
String model, 154 Weak-boson fusion, see VBF
String shoving, 172 weightDecay(), 47
Strong coupling, see αs Weights, 20, 222, 223
Strong decays, 191 in DIRE, 98, 100, 101
Strong ordering, 60 in Simple showers, 78, 79, 82
Sudakov factor, 58, 62 in VINCIA, 87, 95
Supersymmetry, see SUSY Weizsäcker-Williams, 145
SUSY, 41, 233, 257 Wimpy showers, 73

310
SciPost Physics Codebases Submission

wounded nucleons, 151

YODA, 244

Zero bias, 136

311
SciPost Physics Codebases Submission

List of acronyms
Below follows a list of standard acronyms used throughout the text, with a reference to the page
where the acronym is first introduced.

2HDM Two–Higgs Doublet Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

AQM Additive Quark Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126

BSM Beyond Standard Model of Particle Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

CKKW-L Catani–Krauss–Kuhn–Webber–Lönnblad . . . . . . . . . . . . . . . . . . . . . . . . . . 112

CM Centre of Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

CMW Catani–Marchesini–Webber . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

CR Colour Reconnection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

DGLAP Dokshitzer–Gribov–Lipatov–Altarelli–Parisi . . . . . . . . . . . . . . . . . . . . . . . . . 52

DIS Deep Inelastic Scattering in the context of ep collisions . . . . . . . . . . . . . . . . . . . . 36

DL Donnachie–Landshoff . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

DM Dark Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

DPS Double Parton Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

EPA Equivalent Photon Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

EW Electroweak . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

FI Final–Initial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

FSR Final–State Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

312
SciPost Physics Codebases Submission

HEP High–Energy Physics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

HI Heavy Ion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

HV Hidden Valley . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

IF Initial–Final . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

II Initial–Initial . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

ISR Initial–State Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

LC Leading Colour . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

LEP Large Electron–Positron Collider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

LHC Large Hadron Collider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

LHA Les Houches Accord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237

LHE Les Houches Event also LHEF: Les Houches Event File . . . . . . . . . . . . . . . . . . . . 12

LO Leading Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

LIPS Lorentz Invariant Phase Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

LL Leading Logarithmic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

MB Minimum Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

MBR Minimum Bias Rockefeller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

MC Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

MCMC Markov Chain Monte Carlo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

313
SciPost Physics Codebases Submission

MEC Matrix Element Corrections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

MPI Multiple Parton Interactions or Multi–Parton Interactions . . . . . . . . . . . . . . . . . . 10

MSSM Minimal Supersymmetric Simplified Model . . . . . . . . . . . . . . . . . . . . . . . . . 42

NLC Next–to–Leading Colour also NNLC etc. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

NLL Next–to–Leading Logarithmic also NNLL etc. . . . . . . . . . . . . . . . . . . . . . . . . . . 66

NLO Next–to–Leading Order also NNLO, N3 LO etc. . . . . . . . . . . . . . . . . . . . . . . . . . 42

N N Nucleon–Nucleon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

NRQCD Non–Relativistic Quantum Chromodynamics . . . . . . . . . . . . . . . . . . . . . . . . 39

PDF Parton Distribution Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

PDG Particle Data Group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

RF Resonance–Final . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

RHIC Relativistic Heavy Ion Collider . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

RPV R–Parity Violating . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

SaS Schuler and Sjöstrand or DL/SaS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

SIDIS Semi–Inclusive DIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151

SLHA SUSY Les Houches Accord . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

SM Standard Model (of Particle Physics) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

UMEPS Unitarised Matrix Element + Parton Shower . . . . . . . . . . . . . . . . . . . . . . . . 112

314
SciPost Physics Codebases Submission

UPCs Ultra–Peripheral Collisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156

QCD Quantum Chromodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

QED Quantum Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

QGP Quark–Gluon Plasma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179

SUSY Supersymmetry or Supersymmetric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

VBF Vector Boson Fusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

VMD Vector Meson Dominance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

315

You might also like