Weirhsshpreport 1
Weirhsshpreport 1
Dominic Weir
Participant: ________________________________________
Signature
Abstract
computer program and library, to conduct analysis on high-energy particle events. ROOT
employs a hierarchical tree structure for the organization of detector data, thereby enabling
efficient computational capabilities that meet the demanding requirements of nuclear physicists
at major laboratories worldwide. In addition to its inherent graphical user interface, ROOT
visualization, and statistical treatment. However, ROOT can be difficult for beginners without
C++ background, and has various flaws in its design and implementation, in addition to limited
applications outside of particle physics. Uproot provides an alternative data analysis framework
independent of C++ ROOT, intended to stream data into machine learning libraries in Python.
The input/output framework relies on NumPy and Matplotlib, two popular python libraries most
users are well acquainted with, to cast blocks of data from the ROOT file and perform the same
graphical analysis as possible in ROOT. This paper explores the implementation of Uproot for
the purpose of analyzing Run 13747 from Hall A’s Super-BigBite detector and includes the
Acknowledgments
I would like to express my heartfelt gratitude to everyone who made my Jefferson Lab
First and foremost, I extend my sincere thanks to my mentor, Dr. Alexandre Camsonne,
for his invaluable guidance, unwavering support, and prompt assistance throughout the
internship. His willingness to answer all my questions has been instrumental in my learning and
growth. I am also grateful to my alternative mentor, Sanghwa Park, for providing me with
essential fundamentals at the beginning of the internship, which laid a strong foundation for my
work. Furthermore, I extend my appreciation to the JLab Science Education team, especially
Steve Gagnon and Carol McKisson, for their continuous support, engaging physics
demonstrations, and facility tours. Their efforts have enhanced my overall experience at
Jefferson Lab. Additionally, I am thankful to all the summer series lecture speakers for
topics related to high-energy particle events. Special thanks go to the Director of this wonderful
Lab, Dr. Stuart Henderson, for not only managing the facility but also taking the time to interact
with us interns and share his personal experiences and wisdom. His leadership has been truly
inspiring.
I owe a special debt of gratitude to my partner, Connor Carpenter, for our fruitful
complemented my physics expertise, and together, we achieved remarkable results. I also extend
my appreciation to him for authoring the PyROOT tutorial. In addition, I would like to
acknowledge Angelina Nair and Esha Sing for their contributions to the Main notebook and the
ROOT notebook.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE iv
I also want to express my gratitude to all the teachers at Ocean Lakes High School,
particularly the magnet physics teacher, William Isel, for instilling critical thinking skills that
proved invaluable during my nuclear physics internship. I would also like to extend thanks to
Allison Graves, my Senior Project Advisor, for her encouragement and support in pursuing this
internship opportunity. Likewise, I am grateful to Ruoming Shen for informing me about the
Next, my heartfelt thanks go out to my mom and dad for their unwavering support and
encouragement throughout this journey. Their belief in me and my dreams has been a constant
source of motivation. Lastly, I want to express my sincere gratitude to my grandpa, Dr. Daniel
Larusso, for listening to all my stories from Jefferson Lab during my long car rides home. His
continuous support and enlightened wisdom have been a source of inspiration for me, and I am
To all the individuals mentioned and to those behind the scenes who contributed to my
internship experience, thank you for making it a truly memorable and transformative journey.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE v
Glossary
Particles:
except for the sign of certain quantum numbers. Specifically, each quark flavor has a
corresponding antiquark flavor with the same mass and spin, but opposite electric and
color charge.
3. Electrons: Elementary particles with a negative charge, orbiting the nucleus of atoms, and
4. Gluons: Elementary particles mediating the strong force that binds quarks together within
hadrons.
5. Hadrons: Composite particles consisting of quarks, including baryons (e.g., protons and
6. Kaons: Mesons composed of one strange quark and one antiquark, playing crucial roles
in particle interactions.
8. Mesons: Hadrons composed of one quark and one antiquark (e.g., pions and kaons),
9. Neutrons: Neutral subatomic particles found in atomic nuclei, made up of quarks with
10. Photons: Elementary particles of light and electromagnetic radiation, carrying no electric
11. Pions: Pions are mesons, which are composite particles composed of an up quark and a
down antiquark, or a down quark and an up antiquark. They play essential roles in the
strong nuclear force, mediating interactions between nucleons (protons and neutrons) in
atomic nuclei.
12. Proton: A positively charged subatomic particle found in atomic nuclei, composed of
quarks (two up quarks and one down quark) and exhibiting a spin of 1/2.
13. Quarks: Elementary particles that combine to form protons and neutrons (baryons), as
well as other hadrons like mesons, characterized by fractional electric charges and half-
integer spins. Quark flavors refer to the distinct types of quarks: up, down, strange,
charm, top, and bottom. Each flavor possesses unique properties, such as electric charge
and mass.
14. Virtual photon: a particle-like entity in quantum field theory that cannot be directly
mediates the electromagnetic force, facilitating the exchange of energy and momentum
1. Calorimeter: A particle detector that measures the energy of particles by stopping them
particles interacting with the detector material by increasing the number of electrons.
High voltages applied to the dynodes causes the multiplication of electrons through the
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE vii
process of secondary emission, where each incident electron striking a dynode releases
probabilities. The Feynman diagram also provides information about the positions of
4. Form factors: Mathematical functions describing how the internal structure of a particle
5. Kinematic phase space: The region of possible values for particle momenta and energies
in a particular interaction.
6. Quark confinement: The phenomenon wherein quarks are bound together within hadrons
8. Scintillator: A material that emits light when struck by particles, assisting in particle
distances, offering crucial insights into nuclear structure, often visualized by the nucleons
overlapping.
10. Spectrometers are precision instruments equipped with magnetic or electric fields that
enable the precise measurement of the energy and momentum of charged particles,
characteristics.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE viii
Table of Contents
Abstract .................................................................................................................................................... ii
Acknowledgments................................................................................................................................ iii
Glossary ..................................................................................................................................................... v
Physics ...................................................................................................................................................... 1
Detectors .................................................................................................................................................. 5
Data ............................................................................................................................................................ 8
Uproot ............................................................................................................................................................................... 11
Jupyter .............................................................................................................................................................................. 12
Uproot Notebook........................................................................................................................................................... 13
References ............................................................................................................................................. 23
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 1
devoted to investigating the structure of nuclei through two high resolution spectrometers at
precise angles (Alcorn et al., 2004). Initiating each experiment, an electron beam with energies
as high as 11 GeV is aimed at a target; 1 after the reaction the electron scatter is measured by the
BigBite spectrometer, while the recoil particle (the scattered particle ejected from the target) is
measured by the Super-BigBite detector. By examining the scattered particles under different
initial conditions, scientists are able to infer the properties of the nucleons and their constituent
quarks.
Physics
hadron scattering can be used to measure the size of the hadron (Smirnova & Hedberg, 2005).
Run 13747 was an experiment at Jefferson Lab designed to investigate the size of a proton
(Benmokhtar et al., 2008). This experiment involved aiming the beam of electrons at a liquid
hydrogen-1 target with the intention of an elastic collision ejecting a proton. The primary
observable to measure such experiments is the scattering cross section. The differential cross-
𝑑2 σ
section gives the probability of protons scattering into a particular solid angle 𝑑Ω =
𝑑Ω𝑑ν
𝑑𝜙𝑑𝜃𝑠𝑖𝑛(𝜃) and change in energy transferred to the proton in the proton-rest frame 𝑑𝜈 (Zheng,
2021). Figure 1 shows the spectrum of scattering cross-section for 𝑒𝑝 collisions at a fixed four-
1
All equations and quantities in this report are in natural units: 𝑐 = ℏ = 1
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 2
Figure 1
As shown in Figure 1, elastic collisions between the incident electron and the proton have
a relatively small energy transfer as this process involves no internal excitation or change in the
quantum state of the proton. On the other hand, in semi-inelastic delta collisions, the scattering
process involves exciting the proton to a higher energy state; one possible state is the delta
resonance (Δ). The delta resonance occurs when one of the quarks in the proton is excited to a
higher energy state while remaining bound within the baryon. The Δ baryons have a mass of
about 1232 MeV, as opposed to 939 MeV of an ordinary nucleon; however, they quickly decay
via the strong interaction into a nucleon and a pion of appropriate charge (Nave, 1998). Other
semi-inelastic collisions are those that result in the N* excitations; these resonances are higher in
energy usually corresponding to one of the quarks having a flipped spin state, or with different
orbital angular momentum when the particle decays. As opposed to Δ baryons that only decay
through pion production, N* resonances decay through various channels. That being said, the
most common decay mode is still the emission of pions that are responsible for carrying away
excess energy and angular momentum. However, for sufficiently high energy resonance states,
heavier mesons including Eta mesons and Kaons may be emitted. Note that both Eta mesons’
and Kaons’ composition include a strange quark, while protons consists only of up and down
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 3
quarks. This evolution is possible through the strong interaction mediated by gluons. For
example, the down quark of a proton may annihilate an anti-down quark of a pion emitting a
gluon. This gluon then materializes into a strange and anti-strange pair, the latter pairing with the
up quark to form a positively charged kaon. This reaction is summarized by the Feynman
Figure 2
Nevertheless, both delta and N* resonances decay back to stable nucleons by emitting
particles, preserving the overall baryon number and charge of the nucleon system. On the other
hand, Deep Inelastic Scattering (DIS) involves high-energy electrons (or exchange photons)
scattering off individual quarks within the nucleon. During this process, the virtual photon
interacts with a quark, probing the nucleon's internal structure at short distances and high
momentum transfers, and probing the nucleon's substructure through parton distribution
functions (PDFs). These PDFs provide essential information about the momentum distributions
of quarks and gluons within the nucleon, revealing their contributions to the nucleon's total
momentum and spin. In fact, deep inelastic electron-proton scattering experiments led to the
discovery of quarks in 1968 (O’Luanaigh, 2019). However, due to the phenomenon of quark
confinement, isolated quarks cannot be observed directly. Instead, the scattered quarks fragment
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 4
into collimated sprays of hadrons in the final state, a process known as hadronization. Thus,
expression 𝒌 + 𝒑 = 𝒌′ + 𝒑′ , where 𝒌 and 𝒑 are the 4-momentnum vectors of the initial electron
beam and proton, respectively, and the scattered states are represented by the prime symbol (‘);
Figure 3
After the collision, the electron beam with relativistic energy of 𝐸𝑒 traveling along the z-axis
scatters at some angle 𝜃𝑘 and the proton scatters at an angle 𝜃𝑝 in the xz-plane. At a speed near
that of light, the mass of the electron is negligibly small, and by the Eisenstein Energy–
Momentum Relation (equation 1), the momentum is approximately equal to its relativistic
energy.
On the other hand, in the lab frame the proton’s relativistic energy is its invariant mass (mp = 938
MeV). Consequently, the two initial vectors are defined as shown below:
𝐸 𝐸𝑒 (2)
𝑘 = (𝑝𝑥 ) = ( 0 )
𝑝𝑧 𝐸𝑒
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 5
𝐸 𝑚𝑝 (3)
𝑝
𝒑 = ( 𝑥) = ( 0 )
𝑝𝑧 0
After the collision, 4-momentum is conserved between the two particles. Assuming only
scattering in the xz-plane, and once again that the mass of the electron is negligibly small, the
𝐸𝑝′ (4)
𝐸𝑒 𝑚𝑝 𝐸𝑒′ 2 2
′ √(𝐸𝑝′ ) − (𝑚𝑝 ) sin θ𝑝
( 0 ) + ( 0 ) = ( 𝐸𝑒 sin θ𝑘 ) +
𝐸𝑒 0 𝐸𝑒′ cos θ𝑘 2 2
√ ′
( (𝐸𝑝 ) − (𝑚𝑝 ) cos θ𝑝 )
The final energy states, as well as the position and angles of the scattered particles are
measured by the detectors. Because the physical quantities associated with the scattering of
electrons on protons depends on the electric and a magnetic form factor of the proton,
measurements of the cross section can be used to determine the form factor and hence the charge
Detectors
Hall A at Jefferson Lab is currently equipped with two state-of-the-art spectrometers, the
BigBite Spectrometer and the SuperBigBite Spectrometer, both of which play crucial roles in
high-energy particle event analysis. BigBite, named for its large momentum and angular
acceptance, is the first of the two high resolution spectrometer detectors and was installed in
2007 (Liyanage & Wojtsekhowski, 2007). It is designed for detecting, tracking, and identifying
scattered electrons at high luminosity to map out the kinematic phase space. In 2021, the
SuperBigBite detector (SBS) was commissioned on the right side of the beam to detect high-
energy protons and neutrons (Puckett, 2021). The two spectrometers complement each other
The SBS has several components intended to measure the momentum and energy of the
Figure 4
Upon first entering the spectrometer, the charged particles are deflected by the 48D48 warm coil
magnet. This deflection is vital to distinguishing protons from neutrons once they make it to the
Hadronic Calorimeter.
First though, the particles travel through the Gas Electron Multiplier (GEM) detectors.
These detectors take advantage of the electrons stripped from the gasses’ molecular orbitals by
ionizing radiation. A suitable voltage is applied across the polyimide foil guiding the electrons to
microholes spread across the GEM. In the presence of strong electric fields, microholes serve as
sites of electron acceleration, leading to collisions with surrounding atoms that release additional
electrons. As these affected regions accumulate sufficient electrons, they transform into an
electrically conductive medium, generating a significant current that can be detected and read by
electronic devices (Fabio Sauli, 2016). This information reveals the position of the particles
passing through, and ultimately the momentum by analyzing the particle’s deflection in the
After traveling through the 3 low interference GEMs, the particles enter the Hadronic
Figure 5
The HCal consists of 288 detector modules aligned in a 12 wide by 24 high array. Figure 6
Figure 6
The module consists of iron plates interleaved with scintillator planes. Particles hit the iron plates
cascade of secondary particles excites the electrons in the scintillator tiles producing photons
proportional to the energy absorbed. Running the length of each module is single wavelength
shifter that’s directs the photons to a 2-inch diameter Photo Multiplier Tube (PMT) mounted on
the back. A photocathode is located at the opening of the PMT. As dictated by the photoelectric
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 8
effect, the photocathode ejects an electron that is directed to the electron multiplier by the
focusing electrode. The electron multiplier consists of several dynodes arranged in increasing
potential. The electrons are accelerated towards the dynode, striking the surface producing
several more electrons by secondary emission. Each voltage difference increases the totally
energy of the electrons while each collision increases the number of electrons. By the time the
group of electrons travel to the anode, a large enough current is produced to be read by
electronics. This data is crucial in determining the energy of the original hadron, but also give
information regarding the locations of the clusters. A diagram of a PMT is shown in figure 7.
Figure 7
Data
There are several steps involved in converting the signals produced by the detectors into
data that the physicists can work with. However, the first step is the trigger system that rapidly
evaluates which events in a particle detector to keep based on a trigger menu because only a
small fraction of the total can be recorded. At the same time, the Data Acquisition System
(DAQ) is responsible for temporarily storing the data pending the trigger decision, and then
recording data from the selected events in a suitable format. A complex interaction between the
initial steps of the DAQ, the triggering system, and time-to-digital converters are incorporated to
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 9
minimize dead time (time periods when interesting interactions cannot be selected) and correctly
One of the most crucial aspects of preparing the data is the Analog-to-digital conversion
(ADC). The ADC process transforms the continuous analog signals into discrete digital values.
This conversion is performed by sampling the continuous signal at regular intervals and
assigning numerical values to these samples based on their amplitude. The greater the resolution
as determined by its bit length, the greater the generated piecewise signal represents the original
analog signal (Gudino, 2018). The digital data is then temporarily stored in Data Buffers, high-
speed memory elements that can handle the large data rates produced by the detectors. They
store the raw data from multiple events before the data is forwarded to the Event Builders for
further processing. The Event Builders are responsible for assembling the data fragments from
the Data Buffers to form complete events, each of which are data related to a specific particle
interaction. Event Builders efficiently combine the data from multiple detectors and channels to
create coherent event data. Next, the Readout Controllers manage the flow of data from the
Event Builders to the next stage of data processing. They organize the data into packets and
ensure that the data is properly transmitted and recorded. After passing through the Event
Builders, the data from different detectors and channels are combined to form complete events.
However, the data is still in a raw and unorganized format. In the back end of the DAQ,
the data is formatted into a standardized structure to ensure consistency and ease of analysis. To
manage the storage and transfer of the large amount of data efficiently, data compression
techniques may be employed. Additionally, the back end of the DAQ organizes the data into
logical units, such as data files or data streams. The data is then divided into smaller chunks to
facilitate parallel processing and easy access during analysis. This organized and compressed
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 10
data is stored in a standardized format known as ROOT files; a framework developed by CERN
to store a hierarchical tree of columnar data. Along with the raw event data, additional
information known as metadata is also stored. Metadata includes details about the experiment
setup, detector calibration, event timestamps, and other relevant information that is crucial for
At the core of ROOT is the TTree class, representing the tree structure that organizes data
hierarchically. The TTree efficiently stores events as entries, each associated with specific data
variables or properties, known as branches. These branches act as columns, holding arrays or
simple data types that represent measurements or attributes of the events. One of the essential
features of the TTree is its ability to store metadata in the form of TNamed objects. Additionally,
the TTree supports data compression to reduce storage requirements and batch processing for
efficient parallelization. Furthermore, ROOT is not only limited to data storage and
manipulation; it also provides a comprehensive suite of data visualization tools. Researchers can
create a wide range of plots, histograms, and graphs to gain insights into their data and explore
patterns or trends visually (ROOT, n.d.). Overall, ROOT's Tree structure and accompanying
features make it an indispensable tool for analyzing complex scientific datasets, providing
researchers with the means to explore fundamental physics principles and make significant
contributions to particle physics. However, while a powerful and widely used (in particle or
nuclear physics) framework, ROOT has faced criticism regarding its limited documentation and
steep learning curve. Novices often find it challenging to grasp the intricacies of the C++
programming language used in the complex framework of ROOT. Moreover, its design and
implementation have been subject to scrutiny due to issues such as code bloat, heavy reliance on
global variables, and an overly complex class hierarchy. These aspects have occasionally led to
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 11
frustration among developers and have prompted discussions about improving the framework's
This research sought to ascertain the viability of Uproot, a Python input/output library
specialized in reading ROOT files, as an alternative to the more challenging C++ ROOT
framework for physicists' data analysis requirements. The exploration involved a systematic
serving as a suitable platform for comprehensive data processing and manipulation tasks. To
foster broader understanding and proficiency among the scientific community, a Jupyter
Notebook tutorial was formulated, providing a structured guide for researchers to harness Uproot
Uproot
Building upon the utilization of Uproot, this section delves into its technical capabilities,
comparing it to the C++ ROOT framework, and showcasing its seamless integration with
NumPy and Matplotlib, offering physicists a user-friendly solution for comprehensive data
analysis. As a Python-based library, this solution leverages the simplicity, flexibility, and
extensive ecosystem of Python. For example, while C++ ROOT requires researchers to manage
multiple objects, such as TFile, TBranch, and TTree, to access and extract data properly, Uproot
streamlines the process with a more intuitive and straightforward Python syntax. Videlicet,
researchers can effortlessly access data from a ROOT file using Uproot's simple syntax, such as
data from branches. Leveraging Python's versatility and eliminating the need for low-level
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 12
However, while C++ ROOT has its own numerical computation and graphics features,
Uproot incorporates these powerful capabilities through integration with the NumPy and
Matplotlib library, respectively. By employing NumPy arrays and vectorized operations, Uproot
efficiently processes and manipulates large datasets, offering competitive (albeit slightly inferior)
Matplotlib, a widely used Python library for data visualization. This tight integration with
Matplotlib streamlines the data visualization process, allowing for immediate exploration and
Jupyter
The tutorial was created in JupyterHub, a web-based platform that enables the
JupyterHub, the research project gained valuable access to Jefferson Lab’s computational
environments and resources. This access proved to be beneficial, as it provided a scalable and
powerful computing infrastructure that significantly enhanced the project's data analysis
capabilities, while providing direct access to the ROOT files. Jupyter Notebook, an integral
cells, allowing segments of code to be ran individually and iteratively modify specific portions to
analyze graphs and results interactively. This cell-based structure facilitated flexible data
analysis, enabling experimenting with code to visualize immediate outcomes, and refine
combine Markdown (a lightweight markup language) and Python code in separate cells.
Markdown integration allowed for the seamless inclusion of HTML/CSS and LaTeX, enhancing
the tutorial's explanatory power and technical formatting. This integration provided
Lab’s infrastructure. Users could access the tutorial by downloading the ROOT file locally and
working within JupyterLab. This adaptability made the tutorial widely accessible to researchers
Uproot Notebook
The Uproot Jupyter Notebook tutorial commences with importing the ROOT file, and
then the TTree, that contains the data from Run 13747. Next, the tutorial describes the event data
and how to access it. For the remainder of the notebook, the tutorial works with the energy
would ideally consist only of the energy deposited by the proton; however, due to the inevitable
background noise and some inelastic collisions, the array includes other miscellaneous energies
as well.
Next, the tutorial delineates plotting the energies using Matplotlib. The first example is a
1D histogram plot of the energy clusters. This example of the plotting section is shown below in
Figure 8.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 14
Figure 8
After the 1D histogram, the tutorial introduces the x and y positions from the center of
the HCal for the energy clusters to graph a 2D positional plot. Succeeding the 2D graph is a 3D
scatter diagram that plots the energy of the clusters against their 2-dimensional position — first
without color and then with a color axis corresponding to the energies, the latter of which is
shown in figure 9. To conclude the section on plotting, the 3D scatter plot is reduced to 2 spatial
Figure 9
The final section of the notebook is designated to applying selection criteria, or cuts, to
the data set. In this segment, two different types of cuts are demonstrated. The first is energy
cuts, limiting the graph to events where the clusters meet a certain energy threshold. Figure 10
shows the example energy cut provided in the tutorial where the clusters are restricted to those
above 0.5 GeV, effectively reducing the amount of the lower energy background noise. The
second type of selection criteria is positional cuts, restricting the data set to events where the
position vectors of the particles satisfy certain criteria. The first position cut necessary is cutting
the end border of the position space because only some of the energy of the proton is detected by
the HCal, and the rest is lost to the external environment. Including these events would bias the
average energy of the protons. Another effective strategy for data selection is limiting the y-
values to the region specific to the charge of the proton. This approach is facilitated by the
varying deflection of particles in the 48D48 magnetic field, enabling distinct trajectories based
on their charge.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 16
Figure 10
In conclusion, the Uproot Jupyter Notebook provides users with essential skills for
efficiently analyzing event data stored in ROOT files through the aforementioned Python
libraries. Subsequently, physicists can leverage their expertise in the field to further integrate
data from different detectors, apply customized selection criteria, and conduct in-depth analyses
Dr. Provakar Datta, wherein he crafted a proton spot plot (Figure 11) for Run 13747, adeptly
applying advanced cuts to enhance the visibility of protons in the HCal detector.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 17
Figure 11
ROOT Trio
In pursuit of advancing data analysis capabilities and fostering ease of access to ROOT,
this project is part of a trio of tutorials geared towards simplifying data analysis using different
interfaces of the ROOT framework. Alongside the Uproot tutorial described earlier, two
additional tutorials have been crafted by other interns, each catering to distinct aspects of ROOT
data analysis.
The first tutorial concentrates on the fundamentals of ROOT and provides comprehensive
guidance on utilizing the native C++ interface. Delving into ROOT's powerful features, this
tutorial equips researchers with a foundation in C++ ROOT, enabling them to harness its
The second tutorial centers around PyROOT, a Python interface that serves as a bridge
between ROOT's capabilities and the simplicity of Python programming. With PyROOT,
researchers can seamlessly work with ROOT data, harnessing Python's versatility and powerful
data analysis libraries to complement ROOT's functionalities. Note that while both Uproot and
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 18
PyROOT offer Python interfaces for ROOT data analysis, they have distinct differences in their
approach and usage. Uproot specializes in reading ROOT files directly into Python data
structures like NumPy arrays, providing a more streamlined and user-friendly experience. On the
other hand, PyROOT allows for a closer integration with ROOT's C++ interface, enabling
researchers to access and manipulate ROOT objects directly, making it suitable for those who are
already familiar with C++ ROOT or need to interact more closely with ROOT's internal
functionalities.
As all tutorials focus on data from Run 13747, conducted in Hall A, a Main file offers a
comprehensive description of the detectors employed in the Hall A experimental setup, as well
as a description of the kinematics of the reaction. This collective knowledge base equips
researchers with essential contextual understanding to effectively analyze data from this specific
experiment.
powerful toolkit for physicists and researchers, offering multiple entry points to the ROOT
framework based on individual preferences and expertise (Jefferson Lab, 2023). By streamlining
and simplifying ROOT data analysis through these tutorials, the trio aims to enhance researchers'
productivity and efficiency, enabling them to make meaningful scientific discoveries from the
Throughout the course of this internship, I embarked on a journey that initially seemed
daunting, with a vast realm of new knowledge to acquire and challenges to overcome. However,
I found great enjoyment in delving into this exciting world of nuclear physics, as well as the
opportunity to meet remarkable individuals at Jefferson Lab who shared their expertise and
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 19
passion for the subject. This experience has been immensely valuable, as it not only facilitated
significant growth in my technical skills but also provided a profound opportunity for self-
discovery. Immersed in the captivating field of nuclear physics, I have gained a deeper
appreciation for the complexities of scientific research and its potential to unravel the mysteries
of the universe.
Throughout this project, the key to my success lay in integrating numerous resources that
facilitated my learning journey in accelerator and nuclear physics. First and foremost, my
mentors played a pivotal role in providing crucial guidance and support. Their expertise and
the equipment and understanding the kinematics involved. Their mentorship not only enabled me
to grasp the fundamentals of the field but also provided me with a solid foundation to embark on
realm of mathematics and programming required for this project. I dedicated time to refining my
understanding of rudimentary special relativity and linear algebra, enabling me to delve deeper
into the physics concepts involved. Additionally, I immersed myself in scientific journal articles,
textbooks could offer. In addition to mathematics, I also learned many technical skills that I can
continue to apply outside of Jefferson Lab. The first of which is learning to program in Python.
While a rather moderate transition from my experience in Java, Python still posed certain
challenges, such as adapting to Python's dynamic typing and different approach to object-
Ultimately, despite these initial difficulties, mastering Python's ease of use and readily accessible
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 20
data analysis libraries will prove highly beneficial in my future endeavors, equipping me with a
crucial and versatile skill set for diverse scientific pursuits. Another vital technical skill was
learning LaTeX, a typesetting system widely used in academia, which will be instrumental in
preparing physics reports if I choose to pursue a PhD. In fact, its usefulness is evident in the
present report, as I have employed LaTeX to write the mathematical equations showcased
throughout the document, showcasing its significance in delivering professional and well-
undergraduate interns. Their willingness to share their knowledge and insights garnered by a
scientific inquiries and offered me diverse perspectives on nuclear physics research. I also had
the opportunity to converse with several graduate students who were able to give me a deeper
Lab during lectures, and engage with the scientific community. Being part of this collaborative
and supportive environment fostered personal and professional growth, instilling in me a deeper
However, one of the most enlightening experiences was participating in the summer
lecture series. These lectures were intended for an undergraduate physics audience, and hence
were the most digestible source of information and tied together my knowledge. These lecturers
were highly enthusiastic about their subject and deepened my understanding of various research
areas within high-energy physics. The lectures could be divided into 3 different categories.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 21
The first of which is lectures on Jefferson Lab’s accelerator and detector equipment,
from superconducting radio frequency technology to calorimetry. From this category, the
presentation which stood out the most was Dr. Joe Grames’s lecture on the polarization of the
electron beam. In addition to his charismatic presentation skills, he effectively explained how the
electrons are produced via photoemission from the gallium arsenide phosphide alloy, with a bias
towards a particular spin dependent on the polarization of the light source. From there, he
physics, revealing the interdisciplinarity of nuclear physics and the pivotal role of chemistry in
advancing our understanding of higher energy phenomena. I particularly enjoyed how this
lecture underscored the rich connections between various scientific disciplines, further fueling
The second group of lectures introduced us to the recent and near future experiments at
Jefferson Lab and their applications. This varied from Molecular Breast Imaging to the positron
beam upgrade. My favorite lecture was on proton therapy by Dr. Cynthia Keppel; I enjoyed
learning about how nuclear physics, and particularly the Bragg peak of ions, is applied to cure
cancer. As she eloquently characterized the partnership between nuclear physics and radiation
oncologists, I found myself deeply intrigued by both fields and their collaborative potential in
knowledge and skills for research and working as a mentoree in research and development. In
this set of lectures, the topics varied from solving Fermi problems by an MIT physicist to the
ethics regarding research presented by the dean of the ODU College of Science. Nevertheless,
my most cherished lecture in this series was delivered by Dr. Douglas Higinbotham, focusing on
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 22
"The Lifecycle of Nuclear Physics Experiments at Jefferson Lab and the Future Electron Ion
Collider." It was the culminating presentation of the summer lecture series, and fascinating to
see the larger picture. By showcasing the remarkable discoveries of the quasi-elastic electron
scattering experiment that verified the existence of short-range correlations, Dr. Higinbotham
took us on a captivating journey. He masterfully elucidated how this venture commenced with a
discrepancy between the quasi-elastic electron-proton knockout rate and the mean-field theory
prediction, leading to the approval of the experiment by the advisory board in 2001. The
ambitious undertaking involved the construction of a new detector in Hall A to detect neutrons,
and it wasn't until 2008 that the experiment ran, yielding successful results published thereafter
(and several PhDs). Furthermore, Dr. Higinbotham emphasized the iterative nature of this
scientific process, revealing at the most recent Program Advisory Committee the board approved
Overall, the integration of these diverse resources has been instrumental in shaping my
growth and learning experience during this internship. The combination of mentorship, self-
learning, interactions with peers, exposure to lectures, and the engaging workplace environment
has not only equipped me with valuable technical skills but also fueled my lifelong passion for
science. I am deeply grateful for the rich learning experience this project has provided and look
forward to applying this newfound knowledge and enthusiasm in my future scientific endeavors.
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 23
References
Alcorn, J., Anderson, B. D., Aniol, K. A., Annand, J. R. M., Auerbach, L., Arrington, J., Averett,
T., Baker, F. T., Baylac, M., Beise, E. J., Berthot, J., Bertin, P. Y., Bertozzi, W., Bimbot,
L., Black, T., Boeglin, W. U., Boykin, D. V., Brash, E. J., Breton, V., & Breuer, H.
(2004). Basic instrumentation for Hall A at Jefferson Lab. Nuclear Instruments and
Benmokhtar, F., Franklin, G., Quinn, B., Schumacher, R., Camsonne, A., Chen, J., Chudakov,
E., Dejager, C., Degtyarenko, P., Gomez, J., Hansen, O., Higinbotham, D., Jones, M.,
Lerose, J., Michaels, R., Nanda, S., Saha, A., Sulkosky, V., Wojtsekhowski, B., & Fassi,
09-019.pdf
Ellis, N. (n.d.). Trigger and data acquisition. CERN. Retrieved August 2, 2023, from
https://cds.cern.ch/record/1017829/files/p241.pdf
Fabio Sauli. (2016). The gas electron multiplier (GEM): Operating principles and applications.
https://doi.org/10.1016/j.nima.2015.07.060
Govind, K. (2020, January 6). Pion+ and proton make kaon+ and another strange particle, X.
https://physics.stackexchange.com/q/523428
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 24
events/articles/engineering-resource-basics-of-analog-to-digital-converters
https://gspda.jlab.org/wiki/index.php/Main_Page#tab=Summer_Lecture_Series
Jefferson Lab. (2023, July 6). Data Analysis of Hall A Using ROOT Applications in Jupyter.
GitHub. https://github.com/JeffersonLab/JupyterAnalysis
Liyanage, N., & Wojtsekhowski, B. (2007). BigBite: A new large acceptance spectrometer for
https://ui.adsabs.harvard.edu/abs/2007APS..APRE16008L/abstract
astr.gsu.edu/hbase/Particles/delta.html
https://home.cern/news/news/physics/fifty-years-quarks
Puckett, A. (2021, August 2). SBS Installation in Hall A at Jefferson Lab, July 2021 | Professor
https://puckett.physics.uconn.edu/2021/08/02/sbs-installation-in-hall-a-at-jefferson-lab-
july-2021/
ROOT. (n.d.). ROOT Manual. ROOT; CERN. Retrieved August 2, 2023, from
https://root.cern/manual
https://hedberg.web.cern.ch/hedberg/lectures/ch7_2005_lec2.pdf
DATA ANALYSIS WITH PYTHON FOR SUPER-BIGBITE 25
https://inpp.ohio.edu/~rochej/group_page/tips/crash_course_for_summer_undergrad_rese
arch.pdf