0% found this document useful (0 votes)

37 views

Deep Learning in Particle Physics

The document discusses machine learning techniques for neutrino detection. It summarizes that neutrinos are identified using deep convolutional neural networks (DCNNs) that classify interaction locations in segmented detectors. The DCNNs are trained on simulated neutrino event data and predict interaction locations in real detector data. Domain adversarial neural networks (DANNs) are used to reduce biases from training on simulation instead of real data. The machine learning techniques improve neutrino signal purity compared to traditional track-based approaches.

Uploaded by

mifta ardianti

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views

Deep Learning in Particle Physics

Uploaded by

mifta ardianti

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 61

"machine

learning"
"Machine
Lerningg"

"neural
network"
Even bananas!
shape shifter!
Super Kamiokande (1997)
50,000 ton of ultra pure water,
"photographed" by 11,000 PMTs
smashing more than 10 million high energy protons
producing few trillion neutrinos every 1.3 seconds
NUE CREATES ELECTRON NUMU CREATES MUONS
Electron “showers/hits in cone shape” in detector Muon creates long “track/line” in detector
HOW TO IDENTIFY A NEUTRINO (IF IT
WERE A CAT)
Non ML: very hard and tedious task

CAT IS A COLLECTION OF SHAPES:

BODY + 2 PAWS + 2 EARS + 2 EYES + NOSE + MOUTH +

....

We can write different algorithms to do pattern

recognitions to identify these shapes, and then connect
them to reconize a cat
HOW TO IDENTIFY A NEUTRINO (IF IT
WERE A CAT)
Non ML: very hard and tedious task

NEUTRINO IS A COLLECTION OF PHYSICS

OBJECTS:

SUM OF HITS IN STRAIGHT LINE--> TRACK

CONNECT TRACKS TO ONE INTERACTION POINT

We can write different algorithms to do pattern

recognitions to identify these objects, and then
merge them together to reconize a neutrino
APPLY A NEURON FOR THE TOP-
SPLIT THE IMAGE INTO
LEFT QUADRANT TO CONVERT
FOUR EQUAL QUADRANTS
THE 128 * 128 * 3 FEATURES
INTO ONE SINGLE NUMBER.

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-deep-learning-part-2-cnns-for-computer-vision-472bbb2c8060 https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
APPLY THE EXACT SAME AFTER APPLYING THAT
NEURON FOR THE OTHER NEURON FOR ALL FOUR
QUADRANTS QUADRANTS, WE HAVE
FOUR DIFFERENT NUMBERS

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
TAKE THE MAXIMUM OF THE FOUR NUMBER TO
GET A SINGLE NUMBER

Development

Workflow

(Machine

Learning)

image from:

https://towardsdatascience.com/intuitive-
deep-learning-part-2-cnns-for-computer-
vision-472bbb2c8060
plastic with
scintillator
interspersed
between
passive
target
Each triangular hit = scintillator plane with optic fiber
"vertex": start of Detector "lights up" when charged particle passed
neutrino interaction
through it. Colors represent energy deposited

how neutrino 128 neutrino

triangular
event look like
planes
viewed from
the top of
detector
track = line
"x-view"

neutrino
EM-like particles,
many protons/pions
"shower"
rotated 60o from y rotated -60o from y
In experimental
particle physics,
we use simulation
(synthetic data)
where we know the
true information
(labeled data)
Useful for tuning
pattern recognition
input layer hidden layer
output layer

make DCNN (Deep

Classify
images for 3 Convolutional
location of
different Neural
interaction
view Network)

apply cat finding algorithm

divide detector into segments

check if there a vertex in each
segment
hybrid classification and
localization
Using Theano along with Lasagne
framework for network design
z-direction/neutrino
https://arxiv.org/abs/1808.08332
https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
input layer

Input is three 2-deep tensors containing deposited energy and

hit-time in x, u, v view
Supervised learning: use simulation where we know the "truth"
hidden layer

Four iterations of convolution and max pooling layers.

https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
hidden layer

Fully connected layer with 196 semantic outputs for

each view
Concatenated and fed to another fully connected layer
with 128 outputs.
Dropout: drop connections between layers during
training and reduce over-fitting
Final fully connected layer with 11 (67) outputs for the
segment (plane) classifier.

Softmax layer:
True z-segment: labeled data/true
information of which segment the vertex
is generated
Reconstructed z-segment: a vector of
softmax probabilities of DCNN predicted
segment

example is for 11 segments, but we

developed 67 planes classification and
173 planes for physics results
Signal purity has been improved by the factor of 2-3 using ML technique compared to
track based approach

https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
Train and prediction in different domains
Train with labeled data: in our case it is Simulation/synthetic
data (lets call it source domain)
Test with unlabeled data: in our case it is real data (lets call it
target domain)
cat not cat

The localization problem strongly tied to physics.

But... Physics model is not perfect!

If we knew the physics exactly, we wouldn't be doing the

experiments/

Domain discrepancy arises

Training images with cats indoor
might bias the prediction Find ways to reduce any biases in the algorithm that may come

from training our models in one domain and applying them in
another.

Use Domain Adversarial Neural Network

Minimize the loss of the label
classifier so that network can
Without DANN:

predicts the input level

one classifier
into
the
network

Label predictor:

output
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level

Two classifier
into
the
network

Label predictor:

output

Domain classifier:
works internally
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level

Two classifier
into
the
network

Label predictor:

output

Domain classifier:
works internally

Maximize the loss of the domain classifier so that network

can not distinguish between source and target domain
Minimize the loss of the label
classifier so that network can
With DANN:

predicts the input level

Two classifier
into
the
network

Label predictor:

output

Domain classifier:
works internally

Maximize the loss of the domain classifier so that network

can not distinguish between source and target domain

The network develops an insensitivity to features that are present in one domain but
not the
other, and train only on features that are common to both domains
https://arxiv.org/abs/1808.08332
Journal of Instrumentation, Volume 13, Number 11, 2018
using Caffe for the deep learning network

Blue vs black: model trained in the same domain (FSI active,Blue curve) is better
than a
model trained with an out-of domain physics model (FSI inactive, black curve)

DANN helps to recover the domain information

Vertex Finding
classifying what appears in an image into
one
out of a set of predefined classes

what is the segment of the vertex ?

Image Semantic Segmentation

classifying each pixel in an image into
one out
of a set of predefined classes

what type of particle each particle is?

Image processing is done by LarCV package(https://github.com/DeepLearnPhysics/larcv2.git), developed by
Kazuhiro Terao et al.

Analysis framework developed for processing LArTPC image data. Supports C++ data structures, IO
interface, and data processing
machinery.
Directly manipulate the image data with or without OpenCV.
Interface with open source deep learning softwares including Caffe by Berkeley Lab. and TensorFlow by
Google.
ROOT format- easy to handle (for particle physics) can do other things like crop images , resize images
etc

U-resnet(https://github.com/DeepLearnPhysics/u-resnet.git) is used to implement the semantic

segmentation algorithm

Hybrid of the U-Net(arxiv: 1505.04597) and residual network (arxiv: 1512.03385,

arxiv:1603.05027)

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

EM like: light blue

NON-EM like: yellow

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

https://doi.org/10.48550/arXiv.2103.06992

JINST 16 P07060 2021

Traditional Semantic
Reconstruction Segmentation
https://arxiv.org/abs/2008.01242

Semantic Segmentation (CNN) on MicroBooNE

GoogleNet Application on NEXT Graphical Neural Net on IceCUBE

Identifies 630% more signal events than a CNN/traditional
algorithm
Outperforms traditional reconstruction by

between 20% and 60%.

GoogLeNet & CNN Application on NOvA

https://arxiv.org/abs/2008.01242

Small-scale GPU clusters often used for

training.
ML in particle physics experiments uses huge resources
Large-scale GPU required for
Particle physics experiments record billions of events a year. evaluation.
Neural nets perform >109 floating point operations. Currently no GPU computing clusters
similar to the CPU
OSG exists.
Widespread use of large-spread computing clusters such
as
the Open Science Grid to perform evaluation on CPUs.
doi.org:10.3389/fdata.2022.787421

Fast/efficient ML: dataset sizes and data rates are in physics

experiments
are uniquely massive compared to industry
and other scientific domains

https://a3d3.ai/
https://arxiv.org/abs/2008.01242

Quantitative results and careful statistical analyses.

Consideration of systematic effects and bias is crucial to
particle physics.
As ML becomes more widespread in physics, there will be
increased efforts
in understand this.

Use of simulated datasets corresponding to real data.

Unlike most industry applications, particle physics trains on
simulations.
These simulations can be tuned at will, making it possible to
study
networks behavior under controlled modifications.
Comparing data and simulations can improve studies in
domain transfer.