0% found this document useful (0 votes)

2 views21 pages

Introduction 2

The document discusses the application of real-time deep learning on FPGAs, particularly in the context of the Large Hadron Collider (LHC) and other high-data environments like self-driving cars. It highlights the challenges posed by extreme data rates and the need for efficient machine learning algorithms to manage data processing within strict latency constraints. FPGAs are presented as a solution due to their programmable nature, low power consumption, and ability to handle parallel processing effectively.

Uploaded by

Ehsan Faraji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

2 views21 pages

Introduction 2

Uploaded by

Ehsan Faraji

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 21

How to do real-time Deep Learning on FPGAs

Introduction

Universität Zürich - Physik Institut, 6 February 2019

Motivation:
use cases at the LHC and beyond

06.02.2019 fpa4hep: real-time deep learning on FPGAs 2

Future challenges @ LHC
Extreme bunch crossing frequency of 40 MHz → extreme data rates O(100 TB/s)

LHC TODAY HL-LHC

‣~~ 40 collisions/event ‣~ 200 collisions/event

‣ 10 sec/event processing time
‣more granular detector
‣~flatminutes/event processing time
‣resources
budget for computing

06.02.2019 fpa4hep: real-time deep learning on FPGAs 3

Future challenges @ HL-LHC
Modern machine learning methods might be the way out!

Current event reconstruction

algorithms will not be substainable

Recast instead the problem as

a machine learning problem

‣ Excellent physics performance

‣ Intrinsically
speed
parallelizable → high

‣ Follow industry trends in developing

new devices optimized for ML
and speed the up the inference
# collisions/event ~ event complexity

06.02.2019 fpa4hep: real-time deep learning on FPGAs 4

The LHC big data problem

r l
ge ve
s
lli Hz

pu e
g
on
CMS
CMSTrigger
CMS Trigger
Trigger

in
r

tin
ig e
ge
co M

Tr h-L

om ﬄ
si

ig
pp 40

O
ig
Tr

H
L1
1 1kHz

C
1 kHz
kHz
100
100kHz
100 kHz
kHz 1 MB/evt
1 1MB/evt
1 MB/evt
MB/evt

HiH Hi
ghighgh
4040MHz
40 MHz e rer er T T-L-L -L
MHz ig gggigg rTi r rie e e
T i
rTr Tr ggigggvgeveve
L1L1L1 e
DATA FLOWr r
re el l l

• •Level-1
•Level-1Trigger
Level-1 Trigger
Trigger
• 40 MHz in / 100 KHz out • •High-Level
(hardware)
(hardware)
(hardware) •High-Level
High-Level Trigger (software)
Trigger
Trigger (software)
(software)
• Absorbs 100s TB/s
• •99.75%
•99.75%
99.75%rejectedTrigger decision to be made•in•~
•rejected
rejected •10
99% μsrejected
99%
99% rejected
rejected
• •decision
•decision
• Coarse local reconstruction
inin~4
decision in
~4 μs
~4μs μs
• FPGAs / Hardware implemented
• • •
decision
decision
decision inin~100s
in ms
~100s
~100s msms

06.02.2019 fpa4hep: real-time deep learning on FPGAs 5

The LHC big data problem

r l
ge ve
s
lli Hz

pu e
g
on
CMS
CMSTrigger
CMS Trigger
Trigger

in
r

tin
ig e
ge
co M

Tr h-L

om ﬄ
si

ig
pp 40

O
ig
Tr

H
L1
1 1kHz

C
1 kHz
kHz
100
100kHz
100 kHz
kHz 1 MB/evt
1 1MB/evt
1 MB/evt
MB/evt

HiH Hi
ghighgh
4040MHz
40 MHz e rer er T T-L-L -L
MHz ig gggigg rTi r rie e e
T i
rTr Tr ggigggvgeveve
L1L1L1 e
DATA FLOWr r
re el l l

• •Level-1
•Level-1Trigger
Level-1 Trigger
Trigger (hardware)• 100•KHz
(hardware)
(hardware) •High-Level
•High-Level
/ 1 KHz outTrigger
inHigh-Level (software)
Trigger
Trigger (software)
(software)
• Output: ~ 500 KB/event
• •99.75%
•99.75%
99.75%rejected
rejected
rejected • •99%
•99%
• Processing 99%rejected
time ~ rejected
rejected
300 ms

• •decision
•decisioninin~4
decision ~4μs
in ~4μsμs •
• Simplified global reconstruction
• •
decision
decision
decision inin~100s
in ms
~100s
~100s
• Software implemented on CPUs
msms

06.02.2019 fpa4hep: real-time deep learning on FPGAs 6

The LHC big data problem

r l
ge ve
s
lli Hz

CMS Trigger

pu e
g
on
CMS
CMSTrigger
CMS Trigger
Trigger

in
r

tin
ig e
ge
co M

Tr h-L

om ﬄ
si

ig
pp 40

O
ig
Tr

H
1 kHz1 1kHz
L1

C
1 kHz
kHz
100 kHz
100 kHz
100
100 kHz
kHz 1 MB/evt
1 1MB/evt
1 MB/evt
MB/evt
Hi
gh HiH g i Hi
gh
40 MHz er T
r - L h g h
4040MHz
40 MHz
MHz igg ge er
gri er eTvT T-L-L -L
Tr rig igrgig gg riegri rieve ev
L1 1 T T r T e l gg g g gev e e
1
LL L 1 r e e el l l
DATA FLOWr rr

vel-1 Trigger
• • •Level-1
Level-1
Level-1 (hardware)
Trigger (hardware)
Trigger
Trigger •
(hardware)
(hardware) High-Level Trigger
• • •High-Level
High-Level (software)
Trigger
High-Level (software)
Trigger
Trigger (software)
(software)
• Output: max. 1 MB/event
.75%
•• • rejected
99.75%
99.75%rejected
rejected
99.75% rejected • 99% rejected
• 99%
99% • • • Accurate global reconstruction
rejected
99% rejectedtime ~ 20 s
•rejected
Processing

cision in ~4 μs
• • •decision in~4
decision in
decision ~4μs
in ~4
μsμs • decision in ~100s
• • •decision
decision in
decision ms
~100s
in ~100s
in ~100s
• Software ms
msmson CPUs
implemented

06.02.2019 fpa4hep: real-time deep learning on FPGAs 7

The LHC big data problem

r l
ge ve
s
lli Hz

CMS Trigger

pu e
g
on
CMS
CMSTrigger
CMS Trigger
Trigger

in
r

tin
ig e
ge
co M

Tr h-L

om ﬄ
si

ig
pp 40

O
ig
Tr

H
1 kHz1 1kHz
L1

C
1 kHz
kHz
100 kHz
100 kHz
100
100 kHz
kHz 1 MB/evt
1 1MB/evt
1 MB/evt
MB/evt
Hi
gh HiH g i Hi
gh
40 MHz er T
r - L h g h
4040MHz
40 MHz
MHz ig g ge er
gri er eTvT T-L-L -L
Tr rig igrgig gg riegri rieve ev
L1 1 T T r T e l gg g g gev e e
1 ns 1Lμs1
LL 1 r e re100el l msl 1s
r r

vel-1 Trigger
• • •Level-1
Level-1
Level-1 (hardware)
Trigger (hardware)
Trigger
Trigger •
(hardware)
(hardware) High-Level Trigger
• • •High-Level
High-Level (software)
Trigger
High-Level (software)
Trigger
Trigger (software)
(software)
.75%
•• • rejected
99.75%
99.75%rejected
rejected
99.75% rejected • 99% rejected
• 99%
99%
Deploy ML algorithms very • •early in the game
rejected
99% rejected
rejected
cision in ~4 μs
• • •decision in ~4 μs
decision in
decision ~4
in μs
~4 μs •
Challenge: strict latency constraints!
decision in
decision ~100s
decision in ms
• • •decision in ~100s
~100s
in ms
~100s
msms

06.02.2019 fpa4hep: real-time deep learning on FPGAs 8

r l
ge ve
s
lli Hz

pu e
g
on
CMS
CMSTrigger
Trigger

in
r

tin
ig e
ge
co M

Tr h-L

om ﬄ
si

ig
pp 40

O
ig
Tr

H
L1
1 1kHz

C
kHz
100
100kHz
kHz 1 MB/evt
1 1MB/evt
MB/evt

HiH
ghigh
4040MHz
40 MHz erer TrTr -L-eLe
MHz igi
ggg ig ig vev
T rTr gege l el
1 ns 1L1μs
L1 r 100
r ms 1s

• •Level-1
Level-1Trigger
Trigger(hardware)
(hardware) • •High-Level
High-LevelTrigger
Trigger(software)
(software)
• •99.75%
99.75%rejected
rejected • •99%
99%rejected
rejected
• •decision
decisioninin~4~4μsμs • •decision
decisioninin~100s
~100sms ms

06.02.2019 fpa4hep: real-time deep learning on FPGAs 9

Beyond LHC
Ex: self-driving cars

A single self-driving test vehicle can

produce ~ 30 TB/day

There are over 250 million cars on the

road in the US alone

If < 1% replaced by autonomous vehicles by 2020

→ HUGE amount of data generated, not manageable by central servers!

06.02.2019 fpa4hep: real-time deep learning on FPGAs 10

Beyond LHC
Ex: self-driving cars

A single self-driving test vehicle can

produce ~ 30 TB/day

There are over 250 million cars on the

road in the US alone

If < 1% replaced by autonomous vehicles by 2020

→ HUGE amount of data generated, not manageable by central servers!

Need edge computing architectures, low power and small in size to

run powerful data analytics programs oboard

NB: latency matters! even a few milliseconds of delay

can result in an accident!
The stakes are too high to wait the answer from a distant cloud server.

06.02.2019 fpa4hep: real-time deep learning on FPGAs 11

People might have different opinion… but today
we learn about FPGA & Machine Learning!
FPGA
What are FPGAs?
“programmable hardware”

wing
Field Programmable Gate Arrays are reprogrammable
integrated circuits
FPGA diagram

ctures
Contain array of logic cells used to configure low level
operations (bit masking, shifting, addition)

ures
DSPs (multiply-accumulate,
Logic cell etc.)
Flip Flops (registers/distributed memo
LUTs (logic)
Block RAMs
Look-up(memories)
Flip-flop
table
(registers)
(logic)

06.02.2019 fpa4hep: real-time deep learning on FPGAs 13

FPGA
What are FPGAs?
“programmable hardware”

wing
Field Programmable Gate Arrays are reprogrammable
integrated circuits
FPGA diagram

ctures
Contain array of logic cells used to configure low level
operations (bit masking, shifting, addition)

ures
DSPs (multiply-accumulate,
Also contain embedded components: etc.)
Flip Flops (registers/distributed
Digital Signal Processors (DSPs):
memo
LUTs
logic units(logic)
used for multiplications
Block RAMs (memories)
Random-access memories (RAMs):
embedded memory elements

06.02.2019 fpa4hep: real-time deep learning on FPGAs 14

FPGA
What are FPGAs?
“programmable hardware”

wing
Field Programmable Gate Arrays are reprogrammable
integrated circuits
FPGA diagram

ctures
Contain array of logic cells embedded with DSPs,
BRAMs, etc.

High speed input/output to handle the large bandwith

ures
Support highly parallel algorithm implementations

Low power (relative to CPU/GPU)

DSPs (multiply-accumulate,
Digital Signal Processors (DSPs): etc.)
Flip Flops (registers/distributed memo
logic units used for multiplications

LUTs (logic)
Random-access memories (RAMs):
embedded memory elements
Block RAMs (memories)
Flip-flops (FF) and look up tables
(LUTs) for additions

06.02.2019 fpa4hep: real-time deep learning on FPGAs 15

How are FPGAs programmed?

Hardware Description Languages

HDLs are programming languages which describe

electronic circuits

High Level Synthesis

generate HDL from more common C/C++ code

pre-processor directives and constraints used to
optimize the timing
drastic decrease in firmware development time!

We use today Xilinx Vivado HLS [*]

[*] https://www.xilinx.com/support/documentation/sw_manuals/xilinx2014_1/ug902-vivado-high-level-synthesis.pdf
06.02.2019 fpa4hep: real-time deep learning on FPGAs 16
Neural network inference
Lmn
N
L
N11
xn = gn (Wn,n 1 xn 1 + bn )
L
NMN
activation function multiplication addition
precomputed and
DSPs logic cells
stored in BRAMs
M hidden layers

16 inputs

64 nodes
output layer
activation: ReLU

input layer 32 nodes

activation: ReLU
layer m

32 nodes
N
X activation: ReLU
Nmultiplications = Ln 1 ⇥ Ln
5 outputs
n=2 activation: SoftMax

06.02.2019 fpa4hep: real-time deep learning on FPGAs 17

Neural network inference
Lmn
N
L
N11
xn = gn (Wn,n 1 xn 1 + bn )
L
NMN
activation function multiplication addition
precomputed and
DSPs logic cells
stored in BRAMs
M hidden layers

16 inputs

64 nodes
activation: ReLU
output layer
How many resources?
input layer DSPs, LUTs, FFs?
32 nodes
activation: ReLU
layer m
Does the model fit in the
32 nodes
N
X latencyactivation:
requirements?
ReLU
Nmultiplications = Ln 1 ⇥ Ln
5 outputs
n=2 activation: SoftMax

06.02.2019 fpa4hep: real-time deep learning on FPGAs 18

network in terms of not only performance but also resource usage and latency.
Today you are going to implement a NN on FPGA with this package:
2.1 hls4ml concept
high level synthesis for machine learning
Our basic task is to translate a trained neural network by taking a model architecture, weights, and
biases and implementing them in HLS in an automated fashion. This https://arxiv.org/abs/1804.06913
automated procedure is the task
of the software/firmware package, hls4ml. A schematic of a typical workflow is illustrated in Fig. 1.

hls4ml
-

hls 4 ml

HLS 4 ML/

https://hls-fpga-machine-learning.github.io/hls4ml/
Figure 1: A typical workflow to translate a model into a firmware implementation using hls4ml.
06.02.2019 fpa4hep: real-time deep learning on FPGAs 19
Eﬃcient NN design for FPGAs
FPGAs provide huge flexibility Constraints:
Performance depends on how well you Input bandwidth
take advantage of this FPGA resources
Latency

Today you will learn how to optimize your project through: in g

in
t ra
N
- compression: reduce number of synapses or neurons N

- quantization: reduces the precision of the calculations (inputs, e c t

r oj
weights, biases) a p ing
PG n F s i g
de
- parallelization: tune how much to parallelize to make the
inference faster/slower versus FPGA resources

06.02.2019 fpa4hep: real-time deep learning on FPGAs 20

Today’s hls4ml hands on
• First part:

- take confidence with the package, its functionalities and design synthesis by running with one of
the provided trained NN

- learn how to read out an estimate of FPGA resources and latency for a NN after synthesis

- learn how to optimize the design with quantization and parallelization

• Second part:

- learn how to export the HLS design to firmware with SDAccel

• Third part:

- learn how to do model compression and its effect on the FPGA resources/latency

• Fourth part:

- learn how to accelerate NN inference firmware on a real FPGA (provided on Amazon cloud) with
SDAccel

- timing and resources studies after running on real FPGA

06.02.2019 fpa4hep: real-time deep learning on FPGAs 21

Digital Modulations using Matlab
From Everand
Digital Modulations using Matlab
Mathuranathan Viswanathan
4/5 (6)
ESD Series Inverter Manual
No ratings yet
ESD Series Inverter Manual
4 pages
Basic Equipment - OCTAVIA II As of May 2006
No ratings yet
Basic Equipment - OCTAVIA II As of May 2006
691 pages
Cognitive Ergonomics and Design
No ratings yet
Cognitive Ergonomics and Design
22 pages
hls4ml Tutorial
No ratings yet
hls4ml Tutorial
49 pages
Full Text
No ratings yet
Full Text
25 pages
FPGA Lecture SERC NISER
No ratings yet
FPGA Lecture SERC NISER
57 pages
FPGA Co-Processor For The ALICE High Level Trigger: Gaute Grastveit University of Bergen Norway
No ratings yet
FPGA Co-Processor For The ALICE High Level Trigger: Gaute Grastveit University of Bergen Norway
20 pages
Unit 1
No ratings yet
Unit 1
43 pages
Fpga Pres1a 1
No ratings yet
Fpga Pres1a 1
26 pages
Algosup Fpga Course Day 1 Slides
No ratings yet
Algosup Fpga Course Day 1 Slides
111 pages
FPGA PPT Presentation On Flow
No ratings yet
FPGA PPT Presentation On Flow
21 pages
FPGA
No ratings yet
FPGA
26 pages
Introduction To FPGA Introduction To FPGA
No ratings yet
Introduction To FPGA Introduction To FPGA
15 pages
Advance Digital Electronics FINAL PDF
No ratings yet
Advance Digital Electronics FINAL PDF
45 pages
W MI Are Field Programmable Gate Arrays Ready For The Mainstream
No ratings yet
W MI Are Field Programmable Gate Arrays Ready For The Mainstream
7 pages
FPGA Co-Processor For The ALICE High Level Trigger: Gaute Grastveit University of Bergen Norway
No ratings yet
FPGA Co-Processor For The ALICE High Level Trigger: Gaute Grastveit University of Bergen Norway
20 pages
1 s2.0 S0010465523003429 Main
No ratings yet
1 s2.0 S0010465523003429 Main
21 pages
Introduction To Programmable Logic: John Coughlan RAL Technology Department Electronics Division
No ratings yet
Introduction To Programmable Logic: John Coughlan RAL Technology Department Electronics Division
28 pages
What Is FPGA - FPGA Basics, Applications and Uses
No ratings yet
What Is FPGA - FPGA Basics, Applications and Uses
10 pages
Harnessing Hardware Acceleration in High-Energy Physics Through High-Level Synthesis Techniques
No ratings yet
Harnessing Hardware Acceleration in High-Energy Physics Through High-Level Synthesis Techniques
16 pages
Implementing VHDL Code On Fpga
No ratings yet
Implementing VHDL Code On Fpga
1 page
Introduction To Digital Hardware Design
No ratings yet
Introduction To Digital Hardware Design
25 pages
Tapa: A Scalable Task-Parallel Dataflow Programming Framework For Modern Fpgas With Co-Optimization of Hls and Physical Design
No ratings yet
Tapa: A Scalable Task-Parallel Dataflow Programming Framework For Modern Fpgas With Co-Optimization of Hls and Physical Design
31 pages
Autoencoders On FPGAs For Real-Time, Unsupervised New Physics Detection at 40 MHZ at The Large Hadron Collider
No ratings yet
Autoencoders On FPGAs For Real-Time, Unsupervised New Physics Detection at 40 MHZ at The Large Hadron Collider
12 pages
Cs295: Modern Systems What Are Fpgas and Why Should You Care
No ratings yet
Cs295: Modern Systems What Are Fpgas and Why Should You Care
22 pages
Fpga Design Using VHDL Course - Lec1
No ratings yet
Fpga Design Using VHDL Course - Lec1
21 pages
Intel Fpga Industrial Solutions Playbook 2022
No ratings yet
Intel Fpga Industrial Solutions Playbook 2022
43 pages
FPGA Low Latency Presentation Visual
No ratings yet
FPGA Low Latency Presentation Visual
10 pages
01 Fpga
No ratings yet
01 Fpga
38 pages
FPGA Kitap BLM
No ratings yet
FPGA Kitap BLM
30 pages
FPGA Complete 14 Slides
No ratings yet
FPGA Complete 14 Slides
14 pages
L1 Introduction
No ratings yet
L1 Introduction
20 pages
Fpgas For Beginners
No ratings yet
Fpgas For Beginners
15 pages
FPGA Architecture, Technologies, and Tools: Neeraj Goel IIT Delhi
No ratings yet
FPGA Architecture, Technologies, and Tools: Neeraj Goel IIT Delhi
63 pages
MICRO22 - FPGA - DL - Deep Learning Optimized FPGA Architectures
No ratings yet
MICRO22 - FPGA - DL - Deep Learning Optimized FPGA Architectures
230 pages
Carnegie Mellon University's - RCS
No ratings yet
Carnegie Mellon University's - RCS
275 pages
EE292A Lecture 2.ML - Hardware
No ratings yet
EE292A Lecture 2.ML - Hardware
61 pages
FPGA Basics
No ratings yet
FPGA Basics
20 pages
Unit 1
No ratings yet
Unit 1
34 pages
Fpga Tutorial 2009
No ratings yet
Fpga Tutorial 2009
30 pages
Fast Algorithms For Spiking Neural Network Simulation With Fpgas
No ratings yet
Fast Algorithms For Spiking Neural Network Simulation With Fpgas
34 pages
Lec5 FPGA
No ratings yet
Lec5 FPGA
46 pages
FPGA Genreal Paper
No ratings yet
FPGA Genreal Paper
7 pages
Week 1 (Part 2) ECE-852 Pak Austria
No ratings yet
Week 1 (Part 2) ECE-852 Pak Austria
45 pages
Name: P.Aswin Bharathi Date: 24-02-2021 Reg - Num: 9519005303 Experiment Number: 1 Study of Embedded Soc in Fpga Aim
No ratings yet
Name: P.Aswin Bharathi Date: 24-02-2021 Reg - Num: 9519005303 Experiment Number: 1 Study of Embedded Soc in Fpga Aim
13 pages
Electronics System Design Using FPGA
No ratings yet
Electronics System Design Using FPGA
15 pages
2022 06 15 FPGA Lecture HS
No ratings yet
2022 06 15 FPGA Lecture HS
79 pages
22ec902 Unit 5 Fpga Architecture and Applications
No ratings yet
22ec902 Unit 5 Fpga Architecture and Applications
61 pages
What Is An FPGA Chip ?: I/O Block
No ratings yet
What Is An FPGA Chip ?: I/O Block
20 pages
Introduction To Digital Hardware Design
No ratings yet
Introduction To Digital Hardware Design
25 pages
Elec2630 Embedded Systems Theory
No ratings yet
Elec2630 Embedded Systems Theory
14 pages
Poornima College of Engineering A Seminar Presentation On: Fpga and Functions Realisation Using Fpga
No ratings yet
Poornima College of Engineering A Seminar Presentation On: Fpga and Functions Realisation Using Fpga
21 pages
FPGAs in Data Centers
No ratings yet
FPGAs in Data Centers
6 pages
Understanding The Potential of FPGA-Based Spatial Acceleration For Large Language Model Inference
No ratings yet
Understanding The Potential of FPGA-Based Spatial Acceleration For Large Language Model Inference
28 pages
Chwia B3ida
No ratings yet
Chwia B3ida
33 pages
A Framework For Fpga Design Planning
No ratings yet
A Framework For Fpga Design Planning
4 pages
Intel FPGA Is The New Frontier in Artificial Intelligence
No ratings yet
Intel FPGA Is The New Frontier in Artificial Intelligence
4 pages
An Introduction To FPGA
No ratings yet
An Introduction To FPGA
38 pages
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
From Everand
Filter Bank: Insights into Computer Vision's Filter Bank Techniques
Fouad Sabry
No ratings yet
Analog Dialogue, Volume 47, Number 2
From Everand
Analog Dialogue, Volume 47, Number 2
Analog Dialogue
No ratings yet
Digital Signal Processing for Audio Applications: Volume 1 - Formulae
From Everand
Digital Signal Processing for Audio Applications: Volume 1 - Formulae
Anton R Kamenov
No ratings yet
A Beginner's Guide to Ham Radio
From Everand
A Beginner's Guide to Ham Radio
George Freeman
No ratings yet
3-Java-Variables & Data Types
No ratings yet
3-Java-Variables & Data Types
29 pages
Alvin Jay R. Balucas
No ratings yet
Alvin Jay R. Balucas
1 page
Electric Generator GSW1120M New
No ratings yet
Electric Generator GSW1120M New
7 pages
26 32 13.13 Diesel Engine Generator
100% (2)
26 32 13.13 Diesel Engine Generator
36 pages
Instruction PASS CODE Calculator ENG
100% (1)
Instruction PASS CODE Calculator ENG
10 pages
10 October 1991
No ratings yet
10 October 1991
116 pages
4.6. Schematic - Bus Riser-J1.10 - R
No ratings yet
4.6. Schematic - Bus Riser-J1.10 - R
13 pages
Axia Gpio v2.2 12-2008
No ratings yet
Axia Gpio v2.2 12-2008
40 pages
Flyer Digivod 16 MRZ 09 E
No ratings yet
Flyer Digivod 16 MRZ 09 E
4 pages
DBMS Lab Assignment 1 - ER Diagram
No ratings yet
DBMS Lab Assignment 1 - ER Diagram
6 pages
ANEXA1A Copie A Annex1 CLA 2014-2015 - Salary Check-List
No ratings yet
ANEXA1A Copie A Annex1 CLA 2014-2015 - Salary Check-List
3 pages
STULZ CyberOne EC CW Engineering Manual QECC026A
No ratings yet
STULZ CyberOne EC CW Engineering Manual QECC026A
24 pages
Genfuel Hydrogen Solutions Fuel For Small Fleets: Your Full-Service Hydrogen Fueling System
No ratings yet
Genfuel Hydrogen Solutions Fuel For Small Fleets: Your Full-Service Hydrogen Fueling System
2 pages
DM542E
No ratings yet
DM542E
14 pages
Project1 - BookStoreDatabase
No ratings yet
Project1 - BookStoreDatabase
4 pages
MINDMAP
No ratings yet
MINDMAP
1 page
IS - H - Preregistration
No ratings yet
IS - H - Preregistration
19 pages
Google App Engine
100% (1)
Google App Engine
14 pages
Radar and Navigational Aids
0% (1)
Radar and Navigational Aids
1 page
Holiday Homework-1
No ratings yet
Holiday Homework-1
1 page
Quiz 4
No ratings yet
Quiz 4
3 pages
GMP RF F Lives International
No ratings yet
GMP RF F Lives International
1 page
Expand Your Reach - How To Grow A Brand by Dubbing Content Into Different Languages Using Rask
No ratings yet
Expand Your Reach - How To Grow A Brand by Dubbing Content Into Different Languages Using Rask
8 pages
ACOS 4.1.4-GR1-P5 Configuring Application Delivery Partitions
No ratings yet
ACOS 4.1.4-GR1-P5 Configuring Application Delivery Partitions
100 pages
Td2 Quickstart Guide: Device Connection Diagram
No ratings yet
Td2 Quickstart Guide: Device Connection Diagram
1 page
A 6762 FF 32 Olliver Developments
No ratings yet
A 6762 FF 32 Olliver Developments
4 pages
Raspberry Pin Out Overview
No ratings yet
Raspberry Pin Out Overview
1 page

Introduction 2

Uploaded by

Introduction 2

Uploaded by

How to do real-time Deep Learning on FPGAs

Universität Zürich - Physik Institut, 6 February 2019

06.02.2019 fpa4hep: real-time deep learning on FPGAs 2

LHC TODAY HL-LHC

‣~~ 40 collisions/event ‣~ 200 collisions/event

06.02.2019 fpa4hep: real-time deep learning on FPGAs 3

Current event reconstruction

Recast instead the problem as

‣ Excellent physics performance

‣ Follow industry trends in developing

06.02.2019 fpa4hep: real-time deep learning on FPGAs 4

06.02.2019 fpa4hep: real-time deep learning on FPGAs 5

06.02.2019 fpa4hep: real-time deep learning on FPGAs 6

06.02.2019 fpa4hep: real-time deep learning on FPGAs 7

06.02.2019 fpa4hep: real-time deep learning on FPGAs 8

06.02.2019 fpa4hep: real-time deep learning on FPGAs 9

A single self-driving test vehicle can

There are over 250 million cars on the

If < 1% replaced by autonomous vehicles by 2020

06.02.2019 fpa4hep: real-time deep learning on FPGAs 10

A single self-driving test vehicle can

There are over 250 million cars on the

If < 1% replaced by autonomous vehicles by 2020

Need edge computing architectures, low power and small in size to

NB: latency matters! even a few milliseconds of delay

06.02.2019 fpa4hep: real-time deep learning on FPGAs 11

06.02.2019 fpa4hep: real-time deep learning on FPGAs 13

06.02.2019 fpa4hep: real-time deep learning on FPGAs 14

High speed input/output to handle the large bandwith

Low power (relative to CPU/GPU)

06.02.2019 fpa4hep: real-time deep learning on FPGAs 15

Hardware Description Languages

HDLs are programming languages which describe

High Level Synthesis

generate HDL from more common C/C++ code

We use today Xilinx Vivado HLS [*]

input layer 32 nodes

06.02.2019 fpa4hep: real-time deep learning on FPGAs 17

06.02.2019 fpa4hep: real-time deep learning on FPGAs 18

Today you will learn how to optimize your project through: in g

- quantization: reduces the precision of the calculations (inputs, e c t

06.02.2019 fpa4hep: real-time deep learning on FPGAs 20

- learn how to optimize the design with quantization and parallelization

- learn how to export the HLS design to firmware with SDAccel

- timing and resources studies after running on real FPGA

06.02.2019 fpa4hep: real-time deep learning on FPGAs 21

You might also like