1.
|
|
Open-source FPGA-ML codesign for the MLPerf Tiny Benchmark
/ Borras, Hendrik (Heidelberg U.) ; Di Guglielmo, Giuseppe (Columbia U.) ; Duarte, Javier (UC, San Diego) ; Ghielmetti, Nicolò (CERN) ; Hawks, Ben (Fermilab) ; Hauck, Scott (Washington U., Seattle) ; Hsu, Shih-Chieh (Washington U., Seattle) ; Kastner, Ryan (UC, San Diego) ; Liang, Jason (UC, San Diego) ; Meza, Andres (UC, San Diego) et al.
We present our development experience and recent results for the MLPerf Tiny Inference Benchmark on field-programmable gate array (FPGA) platforms. [...]
arXiv:2206.11791 ; FERMILAB-CONF-22-479-SCD.
-
Fermilab Library Server - Fulltext - Fulltext
|
|
2.
|
AIgean: An Open Framework for Deploying Machine Learning on Heterogeneous Clusters
/ Tarafdar, Naif (Toronto U.) ; Di Guglielmo, Giuseppe (Columbia U.) ; Harris, Philip C (MIT) ; Krupa, Jeffrey D (MIT) ; Loncar, Vladimir (CERN) ; Rankin, Dylan S (MIT) ; Tran, Nhan (Fermilab) ; Wu, Zhenbin (Illinois U., Chicago) ; Shen, Qianfeng Clark (Toronto U.) ; Chow, Paul (Toronto U.)
AIgean, pronounced like the sea, is an open framework to build and deploy machine learning (ML) algorithms on a heterogeneous cluster of devices (CPUs and FPGAs). We leverage two open source projects: Galapagos, for multi-FPGA deployment, and hls4ml, for generating ML kernels synthesizable using Vivado HLS. [...]
FERMILAB-PUB-22-529-SCD.-
2022 - 32 p.
- Published in : ACM Trans. Reconf. Tech. Syst. 15 (2022) 1-32
|
|
3.
|
Applications and Techniques for Fast Machine Learning in Science
/ Deiana, Allison McCarn (Southern Methodist U.) ; Tran, Nhan (Fermilab ; Northwestern U. (main)) ; Agar, Joshua (Lehigh U. (main)) ; Blott, Michaela (Xilinx, Dublin) ; Di Guglielmo, Giuseppe (Columbia U. (main)) ; Duarte, Javier (UC, San Diego) ; Harris, Philip (MIT) ; Hauck, Scott (George Washington U. (main)) ; Liu, Mia (Purdue U.) ; Neubauer, Mark S. (Illinois U., Urbana) et al.
In this community review report, we discuss applications and techniques for fast machine learning (ML) in science -- the concept of integrating power ML methods into the real-time experimental data processing loop to accelerate scientific discovery. The material for the report builds on two workshops held by the Fast ML for Science community and covers three main areas: applications for fast ML across a number of scientific domains; techniques for training and implementing performant and resource-efficient ML algorithms; and computing architectures, platforms, and technologies for deploying these algorithms. [...]
arXiv:2110.13041; FERMILAB-PUB-21-502-AD-E-SCD.-
2022-04-12 - 56 p.
- Published in : Front. Big Data 5 (2022) 787421
Fulltext: 2110.13041 - PDF; fermilab-pub-21-502-ad-e-scd - PDF; Fulltext from Publisher: PDF; External link: Fermilab Library Server
|
|
4.
|
A reconfigurable neural network ASIC for detector front-end data compression at the HL-LHC
/ Di Guglielmo, Giuseppe (Columbia U.) ; Fahim, Farah (Fermilab ; Northwestern U.) ; Herwig, Christian (Fermilab) ; Valentin, Manuel Blanco (Northwestern U.) ; Duarte, Javier (UC, San Diego) ; Gingu, Cristian (Fermilab) ; Harris, Philip (MIT) ; Hirschauer, James (Fermilab) ; Kwok, Martin (Brown U.) ; Loncar, Vladimir (CERN ; Belgrade, Inst. Phys.) et al.
Despite advances in the programmable logic capabilities of modern trigger systems, a significant bottleneck remains in the amount of data to be transported from the detector to off-detector logic where trigger decisions are made. We demonstrate that a neural network autoencoder model can be implemented in a radiation tolerant ASIC to perform lossy data compression alleviating the data transmission problem while preserving critical information of the detector energy profile. [...]
arXiv:2105.01683; FERMILAB-PUB-21-217-CMS-E-SCD.-
2021-06-07 - 9 p.
- Published in : IEEE Trans. Nucl. Sci. 68 (2021) 2179
Fulltext: 5a82c8a6b63c02fc015568642085785c - PDF; 2105.01683 - PDF; fermilab-pub-21-217-cms-e-scd - PDF; External link: Fermilab Accepted Manuscript
|
|
5.
|
|
hls4ml: An Open-Source Codesign Workflow to Empower Scientific Low-Power Machine Learning Devices
/ Fahim, Farah (Northwestern U. ; Fermilab) ; Hawks, Benjamin (Fermilab) ; Herwig, Christian (Fermilab) ; Hirschauer, James (Fermilab) ; Jindariani, Sergo (Fermilab) ; Tran, Nhan (Fermilab) ; Carloni, Luca P. (Columbia U.) ; Di Guglielmo, Giuseppe (Columbia U.) ; Harris, Philip (MIT) ; Krupa, Jeffrey (MIT) et al.
Accessible machine learning algorithms, software, and diagnostic tools for energy-efficient devices and systems are extremely valuable across a broad range of application domains. [...]
arXiv:2103.05579 ; FERMILAB-CONF-21-080-SCD.
-
10 p.
Fermilab Library Server - Fulltext - Fulltext
|
|
6.
|
Fast convolutional neural networks on FPGAs with hls4ml
/ Aarrestad, Thea (CERN) ; Loncar, Vladimir (CERN ; Belgrade, Inst. Phys.) ; Ghielmetti, Nicolò (CERN ; Belgrade, Inst. Phys.) ; Pierini, Maurizio (CERN) ; Summers, Sioni (CERN) ; Ngadiuba, Jennifer (Caltech, Pasadena (main)) ; Petersson, Christoffer (Unlisted, SE ; Chalmers U. Tech.) ; Linander, Hampus (Unlisted, SE) ; Iiyama, Yutaro (Tokyo U., ICEPP) ; Di Guglielmo, Giuseppe (Columbia U. (main)) et al.
We introduce an automated tool for deploying ultra low-latency, low-power deep neural networks with convolutional layers on FPGAs. By extending the hls4ml library, we demonstrate an inference latency of $5\,\mu$s using convolutional architectures, targeting microsecond latency applications like those at the CERN Large Hadron Collider. [...]
arXiv:2101.05108; FERMILAB-PUB-21-130-SCD.-
2021-07-16 - 25 p.
- Published in : Mach. Learn. Sci. Technol. 2 (2021) 045015
Fulltext: 2101.05108 - PDF; fermilab-pub-21-130-scd - PDF; document - HTM; Fulltext from Publisher: PDF; Fulltext from publisher: PDF; External link: Fermilab Library Server
|
|
7.
|
Distance-Weighted Graph Neural Networks on FPGAs for Real-Time Particle Reconstruction in High Energy Physics
/ Iiyama, Yutaro (Tokyo U., ICEPP) ; Cerminara, Gianluca (CERN) ; Gupta, Abhijay (CERN) ; Kieseler, Jan (CERN) ; Loncar, Vladimir (CERN) ; Pierini, Maurizio (CERN) ; Qasim, Shah Rukh (CERN) ; Rieger, Marcel (CERN) ; Summers, Sioni (CERN) ; Van Onsem, Gerrit (CERN) et al.
Graph neural networks have been shown to achieve excellent performance for several crucial tasks in particle physics, such as charged particle tracking, jet tagging, and clustering. An important domain for the application of these networks is the FGPA-based first layer of real-time data filtering at the CERN Large Hadron Collider, which has strict latency and resource constraints. [...]
arXiv:2008.03601; FERMILAB-PUB-20-405-E-SCD.-
2021-01-12 - 15 p.
- Published in : Front. Big Data 3 (2020) 598927
Fulltext: 2008.03601 - PDF; fermilab-pub-20-405-e-scd - PDF; Fulltext from Publisher: PDF; Fulltext from publisher: PDF; External link: Fermilab Library Server (fulltext available)
|
|
8.
|
Compressing deep neural networks on FPGAs to binary and ternary precision with HLS4ML
/ Loncar, Vladimir (Belgrade, Inst. Phys. ; CERN) ; Hoang, Duc (Rhodes Coll.) ; Di Guglielmo, Giuseppe (Columbia U.) ; Duarte, Javier (UC, San Diego) ; Harris, Philip (MIT, Cambridge, CTP) ; Jindariani, Sergo (Fermilab) ; Kreinar, Edward (Unlisted, US, VA) ; Liu, Mia (Fermilab) ; Ngadiuba, Jennifer (CERN) ; Pedro, Kevin (Fermilab) et al.
We present the implementation of binary and ternary neural networks in the hls4ml library, designed to automatically convert deep neural network models to digital circuits with FPGA firmware. Starting from benchmark models trained with floating point precision, we investigate different strategies to reduce the network's resource consumption by reducing the numerical precision of the network parameters to binary or ternary. [...]
arXiv:2003.06308; FERMILAB-PUB-20-167-PPD-SCD; FERMILAB-PUB-20-167-PPD-SCD.-
2020-12-01 - 12 p.
- Published in : Mach. Learn. Sci. Tech. 2 (2021) 015001
Fulltext: fermilab-pub-20-167-ppd-scd - PDF; 2003.06308 - PDF; Fulltext from publisher: PDF; External link: Fermilab Library Server (fulltext available)
|
|