CERN Accelerating science

Talk
Title Laurelin: Java-native ROOT I/O for Apache Spark
Video
If you experience any problem watching the video, click the download button below
Download Embed
Show n. of views
Mp4:480p
(presenter)
720p
(presenter)
1080p
(presenter)
240p
(presenter)
360p
(presenter)
Subtitles:
Copy-paste this code into your page:
Author(s) Melo, Andrew Malone (speaker) (Vanderbilt University (US))
Corporate author(s) CERN. Geneva
Imprint 2021-05-19. - 663.
Series (Conferences)
(25th International Conference on Computing in High Energy & Nuclear Physics)
Lecture note on 2021-05-19T17:53:00
Subject category Conferences
Abstract Apache Spark is one of the predominant frameworks in the big data space, providing a fully-functional query processing engine, vendor support for hardware accelerators, and performant integrations with scientific computing libraries. One difficulty in adopting conventional big data frameworks to HEP workflows is the lack of support for the ROOT file format in these frameworks. Laurelin implements ROOT I/O with a pure Java library, with no bindings to the C++ ROOT implementation, and is readily installable via standard Java packaging tools. It provides a performant interface enabling Spark to read (and soon write) ROOT TTrees, enabling users to process these data without a pre-processing phase converting to an intermediate format.
Related document Conference Paper EPJ Web Conf. 251 (2021) 02072
Copyright/License © 2021-2024 CERN
Submitted by [email protected]

 


 Record created 2021-05-21, last modified 2024-06-26


External links:
Download fulltextTalk details
Download fulltextEvent details