002920212 001__ 2920212
002920212 003__ SzGeCERN
002920212 005__ 20241218205410.0
002920212 0247_ $$2DOI$$9EDP Sciences$$a10.1051/epjconf/202429506006
002920212 0248_ $$aoai:cds.cern.ch:2920212$$pcerncds:FULLTEXT$$pcerncds:CERN:FULLTEXT$$pcerncds:CERN
002920212 035__ $$9OSTI$$a2468774
002920212 035__ $$9https://inspirehep.net/api/oai2d$$aoai:inspirehep.net:2786416$$d2024-12-17T09:34:03Z$$h2024-12-18T05:03:51Z$$mmarcxml
002920212 035__ $$9Inspire$$a2786416
002920212 037__ $$aFERMILAB-CONF-24-0684-CSAID
002920212 041__ $$aeng
002920212 100__ $$aGuiraud, [email protected]$$uCERN
002920212 245__ $$9EDP Sciences$$aBoosting RDataFrame performance with transparent bulk event processing
002920212 260__ $$c2024
002920212 300__ $$a8 p
002920212 520__ $$9EDP Sciences$$aRDataFrame is ROOT’s high-level interface for Python and C++ data analysis. Since it first became available, RDataFrame adoption has grown steadily and it is now poised to be a major component of analysis software pipelines for LHC Run 3 and beyond. Thanks to its design inspired by declarative programming principles, RDataFrame enables the development of highperformance, highly parallel analyses without requiring expert knowledge of multi-threading and I/O: user logic is expressed in terms of self-contained, small computation kernels tied together by a high-level API. This design completely decouples analysis logic from its actual execution, and opens several interesting avenues for workflow optimization. In particular, in this work we explore the benefits of moving internal data processing from an event-by-event to a bulkby-bulk loop. This refactoring dramatically reduces the framework’s runtime overheads; in collaboration with the I/O layer it improves data access patterns; it exposes information that optimizing compilers might use to auto-vectorize the invocation of user-defined computations; finally, while existing user-facing interfaces remain unaffected, it becomes possible to additionally offer interfaces that explicitly expose bulks of events, useful e.g. for the injection of GPU kernels into the analysis workflow. In order to inform similar future R&D;, design challenges will be presented, as well as an investigation of the relevant timememory trade-off backed by novel performance benchmarks.
002920212 540__ $$aCC-BY-4.0$$bEDP Sciences$$uhttps://creativecommons.org/licenses/by/4.0/
002920212 542__ $$3publication$$dThe authors$$g2024
002920212 690C_ $$aARTICLE
002920212 690C_ $$aCERN
002920212 700__ $$aBlomer, Jakob$$uCERN
002920212 700__ $$aCanal, Philippe$$uFermilab
002920212 700__ $$aNaumann, Axel$$uCERN
002920212 773__ $$c06006$$pEPJ Web Conf.$$v295$$wC23-05-08$$y2024
002920212 8564_ $$uhttps://lss.fnal.gov/archive/2024/conf/fermilab-conf-24-0684-csaid.pdf$$yFermilab Library Server
002920212 8564_ $$82701634$$s1102482$$uhttp://cds.cern.ch/record/2920212/files/3afc9b57f03c0d303680c14cbbcb8c23.pdf$$yFulltext
002920212 8564_ $$82701635$$s1873779$$uhttp://cds.cern.ch/record/2920212/files/document.pdf$$yFulltext
002920212 960__ $$a13
002920212 962__ $$b2853081$$k06006$$nnorfolk20230508
002920212 980__ $$aARTICLE
002920212 980__ $$aConferencePaper