CERN Accelerating science

ATLAS Note
Report number ATL-SOFT-PROC-2018-047
Title Building and Using Containers on HPCs for the ATLAS Experiment
Author(s) Yang, Wei (SLAC National Accelerator Laboratory) (+) ; Benjamin, Douglas (Argonne National Laboratory) ; Childers, John Taylor (Argonne National Laboratory) ; Lesny, David (University of Illinois at Urbana-Champaign) ; Oleynik, Danila (Joint Institute for Nuclear Research) ; Panitkin, Sergey (Brookhaven National Laboratory (BNL)) ; Tsulaia, Vakhtang (Lawrence Berkeley National Laboratory and University of California, Berkeley) ; Zhao, Xin (Brookhaven National Laboratory (BNL))
Corporate Author(s) The ATLAS collaboration
Collaboration ATLAS Collaboration
Publication 2018
Imprint 30 Nov 2018
Number of pages mult.
In: 23rd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2018, Sofia, Bulgaria, 9 - 13 Jul 2018
Subject category Particle Physics - Experiment
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords HPC ; Container ; IO ; Offline Computing
Abstract The HPC environment presents several challenges to the ATLAS experiment in running their automated computational workflow smoothly and efficiently, in particular on issues such as software distribution and IO load. A vital component of the LHC Computing Grid, CVMFS is not always available in HPC environments. ATLAS computing has experimented with all-inclusive containers, and later developed an environment to produce such containers for both Shifter and Singularity. The all-inclusive containers include most of the recent ATLAS software releases, database releases, and other tools extracted from CVMFS. This helped ATLAS to distribute software automatically to HPCs with an environment identical to those in CVMFS. It also significantly reduced the metadata I/O load to HPCs’ shared file systems. The production operation at NERSC has proved that by using this type of containers, we can both transparently fit into the previously developed ATLAS operation methods, and at the same time scale up to run many more jobs.



 Záznam vytvorený 2018-11-30, zmenený 2018-12-01