Author(s)
| Calafiura, Paolo (LBL, Berkeley) ; De, Kaushik (Texas U., Arlington) ; Guan, Wen (Wisconsin U., Madison) ; Maeno, Tadashi (Brookhaven) ; Nilsson, Paul (Brookhaven) ; Oleynik, Danila (Texas U., Arlington) ; Panitkin, Sergey (Brookhaven) ; Tsulaia, Vakhtang (LBL, Berkeley) ; van Gemmeren, Peter (Argonne, HEP) ; Wenaus, Torre (Brookhaven) |
Abstract
| The ATLAS Event Service (ES) implements a new fine grained approach to HEP event processing, designed to be agile and efficient in exploiting transient, short-lived resources such as HPC hole-filling, spot market commercial clouds, and volunteer computing. Input and output control and data flows, bookkeeping, monitoring, and data storage are all managed at the event level in an implementation capable of supporting ATLAS-scale distributed processing throughputs (about 4M CPU-hours/day). Input data flows utilize remote data repositories with no data locality or prestaging requirements, minimizing the use of costly storage in favor of strongly leveraging powerful networks. Object stores provide a highly scalable means of remotely storing the quasi-continuous, fine grained outputs that give ES based applications a very light data footprint on a processing resource, and ensure negligible losses should the resource suddenly vanish. We will describe the motivations for the ES system, its unique features and capabilities, its architecture and the highly scalable tools and technologies employed in its implementation, and its applications in ATLAS processing on HPCs, commercial cloud resources, volunteer computing, and grid resources. |