CERN Accelerating science

ATLAS Slides
Report number ATL-SOFT-SLIDE-2016-657
Title Consolidation of Cloud Computing in ATLAS
Author(s) Taylor, Ryan P. (University of Victoria) ; Cordeiro, Cristovao (CERN) ; Di Girolamo, Alessandro (CERN) ; Hover, John (Brookhaven National Laboratory (BNL)) ; Kouba, Tomas (Academy of Sciences of the Czech Republic, Institute of Physics and Institute for Computer Science) ; Love, Peter (Lancaster University, Department of Physics) ; Mcnab, Andrew (School of Physics and Astronomy, University of Manchester) ; Schovancova, Jaroslava (The University of Texas at Arlington) ; Sobie, Randall (University of Victoria)
Corporate author(s) The ATLAS collaboration
Collaboration ATLAS Collaboration
Submitted to 22nd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2016, San Francisco, Usa, 10 - 14 Oct 2016
Submitted by [email protected] on 20 Sep 2016
Subject category Particle Physics - Experiment
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords cloud computing ; contextualization ; monitoring ; benchmarking ; commercial cloud ; analytics
Abstract Throughout the first year of LHC Run 2, ATLAS Cloud Computing has undergone a period of consolidation, characterized by building upon previously established systems, with the aim of reducing operational effort, improving robustness, and reaching higher scale. This paper describes the current state of ATLAS Cloud Computing. Cloud activities are converging on a common contextualization approach for virtual machines, and cloud resources are sharing monitoring and service discovery components. We describe the integration of Vac resources, streamlined usage of the High Level Trigger cloud for simulation and reconstruction, extreme scaling on Amazon EC2, and procurement of commercial cloud capacity in Europe. Building on the previously established monitoring infrastructure, we have deployed a real-time monitoring and alerting platform which coalesces data from multiple sources, provides flexible visualization via customizable dashboards, and issues alerts and carries out corrective actions in response to problems. Finally, a versatile analytics platform for data mining of log files is being used to analyze benchmark data and diagnose and gain insight on job errors.



 Záznam vytvorený 2016-09-20, zmenený 2017-12-13