CERN Accélérateur de science

ATLAS Slides
Report number ATL-SOFT-SLIDE-2012-429
Title ATLAS Distributed Computing Automation
Author(s) Schovancova, J (Academy of Sciences of the Czech Republic) ; Barreiro Megino, F H (CERN) ; Borrego, C (Physics Department, Universidad Autonoma de Madrid) ; Campana, S (CERN) ; Di Girolamo, A (CERN) ; Elmsheuser, J (Fakultaet fuer Physik, Ludwig-Maximilians-Universitaet Muenchen) ; Hejbal, J (Academy of Sciences of the Czech Republic) ; Kouba, T (Academy of Sciences of the Czech Republic) ; Legger, F (Fakultaet fuer Physik, Ludwig-Maximilians-Universitaet Muenchen) ; Magradze, E (Georg-August-Universitat Goettingen, II. Physikalisches Institut) ; Medrano Llamas, R (CERN) ; Negri, G (CERN) ; Rinaldi, L (INFN Bologna and Universita' di Bologna, Dipartimento di Fisica) ; Sciacca, G (University of Bern, Albert Einstein Center for Fundamental Physics, Laboratory for High Energy Physics) ; Serfon, C (Fakultaet fuer Physik, Ludwig-Maximilians-Universitaet Muenchen) ; Van Der Ster, D C (CERN)
Corporate author(s) The ATLAS collaboration
Submitted to 5th International Conference "Distributed Computing and Grid-technologies in Science and Education", Dubna, Russian Federation, 16 - 20 Jul 2012
Submitted by [email protected] on 11 Jul 2012
Subject category Detectors and Experimental Techniques
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords ATLAS Distributed Computing ; ADC ; automation ; functional test ; exclusion ; recovery
Abstract The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.



 Notice créée le 2012-07-11, modifiée le 2017-05-23