CERN Accelerating science

ATLAS Slides
Report number ATL-DAQ-SLIDE-2018-803
Title Integrated automation for configuration management and operations in the ATLAS online computing farm
Author(s) Amirkhanov, Artem (Budker Institute of Nuclear Physics, Siberian Branch of Russian Academy of Sciences) ; Ballestrero, Sergio (University of Johannesburg, Department of Mechanical Engineering Science) ; Brasolin, Franco (Universita e INFN, Bologna) ; Lee, Christopher Jon (University of Cape Town) ; Du Plessis, Haydn Dean (University of Johannesburg, Department of Mechanical Engineering Science) ; Mitrogeorgos, Konstantinos (Aristotle University of Thessaloniki) ; Pernigotti, Marco (European Laboratory for Particle Physics, CERN) ; Sanchez Pineda, Arturo Rodolfo (INFN Gruppo Collegato di Udine and Universita' di Udine, Dipartimento di Chimica, Fisica e Ambiente) ; Scannicchio, Diana (University of California, Irvine) ; Twomey, Matthew Shaun (Department of Physics, University of Washington, Seattle)
Corporate author(s) The ATLAS collaboration
Collaboration ATLAS Collaboration
Submitted to 23rd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2018, Sofia, Bulgaria, 9 - 13 Jul 2018
Submitted by [email protected] on 01 Oct 2018
Subject category Particle Physics - Experiment
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords Automation ; Monitoring ; Farm Management
Abstract The online farm of the ATLAS experiment at the LHC, consisting of nearly 4000 PCs with various characteristics, provides configuration and control of the detector and performs the collection, processing, selection, and conveyance of event data from the front-end electronics to mass storage. Different aspects of the farm management are already accessible via several tools. The status and health of each host are monitored by a system based on Icinga2 and Ganglia. PuppetDB gathers centrally all the status information from Puppet, the configuration management tool used to ensure configuration consistency of every host. The in-house Configuration Database controls DHCP and PXE, integrating also external information sources. In this paper we present our roadmap for integrating these and other data sources and systems, and building a higher level of abstraction on top of this foundation. An automation and orchestration tool will be able to use these systems and replace lengthy manual procedures, some of which also require interactions with other systems and teams, e.g. for the repair of a faulty host. Finally, an inventory and tracking system will complement the available data sources, keep track of host history, and improve the evaluation of long-term lifecycle management and purchase strategies.
Related document Conference Paper ATL-DAQ-PROC-2018-038



 Rekord stworzony 2018-10-01, ostatnia modyfikacja 2019-11-05


Pełny tekst:
Pobierz pełny tekstPDF
Link zewnętrzny:
Pobierz pełny tekstOriginal Communication (restricted to ATLAS)