CERN Accelerating science

ATLAS Note
Report number ATL-DAQ-PROC-2013-003
Title ATLAS TDAQ System Administration: an overview and evolution
Author(s) LEE, CJ (Johannesburg U. ; CERN) ; BALLESTRERO, S (Johannesburg U.) ; BOGDANCHIKOV, A (Novosibirsk, IYF) ; BRASOLIN, F (INFN, Bologna) ; CONTESCU, AC (Bucharest U. ; CERN) ; DARLEA, G-L (Bucharest U.) ; KOROL, A (Novosibirsk, IYF) ; SCANNICCHIO, DA (UC, Irvine) ; TWOMEY, M (Washington U. ; CERN) ; VALSAN, ML (Bucharest U. ; CERN)
Corporate Author(s) The ATLAS collaboration
Publication 2013
Imprint 12 Apr 2013
Number of pages mult.
In: PoS ISGC2013 (2013) pp.006
In: International Symposium on Grids and Clouds 2013, Taipei, Taiwan, 17 - 22 Mar 2013, pp.006
Subject category Detectors and Experimental Techniques
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords TDAQ ; SysAdmin ; ATLAS
Abstract The ATLAS Trigger and Data Acquisition (TDAQ) system is responsible for the online processing of live data streaming from the ATLAS experiment at the Large Hadron Collider (LHC) at CERN. The system processes the direct data readout from  100 million channels on the detector through multiple trigger levels, selecting interesting events for analysis with a factor of $10^{7}$ reduction on the data rate with a latency of less than a few seconds. Most of the functionality is implemented on  3000 servers composing the online farm. Due to the critical functionality of the system a sophisticated computing environment is maintained, covering the online farm and ATLAS control rooms, as well as a number of development and testing labs. The specificity of the system required the development of dedicated applications (e.g. ConfDB, BWM) for system configuration and maintenance; in parallel other Open Source tools (Puppet and Quattor) are used to centrally configure the operating systems. The health monitoring of the TDAQ system hardware and OS performs  60 thousand checks every 5 minutes; it is currently implemented over Nagios, and it is being complemented and replaced by Ganglia and Icinga. The online system adopted a sophisticated user management, based on the Active Directory infrastructure and integrated with Access Manager, a dedicated Role Based Access Control (RBAC) tool. The RBAC and its underlying LDAP database control user rights from the external access to the farm down to specific user actions. A web-based user interface allows delegated administrators to manage specific role assignments. The current activities of the SysAdmin group include the daily monitoring, troubleshooting and maintenance of the online system, storage and farm upgrades, and readying systems for an upgrade to Scientific Linux 6 with the related global integration, configuration, optimisation and hardware updates necessary. In addition, during the 2013 shutdown the team will provide support for the usage of a large fraction of the online farm for GEANT4 simulations of ATLAS.
Copyright/License Preprint: (License: CC-BY-4.0)

Corresponding record in: INSPIRE


 Záznam vytvorený 2013-04-12, zmenený 2018-05-29