CERN Accelerating science

Published Articles
Title Towards the integrated ALICE Online-Offline (O$^2$) monitoring subsystem
Author(s) Chibante Barroso, Vasco (CERN) ; Elia, Domenico (INFN, Bari) ; Grigoras, Costin (CERN) ; Gomez Ramirez, Andres (Goethe U., Frankfurt (main)) ; Vino, Gioacchino (INFN, Bari) ; Wegrzynek, Adam (CERN)
Publication 2019
Number of pages 8
In: EPJ Web Conf. 214 (2019) 03043
In: 23rd International Conference on Computing in High Energy and Nuclear Physics, CHEP 2018, Sofia, Bulgaria, 9 - 13 Jul 2018, pp.03043
DOI 10.1051/epjconf/201921403043
Subject category Computing and Computers ; Detectors and Experimental Techniques
Accelerator/Facility, Experiment CERN LHC ; ALICE
Abstract ALICE (A Large Ion Collider Experiment) is preparing for a major upgrade of the detector, readout and computing systemsfor LHC Run 3. A new facility called O$^2$ (Online-Offline) will play a major role in data compression and event processing. To efficiently operate the experiment, we are designing a monitoring subsystem, which will provide a complete overview of the O$^2$ overall health, detect performance degradation and component failures. The monitoring subsystem will receive and collect up to 600 kHz of performance metrics. It consists of a custom monitoring library and a server-side, distributed software covering five main functional tasks: parameter collection and processing, storage, visualisation and alarms. To select the most appropriate tools for these tasks, we evaluated three options: “Modular Stack”, Zabbix and the currently used ALICE Grid monitoring tool called MonALISA. The former one consists of a toolkit including collectd, Apache Flume, Apache Spark, InfluxDB, Grafana and Riemann. This paper describes the monitoring subsystem functional architecture. It goes through a complete evaluation of the three considered options, the selection process, risk assessment and justification for the final decision. The in-depth comparison includes functional features and throughput measurement to ensure the required processing and storage performance.
Copyright/License publication: © 2019-2025 The Authors (License: CC-BY-4.0)

Corresponding record in: Inspire


 レコード 生成: 2019-11-14, 最終変更: 2022-08-10


Fulltext from publisher:
Download fulltext
PDF