CERN Accelerating science

ATLAS Slides
Report number ATL-DAQ-SLIDE-2018-557
Title EVALUATING KUBERNETES AS AN ORCHESTRATOR OF THE HIGH LEVEL TRIGGER COMPUTING FARM OF THE TRIGGER AND DATA ACQUISITION SYSTEM OF THE ATLAS EXPERIMENT AT THE LARGE HADRON COLLIDER
Author(s) Avolio, Giuseppe (European Laboratory for Particle Physics, CERN) ; Cadeddu, Mattia (European Laboratory for Particle Physics, CERN) ; Hauser, Reiner (Michigan State University, Department of Physics and Astronomy)
Corporate author(s) The ATLAS collaboration
Collaboration ATLAS Collaboration
Submitted by [email protected] on 30 Jul 2018
Subject category Particle Physics - Experiment
Accelerator/Facility, Experiment CERN LHC ; ATLAS
Free keywords Online ; Kubernetes ; Cluster ; HLT ; Container ; Docker ; Orchestration
Abstract The ATLAS experiment at the LHC relies on a complex and distributed Trigger and Data Acquisition (TDAQ) system to gather and select particle collision data. The High Level Trigger (HLT) component of the TDAQ system is responsible for executing advanced selection algorithms, reducing the data rate to a level suitable for recording to permanent storage. The HLT functionality is provided by a computing farm made up of thousands of commodity servers, each executing one or more processes. Moving the HLT farm management towards a containerized solution is one of the main theme of the ATLAS TDAQ Phase-II upgrades in the area of the online software; it would make it possible to open new possibilities for fault tolerance, reliability and scalability. This paper presents the results of an evaluation of Kubernetes as a possible orchestrator of the ATLAS TDAQ HLT computing farm. Kubernetes is a system for advanced management of containerized applications in large clusters. We will first highlight some of the technical solutions adopted to run the offline version of today’s HLT software in a Docker container. Then we will focus on some scaling performance measurements executed with a cluster of 1000 CPU cores. In particular, we will: - Show the way Kubernetes scales in deploying containers as a function the cluster size; - Prove how a proper tuning of the Query Per Second (QPS) Kebernetes parameter set can improve the scaling of applications. Finally, we will conclude with an assessment about the possibility to use Kubernetes as an orchestrator of the HLT computing farm in LHC’s Run IV.



 Record created 2018-07-30, last modified 2018-07-30


Fulltext:
Kubernetes-EF - Download fulltextPDF
ATL-DAQ-SLIDE-2018-557 - Download fulltextPPTX
External link:
Download fulltextOriginal Communication (restricted to ATLAS)