Abstract
| The ATLAS experiment at the LHC at CERN continuously evolves its TDAQ system to meet the challenges of new physics goals and technological advancements. As ATLAS prepares for the Phase-II Run 4 of the LHC, significant enhancements in the TDAQ Controls and Configuration (TDAQ-CC) tools have been designed to ensure efficient data collection, processing, and management. This abstract presents the evolution of ATLAS TDAQ-CC system leading up to Phase-II Run 4. As part of the evolution towards Phase-II, Kubernetes has been chosen to orchestrate the Event Filter (EF) farm. By leveraging Kubernetes, ATLAS can dynamically allocate computing resources, scale processing capacity in response to changing data taking conditions and ensure high availability of data processing services. The integration of the Kubernetes with the TDAQ Run Control framework enables perfect synchronisation between the experiment's data acquisition components and the computing infrastructure. We will discuss the architectural considerations and implementation challenges involved in Kubernetes integration with the ATLAS TDAQ-CC system. We will highlight the benefits of using Kubernetes as an EF farm orchestrator, including improved resource utilization, enhanced fault tolerance, and simplified deployment and management of data processing workflows. In addition, we will report on the extensive testing of Kubernetes that was conducted using a farm of 2500 servers within the experiment data taking environment, demonstrating its scalability and robustness in handling the demands of the ATLAS TDAQ system for Phase-II. The adoption of Kubernetes represents a significant step forward in the evolution of ATLAS TDAQ-CC system, aligning with industry best practices in container orchestration. |