Author(s)
| Ballestrero, S (University of Johannesburg, South Africa) ; Batraneanu, S M (University of California, Irvine, USA) ; Brasolin, F (Istituto Nazionale di Fisica Nucleare Sezione di Bologna, Italy) ; Contescu, C (CERN and Polytechnic University of Bucharest, Romania) ; Di Girolamo, A (CERN IT Experiment Support Group (CERN IT ES)) ; Lee, C J (CERN and University of Johannesburg, South Africa) ; Pozo Astigarraga, M E (CERN) ; Scannicchio, D A (University of California, Irvine, USA) ; Twomey, M S (University of Washington Department of Physics, USA) ; Zaytsev, A (Brookhaeven National Laboratory (BNL), USA) |
Abstract
| With the LHC collider at CERN currently going through the period of Long Shutdown 1 (LS1) there is a remarkable opportunity to use the computing resources of the large trigger farms of the experiments for other data processing activities. In the case of ATLAS experiment the TDAQ farm, consisting of more than 1500 compute nodes, is particularly suitable for running Monte Carlo production jobs that are mostly CPU and not I/O bound. This contribution gives a thorough review of all the stages of Sim@P1 project dedicated to the design and deployment of a virtualized platform running on the ATLAS TDAQ computing resources and using it to run the large groups of CernVM based virtual machines operating as a single CERN-P1 WLCG site. This platform has been designed to avoid interference with TDAQ usage of the farm and to guarantee the security and the usability of the ATLAS private network; Openstack has been chosen to provide a cloud management layer. The approaches to organizing support for the sustained operation of the system on both infrastructural (hardware, virtualization platform) and logical (site support and job execution) levels are also discussed. The project is a result of combined effort of the ATLAS TDAQ SysAdmins and NetAdmins teams, CERN IT-SDC-OL Department and RHIC and ATLAS Computing Facility at BNL. The experience obtained while operating Sim@P1 infrastructure over the last 3.5 months shows that the virtualized infrastructure deployed on top of the ATLAS HLT farm is capable of contributing to the ATLAS MC production on a level of computing power of a large Tier-1 WLCG site, despite the opportunistic nature of the underlying computing resources being used. |