CERN Accelerating science

002754088 001__ 2754088
002754088 003__ SzGeCERN
002754088 005__ 20210316223451.0
002754088 0247_ $$2DOI$$9EDP Sciences$$a10.1051/epjconf/202024501028
002754088 0248_ $$aoai:inspirehep.net:1832012$$pcerncds:CERN$$pcerncds:CERN:FULLTEXT$$pcerncds:FULLTEXT$$qINSPIRE:HEP$$qForCDS
002754088 035__ $$9http://old.inspirehep.net/oai2d$$aoai:inspirehep.net:1832012$$d2021-03-09T10:11:07Z$$h2021-03-10T05:00:02Z$$mmarcxml
002754088 035__ $$9Inspire$$a1832012
002754088 041__ $$aeng
002754088 100__ $$aBadaro, Gilbert$$uAmerican U. of Beirut
002754088 245__ $$9EDP Sciences$$aDAQExpert the service to increase CMS data-taking efficiency
002754088 260__ $$c2020
002754088 300__ $$a7 p
002754088 520__ $$9EDP Sciences$$aThe Data Acquisition (DAQ) system of the Compact Muon Solenoid (CMS) experiment at the LHC is a complex system responsible for the data readout, event building and recording of accepted events. Its proper functioning plays a critical role in the data-taking efficiency of the CMS experiment. In order to ensure high availability and recover promptly in the event of hardware or software failure of the subsystems, an expert system, the DAQ Expert, has been developed. It aims at improving the data taking efficiency, reducing the human error in the operations and minimising the on-call expert demand. Introduced in the beginning of 2017, it assists the shift crew and the system experts in recovering from operational faults, streamlining the post mortem analysis and, at the end of Run 2, triggering fully automatic recovery without human intervention. DAQ Expert analyses the real-time monitoring data originating from the DAQ components and the high-level trigger updated every few seconds. It pinpoints data flow problems, and recovers them automatically or after given operator approval. We analyse the CMS downtime in the 2018 run focusing on what was improved with the introduction of automated recovery; present challenges and design of encoding the expert knowledge into automated recovery jobs. Furthermore, we demonstrate the web-based, ReactJS interfaces that ensure an effective cooperation between the human operators in the control room and the automated recovery system. We report on the operational experience with automated recovery.
002754088 540__ $$aCC-BY-4.0$$bEDP Sciences$$uhttps://creativecommons.org/licenses/by/4.0/
002754088 542__ $$dThe Authors$$g2020
002754088 65017 $$2SzGeCERN$$aComputing and Computers
002754088 65017 $$2SzGeCERN$$aDetectors and Experimental Techniques
002754088 693__ $$aCERN LHC$$eCMS
002754088 690C_ $$aCERN
002754088 690C_ $$aARTICLE
002754088 700__ $$aBehrens, Ulf$$uRice U.
002754088 700__ $$aBranson, James$$uUC, San Diego
002754088 700__ $$aBrummer, Philipp$$uCERN$$uKIT, Karlsruhe$$vKarlsruhe Institute of Technology, Karlsruhe, Germany
002754088 700__ $$aCittolin, Sergio$$uUC, San Diego
002754088 700__ $$aDa Silva-Gomes, Diego$$uFermilab$$uCERN$$vCERN, Geneva, Switzerland
002754088 700__ $$aDarlea, Georgiana-Lavinia$$uMIT
002754088 700__ $$aDeldicque, Christian$$uCERN
002754088 700__ $$aDobson, Marc$$uCERN
002754088 700__ $$aDoualot, Nicolas$$uFermilab$$uCERN$$vCERN, Geneva, Switzerland
002754088 700__ $$aFulcher, Jonathan Richard$$uCERN
002754088 700__ $$aGigi, Dominique$$uCERN
002754088 700__ $$aGladki, [email protected]$$uCERN
002754088 700__ $$aGlege, Frank$$uCERN
002754088 700__ $$aGolubovic, Dejan$$uCERN
002754088 700__ $$aGomez-Ceballos, Guillelmo$$uMIT
002754088 700__ $$aHegeman, Jeroen$$uCERN
002754088 700__ $$aJames, Thomas Owen$$uCERN
002754088 700__ $$aLi, Wei$$uRice U.
002754088 700__ $$aMecionis, Audrius$$uFermilab$$uVilnius U.$$vVilnius University, Vilnius, Lithuania
002754088 700__ $$aMeijers, Frans$$uCERN
002754088 700__ $$aMeschi, Emilio$$uCERN
002754088 700__ $$aMommsen, Remigius K$$uFermilab
002754088 700__ $$aMor, Keyshav$$uCERN
002754088 700__ $$aMorovic, Srecko$$uUC, San Diego
002754088 700__ $$aOrsini, Luciano$$uCERN
002754088 700__ $$aPapakrivopoulos, Ioannis$$uNatl. Tech. U., Athens$$uCERN$$vCERN, Geneva, Switzerland
002754088 700__ $$aPaus, Christoph$$uMIT
002754088 700__ $$aPetrucci, Andrea$$uUC, San Diego
002754088 700__ $$aPieri, Marco$$uUC, San Diego
002754088 700__ $$aRabady, Dinyar$$uCERN
002754088 700__ $$aRaychino, Kolyo$$uCERN
002754088 700__ $$aRacz, Attila$$uCERN
002754088 700__ $$aRodriguez-Garcia, Alvaro$$uCERN
002754088 700__ $$aSakulin, Hannes$$uCERN
002754088 700__ $$aSchwick, Christoph$$uCERN
002754088 700__ $$aSimelevicius, Dainius$$uVilnius U.$$uCERN$$vCERN, Geneva, Switzerland
002754088 700__ $$aSoursos, Panagiotis$$uCERN
002754088 700__ $$aStahl, Andre$$uRice U.
002754088 700__ $$aStankevicius, Mantas$$uFermilab$$uVilnius U.$$vVilnius University, Vilnius, Lithuania
002754088 700__ $$aSuthakar, Uthayanath$$uCERN
002754088 700__ $$aVazquez-Velez, Cristina$$uCERN
002754088 700__ $$aZahid, Awais$$uCERN
002754088 700__ $$aZejdl, Petr$$uFermilab$$uCERN$$vCERN, Geneva, Switzerland
002754088 773__ $$01830716$$c01028$$pEPJ Web Conf.$$v245$$wC19-11-04$$y2020
002754088 8564_ $$82281646$$s636996$$uhttp://cds.cern.ch/record/2754088/files/10.1051_epjconf_202024501028.pdf$$yFulltext from publisher
002754088 8564_ $$82281646$$s12325$$uhttp://cds.cern.ch/record/2754088/files/10.1051_epjconf_202024501028.gif?subformat=icon$$xicon$$yFulltext from publisher
002754088 8564_ $$82281646$$s126705$$uhttp://cds.cern.ch/record/2754088/files/10.1051_epjconf_202024501028.jpg?subformat=icon-700$$xicon-700$$yFulltext from publisher
002754088 8564_ $$82281646$$s17600$$uhttp://cds.cern.ch/record/2754088/files/10.1051_epjconf_202024501028.jpg?subformat=icon-180$$xicon-180$$yFulltext from publisher
002754088 960__ $$a13
002754088 962__ $$b2684706$$k01028$$nadelaide20191104
002754088 980__ $$aARTICLE
002754088 980__ $$aConferencePaper