

08 May 2017

# Electronics for CMS Endcap Muon Level-1 Trigger System Phase-1 and HL LHC Upgrades Summary

Alexander Madorsky for the CMS Collaboration

#### Abstract

To accommodate high-luminosity LHC operation at 13 TeV collision energy, the CMS Endcap Muon Level-1 Trigger system had to be significantly modified. To provide the best track reconstruction, the trigger system must now import all available trigger primitives generated by Cathode Strip Chambers and by certain other subsystems, such as Resistive Plate Chambers (RPC). In addition to massive input bandwidth, this also required significant increase in logic and memory resources.To satisfy these requirements, a new Sector Processor unit has been designed. It consists of three modules. The Core Logic module houses the large FPGA that contains the track-finding logic and multi-gigabit serial links for data exchange. The Optical module contains optical receivers and transmitters; it communicates with the Core Logic module via a custom backplane section. The Pt Lookup Table (PTLUT) module contains 1 GB of low-latency memory that is used to assign the final Pt to reconstructed muon tracks. The TCA architecture (adopted by CMS) was used for this design. The talk presents the details of the hardware and firmware design of the production system based on Xilinx Virtex-7 FPGA family. The next round of LHC and CMS upgrades starts in 2019, followed by a major High-Luminosity (HL) LHC upgrade starting in 2024. In the course of these upgrades, the new Gas Electron Multiplier (GEM) detector and more RPC chambers will be added to the Endcap Muon system. In order to keep up with all these changes, a new Advanced Processor unit is being designed. This device will be based on Xilinx UltraScale+ FPGAs. It will be able to accommodate up to 100 serial links with bit rates of up to 25 Gb/s, and provide up to 2.5 times more logic resources than the device used currently. The amount of PTLUT memory will be significantly increased to provide more flexibility for Pt assignment algorithm. The talk presents preliminary details of the hardware design program.

Presented at *Instr17 Instrumentation for Colliding Beam Physics*

# **Electronics for CMS Endcap Muon Level-1 Trigger System Phase-1 and HL LHC Upgrades**

#### **Alexander Madorsky on behalf of the CMS Collaboration**

*E-mail*: [madorsky@phys.ufl.edu](mailto:marek@ipj.gov.pl)

ABSTRACT: To accommodate high-luminosity LHC operation at a 13 TeV collision energy, the CMS Endcap Muon Level-1 Trigger system had to be significantly modified. To provide robust track reconstruction, the trigger system must now import all available trigger primitives generated by the Cathode Strip Chambers and by certain other subsystems, such as Resistive Plate Chambers (RPC). In addition to massive input bandwidth, this also required significant increase in logic and memory resources. To satisfy these requirements, a new Sector Processor unit has been designed. It consists of three modules. The Core Logic module houses the large FPGA that contains the track-finding logic and multi-gigabit serial links for data exchange. The Optical module contains optical receivers and transmitters; it communicates with the Core Logic module via a custom backplane section. The Pt Lookup Table (PTLUT) module contains 1 GB of low-latency memory that is used to assign the final Pt to reconstructed muon tracks. The µTCA architecture (adopted by CMS) was used for this design. The talk presents the details of the hardware and firmware design of the production system based on Xilinx Virtex-7 FPGA family. The next round of LHC and CMS upgrades starts in 2019, followed by a major High-Luminosity (HL) LHC upgrade starting in 2024. In the course of these upgrades, new Gas Electron Multiplier (GEM) detectors and more RPC chambers will be added to the Endcap Muon system. In order to keep up with all these changes, a new Advanced Processor unit is being designed. This device will be based on Xilinx UltraScale+ FPGAs. It will be able to accommodate up to 100 serial links with bit rates of up to 25 Gb/s, and provide up to 2.5 times more logic resources than the device used currently. The amount of PTLUT memory will be significantly increased to provide more flexibility for the Pt assignment algorithm. The talk presents preliminary details of the hardware design program.

KEYWORDS: CMS, Endcap, Muon, Trigger, FPGA, Track Finder, micro-TCA

#### **Contents**



# **1. Upgrade goals for Long Shutdown 1 (LS1), 2013-2014**

 The following sections cover the most important improvements introduced in order for the Muon Endcap trigger system to function properly after the LHC and CMS emerged from Long Shutdown 1 (LS1).

#### **1.1 Muon Level 1 Trigger system segmentation**

 Before LS1, the Muon Level 1 trigger was separated into two subsystems: Drift Tube Track Finder (DTTF) was processing data from the barrel part of the Muon detector, and Cathode Strip Chamber Track Finder (CSCTF) was processing data from endcap part of the Muon detector. This created a triplication of effort in overlap area between barrel and endcap parts, because the data from that area were processed by DTTF, CSCTF, and also RPC Pattern Comparator (PAC).

 After the LS1 upgrade, the Muon Level 1 trigger system consists of three parts. The segmentation before and after LS1 is shown i[n Table 1.](#page--1-0) The overlap area has been separated from barrel and endcap, and is now processed by its own track finder.

#### **1.2 Trigger primitive import**

36 The legacy CSCTF system filtered the trigger primitives generated by each 60° azimuthal Muon Endcap sector. Only up to 15 better-reconstructed primitives out of possible 90 were sent to Sector



#### **Table 1. Muon Level 1 Trigger system segmentation before and after LS1**

 Processor boards. This reduced the efficiency for events with multiple muons in a small geometrical region and would lead to inefficiency in high pile-up conditions [\[2\].](#page--1-1) To solve that problem, EMTF imports all trigger primitives. Additionally, data from another regional subsystem, Resistive Plate Chambers (RPC), is also imported into EMTF.

#### **1.3 Sector Overlap processing**

 Muon Endcap Level-1 trigger system is separated into  $60^\circ$  azimuthal sectors. In the legacy CSCTF system, trigger primitives from each sector were delivered to a corresponding Sector Processor only. This approach led to inefficiency of the track detection on the edges of each sector in cases when some of the track segments fell into chambers in the neighboring sector. The upgraded system removes this inefficiency by sharing trigger primitives between two neighboring sectors in the sector overlap area.

#### **1.4 Transverse Momentum assignment**

51 A flexible and powerful way to assign transverse momentum  $(p_T)$  to the muons that has been identified by the ME trigger system is to use a Look-up Table (LUT). This approach allows for complete algorithmic flexibility as well as fixed latency, which is independent of the algorithm 54 used. A proper  $p_T$  assignment in the upgraded system is more complex. This requires 55 implementing an LUT with significantly bigger address space to improve  $p_T$  assignment resolution with additional variables.

#### **2. Modular Track Finder 7 (MTF7) production hardware**

 CMS has adopted the µTCA hardware platfor[m \[3\]](#page--1-2) as a standard for new equipment development. MTF7 has been constructed using that standard. MTF7 hardware was also used to build the upgraded OMTF system [\[4\].](#page--1-3) Sections below describe the modules of MTF7 design.

#### **2.1 Core Logic module**

 The Core Logic module contains the large FPGA for trigger data processing. The production hardware is based on the Virtex-7 family of Xilinx FPGAs [\[5\].](#page--1-4) In addition to that, the Core Logic module contains a smaller control FPGA, a Module Management Controller (MMC), configuration memory for both FPGAs, power supplies, and clock management circuitry. The 66 module is able to receive trigger data via 80 GTH<sup>1</sup> links (up to 10 Gb/s each), and 4 GTX<sup>1</sup> links (up to 6.6 Gb/s each, connected to control FPGA). It can output trigger decisions and other data using 24 GTH (10 Gb/s) links. PCI express (PCIe) was selected as the main control interface solution for the upgraded Sector Processor design. This choice is dictated by the bandwidth requirements, specifically downloading PTLUT memory contents. Each module is provided with direct access to the host computer memory.

#### **2.2 Optical module**

- The Optical module contains 7 12-channel optical receivers, 2 12-channel transmitters, MMC
- and control circuitry, and power supplies. All optical components are 10 Gb/s parts

<sup>-</sup><sup>1</sup> GTH and GTX are two different types of multi-gigabit serial links available in Xilinx FPGAs [\[5\].](#page--1-4)

- manufactured by Avago [\[6\].](#page--1-5) The optical receivers and transmitters are connected to FPGAs on
- the Core Logic module via a short custom backplane section.

# **2.3 PTLUT module**

- The PTLUT module is implemented as a mezzanine card that sits on top of the Core Logic
- module. It contains 1 GB of Reduced Latency DRAM (RLDRAM) memory, manufactured by
- Micron [\[7\].](#page--1-6) This type of memory, while retaining all the advantages of DRAM (large capacity,
- low power consumption, low price), has been specifically designed to reduce latency for
- 82 random address accesses. The address bit count usable for the  $p_T$  assignment is 30.

# **2.4 Optical plant**

 The trigger primitives are transmitted from Cathode Strip Chambers (CSCs) using 3.2 Gb/s optical links. Most of these primitives must be shared between EMTF and/or OMTF Sector Processors (2-way sharing), and some of them are shared between two EMTF and two OMTF processors, to account for sector overlap (4-way sharing).

 The data sharing is implemented using passive optical splitters. The splitter and fiber network for each of the 12 Endcap Muon sectors is implemented in a 1U rack mount patch panel; there are 12 such panels in the entire system.

#### **2.5 Firmware**

 The EMTF Sector Processor firmware was significantly reworked relative to the legacy 93 CSCTF system. It is able to import and process up to 108 trigger primitives from CSCs per Bunch- Crossing (BX), and up to 84 RPC primitives per BX. By contrast, the legacy CSCTF system was only processing up to 15 CSC primitives per BX. The RPC primitives were not used. The EMTF input bandwidth increase relative to CSCTF is nearly 13 times.

 The processing starts with a search for straightest paths in φ direction using pre-defined patterns, with CSC primitives only. The 12 best φ patterns are selected on each BX. Next, input primitives are matched to patterns. The RPC data are used to complement missing CSC 100 primitives. The path in  $\theta$  direction is verified to be straight, out-of-line primitives are removed. In 101 the last step,  $\Delta\varphi$  and  $\Delta\theta$  between primitives are calculated, and finally the best three tracks are 102 selected and fed to the  $p_T$ -assignment LUT.

# **3. Performance**

 The performance plots ([Figure 2](#page--1-7)) [\[8\]](#page--1-8) show the measured L1 muon trigger efficiency using a "tag- and-probe" method on a dimuon sample. The efficiency is shown as a function of the 106 reconstructed probe  $p_T$  for a threshold of 22 GeV and as a function of pseudorapidity (η) for the same threshold and when the reconstructed probe muon has a PT>40 GeV. The endcap region 108 covers  $| \eta | > 1.2$ .

# **4. Advanced Processor (AP) for HL LHC (Phase 2) upgrades**

 We are working on a design of a prototype processing board for HL LHC trigger upgrades in collaboration with University of Wisconsin-Madison. The goal of this project is to design a general-purpose FPGA processing board with expansion capabilities that could be suitable for use by multiple systems, not necessarily only trigger.

 CMS is evaluating ATCA [\[9\]](#page--1-9) telecom standard as a replacement for µTCA. We plan to build the prototype in that standard. The following sections describe the main features of the AP design.

# **4.1 FPGA selection**

 The AP design targets Xilinx UltraScale (US) and UltraScale+ (US+) FPGA families. Both US and US+ are using the same lineup of packages, with multiple devices available in each package.  A board designed for a particular package will provide the end user with an option of selecting the FPGA that represents an optimal balance of price and resources. Currently two packages are

being evaluated: B2104 and C2104.

# **4.2 Multi-gigabit optical links**

 The goal is to provide up to 100 optical I/O connections, capable of multiple data rates. Exact connection count is determined by the FPGA that the end user selects. Bit rates provided are 1– 14 Gb/s base range, and up to 25–28 Gb/s extended range. We are currently considering the FireFly family [\[10\]](#page--1-10) of optical links as a viable optical component candidate. The advantages of FireFly are small size, multiple available data rates (1–14 and 25–28 Gb/s), and availability of connector-compatible high-speed copper cables. Initial tests with 14 Gb/s version have yielded very good results.

#### **4.3 Mezzanine cards (expansion capabilities)**

 The AP will provide connectors and space for one or possibly two mezzanine cards for user- designed expansion devices. The connectors will be attached directly to regular I/O banks of the main FPGA, using nearly all available I/O pins. The mezzanine cards may contain any devices that the end user requires, such as external memory banks, supplemental processors, exotic I/O, associative memory, etc.

#### **4.4 Rear Transition Module (RTM) support**

 The AP will provide RTM support according to the ATCA specification. A certain number of serial links will be routed to the RTM, to provide the end user with more flexibility, such as legacy optical connection support. Molex's Impel connectors [\[11\]](#page--1-11) have been evaluated for transition from main board to RTM, and have demonstrated very good performance at 14 Gb/s bit rate.

#### **4.5 Embedded Linux control platform**

 The AP control interface will be based on Zynq-7000 family of devices, running embedded Linux. The control module is being designed as a small mezzanine card in expanded COMExpress Mini [\[12\]](#page--1-12) form factor.

#### **Acknowledgments**

 We would like to thank J. Mendez (CERN) for providing firmware, software, and support for the MMC design. We also wish to thank the Boston University Electronic Design Facility for providing AMC13 boards, and support for them. We thank the RPC team (University of Warsaw) for their helpful suggestions.



**Figure 1: MTF7 production hardware**



 $\overline{a}$ 

<sup>&</sup>lt;sup>2</sup> The efficiency drop seen at high  $p<sub>T</sub>$ , in the plot on the right, was traced back to a misconfiguration of the system, which has been corrected in August 2016.