# Design and test performance of the ATLAS Feature

2 Extractor trigger boards for the Phase-1 Upgrade

# 3 Weiming Qian<sup>\*</sup>

- 4 STFC Rutherford Appleton Laboratory,
- 5 Chilton, Didcot, Oxon, OX11 0QX, United Kingdom
- 6 *E-mail*: Weiming.Qian@stfc.ac.uk

ABSTRACT: In Run 3, the ATLAS Level-1 Calorimeter Trigger will be augmented by an Electron Feature Extractor (eFEX), to identify isolated  $e/\gamma$  and  $\tau$  particles, and a Jet Feature Extractor (jFEX), to identify energetic jets and calculate various local energy sums. Each module accommodates more than 450 differential signals that can operate at up to 12.8 Gb/s, some of which are routed over 30 cm between FPGAs. Presented here are the module designs, the processes that have been adopted to meet the challenges associated with multi-Gb/s PCB design, and the results of tests that characterize the performance of these modules.

14 KEYWORDS: Trigger; Feature Extractor; ATCA; PCB simulation.

<sup>\*</sup>On behalf of ATLAS Collaboration

| 15 | Contents                                                               |   |
|----|------------------------------------------------------------------------|---|
| 16 | 1. Introduction                                                        | 1 |
| 17 | 1.1 ATLAS Level-1 Calorimeter Trigger architecture for Phase-I Upgrade | 1 |
| 18 | 1.2 Trigger algorithms and performance                                 | 2 |
| 19 | 2. Prototype design                                                    | 3 |
| 20 | 2.1 eFEX                                                               | 3 |
| 21 | 2.1.1 Processing Area                                                  | 3 |
| 22 | 2.1.2 eFEX prototype                                                   | 4 |
| 23 | 2.2 jFEX                                                               | 4 |
| 24 | 2.2.1 Processing area                                                  | 4 |
| 25 | 2.2.2 jFEX prototype                                                   | 5 |
| 26 | 2.3 PCB design method                                                  | 6 |
| 27 | 3. Prototype test                                                      | 6 |
| 28 | 3.1 Link speed test results                                            | 7 |
| 29 | 3.2 Link speed decision                                                | 8 |
| 30 | 4. Conclusion                                                          | 8 |
| 31 |                                                                        |   |
| 32 |                                                                        |   |

# 34 **1. Introduction**

33

In Run 3 (starting in 2021), the LHC [1] luminosity will double (to  $\sim 2.5 \times 10^{34} \ cm^{-2} s^{-1}$ ), which will greatly increase the pileup rate. However, the ATLAS [2] front-end detector electronics will remain largely unchanged. Hence the total ATLAS Level-1 Trigger [3] rate will still be limited by the readout bandwidth of the front-end electronics to 100 KHz or less. Moreover, the Level-1 Trigger must retain sensitivity to the physics electroweak processes and stay within the current ATLAS Level-1 latency envelope of  $2.5\mu s$ . To meet these challenges, the Phase-I Upgrade [4] to the ATLAS Level-1 Trigger system is needed.

# 42 1.1 ATLAS Level-1 Calorimeter Trigger architecture for Phase-I Upgrade

Figure 1 shows the architecture of the Phase-I Upgrade of the ATLAS Level-1 Calorimeter Trigger (L1Calo) [5]. The current L1Calo system is augmented by three additional featureidentification subsystems:

- the electromagnetic Feature Extractor (eFEX), comprising eFEX modules and Hub
   modules with Readout Driver (ROD) daughter cards, which identifies isolated e/γ and τ
   candidates, using data of finer granularity than is currently available to L1Calo;
- the jet Feature Extractor (jFEX), comprising jFEX modules and Hub modules with
   ROD daughter cards, which identifies energetic jets and computes various local energy

51 sums, using data of finer granularity than that available to the current L1Calo JEP 52 subsystem;

• the global Feature Extractor (gFEX [6]), comprising one gFEX module, which identifies calorimeter trigger features requiring the complete calorimeter data.



55 56

53

54

Figure 1. ATLAS Level-1 Calorimeter Trigger Phase-I Upgrade [7]

57 In addition to these, the Phase-I upgrade of L1Calo includes the Tile Rear Extension 58 (TREX) to the Pre-Processor (PPr) subsystem, which digitizes Tile data and transmits them to 59 the FEXs optically, the Fibre Optical Exchange (FOX), and the FEX Test Module (FTM), 60 which facilitates the testing of FEX modules before system-level commissioning.

61 Apart from the small number of PPr modules that digitize Tile data for the FEXs, the 62 current L1Calo system, comprising the Pre-Processor, Jet Energy Processor (JEP) and Cluster 63 Processor (CP), will be decommissioned after the Phase-I Upgrade is fully commissioned.

### 64 **1.2 Trigger algorithms and performance**

In the current L1Calo system, the CP processes data from the calorimeters and identifies energy deposits characteristic of isolated  $e/\gamma$  and  $\tau$  particles, using Trigger Towers of typical granularity of  $0.1 \times 0.1$  ( $\eta \times \phi$ ). The eFEX performs this same function using higher granularity data from the Liquid Argon (LAr) electromagnetic calorimeter. For each LAr Trigger Tower, the eFEX receives data from 10 'supercells' in four layers, as shown in Figure 2.





Figure 2. eFEX trigger algorithms [8]

This makes it possible to increase the discriminatory power of L1Calo by running a collection of new trigger algorithms, including  $R_{\eta}, f_3$  and  $R_{Had}$ , that analyse shower shapes. These algorithms run in a window of  $0.3 \times 0.3$  ( $\eta \times \phi$ ) that slides by 0.1 in both  $\eta$  and  $\phi$  (such that neighbouring instances of the window overlap).

Figure 3 shows the results of a simulation 79 comparing the performance of the current (Run 80 2) algorithms with the eFEX (Phase I) 81 algorithms. It shows the eFEX can reduce the 82 83 EM trigger rate by a factor of ~3, or allow the trigger threshold to be lowered by ~7Gev at the 84 20 KHz reference point. Further optimization on 85 eFEX algorithms is under study. 86



The jFEX identifies jets, and calculates  $\sum E_T$  and  $E_T^{miss}$ . In the current system, these functions are implemented by the JEP. The jFEX improves on the performance of the JEP by a number of means. It receives higher-granularity calorimeter data (0.1×0.1 ( $\eta \times \phi$ ) rather than 0.2×0.2) and implements a Gaussian-weighted filter, giving it greater discriminatory power; it can implement a larger algorithm window; and each jFEX module processes data from a complete ring of the calorimeter in  $\phi$ , enabling in-time pileup suppression and improving the calculation of  $\sum E_T$  and  $E_T^{miss}$ .

94 Figure 4 shows the results of simulations 95 comparing the performance of the algorithms that can be implemented on the jFEX, with those 96 currently run on the JEP. It shows how the turn-97 on curves of the jFEX are sharper - a fact that 98 99 can be used to raise the trigger thresholds without losing efficiency, leading to rate reductions 100 similar to the eFEX. 101

- 102 2. Prototype design
- 103 2.1 eFEX

#### 104 **2.1.1 Processing Area**

105 The eFEX subsystem processes data from the calorimeters within the region of  $2.5 \le \eta \le 2.5$  and  $0 \le \phi \le 2\pi$  — a total volume of ~14Tb/s. Given the limits of current 106 technology, it is impossible to receive this into a single module. Due to the overlapping nature 107 108 of the eFEX algorithm windows, partitioning the subsystem into multiple modules means a 109 substantial volume of calorimeter data must be duplicated and/or shared between modules (and between FPGAs on the modules). This partitioning needs to balance the total number of 110 modules, the number of FPGAs per module, the fibre count per module, the complexity of fibre 111 mapping between the calorimeters and the eFEX, and the difficulty of sharing data between 112 113 adjacent modules and between FPGAs. Figure 5 shows the partitioning of the eFEX prototype 114 design. The middle eFEX module processes a core calorimeter area of 1.6×0.8 ( $\eta \times \phi$ ), whereas



Figure 4. jFEX trigger performance [10]

- 115 the eFEX modules on two sides process a core calorimeter area of  $1.7 \times 0.8$  ( $\eta \times \phi$ ). Thus, three
- 116 eFEX modules process a complete strip in the  $\eta$  range, and 24 modules are required in total.





117

118

Figure 5. eFEX partitioning [11]

# 119 **2.1.2 eFEX prototype**

120 The eFEX prototype, shown in Figure 6, is an 121 ATCA [12] module with a non-standard physical 122 form: the front board is extended through Zone 3 123 into the rear shelf space to optimize the routing of 124 the input fibres, which are connected to the module 125 via a custom Rear Transition Module (RTM).

126 The eFEX PCB is a 22-layer board with six 127 micro-via layers. It houses:

- 4 Xilinx Virtex-7 [13] FPGAs
  (XC7VX550T) for algorithm processing;
  1 Xilinx Virtex-7 FPGA (XCVX330T) for
- 1 XIIIIX VITEX-/ FPGA (XCVX3301) for control and readout functions;
- 132
  17 Avago MiniPODs [14] for optical input (144 signals) and output (36);



Figure 6. eFEX prototype

• 94 high-speed fan-out buffers (NB7VQ14M [15]) for data duplication between FPGAs.

The high-speed fan-out buffer NB7VQ14M was tested on a previous module, the High-135 Speed Demonstrator (HSD) [16]. It exhibited very good signal quality at 10 Gbps with 136 negligible propagation delay, and hence it was chosen for data duplication on the eFEX module. 137 In total, there are about 450 high-speed multi-Gb/s differential tracks routed on a single eFEX. 138 Blind and buried vias are used to achieve this density of signal tracks. The PCB is made from a 139 140 low loss material (both Isola Itera and Megtron6 have been used on different prototype 141 modules) and the PCB is rotated by 22° to minimize the effect of PCB fibre glass weave on 142 differential skew.

- 143 **2.2 jFEX**
- 144 **2.2.1 Processing area**

145 The jFEX receives data from the calorimeters within the region of  $-4.9 \le \eta \le 4.9$  and 146  $0 \le \phi \le 2\pi$  — a total volume of ~3Tb/s. The jFEX subsystem is partitioned into 7 processing 147 modules each covering a  $\phi$  ring as shown in Figure 7. 148 This  $\phi$ -ring coverage of each jFEX module enables it to calculate pile-up (i.e. energy 149 density) for the  $\eta$  range processed, and apply this as a correction to the jet and  $E_{\rm T}^{\rm miss}$  algorithms.



#### Figure 7. jFEX partitioning

Figure 8. jFEX prototype PCB design

#### 150 **2.2.2 jFEX prototype**

The jFEX prototype is currently being manufactured. The PCB layout is shown in Figure 8. It uses the same physical form factor as the eFEX, so that the modules can share the same RTM design. The jFEX PCB is implemented as a 24-layer board with 8 micro-via layers. It houses:

- 4 Xilinx Ultrascale [13] FPGAs (XCVU190) for algorithm processing and readout;
- 1 mezzanine card for control;
  - 24 Avago MiniPODs for optical input (216 signals) and output (32);

Due to its larger algorithm window, the jFEX needs to share even more data between FPGAs than the eFEX. In total, there are about 540 high-speed multi-Gb/s differential tracks routed on a single jFEX PCB. A loopback feature of the Xilinx Multi-Gb/s Transceiver (MGT), Far-End PMA loopback, is used for data-sharing between FPGAs on the jFEX module. This makes use of the otherwise unused transmitters of MGTs with a small sacrifice of latency. Figure 9 shows the results of loopback tests done on a Xilinx Evaluation Board VCU110 (Ultrascale XCVU190), which shows a very good eye opening at 25 Gb/s.



165

155

157

166

Figure 9. Xilinx MGT Far-End PMA loopback test (path 3 in diagram), IBERT 2-D Eye Scan 25Gb/s @ 10<sup>-11</sup> on Xilinx Evaluation Board VCU110 (Ultrascale XCVU190)

167

## 168 2.3 PCB design method

The eFEX prototype and jFEX prototype share a lot of challenges in PCB design. Firstly, both 169 170 are very high-density and high-speed PCB boards. The baseline speed of the inputs links specified in the ATLAS Phase-I TDR is 6.4 Gb/s. However, there is always strong desire to run 171 the links faster in order to further improve trigger performance and flexibility. Secondly, both 172 the eFEX and jFEX require complex channel mapping and data sharing. As a consequence, 173 174 some high-speed links need to run very long signal tracks across the whole PCB between FPGAs. Thirdly, both the eFEX and jFEX have very high power consumption, approaching 175 400W per module. Cooling design will also be very challenging as all the power consumed 176 turns into heat. To meet these challenges, the systematic PCB design method, which was 177 developed with great success in the HSD project, has been adopted in both eFEX and jFEX 178 PCB design. Notably, in addition to signal-integrity simulation, power-integrity simulation (as 179 shown in Figure 10) is particularly important for these modules. The power rails for the FPGA 180 cores and MGTs on both the eFEX and jFEX need to carry more than 100A current; the voltage 181 drops across the power distribution networks thus become significant design constraints. 182



183

# 184 **F**i

Figure 10. eFEX MGT power plane DC voltage drop simulation at 35A

# 185 **3. Prototype test**

After passing initial power-on and boundary scan tests smoothly, the eFEX prototype (Figure 6) was tested with the FTM and the LAr Digital Processing System (DPS) prototype, in a systematic check of all the eFEX high-speed input and output links.

In order to validate the TDR baseline link speed and test the upper limit of the possible link speeds, all eFEX high-speed links were tested at three different speeds (6.4 GB/s, 11.2 Gb/s and 12.8 Gb/s). The decision on the link speed has a significant impact on the FEX architecture, especially for jFEX, where different link speeds required completely different partitioning.

102 The test satur for these link sneed tests is your close to the final system as shown in Figure 1

The test setup for these link speed tests is very close to the final system as shown in Figure 1.
For example, a FOX demonstrator was used to mimic the complex fibre mapping and insertion

195 loss between the LAr DPS and the eFEX.

For the link tests with the LAr DPS, the link sources were Altera Arria 10 FPGAs [17] with MGTs capable of up to 14 Gb/s. The first eFEX prototype is fitted with Xilinx speed grade -2 Virtex-7 FPGAs, with MGTs specified up to 11.3 Gb/s. The FTM is fitted with a Xilinx speed grade -3 Virtex-7 FPGAs, with MGTs specified up to 13.1 Gb/s.

#### 200 **3.1 Link speed test results**

201 The test results obtained for the TDR 202 baseline link speed of 6.4 Gb/s are 203 extremely good, with wide-open 2-D eye 204 scans and Bit Error Rates of less than  $10^{-14}$  (no error over  $3 \times 10^{14}$  bits) for 257 out 206 of the 264 input links on the eFEX 207 prototype.

At 11.2 Gb/s, the opening of the 209 2-D eye scans on the eFEX is still very 210 good, as shown in Figure 11. Figure 12 211 shows the overall statistics of the open



Figure 11. Typical eFEX input link 2-D eye-scan @ 11.2Gb/s

areas of 2-D eye scans for all eFEX input links at 11.2 Gb/s. The Bit Error Rate on 257 out of 264 eFEX input channels is less than  $10^{-14}$  (no error over  $3 \times 10^{14}$  bits). Of the results for the other 7 links, 4 are correlated to less-optimal PCB routing (which can be improved in next PCB 215 iteration), and 3 are due to a bad high-speed fan-out buffer (which can be repaired).



216 217

Figure 12. Open Area of 2-D eye scan for all eFEX input links

At 12.8 Gb/s, many links on the eFEX prototype still work, but a significant number fail, as this is outside FPGA MGT's specified speed range. In order to evaluate the eFEX performance at 12.8 Gb/s, another eFEX prototype will be fitted with Xilinx speed grade -3 Virtex-7 FPGA, with MGTs capable of running up to 13.1 Gb/s.

## 222 **3.2 Link speed decision**

Based on the above excellent test results, and previous test results between the LAr DPS and the L1Calo gFEX, 11.2 Gb/s has been adopted as the new baseline link speed between LAr and L1Calo. This has greatly simplified the jFEX architecture, increased the dynamic range of the calorimeter data received by L1Calo, and simplified the link protocol, all of which improve L1Calo trigger performance.

# 228 **4. Conclusion**

The ATLAS Level-1 Calorimeter Trigger will be upgraded as part of the ATLAS Phase-I upgrades for 2019. Development of both the eFEX and jFEX are well underway, with prototypes under test or in manufacture. A systematic PCB design method, centred on PCB simulation and validation, has been used in developing these high-speed, high-density and highpower modules. The test results of the first eFEX prototype are very good, as a result of which the baseline for the speed of links into L1Calo has been increased from 6.4 Gb/s to 11.2 Gb/s, simplifying the architecture and improving the performance of the trigger.

# 236 **References**

| 237<br>238        | [1]  | Lyndon Evans and Philip Bryant, <i>LHC Machine</i> , 2008 JINST 3 S08001.<br>http://iopscience.iop.org/1748-0221/3/08/S08001                                                                   |
|-------------------|------|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| 239<br>240        | [2]  | ATLAS collaboration, <i>The ATLAS Experiment at the CERN Large Hadron Collider</i> , 2008 <i>JINST</i> <b>3</b> S08003. http://iopscience.iop.org/1748-0221/3/08/S08003                        |
| 241<br>242        | [3]  | ATLAS collaboration, ATLAS Level-1 Trigger: TDR, CERN/LHCC/98-14.<br>http://atlas.web.cern.ch/Atlas/GROUPS/DAQTRIG/TDR/tdr.html                                                                |
| 243<br>244        | [4]  | ATLAS collaboration, <i>Technical Design Report for the Phase-I Upgrade of the ATLAS TDAQ System</i> , CERN-LHCC-2013-018. https://cds.cern.ch/record/1602235/files/ATLAS-TDR-023.pdf          |
| 245<br>246        | [5]  | R. Achenbach et al, <i>The ATLAS Level-1 Calorimeter Trigger</i> , 2008 JINST 3 P03001.<br>http://iopscience.iop.org/1748-0221/3/03/P03001                                                     |
| 247<br>248<br>249 | [6]  | Weihao Wu et al, The development of the Global Feature Extractor for the LHC Run-3 upgrade of the L1 Calorimeter trigger system, ATL-DAQ-PROC-2016-010.<br>https://cds.cern.ch/record/2162162/ |
| 250<br>251        | [7]  | https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/UPGRADE/CERN-LHCC-2013-018/fig_19.png                                                                                                           |
| 252<br>253        | [8]  | https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/UPGRADE/CERN-LHCC-2013-018/fig_20.png                                                                                                           |
| 254<br>255        | [9]  | https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/UPGRADE/CERN-LHCC-2013-017/fig_11.png                                                                                                           |
| 256<br>257        | [10] | https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/UPGRADE/CERN-LHCC-2013-017/fig_14b.png                                                                                                          |
| 258<br>259        | [11] | https://atlas.web.cern.ch/Atlas/GROUPS/PHYSICS/UPGRADE/CERN-LHCC-2013-018/fig_26.png                                                                                                           |

- 260 [12] AdvancedTCA PCIMG 3.0 Short Form Specification
- 261 http://www.picmg.org/pdf/PICMG\_3\_0\_Shortform.pdf
- 262 [13] https://www.xilinx.com/products/silicon-devices/fpga.html
- [14] MicroPOD<sup>TM</sup> and MiniPOD<sup>TM</sup> 120G Transmitters/Receivers
   http://www.avagotech.com/pages/minipod\_micropod
- 265 [15] http://www.onsemi.com/PowerSolutions/product.do?id=NB7VQ14M
- [16] W. Qian, ATLAS level-1 calorimeter trigger upgrade for phase-I, 2013 JINST 8 C01039.
   http://iopscience.iop.org/article/10.1088/1748-0221/8/01/C01039
- 268 [17] https://www.altera.com/products/fpga/arria-series/arria-10/overview.html