PREPARED FOR SUBMISSION TO JINST

Topical Workshop on Electronics for Particle Physics 19–23 September 2022 Bergen, Norway

# Verification of simulated ASIC functionality and radiation tolerance for the HL-LHC ATLAS ITk Strip Detector

# W.J. Ashmanskas, J.R. Dandoy,<sup>1</sup> N.C. Dressnandt, P.T. Keener, J. Kroll, E. Lipeles, S. Lu, F.M. Newcomer, A. Nikolica, B.J. Rosser,<sup>2</sup> E. Thomson

Department of Physics and Astronomy, University of Pennsylvania, Philadelphia, Pennsylvania, USA

*E-mail:* jeff.dandoy@cern.ch

ABSTRACT: The verification of ASICs through simulation is critical to ensure their successful operation in particle physics detectors and to minimize the number of long and expensive production cycles required. Three radiation-tolerant ASICs (HCC, AMAC, and ABC) will perform the front-end readout, monitoring, and control of the ITk Strip charged-particle tracker for the ATLAS detector at the HL-LHC. The Python-based cocotb verification framework is used to design sophisticated tests with contributions from ASIC verification non-experts and students. The verification program includes interactions between multiple ASICs, realistic data flows, operational stress tests, and a focus on mitigation of disruptive Single Event Effects due to radiation.

KEYWORDS: Simulation methods and programs; Particle tracking detectors; Si microstrip and pad detectors; Radiation-hard detectors

© 2022 CERN for the benefit of the ATLAS Collaboration. Reproduction of this article or parts of it is allowed as specified in the CC-BY-4.0 license.

<sup>&</sup>lt;sup>1</sup>Corresponding author.

<sup>&</sup>lt;sup>2</sup>Now at Enrico Fermi Institute, University of Chicago, Chicago, Illinois, USA

# Contents

| 5 | Conclusions and outlook                                  | 5             |
|---|----------------------------------------------------------|---------------|
| 4 | Irradiation simulations                                  | 4             |
| 3 | ASIC verification simulations 3.1 Multi-level triggering | <b>2</b><br>3 |
| 2 | ASICs of the ITk Strip subdetector                       | 1             |
| 1 | ATLAS Inner Tracker Upgrade for the HL-LHC               | 1             |

# 1 ATLAS Inner Tracker Upgrade for the HL-LHC

The high-luminosity upgrade to the Large Hadron Collider (HL-LHC) is expected to begin operation in 2029, targeting peak instantaneous luminosities of up to  $7.5 \times 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> and delivering more than 3000 fb<sup>-1</sup> of high-energy proton-proton collisions to experiments. It will enable the experiments to expand their physics goals, including precision measurements of Higgs boson properties and an extensive search program for physics beyond the Standard Model.

The upgrade of the ATLAS experiment [1] for the HL-LHC includes the replacement of the inner charged-particle tracking detector with the all-silicon Inner Tracker (ITk), consisting of the inner Pixel and outer Strip subdetectors. The ITk introduces an improved granularity to handle the busier environment, a faster full-detector readout rate of 1 MHz, and the ability to withstand higher levels of radiation. The coverage of silicon-based tracking in ATLAS will increase from a radius of r < 514 mm to r < 1000 mm, with an expanded pseudorapidity coverage from  $|\eta| \le 2.6$  to  $|\eta| \le 4.0$ .

The ITk Strip subdetector [2] consists of four double-sided layers in the detector barrel and six double-sided disks in each detector end-cap. It will utilize approximately 280,000 ASICs to readout 165 m<sup>2</sup> of silicon sensors through 60 million channels. The ASIC designs are described in Section 2. The simulation of the designs is discussed in Section 3, with a particular focus on interactions between multiple ASICs. Simulation of radiation sensitivities is detailed in Section 4, while conclusions and outlook are given in Section 5.

### 2 ASICs of the ITk Strip subdetector

Three ASIC designs contribute to the readout, control, and monitoring of the ITk Strip detector. The Atlas Binary Chip (ABC) ASIC provides pre-amplification and discrimination of signals from 256 strips of n-in-p type sensors. Sensor data are stored in a pipeline of fixed latency and copied to memory upon reception of a synchronous first-level trigger (L0). Upon reception of a subsequent asynchronous second-level trigger (L1), the data are compressed into 12-bit clusters and transmitted.

The Hybrid Controller Chip (HCC) ASIC provides control of up to 11 ABCs and forwards L0 trigger requests. In the nominal single-level trigger mode the HCC will auto-generate a L1 trigger for each L0 trigger. It can internally queue up to 128 pending L1 triggers, regulating the flow of requests to the ABCs. The HCC combines the physics data from all connected ABCs into a single event packet, serializes it and transmits it off-detector at 640 Mbps. One HCC and the connected ABCs form a readout hybrid.

The Autonomous Monitoring And Control (AMAC) ASIC provides voltage control and monitors currents, voltages, and temperatures. If a critical value goes out of range the AMAC autonomously interrupts connected ABCs, HCCs, the DC/DC regulator, and HV switches for protection. The AMAC is integrated into a power board, which together with one or two hybrids and the silicon sensor strips forms a module.

The verification described here was developed to test the significant changes required for an increased readout rate in design versions ABCStarV0, HCCStarV0, and AMACV2a, as well as increased radiation tolerance in the final design versions ABCStarV1, HCCStarV1, and AMACStar.

#### **3** ASIC verification simulations

Verification simulations were performed separately using the HCC and AMAC designs, as well as with multiple ASICs at the hybrid level (1 HCC with up to 11 ABCs) and module level (1 AMAC, 2 HCCs, and up to 22 ABCs). The tests were written in Python and interfaced to the Cadence Incisive and Xcelium simulators with the cocotb (Coroutine Cosimulation testbench) [3, 4] verification framework, an open-source Python library. This allowed the easy development of complex tests which benefited from the extensive Python ecosystem and its relative ease of use.

Simulations were performed on the designs at the register-transfer level, after logic synthesis, and after place and route. Hundreds of unit tests were developed for the individual HCC and AMAC designs and focused on exhaustive coverage of each ASIC's logic and on correct functionality [5]. Tests were run nightly in a Continuous Integration (CI) system.

Realistic simulations were performed at the hybrid and module levels to verify interactions between ASICs under expected HL-LHC conditions. The trigger rate was randomized (nominally 1 MHz) and constrained to follow the expected proton beam structure of the HL-LHC. The number of physics clusters per event was Poisson-sampled with a mean occupancy  $\lambda$  informed by HL-LHC expectations. Tests typically simulated conditions in the highest-occupancy modules, including the innermost Layer 0 (L0) module of the detector barrel region ( $\lambda = 10$  clusters across 10 ABCs in each hybrid) and the farthest-forward Disk 5 Ring 1 (D5R1) module of the detector end-cap region ( $\lambda = 18.9$  clusters across 9 ABCs in each hybrid). These simulations validated communication between ASICs and performance in the most demanding part of the detector, gave insight into long-term operating behavior, and helped inform quality testing of production ASICs [6–8].

Additional tests probe the capabilities of the ITk Strip system at the extremes. Scans for maximal operating limits of trigger rates and cluster occupancies showed hybrid-level performance exceeding design requirements. System behavior and recovery during unexpected noise bursts was studied, including many back-to-back triggers or an anomalously large number of clusters. The L1 trigger queue occupancy in the HCC during an example noise burst test is shown in figure 1. During nominal operation a noise burst of 30 clusters is placed on a single ABC for a period of

60 consecutive triggers, leading to long processing times by the ABC and consequently a buildup of queued L1 triggers in the HCC. After the noise burst ends, the system slowly recovers until the L1 trigger queue is empty. Tests showed the designs were operationally robust against anomalous scenarios, and informed changes to the optimal reset behavior for quick recovery.



**Figure 1**. Simulated time evolution of the internal trigger queue occupancy of the HCC during a noise burst. The simulation reflects the end-cap module D5R1 with highest expected cluster occupancy. During regular operation at a 1 MHz trigger rate, a single ABC receives an anomalous 30 clusters for a period of 60 triggers.

#### 3.1 Multi-level triggering

The realistic simulations allowed for measurements of system performance, helping inform design decisions for the ATLAS Trigger and Data Acquisition (TDAQ) upgrades. The feasibility of an evolution to multi-level hardware triggers [9] with intermediate tracking was studied.

In the multi-level trigger scenario, the rate of the first fixed-latency L0 trigger is increased to 4 MHz. The HCC forwards L0 trigger requests to the ABCs to record physics data but does not generate a L1 trigger as in the single-level trigger scenario. Detector regions of interest are quickly identified by TDAQ systems and a priority R3 trigger is sent to ~10% of the ITk Strip detector. The HCC prioritizes R3 triggers over any pending L1 triggers, ensuring the urgent physics data are read out with latencies below ~6  $\mu$ s. This subset of ITk Strip data is used for tracking that informs the final L1 trigger decision. The only prioritization mechanism is in the HCC trigger queue as the HCC converts R3 triggers into L1 triggers before transmission to the ABCs.

The expected performance of multi-level triggering in the busiest barrel module is given in figure 2, showing the percentage of data loss as a function of the L0 trigger rate and the delay in sending L1 requests to the HCC. Non-negligible data loss above 0.5% can be seen at the highest L0 trigger rates and longest L1 delays, reflecting the overwriting of L0 physics data stored on the ABC before the associated L1 trigger request is received. The ABC memory can store physics data for up to 128 L0 triggers. This data loss is reduced or removed at lower L0 trigger rates and shorter L1 trigger delays. The evolution scenario for multi-level triggering was not adopted [10] but the functionality remains in the ITk Strip system.



**Figure 2**. Simulated data loss during multi-level trigger operation. The simulation reflects the barrel module L0 with highest expected cluster occupancy. Simulations of 50,000 L0 triggers are performed for various L0 trigger rates and various delays of the L1 trigger decision. The R3 trigger rate is set to 10% of the L0 rate, and the combined L1 and R3 trigger rate is set to 1 MHz.

#### 4 Irradiation simulations

The ASICs must operate under radiation doses of up to 50 MRad [2] and while subject to energetic ionizing particles causing disruptive Single Event Effects (SEEs). Sensitivities discovered in the prototype ASICs during radiation tests prompted additional protections in the designs. These include more deglitch circuits and expanded use of triple modular redundancy (TMR) with triplicated voters. The TMR was implemented using the CERN TMRG tool [11], and size considerations required inputs, outputs, and storage paths for physics data to remain non-triplicated.

A test program with SEE simulations has been performed for the HCC at hybrid and module level using netlists after logic synthesis or after place and route. A similar test program was first performed for the individual HCC and AMAC designs [5]. Two types of SEEs are considered in the irradiation simulations. Single Event Upsets (SEUs) change the state of flip-flops, and the Verilog gate libraries were modified to support simulated SEU injections. Single Event Transients (SETs) cause temporary voltage pulses in wires, and in simulation wires were inverted via cocotb with randomized timing. The SET width was typically 1 ns, longer than physically expected.

SEEs are injected once every clock cycle at random on flip-flops and wires in the design. This rate is roughly 1 billion times faster than expected at the HL-LHC to collect sufficient statistics. A verification target of 100 SEEs per flip-flop (~42k) and wire (~250k) was exceeded via multiple daily runs of the CI system. Realistic data flow was simulated during the SEE injections. If a data error was observed the injection was stopped while the simulation continued. In case of no further errors, the disruptive SEE was categorized as an automatic recovery from a one-off error, commonly caused by bit flips in the non-triplicated physics data path. In situations of continuous errors, an escalating series of recovery attempts were performed using logic, register, and hard resets. The rapid SEE rate and complexity of multi-ASIC interactions could make association of errors to specific SEEs difficult. A statistical analysis correlated errors with nearby injected SEEs, allowing rare sensitivities to be addressed such as adding deglitchers to critical reset logic.

Of 92 million SEEs simulated on triplicated logic, only 28 instances of one-off physics data errors were observed. These were caused by interactions extending beyond one clock cycle of two

close-by SEEs, as consequence of the rapid SEE rate that is not expected to occur at the HL-LHC. Of 25 million SEEs simulated on non-triplicated logic, 0.3% led to one-off physics data errors. These were expected and extrapolations from previous testbeam results with a prototype HCCStarV0 [12] gave a predicted error rate of O(1) a day per HCC at the HL-LHC. This is 10 orders of magnitude lower than the expected electronic noise of the silicon strips and therefore negligible.

Resets were performed for 0.01% of the non-triplicated SEEs and extrapolated to O(1) error every 2 weeks per HCC at the HL-LHC. This is an upper bound as many instances were caused by an accumulation of effects from multiple SEEs due to the rapid SEE rate. An example distribution of the simulated recovery time is given in figure 3 split by recovery type. The recovery for each reset type is seen to be quick, including for hard resets that were artificially induced using SETs of 5 ns length. Through the course of the irradiation simulations, all sources of hard resets were identified and corrected. After completion of the extensive verification program in simulation and the subsequent fabrication, radiation tests were performed with heavy ion and proton beams [8, 13]. These radiation tests have confirmed the successful operation of the ASICs in a high radiation environment and validated the extrapolated upper bounds on error and reset rates.



**Figure 3**. Simulated recovery time of disruptive Single Event Effects in the non-triplicated HCC logic, categorized by automatic recovery (green), logic reset (blue), register reset (violet), or hard reset (red). The simulation reflects the end-cap module D5R1 with highest expected cluster occupancy and uses 5 ns SETs.

# 5 Conclusions and outlook

Simulations were performed for the three ASICs of the ATLAS ITk Strip detector and their interactions using the Python-based cocotb verification library. Realistic tests of expected HL-LHC operating conditions and novel techniques for studying radiation effects strengthened the designs and ensured the ASICs met technical specifications. The simulations provided key insights into system operation that informed functional, quality, and radiation tests of production ASICs.

#### Acknowledgments

The authors thank Jaya John John<sup>1</sup> and Pedro Miguel Vicente Leitao<sup>2</sup> for support on ABC simulations and the CERN CHIPS group for support in reviewing the ASIC designs and simulations.

<sup>&</sup>lt;sup>1</sup>Department of Physics, University of Oxford <sup>2</sup>CERN

#### References

- [1] ATLAS Collaboration, *The ATLAS Experiment at the CERN Large Hadron Collider*, *JINST* **3** (2008) S08003.
- [2] ATLAS Collaboration, *Technical Design Report for the ATLAS Inner Tracker Strip Detector*, CERN-LHCC-2017-005, 2017.
- [3] C. Higgs (Potential Ventures), Cocotb, ORConf2015, CERN, Geneva, Switzerland, Oct 2015.
- [4] B.J. Rosser, *Cocotb: a Python-based digital logic verification framework*, Micro-electronics Section seminar, CERN, Geneva, Switzerland, Dec 2018.
- [5] B.J. Rosser, Continuing the Search for Nothing: Invisible Higgs Boson Decays and High Luminosity Upgrades at the ATLAS Detector, CERN-THESIS-2021-333, 2021.
- [6] R.P. McGovern, Pre-Production Testing of the HCCStar at Penn for the ATLAS ITk Detector, Topical Workshop on Electronics for Particle Physics, 2022, submitted to JINST.
- [7] T. Gosart, *Pre-Production Testing of the AMACStar ASIC at Penn for the ATLAS ITk Detector*, Topical Workshop on Electronics for Particle Physics, 2022, submitted to JINST.
- [8] L.F. Gutierrez Zagazeta, *Testing of the HCC and AMAC functionality and radiation tolerance for the HL-LHC ATLAS ITk Strip Detector*, Topical Workshop on Electronics for Particle Physics, 2022, submitted to JINST.
- [9] ATLAS Collaboration, Technical Design Report for the Phase-II Upgrade of the ATLAS TDAQ System, CERN-LHCC-2017-020, ATLAS-TDR-029, 2017.
- [10] ATLAS Collaboration, Technical Design Report for the Phase-II Upgrade of the ATLAS Trigger and Data Acquisition System - Event Filter Tracking Amendment, CERN-LHCC-2022-004.kl ATLAS-TDR-029-ADD-1, 2022.
- [11] S. Kulis, Single Event Effects mitigation with TMRG tool, JINST 12 (2017) C01082.
- [12] J.R. Dandoy, *Development and Testing of the ATLAS ITk HCCStar ASIC*, 2019 Meeting of the Division of Particles and Fields of the American Physical Society, Northeastern University, Boston, USA, July 2019.
- [13] A. Wall, *Irradiation testing of ASICs for the ATLAS HL-LHC upgrade*, Topical Workshop on Electronics for Particle Physics, 2022, submitted to JINST.