

27 February 2019 (v5, 22 July 2019)

# The CMS Phase-1 pixel detector – experience and lessons learned from two years of operation

Benedikt Vormwald for the CMS Collaboration

#### Abstract

In 2017, CMS has installed a new pixel detector with 124M channels that features full 4-hit coverage in the tracking volume ( $\eta < 2.5$ ) and is capable to withstand instantaneous luminosities of  $2 \cdot 10^{34} \text{ cm}^{-2} \text{s}^{-1}$  and beyond. By now the detector has been successfully operated for two years in proton and heavy ion collisions. In this time, many improvements of the DAQ system, the detector monitoring capabilities, and silicon property modelling have been made. Very valuable experience has been collected in running a detector with DC-DC powering and CO<sub>2</sub> cooling, which are both new core technologies for most of the upcoming detector upgrades at LHC experiments. During the long shutdown of LHC from 2019 to 2021 the CMS pixel detector will be extracted and the modules of the innermost layer that suffered the most from radiation damage will be replaced. At that occasion a better readout chip as well as a new communication chip will be used for these modules, which fixes problems observed during operation.

This paper will give an overview of the different improvements that have been made and the challenges that have been faced in the last two years. A special focus will be put on the lessons learned in the light of the design of future detectors. Finally, the planned work on the CMS pixel detector during the LHC shutdown will be outlined.

Presented at PIXEL2018 International Workshop on Semiconductor Pixel Detectors for Particles and Imaging 2018

PUBLISHED BY IOP PUBLISHING FOR SISSA MEDIALAB



Received: *February 28, 2019* Accepted: *June 24, 2019* Published: *July 16, 2019* 

Pixel 2018 International Workshop December 10–14, 2018 Activity Center of Academia Sinica, Taipei, Taiwan

## The CMS Phase-1 pixel detector - experience and lessons learned from two years of operation

## B. Vormwald on behalf of the CMS Collaboration

University of Hamburg, Hamburg, Germany

*E-mail:* benedikt.vormwald@cern.ch

ABSTRACT: In 2017, CMS has installed a new pixel detector with 124M channels that features full 4-hit coverage in the tracking volume ( $|\eta| < 2.5$ ) and is capable to withstand instantaneous luminosities of  $2 \cdot 10^{34}$  cm<sup>-2</sup>s<sup>-1</sup> and beyond. By now the detector has been successfully operated for two years in proton and heavy ion collisions. In this time, many improvements of the DAQ system, the detector monitoring capabilities, and silicon property modelling have been made. Very valuable experience has been collected in running a detector with DC-DC powering and CO<sub>2</sub> cooling, which are both new core technologies for most of the upcoming detector upgrades at LHC experiments. During the long shutdown of LHC from 2019 to 2021 the CMS pixel detector will be extracted and the modules of the innermost layer that suffered the most from radiation damage will be replaced. At that occasion a better readout chip as well as a new communication chip will be used for these modules, which fixes problems observed during operation.

This paper will give an overview of the different improvements that have been made and the challenges that have been faced in the last two years. A special focus will be put on the lessons learned in the light of the design of future detectors. Finally, the planned work on the CMS pixel detector during the LHC shutdown will be outlined.

KEYWORDS: Large detector systems for particle and astroparticle physics; Particle tracking detectors; Radiation-hard detectors; Solid state detectors

## Contents

| 1 | Phase-1 CMS pixel detector        |                                      |
|---|-----------------------------------|--------------------------------------|
| 2 | Operation experience              |                                      |
|   | 2.1                               | Single Event Upsets                  |
|   | 2.2                               | DCDC failure                         |
|   | 2.3                               | Evaporative CO <sub>2</sub> -cooling |
| 3 | Performance and radiation effects |                                      |
|   | 3.1                               | Detector timing                      |
|   | 3.2                               | Readout chip thresholds              |
|   | 3.3                               | Hit efficiency                       |
|   | 3.4                               | Residuals                            |
|   | 3.5                               | Depletion voltage                    |
| 4 | Plans and conclusions             |                                      |
|   | 4.1                               | Plans during LS2                     |
|   | 4.2                               | Summary                              |
|   |                                   |                                      |

## 1 Phase-1 CMS pixel detector

In 2017, the Compact Muon Solenoid (CMS) Collaboration [1] installed a new pixel detector. One of the main motivations for that was the expected dynamic inefficiency of the readout chip in the innermost layer at the LHC luminosities after 2017. These expected inefficiencies were mostly driven by limitations on the readout bandwidth of pixel double-columns as well as buffer sizes in the frontend chip. A typical fill of LHC in 2018 started with a luminosity of  $20 \cdot 10^{33} \text{ cm}^{-2} \text{s}^{-1}$  which would have caused an inefficiency of roughly 30% in the innermost layer. Following the typical burn-down of the proton collisions the luminosity would still stay for more than 10 hours above  $8 \cdot 10^{33} \text{ cm}^{-2} \text{s}^{-1}$  corresponding to an inefficiency in hit reconstruction of more than 2%. This clearly shows that an upgrade of the front-end was essential in order to ensure a successful operation of the detector in 2017 and 2018.

Therefore, the main design guidelines for the new detector were to upgrade the readout chip and increase the readout bandwidth. At the same time the geometry was improved by adding a forth layer in the barrel part of the detector (BPix) as well as an additional disk per side in the forward region (FPix) resulting in a full 4-hit coverage in the tracking volume of  $\eta < 2.5$  [2].

The first layer is located closer to the beam pipe, at r = 2.9 cm, and the fourth layer is closer to the first layer of the silicon strip detector, at r = 16 cm. The detection modules in the forward detector are arranged in a turbine-like structure. This tilted and rotated design optimizes the charge sharing between pixels and is therefore beneficial for the hit resolution in the forward direction.

1

4 5 5

6 6

7 7 8 Since the number of detector channels almost doubled compared to the original pixel detector, but the number of cables running into the detector had to stay the same due to space limitations, DC-DC converters are used close to the actual detector. The DC-DC converter boards are built around the magnetic field resistant and radiation hard DC-DC ASIC named FEAST2 developed by CERN [3, 4]. Service cylinders outside of the tracking volume in the high  $\eta$  region house all the service electronics (optical links, delay chips, PLL), the DC-DC power converters, and act as preheating zone for the evaporative CO<sub>2</sub> cooling [5].

The general design of the detection modules is similar to the Phase-0 pixel detector. The individual modules consist of a 285  $\mu$ m-thick n-in-n sensor with a pixel size of  $150 \times 100 \,\mu$ m<sup>2</sup> that is bump-bonded to 16 readout chips. All modules, except the ones for layer 1, use the psi46dig chip [6], which is based on a column-drain readout architecture. The conditions for modules in layer 1, however, are so harsh that a special readout chip (PROC600) [7] has been developed for those modules. This chip is manufactured in the same 250 nm CMOS process as the psi46dig chip. It is capable to deal with hit rates of up to 600 MHz/cm<sup>2</sup>. The PROC600 chip makes use of the dynamic cluster drain readout architecture. Both readout chips work in a zero-suppression mode and feature an 8-bit on-chip pulse-hight digitization. Every module also has at least one communication chip (TBM) which acts as communication hub, distributes fast signals to the ROCs and orchestrates the serialized readout.

As outlined above, the Phase-1 pixel detector incorporates many key technologies of modern particle detectors. Therefore in the time of converging detector designs for several Phase-2 upgrade projects, it is worth looking back to two years of operation of this detector and discussing the experience with those technologies.

#### **2** Operation experience

#### 2.1 Single Event Upsets

In high radiation environments like at the position of the pixel detector, soft errors from corrupted registers in the electronics due to traversing ionizing particles are an expected effect. Within the CMS pixel detector these single event upsets (SEUs) have been observed in almost all electronics components in the detector along the signal path.

The normal procedure to recover from these soft errors is to reprogram the front-end and auxiliary electronics regularly. A special algorithm has been developed that recognizes severe soft errors affecting larger parts of the detector and triggers the recovery procedure, which includes stopping globally the triggers and re-configuring the problematic component as well as all front-end chips.

In addition to the conventional SEUs, single event latch-ups in the finite state machine of the TBM have been found to occur unexpectedly. This failure leads to the stop of the data transmission of all the ROCs behind the affected token chain and could be tracked back to an unprotected register in the TBM. Only a power cycle could bring back the TBM into a functional state. In 2017, the disable/enable functionality of the DC-DC converters has been utilized for that purpose. The intensive usage of the disable/enable feature of the DC-DC converters, however, revealed a problem in the DC-DC converter, as discussed in the next section.

#### 2.2 DCDC failure

On the 5th of October 2017, the first DC-DC converter showed a malfunction. After power-cycling the converter could not be turned on again. In the coming days, many more converters broke. From an extrapolation of the failure rate, it became clear that without intervention by mid of 2018 the tracking performance would be severely affected. For that reason, the detector was extracted in the year-end technical stop (YETS) of LHC in 2017. At this occasion, all DC-DC converters in the detector have been replaced with fresh converters of the same type, but with a larger on-board fuse. The larger fuse was selected to allow to operate the DC-DC converter at a lower input voltage, which in turn coincides with larger input currents. Despite tremendous efforts to reproduce the DC-DC failure outside of CMS, no single DC-DC converter could be damaged in the lab in a similar way before the detector had to be reinstalled at the end of the YETS. However, the current-voltage (I-V) characteristics of all about 1200 extracted DC-DC converters have been measured. A total of 65 converters were found to be broken, as expected from the operation period, while 333 converters showed an increased power consumption in the disabled state, which was a surprising result.

Furthermore, it was found that modules behind broken DC-DC converters were also damaged. The turned-off readout chip could not drain the leakage currents of the sensor efficiently when high voltage was applied. This gradually damaged the pre-amplifier. The damage was found to be proportional to the leakage current of the sensor as well as the time the modules spent in the state with high voltage enabled and low voltage off. The most severe damage occurred in eight modules in layer 1 from which fortunately six could be accessed and replaced during the YETS.

Only mid of 2018, the chip designers of the FEAST2 chip managed to reproduce the breaking symptoms under lab conditions [8] and quickly after that the failure of the ASIC could be identified as one transistor that was not protected sufficiently against leakage currents originating from TID damage. In the disabled state these leakage currents are amplified and lead to a charge-up of a capacitor up to the input voltage (11.4 V in 2017) instead of the specified 3.3 V. The discharge when the converter gets enabled can damage the connected electronics irreversibly. Alternatively, a new ohmic path to ground can be formed. The latter case explains the occurrence of converters with higher power consumption in the disabled state. After these findings, operational procedures were adjusted to not disable the DC-DC converters anymore, but instead use the power supply itself to perform the power-cycle of the latched-up TBM channels. Thereby, one power supply channel supplies between 3 and 16 modules whith low voltage depending on the position in the detector. Also lowering the input voltage to 9 V, which was possible due to the different on-board fuse, turned out to reduce the risk of damaging converters. The power-cycling was performed exclusively in between two LHC fills. No converters were broken in 2018.

## 2.3 Evaporative CO<sub>2</sub>-cooling

The operation of the CO<sub>2</sub> cooling plant was very stable and reliable during 2017 and 2018. No detector downtime has been attributed to the cooling system. In BPix, the cooling loops inside the detector cover in one layer an area of  $\Delta \phi \approx 90^{\circ}$  (layer 1&2) or  $\Delta \phi \approx 45^{\circ}$  (layer 3&4). The loops meander from one end to the other beneath the carbon fiber support structure of the modules. Along the cooling pipes a significant temperature drop of up to 5 K has been observed. This inhomogeneity has a direct impact on the leakage currents of the silicon sensors within the same layer. In order to



**Figure 1**. Measurements in a thermal mock-up of one half of layer 2 showed a similar temperature drop of about 5K along a cooling pipe as observed in the detector. The temperature at an azimuthal angle  $\phi$  indicates the average of the temperature reading of all eight modules at this  $\phi$  position. A reduced flow can mitigate the effect by 1 K – -1.5 K.

mitigate the effect, the  $CO_2$  mass flow has been lowered in two steps from 2.5 g/s to 1.8 g/s during the two years and a slight improvement of the inhomogeneity could be achieved.

A thermal mock-up of one half of layer 2, which is equipped with silicon heater modules as well as a large number of temperature probes was set into operation in 2018. This mock-up turned out to be a key component in understanding the thermal behaviour of the actual detector. As show in figure 1, the temperature gradient as well as the dependence of this gradient on the mass flow could be reproduced with a realistic set of parameters for preheating and module power. In a next iteration the setup will be equipped with additional pressure sensors in order to further improve the thermal model of the actual detector.

Another aspect that has been experienced in 2018 are the limited capabilities of using  $CO_2$  to warm up the detector. Evaporative cooling by definition is only an efficient cooling in the boiling phase, but cannot be used for heating if there is no heat dissipation inside of the detector. This is especially important to be aware of for controlled annealing attempts of the silicon sensor and in safety matters if there is a problem with the dry gas supply.

## **3** Performance and radiation effects

## 3.1 Detector timing

One of the essential steps during the detector commissioning is the adjustment of the global timing with respect to the LHC collisions. It has been found that the readout chip in layer 1 (PROC600) is half a clock cycle (12.5 ns) faster than the readout chips of the other layers and disks. Unfortunately, layer 1 and layer 2 share by construction the same trigger and clock distribution in a given  $\phi$  region. In figure 2, the measured hit efficiency is shown for different layers, where the global delay has been changed for all layers simultaneously. The fully efficient overlap region of layer 1 and layer 2 is



**Figure 2**. Measured hit efficiency in different barrel layers for different global delays. The dotted lines indicate the settings selected in 2018. The shaded areas should not be used for further interpretation as in these areas global tracking inefficiencies dominate the observed hit inefficiencies.

only a few nano-seconds wide. The working point for the delay has been selected such that layer 2 is read out slightly too early and layer 1 slightly too late. This optimizes the charge collection in layer 1, while it is not ideal for layer 2. In a future version of the TBM, which will be used for the re-build of layer 1 during LHC long shutdown 2 (LS2), individual delays for clock and trigger can be configured for each single TBM. Although the solution of 2017 and 2018 has proven to work reliably, the additional tuning capabilities will allow to improve the local reconstruction performance in particular in layer 2.

## 3.2 Readout chip thresholds

The thresholds of the readout chips have been adjusted to about 1200 electrons for BPix and 1700 electrons for FPix. In the presence of collisions, additional hits correlated with real hits have been observed in layer 1. This cross-talk is highly rate dependent. In order to suppress this effect, the thresholds in layer 1 have been set to a value well above 2000 electrons. The distribution of the measured thresholds in the barrel pixel detector is depicted in figure 3. The sources of the cross-talk have been identified and the effect can be reduced significantly if the programming sequence of the front-end settings is changed. This changed sequence shuts off some control lines running over several pixels in the readout chip that otherwise transmit the cross-talk. The effectiveness of this change was demonstrated in one of the last fills of LHC during Run-2 where thresholds could be lowered significantly. The new version of the PROC600 used in the new layer 1 will be further improved with respect to shielding such that it is expected that after LS2 the thresholds in layer 1 will be as low as in the other layers and disks.

#### 3.3 Hit efficiency

Figure 3 (right) shows the hit efficiency as a function of instantaneous luminosity. The overall performance is very good, with hit efficiencies above 99% in layers 2 to 4 and all disks. In



**Figure 3**. Left: distribution of measured thresholds in the barrel pixel detector. Layer 1 has a larger threshold in order to mitigate cross-talk effects during collisions. Right: hit efficiency for different instantaneous luminosities. Some inefficiency in layer 1 towards low and high luminosities are caused by a problem in the buffer logic in the double column periphery.

layer 1, some inefficiency can be spotted in regions with very low ( $< 1 \cdot 10^{33} \text{ cm}^2 \text{s}^{-1}$ ) and high (>  $14 \cdot 10^{33} \text{ cm}^2 \text{s}^{-1}$ ) luminosities. These effects originate from a problem in the buffer logic in the double column periphery of the readout chip which requires frequent resets. The issue will also be addressed in the next version of PROC600.

## 3.4 Residuals

The residuals shown in figure 4 are extracted from a triplet method where the measured hit position in a layer is compared with the interpolated position from neighboring layers. As an example, for layer 2 the residuals are determined to  $\Delta_{r-\phi}^{\text{layer 2}} = 13 \,\mu\text{m}$  and  $\Delta_z^{\text{layer 2}} = 30 \,\mu\text{m}$ , and in the forward region in disk 2 to  $\Delta_{r-\phi}^{\text{disk2}} = 12 \,\mu\text{m}$  and  $\Delta_z^{\text{disk2}} = 19 \,\mu\text{m}$ . These numbers do not correspond to the intrinsic hit resolution, as they have still folded in the uncertainty from the measurement method. However, they can be understood as an upper limit to the intrinsic hit resolution.

#### 3.5 Depletion voltage

In 2018, a lot of efforts were invested to improve the monitoring of the sensor properties during irradiation. Apart from a few high voltage bias scans on the full detector, every 1–2 weeks a bias scan has been performed on a representative subset of modules in all layers and disks. Due to a careful selection of the module groups, avoiding an overlap in the  $\eta - \phi$  region, the scan of all layers could be performed simultaneously without affecting the data quality. Figure 5 (left) shows the average normalized on-track cluster charge depending on the bias voltage for layer 2. It is apparent that with increasing dose the collected charge for a given bias voltage is reduced. Also the position of the inflection point of the curves, which is considered to be a good estimation for the depletion voltage, moves to higher values with increasing irradiation.



**Figure 4**. Hit residuals for layer 2 and disk 2. Template reconstruction refers to the standard offline method for reconstructing hits, taking into account a detailed cluster shape simulation predicted by PixelAv [9].

Figure 5 (right) compares the measured depletion voltage (points) with a simulation predicted by the Hamburg model [10]. This simulation takes into account the complete thermal history of the detector as well as the fluence profile. Although the model needs further fine tuning, predictions and measurements agree already now on the level of 20%. Thus, it is a very valuable tool for detector operation in planning of the adjustments of operation parameters, like bias settings.

## 4 Plans and conclusions

#### 4.1 Plans during LS2

During the two years shutdown of LHC from 2019 to 2021 the detector will be extracted from CMS and stored in cold boxes in surface clean rooms. In order to prevent reverse annealing, the temperature of the detector will be kept around 0  $^{\circ}$ C most of the time.

Of course the detector will also undergo some general maintenance work. All DC-DC converters will be exchanged with a new version of the ASIC that is no longer susceptible to the breaking



**Figure 5**. Left: average normalized on-track cluster charge in layer 2 for different bias settings. The inflection point where the cluster charge reaches a plateau moves to higher values with increasing dose. Right: comparison of measured (markers) and simulated (lines) depletion voltage over time.

mechanism explained in section 2.2. Faulty components will be investigated and repaired, if they are accessible. The power supplies will be upgraded such that they overcome their present limit of 600 V and can be operated up to 800 V. This will ensure the efficient operation of the detector until the end of its lifetime.

As foreseen in the TDR, the entire layer 1 will be replaced during LS2. This layer suffered the most from radiation damage due to the proximity to the interaction point. The exchange is necessary in order to guarantee high quality data until the end of LHC Run-3. In this replacement, opportunistically a new version of the TBM and the PROC600 will be used. The new TBM will address the latch-up issue and will give the capability to adjust the timing of layer 1 and layer 2 relative to each other. The new PROC600 will mitigate the cross-talk problem as well as the inefficiency at low and high instantaneous luminosities.

#### 4.2 Summary

The CMS Phase-1 pixel detector has been operated successfully since 2017. At the end of 2017, the detector had to be extracted due to a massive DC-DC converter failure. Thanks to the right actions, operation in 2018 was smooth. In all cases where unforeseen complications occurred the impact on the data quality was minimal. The detector will be refurbished during LS2 such that it will be in the best possible shape at the beginning of LHC Run-3.

### References

- [1] CMS collaboration, The CMS Experiment at the CERN LHC, 2008 JINST 3 S08004.
- [2] CMS collaboration, *CMS Technical Design Report for the Pixel Detector Upgrade*, CERN-LHCC-2012-016.

- [3] L. Feld, W. Karpinski, K. Klein, M. Lipinski, M. Preuten, M. Rauch et al., *The DC-DC conversion power system of the CMS Phase-1 pixel upgrade*, 2015 *JINST* **10** C01052.
- [4] http://project-dcdc.web.cern.ch/project-DCDC/.
- [5] J. Daguin, K. Arndt, W. Bertl, J. Noite, P. Petagna, H. Postema et al., Evaporative CO<sub>2</sub> cooling system for the upgrade of the CMS pixel detector at CERN, in Proceedings of the 13<sup>th</sup> InterSociety Conference on Thermal and Thermomechanical Phenomena in Electronic Systems, San Diego, CA, U.S.A., 30 May – 1 June 2012, pp. 723–731.
- [6] B. Meier, CMS pixel detector with new digital readout architecture, 2011 JINST 6 C01011.
- [7] CMS collaboration, *High rate capability and radiation tolerance of the PROC600 readout chip for the CMS pixel detector*, 2017 *JINST* **12** C01078.
- [8] https://project-dcdc.web.cern.ch/project-dcdc/public/Documents/SummaryMeasurements18.pdf.
- [9] M. Swartz, A Detailed Simulation of the CMS Pixel Sensor, CMS-NOTE-2002-027 (2002).
- [10] M. Moll, Radiation damage in silicon particle detectors: Microscopic defects and macroscopic properties, PhD Thesis, Hamburg University (1999) [DESY-THESIS-1999-040].