# A digital ASIC for sub-ns timing with the LHCb RICH detectors in Run 4

C. Arnaboldi<sup>1</sup>, M. Calvi<sup>1</sup>, S. Capelli<sup>1</sup>, P. Carniti<sup>1</sup>, F. Fontanelli<sup>2</sup>, C. Gotti<sup>1</sup>, S. Minutoli<sup>2</sup>, G. Pessina<sup>1</sup>, A. Petrolini<sup>2</sup>, G. Simi<sup>3</sup>

<sup>1</sup> University and INFN Milano-Bicocca
<sup>2</sup> University and INFN Genova
<sup>3</sup> University and INFN Padova

## 1 Introduction

This note complements the LHCb RICH Future Upgrades FTDR [1], which envisions possible evolutions of the LHCb RICH detectors in LHC Run 4 and Run 5. The addition of sub-ns time resolving capability to the LHCb RICH detectors is one of the key points, and would lead to reduced background, pile-up mitigation and ultimately improved PID. Examples of the improvement in signal over background rejection capabilities that can be obtained in Run 4 with the proposed system are described in [2].

This document describes the proposal to develop a fully digital, radiation hard ASIC to enable single photon counting with sub-ns resolution in the LHCb RICH detectors during Run 4. The proposed scheme is arguably the most cost-effective way to reach the goal of a RICH detector with sub-ns timing resolution, since it aims to make the best use of the timing capabilities of the LHCb RICH photodectors (MaPMTs) while reusing as much as possible the front-end electronics which have been installed for Run 3.

## 2 Time resolution of the Elementary Cell

In the LHCb RICH detectors of Run 3, multi-anode photomultipliers (MaPMTs) are read out with the CLARO ASIC, a fast, low power, rad-hard amplifier and discriminator with programmable gain and threshold, specifically taylored for the LHCb RICH detectors [3]. MaPMTs and CLAROs are parts of the so-called Elementary Cell (EC), designed to minimize the distance between MaPMT anodes and CLARO inputs for optimal noise performance [4]. The CLARO output signals are routed from the EC to the digital board (PDMDB) where FPGAs collect data and manage offdetector Gigabit transmission through optical fibers.

The time resolution of the EC was measured with the setup shown in Figure 1, left side. The MaPMT was illuminated with fast laser pulses (70 ps FWHM) from a Hamamatsu PLP-10 with 405 nm head. The output signals from the CLAROs were buffered with wide bandwidth operational amplifiers (LMH6702) on the board attached to the back of the EC, and acquired with a Rohde&Schwarz RTO1044 oscilloscope (4 GHz, 20 GS/s). A STM32 microcontroller on the Nucleo board was used to program the CLARO gain and thresholds via SPI to a typical single photon counting configuration.

For any given single photon rate, 1000 waveforms were acquired, and the timing of the output signals was measured, taking the output trigger signal of the laser driver as reference. The time resolution of the full elementary cell was compared to that obtained from the MaPMT alone, with anodes directly buffered to the oscilloscope with fast operational amplifiers. The average time resolution measured with the MaPMT alone is 174 ns RMS, which becomes 193 ns RMS from the full EC. The CLARO is then found to contribute marginally to the overall time resolution of the EC, which is dominated by the transit time spread of the MaPMTs. From the quadratic difference of these two sets of measurements, the average CLARO contribution can be estimated to be about 80 ps RMS. No dependence on signal rate is observed up to photon rates well into the MHz range.

The time walk of the CLARO was compensated with a time over threshold technique: the duration of each signal was determined and used to correct the leading edge time, which would otherwise be affected by signal amplitude. Figure 2 shows the time over threshold measurement. The time over threshold ranges from 4 ns to 18 ns for typical single photons, with most signals between 10 ns and 18 ns, reflecting the shape of the amplitude spectrum from the MaPMT. At the same time the measured (uncompensated) time of arrival varies by about 5 ns. By interpolating the distribution, time walk can be corrected, to obtain the results of Figure 1.



Figure 1: On the left, setup used for timing characterization of the LHCb RICH Elementary Cell. On the right, measured time resolution from just the MaPMT, and from the full Elementary Cell (MaPMT + CLARO). The CLARO time walk was compensated with time over threshold.



Figure 2: Time over threshold measurement, used to compensate the timing results of Figure 1.

In the RICH detectors of Run 3, the CLARO time walk is not compensated. Each photon hit is rendered as a single bit, 1 or 0. Transmitting additional time over threshold information would require additional bandwidth. Furthermore, the smallest sampling bins that can be used in the FPGA are 3.125 ns (already a factor 8 beyond the original design target of 25 ns). As a consequence, the time resolution of the EC is not leveraged to its fullest, and the overall resolution of the system is limited to a few ns.

By replacing the FPGA with a rad-hard digital ASIC capable of sub-nanosecond sampling, and by enabling compensation of the CLARO time walk (online or offline) with time over threshold measurement, the last two contributions could be eliminated. This would allow to fully exploit the intrinsic time resolution of the existing elementary cells ( $\sim 200$  ps RMS), an improvement of about one order of magnitude with respect to the conditions of Run 3. This proposal is schematically depicted in Figure 3.

## 3 Effect of binning

The results of the previous section were obtained starting from waveforms acquired by the oscilloscope at 20 ps sampling steps. This is clearly challenging to obtain in an ASIC, and also unnecessary, given that MaPMTs limit anyway the timing resolution to  $\sim$ 200 ps RMS.

To study the effect that coarser time bins would have on the effectiveness of time over threshold correction and on the overall time resolution, the acquired waveforms were decimated without interpolation, as if they were acquired at lower sampling rates. The results are shown in Figure 4. The columns show time bins for the falling edge of the signals, while rows show the time bins for the rising edge of the signals. Variations are shown relative to the 20 ps case. As can be seen, the deterioration due to coarser time bins is marginal (below 3%) up to 195 ps time bins for the rising edge and 390 ps for the falling edge. This is not surprising, since 195 ps time bins correspond to a RMS contribution of 195 ps  $/\sqrt{12} = 56$  ps, which in quadrature becomes negligible compared to the ~200 ps RMS resolution of the EC. The falling edge is only used for time over threshold correction, hence a 2x coarser time bin compared to the rising edge is acceptable. The deterioration is still small (<10%) when 390 ps bins are used for the rising edge and 780 ps bins are used for the falling edge.



Figure 3: Proposed redesign of the digital part of the RICH front-end system, while leaving the Elementary Cells untouched.

| Fall bin [ps]<br>Rise bin [ps] | 20<br>Reference | 48<br>e | 96     | 195    | 390    | 780    |
|--------------------------------|-----------------|---------|--------|--------|--------|--------|
| 20                             | 215ps           | 0.00%   | 0.00%  | 1.40%  | 1.40%  | 5.12%  |
| 48                             | -0.47%          | -0.47%  | 0.47%  | 1.86%  | 1.86%  | 5.58%  |
| 96                             | 0.93%           | 0.93%   | 2.33%  | 2.33%  | 1.86%  | 6.05%  |
| 195                            | 1.40%           | 1.40%   | 1.40%  | 1.86%  | 2.79%  | 7.44%  |
| 390                            | 4.19%           | 3.72%   | 4.19%  | 6.05%  | 7.91%  | 9.30%  |
| 780                            | 19.53%          | 19.53%  | 19.53% | 20.00% | 21.86% | 25.58% |

Figure 4: Effect of coarse time bins (lower sampling rates) on the effectiveness of time over threshold correction and overall time resolution.

The two time bin options selected for the rising edge, 195 ps and 390 ps, reflect the clock frequencies that will need to be available in the digital ASIC. Considering a double data rate (DDR) architecture, 390 ps time bins could be readily derived from a 1.28 GHz clock, directly provided by the lpGBT ASICs that would manage data transmission. Time bins down to 195 ps require a 2.56 GHz clock, which could be obtained by including inside the ASIC a rad-hard PLL (IP developed for the lpGBT).

Given the overall ranges of time measurement for both the leading and falling edges, the choice of time bins reflects on the event size of a single hit: 8 bits are required to represent the time information of a single photon hit in case of 390/780 ps sampling (rise/fall), while 10 bits are required to double the precision to 195/390 ps sampling (rise/fall).

## 4 ASIC description

Figure 5 shows a block diagram of the proposed digital ASIC. Since the required sampling bins are relatively coarse, the TDC architecture can be simplified by using DDR counters latched by the CLARO digital signals. The advantage of coarser time bins also reflects directly on lower power consumption and less demanding cooling requirements. A design variant is shown in figure 6, where the only difference is the use of deserializers at the input in place of counters. This is a clearly a critical aspect, which will be carefully evaluated during the design phase.

A multiplexer allows to select the clock source between an external clock or the internal clock generated by the PLL. The CLARO digital signals are buffered and level-shifted to adapt the voltage level to the one required by the technology. The following blocks (time of arrival, ToA, and time over threshold, ToT, encoders, and the hit builder) apply a programmable gate (shutter) to the digitized time information and data compression. Time walk compensation on-chip based on a redundant LUT is also being considered.



Figure 5: Block diagram for the digital ASIC for Upgrade 1b (variant 1).



Figure 6: Block diagram for the digital ASIC for Upgrade 1b (variant 2).

Data from different channels is then assembled in eLink frames and sent to the lpGBT for transmission off-detector. A major concern that has to be addressed is related to the high bandwidth required, especially in high occupancy areas of the detector. The hit size can be optimized by selecting only a narrow gate on both the ToA (6.25 ns range with tunable offset) and ToT (12.5 ns range). This choice corresponds to a hit size of 8 bits (1.28 GHz sampling clock) or 10 bits (2.56 GHz sampling clock). Different strategies for data compression are being considered: the more canonical zero suppression and an optimized "zero compression" algorithm, based on simplified (static) lossless Huffman coding, more efficient than zero suppression in high occupancy regions.

The digital ASIC will be designed with radiation-hardening-by-design techniques and will use TSMC 65nm CMOS technology, qualified by CERN and widely used within the HEP community.

## 5 Estimated costs

Table 1 details the preliminary cost estimate for the project. The estimate does not consider the possibility of using multi-layer masksets (MLM), which would allow for substantial cost saving (at least of a factor 2) but are currently unavailable due to foundry overload caused by the global semiconductor shortage. The table also lists the cost of ASIC packaging, ancillary ASICs (most of which can be mounted on modules and reused for Upgrade 2), and printed circuit boards. The

cost estimate assumes 64 channels per digital ASIC, with 1 digital ASIC per lpGBT in half of the detector (higher occupancy) and 2 digital ASICs per lpGBT in the other half (lower occupancy), which gives a total of 3072 lpGBT and 768 VTRX+. The unit costs for the ancillary ASICs were assumed to be 130 CHF for the VTRX+ (optical transceiver, 4 TX channels, 1 RX channel) and 25 CHF for the lpGBT (SERDES, 1 TX channel, 1 RX channel).

| ASIC production                                   |                                |                                             |  |  |  |
|---------------------------------------------------|--------------------------------|---------------------------------------------|--|--|--|
| 65 nm<br>Packaging                                | Engineering run and wafers BGA | 600 k€<br>40 k€                             |  |  |  |
| Other costs (packaging, PCBs, ancillary ASICs)    |                                |                                             |  |  |  |
| lpGBT*<br>VTRX+*<br>bPOL*<br>PCBs<br>(* non recur | rring in Upg2)                 | 100 kCHF<br>120 kCHF<br>50 kCHF<br>100 kCHF |  |  |  |
| <b>Total</b> of which nor                         | <b>1 MCHF</b><br>270 kCHF      |                                             |  |  |  |

Table 1: Estimate of the costs involved with the digital ASIC project

## References

- [1] LHCb-PUB-2021-013; CERN-LHCb-PUB-2021-013
- [2] CERN-LHCb-PUB-2021-009
- [3] M. Baszczyk et al 2017 JINST 12 P08019 https://doi.org/10.1088/1748-0221/12/08/P08019
- [4] M.K. Baszczyk et al 2017 JINST 12 P01012 https://doi.org/10.1088/1748-0221/12/01/P01012