# Submission of the First Full Scale Prototype Chip for Upgraded ATLAS Pixel Detector at LHC, FE-I4A

Marlon Barbero*<sup>a</sup>*<sup>1</sup> , David Arutinov*<sup>a</sup>* , Roberto Beccherle*<sup>b</sup>* , Giovanni Darbo*<sup>b</sup>* , Sourabh Dube*<sup>c</sup>* , David Elledge*<sup>c</sup>* , Julien Fleury*<sup>c</sup>*<sup>2</sup> , Denis Fougeron*<sup>d</sup>* , Maurice Garcia-Sciveres*<sup>c</sup>* , Fabrice Gensolen*<sup>d</sup>* , Dario Gnani*<sup>c</sup>* , Vladimir Gromov*<sup>e</sup>* , Frank Jensen*<sup>c</sup>* , Tomasz Hemperek*<sup>a</sup>* , Michael Karagounis*<sup>a</sup>* , Ruud Kluit*<sup>e</sup>* , Andre Kruth*<sup>a</sup>* , Abderrezak Mekkaoui*<sup>c</sup>* , Mohsine Menouni*<sup>d</sup>* , Jan David Schipper*<sup>e</sup>* , Norbert Wermes*<sup>a</sup>* , Vladimir Zivkovic*<sup>e</sup>* .

> <sup>a</sup>Physikalisches Institut Universität Bonn, Nussallee 12, 53115 Bonn, Germany *b INFN Genova, via Dodecaseno 33, IT-16146 Genova, Italy <sup>c</sup>Lawrence Berkeley National Laboratory, 1 cyclotron road, CA 94720, USA <sup>d</sup>CPPM Aix-Marseille Universit´e, CNRS*/*IN2P3, Marseille, France <sup>e</sup>NIKHEF, Science Park 105, 1098 XG Amsterdam, The Netherlands*

#### Abstract

A new ATLAS pixel chip FE-I4 is being developed for use in upgraded LHC luminosity environments, including the near-term Insertable B-Layer (IBL) upgrade. FE-I4 is designed in a 130 nm CMOS technology, presenting advantages in terms of radiation tolerance and digital logic density compared to the 0.25  $\mu$ m CMOS technology used for the current ATLAS pixel IC, FE-I3. The FE-I4 architecture is based on an array of 80×336 pixels, each 50×250  $\mu m^2$ , consisting of analog and digital sections consisting of analog and digital sections.

In the summer 2010, a first full scale prototype FE-I4A was submitted for an engineering run. This IC features the full scale pixel array as well as the complex periphery of the future full-size FE-I4. The FE-I4A contains also various extra test features which should prove very useful for the chip characterization, but deviate from the needs for standard operation of the final FE-I4 for IBL. In this paper, focus will be brought to the various features implemented in the FE-I4A submission, while also underlining the main differences between the FE-I4A IC and the final FE-I4 as envisioned for IBL.

*Keywords:* Pixel detector, ATLAS, upgrade, IBL, FE-I4

#### 1. Scope of the project and introduction to FE-I4

In these first years of operational experience with the LHC, the road to higher LHC luminosity is clearing up, allowing the detector communities to devise plans for detector upgrades.

The ATLAS pixel detector will see two major phases of upgrade, phase I during the year 2016 shutdown, and phase II for the High Luminosity upgrade (HL-LHC) in 2020. For the phase I upgrade, the addition of a fourth layer to the pixel system with a smaller beam pipe is foreseen: This project is called the Insertable B-layer (IBL). For the HL-LHC upgrade, a new Inner Tracker will replace the existing one.

The design of the FE-I4 has started when it was realized that the FE-I3 IC [2] presently used in the ATLAS

pixel detector [3] features an architecture which scales badly with hit rates higher than the ones expected for LHC full design luminosity. The FE-I3 is based on an architecture which requires transfer of every pixel hit down to data buffers belonging to the periphery of the IC. The pixel hit data fill these buffers until expiration of the trigger latency -typically of order  $3 \mu s$ - before being transmitted for readout if triggered or, with much higher probability, erased if not triggered. The data transfer mechanism from the pixel to the periphery is time consuming and becomes highly inefficient at higher hit rates (for more information about hit recording inefficiencies at high hit rate in the current pixel FE and in FE-I4, see [4]).

One of the first test chips submitted in the framework of the FE-I4 collaboration at the end of 2006 was an exploratory prototype analog array [5]. This test chip was tested in 2007 and is the basis of the present analog

<sup>1</sup>Corresponding author: barbero@physik.uni-bonn.de

 $2$ Visitor from Laboratoire de l'Accélérateur Linéaire, Orsay, FR

*Preprint submitted to Nuclear Instruments and Methods A October 21, 2010*

pixel of FE-I4. The analog pixel front-end is designed for low power consumption and is compatible with several sensor candidates as the choice of the sensor technology for IBL is yet to be made. The analog front-end is based on a two-stage architecture with a pre-amplifier AC-coupled to a second stage of amplification. It features a leakage current compensation circuit, local 4-bit pre-amplifier feedback tuning and a discriminator locally adjusted by 5 configuration bits. More information on the analog front-end, test results as well as its behavior after irradiation is given in [6].

But the main improvement brought to the FE-I4 with respect to the FE-I3 and the essential reason that triggered a redesign is on the side of the digital architecture. Taking advantage of the smaller feature size (130 nm for FE-I4 versus 250 nm for FE-I3), a more complex digital architecture can now be implemented while still reducing the pixel size from  $50 \times 400 \mu m^2$  in FE-<br>
13 to  $50 \times 250 \mu m^2$  in FE-14. The new digital organi-I3 to  $50 \times 250 \ \mu m^2$  in FE-I4. The new digital organization scheme is based on a 4-pixel unit called Pixel zation scheme is based on a 4-pixel unit called Pixel Digital Region (PDR) allowing for local storage of hits in 5-deep data buffers at pixel level for the duration of the first level trigger (LVL1) latency. This local storage helps overcoming the limitations of the current AT-LAS pixel chip FE-I3 at high hit rates. The PDR-based digital architecture allows for a sharing of resources at the 4-pixel level, which leads to a power-efficient design and saves space. Physics-based simulations have shown that this structure is also adapted to the clustered nature of real pixel hits, and gives low recording inefficiency. Finally, this structure reduces the problem of time-walk by using the property of pixel hit clusters related to the spatial proximity of pixels recording small charges (thus exhibiting time-walk) to pixels with large electron collection. More details concerning the 4-pixel region and more information on the advantages of such architecture are given in reference [7].

## 2. Material reduction for the inner pixel layers and consequences for FE-I4 design

Physics studies considering the improvements brought by the IBL to tracking, vertexing and b-tagging [1] have again underlined the importance of not only having a smaller pixel granularity, but have also stressed the importance of reducing material in the innermost layers. The radiation length of the current pixel detector is 2.7%  $x/X_0$  per layer. The target design for the IBL is 1.5%  $x/X<sub>0</sub>$ . A lot of emphasis is thus put on material reduction (see [8]) which has direct consequences on the design of the FE-I4:

- In order to enhance the active area to inactive area ratio of the front-end, it is beneficial to go to a smaller feature size which leads to a reduction of the size of the periphery of the IC (from 2.8 mm for FE-I3 to 2 mm for FE-I4) but also to design a big IC: The low resistivity metal layers of the used process allow for an efficient power distribution in the front-end despite approximately doubling the length of the pixel column from FE-I3 to FE-I4. Aiming for a big chip size has also benefits from the module design point of view and leads to more integrated concepts. The  $20\times19$  mm<sup>2</sup> FE-I4 chip we have finally opted for, is close to the maximum reticle size allowed by the vendor. It should be noted that the main cost driver for the current AT-LAS pixel module is indeed the flip-chipping cost, and that this cost scales per number of IC to manipulate. Hence it can be realized that big ICs will lead to cost reduction too, particularly when thinking of FE-I4 for large area layers for the second phase of upgrade. But it should be stressed that the design of a big front-end chip is in practice a challenge, not only for architectural aspects (to name a few, power distribution, start-up conditions, clock distribution, signal integrity) but also to succeed to design for high yield, as well as to make efficient use of software tools (exponential increase of the resources needed for the simulation of the big chip, increased complexity of the verification procedures, ...).
- The feature size reduction also allows putting more digital complexity in less area, thus integrating more digital functionality in the IC, and, with respect to the current ATLAS pixel module, leads to a simpler module design with no Module Control Chip at the module level.
- Thinning down the front-end chips to less than 100  $\mu$ m thickness: This action has no direct consequences on FE-I4 design, but has consequences on the mechanical handling of the IC.
- Thinning down the sensor is a line of thought of the ATLAS planar silicon sensor community. As thinned down sensors show signal to noise ratios after irradiation rather similar to sensors of standard thickness, it is beneficial to work with a reduced sensor thickness right from the start. In order to be able to exploit the smaller sensor signal (may it come from thinner sensors or from irradiated samples), one has to strive for the lowest attainable discriminator threshold level on the front-

end side. Noise in the analog front-end prototype was measured to be low enough to achieve good performance with rather low input signals. In FE-I4A, effort has been put in decoupling the analog pixel array from the digital activity by having a proper powering scheme with different power domains, by allowing the use of various shield configurations in the FE-I4A test chip and by making use of a deep nwell option provided by the vendor to isolate the synthesized digital logic from the analog front-end. The final level of the threshold that can be set during operation of the full-size FE-I4 chip is hard to estimate from simulations because of the manifold sources of cross-coupling induced noise in a complex chip, and an answer to this question of the lowest attainable threshold can only be reached by testing sessions with the big test IC bump-bonded to the various sensor flavors in a setup as close as possible to the final environment from the point of view of the system aspects.

- The main contributors to the power dissipation in the pixel detector are the sensor (leakage current after irradiation for silicon sensors) and the IC. In order to reduce the cooling needs in the IBL and the associated massive cooling pipes, additional effort has been put on reducing FE-I4 power both in the analog domain and the digital domain. In the analog domain, the use of an AC-coupled 2 stage architecture allows to keep the power budget at the 10  $\mu$ A/pixel level. In the digital domain, clock gating techniques are used, but it should also be noted the sharing of digital logic resources that come with the 4-pixel region scheme implemented and leads to a power efficient design. Furthermore, the use of a digital discriminator in the region allows to recover analog time-walk and in turn permits to reduce the current in the analog pixel.
- Beside the efforts of designing the FE-I4 for lower power, new powering schemes will also be tested with the FE-I4A, as a new powering system needs to be put in place for the upgraded ATLAS tracker to answer the problems of cable congestion and high inefficiency of individual module powering schemes for the 130 nm CMOS technology with its low voltage. For this purpose, 2 Shunt-LDO regulators [9] have been implemented that can be used for serial powering schemes, as well as a DC-DC converter. In both schemes, the driving idea is to bring power to the inner detector at higher voltage but with reduced currents, which lowers power

losses in the cables, reduces the associated cable cross-section and enhances power efficiency [10].

## 3. FE-I4A periphery



Figure 1: Overview of the FE-I4A IC

An overview of the FE-I4A is provided in Fig. 1 with focus on the periphery of the IC. The periphery of FE-I4A consists of blocks which are designed in the various collaboration institutes. Some blocks will be needed for the standard operation of the FE-I4 for IBL and have been designed to require only minor modification when going from the FE-I4A version to the final FE-I4 for IBL version. But FE-I4A also includes test features that provide some extra test functionality to debug the prototype chip or provide more testing flexibility. A selection of the main blocks needed for the final FE-I4 are listed below:

• The End of Digital column is a simple interface between the digital pixel array and the End of Chip Logic block. Logic based on a yield enhancing triplicated token passing scheme from Double-Column to Double-Column handles a priority scheme for pixel data readout to the End of Chip Logic block.

- The End of Chip Logic block consists of a control block which broadcasts the Level 1 Trigger information to the digital regions and organizes data output from the pixel array. The pixel data reaching the End of Chip Logic block is Hamming coded (e.g. see [11]) again for yield enhancement. After decoding, it is reformatted to achieve data bandwidth reduction but also to fit an 8-bit wordbased scheme convenient for the next step of encoding (see Data Output Block below). The data are then stored in an asynchronous FIFO before being processed by the Data Output Block.
- The Data Output Block takes the pixel data stored in the End of Chip logic FIFO and provides 8b10b encoding [12] before streaming out the encoded data at 160 Mb/s. The default output format for the FE-I4 data is 8b10b encoded, which should provide a data stream with proper engineering properties off-detector and in turn should ease clock reconstruction from the data stream. The data streamed out can be either one of the 8b10b commas corresponding to the *Start of Frame*, *End of frame* or *Idle*, or one of 5 30-bits based data words, *Data Header* (including trigger and bunchcrossing ID), *Data Record* (2-pixel data with column address, row address and 2 Time-over-Threshold records), *Address Record* (global register or pixel shift register read back address), *Value Record* (content of either global register or pixel shift register) or a *Service Record* (an error message).
- The Clock Generator Block based on a Phase Locked Looped -PLL- [13] is needed to generate the proper 160 MHz clock for single edge data serialization at 160 Mb/s, from the 40 MHz FE-I4A input clock. It has to be remembered that the FE-I4 for IBL will be operated inside the current ATLAS pixel detector, and as such it needs to fit the constraint of adapting to the current system, an input 40 MHz clock being one of these requirements.
- The command decoder accepts input at 40 Mb/s. The fastest decoded command is the Level 1 Trigger request, which is then broadcast to trigger counters in the periphery as well as to the digital

pixel region in the array. After decoding, global configuration is stored in a random access memory bank of 32 16-bit triple-redundant SEU-hard cells based on a modified DICE latch[14]. Local pixel configuration is streamed out to one or several out of 40 Double-Column based 672-bits long shift registers and latched to one or several out of 13 pixel local latches. The FE-I4 command decoder shows some similarity with the command decoder used in the current ATLAS pixel Module Control Chip for which more information can be found in [15].

- Digital to Analog Converters (DACs) can also be found on the periphery, and are used either for the biasing of some peripheral elements, or to generate the bias needed at the analog pixel level.
- The Efuse section of the periphery is dedicated to non volatile storage of information. Efuses can be burnt, allowing to record information belonging to a specific die which the user is never supposed to change again, e.g. chip ID number, or the choice of one of the two redundant shift registers for each double-column which should be tested early on and burnt in (a yield enhancing feature).
- LVDS receiver and transmitter have been designed to cope with the bandwidth requirements of FE-I4A inputs / outputs. The receiver is based on a complementary independent comparator input pair design, one fitting the high common mode input voltage range, the other the low voltage range, to allow for rail to rail operation. The transmitter is based on a standard architecture adapted to the small supply voltage of the 130 nm node. The tristate output signal current is configurable from 0.6 mA to 3.0 mA. More information concerning the FE-I4A LVDS circuits can be found in [10].

The blocks succinctly described above are blocks that are needed for standard operation of the FE-I4. In FE-I4A, other blocks and test options are implemented which should enable more testing flexibility. Note that a subset of these blocks or testing options might still be required for the future full-size FE-I4 chip.

• A multipurpose input / output multiplexer (IOMuX) has been implemented. Based on some 3-bit selection mechanism, this is the input / output block to several test features. First, this block feeds a parallel configuration memory, decoupled from the main configuration register

bank. This bank is simply serially loaded with bits for a subset of FE-I4A configuration registers without any command overhead. It should be noted that switching from one configuration bank to the other, hence from one chip configuration to another, is provided thanks to these two parallel memory banks which opens testing possibilities. Second, the pixel shift register and the local pixel configuration can also be loaded with this mechanism. Third, the IOMuX provides the path for Efuse programming (see above). Finally, it also provides input / output to the scan chain testing of the Data Output Block, the Command Decoder and the End of Chip Logic blocks, used for digital test purposes. More information on digital testing strategy is provided in [16].

- There is a secondary means to provide the Level 1 Trigger pulse without command overhead through a CMOS input pad.
- For testing purposes, 8b10b encoding mode can be switched off, which will be important to loop back the clock through the data path and will allow tuning of transmission delays at the global system level.
- A new mode of operation with respect to the previous generation ATLAS pixel FE-I3 chip is provided as a "stop mode", which should allow the user to freeze the sending of the clock to the digital array at a certain moment in time. The user can then precisely control the triggering and clocking sequence in the digital array, and effectively read out sequentially all hits recorded during the programmable Level 1 Trigger latency.
- The chip can also be programmed to operate in a self-triggering mode based on the global Hit-OR signal (global OR of all un-masked pixel comparator output).
- In FE-I4A, it was chosen not to hard-wire any specific powering scheme. In the periphery of the IC, there is a DC-DC charge pump converter as well as 2 Shunt-LDO devices, which might be used for serially powered connection schemes but could also be used in a pure Low Drop-Out voltage regulation mode. The choice of a specific powering configuration is left to the user which can use different wire bonding schemes to connect the power domains in various ways, allowing for high testability.

• At the bottom of the IC, 135 first row pads can be found. Approximately 50 of these pads are used to feed the 7 different power domains of FE-I4A: 2 analog domains, 2 digital domains, a domain used for the Clock Generator Block, one used for the Efuse programming and the last for the T3 isolation well connection. The other pads are either the standard chip inputs / outputs, either provide access to the diverse test features or to various test points or bias read and / or overwrite. There are also 9 pads located in a second row (related to one of the two Shunt-LDO devices) and finally 86 test pads located at the very top of the FE-I4A array. A snapshot of FE-I4A layout is shown in Fig. 2.



Figure 2: Snapshot of the layout of the FE-I4A IC

## 4. Conclusion: From the FE-I4A to a final FE-I4 for the IBL

The FE-I4A chip will be extensively tested at the various participating laboratories during fall 2010 (a user guide to FE-I4A is provided in [17]). A USB-based test system and a common testing platform has been developed in collaboration between the university of Bonn (Germany), CERN, the university of Göttingen (Germany) and the Lawrence Berkeley National Laboratory (USA), described in detail in [18]. As soon as basic understanding of the functionalities of the IC has been confirmed and confidence has been established in the testing procedure, a few wafers will be tested and sent out for bump-bonding to various sensor candidates [19]. Testing of the FE-I4A bump-bonded to sensors, both in the laboratories, under radiation and in test beams, will take a big fraction of 2011, but will lead to the definition of a sensor baseline for the IBL as well as refining the FE-I4 design if need is.

Implementation of a few extra blocks and minor modifications to the FE-I4A are already scheduled in the community. To name a few, there will be the need to design a 10-bit Analog to Digital Converter for insitu monitoring of various analog signals or for monitoring of the temperature (these temperature measurement circuits need to be designed too), or to design a circuit which compensates the temperature dependence of the global threshold. On the digital side, it might be useful to allow the programming of an event size limit above which events would be truncated (these events been mostly related to unphysical processes and having the potential to seriously hamper the data acquisition), or to allow the possibility of coding a maximum allowed cluster size (suppression of beam halo tracks in the z direction).

After the design of the final FE-I4 for the IBL, the FE-I4 will also find applications for the HL-LHC upgrade phase of the ATLAS pixel detector, in particular for what concerns the outer pixel layers. For this purpose a re-tuning of the FE-I4 design will take place having in mind the requirements of these outer layers: A higher cost efficiency, the development of new module concepts and the need to make maximum use of the relatively lower data output bandwidth at these higher radii.

- [1] CERN-LHCC-2010-013 / ATLAS-TDR-019, "Insertable B-Layer, Technical Design Report"
- [2] The ATLAS Collaboration, "ATLAS pixel detector electronics and sensor", JINST 3, P07007 (2008)
- [3] The ATLAS Collaboration, "The ATLAS experiment at the Large Hadron Collider", JINST 3, S08003 (2008)
- [4] D. Arutinov *et al.*, "Digital Architecture and Interface of the new ATLAS Pixel Front-End IC for Upgraded Luminosity", IEEE Trans. Nucl. Sci. 56, 2 (2009)
- [5] A. Mekkaoui et al., "FE-I4proto1 user guide", internal document ATLAS pixel upgrade collaboration.
- [6] M. Garcia-Sciveres *et al.*, "The FE-I4 Pixel Readout Integrated Circuit", Proceedings of the Seventh International Hiroshima Symposium on the Development and Application of Semiconductor Tracking Detectors, Aug. 29- Sept. 1 2009, Hiroshima, Japan.
- [7] M. Barbero *et al.*, "FE-I4 ATLAS Pixel Chip Design", Proceedings of Science (Vertex 2009 Conference in Veluwe, The Netherlands, Sept 13-18 2009) 027.
- [8] L. Gonella et al., "Towards minimum material trackers for high energy physics experiments at upgraded luminosities", these proceedings.
- [9] M. Karagounis *et al.*, "An Integrated Shunt-LDO Regulator for

Serial Powered Systems", Proceedings of the 35th European Solid-State Circuits Conference, 2009

- [10] M. Karagounis *et al.*, "Development of the ATLAS FE-I4 pixel readout IC for b-layer Upgrade and Super-LHC", proceedings of TWEPP 2008. CERN-2008-008, 2008. 6pp. Published in \*Naxos 2008, Electronics for particle physics\* 70-75
- [11] T. Moon, "Error Correction Coding: Mathematical Methods and Algorithms" Wiley 2005 (ISBN 0-471-64800-0)
- [12] A. Widmer, P. Franaszek, "A DC-Balanced, Partitioned-Block, 8B/10B Transmission Code", IBM Journal of Research and Development 27 (5); 440 (1983)
- [13] A. Kruth *et al.*, "Charge Pump Clock Generation PLL for the Data Output Block of the Upgraded ATLAS Pixel Front-End in 130 nm CMOS", proceedings TWEPP09
- [14] M. Menouni et al., "Design and measurements of SEU tolerant latches", in Naxos 2008, Electronics for particle physics, CERN-2008-008: 402-405 (2008)
- [15] R. Beccherle et al., "MCC : The Module Controller Chip for the ATLAS Pixel Detector", Nucl. Instrum. Meth. A 492, 117-133 (2002)
- [16] V. Zivkovic et al., "The Design for Test Architecture in Digital Section of the ATLAS FE-I4 Chip", to be submitted to proceedings of TWEPP 2010 conference.
- [17] FE-I4 Collaboration, "The FE-I4A Integrated Circuit Guide", 2010
- [18] M. Backhaus *et al.*, "Development of a versatile and modular test system for ATLAS hybrid pixel detectors", these proceedings.
- [19] F. Hügging et al., "The Insertable B-Layer project", these proceedings.